guidetogamblingonline.com
robots.txt

Robots Exclusion Standard data for guidetogamblingonline.com

Resource Scan

Scan Details

Site Domain guidetogamblingonline.com
Base Domain guidetogamblingonline.com
Scan Status Ok
Last Scan2026-03-26T03:28:58+00:00
Next Scan 2026-04-25T03:28:58+00:00

Last Scan

Scanned2026-03-26T03:28:58+00:00
URL https://guidetogamblingonline.com/robots.txt
Domain IPs 185.181.254.116
Response IP 185.181.254.116
Found Yes
Hash 9190915864a2a80312a894236329f46d87241f5518b47c853e24ea838e831ced
SimHash 681de313d760

Groups

scrapy

Rule Path
Allow /

scrapy

Rule Path
Allow /

googlebot

Rule Path
Allow /sitemap.xml
Allow /sitemap.xml.gz

*

Rule Path
Disallow /cgi-bin/
Disallow /contact/
Disallow /trackback/
Disallow /*?*
Disallow */trackback/
Disallow /auto/

googlebot

Rule Path
Disallow /*.inc$
Disallow /*.gz$
Disallow /*.cgi$
Disallow /*.wmv$
Disallow /*.cgi$
Disallow /*.xhtml$
Disallow */trackback*
Disallow /z/

googlebot-image

Rule Path
Allow /*

*

Rule Path
Disallow /cgi-bin/
Disallow /about/
Disallow /contact/
Disallow /wp-
Disallow /feed/
Disallow /trackback/

ia_archiver

Rule Path
Disallow /

duggmirror

Rule Path
Disallow /

googlebot

Rule Path
Allow /*.php$
Allow /*.js$
Allow /*.inc$
Allow /*.css$
Allow /*.gz$
Allow /*.wmv$
Allow /*.cgi$
Allow /*.xhtml$
Allow /visit/

Comments

  • disallow all files in these directories
  • disallow all files ending with these extensions
  • Allow google image bot to search all images
  • Allow archiving site
  • Allow duggmirror