gaycalgary.com
robots.txt

Robots Exclusion Standard data for gaycalgary.com

Resource Scan

Scan Details

Site Domain gaycalgary.com
Base Domain gaycalgary.com
Scan Status Ok
Last Scan2024-11-20T14:45:45+00:00
Next Scan 2024-11-27T14:45:45+00:00

Last Scan

Scanned2024-11-20T14:45:45+00:00
URL http://gaycalgary.com/robots.txt
Domain IPs 184.71.230.122
Response IP 184.71.230.122
Found Yes
Hash c551c5845aabb64e60f924a32a12b042742062937dd28c103713b7e8f0938eee
SimHash 6033d1828187

Groups

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

googlebot

Rule Path
Disallow

googlebot-image

Rule Path
Disallow

googlebot-mobile

Rule Path
Disallow

msnbot

Rule Path
Disallow

slurp

Rule Path
Disallow

teoma

Rule Path
Disallow

gigabot

Rule Path
Disallow

robozilla

Rule Path
Disallow

nutch

Rule Path
Disallow

ia_archiver

Rule Path
Disallow

baiduspider

Rule Path
Disallow

naverbot

Rule Path
Disallow

yeti

Rule Path
Disallow

yahoo-mmcrawler

Rule Path
Disallow

psbot

Rule Path
Disallow

yahoo-blogs/v3.9

Rule Path
Disallow

*

Rule Path
Disallow
Disallow /cgi-bin/
Disallow /gs/

Other Records

Field Value
crawl-delay 720

Comments

  • robots.txt generated by www.seoptimer.com and https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-website/