findingaugustine.org
robots.txt

Robots Exclusion Standard data for findingaugustine.org

Resource Scan

Scan Details

Site Domain findingaugustine.org
Base Domain findingaugustine.org
Scan Status Ok
Last Scan2025-10-17T00:06:11+00:00
Next Scan 2025-11-16T00:06:11+00:00

Last Scan

Scanned2025-10-17T00:06:11+00:00
URL https://findingaugustine.org/robots.txt
Domain IPs 153.104.7.32
Response IP 153.104.7.32
Found Yes
Hash 794f43a20e373a3a67de76bc6cffd9cd6fb4546033eaa4a623775d9eb05584ed
SimHash 72a7195180e4

Groups

*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

bytespider

Rule Path
Disallow /

sogou web spider

Rule Path
Disallow /

sogou inst spider

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claudebot/1.0

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

omgili

Rule Path
Disallow /

academicbotrtu

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

test-bot

Rule Path
Disallow /

timpibot

Rule Path
Disallow /

timpibot/0.9

Rule Path
Disallow /

ai2bot
ai2bot-dolma
amazonbot
applebot
applebot-extended
cohere-ai
facebookexternalhit
friendlycrawler
googleother
googleother-image
googleother-video
iaskspider/2.0
icc-crawler
imagesiftbot
img2dataset
isscyberriskcrawler
kangaroo bot
meta-externalagent
meta-externalfetcher
oai-searchbot
omgilibot
perplexitybot
scrapy
sidetrade indexer bot
velenpublicwebcrawler
webzio-extended
youbot

Rule Path
Disallow /

Other Records

Field Value
sitemap http://findingaugustine.org/sitemapIndex.xml

Comments

  • ---
  • Dark Visitors
  • https://darkvisitors.com/
  • Claude
  • Common Crawl
  • ChatGPT user prompt research
  • Google AI training data crawl
  • ---
  • Islandora Issues
  • https://github.com/Islandora/documentation/issues/2286
  • OpenAI training data crawl
  • --- From https://github.com/ai-robots-txt/ai.robots.txt/blob/main/robots.txt, deduplicated with stanzas above, updated 10/28/24
  • ---