wilburwilliams.uk
robots.txt

Robots Exclusion Standard data for wilburwilliams.uk

Resource Scan

Scan Details

Site Domain wilburwilliams.uk
Base Domain wilburwilliams.uk
Scan Status Ok
Last Scan2025-09-02T04:15:45+00:00
Next Scan 2025-09-09T04:15:45+00:00

Last Scan

Scanned2025-09-02T04:15:45+00:00
URL https://wilburwilliams.uk/robots.txt
Domain IPs 2a03:90c0:999c::12, 81.28.12.12
Response IP 81.28.12.12
Found Yes
Hash 92843dedbb037503ab3c7207591d5cd501d7ece5543952140109ef8200b67f6b
SimHash 354a8911c3f4

Groups

gptbot
oai-searchbot

Rule Path
Disallow /

google-extended
googleother
googleother-image
googleother-video
bardbot

Rule Path
Disallow /

anthropic-ai
claudebot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ai2bot
ai2bot-dolma
amazonbot
applebot
applebot-extended
ccbot
cohere-ai
diffbot
duckassistbot
facebookbot
iaskspider/2.0
icc-crawler
imagesiftbot
img2dataset
isscyberriskcrawler
kangaroo bot
meta-externalagent
meta-externalfetcher
omgili
omgilibot
pangubot
perplexitybot
petalbot
scrapy
sidetrade indexer bot
timpibot
velenpublicwebcrawler
webzio-extended
youbot

Rule Path
Disallow /

Comments

  • OpenAI (I still hate that they're called that, it's not open anymore) GPT scraper
  • Google's "bard" AI scraper
  • Claude
  • Bytespider just cuz fck you bytedance
  • Other bots