genome.ucsc.edu
robots.txt

Robots Exclusion Standard data for genome.ucsc.edu

Resource Scan

Scan Details

Site Domain genome.ucsc.edu
Base Domain ucsc.edu
Scan Status Ok
Last Scan2024-05-22T10:59:06+00:00
Next Scan 2024-06-21T10:59:06+00:00

Last Scan

Scanned2024-05-22T10:59:06+00:00
URL https://genome.ucsc.edu/robots.txt
Domain IPs 128.114.119.131, 128.114.119.132
Response IP 128.114.119.131
Found Yes
Hash dccbf87ecaaa565770711b4d7b27e4f2e7d39b46a19bb8c202e5b2c1fe8c57ae
SimHash 540db9e1e2a1

Groups

adsbot-google
ahrefsbot
amazonbot
anthropic-ai
applebot
awariorssbot
awariosmartbot
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
dataforseobot
diffbot
facebookbot
semrushbot
friendlycrawler
google-extended
googleother
gptbot
img2dataset
imagesiftbot
magpie-crawler
meltwater
omgili
omgilibot
peer39_crawler
peer39_crawler/1.0
perplexitybot
piplbot
scoop.it
seekr
yandexbot
youbot

Rule Path
Disallow /

*

Rule Path
Disallow /admin/stats/
Disallow /goldenPath/certificate.html
Disallow /goldenPath/certificates/
Disallow /cgi-bin/hgTracks*.customText*.

Other Records

Field Value
crawl-delay 5