genome.ucsc.edu
robots.txt
Robots Exclusion Standard data for genome.ucsc.edu
Resource Scan
Scan Details
Site Domain | genome.ucsc.edu |
Base Domain | ucsc.edu |
Scan Status | Ok |
Last Scan | 2024-05-22T10:59:06+00:00 |
Next Scan | 2024-06-21T10:59:06+00:00 |
Last Scan
Scanned | 2024-05-22T10:59:06+00:00 |
URL | https://genome.ucsc.edu/robots.txt |
Domain IPs | 128.114.119.131, 128.114.119.132 |
Response IP | 128.114.119.131 |
Found | Yes |
Hash | dccbf87ecaaa565770711b4d7b27e4f2e7d39b46a19bb8c202e5b2c1fe8c57ae |
SimHash | 540db9e1e2a1 |
Groups
adsbot-google
ahrefsbot
amazonbot
anthropic-ai
applebot
awariorssbot
awariosmartbot
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
dataforseobot
diffbot
facebookbot
semrushbot
friendlycrawler
google-extended
googleother
gptbot
img2dataset
imagesiftbot
magpie-crawler
meltwater
omgili
omgilibot
peer39_crawler
peer39_crawler/1.0
perplexitybot
piplbot
scoop.it
seekr
yandexbot
youbot
Rule | Path |
---|---|
Disallow | / |
*
Rule | Path |
---|---|
Disallow | /admin/stats/ |
Disallow | /goldenPath/certificate.html |
Disallow | /goldenPath/certificates/ |
Disallow | /cgi-bin/hgTracks*.customText*. |
Other Records
Field | Value |
---|---|
crawl-delay | 5 |