sciencebase.gov
robots.txt

Robots Exclusion Standard data for sciencebase.gov

Resource Scan

Scan Details

Site Domain sciencebase.gov
Base Domain sciencebase.gov
Scan Status Ok
Last Scan2024-11-06T16:18:35+00:00
Next Scan 2024-12-06T16:18:35+00:00

Last Scan

Scanned2024-11-06T16:18:35+00:00
URL https://sciencebase.gov/robots.txt
Redirect https://www.sciencebase.gov/robots.txt
Redirect Domain www.sciencebase.gov
Redirect Base sciencebase.gov
Domain IPs 137.227.229.83
Redirect IPs 137.227.248.21, 2001:49c8:8000:121d::1083
Response IP 137.227.248.21
Found Yes
Hash 50d47f70ce4527d38dfa1da942d75b81d529fe92f229dec531f6c7cc730d8489
SimHash 688c9c438682

Groups

*

Rule Path
Disallow /confluence/
Disallow /arcgis/
Disallow /flexviewer/
Disallow /catalogMaps/
Disallow /vocab/
Disallow /catalog/file/
Disallow /catalog/folder/
Disallow /catalog/item/imap/
Disallow /catalog/item/feedMap/
Disallow /catalog/items/searchOverlay
Disallow /catalog/*?community=*
Disallow /catalog/*%26community%3D*
Disallow /catalog/*?format=*
Disallow /catalog/*%26format%3D*
Disallow /catalog/*?view=*
Disallow /catalog/*%26view%3D*

Other Records

Field Value
crawl-delay 10

dotbot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

uptimerobot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

nutch

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

googlebot

Rule Path
Disallow /

bingbot

Rule Path
Disallow /

yahoo! slurp

Rule Path
Disallow /

duckduckbot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

yandex

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.sciencebase.gov/sitemaps/sitemap_index.xml