eurekalert.org
robots.txt

Robots Exclusion Standard data for eurekalert.org

Resource Scan

Scan Details

Site Domain eurekalert.org
Base Domain eurekalert.org
Scan Status Ok
Last Scan2024-04-27T12:17:54+00:00
Next Scan 2024-05-27T12:17:54+00:00

Last Scan

Scanned2024-04-27T12:17:54+00:00
URL https://eurekalert.org/robots.txt
Redirect https://www.eurekalert.org/robots.txt
Redirect Domain www.eurekalert.org
Redirect Base eurekalert.org
Domain IPs 52.8.62.150, 54.151.119.244
Redirect IPs 54.183.108.71, 54.241.196.89
Response IP 54.241.196.89
Found Yes
Hash 363e79564872354854d745aa2cbcec6b12aa2f1feed400328f2dda914f724087
SimHash 8811f8d58614

Groups

*

Rule Path
Disallow /build
Disallow /bundles
Disallow /images
Disallow /js
Disallow /pdfs
Disallow /advancedSearch
Disallow /simplesearch
Disallow /reporter
Disallow /pio
Disallow /admin
Disallow /news-releases/browse
Disallow /language
Disallow /multimedia/all
Disallow /multimedia/images
Disallow /multimedia/video
Disallow /multimedia/audio
Disallow /meetings
Disallow /specialtopic
Disallow /newsroom
Disallow /newsportal
Disallow /press
Disallow /tipsheet
Disallow /funderportal

Other Records

Field Value
crawl-delay 1

Other Records

Field Value
sitemap https://www.eurekalert.org/sitemap.xml

Comments

  • robots.txt for https://www.eurekalert.org