benjamins.com
robots.txt

Robots Exclusion Standard data for benjamins.com

Resource Scan

Scan Details

Site Domain benjamins.com
Base Domain benjamins.com
Scan Status Ok
Last Scan2025-10-10T04:24:55+00:00
Next Scan 2025-11-09T04:24:55+00:00

Last Scan

Scanned2025-10-10T04:24:55+00:00
URL https://benjamins.com/robots.txt
Domain IPs 104.26.2.42, 104.26.3.42, 172.67.69.65, 2606:4700:20::681a:22a, 2606:4700:20::681a:32a, 2606:4700:20::ac43:4541
Response IP 104.26.3.42
Found Yes
Hash 2933b4a24de909c1b8a387e9bff27b3f1e19a98426edae50b91bcb9f16fbf142
SimHash 684293870653

Groups

*

Rule Path
Disallow /search
Disallow /catalog/search
Disallow /series
Disallow /*/getpdf
Disallow /feed
Disallow /jbp/temp/
Disallow /claire
Disallow /online/login
Disallow /cgi-bin
Disallow /cbc/sc/
Disallow /content/about/team
Disallow /catalog-inquiry
Disallow /online/*/publications
Disallow /online/*current-filter
Disallow /online/*/pdf/
Disallow /online/getpdf/
Disallow /online/*/authors
Disallow /online/*/journals
Disallow /online/*/series
Disallow /online/*/keywords
Disallow /online/*/persons
Disallow /online/*/subjects
Disallow /online/*/languages
Disallow /online/*/thesaurus

Other Records

Field Value
sitemap https://benjamins.com/sitemap.xml
sitemap https://benjamins.com/sitemaps/sitemap-bbr-handbooks.xml

Comments

  • robots.txt for John Benjamins
  • Disallow: */resources