new.publishersweekly.com
robots.txt

Robots Exclusion Standard data for new.publishersweekly.com

Resource Scan

Scan Details

Site Domain new.publishersweekly.com
Base Domain publishersweekly.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonRequest timed out.
Last Scan2024-03-19T04:46:02+00:00
Next Scan 2024-06-17T04:46:02+00:00

Last Successful Scan

Scanned2022-07-24T10:40:29+00:00
URL http://new.publishersweekly.com/robots.txt
Redirect https://www.publishersweekly.com:443/robots.txt
Redirect Domain www.publishersweekly.com
Redirect Base publishersweekly.com
Response IP 100.24.111.39
Found Yes
Hash 7a24e8faccc08f2bb78c21bc955764c797f9ed7e284c8a24e9588e3754170332
SimHash 2877c824e71b

Groups

ahrefsbot

Rule Path
Disallow /

zibber-v0.1(www.zibb.com/crawler/)

Rule Path
Disallow /

mlbot*

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

*

Rule Path
Disallow /paper-copy/
Disallow /pw/papercopy/
Disallow /pw/papercopy_bestseller/
Disallow /pw/by-topic/1-legacy/
Disallow /cgi-bin/
Disallow /pw/mobile/
Disallow /iowa-edit/
Disallow /binary-data/EGALLEY/
Disallow /binary-data/DIY/
Disallow /pw/bookit/
Disallow /pw/search/
Disallow /pw/emailtemplates/