archives.post-gazette.com
robots.txt

Robots Exclusion Standard data for archives.post-gazette.com

Resource Scan

Scan Details

Site Domain archives.post-gazette.com
Base Domain post-gazette.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-08-07T17:59:53+00:00
Next Scan 2024-11-05T17:59:53+00:00

Last Successful Scan

Scanned2023-10-06T14:23:23+00:00
URL https://archives.post-gazette.com/robots.txt
Domain IPs 104.18.43.168, 172.64.144.88, 2606:4700:4400::6812:2ba8, 2606:4700:4400::ac40:9058
Response IP 172.64.144.88
Found Yes
Hash caacf2ff90b06ac6dd79a281580ff795b60f8407e524f427821b96717c372d2c
SimHash 500eaf41d17f

Groups

*

Rule Path
Disallow /busy.html
Disallow /error.html
Disallow /error.php
Disallow /download/
Disallow /clippings/download/
Allow /newspage/

ahrefsbot

Rule Path
Disallow /busy.html
Disallow /error.html
Disallow /error.php

googlebot-image

Rule Path
Allow /*

applebot

Rule Path
Allow /*

facebot

Rule Path
Allow /*

Comments

  • Slow Bots see https://ahrefs.com/robot for more info
  • Updated 2/25/2019