thedcregister.com
robots.txt

Robots Exclusion Standard data for thedcregister.com

Resource Scan

Scan Details

Site Domain thedcregister.com
Base Domain thedcregister.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-06-12T13:21:17+00:00
Next Scan 2024-08-11T13:21:17+00:00

Last Successful Scan

Scanned2024-04-14T12:48:50+00:00
URL https://thedcregister.com/robots.txt
Redirect https://www.registerpublications.com/robots.txt
Redirect Domain www.registerpublications.com
Redirect Base registerpublications.com
Domain IPs 104.196.37.2
Redirect IPs 74.84.144.198
Response IP 74.84.144.198
Found Yes
Hash 889d28ab84c8fb29de3ab7a15f5b64833de6f1d7e51c16416353d10615b4b744
SimHash 0818c9500135

Groups

google-extended

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

*

Rule Path
Allow /ads.txt
Disallow /ads

Comments

  • User-agent: Googlebot
  • Disallow: /

Warnings

  • 1 invalid line.