thecharlottepost.com
robots.txt

Robots Exclusion Standard data for thecharlottepost.com

Resource Scan

Scan Details

Site Domain thecharlottepost.com
Base Domain thecharlottepost.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-08-05T10:25:39+00:00
Next Scan 2024-11-03T10:25:39+00:00

Last Successful Scan

Scanned2023-07-13T10:23:36+00:00
URL https://thecharlottepost.com/robots.txt
Redirect http://www.thecharlottepost.com/robots.txt
Redirect Domain www.thecharlottepost.com
Redirect Base thecharlottepost.com
Domain IPs 104.21.41.172, 172.67.165.232, 2606:4700:3033::ac43:a5e8, 2606:4700:3036::6815:29ac
Redirect IPs 104.21.41.172, 172.67.165.232, 2606:4700:3033::ac43:a5e8, 2606:4700:3036::6815:29ac
Response IP 104.21.41.172
Found Yes
Hash d2e90f89586319da5d63eefd9b78652da55163a30191a6929fdf4d3803372169
SimHash a5e89ac0e534

Groups

*

Rule Path
Disallow /*print%3Dpdf*
Disallow /admin

Other Records

Field Value
crawl-delay 5

Comments

  • ROBOTS.TXT
  • www.thecharlottepost.com
  • Google
  • User-agent: Googlebot
  • Disallow:
  • Yahoo
  • User-agent: Slurp
  • Disallow:
  • Alta-Vista
  • User-agent: Scooter
  • Disallow:
  • Excite
  • User-agent: ArchitextSpider
  • Disallow:
  • InfoSeek
  • User-agent: UltraSeek
  • Disallow:
  • Lycos
  • User-agent: Lycos_Spider_(T-Rex)
  • Disallow:
  • LookSmart
  • User-agent: MantraAgent
  • Disallow:
  • Alltheweb
  • User-agent: FAST-WebCrawler
  • Disallow: