thenewbaguette.com
robots.txt

Robots Exclusion Standard data for thenewbaguette.com

Resource Scan

Scan Details

Site Domain thenewbaguette.com
Base Domain thenewbaguette.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-11-27T23:10:53+00:00
Next Scan 2026-01-26T23:10:53+00:00

Last Successful Scan

Scanned2025-09-29T19:34:59+00:00
URL https://thenewbaguette.com/robots.txt
Domain IPs 104.21.55.212, 172.67.173.39
Response IP 104.21.55.212
Found Yes
Hash 5d5473cce3badef00a9e66251083bcc712b9a1e9ea7ae50ace66ba39e3001eff
SimHash 59049a40cb77

Groups

gptbot
chatgpt-user
oai-searchbot
claudebot
claude-user
perplexitybot
perplexity-user
google-extended
googleother

Rule Path
Allow /
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php

*

Rule Path
Disallow /ebooks/
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php

Other Records

Field Value
sitemap https://thenewbaguette.com/sitemap_index.xml

Comments

  • Explicitly allow reputable AI crawlers
  • Default policy for everyone else