havahavai.com
robots.txt

Robots Exclusion Standard data for havahavai.com

Resource Scan

Scan Details

Site Domain havahavai.com
Base Domain havahavai.com
Scan Status Ok
Last Scan2025-08-22T16:03:09+00:00
Next Scan 2025-09-05T16:03:09+00:00

Last Scan

Scanned2025-08-22T16:03:09+00:00
URL https://havahavai.com/robots.txt
Domain IPs 104.21.52.153, 172.67.201.32, 2606:4700:3030::6815:3499, 2606:4700:3036::ac43:c920
Response IP 104.21.52.153
Found Yes
Hash 3828d5ebfba937a20a05b75492db67abf21396c8a38f8deff8399baf45b12a73
SimHash 60207973e600

Groups

*

Rule Path
Allow /
Disallow /api/
Disallow /_next/
Disallow /admin/
Disallow /.well-known/
Disallow /private/
Allow /blog/
Allow /images/
Allow /favicon.ico
Allow /manifest.json
Allow /ai.txt
Allow /humans.txt

Other Records

Field Value
crawl-delay 1

gptbot

Rule Path
Allow /
Allow /blog/
Allow /ai.txt

Other Records

Field Value
crawl-delay 2

chatgpt-user

Rule Path
Allow /
Allow /blog/
Allow /ai.txt

ccbot

Rule Path
Allow /
Allow /blog/
Allow /ai.txt

anthropic-ai

Rule Path
Allow /
Allow /blog/
Allow /ai.txt

claude-web

Rule Path
Allow /
Allow /blog/
Allow /ai.txt

Other Records

Field Value
sitemap https://havahavai.com/sitemap.xml

Comments

  • Sitemap
  • Disallow crawling of API routes and internal files
  • Allow crawling of all public content
  • Crawl-delay for respectful crawling
  • Specific directives for AI crawlers