intervuai.org
robots.txt

Robots Exclusion Standard data for intervuai.org

Resource Scan

Scan Details

Site Domain intervuai.org
Base Domain intervuai.org
Scan Status Ok
Last Scan2025-11-13T18:55:31+00:00
Next Scan 2025-11-20T18:55:31+00:00

Last Scan

Scanned2025-11-13T18:55:31+00:00
URL https://intervuai.org/robots.txt
Domain IPs 35.219.200.1
Response IP 35.219.200.1
Found Yes
Hash 6228e9e3b3dab850e6b87d902999bd50ad5c0672465474d748e4d76235a03581
SimHash 44740a51ac97

Groups

*

Rule Path
Allow /

Other Records

Field Value
sitemap /sitemap.xml

Comments

  • Allow all user agents to crawl all content by default
  • Disallow crawling of specific paths if needed, for example:
  • Disallow: /admin/
  • Disallow: /tmp/
  • Sitemap location