env-int.com
robots.txt

Robots Exclusion Standard data for env-int.com

Resource Scan

Scan Details

Site Domain env-int.com
Base Domain env-int.com
Scan Status Ok
Last Scan2026-02-03T21:45:37+00:00
Next Scan 2026-02-17T21:45:37+00:00

Last Scan

Scanned2026-02-03T21:45:37+00:00
URL https://env-int.com/robots.txt
Domain IPs 104.26.0.207, 104.26.1.207, 172.67.74.30, 2606:4700:20::681a:1cf, 2606:4700:20::681a:cf, 2606:4700:20::ac43:4a1e
Response IP 104.26.1.207
Found Yes
Hash 5704d2224896fd48982df3e2fda468a42e42a514da3e02e01dd0f4fc51282d94
SimHash 471dcb506bc6

Groups

*

Rule Path
Allow /

Other Records

Field Value
crawl-delay 1

gptbot

Rule Path
Allow /

chatgpt-user

Rule Path
Allow /

claude-web

Rule Path
Allow /

perplexitybot

Rule Path
Allow /

ccbot

Rule Path
Allow /

bingbot

Rule Path
Allow /

googlebot

Rule Path
Allow /

applebot

Rule Path
Allow /

facebookbot

Rule Path
Allow /

youbot

Rule Path
Allow /

neevabot

Rule Path
Allow /

yandexbot

Rule Path
Allow /

duckduckbot

Rule Path
Allow /

*

Rule Path
Disallow /api/
Disallow /_next/
Disallow /server/
Disallow /test/
Disallow /sentry-*
Disallow /*.json$
Disallow /*?*utm_*
Disallow /*?*ref=*
Disallow /*?*session=*
Disallow /*?*token=*

Other Records

Field Value
sitemap https://env-int.com/sitemap.xml

Comments

  • Environmental Intellect - Robots.txt
  • Last updated: 2024
  • We welcome AI and search engine crawlers to index our content
  • Allow all web crawlers by default
  • Explicitly welcome AI/LLM crawlers for better AI search visibility
  • OpenAI GPTBot (ChatGPT)
  • ChatGPT Browser
  • Anthropic Claude
  • Perplexity AI
  • Common Crawl (used by many AI companies)
  • Microsoft Bing/Copilot
  • Google (Bard/Gemini)
  • Apple AI
  • Meta AI
  • You.com
  • Neeva AI
  • Yandex
  • DuckDuckGo
  • Disallow access to certain paths for all bots
  • Sitemap location