ansiblebyexample.com
robots.txt

Robots Exclusion Standard data for ansiblebyexample.com

Resource Scan

Scan Details

Site Domain ansiblebyexample.com
Base Domain ansiblebyexample.com
Scan Status Ok
Last Scan2025-12-01T15:37:21+00:00
Next Scan 2025-12-08T15:37:21+00:00

Last Scan

Scanned2025-12-01T15:37:21+00:00
URL https://ansiblebyexample.com/robots.txt
Redirect https://www.ansiblebyexample.com/robots.txt
Redirect Domain www.ansiblebyexample.com
Redirect Base ansiblebyexample.com
Domain IPs 104.21.77.170, 172.67.210.130, 2606:4700:3031::6815:4daa, 2606:4700:3037::ac43:d282
Redirect IPs 104.21.77.170, 172.67.210.130, 2606:4700:3031::6815:4daa, 2606:4700:3037::ac43:d282
Response IP 104.21.77.170
Found Yes
Hash 43bfd7121aa006d0450ba97613731b1d7f6ab2ef66a096bee90fdad699be6758
SimHash 7b16d055cdcd

Groups

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

twitterbot

Rule Path
Allow /

facebookexternalhit

Rule Path
Allow /

*

Rule Path
Allow /
Disallow /api/
Disallow /_next/
Disallow /admin/
Disallow /tutorials/

gptbot

Rule Path
Allow /

openai

Rule Path
Allow /

anthropic

Rule Path
Allow /

llamaindex

Rule Path
Allow /

ai

Rule Path
Allow /

ai

Rule Path
Allow /

Other Records

Field Value
sitemap https://www.ansiblebyexample.com/sitemap.xml
sitemap https://www.ansiblebyexample.com/video-sitemap.xml

Comments

  • LLM crawler guidance (detailed policy in llms.txt)
  • For LLM crawlers, see: https://www.ansiblebyexample.com/llms.txt
  • Premium content - restricted for training/reuse without permission
  • AI indexing: explicitly allow known AI crawlers to index the site
  • This permits model builders / large-scale crawlers to fetch content. Remove
  • or restrict these entries if you want to block specific providers.
  • Known AI crawlers (explicitly allowed):
  • Generic 'AI' crawler tokens