ligeher.nu
robots.txt

Robots Exclusion Standard data for ligeher.nu

Resource Scan

Scan Details

Site Domain ligeher.nu
Base Domain ligeher.nu
Scan Status Ok
Last Scan2024-09-21T06:51:20+00:00
Next Scan 2024-09-28T06:51:20+00:00

Last Scan

Scanned2024-09-21T06:51:20+00:00
URL https://ligeher.nu/robots.txt
Domain IPs 52.233.184.181
Response IP 52.233.184.181
Found Yes
Hash 985fa24f5309db528a4b872a3b260fd90ce0d2b134ded98ca7f8894c5e542851
SimHash 30329a3087e7

Groups

*

Rule Path
Allow /

ccbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

Other Records

Field Value
sitemap https://ligeher.nu/sitemap.xml

Comments

  • AI crawler reference
  • The link below provides instructions to what kind of content can be used to train AI models on this website
  • https://ligeher.nu/ai.txt
  • Search engines
  • Common crawl
  • OpenAI (ChatGPT)
  • OpenAI (ChatGPT realtime search)
  • Anthropic
  • Sitemap