ndla.no
robots.txt

Robots Exclusion Standard data for ndla.no

Resource Scan

Scan Details

Site Domain ndla.no
Base Domain ndla.no
Scan Status Ok
Last Scan2026-01-01T18:41:41+00:00
Next Scan 2026-01-15T18:41:41+00:00

Last Scan

Scanned2026-01-01T18:41:41+00:00
URL https://ndla.no/robots.txt
Domain IPs 216.137.52.125, 216.137.52.29, 216.137.52.6, 216.137.52.84
Response IP 13.226.2.125
Found Yes
Hash a7535697448912882d84f508848becf0e571f070a022032d990c3ea66efc40a7
SimHash 44098f40a888

Groups

*

Rule Path
Disallow /health/
Disallow /oembed/
Disallow /lti/
Disallow /search
Disallow /*/search*
Disallow */article-iframe/*
Disallow */embed-iframe/*
Disallow *login*
Disallow *logout*
Disallow *minndla/
Disallow /history/

ccbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

omigilibot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

Comments

  • Updated ndla.no 03.04.2024 with additions
  • ndla-frontend paths
  • Min NDLA
  • status.ndla.no
  • AI/LLM CRAWLERS, fetched from snl.no
  • Common Crawl
  • OpenAI
  • Anthropic
  • Bytedance
  • Hive
  • Webz
  • Perplexity
  • Diffbot
  • Diffbot