noldus.com
robots.txt

Robots Exclusion Standard data for noldus.com

Resource Scan

Scan Details

Site Domain noldus.com
Base Domain noldus.com
Scan Status Ok
Last Scan2025-11-23T06:17:13+00:00
Next Scan 2025-12-23T06:17:13+00:00

Last Scan

Scanned2025-11-23T06:17:13+00:00
URL https://noldus.com/robots.txt
Domain IPs 172.66.40.86, 172.66.43.170, 2606:4700:3108::ac42:2856, 2606:4700:3108::ac42:2baa
Response IP 172.66.40.86
Found Yes
Hash 3d90d112ce650c0b926c61e63c17dc0bb4888a99a619ea8f24d9b08aeadc7c2a
SimHash 38549c60d476

Groups

*

Rule Path
Disallow /private/
Disallow /tmp/
Disallow /cache/

gptbot

Rule Path
Allow /

chatgpt-user

Rule Path
Allow /

claudebot

Rule Path
Allow /

perplexitybot

Rule Path
Allow /

google-extended

Rule Path
Allow /

ccbot

Rule Path
Allow /

bytespider

Rule Path
Allow /

Other Records

Field Value
sitemap https://noldus.com/sitemap.xml

Comments

  • General access for all bots
  • Sitemap for standard crawlers
  • Custom field (non-standard, but LLMs may check it)
  • --- Explicit allowances for known AI bots ---
  • OpenAI GPTBot (ChatGPT browsing)
  • OpenAI ChatGPT Plugins
  • Anthropic's Claude
  • Perplexity AI
  • Google AI crawlers (future-proofing)
  • Common LLM-related bots

Warnings

  • `llms` is not a known field.