doc.aljazeera.net
robots.txt

Robots Exclusion Standard data for doc.aljazeera.net

Resource Scan

Scan Details

Site Domain doc.aljazeera.net
Base Domain aljazeera.net
Scan Status Ok
Last Scan2025-06-13T06:44:20+00:00
Next Scan 2025-06-27T06:44:20+00:00

Last Scan

Scanned2025-06-13T06:44:20+00:00
URL https://doc.aljazeera.net/robots.txt
Domain IPs 104.83.197.149, 2600:1413:5000:683::2392, 2600:1413:5000:68e::2392
Response IP 184.51.97.153
Found Yes
Hash 2ce6a02072968d5e0d255aa2f10134bdadbc4d194c6e18effb3c857757d83d9a
SimHash 70081d7def33

Groups

*

Rule Path
Disallow /api
Disallow /asset-manifest.json
Allow /search/$
Disallow /search/
Disallow /home/search?q=

anthropic-ai

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

Other Records

Field Value
sitemap https://doc.aljazeera.net/sitemap.xml
sitemap https://doc.aljazeera.net/news-sitemap.xml
sitemap https://doc.aljazeera.net/sitemaps/article-archive.xml
sitemap https://doc.aljazeera.net/sitemaps/article-new.xml
sitemap https://doc.aljazeera.net/sitemaps/video-archive.xml
sitemap https://doc.aljazeera.net/sitemaps/video-new.xml

Comments

  • Al Jazeera Media Network content is made available for your personal, non-commercial
  • use subject to our Terms and Conditions:
  • https://www.aljazeera.com/terms-and-conditions/
  • Any other uses are not permitted, including but not limited to:
  • (1) the development of any software, machine learning, artificial intelligence (AI),
  • and/or large language models (LLMs);
  • (2) text and data mining activities;
  • (3) creating or providing archived or cached data sets containing our content to others; and/or
  • (4) any commercial purposes.
  • Use of any device, tool, or process designed to data mine or scrape the content
  • using automated means is prohibited without prior written permission from
  • Al Jazeera Media Network. Contact https://network.aljazeera.net/en/contact for assistance.
  • Disallow Rules
  • Sitemaps