aljazeeramubasher.net
robots.txt

Robots Exclusion Standard data for aljazeeramubasher.net

Resource Scan

Scan Details

Site Domain aljazeeramubasher.net
Base Domain aljazeeramubasher.net
Scan Status Ok
Last Scan2024-09-21T21:35:22+00:00
Next Scan 2024-09-28T21:35:22+00:00

Last Scan

Scanned2024-09-21T21:35:22+00:00
URL https://aljazeeramubasher.net/robots.txt
Redirect https://www.aljazeeramubasher.net:443/robots.txt
Redirect Domain www.aljazeeramubasher.net
Redirect Base aljazeeramubasher.net
Domain IPs 3.130.248.147, 3.131.164.61, 3.18.131.139
Redirect IPs 2600:1413:b000:1e::17d1:2e49, 2600:1413:b000:1e::17d1:2e5f, 72.247.127.201, 72.247.127.250
Response IP 23.45.207.176
Found Yes
Hash 4b1e5d1a0141c6e2e201cdb6914ace6995f1837b9c57011ae244a6372a08e2fd
SimHash 7908157fe7b1

Groups

*

Rule Path
Disallow /api
Disallow /asset-manifest.json
Allow /search/$
Disallow /search/
Disallow /home/search?q=

anthropic-ai

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

Other Records

Field Value
sitemap https://aljazeeramubasher.net/sitemap.xml
sitemap https://aljazeeramubasher.net/news-sitemap.xml
sitemap https://aljazeeramubasher.net/sitemaps/article-archive.xml
sitemap https://aljazeeramubasher.net/sitemaps/article-new.xml
sitemap https://aljazeeramubasher.net/sitemaps/video-archive.xml
sitemap https://aljazeeramubasher.net/sitemaps/video-new.xml

Comments

  • Al Jazeera Media Network content is made available for your personal, non-commercial
  • use subject to our Terms and Conditions:
  • https://www.aljazeera.com/terms-and-conditions/
  • Any other uses are not permitted, including but not limited to:
  • (1) the development of any software, machine learning, artificial intelligence (AI),
  • and/or large language models (LLMs);
  • (2) text and data mining activities;
  • (3) creating or providing archived or cached data sets containing our content to others; and/or
  • (4) any commercial purposes.
  • Use of any device, tool, or process designed to data mine or scrape the content
  • using automated means is prohibited without prior written permission from
  • Al Jazeera Media Network. Contact https://network.aljazeera.net/en/contact for assistance.
  • Disallow Rules
  • Sitemaps