aljazeera.com
robots.txt

Robots Exclusion Standard data for aljazeera.com

Resource Scan

Scan Details

Site Domain aljazeera.com
Base Domain aljazeera.com
Scan Status Ok
Last Scan2024-04-23T21:47:08+00:00
Next Scan 2024-04-30T21:47:08+00:00

Last Scan

Scanned2024-04-23T21:47:08+00:00
URL https://aljazeera.com/robots.txt
Redirect https://www.aljazeera.com:443/robots.txt
Redirect Domain www.aljazeera.com
Redirect Base aljazeera.com
Domain IPs 13.59.165.215, 18.222.11.211, 3.143.125.237
Redirect IPs 184.26.20.157, 2600:1413:b000:385::2392, 2600:1413:b000:39c::2392
Response IP 23.44.0.67
Found Yes
Hash e370377cdd0c3ec3d714b449db5650e60c983a0678e3c1ce4b1fcc931afefe75
SimHash 5104cf744c93

Groups

*

Rule Path
Disallow /api
Disallow /asset-manifest.json
Allow /search/$
Disallow /search/
Disallow /home/search?q=

Other Records

Field Value
sitemap https://www.aljazeera.com/sitemap.xml
sitemap https://www.aljazeera.com/news-sitemap.xml
sitemap https://www.aljazeera.com/sitemaps/article-archive.xml
sitemap https://www.aljazeera.com/sitemaps/article-new.xml
sitemap https://www.aljazeera.com/sitemaps/video-archive.xml
sitemap https://www.aljazeera.com/sitemaps/video-new.xml