aljazeera.it.com
robots.txt

Robots Exclusion Standard data for aljazeera.it.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	aljazeera.it.com
Base Domain	it.com
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Couldn't connect to server.
Last Scan	2025-03-25T09:26:46+00:00
Next Scan	2025-06-23T09:26:46+00:00

Last Successful Scan

Scanned	2024-11-03T09:22:54+00:00
URL	https://aljazeera.it.com/robots.txt
Domain IPs	104.21.51.208, 172.67.185.228, 2606:4700:3032::ac43:b9e4, 2606:4700:3035::6815:33d0
Response IP	172.67.185.228
Found	Yes
Hash	41ef0b5001ee828c9af11bcefc0700c38fbf2441701748d6567617ddcf166720
SimHash	79a8ea0080b3

Groups

*

Rule	Path
Disallow	/wp-admin/
Allow	/wp-admin/admin-ajax.php

Rule

Path

Disallow

/wp-admin/

Allow

/wp-admin/admin-ajax.php

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Other Records

Field	Value
sitemap	https://aljazeera.it.com/sitemap.xml
sitemap	https://aljazeera.it.com/sitemap-news.xml

Field

Value

sitemap

https://aljazeera.it.com/sitemap.xml

sitemap

https://aljazeera.it.com/sitemap-news.xml

Back to top

Comments

XML Sitemap & Google News version 5.4.9 - https://status301.net/wordpress-plugins/xml-sitemap-feed/
START YOAST BLOCK
---------------------------
---------------------------
END YOAST BLOCK

Back to top

aljazeera.it.comrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

*

ccbot

gptbot

chatgpt-user

Other Records

Comments

aljazeera.it.com
robots.txt