counterterrorismblog.org
robots.txt

Robots Exclusion Standard data for counterterrorismblog.org

Resource Scan

Scan Details

Site Domain counterterrorismblog.org
Base Domain counterterrorismblog.org
Scan Status Ok
Last Scan2025-10-18T22:52:39+00:00
Next Scan 2025-11-17T22:52:39+00:00

Last Scan

Scanned2025-10-18T22:52:39+00:00
URL https://counterterrorismblog.org/robots.txt
Domain IPs 104.21.17.247, 172.67.178.229, 2606:4700:3030::6815:11f7, 2606:4700:3036::ac43:b2e5
Response IP 172.67.178.229
Found Yes
Hash 9549ad4adab0630ae9350a08c216c9e1417ff4ba39a63af3c3a53daf60ac9b5b
SimHash 68645e82d712

Groups

*

Rule Path
Disallow /comments/feed
Disallow /feed/$
Disallow /*/feed/$
Disallow /*/feed/rss/$
Disallow /*/trackback/$
Disallow /*/*/feed/$
Disallow /*/*/feed/rss/$
Disallow /*/*/trackback/$
Disallow /*/*/*/feed/$
Disallow /*/*/*/feed/rss/$
Disallow /*/*/*/trackback/$
Disallow /trackback/
Disallow /wp-admin/
Disallow /*?
Disallow /*.inc$
Disallow */trackback/
Disallow /go/

mediapartners-google

Rule Path
Allow /

adsbot-google

Rule Path
Allow /

googlebot-image

Rule Path
Allow /

googlebot-mobile

Rule Path
Allow /

ia_archiver

Rule Path
Disallow /

Other Records

Field Value
sitemap http://counterterrorismblog.org//sitemap.xml

Warnings

  • `noindex` is not a known field.