statesman.com
robots.txt

Robots Exclusion Standard data for statesman.com

Resource Scan

Scan Details

Site Domain statesman.com
Base Domain statesman.com
Scan Status Ok
Last Scan2024-04-28T17:21:32+00:00
Next Scan 2024-05-05T17:21:32+00:00

Last Scan

Scanned2024-04-28T17:21:32+00:00
URL https://statesman.com/robots.txt
Redirect https://www.statesman.com/robots.txt
Redirect Domain www.statesman.com
Redirect Base statesman.com
Domain IPs 151.101.202.62
Redirect IPs 151.101.130.62, 151.101.194.62, 151.101.2.62, 151.101.66.62
Response IP 199.232.46.62
Found Yes
Hash 5f4da7fa23973dda5bc866538cb17c9978b163035832c4c97ba44280770b7fc7
SimHash 3b1e1fe72583

Groups

anthropic-ai

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

googlebot-news

Rule Path
Disallow /story/sponsor-story/
Disallow /picture-gallery/sponsor-story/
Disallow /videos/sponsor-story/
Disallow /longform/sponsor-story/
Disallow /pages/interactives/sponsor-story/
Disallow /interactives/sponsor-story/
Disallow /videos/embed/

gptbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

*

Rule Path
Disallow /errors
Disallow /interactive/
Disallow /userauth/
Disallow /ugc/
Disallow /feeds/
Disallow /services/
Disallow /facebook/
Disallow /version-info/
Disallow /longform/draft/
Disallow /story/draft/
Disallow /topic/*/smart/
Disallow /search
Disallow /module-showcase/
Disallow /newsletter/
Disallow /blended-newsletter/
Disallow /story/nletter/
Disallow /sports/services/photos/
Disallow /optimus
Disallow /ux-train
Disallow /story/advisory/
Disallow /.cam-tangent/
Disallow /pbd/
Disallow /gciaf/
Disallow /*?refresh
Disallow /*template%3Dprintart
Disallow /.cache/
Disallow /a/
Disallow /advertising/
Disallow /apps/pbcs.dll/art_tips
Disallow /apps/pbcs.dll/classifieds
Disallow /apps/pbcs.dll/css/
Disallow /apps/pbcs.dll/error
Disallow /apps/pbcs.dll/events
Disallow /apps/pbcs.dll/exec
Disallow /apps/pbcs.dll/index
Disallow /apps/pbcs.dll/js/
Disallow /apps/pbcs.dll/misc
Disallow /apps/pbcs.dll/netguest
Disallow /apps/pbcs.dll/ptshowguide
Disallow /apps/pbcs.dll/ptshowguideitem
Disallow /apps/pbcs.dll/related
Disallow /apps/pbcs.dll/search
Disallow /apps/pbcs.dll/temaoversikt
Disallow /apps/pbcsad.dll
Disallow /apps/pbcsi.dll
Disallow /apps/rub.dll
Disallow /binary/
Disallow /blotter/
Disallow /carousel/
Disallow /default_rewrites.ini
Disallow /doubleclick/
Disallow /error
Disallow /external-search/
Disallow /factbox/
Disallow /images/
Disallow /includes/
Disallow /interact/
Disallow /interstitial/
Disallow /localdata.ini
Disallow /logs/
Disallow /mal/
Disallow /navigation/
Disallow /notices/
Disallow /overlay/
Disallow /pagination/
Disallow /peel/
Disallow /publicus.ini
Disallow /relatedimage/
Disallow /scr/
Disallow /search/
Disallow /section?template=emailfriend
Disallow /shareable/
Disallow /skyslider/
Disallow /submit/
Disallow /templates/
Disallow /tmp/
Disallow /verticals/
Disallow /w/
Disallow /xsendmail.ini

Other Records

Field Value
sitemap https://www.statesman.com/news-sitemap.xml
sitemap https://www.statesman.com/web-sitemap-index.xml
sitemap https://www.statesman.com/video-sitemap-index.xml

Comments

  • robots.txt file for https://www.statesman.com/