widgets.scmp.com
robots.txt

Robots Exclusion Standard data for widgets.scmp.com

Resource Scan

Scan Details

Site Domain widgets.scmp.com
Base Domain scmp.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-12-04T07:47:21+00:00
Next Scan 2025-12-05T07:47:21+00:00

Last Successful Scan

Scanned2025-10-17T21:26:33+00:00
URL https://widgets.scmp.com/robots.txt
Domain IPs 170.33.12.207
Response IP 170.33.12.207
Found Yes
Hash 761041dcb7c9e0afb7fbd14c6d4fedeccc6bb6f0e79bc59e334d2f965fa3577b
SimHash b4805c61d5ee

Groups

*

Rule Path
Disallow /video/
Disallow /5g/
Disallow /21hk/
Disallow /crossword/
Disallow /demo/
Disallow /factsandfigures/
Disallow /fbInstantArticleFeeder/
Disallow /GoldthreadFeeder/
Disallow /GoogleNewsstandFeeder/
Disallow /images/
Disallow /infographic/
Disallow /inside-china/
Disallow /june4/
Disallow /log/
Disallow /misc/
Disallow /mobileapp/
Disallow /mobileappdownload/
Disallow /newsletters/
Disallow /NewsRepublicFeeder/
Disallow /partials/
Disallow /racing/
Disallow /record/
Disallow /rss/
Disallow /RSSSyndicationFeed/
Disallow /RSSSyndicationFeedDemo/
Disallow /scmp_hd_footer/
Disallow /scmpir/
Disallow /scmpjob/
Disallow /scmpnext/
Disallow /series/
Disallow /SmartNewsFeeder/
Disallow /sport/
Disallow /syndication/
Disallow /tmp/
Disallow /tools/
Disallow /TopBuzzRSSFeed/
Disallow /UCnewsFeeder/
Allow /newsletters/archive/scmpglobalimpact/
Disallow /newsletters/archive/scmpglobalimpact/api/

Comments

  • Disallow: /