scmp.com
robots.txt

Robots Exclusion Standard data for scmp.com

Resource Scan

Scan Details

Site Domain scmp.com
Base Domain scmp.com
Scan Status Ok
Last Scan2024-09-21T17:21:40+00:00
Next Scan 2024-09-28T17:21:40+00:00

Last Scan

Scanned2024-09-21T17:21:40+00:00
URL https://scmp.com/robots.txt
Redirect https://www.scmp.com/robots.txt
Redirect Domain www.scmp.com
Redirect Base scmp.com
Domain IPs 104.18.204.43, 104.18.205.43, 2606:4700::6812:cc2b, 2606:4700::6812:cd2b
Redirect IPs 104.18.204.43, 104.18.205.43, 2606:4700::6812:cc2b, 2606:4700::6812:cd2b
Response IP 104.18.204.43
Found Yes
Hash 2c1855d1eb689c8bd13c2057d48607d7cf201a3f0b471c116fe8ac7c9eeacb9e
SimHash 1cbc97185f4c

Groups

*

Rule Path
Disallow /public/
Disallow /static/
Disallow /login
Disallow /signin
Disallow /register
Disallow /logout
Disallow /login/facebook
Disallow /login/facebook/*
Disallow /styleguide/*
Disallow /healthz
Disallow /.well-known/*
Disallow /*/firebase-messaging-sw.js
Disallow /google97d8d43559c9b155.html
Allow /static/*.css$
Allow /static/*.css?
Allow /static/*.js$
Allow /static/*.js?
Allow /static/*.gif
Allow /static/*.jpg
Allow /static/*.jpeg
Allow /static/*.png
Allow /public/*.css$
Allow /public/*.css?
Allow /public/*.js$
Allow /public/*.js?
Allow /public/*.gif
Allow /public/*.jpg
Allow /public/*.jpeg
Allow /public/*.png
Disallow /includes/
Disallow /misc/
Disallow /modules/
Disallow /profiles/
Disallow /scripts/
Disallow /themes/
Disallow /CHANGELOG.txt
Disallow /cron.php
Disallow /INSTALL.mysql.txt
Disallow /INSTALL.pgsql.txt
Disallow /INSTALL.sqlite.txt
Disallow /install.php
Disallow /INSTALL.txt
Disallow /LICENSE.txt
Disallow /MAINTAINERS.txt
Disallow /update.php
Disallow /UPGRADE.txt
Disallow /xmlrpc.php
Disallow /sites/default/files/*.pdf
Disallow /sites/default/files/*.doc
Disallow /sites/default/files/*.docx
Disallow /sites/default/files/*.swf
Disallow /admin/
Disallow /comment/reply/
Disallow /filter/tips/
Disallow /node/add/
Disallow /user/register/
Disallow /user/password/
Disallow /user/login/
Disallow /user/logout/
Disallow *?destination=*
Disallow /ajax_comments/
Disallow /scmp_comments/
Disallow *Article_type%3D*
Disallow *field_article*
Disallow /label/
Disallow /node/*/nodequeue
Disallow /node/*/edit
Disallow /ajax
Disallow /ajax/*
Disallow /facebook-instant-articles/feed/*
Disallow /epaper
Disallow /epaper/*
Disallow /story/style/*
Disallow /?q=admin%2F
Disallow /?q=comment%2Freply%2F
Disallow /?q=filter%2Ftips%2F
Disallow /?q=node%2Fadd%2F
Disallow /?q=user%2Fpassword%2F
Disallow /?q=user%2Fregister%2F
Disallow /?q=user%2Flogin%2F
Disallow /?q=user%2Flogout%2F
Disallow /?q=node%2F*%2Fedit
Disallow /?q=node%2F*%2Fnodequeue
Disallow /?q=epaper
Disallow /?q=epaper%2F*
Disallow /?q=facebook-instant-articles%2Ffeed%2F*
Disallow /*/logSend$
Disallow /*/spmException$
Disallow /*/spmact$
Disallow /*/antiSpam$
Disallow /*/nameStorage$
Disallow /*/spmMonitor$
Disallow /*/pvData$
Disallow /*/goldlog$
Disallow /*/initLoad$
Disallow /*/beforeUnload$
Disallow /*/util$
Disallow /*/metaInfo$
Disallow /*/beaconBase$
Disallow /*/spm$
Disallow /*/makeid$
Disallow /*/referrer$
Disallow /*/pvid$
Disallow /*/etag$
Disallow /*/iframe$
Disallow /*/client$
Disallow /*/windvane$
Disallow /*/cookie$
Disallow /*/sendpv$
Disallow /*/personality/index$
Disallow /*/misc$
Disallow /*/client$
Disallow /*/log$
Disallow /*/compose$
Disallow /*/lib_b/*$
Disallow /print/
Disallow /?q=print%2F

Other Records

Field Value
crawl-delay 10

newsnow

Rule Path
Allow /print/
Allow /?q=print%2F

grapeshot

Rule Path
Allow /*/article/*$
Disallow /*?*cid=*
Disallow /*?*showonlyads=*
Disallow /*?*nograpeshot=*
Disallow /*?*noixwrapper=*
Disallow /*?*nogtm=*
Disallow /*?*nochartbeat=*
Disallow /*?*noga=*
Disallow /*?*nomoatyi=*
Disallow /*?*nomoat=*

Other Records

Field Value
sitemap https://www.scmp.com/sitemap.xml
sitemap https://www.scmp.com/sitemap_explained.xml
sitemap https://www.scmp.com/sitemap_podcasts.xml
sitemap https://www.scmp.com/sitemap_announcements.xml
sitemap https://www.scmp.com/sitemap_infographics.xml
sitemap https://www.scmp.com/sitemap_news.xml
sitemap https://www.scmp.com/sitemap_economy.xml
sitemap https://www.scmp.com/sitemap_business.xml
sitemap https://www.scmp.com/sitemap_comment.xml
sitemap https://www.scmp.com/sitemap_tech.xml
sitemap https://www.scmp.com/sitemap_lifestyle.xml
sitemap https://www.scmp.com/sitemap_culture.xml
sitemap https://www.scmp.com/sitemap_sport.xml
sitemap https://www.scmp.com/sitemap_property.xml
sitemap https://www.scmp.com/sitemap_photos.xml
sitemap https://www.scmp.com/sitemap_video.xml
sitemap https://www.scmp.com/sitemap_destination_macau.xml
sitemap https://www.scmp.com/sitemap_magazines.xml
sitemap https://www.scmp.com/sitemap_this_week_in_asia.xml
sitemap https://www.scmp.com/sitemap_directories.xml
sitemap https://www.scmp.com/sitemap_weather.xml
sitemap https://www.scmp.com/sitemap_about_us.xml
sitemap https://www.scmp.com/sitemap_lists.xml
sitemap https://www.scmp.com/sitemap_special_reports.xml
sitemap https://www.scmp.com/sitemap_country_reports.xml
sitemap https://www.scmp.com/sitemap_video_comments.xml
sitemap https://www.scmp.com/sitemap_video_scmp_originals.xml
sitemap https://www.scmp.com/sitemap_video_hong_kong.xml
sitemap https://www.scmp.com/sitemap_video_china.xml
sitemap https://www.scmp.com/sitemap_video_asia.xml
sitemap https://www.scmp.com/sitemap_video_world.xml
sitemap https://www.scmp.com/sitemap_video_business.xml
sitemap https://www.scmp.com/sitemap_video_arts_culture.xml
sitemap https://www.scmp.com/sitemap_video_technology.xml
sitemap https://www.scmp.com/sitemap_video_lifestyle.xml
sitemap https://www.scmp.com/sitemap_video_sport.xml
sitemap https://www.scmp.com/sitemap_video_offbeat.xml
sitemap https://www.scmp.com/sitemap_video_style.xml
sitemap https://www.scmp.com/sitemap_video_post_mag.xml
sitemap https://www.scmp.com/sitemap_video_presented.xml
sitemap https://www.scmp.com/sitemap_article.xml
sitemap https://www.scmp.com/sitemap_gallery.xml
sitemap https://www.scmp.com/sitemap_poll.xml
sitemap https://www.scmp.com/sitemap_promotion.xml
sitemap https://www.scmp.com/sitemap_webform.xml
sitemap https://www.scmp.com/sitemap_video_format.xml
sitemap https://www.scmp.com/sitemap_sections.xml
sitemap https://www.scmp.com/sitemap_topics.xml
sitemap https://www.scmp.com/sitemap_authors.xml
sitemap https://www.scmp.com/sitemap_archive.xml
sitemap https://www.scmp.com/sitemap_abacus.xml
sitemap https://www.scmp.com/sitemap_recipe.xml
sitemap https://www.scmp.com/sitemap_better_life.xml
sitemap https://www.scmp.com/sitemap_podcast.xml

Comments

  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/robotstxt.html
  • For syntax checking, see:
  • http://www.sxw.org.uk/computing/robots/check.html
  • PWA
  • Directories
  • Path
  • CSS, JS, Image
  • Directories
  • Files
  • Paths (clean URLs)
  • Paths (no clean URLs)
  • NewsNow
  • GrapeShot
  • Ads
  • Sitemap