media-self-com.cdn.ampproject.org
robots.txt

Robots Exclusion Standard data for media-self-com.cdn.ampproject.org

Resource Scan

Scan Details

Site Domain media-self-com.cdn.ampproject.org
Base Domain ampproject.org
Scan Status Ok
Last Scan2024-10-21T18:41:17+00:00
Next Scan 2024-11-20T18:41:17+00:00

Last Scan

Scanned2024-10-21T18:41:17+00:00
URL https://media-self-com.cdn.ampproject.org/robots.txt
Domain IPs 2404:6800:4003:c0f::84, 74.125.200.132
Response IP 142.250.195.193
Found Yes
Hash 224c5bd0e99e00d47c3487baf02055ba9c57ca5f17e08caa84f8b757292e1a71
SimHash f85c5917cfe1

Groups

*

Rule Path
Disallow /a/
Disallow /action/
Disallow /c/
Disallow /crt/
Disallow /h/
Disallow /i/
Disallow /ii/
Disallow /m/
Disallow /r/
Disallow /v/
Disallow /wp/

twitterbot

Rule Path
Allow /a/
Allow /action/
Allow /c/
Allow /crt/
Allow /h/
Allow /i/
Allow /ii/
Allow /m/
Allow /r/
Allow /v/
Allow /wp/

Comments

  • Dear Robots (and legal human guardians),
  • The Google AMP Cache is roboted to crawlers. We recommend that search engines
  • process cache links according to the guidelines in https://goo.gl/G40cwD.
  • If you only access the Google AMP Cache for user initiated requests,
  • please contact us at amphtml-robots@googlegroups.com.