scienceblog.com
robots.txt

Robots Exclusion Standard data for scienceblog.com

Resource Scan

Scan Details

Site Domain scienceblog.com
Base Domain scienceblog.com
Scan Status Ok
Last Scan2024-11-12T20:02:03+00:00
Next Scan 2024-11-19T20:02:03+00:00

Last Scan

Scanned2024-11-12T20:02:03+00:00
URL https://scienceblog.com/robots.txt
Domain IPs 104.24.18.87, 104.24.19.87, 2606:4700:20::6818:1257, 2606:4700:20::6818:1357
Response IP 104.24.19.87
Found Yes
Hash 079552bc3845f158aff9a5ce454505363887e71de1aff7b0ce29e100c2480308
SimHash 7b0850d4c290

Groups

*

Rule Path
Allow /wp-content/uploads/
Allow /wp-json/
Disallow /wp-admin/
Disallow /wp-login.php
Disallow /wp-register.php
Disallow /admin/
Disallow /login/
Disallow /wp-includes/
Disallow /wp-content/plugins/
Disallow /wp-content/themes/
Disallow /xmlrpc.php
Disallow /readme.html
Disallow /?author=*
Disallow *?s=*
Disallow *?p=*
Disallow */trackback/
Disallow */feed/
Disallow */comments/
Disallow /*?*
Disallow /*.php$
Disallow /wp-content/debug.log

jetpack

Rule Path
Allow *

amazonbot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 60

dataforseobot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 60

Other Records

Field Value
sitemap /sitemap.xml

Comments

  • Only allow minimal required resources
  • Block WordPress system directories and files
  • Allow Jetpack (most Jetpack features will work through wp-json endpoint)
  • WordPress sitemap