themarketingheaven.com
robots.txt

Robots Exclusion Standard data for themarketingheaven.com

Resource Scan

Scan Details

Site Domain themarketingheaven.com
Base Domain themarketingheaven.com
Scan Status Ok
Last Scan2024-05-22T23:30:38+00:00
Next Scan 2024-06-21T23:30:38+00:00

Last Scan

Scanned2024-05-22T23:30:38+00:00
URL https://themarketingheaven.com/robots.txt
Domain IPs 104.26.14.49, 104.26.15.49, 172.67.70.162, 2606:4700:20::681a:e31, 2606:4700:20::681a:f31, 2606:4700:20::ac43:46a2
Response IP 104.26.14.49
Found Yes
Hash b98e6e34019a958dad0bcac3f0ea4a3d156100b8d7618815a91a30546ce3e3d7
SimHash b04051eb0cbb

Groups

*

Rule Path
Disallow /wp-content/plugins/counter/
Disallow /counter/

rogerbot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

ia_archiver-web.archive.org

Rule Path
Disallow /

xovibot

Rule Path
Disallow /
Allow *.js*
Allow *.css*
Allow *.png*
Allow *.jpg*
Allow *.gif*
Allow /wp-admin/admin-ajax.php
Disallow /?s=
Disallow /search/
Disallow /wp-admin
Disallow */feed/
Disallow */page/
Disallow /wp-login.php
Disallow /wp-register.php
Disallow /trackback/

Other Records

Field Value
sitemap https://themarketingheaven.com/sitemap_index.xml

Comments

  • Allow files critical for rendering
  • Allow AJAX - Do Not Remove
  • Prevent private admin areas from being crawled
  • Prevent duplicate /feed/ pages from being crawled
  • Prevent paging pages from being crawled
  • Prevent login page crawls etc
  • Prevent register page crawls etc
  • Prevent Trackback Neg SEO