survivingthestores.com
robots.txt

Robots Exclusion Standard data for survivingthestores.com

Resource Scan

Scan Details

Site Domain survivingthestores.com
Base Domain survivingthestores.com
Scan Status Ok
Last Scan2024-11-15T16:26:47+00:00
Next Scan 2024-11-22T16:26:47+00:00

Last Scan

Scanned2024-11-15T16:26:47+00:00
URL https://survivingthestores.com/robots.txt
Domain IPs 104.21.23.53, 172.67.209.54, 2606:4700:3033::6815:1735, 2606:4700:3036::ac43:d136
Response IP 104.21.23.53
Found Yes
Hash 3c44b145eaa69309e112d8244fa7f080b774b30e4c03f579c28baf2716625514
SimHash 200d2693c2e5

Groups

*

Rule Path
Disallow /wp-admin
Disallow /wp-content/cache
Disallow /trackback
Disallow /comments
Disallow /category/*/*
Disallow */trackback
Disallow */comments
Disallow /wp-content/dnld
Disallow /wp-content/dnld/
Allow /wp-content/uploads
Disallow /*?*
Disallow /*?
Disallow /*.php$
Disallow /*.inc$
Disallow /*.gz$
Disallow /*.wmv$
Disallow /*.cgi$
Disallow /*.xhtml$

Other Records

Field Value
crawl-delay 20

googlebot-image

Rule Path
Disallow
Allow /*

mediapartners-google*

Rule Path
Disallow
Allow /*

Other Records

Field Value
sitemap http://www.survivingthestores.com/sitemap.xml

Comments

  • disallow all files with ? in url
  • disallow all files ending with these extensions
  • allow google image bot to search all images
  • allow Google adsense bot on entire site