stpgoods.com
robots.txt

Robots Exclusion Standard data for stpgoods.com

Resource Scan

Scan Details

Site Domain stpgoods.com
Base Domain stpgoods.com
Scan Status Ok
Last Scan2024-09-20T13:55:34+00:00
Next Scan 2024-10-20T13:55:34+00:00

Last Scan

Scanned2024-09-20T13:55:34+00:00
URL https://stpgoods.com/robots.txt
Domain IPs 104.21.4.28, 172.67.131.149, 2606:4700:3033::6815:41c, 2606:4700:3034::ac43:8395
Response IP 104.21.4.28
Found Yes
Hash 83880de226b18b7c65cb82b8be9ee89f3fe79e627e8b331f0f9ee3406724900a
SimHash af44bb324663

Groups

*

Rule Path
Disallow /blog/author/
Disallow /blog/tag/
Disallow /blog/category/
Disallow /home/
Disallow /wishlist/
Disallow /review*
Disallow /blog/tag*
Disallow /blog/author*
Disallow /*openstat
Disallow *review/product*
Disallow *?p=1$

blexbot

Rule Path
Disallow /

baiduspider
baiduspider-video
baiduspider-image

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

swebot

Rule Path
Disallow /

aihitbot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

sitebot

Rule Path
Disallow /

twengabot-discover

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

bender

Rule Path
Disallow /

discobot

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

searchwebengine.net

Rule Path
Disallow /

mlbot

Rule Path
Disallow /

nextgensearchbot

Rule Path
Disallow /

speedy

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

nerdbynature.bot

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

sindicebot

Rule Path
Disallow /

plukkie

Rule Path
Disallow /

findfiles.net

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

goodzer

Rule Path
Disallow /

backlinkcrawler

Rule Path
Disallow /

lemurwebcrawler

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

discobot

Rule Path
Disallow /

fast enterprise crawler 6

Rule Path
Disallow /

sensis.com.au web crawler

Rule Path
Disallow /

worio bot heritrix

Rule Path
Disallow /

trovitbot

Rule Path
Disallow /
Disallow /*?cover*
Disallow /*?subtitles*
Disallow /*?p=3*
Disallow /CVS
Disallow /*.svn$
Disallow /*.idea$
Disallow /*.sql$
Disallow /*.tgz$
Disallow /admin/
Disallow /app/
Disallow /downloader/
Disallow /errors/
Disallow /includes/
Disallow /lib/
Disallow /pkginfo/
Disallow /shell/
Disallow /var/
Disallow /api.php
Disallow /cron.php
Disallow /cron.sh
Disallow /error_log
Disallow /get.php
Disallow /install.php
Disallow /LICENSE.html
Disallow /LICENSE.txt
Disallow /LICENSE_AFL.txt
Disallow /README.txt
Disallow /RELEASE_NOTES.txt
Disallow /customer/
Disallow /*?dir*
Disallow /*?mode*
Disallow /*?order*
Disallow /*?limit*
Disallow /*?brand_games*
Disallow /*?book_language*
Disallow /*?recommended_age*
Disallow /*?made_in*
Disallow /*?material*
Disallow /*?t_shert_size*
Disallow /*?subcat*
Disallow /index.php/
Disallow /*?SID=
Disallow /checkout/
Disallow /onestepcheckout/
Disallow /customer/
Disallow /customer/account/
Disallow /customer/account/login/
Disallow /*catalogsearch/
Disallow /catalog/product_compare/
Disallow /catalog/category/view/
Disallow /catalog/product/view/
Disallow /cgi-bin/
Disallow /cleanup.php
Disallow /apc.php
Disallow /memcache.php
Disallow /phpinfo.php

Other Records

Field Value
sitemap https://www.stpgoods.com/sitemap.xml

Comments

  • added by SD on 12/30/04 bot causing a lot of CF errors to occur
  • Do not crawl development files and folders: CVS, svn directories and dump files
  • GENERAL MAGENTO SETTINGS
  • Do not crawl Magento admin page
  • Do not crawl common Magento technical folders
  • Do not crawl common Magento files
  • MAGENTO SEO IMPROVEMENTS
  • Do not crawl sub category pages that are sorted or filtered.
  • Do not crawl 2-nd home page copy (example.com/index.php/). Uncomment it only if you activated Magento SEO URLs.
  • Do not crawl links with session IDs
  • Do not crawl checkout and user account pages
  • Do not crawl seach pages and not-SEO optimized catalog links
  • SERVER SETTINGS
  • Do not crawl common server technical folders and files
  • IMAGE CRAWLERS SETTINGS
  • Extra: Uncomment if you do not wish Google and Bing to index your images
  • User-agent: Googlebot-Image
  • Disallow: /
  • User-agent: msnbot-media
  • Disallow: /