epet.hk
robots.txt

Robots Exclusion Standard data for epet.hk

Resource Scan

Scan Details

Site Domain epet.hk
Base Domain epet.hk
Scan Status Ok
Last Scan2024-09-25T09:49:41+00:00
Next Scan 2024-10-25T09:49:41+00:00

Last Scan

Scanned2024-09-25T09:49:41+00:00
URL https://epet.hk/robots.txt
Redirect https://www.epet.hk/en/robots.txt
Redirect Domain www.epet.hk
Redirect Base epet.hk
Domain IPs 104.26.14.225, 104.26.15.225, 172.67.71.106, 2606:4700:20::681a:ee1, 2606:4700:20::681a:fe1, 2606:4700:20::ac43:476a
Redirect IPs 104.26.14.225, 104.26.15.225, 172.67.71.106, 2606:4700:20::681a:ee1, 2606:4700:20::681a:fe1, 2606:4700:20::ac43:476a
Response IP 104.26.14.225
Found Yes
Hash 8409c2838fa7fc12c6c0f045b05ce30a4d62a7373bd1594e5cc3f976b1ad30ef
SimHash 69347733c663

Groups

*

Rule Path
Disallow /*?dir*
Disallow /*?dir=desc
Disallow /*?dir=asc
Disallow /*?limit=all
Disallow /*?mode*
Disallow /*?___from_store=*
Disallow /*?SID=
Disallow /checkout/
Disallow /onestepcheckout/
Disallow /customer/
Disallow /customer/account/
Disallow /customer/account/login/
Disallow */wishlist/
Disallow */catalogsearch/
Disallow /catalog/product_compare/
Disallow /catalog/category/view/
Disallow /catalog/product/view/
Disallow */l/

Other Records

Field Value
crawl-delay 30

Other Records

Field Value
sitemap https://www.epet.hk/sitemap.xml
sitemap https://www.epet.hk/sitemap_ch.xml

Comments

  • GENERAL SETTINGS
  • Enable robots.txt rules for all crawlers
  • Crawl-delay parameter: number of seconds to wait between successive requests to the same server.Crawl-delay: 10
  • Set a custom crawl rate if youre experiencing traffic problems with your server.
  • MAGENTO SEO IMPROVEMENTS
  • Do not crawl sub category pages that are sorted or filtered.
  • Disallow: /*?___store=*
  • Do not crawl 2-nd home page copy (example.com/index.php/). Uncomment it only if you activated Magento SEO URLs.
  • Disallow: /index.php/
  • Do not crawl links with session IDs
  • Do not crawl checkout and user account pages
  • Do not crawl seach pages and not-SEO optimized catalog links
  • IMAGE CRAWLERS SETTINGS
  • Extra: Uncomment if you do not wish Google and Bing to index your images
  • User-agent: Googlebot
  • Disallow:
  • User-agent: Googlebot-Image
  • Disallow:
  • User-agent: msnbot-media
  • Disallow: /