hcpro.fi
robots.txt

Robots Exclusion Standard data for hcpro.fi

Resource Scan

Scan Details

Site Domain hcpro.fi
Base Domain hcpro.fi
Scan Status Ok
Last Scan2025-11-04T23:13:32+00:00
Next Scan 2025-12-04T23:13:32+00:00

Last Scan

Scanned2025-11-04T23:13:32+00:00
URL https://hcpro.fi/robots.txt
Redirect https://www.hcpro.fi/robots.txt
Redirect Domain www.hcpro.fi
Redirect Base hcpro.fi
Domain IPs 104.21.3.43, 172.67.130.57, 2606:4700:3031::6815:32b, 2606:4700:3034::ac43:8239
Redirect IPs 104.21.3.43, 172.67.130.57, 2606:4700:3031::6815:32b, 2606:4700:3034::ac43:8239
Response IP 172.67.130.57
Found Yes
Hash 8eb87c31499fe27b9dac9c8e85ebea958c42186d5c0216ffe7eb8399fd1c388d
SimHash 51080c3146fb

Groups

ahrefsbot

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

asterias

Rule Path
Disallow /

backdoorbot/1.0

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

black hole

Rule Path
Disallow /

blowfish/1.0

Rule Path
Disallow /

botalot

Rule Path
Disallow /

builtbottough

Rule Path
Disallow /

bullseye/1.0

Rule Path
Disallow /

bunnyslippers

Rule Path
Disallow /

cherrypicker

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

crescent

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

emailcollector

Rule Path
Disallow /

erocrawler

Rule Path
Disallow /

exabot

Rule Path
Disallow /

facebot

Rule Path
Disallow /

foobot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

infonavirobot

Rule Path
Disallow /

kenjin spider

Rule Path
Disallow /

linguee

Rule Path
Disallow /

lwp-trivial

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

spbot

Rule Path
Disallow /

mozilla/4

Rule Path
Disallow /

mozilla/5

Rule Path
Disallow /

netants

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

wget

Rule Path
Disallow /

screaming frog seo spider

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

yandexbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

gptbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

*

Rule Path Comment
Disallow /404/ -
Disallow /app/ -
Disallow /cgi-bin/ -
Disallow /checkout/ -
Disallow /customer/ -
Disallow /errors/ -
Disallow /includes/ -
Disallow /lib/ -
Disallow /media/captcha/ -
Disallow /media/customer/ -
Disallow /media/downloadable/ -
Disallow /media/import/ -
Disallow /media/pdf/ -
Disallow /media/sales/ -
Disallow /media/tmp/ -
Disallow /media/xmlconnect/ -
Disallow /pkginfo/ -
Disallow /report/ -
Disallow /shell/ -
Disallow /stats/ -
Disallow /var/ -
Disallow /customer/section/ -
Disallow /page_cache/block/ -
Disallow /amasty_shopby/ -
Disallow /*?p= -
Disallow /*%26p%3D -
Disallow /*utm_ -
Disallow /*gclid%3D -
Disallow /*_%3D cache-busting timestamps
Disallow /*___ internal debug/variant
Disallow /*?SID= -
Disallow /*%26SID%3D -
Disallow /*?price= -
Disallow /*%26price%3D -
Disallow /*?manufacturer= -
Disallow /*%26manufacturer%3D -
Disallow /*?product_list_order= -
Disallow /*%26product_list_order%3D -
Disallow /*?product_list_mode= -
Disallow /*%26product_list_mode%3D -
Disallow /*?product_list_limit= -
Disallow /*%26product_list_limit%3D -
Disallow /*?dir= -
Disallow /*%26dir%3D -
Disallow /*?cat= -
Disallow /*%26cat%3D -
Disallow /*.cvs$ -
Disallow /*.zip$ -
Disallow /*.svn$ -
Disallow /*.idea$ -
Disallow /*.sql$ -
Disallow /*.tgz$ -

Other Records

Field Value
sitemap https://www.hcpro.fi/pub/sitemap_fi.xml

Comments

  • =========================
  • robots.txt — tuned
  • =========================
  • --- Bad / unnecessary bots ---
  • Common spoofed/abused generic UAs
  • Bulk downloaders
  • SEO tooling (block if not needed)
  • Region-specific bots (block if not targeting them)
  • Rate-limit big engines (optional)
  • AI / research crawlers (adjust to taste)
  • If you want to constrain Google Images to product/category media only (optional):
  • User-agent: Googlebot-Image
  • Allow: /media/catalog/product/
  • Allow: /media/catalog/category/
  • Disallow: /
  • --- Defaults for everyone else ---
  • Core system/sensitive areas (Magento-friendly)
  • Heavy AJAX / internal fragments
  • Pagination / infinite scroll
  • Duplicate/tracking params
  • Facets/filters that explode crawl space (enable if your site uses these)
  • Sort/view controls (duplicate content)
  • Source-control / backups / archives

Warnings

  • 2 invalid lines.