samuelgroves.com
robots.txt

Robots Exclusion Standard data for samuelgroves.com

Resource Scan

Scan Details

Site Domain samuelgroves.com
Base Domain samuelgroves.com
Scan Status Ok
Last Scan2024-05-12T00:59:10+00:00
Next Scan 2024-06-11T00:59:10+00:00

Last Scan

Scanned2024-05-12T00:59:10+00:00
URL https://samuelgroves.com/robots.txt
Redirect https://www.samuelgroves.com/robots.txt
Redirect Domain www.samuelgroves.com
Redirect Base samuelgroves.com
Domain IPs 185.120.74.32
Redirect IPs 104.26.8.140, 104.26.9.140, 172.67.73.172
Response IP 104.26.9.140
Found Yes
Hash a426e48666a048915350af672d05c152d50eb1460c596207b69d5e68ad48f71d
SimHash 487eca00c892

Groups

mj12bot

Rule Path
Disallow /

twengabot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

shopwiki

Rule Path
Disallow /

*

Rule Path
Disallow /myaccount/
Disallow /admin/
Disallow /forgottenpassword.php
Disallow /quicklookup.php
Disallow /search.php
Disallow /addtobasket.php
Disallow /basket.php
Disallow /checkout*.php
Disallow /file.php
Disallow /quicklookup.php
Disallow /removefrombasket.php
Disallow /usersettings.php

Other Records

Field Value
crawl-delay 10

Comments

  • Block MJ12 Bot - too aggressive
  • Block 80Legs - too aggressive
  • Block Twenga Bot - too aggressive
  • Block Baiduspider Bot - too aggressive
  • Block Ahrefs Bot - too aggressive

Warnings

  • 4 invalid lines.