erikberg.com
robots.txt

Robots Exclusion Standard data for erikberg.com

Resource Scan

Scan Details

Site Domain erikberg.com
Base Domain erikberg.com
Scan Status Ok
Last Scan2024-10-06T05:33:40+00:00
Next Scan 2024-11-05T05:33:40+00:00

Last Scan

Scanned2024-10-06T05:33:40+00:00
URL https://erikberg.com/robots.txt
Domain IPs 54.164.145.196
Response IP 54.164.145.196
Found Yes
Hash 46f4514c95216174dccc638755b160d6fd68024ab1982e0c17ebea2196cefe45
SimHash 121ce121dfe1

Groups

msiecrawler

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

sbider

Rule Path
Disallow /

msnbot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 20

omniexplorer_bot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 20

applebot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

surveybot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

spbot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /mlb
Disallow /nba

archive_crawler

Rule Path
Disallow /mlb
Disallow /nba

msrbot

Rule Path
Disallow /nba
Disallow /mlb

turnitinbot

Rule Path
Disallow /nba
Disallow /mlb

ccbot

Rule Path
Disallow /mlb
Disallow /nba

yandex

Rule Path
Disallow /nba
Disallow /mlb

baiduspider

Rule Path
Disallow /nba
Disallow /mlb

sistrix

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

special_archiver

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

linguee

Rule Path
Disallow /

siteexplorer

Rule Path
Disallow /

aboundexbot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

xovibot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

serpstatbot

Rule Path
Disallow /

seositecheckup

Rule Path
Disallow /

zmeu

Rule Path
Disallow /

zeus

Rule Path
Disallow /

Comments

  • robots.txt for erikberg.com
  • crawl way too fast
  • make the data available if you want to archive it
  • special_archiver/3.1.1 +http://www.archive.org/details/archive.org_bot -- makes no mention of special_archiver. 20140705
  • too aggressive
  • too many 404 requests, bad implementation
  • seononsense