jacobsen.no
robots.txt

Robots Exclusion Standard data for jacobsen.no

Resource Scan

Scan Details

Site Domain jacobsen.no
Base Domain jacobsen.no
Scan Status Ok
Last Scan2025-11-04T00:22:53+00:00
Next Scan 2025-12-04T00:22:53+00:00

Last Scan

Scanned2025-11-04T00:22:53+00:00
URL https://jacobsen.no/robots.txt
Domain IPs 96.127.186.146
Response IP 96.127.186.146
Found Yes
Hash f16a3e2e61af597442191344be4932f13099469aa1861c897840719d8904c1dc
SimHash 02b2b1c0cdd0

Groups

msrbot
haste
infonavirobot
marcopolo
nutch
zao
semanticdiscovery
pubcrawl
turnitinbot
npbot
psbot
baiduspider
larbin
ia_archiver
nationaldirectory
lnspiderguy
teleport
miixpc
asterias
lwp-trivial
linkwalker
cosmos
msiecrawler
sitecheck.internetseer.com
pompos
generic
websearchbench
almaden
k2spider
curl
wget
ubicrawler

Rule Path
Disallow /

roverbot

Rule Path
Disallow /

*

Rule Path
Disallow /src/
Disallow /anders/src/
Disallow /weather/src/
Disallow /stats/
Disallow /hosts/
Disallow /priv/

Comments

  • This file set up by andersja according to proposed Standard of Robot
  • Exclusion at http://web.nexor.co.uk/mak/doc/robots/norobots.html
  • created 1997-03-01 16:00
  • updated 2003-12-22 10:50
  • updated 2006-04-11 added hosts
  • updated 2006-04-19 added priv
  • Currently: allow all well-behaved robots.
  • (An empty 'Disallow' line, looking like this:)
  • User-agent: * # Means: All robots.
  • Disallow: # Means: Disallow nothing.
  • http://www.webmasterworld.com/robots.txt has a long list of active
  • robots you might want to block.
  • Some of these (and many others) ignore robots.txt, and are forcibly
  • blocked in .htaccess.
  • (see also
  • http://diveintomark.org/archives/2003/02/26/how_to_block_spambots_ban_spybots_and_tell_unwanted_robots_to_go_to_hell )
  • Rover is a bad dog
  • Jfr. http://www.roverbot.com/user/baddog.html