jacobsen.no
robots.txt

Robots Exclusion Standard data for jacobsen.no

Archived Snapshots

Resource Scan

Scan Details

Site Domain	jacobsen.no
Base Domain	jacobsen.no
Scan Status	Ok
Last Scan	2025-11-04T00:22:53+00:00
Next Scan	2025-12-04T00:22:53+00:00

Last Scan

Scanned	2025-11-04T00:22:53+00:00
URL	https://jacobsen.no/robots.txt
Domain IPs	96.127.186.146
Response IP	96.127.186.146
Found	Yes
Hash	f16a3e2e61af597442191344be4932f13099469aa1861c897840719d8904c1dc
SimHash	02b2b1c0cdd0

Groups

msrbot
haste
infonavirobot
marcopolo
nutch
zao
semanticdiscovery
pubcrawl
turnitinbot
npbot
psbot
baiduspider
larbin
ia_archiver
nationaldirectory
lnspiderguy
teleport
miixpc
asterias
lwp-trivial
linkwalker
cosmos
msiecrawler
sitecheck.internetseer.com
pompos
generic
websearchbench
almaden
k2spider
curl
wget
ubicrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

/

roverbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

*

Rule	Path
Disallow	/src/
Disallow	/anders/src/
Disallow	/weather/src/
Disallow	/stats/
Disallow	/hosts/
Disallow	/priv/

Rule

Path

Disallow

/src/

Disallow

/anders/src/

Disallow

/weather/src/

Disallow

/stats/

Disallow

/hosts/

Disallow

/priv/

Back to top

Comments

This file set up by andersja according to proposed Standard of Robot
Exclusion at http://web.nexor.co.uk/mak/doc/robots/norobots.html
created 1997-03-01 16:00
updated 2003-12-22 10:50
updated 2006-04-11 added hosts
updated 2006-04-19 added priv
Currently: allow all well-behaved robots.
(An empty 'Disallow' line, looking like this:)
User-agent: * # Means: All robots.
Disallow: # Means: Disallow nothing.
http://www.webmasterworld.com/robots.txt has a long list of active
robots you might want to block.
Some of these (and many others) ignore robots.txt, and are forcibly
blocked in .htaccess.
(see also
http://diveintomark.org/archives/2003/02/26/how_to_block_spambots_ban_spybots_and_tell_unwanted_robots_to_go_to_hell )
Rover is a bad dog
Jfr. http://www.roverbot.com/user/baddog.html

Back to top

jacobsen.norobots.txt

Resource Scan

Scan Details

Last Scan

Groups

msrbothasteinfonavirobotmarcopolonutchzaosemanticdiscoverypubcrawlturnitinbotnpbotpsbotbaiduspiderlarbinia_archivernationaldirectorylnspiderguyteleportmiixpcasteriaslwp-triviallinkwalkercosmosmsiecrawlersitecheck.internetseer.compomposgenericwebsearchbenchalmadenk2spidercurlwgetubicrawler

roverbot

*

Comments

jacobsen.no
robots.txt