katolsk.no
robots.txt

Robots Exclusion Standard data for katolsk.no

Archived Snapshots

Resource Scan

Scan Details

Site Domain	katolsk.no
Base Domain	katolsk.no
Scan Status	Ok
Last Scan	2024-06-02T16:52:38+00:00
Next Scan	2024-07-02T16:52:38+00:00

Last Scan

Scanned	2024-06-02T16:52:38+00:00
URL	https://katolsk.no/robots.txt
Domain IPs	2a02:270:2019::15, 77.88.108.15
Response IP	77.88.108.15
Found	Yes
Hash	c8ecda1276a988ace9d3fe7c4738a19a2c5e3c302ac2a44bfd2b46f5baf2ae72
SimHash	2e30ef524ca0

Groups

googlestackdrivermonitoring-uptimechecks

Rule	Path
Disallow	/

Rule

Path

Disallow

simplepie

Rule	Path
Disallow	/

Rule

Path

Disallow

gigabot

Rule	Path
Disallow	/

Rule

Path

Disallow

buck

Rule	Path
Disallow	/

Rule

Path

Disallow

serpstatbot

Rule	Path
Disallow	/

Rule

Path

Disallow

seekport

Rule	Path
Disallow	/

Rule

Path

Disallow

tweetmemebot

Rule	Path
Disallow	/

Rule

Path

Disallow

yoozbot

Rule	Path
Disallow	/

Rule

Path

Disallow

jamesbot

Rule	Path
Disallow	/

Rule

Path

Disallow

yacybot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

slurp

Rule	Path
Disallow	/

Rule

Path

Disallow

screaming frog seo spider

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ucrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

itslearning

Rule	Path
Disallow	/

Rule

Path

Disallow

scanmine

Rule	Path
Disallow	/

Rule

Path

Disallow

pywikibot

Rule	Path
Disallow	/

Rule

Path

Disallow

wlc pywikibot

Rule	Path
Disallow	/

Rule

Path

Disallow

seokicks-robot

Rule	Path
Disallow	/

Rule

Path

Disallow

cliqzbot

Rule	Path
Disallow	/

Rule

Path

Disallow

linkdexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

baiduspider

Rule	Path
Disallow	/

Rule

Path

Disallow

gocrawl

Rule	Path
Disallow	/

Rule

Path

Disallow

trendictionbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ltx71

Rule	Path
Disallow	/

Rule

Path

Disallow

garlikcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

barkrowler

Rule	Path
Disallow	/

Rule

Path

Disallow

mediatoolkitbot

Rule	Path
Disallow	/

Rule

Path

Disallow

deusu

Rule	Path
Disallow	/

Rule

Path

Disallow

seznambot

Rule	Path
Disallow	/

Rule

Path

Disallow

mojeekbot

Rule	Path
Disallow	/

Rule

Path

Disallow

bubing

Rule	Path
Disallow	/

Rule

Path

Disallow

spbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

yeti

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

exabot

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

coccocbot

Rule	Path
Disallow	/

Rule

Path

Disallow

proximic

Rule	Path
Disallow	/

Rule

Path

Disallow

awariorssbot

Rule	Path
Disallow	/

Rule

Path

Disallow

yak

Rule	Path
Disallow	/

Rule

Path

Disallow

yandex

Rule	Path
Disallow	/

Rule

Path

Disallow

go-http-client

Rule	Path
Disallow	/

Rule

Path

Disallow

genieo

Rule

Path

Disallow

mail.ru_bot

Rule

Path

Disallow

mojeekbot

Rule

Path

Disallow

qwantify

Rule

Path

Disallow

changedetection

Rule

Path

Disallow

coccocbot-image

Rule

Path

Disallow

coccocbot-web

Rule

Path

Disallow

proximic

Rule

Path

Disallow

spbot

Rule

Path

Disallow

riddler

Rule

Path

Disallow

sogou

Rule

Path

Disallow

sogou web spider

Rule

Path

Disallow

istellabot

Rule

Path

Disallow

ia_archiver

Rule

Path

Disallow

/biografier/

*

Rule

Path

Disallow

/biografier/historisk/minnedager/list.html

Disallow

/biografier/historisk/minnedager/past.html

Disallow

/biografier/historisk/minnedager/day.html

Disallow

/biografier/historisk/minnedager/date.html

Disallow

/biografier/historisk/minnedager/week.html

Disallow

/biografier/historisk/minnedager/month.html

*

Rule

Path

Disallow

/services/

Disallow

/%2B%2B*

Disallow

/*sendto_form$

Disallow

/*folder_factories$

Disallow

/*ics_view$

Disallow

/*vcs_view$

Disallow

%40%40

Disallow

/%40%40search

Disallow

/search

Disallow

/*?*

Disallow

/*%3D*

Disallow

/biografier/historisk/minnedager

Other Records

Field

Value

crawl-delay

Comments

Define access-restrictions for robots/spiders
http://www.robotstxt.org/wc/norobots.html
Exclude biographies from Web.Archive.org for GDPR compliance
Exclude stuff that uses much CPU
By default we allow robots to access all areas of our site
already accessible to anonymous users
Add Googlebot-specific syntax extension to exclude forms
that are repeated for each piece of content in the site
the wildcard is only supported by Googlebot
http://www.google.com/support/webmasters/bin/answer.py?answer=40367&ctx=sibling

Warnings

2 invalid lines.

katolsk.norobots.txt

Resource Scan

Scan Details

Last Scan

Groups

googlestackdrivermonitoring-uptimechecks

simplepie

gigabot

buck

serpstatbot

seekport

tweetmemebot

yoozbot

jamesbot

yacybot

dotbot

slurp

screaming frog seo spider

semrushbot

ucrawler

itslearning

scanmine

pywikibot

wlc pywikibot

seokicks-robot

cliqzbot

linkdexbot

baiduspider

gocrawl

trendictionbot

ltx71

garlikcrawler

ahrefsbot

barkrowler

mediatoolkitbot

deusu

seznambot

mojeekbot

bubing

spbot

semrushbot

yeti

dotbot

mj12bot

exabot

blexbot

semrushbot

coccocbot

proximic

awariorssbot

yak

yandex

go-http-client

genieo

mail.ru_bot

mojeekbot

qwantify

changedetection

coccocbot-image

coccocbot-web

proximic

spbot

riddler

sogou

sogou web spider

istellabot

ia_archiver

*

*

Other Records

Comments

Warnings

katolsk.no
robots.txt