koutouby.tn
robots.txt

Robots Exclusion Standard data for koutouby.tn

Archived Snapshots

Resource Scan

Scan Details

Site Domain	koutouby.tn
Base Domain	koutouby.tn
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Server returned a server error.
Last Scan	2024-10-22T10:49:06+00:00
Next Scan	2024-12-21T10:49:06+00:00

Last Successful Scan

Scanned	2024-08-01T02:58:39+00:00
URL	https://koutouby.tn/robots.txt
Domain IPs	34.107.231.181
Response IP	34.107.231.181
Found	Yes
Hash	7628cbb7e793cf35e9f15479327b5c55d2f1cd4918bcc83887a79b80d5c52a39
SimHash	a21a715beef7

Groups

*

Rule	Path
Disallow	/Accounts/
Disallow	/Accounts/*
Disallow	/Subscription/
Disallow	/Subscription/*
Disallow	/Facebook/
Disallow	/Products/Read/
Disallow	/Products/Read/*
Disallow	/Catalog/Explore/Filters
Disallow	/Explorer/
Disallow	/Explorer/*
Disallow	/Livre/
Disallow	/Livre/*

Rule

Path

Disallow

/Accounts/

Disallow

/Accounts/*

Disallow

/Subscription/

Disallow

/Subscription/*

Disallow

/Facebook/

Disallow

/Products/Read/

Disallow

/Products/Read/*

Disallow

/Catalog/Explore/Filters

Disallow

/Explorer/

Disallow

/Explorer/*

Disallow

/Livre/

Disallow

/Livre/*

ubicrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

doc

Rule	Path
Disallow	/

Rule

Path

Disallow

zao

Rule	Path
Disallow	/

Rule

Path

Disallow

sitecheck.internetseer.com

Rule	Path
Disallow	/

Rule

Path

Disallow

zealbot

Rule	Path
Disallow	/

Rule

Path

Disallow

msiecrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

sitesnagger

Rule	Path
Disallow	/

Rule

Path

Disallow

webstripper

Rule	Path
Disallow	/

Rule

Path

Disallow

webcopier

Rule	Path
Disallow	/

Rule

Path

Disallow

fetch

Rule	Path
Disallow	/

Rule

Path

Disallow

offline explorer

Rule	Path
Disallow	/

Rule

Path

Disallow

teleport

Rule	Path
Disallow	/

Rule

Path

Disallow

teleportpro

Rule	Path
Disallow	/

Rule

Path

Disallow

webzip

Rule	Path
Disallow	/

Rule

Path

Disallow

linko

Rule	Path
Disallow	/

Rule

Path

Disallow

httrack

Rule	Path
Disallow	/

Rule

Path

Disallow

microsoft.url.control

Rule	Path
Disallow	/

Rule

Path

Disallow

xenu

Rule	Path
Disallow	/

Rule

Path

Disallow

larbin

Rule	Path
Disallow	/

Rule

Path

Disallow

libwww

Rule	Path
Disallow	/

Rule

Path

Disallow

zyborg

Rule	Path
Disallow	/

Rule

Path

Disallow

download ninja

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

shopwiki

Rule	Path
Disallow	/

Rule

Path

Disallow

wget

Rule	Path
Disallow	/

Rule

Path

Disallow

grub-client

Rule	Path
Disallow	/

Rule

Path

Disallow

k2spider

Rule	Path
Disallow	/

Rule

Path

Disallow

npbot

Rule	Path
Disallow	/

Rule

Path

Disallow

webreaper

Rule	Path
Disallow	/

Rule

Path

Disallow

Comments

Sorry, wget in its recursive mode is a frequent problem.
Please read the man page and use it properly; there is a
--wait option you can use to set the delay between hits,
for instance.
The 'grub' distributed client has been *very* poorly behaved.
Doesn't follow robots.txt anyway, but...
Hits many times per second, not acceptable
http://www.nameprotect.com/botinfo.html
A capture bot, downloads gazillions of pages with no public benefit
http://www.webreaper.net/

koutouby.tnrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

*

ubicrawler

doc

zao

sitecheck.internetseer.com

zealbot

msiecrawler

sitesnagger

webstripper

webcopier

fetch

offline explorer

teleport

teleportpro

webzip

linko

httrack

microsoft.url.control

xenu

larbin

libwww

zyborg

download ninja

mj12bot

shopwiki

wget

grub-client

k2spider

npbot

webreaper

Comments

koutouby.tn
robots.txt