allthewebsites.org
robots.txt

Robots Exclusion Standard data for allthewebsites.org

Resource Scan

Scan Details

Site Domain allthewebsites.org
Base Domain allthewebsites.org
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-09-11T23:40:26+00:00
Next Scan 2024-12-10T23:40:26+00:00

Last Successful Scan

Scanned2022-10-13T04:59:34+00:00
URL https://allthewebsites.org/robots.txt
Redirect https://www.allthewebsites.org/robots.txt
Redirect Domain www.allthewebsites.org
Redirect Base allthewebsites.org
Response IP 45.86.69.56
Found Yes
Hash ffa6a1cfdaa7d912c814fa00b47fbf9bd985f93de26c049ba3eb62431a329a2e
SimHash 6a16e102c1d3

Groups

*

Rule Path
Disallow /visit.php
Disallow /info.php
Disallow /jewelry/buy.php
Disallow /jewelry/go.php
Disallow /magazines/shopping.php
Disallow /magazines/subscriber.php
Disallow /webdirectory/report2.php
Disallow /shopping/shoppingvisit.php
Disallow /shopping/buy.php
Disallow /cgi-bin/

shopwiki

Rule Path
Disallow /

linksmanager

Rule Path
Disallow /

linksmanager_bot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

grapeshot

Rule Path
Disallow /

mojeekbot

Rule Path
Disallow /

easouspider

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

backlinkcrawler

Rule Path
Disallow /

integromedb

Rule Path
Disallow /

cms

Rule Path
Disallow /

linkpadbot

Rule Path
Disallow /

comodospider

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

spbot

Rule Path
Disallow /

catexplorador

Rule Path
Disallow /

imrbot

Rule Path
Disallow /

mail.ru_bot

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

solomonobot

Rule Path
Disallow /

netcraftsurveyagent

Rule Path
Disallow /

meanpathbot

Rule Path
Disallow /

plukkie

Rule Path
Disallow /

dcpbot

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

ncbot

Rule Path
Disallow /

komodiabot

Rule Path
Disallow /

aihitbot

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow /

searchmetricsbot

Rule Path
Disallow /

catchbot

Rule Path
Disallow /

linguee

Rule Path
Disallow /

flightdeckreportsbot

Rule Path
Disallow /

screenerbot

Rule Path
Disallow /

bingbot

Rule Path
Disallow /

siteexplorer

Rule Path
Disallow /

aboundexbot

Rule Path
Disallow /

snipebot

Rule Path
Disallow /

ip-web-crawler.com

Rule Path
Disallow /

pagesinventory

Rule Path
Disallow /

wotbox

Rule Path
Disallow /

msnbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

daumoa

Rule Path
Disallow /

exabot

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

squirrobot

Rule Path
Disallow /

voilabot

Rule Path
Disallow /

nerdbynature.bot

Rule Path
Disallow /

nextgensearchbot

Rule Path
Disallow /

psbot

Rule Path
Disallow /

eventgurubot

Rule Path
Disallow /

discobot

Rule Path
Disallow /

gslfbot

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

huaweisymantecspider

Rule Path
Disallow /

baidu spider

Rule Path
Disallow /

baidu

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

baiduspider+

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

solfo-linkchecker

Rule Path
Disallow /

soso

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

sogou

Rule Path
Disallow /

surveybot

Rule Path
Disallow /

yandex

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

yetibot

Rule Path
Disallow /

yeti

Rule Path
Disallow /

genieo

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

scoutjet

Rule Path
Disallow /

jikespider

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

slurp

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

careerbot

Rule Path
Disallow /

wotbox

Rule Path
Disallow /

procogbot

Rule Path
Disallow /

feedfetcher-google

Rule Path
Disallow /

careerbot

Rule Path
Disallow /

cliqusbot

Rule Path
Disallow /

compspybot

Rule Path
Disallow /

docomo

Rule Path
Disallow /

presto

Rule Path
Disallow /

jooblebot

Rule Path
Disallow /

toscrawler

Rule Path
Disallow /

ltbot

Rule Path
Disallow /

domaincrawler

Rule Path
Disallow /

discoverybot

Rule Path
Disallow /

awcheck

Rule Path
Disallow /

acoonbot

Rule Path
Disallow /

xenu link sleuth

Rule Path
Disallow /

screaming frog seo spider

Rule Path
Disallow /

Other Records

Field Value
sitemap http://www.allthewebsites.org/sitemap.xml.gz

Comments

  • NO access (All Spiders)

Warnings

  • 8 invalid lines.