big-lies.org
robots.txt

Robots Exclusion Standard data for big-lies.org

Archived Snapshots

Resource Scan

Scan Details

Site Domain	big-lies.org
Base Domain	big-lies.org
Scan Status	Ok
Last Scan	2024-09-22T02:34:57+00:00
Next Scan	2024-10-22T02:34:57+00:00

Last Scan

Scanned	2024-09-22T02:34:57+00:00
URL	https://big-lies.org/robots.txt
Domain IPs	173.254.35.54
Response IP	173.254.35.54
Found	Yes
Hash	5250a7bccd94b99f39c423f1a7b8aa005cfd469aac2c1d34fe6862d44ee1d9b2
SimHash	22319b5165f0

Groups

applebot
applenewsbot
baiduspider
baiduspider-image
bingbot
bingpreview
ccbot
cliqzbot
coccoc
coccocbot-image
coccocbot-web
daumoa
dazoobot
deusu
duckduckbot
duckduckgo-favicons-bot
euripbot
exabot
exploratodo
facebot
feedly
findxbot
gooblog
googlebot
googlebot-image
googlebot-mobile
googlebot-news
googlebot-video
haosouspider
ichiro
istellabot
jikespider
lycos
mail.ru
mediapartners-google
mojeekbot
msnbot
msnbot-media
orangebot
pinterest
plukkie
qwantify
rambler
sosospider
blexbot
slurp
sogou blog
sogou inst spider
sogou news spider
sogou orion spider
sogou spider2
sogou web spider
sputnikbot
teoma
twitterbot
wotbox
yacybot
yandex
yandexmobilebot
yeti
yioopbot
yoozbot
youdaobot

Rule	Path
Disallow

Rule

Path

Disallow

*

Rule	Path
Disallow	SemrushBot

Rule

Path

Disallow

SemrushBot

*

Rule	Path
Disallow	Dotbot

Rule

Path

Disallow

Dotbot

Back to top

Comments

ROBOTS.TXT
Alphabetically ordered whitelisting of legitimate web robots, which obey the
Robots Exclusion Standard (robots.txt). Each bot is shortly described in a
comment above the (list of) user-agent(s). Uncomment or delete bots you do
not wish to allow on your website / which do not need to visit your website.
Important: Blank lines are not allowed in the final robots.txt file!
Updates can be retrieved from: https://github.com/jonasjacek/robots.txt
This document is licensed with a CC BY-NC-SA 4.0 license.
Test 10 Oct 2018 - RW
so.com chinese search engine
google.com landing page quality check
User-agent: AdsBot-Google
stopped 27 jul 2020
google.com app resource fetcher
User-agent: AdsBot-Google-Mobile-Apps
bing ads bot
User-agent: adidxbot
apple.com search engine
baidu.com chinese search engine
bing.com international search engine
commoncrawl.org open repository of web crawl data
cliqz.com german in-product search engine
coccoc.com vietnamese search engine
daum.net korean search engine
dazoo.fr french search engine
deusu.de german search engine
duckduckgo.com international privacy search engine
eurip.com european search engine
exabot french added 11 oct 2018
exploratodo.com latin search engine
facebook.com social network
feedly.com feed fetcher
findx.com european search engine
goo.ne.jp japanese search engine
google.com international search engine
so.com chinese search engine
goo.ne.jp japanese search engine
istella.it italian search engine
jike.com / chinaso.com chinese search engine
lycos.com & hotbot.com international search engine
mail.ru russian search engine
google.com adsense bot
mojeek.com search engine
bing.com international search engine
orange.com international search engine
pinterest.com social networtk
botje.nl dutch search engine
qwant.com french search engine
rambler.ru russian search engine
seznam.cz czech search engine
User-agent: SeznamBot
soso.com chinese search engine
webmeup
yahoo.com international search engine
sogou.com chinese search engine
sputnik.ru russian search engine
ask.com international search engine
twitter.com bot
wotbox.com international search engine
yacy.net p2p search software
yandex.com russian search engine
search.naver.com south korean search engine
yioop.com international search engine
yooz.ir iranian search engine
youdao.com chinese search engine
crawling rule(s)
allow all other bots - put disallow / to disallow all others

Back to top

Warnings

3 invalid lines.

Back to top

big-lies.orgrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

*

Comments

Warnings

big-lies.org
robots.txt