anged.org
robots.txt

Robots Exclusion Standard data for anged.org

Resource Scan

Scan Details

Site Domain anged.org
Base Domain anged.org
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't establish SSL connection.
Last Scan2025-08-21T07:37:53+00:00
Next Scan 2025-09-20T07:37:53+00:00

Last Successful Scan

Scanned2025-06-30T03:28:44+00:00
URL https://anged.org/robots.txt
Domain IPs 162.0.209.188
Response IP 162.0.209.188
Found Yes
Hash 8a260bca124a467560c0e2e4411c67df52ca3bb5389aaca0033a1deec23df274
SimHash 5314d1cb6331

Groups

*

Rule Path
Disallow /*pmin
Disallow /*pmax
Disallow /*cart
Disallow /*srule
Disallow /*account
Disallow /*checkout
Disallow /*view%3Dgrid
Disallow /*view%3Dlist
Disallow /*quickview
Disallow /*search
Disallow /*wishlist
Disallow /*shipping-tracking
Disallow /*contentsearch
Disallow /*search-ajax
Disallow /*payment-details
Disallow /*add-payment
Disallow /*order
Disallow /*newsletter
Disallow /*login
Disallow /*logout
Disallow /*404
Disallow /*giftregistry
Disallow /*error
Disallow /*offline
Disallow /*help
Disallow /*compare
Disallow /*cart
Disallow /*revieworder
Disallow /*billing
Disallow /*addressbook
Disallow /*add-address
Disallow /*register
Disallow /*profile

googlebot-image

Rule Path
Allow /
Allow /

yandexbot

Rule Path
Allow /
Allow /

ahrefsbot

Rule Path
Allow /

seokicks-robot

Rule Path
Disallow /

sistrix crawler

Rule Path
Disallow /

uptimerobot/2.0

Rule Path
Disallow /

ezooms robot

Rule Path
Disallow /

perl lwp

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

netestate ne crawler (+http://www.website-datenbank.de/)

Rule Path
Disallow /

wiseguys robot

Rule Path
Disallow /

turnitin robot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

turnitin bot

Rule Path
Disallow /

turnitinbot/3.0 (http://www.turnitin.com/robot/crawlerinfo.html)

Rule Path
Disallow /

turnitinbot/3.0

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

pimonster

Rule Path
Disallow /

pimonster

Rule Path
Disallow /

eccp/1.0 (search@eniro.com)

Rule Path
Disallow /

baiduspider
baiduspider-video
baiduspider-image
mozilla/5.0 (compatible; baiduspider/2.0; +http://www.baidu.com/search/spider.html)
mozilla/5.0 (compatible; baiduspider/3.0; +http://www.baidu.com/search/spider.html)
mozilla/5.0 (compatible; baiduspider/4.0; +http://www.baidu.com/search/spider.html)
mozilla/5.0 (compatible; baiduspider/5.0; +http://www.baidu.com/search/spider.html)
baiduspider/2.0
baiduspider/3.0
baiduspider/4.0
baiduspider/5.0

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

gsa-crawler (enterprise; t4-knhh62cdkc2w3; gsa_manage@nikon-sys.co.jp)

Rule Path
Disallow /

megaindex.ru/2.0

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

mail.ru_bot/2.0

Rule Path
Disallow /

mail.ru

Rule Path
Disallow /

mail.ru_bot/2.0; +http://go.mail.ru/help/robots

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

mj12bot/v1.4.3

Rule Path
Disallow /

ubicrawler

Rule Path
Disallow /

doc

Rule Path
Disallow /

zao

Rule Path
Disallow /

twiceler

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

xenu

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

nutch

Rule Path
Disallow /

spock

Rule Path
Disallow /

omniexplorer_bot

Rule Path
Disallow /

becomebot

Rule Path
Disallow /

geniebot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

mlbot

Rule Path
Disallow /

linguee bot

Rule Path
Disallow /

aihitbot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

sbider/nutch

Rule Path
Disallow /

jyxobot

Rule Path
Disallow /

magent

Rule Path
Disallow /

speedy spider

Rule Path
Disallow /

shopwiki

Rule Path
Disallow /

huasai

Rule Path
Disallow /

datacha0s

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

atomic_email_hunter

Rule Path
Disallow /

mp3bot

Rule Path
Disallow /

winhttp

Rule Path
Disallow /

betabot

Rule Path
Disallow /

core-project

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

java

Rule Path
Disallow /

libwww-perl

Rule Path
Disallow /

Comments

  • Google Image Crawler Setup - having crawler-specific sections makes it ignore generic e.g *
  • User-agent: Googlebot
  • Yandex tends to be rather aggressive, may be worth keeping them at arms lenght
  • User-agent: Pinterest
  • Crawlers Setup
  • User-agent: *
  • Block Ahrefs
  • Block SEOkicks
  • Block SISTRIX
  • Block Uptime robot
  • Block Ezooms Robot
  • Block Perl LWP
  • Block BlexBot
  • Block netEstate NE Crawler (+http://www.website-datenbank.de/)
  • Block WiseGuys Robot
  • Block Turnitin Robot
  • Block Heritrix
  • Block pricepi
  • Block Searchmetrics Bot
  • User-agent: SearchmetricsBot
  • Disallow: /
  • Block Eniro
  • Block Baidu
  • Block SoGou
  • Block Youdao
  • Block Nikon JP Crawler
  • Block MegaIndex.ru
  • unless they're feeding search engines.
  • Some bots are known to be trouble, particularly those designed to copy
  • entire sites or download them for offline viewing. Please obey robots.txt.

Warnings

  • 4 invalid lines.