1r1g.com
robots.txt

Robots Exclusion Standard data for 1r1g.com

Resource Scan

Scan Details

Site Domain 1r1g.com
Base Domain 1r1g.com
Scan Status Ok
Last Scan2024-06-26T12:12:25+00:00
Next Scan 2024-07-03T12:12:25+00:00

Last Scan

Scanned2024-06-26T12:12:25+00:00
URL https://1r1g.com/robots.txt
Domain IPs 43.138.133.127
Response IP 43.138.133.127
Found Yes
Hash a659b3c542035f774c6caeb8edfcb1ee4ef0b2b08878ab37b9591e713cc7803a
SimHash 3a981cd1b0e2

Groups

proximic
bizinformation
blexbot
riddler
ltx71
magpie-crawler
grapeshot
grapeshotcrawler
gigablastopensource
bubing
linkdexbot
linkdexbot/2.2
seokicks
seokicks-robot
panscient.com
webdatastats
zoominfobot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

dotbot
dotbot
dotbot/1.1
mj12bot
ahrefsbot
seokicks-robot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

dotbot/1.1

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

megaindex.ru
megaindex.ru/2.0

Rule Path
Disallow /

spbot

Rule Path
Disallow /

linguee

Rule Path
Disallow /

deusu

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

pinterest

Rule Path
Disallow /

yandex
yandexbot
yandexmobilebot
yandeximageresizer
coccocbot
coccocbot-web
coccocbot-image
yeti
seekbot
seekport
seekport crawler

Rule Path
Disallow /

psbot

Rule Path
Disallow /

*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 1

*

Rule Path
Disallow /wp-admin/

*

Rule Path
Disallow /upload/

*

Rule Path
Disallow /media/

mauibot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

grapeshot

Rule Path
Disallow /

megaindex

Rule Path
Disallow /

treato-bot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

linguee bot

Rule Path
Disallow /

the knowledge ai

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

sistrix crawler

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

jobs.de-robot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

unisterbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

searchmetricsbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

surveybot

Rule Path
Disallow /

seodiver

Rule Path
Disallow /

spbot

Rule Path
Disallow /

wotbox

Rule Path
Disallow /

meanpathbot

Rule Path
Disallow /

backlinkcrawler

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

obot

Rule Path
Disallow /

fr-crawler

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

megaindex.com

Rule Path
Disallow /

cloudservermarketspider

Rule Path
Disallow /

trendictionbot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

careerbot

Rule Path
Disallow /

lipperhey-kaus-australis

Rule Path
Disallow /

seoscanners.net

Rule Path
Disallow /

metajobbot

Rule Path
Disallow /

spiderbot

Rule Path
Disallow /

linkstats

Rule Path
Disallow /

jobboersebot

Rule Path
Disallow /

iccrawler

Rule Path
Disallow /

plista

Rule Path
Disallow /

domain re-animator bot

Rule Path
Disallow /

lipperhey-kaus-australis

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

coccoc

Rule Path
Disallow /

um-ic

Rule Path
Disallow /

mindupbot

Rule Path
Disallow /

sg-orbiter

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

qwantify

Rule Path
Disallow /

kraken

Rule Path
Disallow /

plukkie

Rule Path
Disallow /

safednsbot

Rule Path
Disallow /

haosouspider

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

openhosebot

Rule Path
Disallow /

screaming frog seo spider

Rule Path
Disallow /

thumbsniper

Rule Path
Disallow /

r6_commentreader

Rule Path
Disallow /

implisensebot

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

aihitbot

Rule Path
Disallow /

trendictionbot

Rule Path
Disallow /

adscanner

Rule Path
Disallow /

crawler4j

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

python/3.5 aiohttp

Rule Path
Disallow /

toweya.com

Rule Path
Disallow /

netestate

Rule Path
Disallow /

bubing

Rule Path
Disallow /

linguee

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

sentibot

Rule Path
Disallow /

sentibot

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

domaincrawler

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

indeedbot

Rule Path
Disallow /

garlikcrawler

Rule Path
Disallow /

gosign-security-crawler

Rule Path
Disallow /

siteliner

Rule Path
Disallow /

sabsimbot

Rule Path
Disallow /

ltx71

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

*

Rule Path
Disallow /wp-admin/

Other Records

Field Value
sitemap https://qa.1r1g.com/sf/sitemap.xml
sitemap https://qa.1r1g.com/sf/sitemap-question.xml
sitemap https://qa.1r1g.com/sf/sitemap-questionpage.xml
sitemap https://qa.1r1g.com/sf/sitemap-questiontags.xml
sitemap https://qa.1r1g.com/sf/sitemap-answers.xml
sitemap https://qa.1r1g.com/sf/sitemap-users.xml

Comments

  • _____
  • | . . |
  • |_____|
  • ____ ___|_|___ ____
  • ()___) __ ()___)
  • // /| ____ |\ \\
  • // / | ____ | \ \\
  • (___) |___________| (___)
  • (___) (_______) (___)
  • (___) (___) (___)
  • (___) |_| (___)
  • (___) ___/___\___ | |
  • | | | . . . | |.|
  • |.| |___________| /___\
  • /___\ ||| ||| // \\
  • // \\ ||| ||| \\ //
  • \\ // ||| ||| \\ //
  • \\ // ()__) (__()
  • /// \\\
  • /// \\\
  • _///___ ___\\\_
  • |_______| |_______| robots.txt
  • Bad bots ------> BLOCK
  • proximic Documentation https://www.comscore.com/proximic-spider
  • BLEXBot (very bad bot) Documentation http://webmeup-crawler.com/
  • Common Crawl ----> Block Documentation http://commoncrawl.org/faq/
  • SEO bots ------> BLOCK
  • MOZ.com (DotBot) Documentation https://moz.com/researchtools/ose/dotbot
  • MAJESTIC.com Documentation http://www.majestic12.co.uk/projects/dsearch/mj12bot.php
  • Ahrefs.com Documentation https://ahrefs.com/robot
  • SEOkick.de Documentation http://www.seokicks.de/robot.html
  • MegaIndex.com Documentation https://megaindex.com/crawler
  • Semrush.com Documentation https://fr.semrush.com/bot/
  • OpenLinkProfiler Documentation http://openlinkprofiler.org/bot
  • MOZ.com (DotBot) Documentation https://moz.com/researchtools/ose/dotbot
  • Semrush.com Documentation https://fr.semrush.com/bot/
  • MegaIndex.ru (very bad bot) ---> Respects only the following directive "User-agent: *"
  • OpenLinkProfiler Documentation http://openlinkprofiler.org/bot
  • Crawl-delay: 10
  • Crawler
  • Linguee ----> Block https://www.linguee.com/bot
  • DeuSu ----> Block Documentation https://deusu.de/robot.html
  • TurnitinBot ----> Block Documentation https://turnitin.com/robot/crawlerinfo.html
  • Social Network
  • Pinterest ----> Temporary off Documentation https://help.pinterest.com/fr/articles/about-pinterest-crawler#Web
  • Crawl-delay: 20
  • Search engines
  • Yandex ----> Block Documentation https://yandex.com/support/webmaster/controlling-robot/robots-txt.xml
  • Coccobot ----> Block Documentation http://help.coccoc.com/en/search-engine/robots-txt
  • Yeti ----> Block Documentation http://naver.me/bot
  • Baidu ----> Block Documentation http://robots-txt.com/ressources/robots-txt-baidu/
  • 360Spider ----> Block Documentation http://www.botreports.com/user-agent/360spider.shtml
  • Sogou ----> Block
  • Seekbot ----> Block Seekport Crawler http://seekport.com/ - no documentation
  • User-agent: 360Spider
  • User-agent: Baiduspider
  • User-agent: Sogou web spider
  • User-agent: Sogou
  • Picsearch ----> Temporary off Documentation http://www.picsearch.com/bot.html image search service
  • Tiscali IstellaBot
  • User-agent: IstellaBot
  • Disallow: /
  • Cliqzbot Documentation https://cliqz.com/en/cliqzbot
  • User-agent: Cliqzbot
  • Disallow: /
  • MojeekBot Documentation https://www.mojeek.com/bot.html
  • User-agent: MojeekBot
  • Disallow: /
  • Applebot Documentation http://robots-txt.com/ressources/robots-txt-apple/
  • User-agent: Applebot
  • Disallow: /
  • Qwant Documentation http://robots-txt.com/ressources/robots-txt-qwant/
  • User-agent: Qwantify
  • Disallow: /
  • Exalead Documentation https://www.exalead.com/search/webmasterguide
  • User-agent: Exabot
  • Crawl-delay: 60
  • Yahoo Documentation http://robots-txt.com/ressources/robots-txt-yahoo/ https://help.yahoo.com/kb/SLN22600.html
  • User-agent: Slurp
  • Crawl-delay: 60
  • Orange Documentation http://blog.lemoteur.fr/orangebot/
  • User-agent: OrangeBot
  • User-agent: OrangeBot-Collector
  • Crawl-delay: 60
  • Bing Documentation http://robots-txt.com/ressources/robots-txt-bing/
  • User-agent: bingbot
  • User-agent: msnbot
  • Crawl-delay: 20
  • Disallow: /
  • Google Documentation https://support.google.com/webmasters/answer/1061943?hl=fr
  • User-agent: Googlebot
  • User-agent: Googlebot-Image
  • User-agent: Googlebot-Video
  • Disallow: /
  • Wayback Machine (archive.org) ---> ALLOW
  • Documentation http://robots-txt.com/ressources/robots-txt-alexa/
  • https://archive.org/details/archive.org_bot
  • User-agent: ia_archiver
  • User-agent: ia_archiver-web.archive.org
  • User-agent: archive.org_bot
  • Disallow: /
  • https://www.skimble.com/robots.txt
  • https://github.com/jfqd/robots.txt/blob/master/robots.txt
  • https://www.transindus.co.uk/robots.txt
  • http://www.columbec.com/robots.txt
  • http://www.mecatrouve.com/robots.txt
  • See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:
  • User-Agent: *
  • Disallow: /
  • SITEMAP: http://www.skimble.com/sitemap_index.xml.gz
  • Slow down bots
  • User-Agent: Baiduspider
  • Disallow: /
  • Disallow: Sistrix
  • Disallow: Sistrix
  • Disallow: Sistrix
  • Disallow: SEOkicks-Robot
  • Disallow: jobs.de-Robot
  • Backlink Analysis
  • Bot der Leipziger Unister Holding GmbH
  • http://www.opensiteexplorer.org/dotbot
  • http://www.searchmetrics.com
  • http://www.majestic12.co.uk/projects/dsearch/mj12bot.php
  • http://www.domaintools.com/webmasters/surveybot.php
  • http://www.seodiver.com/bot
  • http://openlinkprofiler.org/bot
  • http://www.wotbox.com/bot/
  • http://www.meanpath.com/meanpathbot.html
  • http://www.backlinktest.com/crawler.html
  • http://www.brandwatch.com/magpie-crawler/
  • http://filterdb.iss.net/crawler/
  • http://webmeup-crawler.com
  • https://megaindex.com/crawler
  • http://www.cloudservermarket.com
  • http://www.trendiction.de/de/publisher/bot
  • http://www.exalead.com
  • http://www.career-x.de/bot.html
  • https://www.lipperhey.com/en/about/
  • https://www.lipperhey.com/en/about/
  • https://turnitin.com/robot/crawlerinfo.html
  • http://help.coccoc.com/
  • ubermetrics-technologies.com
  • datenbutler.de
  • http://searchgears.de/uber-uns/crawling-faq.html
  • http://commoncrawl.org/faq/
  • https://www.qwant.com/
  • http://linkfluence.net/
  • http://www.botje.com/plukkie.htm
  • https://www.safedns.com/searchbot
  • http://www.haosou.com/help/help_3_2.html
  • http://www.haosou.com/help/help_3_2.html
  • http://www.moz.com/dp/rogerbot
  • http://www.openhose.org/bot.html
  • http://www.screamingfrog.co.uk/seo-spider/
  • http://thumbsniper.com
  • http://www.radian6.com/crawler
  • http://cliqz.com/company/cliqzbot
  • https://www.aihitdata.com/about
  • http://www.trendiction.com/en/publisher/bot
  • http://seocompany.store
  • https://github.com/yasserg/crawler4j/
  • http://warebay.com/bot.html
  • http://www.website-datenbank.de/
  • http://law.di.unimi.it/BUbiNG.html
  • http://www.linguee.com/bot; bot@linguee.com
  • www.sentibot.eu
  • http://velen.io
  • https://moz.com/help/guides/moz-procedures/what-is-rogerbot
  • http://www.garlik.com
  • https://www.gosign.de/typo3-extension/typo3-sicherheitsmonitor/
  • http://www.siteliner.com/bot
  • https://sabsim.com
  • http://ltx71.com/
  • Rule 1 Records must be separated by empty lines
  • Rule 2 Robots.txt sitemap list is very important! 202205 debug a month !

Warnings

  • 2 invalid lines.