avenirdeseglisesdebruxelles.be
robots.txt

Robots Exclusion Standard data for avenirdeseglisesdebruxelles.be

Resource Scan

Scan Details

Site Domain avenirdeseglisesdebruxelles.be
Base Domain avenirdeseglisesdebruxelles.be
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-09-15T02:34:24+00:00
Next Scan 2024-12-14T02:34:24+00:00

Last Successful Scan

Scanned2022-04-27T11:55:23+00:00
URL https://avenirdeseglisesdebruxelles.be/robots.txt
Redirect https://www.avenirdeseglisesdebruxelles.be/robots.txt
Redirect Domain www.avenirdeseglisesdebruxelles.be
Redirect Base avenirdeseglisesdebruxelles.be
Response IP 172.67.169.151
Found Yes
Hash 1eec7667fb1b839a40aecb0729e1e357d45d7077ed39c6cc4c3109d324aa7e93
SimHash 5567214b65ba

Groups

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

adsbot-google

Rule Path
Disallow /js/

alphaseobot

Rule Path
Disallow /

siteexplorer

Rule Path
Disallow /

sitesucker

Rule Path
Disallow /

openindexspider

Rule Path
Disallow /

booglebot

Rule Path
Disallow /

backlinkcrawler

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

netestate ne crawler (+http://www.website-datenbank.de/)

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

hubspot crawler

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

mail.ru_bot

Rule Path
Disallow /

mail.ru

Rule Path
Disallow /

serpstatbot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

megaindex.com

Rule Path
Disallow /

yandex

Rule Path
Disallow /

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 5

*

Rule Path
Allow /

Comments

  • -----------------------------------------------------------
  • robots.txt, last refresh 2021/11/20
  • -----------------------------------------------------------
  • not all bots below may obey robots.txt in general
  • or specific rules, respectively
  • cat /home/wwwlogs/access.log | awk -F\" '{print $6}' | sort | uniq -c | sort -nr | head -20
  • -----------------------------------------------------------
  • semrush bot
  • ahrefs bot
  • moz bot
  • Wayback Machine
  • https://velen.io/
  • Baiduspider
  • Block SoGou
  • Block Youdao
  • AdsBot
  • http://alphaseobot.com/bot.html
  • http://siteexplorer.info/about.html
  • http://www.sitesucker.us/mac/limitations.html
  • https://www.openindex.io/saas/about-our-spider/
  • http://www.backlinktest.com/crawler.html
  • http://napoveda.seznam.cz/
  • http://www.website-datenbank.de
  • Block netEstate NE Crawler (+http://www.website-datenbank.de/)
  • Block BlexBot
  • https://megaindex.com/crawler
  • ------------
  • not exclude
  • ------------