nosdevoirs.fr
robots.txt

Robots Exclusion Standard data for nosdevoirs.fr

Resource Scan

Scan Details

Site Domain nosdevoirs.fr
Base Domain nosdevoirs.fr
Scan Status Ok
Last Scan2024-11-13T19:38:31+00:00
Next Scan 2024-11-20T19:38:31+00:00

Last Scan

Scanned2024-11-13T19:38:31+00:00
URL https://nosdevoirs.fr/robots.txt
Domain IPs 104.19.182.63, 104.19.183.63
Response IP 104.19.183.63
Found Yes
Hash 638b31ef6b72128f1ef679aa8b53d816ca2e2d87668ea01cac340b054a7c0913
SimHash e0b0927048f3

Groups

*

Rule Path
Allow /ads.txt
Disallow /advertisements/gift_clicks
Disallow /app/ask?*
Disallow /buddies/invite/
Disallow /buddies_new/invite/
Disallow /cdn-cgi/l/email-protection
Disallow /login?*
Disallow /question/add?*
Disallow /signup?*
Disallow /tasks/prev_task/
Disallow /tasks/next_task/
Disallow /tasks/latex/
Disallow /tasks/solve_dynamic/
Disallow /users/thank/
Disallow /users/view_awards
Disallow /api
Disallow /bff
Disallow /experts

yandex

Rule Path
Disallow /advertisements/gift_clicks
Disallow /app/ask?*
Disallow /buddies/invite/
Disallow /buddies_new/invite/
Disallow /cdn-cgi/l/email-protection
Disallow /login?*
Disallow /question/add?*
Disallow /signup?*
Disallow /tasks/prev_task/
Disallow /tasks/next_task/
Disallow /tasks/latex/
Disallow /tasks/solve_dynamic/
Disallow /users/thank/
Disallow /users/view_awards/

semrushbot-sa

Rule Path Comment
Disallow / Semrush
Allow /ads.txt -

semrushbot

Rule Path Comment
Disallow / Semrush
Allow /ads.txt -

rogerbot

Rule Path Comment
Disallow / MOZ
Allow /ads.txt -

dotbot

Rule Path Comment
Disallow / MOZ
Allow /ads.txt -

blexbot

Rule Path Comment
Disallow / Webmeup.com
Allow /ads.txt -

spbot

Rule Path Comment
Disallow / Openlinkprofiler
Allow /ads.txt -

seodiver

Rule Path Comment
Disallow / SEOdiver
Allow /ads.txt -

dataprovider

Rule Path Comment
Disallow / DataProvider.com
Allow /ads.txt -

magpie-crawler

Rule Path Comment
Disallow / BrandWatch.com
Allow /ads.txt -

getintent crawler

Rule Path
Disallow /
Allow /ads.txt

grapeshot

Rule Path
Disallow

doubleverify

Rule Path
Disallow

white ops

Rule Path
Disallow

moatbot

Rule Path
Disallow

ias_crawler

Rule Path
Disallow

forensiq

Rule Path
Disallow

duckduckbot

Rule Path
Disallow

leikibot

Rule Path
Disallow

baidu-yunguance-scanbot(ce.baidu.com)

Rule Path
Disallow

baidu-yunguance-slabot(ce.baidu.com)

Rule Path
Disallow

baidu-yunguance-perfbot(ce.baidu.com)

Rule Path
Disallow

baidu-yunguance-vsbot(ce.baidu.com)

Rule Path
Disallow

seznambot

Rule Path
Disallow /
Allow /ads.txt

sogou web spider

Rule Path
Disallow /
Allow /ads.txt

baiduspider

Rule Path
Disallow /
Allow /ads.txt

naverbot

Rule Path
Disallow /
Allow /ads.txt

yeti

Rule Path
Disallow /
Allow /ads.txt

coccocbot-web

Rule Path
Disallow /
Allow /ads.txt

qwantify

Rule Path
Disallow /
Allow /ads.txt

exabot

Rule Path
Disallow /
Allow /ads.txt

linguee

Rule Path Comment
Disallow / Language tool
Allow /ads.txt -

surdotlybot

Rule Path Comment
Disallow / Sur.ly
Allow /ads.txt -

bubing

Rule Path Comment
Disallow / Bubing academic crawler
Allow /ads.txt -

twitterbot

Rule Path
Disallow

mediapartners-google

Rule Path
Disallow

facebot

Rule Path
Disallow

applebot-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

Comments

  • Brainly Robots.txt 31.07.2017
  • Disallow Marketing bots
  • Disallow exotic search engine crawlers
  • Disallow other crawlers
  • Good bots whitelisting:
  • Other bots
  • Neticle Crawler v1.0 ( http://bot.neticle.hu/ ) https://bot.neticle.hu/ - brand monitoring
  • Mega https://megaindex.com/crawler - link indexer tool (supports directives in user-agent:*)
  • Obot - IBM X-Force service
  • SafeDNSBot (https://www.safedns.com/searchbot)

Warnings

  • 3 invalid lines.
  • `clean-param` is not a known field.
  • `host` is not a known field.