college-saint-colomban.fr
robots.txt

Robots Exclusion Standard data for college-saint-colomban.fr

Resource Scan

Scan Details

Site Domain college-saint-colomban.fr
Base Domain college-saint-colomban.fr
Scan Status Ok
Last Scan2024-10-20T18:07:46+00:00
Next Scan 2024-11-19T18:07:46+00:00

Last Scan

Scanned2024-10-20T18:07:46+00:00
URL https://college-saint-colomban.fr/robots.txt
Domain IPs 162.159.134.42
Response IP 162.159.134.42
Found Yes
Hash 88da8e9cab195e6709e55f0eea6250635291242b55ebecd6cca7503d7fd8fad8
SimHash d25a54c0cd32

Groups

mj12bot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

seokicks

Rule Path
Disallow /

discobot

Rule Path
Disallow /

blekkobot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

sistrix crawler

Rule Path
Disallow /

uptimerobot/2.0

Rule Path
Disallow /

ezooms robot

Rule Path
Disallow /

perl lwp

Rule Path
Disallow /

netestate ne crawler

Rule Path
Disallow /

wiseguys robot

Rule Path
Disallow /

turnitin robot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

babya discoverer

Rule Path
Disallow /

Comments

  • Block MJ12bot as it is just noise
  • Block Ahrefs
  • Block Sogou
  • Block SEOkicks
  • SEOkicks
  • Dicoveryengine.com
  • Blekkobot
  • Block BlexBot
  • Block SISTRIX
  • Block Uptime robot
  • Block Ezooms Robot
  • Block Perl LWP
  • Block netEstate NE Crawler
  • Block WiseGuys Robot
  • Block Turnitin Robot
  • Exabot
  • Babya Discoverer

Warnings

  • 1 invalid line.