emufrance.com
robots.txt

Robots Exclusion Standard data for emufrance.com

Resource Scan

Scan Details

Site Domain emufrance.com
Base Domain emufrance.com
Scan Status Ok
Last Scan2024-11-16T18:32:37+00:00
Next Scan 2024-11-23T18:32:37+00:00

Last Scan

Scanned2024-11-16T18:32:37+00:00
URL http://emufrance.com/robots.txt
Redirect http://www.emu-france.com/robots.txt
Redirect Domain www.emu-france.com
Redirect Base emu-france.com
Domain IPs 217.70.184.38
Redirect IPs 51.91.73.21
Response IP 51.91.73.21
Found Yes
Hash 262769516bc92e8f5316e1f5e5a85f6601b10b0812c329bf1299a944f8c8dc67
SimHash a4195dd9c5f9

Groups

*

Rule Path
Disallow
Disallow /forum/
Disallow /_temp/

Other Records

Field Value
crawl-delay 10

ia_archiver

Rule Path
Disallow

bingbot

Rule Path
Disallow
Disallow /forum/
Disallow /_temp/

Other Records

Field Value
crawl-delay 10

msnbot

Rule Path
Disallow
Disallow /forum/
Disallow /_temp/

mediapartners-google*

Rule Path
Disallow
Disallow /forum/
Disallow /_temp/

israbot

Rule Path
Disallow
Disallow /forum/
Disallow /_temp/

orthogaffe

Rule Path
Disallow
Disallow /forum/
Disallow /_temp/

ubicrawler

Rule Path
Disallow /

doc

Rule Path
Disallow /

zao

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

xenu

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

fast

Rule Path
Disallow /

Other Records

Field Value
sitemap http://www.emu-france.com/auto_sitemap.xml

Comments

  • Sitemap: http://www.emu-france.com/sitemap.xml.gz
  • archive.org = ok (important pour emu-france !!)
  • advertising-related bots:
  • Wikipedia work bots:
  • Crawlers that are kind enough to obey, but which we'd rather not have
  • unless they're feeding search engines.
  • Some bots are known to be trouble, particularly those designed to copy
  • entire sites. Please obey robots.txt.
  • Misbehaving: requests much too fast:
  • User-agent: Amazonbot
  • Disallow: /
  • User-agent: facebookexternalhit
  • Disallow: /
  • User-agent: meta-externalagent
  • Disallow: /
  • User-agent: SemrushBot
  • Disallow: /