mein-jobmarkt.de
robots.txt

Robots Exclusion Standard data for mein-jobmarkt.de

Resource Scan

Scan Details

Site Domain mein-jobmarkt.de
Base Domain mein-jobmarkt.de
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-10-30T14:18:47+00:00
Next Scan 2024-11-29T14:18:47+00:00

Last Successful Scan

Scanned2024-10-01T14:15:08+00:00
URL https://www.mein-jobmarkt.de/robots.txt
Domain IPs 217.182.184.195
Response IP 217.182.184.195
Found Yes
Hash cf31b443464157feebbb27b1a917241d8974a21173d3c645f2e21e2fbb18a892
SimHash 13615d51c736

Groups

*

Rule Path
Disallow /User
Disallow /Dateien
Disallow /Nachrichten/Suche
Disallow /ScriptResource
Disallow /WebResource
Disallow /Verlag/Datenschutz
Disallow /Marktplatz
Disallow /Verlag/OAA-gesperrt

Other Records

Field Value
crawl-delay 2

Comments

  • Robots.txt for crawler
  • Disallow Crawler
  • Crawler often creates invalid script/webresource resource request
  • Max crawler Time per page in sec
  • Sitemap
  • Sitemap: https://www.mein-jobmarkt.de/Sitemap_Index.xml.gz