gaeubote.de
robots.txt

Robots Exclusion Standard data for gaeubote.de

Resource Scan

Scan Details

Site Domain gaeubote.de
Base Domain gaeubote.de
Scan Status Ok
Last Scan2024-05-27T17:43:23+00:00
Next Scan 2024-06-03T17:43:23+00:00

Last Scan

Scanned2024-05-27T17:43:23+00:00
URL https://gaeubote.de/robots.txt
Redirect https://www.gaeubote.de/robots.txt
Redirect Domain www.gaeubote.de
Redirect Base gaeubote.de
Domain IPs 54.36.43.50
Redirect IPs 54.36.43.50
Response IP 54.36.43.50
Found Yes
Hash 9b1553dfe6bcc6c3e61390e764e589d72385fc5ade7c5a7d4437b3fc068397d1
SimHash 11495d10c72c

Groups

*

Rule Path
Disallow /User
Disallow /Dateien
Disallow /Nachrichten/Suche
Disallow /ScriptResource
Disallow /WebResource

Other Records

Field Value
crawl-delay 2

Comments

  • Robots.txt for crawler
  • Disallow Crawler
  • Crawler often creates invalid script/webresource resource request
  • Max crawler Time per page in sec
  • Sitemap
  • Sitemap: https://www.gaeubote.de/Sitemap_Index.xml.gz