ems-vechte-surfer.de
robots.txt

Robots Exclusion Standard data for ems-vechte-surfer.de

Resource Scan

Scan Details

Site Domain ems-vechte-surfer.de
Base Domain ems-vechte-surfer.de
Scan Status Ok
Last Scan2024-10-31T20:07:42+00:00
Next Scan 2024-11-07T20:07:42+00:00

Last Scan

Scanned2024-10-31T20:07:42+00:00
URL https://ems-vechte-surfer.de/robots.txt
Redirect https://www.ems-vechte-surfer.de/robots.txt
Redirect Domain www.ems-vechte-surfer.de
Redirect Base ems-vechte-surfer.de
Domain IPs 217.182.187.117
Redirect IPs 217.182.187.117
Response IP 217.182.187.117
Found Yes
Hash c31a34b526ae3dd5252c00d36ae18a9c924b17b27d7a4d18bd15fd36986a9f6b
SimHash 71105d10cfa5

Groups

amazonbot
applebot
applebot-extended
bytespider
ccbot
chatgpt-user
claude-web
claudebot
diffbot
friendlycrawler
gptbot
icc-crawler
imagesiftbot
oai-searchbot
perplexitybot
petalbot
scrapy
timpibot
velenpublicwebcrawler
youbot
anthropic-ai
cohere-ai
img2dataset
omgili
omgilibot

Rule Path
Disallow /

*

Rule Path
Disallow /User
Disallow /Dateien
Disallow /Nachrichten/Suche
Disallow /ScriptResource
Disallow /WebResource

Other Records

Field Value
crawl-delay 2

Other Records

Field Value
sitemap https://www.ems-vechte-surfer.de/Sitemap_Index.xml.gz

Comments

  • Robots.txt for crawler
  • Disallow Crawler
  • Crawler often creates invalid script/webresource resource request
  • Max crawler Time per page in sec
  • Sitemap
  • Legal notice: ems-vechte-surfer.de expressly reserves the right to use its content for commercial text and data mining (� 44b UrhG).
  • The use of robots or other automated means to access ems-vechte-surfer.de or collect or mine data without the express permission of ems-vechte-surfer.de is strictly prohibited.
  • If you would like to apply for permission to crawl ems-vechte-surfer.de, collect or use data, please contact datenschutz@gn-online.de