peterpane.de
robots.txt

Robots Exclusion Standard data for peterpane.de

Resource Scan

Scan Details

Site Domain peterpane.de
Base Domain peterpane.de
Scan Status Ok
Last Scan3/15/2025, 12:08:34 PM
Next Scan 4/14/2025, 12:08:34 PM

Last Scan

Scanned3/15/2025, 12:08:34 PM
URL https://peterpane.de/robots.txt
Domain IPs 104.26.4.126, 104.26.5.126, 172.67.68.186, 2606:4700:20::681a:47e, 2606:4700:20::681a:57e, 2606:4700:20::ac43:44ba
Response IP 104.26.4.126
Found Yes
Hash ce024cf4f77929533d244501308807c33c9c6bc328b77a710b7ca400f4e461f5
SimHash f4985d51dfa2

Groups

*

Rule Path
Disallow /wp-admin/
Allow /
Disallow /*gclid*
Disallow /*?utm*
Disallow /*%26utm*
Disallow /*?from*
Disallow /*%26from*
Disallow /*?fbclid=*
Disallow /?s*
Disallow /*?tid*
Disallow /*%26tid*
Disallow /*?PageSpeed*
Disallow /*%3Dnoscript*
Allow /*.*jpg*
Allow /*.*png*
Allow /*.*jpeg*
Allow /*.*gif*
Allow /*.*pdf*
Allow /*.*js*
Allow /*.*css*
Allow /*.*woff*
Allow /*.*ttf*

seobility

Rule Path
Allow /

grub-client

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

zookabot

Rule Path
Disallow /

proximic

Rule Path
Disallow /

crystalsemanticsbot

Rule Path
Disallow /

larbin

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

ubicrawler

Rule Path
Disallow /

doc

Rule Path
Disallow /

zao

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

xenu

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

sogou web spider

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

k2spider

Rule Path
Disallow /

npbot

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.peterpane.de/sitemap_index.xml

Comments

  • kein index für dynamische urls
  • The 'grub' distributed client has been *very* poorly behaved.
  • FROM http://de.wikipedia.org/robots.txt
  • Crawlers that are kind enough to obey, but which we'd rather not have
  • unless they're feeding search engines.
  • Some bots are known to be trouble, particularly those designed to copy
  • entire sites. Please obey robots.txt.
  • The 'grub' distributed client has been *very* poorly behaved.
  • Doesn't follow robots.txt anyway, but...
  • Hits many times per second, not acceptable
  • http://www.nameprotect.com/botinfo.html
  • A capture bot, downloads gazillions of pages with no public benefit
  • http://www.webreaper.net/

Warnings

  • 2 invalid lines.