lehmann.cx
robots.txt

Robots Exclusion Standard data for lehmann.cx

Resource Scan

Scan Details

Site Domain lehmann.cx
Base Domain lehmann.cx
Scan Status Ok
Last Scan2025-10-10T00:15:49+00:00
Next Scan 2025-10-24T00:15:49+00:00

Last Scan

Scanned2025-10-10T00:15:49+00:00
URL https://lehmann.cx/robots.txt
Domain IPs 104.21.1.101, 172.67.129.5, 2606:4700:3030::ac43:8105, 2606:4700:3031::6815:165
Response IP 104.21.1.101
Found Yes
Hash c7b68c0827214c6b6585954e3ed2e490086fc8c58afbb654dca8a5bde2f692a3
SimHash 68516d5b9676

Groups

petalbot

Rule Path
Disallow /wiki/

*

Rule Path
Disallow /wiki/*?

*

Rule Path
Disallow /wiki/lib/

*

Rule Path
Allow /

Other Records

Field Value
sitemap https://www.lehmann.cx/sitemap.xml.gz
sitemap https://www.lehmann.cx/wiki/doku.php?do=sitemap
sitemap https://www.lehmann.cx/blog/sitemap.xml

Comments

  • some kind of crawling issue, sorry
  • overeager bots go through the whole history, no point
  • why lib? there is nothing relevant in there
  • anything else should be ok