de.infobyip.com
robots.txt

Robots Exclusion Standard data for de.infobyip.com

Resource Scan

Scan Details

Site Domain de.infobyip.com
Base Domain infobyip.com
Scan Status Ok
Last Scan2025-11-22T13:50:16+00:00
Next Scan 2025-12-22T13:50:16+00:00

Last Scan

Scanned2025-11-22T13:50:16+00:00
URL https://de.infobyip.com/robots.txt
Domain IPs 104.26.2.221, 104.26.3.221, 172.67.70.146, 2606:4700:20::681a:2dd, 2606:4700:20::681a:3dd, 2606:4700:20::ac43:4692
Response IP 172.67.70.146
Found Yes
Hash 20a692a9e716deb62770a2021e8c720d46f5a3b84ef1c0b4959b0edb96d68274
SimHash e2762d46c91c

Groups

*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 1

gptbot
claudebot

Rule Path
Disallow /balancedchemicalequations

gptbot
claudebot

Rule Path
Disallow /molecularweightcalculated

*

Rule Path
Disallow /cite.php

Comments

  • Prevent msn from overwhealming the server, e.g some msn bot ips hit site 99558 per day in Feb 2015
  • Changed to any agent since mail.ru started to overload it as well
  • Since Jan 7 2025 a few hosts like rate-limited-proxy-209-85-238-1.google.com started to issuing 200K+ daily requests.
  • Prevent building site content in LLM