pt.infobyip.com
robots.txt

Robots Exclusion Standard data for pt.infobyip.com

Resource Scan

Scan Details

Site Domain pt.infobyip.com
Base Domain infobyip.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-09-03T02:20:08+00:00
Next Scan 2025-10-03T02:20:08+00:00

Last Successful Scan

Scanned2025-07-12T22:32:04+00:00
URL https://pt.infobyip.com/robots.txt
Domain IPs 104.26.2.221, 104.26.3.221, 172.67.70.146, 2606:4700:20::681a:2dd, 2606:4700:20::681a:3dd, 2606:4700:20::ac43:4692
Response IP 104.26.3.221
Found Yes
Hash 87068570061eaae5fb6db102d231ba8ff955f1316d43aa2cea48791fe25f0c60
SimHash e4722944c91c

Groups

*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 1

gptbot

Rule Path
Disallow /balancedchemicalequations

*

Rule Path
Disallow /cite.php

Comments

  • Prevent msn from overwhealming the server, e.g some msn bot ips hit site 99558 per day in Feb 2015
  • Changed to any agent since mail.ru started to overload it as well
  • Since Jan 7 2025 a few hosts like rate-limited-proxy-209-85-238-1.google.com started to issuing 200K+ daily requests.
  • Prevent yandex from using too many resources
  • User-agent: yandex
  • Crawl-delay: 0.1
  • Prevent building site content in LLM