caspian.aero
robots.txt

Robots Exclusion Standard data for caspian.aero

Resource Scan

Scan Details

Site Domain caspian.aero
Base Domain caspian.aero
Scan Status Ok
Last Scan2025-11-28T21:53:23+00:00
Next Scan 2025-12-28T21:53:23+00:00

Last Scan

Scanned2025-11-28T21:53:23+00:00
URL https://caspian.aero/robots.txt
Domain IPs 91.201.52.137
Response IP 91.201.52.137
Found Yes
Hash fcf0a480123ee5bd58bf9edfbcf3b6e2a70d5ec616b618edfe3e25f381475060
SimHash bb0c47fcde81

Groups

teoma

Rule Path
Disallow /control/
Disallow /report/

*

Rule Path
Disallow /control/
Disallow /report/
Disallow /details/goldenbull2007john/
Disallow /stream/goldenbull2007john/
Disallow /download/goldenbull2007john/
Disallow /14/items/goldenbull2007john/goldenbull2007john_djvu.txt

Other Records

Field Value
sitemap http://archive.org/sitemap/sitemap.xml
sitemap http://archive.org/sitemap/sitemap.xml

Comments

  • Welcome to the Archive!
  • Please crawl our files.
  • We appreciate if you can crawl responsibly.
  • Stay open!
  • slow down the ask jeeves crawler which was hitting our SE a little too fast
  • via collection pages. --Feb2008 tracey--