catalog.cpp.edu
robots.txt

Robots Exclusion Standard data for catalog.cpp.edu

Resource Scan

Scan Details

Site Domain catalog.cpp.edu
Base Domain cpp.edu
Scan Status Ok
Last Scan2025-09-29T20:59:34+00:00
Next Scan 2025-10-29T20:59:34+00:00

Last Scan

Scanned2025-09-29T20:59:34+00:00
URL https://catalog.cpp.edu/robots.txt
Domain IPs 35.174.48.113, 52.202.242.122, 52.3.71.54
Response IP 52.202.242.122
Found Yes
Hash 1dfe956a6b5e20dc3c081043faf71e746170ae4dfefa0f00c2bacc8dc8c8a0c3
SimHash 7b30d8c9d3f3

Groups

archive.org_bot

Rule Path
Disallow /portfolio.php
Disallow /portfolio_nopop.php
Disallow /ajax/
Disallow /search_advanced.php

Other Records

Field Value
crawl-delay 15

occ-crawler

Rule Path
Disallow /portfolio.php
Disallow /portfolio_nopop.php
Disallow /ajax/
Disallow /search_advanced.php

Other Records

Field Value
crawl-delay 15

*

Rule Path
Disallow /portfolio.php
Disallow /portfolio_nopop.php
Disallow /ajax/
Disallow /search_advanced.php

Other Records

Field Value
crawl-delay 120

Comments

  • Oberlin's bot.
  • Owens Community College
  • Everyone else.