oclc.org
robots.txt
Robots Exclusion Standard data for oclc.org
Resource Scan
Scan Details
Site Domain | oclc.org |
Base Domain | oclc.org |
Scan Status | Ok |
Last Scan | 2024-06-14T07:02:15+00:00 |
Next Scan | 2024-07-14T07:02:15+00:00 |
Last Scan
Scanned | 2024-06-14T07:02:15+00:00 |
URL | https://oclc.org/robots.txt |
Redirect | https://www.oclc.org/robots.txt |
Redirect Domain | www.oclc.org |
Redirect Base | oclc.org |
Domain IPs | 132.174.0.132 |
Redirect IPs | 132.174.0.132 |
Response IP | 132.174.0.132 |
Found | Yes |
Hash | eafb305f3d2815c4e9b786c372ebad3a4f69caf063a347589306966ac0f8be72 |
SimHash | 6b08fc74a353 |
Groups
*
Rule | Path |
---|---|
Disallow | */contacts/all-contacts.*.html$ |
Disallow | */contacts/all-contacts/us-sales.*.html$ |
Disallow | */member-stories.*.html$ |
Disallow | */services/a-z.*.html$ |
Disallow | */services/a-z/* |
Disallow | */developer/gallery.*.en.html$ |
Disallow | */developer/news.*.en.html$ |
Disallow | */developer/develop/web-services.*.en.html$ |
Other Records
Field | Value |
---|---|
crawl-delay | 5 |
Other Records
Field | Value |
---|---|
sitemap | https://www.oclc.org/sitemap.xml |