saintclements.net
robots.txt
Robots Exclusion Standard data for saintclements.net
Resource Scan
Scan Details
Site Domain | saintclements.net |
Base Domain | saintclements.net |
Scan Status | Ok |
Last Scan | 2024-10-31T08:41:34+00:00 |
Next Scan | 2024-11-07T08:41:34+00:00 |
Last Scan
Scanned | 2024-10-31T08:41:34+00:00 |
URL | http://saintclements.net/robots.txt |
Domain IPs | 141.8.192.236 |
Response IP | 141.8.192.236 |
Found | Yes |
Hash | b5ff5b7df16950162967244303aabeaede9ad3d15037b105d96eb7bbf55cc7dc |
SimHash | b30bff53032b |
Groups
*
Rule | Path |
---|---|
Disallow | /attachment/* |
Disallow | /page/*/ |
Disallow | */page/* |
Disallow | /author/* |
Disallow | /arhives/* |
Disallow | /wp-service/* |
Disallow | /wp-service/ |
Disallow | /wp-service/* |
Disallow | /wp-admin/ |
Other Records
Field | Value |
---|---|
crawl-delay | 10 |
Warnings
- 4 invalid lines.
- `host` is not a known field.