in.canon
robots.txt
Robots Exclusion Standard data for in.canon
Resource Scan
Scan Details
Site Domain | in.canon |
Base Domain | in.canon |
Scan Status | Ok |
Last Scan | 2024-10-31T19:35:04+00:00 |
Next Scan | 2024-11-30T19:35:04+00:00 |
Last Scan
Scanned | 2024-10-31T19:35:04+00:00 |
URL | https://in.canon/robots.txt |
Domain IPs | 13.33.88.13, 13.33.88.20, 13.33.88.32, 13.33.88.40 |
Response IP | 13.33.88.40 |
Found | Yes |
Hash | e411abd942afa00bb8b17c50de885722bbdfa52879d364192ddd4ce36b242840 |
SimHash | 3a589587e98b |
Groups
*
Rule | Path |
---|---|
Disallow | *sort%3Daz* |
Disallow | *sort%3Dza* |
Disallow | *sort%3Dnewest* |
Disallow | *sort%3Doldest* |
Disallow | *sort%3DhighestPrice* |
Disallow | *sort%3DlowestPrice* |
Disallow | */business/search?q=* |
Disallow | */consumer/search?q=* |
Disallow | */support/search?q=* |
Disallow | */support/get-search-result-content* |
Disallow | */support/download?* |
Disallow | */admin/* |
Other Records
Field | Value |
---|---|
crawl-delay | 30 |