th.canon
robots.txt
Robots Exclusion Standard data for th.canon
Resource Scan
Scan Details
Site Domain | th.canon |
Base Domain | th.canon |
Scan Status | Ok |
Last Scan | 2024-10-29T04:10:56+00:00 |
Next Scan | 2024-11-28T04:10:56+00:00 |
Last Scan
Scanned | 2024-10-29T04:10:56+00:00 |
URL | https://th.canon/robots.txt |
Domain IPs | 13.33.88.101, 13.33.88.108, 13.33.88.110, 13.33.88.128 |
Response IP | 13.33.88.128 |
Found | Yes |
Hash | e411abd942afa00bb8b17c50de885722bbdfa52879d364192ddd4ce36b242840 |
SimHash | 3a589587e98b |
Groups
*
Rule | Path |
---|---|
Disallow | *sort%3Daz* |
Disallow | *sort%3Dza* |
Disallow | *sort%3Dnewest* |
Disallow | *sort%3Doldest* |
Disallow | *sort%3DhighestPrice* |
Disallow | *sort%3DlowestPrice* |
Disallow | */business/search?q=* |
Disallow | */consumer/search?q=* |
Disallow | */support/search?q=* |
Disallow | */support/get-search-result-content* |
Disallow | */support/download?* |
Disallow | */admin/* |
Other Records
Field | Value |
---|---|
crawl-delay | 30 |