all4.com
robots.txt
Robots Exclusion Standard data for all4.com
Resource Scan
Scan Details
Site Domain | all4.com |
Base Domain | all4.com |
Scan Status | Ok |
Last Scan | 2024-09-24T02:11:20+00:00 |
Next Scan | 2024-10-01T02:11:20+00:00 |
Last Scan
Scanned | 2024-09-24T02:11:20+00:00 |
URL | https://all4.com/robots.txt |
Redirect | https://www.channel4.com/robots.txt |
Redirect Domain | www.channel4.com |
Redirect Base | channel4.com |
Domain IPs | 52.213.2.162, 54.247.164.82 |
Redirect IPs | 184.51.97.37 |
Response IP | 23.54.57.184 |
Found | Yes |
Hash | e8d33369a253562578e768334518d47bf9893a1088c21a24c1f961035d113094 |
SimHash | e9015c70c333 |
Groups
*
Rule | Path |
---|---|
Disallow | /news/?* |
Disallow | /news/*/?* |
Disallow | /press/unregistered-image-search?* |
Disallow | /press/content-search?* |
Other Records
Field | Value |
---|---|
sitemap | https://www.channel4.com/news/sitemap.xml |
sitemap | https://www.channel4.com/sitemap.xml |
Warnings
- 1 invalid line.