theweek.com
robots.txt
Robots Exclusion Standard data for theweek.com
Resource Scan
Scan Details
Site Domain | theweek.com |
Base Domain | theweek.com |
Scan Status | Ok |
Last Scan | 2024-10-26T15:12:01+00:00 |
Next Scan | 2024-11-02T15:12:01+00:00 |
Last Scan
Scanned | 2024-10-26T15:12:01+00:00 |
URL | https://theweek.com/robots.txt |
Domain IPs | 199.232.194.114, 199.232.198.114 |
Response IP | 199.232.198.114 |
Found | Yes |
Hash | 9fdc8ec98016c3423d5a1fdabead6a96671f6a6f77b8ad99735d55c8b43e0eaf |
SimHash | 2424e480ad99 |
Groups
*
Rule | Path |
---|---|
Disallow | */deals/compare |
Disallow | */html/ |
Disallow | */p/*/embed/captioned |
Disallow | *searchTerm%3D* |
Disallow | *sortBy%3D* |
Disallow | *productBrand%3D* |
Disallow | *%7B*%7D* |
Disallow | /infinite-scroll-article/* |
Disallow | /infinite-scroll-review/* |
Disallow | /infinite-scroll-recipe/* |
*
Rule | Path |
---|---|
Disallow | /search/ |
Disallow | /359/ |
Disallow | /content/ |
Disallow | /blaize/datalayer |
Disallow | /*?*xhr=* |
*
No rules defined. All paths allowed.
Other Records
Field | Value |
---|---|
sitemap | https://theweek.com/sitemap.xml |
sitemap | https://theweek.com/uk/sitemap.xml |
Comments