space.com
robots.txt
Robots Exclusion Standard data for space.com
Resource Scan
Scan Details
Site Domain | space.com |
Base Domain | space.com |
Scan Status | Ok |
Last Scan | 2024-05-11T00:16:48+00:00 |
Next Scan | 2024-05-18T00:16:48+00:00 |
Last Scan
Scanned | 2024-05-11T00:16:48+00:00 |
URL | https://space.com/robots.txt |
Redirect | https://www.space.com/robots.txt |
Redirect Domain | www.space.com |
Redirect Base | space.com |
Domain IPs | 199.232.194.114, 199.232.198.114 |
Redirect IPs | 199.232.194.114, 199.232.198.114 |
Response IP | 146.75.42.114 |
Found | Yes |
Hash | 3477922f49e6953f542a8a0a4e7f944e8a1bf2d88e4215093fd5f335b47a1966 |
SimHash | 6404bcc0af9d |
Groups
*
Rule | Path |
---|---|
Disallow | */deals/compare |
Disallow | */html/ |
Disallow | */p/*/embed/captioned |
Disallow | *searchTerm%3D* |
Disallow | *sortBy%3D* |
Disallow | *productBrand%3D* |
Disallow | *%7B*%7D* |
Disallow | /infinite-scroll-article/* |
Disallow | /infinite-scroll-review/* |
Disallow | /infinite-scroll-recipe/* |
*
Rule | Path |
---|---|
Disallow | /search.php |
Disallow | /social.php |
Disallow | /newsletter-signup |
Disallow | /_proxy* |
*
No rules defined. All paths allowed.
Other Records
Field | Value |
---|---|
sitemap | https://www.space.com/sitemap.xml |
Comments