theonion.com
robots.txt
Robots Exclusion Standard data for theonion.com
Resource Scan
Scan Details
Site Domain | theonion.com |
Base Domain | theonion.com |
Scan Status | Ok |
Last Scan | 2024-05-14T23:27:04+00:00 |
Next Scan | 2024-05-21T23:27:04+00:00 |
Last Scan
Scanned | 2024-05-14T23:27:04+00:00 |
URL | https://theonion.com/robots.txt |
Redirect | https://www.theonion.com/robots.txt |
Redirect Domain | www.theonion.com |
Redirect Base | theonion.com |
Domain IPs | 151.101.130.166, 151.101.194.166, 151.101.2.166, 151.101.66.166 |
Redirect IPs | 151.101.130.166, 151.101.194.166, 151.101.2.166, 151.101.66.166 |
Response IP | 151.101.2.166 |
Found | Yes |
Hash | ec0ac4643765c73e1524ae509bf2178687f6de3540358557baeae9a3cea8fd1f |
SimHash | 01155a51ef92 |
Groups
*
Rule | Path |
---|---|
Disallow | /stats/ |
Disallow | /api/ |
Disallow | /ajax/ |
Disallow | /embed/ |
Disallow | /setbucket* |
Disallow | /game/score/* |
Disallow | /game/summary/* |
Disallow | /advisor/* |
Allow | /advisor/sitemap.xml |
Disallow | /search$ |
Disallow | /search? |
Other Records
Field | Value |
---|---|
sitemap | https://www.theonion.com/sitemap.xml |