similarweb.com
robots.txt
Robots Exclusion Standard data for similarweb.com
Resource Scan
Scan Details
Site Domain | similarweb.com |
Base Domain | similarweb.com |
Scan Status | Ok |
Last Scan | 2024-11-14T00:28:10+00:00 |
Next Scan | 2024-12-14T00:28:10+00:00 |
Last Scan
Scanned | 2024-11-14T00:28:10+00:00 |
URL | https://similarweb.com/robots.txt |
Redirect | https://www.similarweb.com:443/robots.txt |
Redirect Domain | www.similarweb.com |
Redirect Base | similarweb.com |
Domain IPs | 52.38.194.239, 54.70.231.135 |
Redirect IPs | 23.209.46.69, 23.209.46.97 |
Response IP | 23.45.207.174 |
Found | Yes |
Hash | 5f1b5626821aaf0e909e197ab1ff2384e447b2c5f12b374674ef43ec75a8c216 |
SimHash | 0d174c2ea572 |
Groups
*
Rule | Path |
---|---|
Disallow | */search/ |
Disallow | */adult/* |
Disallow | /corp/*.pdf$ |
Disallow | /corp/solution/ |
Disallow | /corp/lps/ |
Disallow | /corp/get-data/ |
Disallow | /corp/unlock-growth/ |
Disallow | /silent-login/ |
Disallow | /signin-oidc/ |
Disallow | /signout-oidc/ |
Other Records
Field | Value |
---|---|
sitemap | https://www.similarweb.com/corp/sitemap_index.xml |
sitemap | https://www.similarweb.com/blog/sitemap_index.xml |
sitemap | https://www.similarweb.com/sitemaps/sitemap_index.xml.gz |
Comments