similarweb.com
robots.txt
Robots Exclusion Standard data for similarweb.com
Resource Scan
Scan Details
Site Domain | similarweb.com |
Base Domain | similarweb.com |
Scan Status | Ok |
Last Scan | 2024-04-18T00:11:22+00:00 |
Next Scan | 2024-05-18T00:11:22+00:00 |
Last Scan
Scanned | 2024-04-18T00:11:22+00:00 |
URL | https://similarweb.com/robots.txt |
Redirect | https://www.similarweb.com:443/robots.txt |
Redirect Domain | www.similarweb.com |
Redirect Base | similarweb.com |
Domain IPs | 35.81.94.158, 52.27.240.80 |
Redirect IPs | 184.87.193.74, 184.87.193.77 |
Response IP | 184.27.123.51 |
Found | Yes |
Hash | 5f1b5626821aaf0e909e197ab1ff2384e447b2c5f12b374674ef43ec75a8c216 |
SimHash | 0d174c2ea572 |
Groups
*
Rule | Path |
---|---|
Disallow | */search/ |
Disallow | */adult/* |
Disallow | /corp/*.pdf$ |
Disallow | /corp/solution/ |
Disallow | /corp/lps/ |
Disallow | /corp/get-data/ |
Disallow | /corp/unlock-growth/ |
Disallow | /silent-login/ |
Disallow | /signin-oidc/ |
Disallow | /signout-oidc/ |
Other Records
Field | Value |
---|---|
sitemap | https://www.similarweb.com/corp/sitemap_index.xml |
sitemap | https://www.similarweb.com/blog/sitemap_index.xml |
sitemap | https://www.similarweb.com/sitemaps/sitemap_index.xml.gz |
Comments