gw.lightinthebox.com
robots.txt
Robots Exclusion Standard data for gw.lightinthebox.com
Resource Scan
Scan Details
Site Domain | gw.lightinthebox.com |
Base Domain | lightinthebox.com |
Scan Status | Ok |
Last Scan | 2024-11-12T12:52:08+00:00 |
Next Scan | 2024-11-26T12:52:08+00:00 |
Last Scan
Scanned | 2024-11-12T12:52:08+00:00 |
URL | https://gw.lightinthebox.com/robots.txt |
Domain IPs | 96.17.96.26, 96.17.96.30 |
Response IP | 104.81.138.81 |
Found | Yes |
Hash | 3e6c1bef5f1050b8ad92bed559aa72051811e897eba1cdffd13fa0687cc9dd20 |
SimHash | d37197fec631 |
Groups
*
Rule | Path |
---|---|
Disallow | /cache/ |
Disallow | /api/ |
Disallow | /plugins/ |
Disallow | /newproducttags/ |
Disallow | /ns/ |
Disallow | /*/ns/ |
Disallow | /narrow/ |
Disallow | /n/ |
Disallow | /*/n/ |
Disallow | /index.php?main_page=login* |
Disallow | /*/index.php?main_page=login* |
Disallow | /index.php?main_page=shopping_cart* |
Disallow | /*/index.php?main_page=shopping_cart* |
Disallow | /index.php?main_page=shopping_cart_add* |
Disallow | /*/index.php?main_page=shopping_cart_add* |
Allow | /*%26litb_from%3Dpaid_adwords_shopping |
Allow | /*%26litb_from%3Dbing_shopping |
Other Records
Field | Value |
---|---|
sitemap | https://www.lightinthebox.com/sitemap.xml |