jaanuu.com
robots.txt

Robots Exclusion Standard data for jaanuu.com

Resource Scan

Scan Details

Site Domain jaanuu.com
Base Domain jaanuu.com
Scan Status Ok
Last Scan2024-09-27T17:27:32+00:00
Next Scan 2024-10-27T17:27:32+00:00

Last Scan

Scanned2024-09-27T17:27:32+00:00
URL https://jaanuu.com/robots.txt
Redirect https://www.jaanuu.com:443/robots.txt
Redirect Domain www.jaanuu.com
Redirect Base jaanuu.com
Domain IPs 18.155.68.120, 18.155.68.26, 18.155.68.36, 18.155.68.72
Redirect IPs 13.35.186.14, 13.35.186.15, 13.35.186.81, 13.35.186.85
Response IP 18.155.68.72
Found Yes
Hash 93cca915298023023fefb6c2348a5efc2e4c12d44c9d189392dbb33bb83a4df3
SimHash ea812c8f79d0

Groups

*

Rule Path Comment
Disallow /reports_new/ disable folder
Disallow /products/review_ajax entire route
Disallow /*?sizes%5B%5D=* any URL contains sizes param
Disallow /*%26sizes%5B%5D%3D* any URL contains sizes param

Other Records

Field Value
sitemap http://www.jaanuu.com/sitemap.xml.gz
sitemap https://www.jaanuu.com/blog/sitemap.xml

Comments

  • See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:
  • User-agent: *
  • Disallow: /
  • Disallow: /sales/