in.canon
robots.txt

Robots Exclusion Standard data for in.canon

Resource Scan

Scan Details

Site Domain in.canon
Base Domain in.canon
Scan Status Ok
Last Scan2024-10-31T19:35:04+00:00
Next Scan 2024-11-30T19:35:04+00:00

Last Scan

Scanned2024-10-31T19:35:04+00:00
URL https://in.canon/robots.txt
Domain IPs 13.33.88.13, 13.33.88.20, 13.33.88.32, 13.33.88.40
Response IP 13.33.88.40
Found Yes
Hash e411abd942afa00bb8b17c50de885722bbdfa52879d364192ddd4ce36b242840
SimHash 3a589587e98b

Groups

*

Rule Path
Disallow *sort%3Daz*
Disallow *sort%3Dza*
Disallow *sort%3Dnewest*
Disallow *sort%3Doldest*
Disallow *sort%3DhighestPrice*
Disallow *sort%3DlowestPrice*
Disallow */business/search?q=*
Disallow */consumer/search?q=*
Disallow */support/search?q=*
Disallow */support/get-search-result-content*
Disallow */support/download?*
Disallow */admin/*

Other Records

Field Value
crawl-delay 30

semrushbot

Rule Path
Disallow /

siteauditbot

Rule Path
Disallow /

semrushbot-ba

Rule Path
Disallow /

semrushbot-si

Rule Path
Disallow /

semrushbot-swa

Rule Path
Disallow /

semrushbot-ct

Rule Path
Disallow /

semrushbot-bm

Rule Path
Disallow /

splitsignalbot

Rule Path
Disallow /

semrushbot-coub

Rule Path
Disallow /

go-http-client

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

yandex

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /