th.canon
robots.txt

Robots Exclusion Standard data for th.canon

Resource Scan

Scan Details

Site Domain th.canon
Base Domain th.canon
Scan Status Ok
Last Scan2024-10-29T04:10:56+00:00
Next Scan 2024-11-28T04:10:56+00:00

Last Scan

Scanned2024-10-29T04:10:56+00:00
URL https://th.canon/robots.txt
Domain IPs 13.33.88.101, 13.33.88.108, 13.33.88.110, 13.33.88.128
Response IP 13.33.88.128
Found Yes
Hash e411abd942afa00bb8b17c50de885722bbdfa52879d364192ddd4ce36b242840
SimHash 3a589587e98b

Groups

*

Rule Path
Disallow *sort%3Daz*
Disallow *sort%3Dza*
Disallow *sort%3Dnewest*
Disallow *sort%3Doldest*
Disallow *sort%3DhighestPrice*
Disallow *sort%3DlowestPrice*
Disallow */business/search?q=*
Disallow */consumer/search?q=*
Disallow */support/search?q=*
Disallow */support/get-search-result-content*
Disallow */support/download?*
Disallow */admin/*

Other Records

Field Value
crawl-delay 30

semrushbot

Rule Path
Disallow /

siteauditbot

Rule Path
Disallow /

semrushbot-ba

Rule Path
Disallow /

semrushbot-si

Rule Path
Disallow /

semrushbot-swa

Rule Path
Disallow /

semrushbot-ct

Rule Path
Disallow /

semrushbot-bm

Rule Path
Disallow /

splitsignalbot

Rule Path
Disallow /

semrushbot-coub

Rule Path
Disallow /

go-http-client

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

yandex

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /