clustrmaps.com
robots.txt

Robots Exclusion Standard data for clustrmaps.com

Resource Scan

Scan Details

Site Domain clustrmaps.com
Base Domain clustrmaps.com
Scan Status Ok
Last Scan2024-09-15T05:27:01+00:00
Next Scan 2024-09-22T05:27:01+00:00

Last Scan

Scanned2024-09-15T05:27:01+00:00
URL https://clustrmaps.com/robots.txt
Domain IPs 104.22.72.194, 104.22.73.194, 172.67.43.119, 2606:4700:10::6816:48c2, 2606:4700:10::6816:49c2, 2606:4700:10::ac43:2b77
Response IP 172.67.43.119
Found Yes
Hash edece5cc41b134bfd9959de9c319cff90bac6c1c7b554e24a91555aeef5d38da
SimHash 0b0a80203fb9

Groups

*

Rule Path
Disallow /website_directory
Disallow /map_v2.png
Disallow /map_v3.png
Disallow /a/hm/
Disallow /a/jx/
Disallow /bl/tools/r
Disallow /bl/tools/bv
Disallow /bl/opt-out
Disallow /c/
Disallow /persons/i/
Disallow /bv/
Disallow /details/

mj12bot

Rule Path
Disallow /

the knowledge ai

Rule Path
Disallow /

garlikcrawler/1.1 (http://garlik.com/, crawler@garlik.com)

Rule Path
Disallow /

linguee

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

grapeshot

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

proximic

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

Warnings

  • `host` is not a known field.