freetheocean.com
robots.txt

Robots Exclusion Standard data for freetheocean.com

Resource Scan

Scan Details

Site Domain freetheocean.com
Base Domain freetheocean.com
Scan Status Ok
Last Scan2024-11-12T18:05:34+00:00
Next Scan 2024-11-19T18:05:34+00:00

Last Scan

Scanned2024-11-12T18:05:34+00:00
URL https://freetheocean.com/robots.txt
Redirect https://www.freetheocean.com/robots.txt
Redirect Domain www.freetheocean.com
Redirect Base freetheocean.com
Domain IPs 23.185.0.3, 2620:12a:8000::3, 2620:12a:8001::3
Redirect IPs 23.185.0.3, 2620:12a:8000::3, 2620:12a:8001::3
Response IP 23.185.0.3
Found Yes
Hash 39e9970af95f6f182d06a04a9df303bd8c41279f72c4563f929703030b7e8a29
SimHash 2276dd40c68a

Groups

*

Rule Path
Disallow /cgi-bin/
Disallow /tmp/
Disallow /junk/

ahrefsbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

aspiegelbot

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

megaindex

Rule Path
Disallow /

spbot

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

ltx71

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

linkfluence

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

bubing

Rule Path
Disallow /

coccoc

Rule Path
Disallow /

exabot

Rule Path
Disallow /

grapeshotcrawler

Rule Path
Disallow /

proximic

Rule Path
Disallow /

sogou

Rule Path
Disallow /

seekport

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

twengabot

Rule Path
Disallow /

yeti

Rule Path
Disallow /

zumbot

Rule Path
Disallow /

wget

Rule Path
Disallow /

httrack

Rule Path
Disallow /

wget

Rule Path
Disallow /

curl

Rule Path
Disallow /

*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

Comments

  • Block Bad Bots
  • Block all user agents trying to access the site too frequently