5dca.org
robots.txt

Robots Exclusion Standard data for 5dca.org

Resource Scan

Scan Details

Site Domain 5dca.org
Base Domain 5dca.org
Scan Status Ok
Last Scan2024-05-22T15:12:43+00:00
Next Scan 2024-06-21T15:12:43+00:00

Last Scan

Scanned2024-05-22T15:12:43+00:00
URL https://5dca.org/robots.txt
Redirect https://5dca.flcourts.gov/robots.txt
Redirect Domain 5dca.flcourts.gov
Redirect Base flcourts.gov
Domain IPs 52.71.164.101
Redirect IPs 18.244.51.100, 18.244.51.49, 18.244.51.78, 18.244.51.89
Response IP 13.227.254.117
Found Yes
Hash 82c5c1857991a287b20cd1f2e3c323ca9a93233f4fdbb05859beab4e29ce569b
SimHash 6811d1c1c3f5

Groups

*

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 5

googlebot
bingbot
bingpreview
msnbot
slurp
duckduckbot
applebot
ia_archiver
facebookexternalhit
twitterbot
linkedinbot

Rule Path Comment
Allow / -
Disallow /admin/ -
Disallow /site_admin/ -
Disallow /search -
Disallow /search/ -
Disallow /content/search -
Disallow /content/advancedsearch -
Disallow /content/tipafriend -
Disallow /layout/set/print -
Disallow /rss -
Disallow /media/ -
Disallow /ezinfo/ -
Disallow /user/ -
Disallow /test-area/ -
Disallow /content/download/347260/file/11-404_JurisIni.pdf 11404

Comments

  • Disallow all
  • But allow only important bots