llli.org
robots.txt

Robots Exclusion Standard data for llli.org

Resource Scan

Scan Details

Site Domain llli.org
Base Domain llli.org
Scan Status Ok
Last Scan2025-02-24T00:10:21+00:00
Next Scan 2025-03-26T00:10:21+00:00

Last Scan

Scanned2025-02-24T00:10:21+00:00
URL https://llli.org/robots.txt
Domain IPs 192.124.249.20
Response IP 192.124.249.20
Found Yes
Hash 43c43c4b3ecd21d3a5d8a8ee793a9e4ceec42694028855a5f9af1f6c919f2075
SimHash 5c5cda114542

Groups

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

slurp

Rule Path
Allow /

duckduckbot

Rule Path
Allow /

baiduspider

Rule Path
Allow /

yandex

Rule Path
Allow /

facebookexternalhit

Rule Path
Allow /

*

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 10

Comments

  • Allow Googlebot
  • Allow Bingbot
  • Allow Yahoo's bot (Slurp)
  • Allow DuckDuckGo bot
  • Allow Baidu Spider (optional - remove if not needed)
  • Allow Yandex Bot
  • Allow Facebook external crawler
  • Disallow all other bots