little.cloud
robots.txt

Robots Exclusion Standard data for little.cloud

Resource Scan

Scan Details

Site Domain little.cloud
Base Domain little.cloud
Scan Status Ok
Last Scan2025-10-22T17:08:59+00:00
Next Scan 2025-11-21T17:08:59+00:00

Last Scan

Scanned2025-10-22T17:08:59+00:00
URL https://little.cloud/robots.txt
Domain IPs 104.21.78.169, 172.67.136.4, 2606:4700:3030::ac43:8804, 2606:4700:3033::6815:4ea9
Response IP 104.21.78.169
Found Yes
Hash fb7e73b0df208f5a075b72ab3f651b5c3c88eef7c82e43dea0c5d5f99fc45b90
SimHash 641ebb412574

Groups

*

Rule Path
Allow /

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

slurp

Rule Path
Allow /

duckduckbot

Rule Path
Allow /

baiduspider

Rule Path
Allow /

yandexbot

Rule Path
Allow /
Disallow /node_modules/
Disallow /src/
Disallow /assets/scss/
Disallow /assets/vendor/
Disallow /.git/
Disallow /dev/
Disallow /*.bak
Disallow /*.backup
Disallow /*.old
Disallow /*.tmp
Disallow /*~
Disallow /*.log
Disallow /package.json
Disallow /package-lock.json
Disallow /gulpfile.js
Disallow /*.md
Allow /assets/css/
Allow /assets/js/
Allow /assets/img/
Allow /assets/json/
Allow /assets/favicon/

Other Records

Field Value
crawl-delay 1

Other Records

Field Value
sitemap https://little.cloud/sitemap.xml

Comments

  • Robots.txt for Little Cloud website
  • https://little.cloud/robots.txt
  • Allow all major search engine bots
  • Disallow access to sensitive directories
  • Disallow access to backup files
  • Disallow access to log files
  • Disallow access to configuration files
  • Allow access to CSS and JS files (important for rendering)
  • Sitemap location
  • Crawl delay (optional - helps with server load)