4cee.com
robots.txt

Robots Exclusion Standard data for 4cee.com

Resource Scan

Scan Details

Site Domain 4cee.com
Base Domain 4cee.com
Scan Status Ok
Last Scan2025-10-23T01:41:53+00:00
Next Scan 2025-11-22T01:41:53+00:00

Last Scan

Scanned2025-10-23T01:41:53+00:00
URL https://4cee.com/robots.txt
Redirect https://www.4cee.com/robots.txt
Redirect Domain www.4cee.com
Redirect Base 4cee.com
Domain IPs 199.60.103.19
Redirect IPs 199.60.103.225, 199.60.103.31, 2606:2c40::c73c:671f, 2606:2c40::c73c:67e1
Response IP 199.60.103.225
Found Yes
Hash 15583c8b74b0b31439ce3a46964556f759b3ee268ee23468963b239604b73f74
SimHash 68499f1545ea

Groups

amazonbot

Rule Path
Disallow /
Disallow /_hcms/preview/
Disallow /hs/manage-preferences/
Disallow /hs/preferences-center/
Disallow /*?*hs_preview=*
Disallow /*?*hsCacheBuster=*

claudebot

Rule Path
Disallow /
Disallow /_hcms/preview/
Disallow /hs/manage-preferences/
Disallow /hs/preferences-center/
Disallow /*?*hs_preview=*
Disallow /*?*hsCacheBuster=*

baiduspider

Rule Path
Disallow /
Disallow /_hcms/preview/
Disallow /hs/manage-preferences/
Disallow /hs/preferences-center/
Disallow /*?*hs_preview=*
Disallow /*?*hsCacheBuster=*

bytespider

Rule Path
Disallow /
Disallow /_hcms/preview/
Disallow /hs/manage-preferences/
Disallow /hs/preferences-center/
Disallow /*?*hs_preview=*
Disallow /*?*hsCacheBuster=*

facebookexternalhit

Rule Path
Disallow /
Disallow /_hcms/preview/
Disallow /hs/manage-preferences/
Disallow /hs/preferences-center/
Disallow /*?*hs_preview=*
Disallow /*?*hsCacheBuster=*

facebot

Rule Path
Disallow /
Disallow /_hcms/preview/
Disallow /hs/manage-preferences/
Disallow /hs/preferences-center/
Disallow /*?*hs_preview=*
Disallow /*?*hsCacheBuster=*

yandexbot

Rule Path
Disallow /
Disallow /_hcms/preview/
Disallow /hs/manage-preferences/
Disallow /hs/preferences-center/
Disallow /*?*hs_preview=*
Disallow /*?*hsCacheBuster=*

*

Rule Path
Disallow /_hcms/preview/
Disallow /hs/manage-preferences/
Disallow /search?
Disallow /hs/preferences-center/
Disallow /*?*hs_preview=*
Disallow /*?*hsCacheBuster=*

Other Records

Field Value
sitemap https://www.4cee.com/sitemap.xml

Comments

  • robots.txt - Search Engine Crawler Access Control
  • Different bots are blocked
  • to meet legal compliance and risk reduction as indicated in the EU.
  • Amazon – USA (Alexa + AWS AI)
  • Mainly used for training
  • Anthropic – USA (Claude AI)
  • Blocked because mainly used for training
  • Baidu – China (Search + Ernie LLM)
  • Blocked because only used for LLM in China
  • ByteDance (TikTok) – China (TikTok, Doubao LLM)
  • Blocked because only used for LLM in China and for TikTok
  • Facebook (Meta) – USA (Facebook)
  • Blocked because mainly used for scraping and not used for LLMs
  • Yandex – Russia
  • Blocked because only used for LLM in Russia
  • General rules for all bots