italki.com
robots.txt

Robots Exclusion Standard data for italki.com

Resource Scan

Scan Details

Site Domain italki.com
Base Domain italki.com
Scan Status Ok
Last Scan2025-10-20T16:11:47+00:00
Next Scan 2025-11-03T16:11:47+00:00

Last Scan

Scanned2025-10-20T16:11:47+00:00
URL https://italki.com/robots.txt
Redirect https://www.italki.com:443/robots.txt
Redirect Domain www.italki.com
Redirect Base italki.com
Domain IPs 35.83.65.236, 44.229.224.22, 54.185.193.26
Redirect IPs 104.18.8.37, 104.18.9.37, 2606:4700::6812:825, 2606:4700::6812:925
Response IP 104.18.8.37
Found Yes
Hash 35931863917da8e27900b4b7aea9559c638105b1e969a3e115c9f11c4c1fb832
SimHash 61147855ad55

Groups

*

Rule Path
Disallow /notebook/*
Disallow /dashboard
Disallow /user/*
Disallow /settings
Disallow /messages
Disallow /lessons
Disallow /finance
Disallow /calendar
Disallow /contacts
Disallow /following
Disallow /followers

googlebot
bingbot

Rule Path
Allow /

oai-searchbot
chatgpt-user
perplexitybot
firecrawlagent
andibot
exabot
phindbot
youbot
gptbot
google-extended

Rule Path
Allow /

ccbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.italki.com/sitemapindex.xml

Comments

  • Default rules - restrict private areas
  • Allow traditional search indexing (inherits default restrictions)
  • Allow AI search and agent use
  • Disallow AI training data collection