how.to
robots.txt

Robots Exclusion Standard data for how.to

Resource Scan

Scan Details

Site Domain how.to
Base Domain how.to
Scan Status Ok
Last Scan2026-01-09T02:50:18+00:00
Next Scan 2026-01-16T02:50:18+00:00

Last Scan

Scanned2026-01-09T02:50:18+00:00
URL https://how.to/robots.txt
Domain IPs 185.158.133.1
Response IP 185.158.133.1
Found Yes
Hash 42331c38191e9e5796bc325145c79751ef56397a2cb5ecfac17391e42b9211d0
SimHash 6a191b01043e

Groups

gptbot

Rule Path
Allow /

chatgpt-user

Rule Path
Allow /

google-extended

Rule Path
Allow /

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

twitterbot

Rule Path
Allow /

facebookexternalhit

Rule Path
Allow /

perplexitybot

Rule Path
Allow /

claudebot

Rule Path
Allow /

anthropic-ai

Rule Path
Allow /

applebot

Rule Path
Allow /

*

Rule Path
Allow /

Other Records

Field Value
sitemap https://how.to/sitemap.xml

Comments

  • AI Crawlers - Welcome!
  • Sitemap (submit this in Google Search Console)
  • LLM Documentation
  • For AI assistants: see /llms.txt for site structure and citation guidelines
  • Extended documentation: /llms-full.txt