how.to
robots.txt

Robots Exclusion Standard data for how.to

Archived Snapshots

Resource Scan

Scan Details

Site Domain	how.to
Base Domain	how.to
Scan Status	Ok
Last Scan	2026-01-09T02:50:18+00:00
Next Scan	2026-01-16T02:50:18+00:00

Last Scan

Scanned	2026-01-09T02:50:18+00:00
URL	https://how.to/robots.txt
Domain IPs	185.158.133.1
Response IP	185.158.133.1
Found	Yes
Hash	42331c38191e9e5796bc325145c79751ef56397a2cb5ecfac17391e42b9211d0
SimHash	6a191b01043e

Groups

gptbot

Rule	Path
Allow	/

Rule

Path

Allow

/

chatgpt-user

Rule	Path
Allow	/

Rule

Path

Allow

/

google-extended

Rule	Path
Allow	/

Rule

Path

Allow

/

googlebot

Rule	Path
Allow	/

Rule

Path

Allow

/

bingbot

Rule	Path
Allow	/

Rule

Path

Allow

/

twitterbot

Rule	Path
Allow	/

Rule

Path

Allow

/

facebookexternalhit

Rule	Path
Allow	/

Rule

Path

Allow

/

perplexitybot

Rule	Path
Allow	/

Rule

Path

Allow

/

claudebot

Rule	Path
Allow	/

Rule

Path

Allow

/

anthropic-ai

Rule	Path
Allow	/

Rule

Path

Allow

/

applebot

Rule	Path
Allow	/

Rule

Path

Allow

/

*

Rule	Path
Allow	/

Rule

Path

Allow

/

Back to top

Other Records

Field	Value
sitemap	https://how.to/sitemap.xml

Field

Value

sitemap

https://how.to/sitemap.xml

Back to top

Comments

AI Crawlers - Welcome!
Sitemap (submit this in Google Search Console)
LLM Documentation
For AI assistants: see /llms.txt for site structure and citation guidelines
Extended documentation: /llms-full.txt

Back to top

how.torobots.txt

Resource Scan

Scan Details

Last Scan

Groups

gptbot

chatgpt-user

google-extended

googlebot

bingbot

twitterbot

facebookexternalhit

perplexitybot

claudebot

anthropic-ai

applebot

*

Other Records

Comments

how.to
robots.txt