to-markdown.com
robots.txt

Robots Exclusion Standard data for to-markdown.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	to-markdown.com
Base Domain	to-markdown.com
Scan Status	Ok
Last Scan	2025-09-24T05:22:26+00:00
Next Scan	2025-10-01T05:22:26+00:00

Last Scan

Scanned	2025-09-24T05:22:26+00:00
URL	https://to-markdown.com/robots.txt
Domain IPs	104.21.96.7, 172.67.150.29, 2606:4700:3036::6815:6007, 2606:4700:3037::ac43:961d
Response IP	172.67.150.29
Found	Yes
Hash	b51ae7bdf1c4924024430d823c09f7254d60248aba96572847cf4b06b15f9981
SimHash	40014fe28b82

Groups

*

Rule	Path
Allow	/
Allow	/docs/
Allow	/examples
Allow	/privacy
Allow	/terms
Allow	/markitdown
Disallow	/api/
Disallow	/v1/
Disallow	/static/
Disallow	/404
Disallow	/500
Disallow	/*.json$

Rule

Path

Allow

/

Allow

/docs/

Allow

/examples

Allow

/privacy

Allow

/terms

Allow

/markitdown

Disallow

/api/

Disallow

/v1/

Disallow

/static/

Disallow

/404

Disallow

/500

Disallow

/*.json$

gptbot

Rule	Path
Allow	/llms.txt
Disallow	/

Rule

Path

Allow

/llms.txt

Disallow

/

anthropic-ai

Rule	Path
Allow	/llms.txt
Disallow	/

Rule

Path

Allow

/llms.txt

Disallow

/

googlebot

Rule	Path
Allow	/
Disallow	/api/
Disallow	/v1/
Disallow	/static/
Disallow	/404
Disallow	/500
Disallow	/*.json$

Rule

Path

Allow

/

Disallow

/api/

Disallow

/v1/

Disallow

/static/

Disallow

/404

Disallow

/500

Disallow

/*.json$

Back to top

Other Records

Field	Value
sitemap	https://to-markdown.com/sitemap.xml

Field

Value

sitemap

https://to-markdown.com/sitemap.xml

Back to top

Comments

Format-specific subdomains

Back to top

Warnings

`host` is not a known field.
`llm-content` is not a known field.
`llm-full-content` is not a known field.

Back to top

to-markdown.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

gptbot

anthropic-ai

googlebot

Other Records

Comments

Warnings

to-markdown.com
robots.txt