to-markdown.com
robots.txt

Robots Exclusion Standard data for to-markdown.com

Resource Scan

Scan Details

Site Domain to-markdown.com
Base Domain to-markdown.com
Scan Status Ok
Last Scan2025-09-24T05:22:26+00:00
Next Scan 2025-10-01T05:22:26+00:00

Last Scan

Scanned2025-09-24T05:22:26+00:00
URL https://to-markdown.com/robots.txt
Domain IPs 104.21.96.7, 172.67.150.29, 2606:4700:3036::6815:6007, 2606:4700:3037::ac43:961d
Response IP 172.67.150.29
Found Yes
Hash b51ae7bdf1c4924024430d823c09f7254d60248aba96572847cf4b06b15f9981
SimHash 40014fe28b82

Groups

*

Rule Path
Allow /
Allow /docs/
Allow /examples
Allow /privacy
Allow /terms
Allow /markitdown
Disallow /api/
Disallow /v1/
Disallow /static/
Disallow /404
Disallow /500
Disallow /*.json$

gptbot

Rule Path
Allow /llms.txt
Disallow /

anthropic-ai

Rule Path
Allow /llms.txt
Disallow /

googlebot

Rule Path
Allow /
Disallow /api/
Disallow /v1/
Disallow /static/
Disallow /404
Disallow /500
Disallow /*.json$

Other Records

Field Value
sitemap https://to-markdown.com/sitemap.xml

Comments

  • Format-specific subdomains

Warnings

  • `host` is not a known field.
  • `llm-content` is not a known field.
  • `llm-full-content` is not a known field.