protocols.io
robots.txt

Robots Exclusion Standard data for protocols.io

Archived Snapshots

Resource Scan

Scan Details

Site Domain	protocols.io
Base Domain	protocols.io
Scan Status	Ok
Last Scan	2026-02-14T09:56:36+00:00
Next Scan	2026-02-28T09:56:36+00:00

Last Scan

Scanned	2026-02-14T09:56:36+00:00
URL	https://protocols.io/robots.txt
Redirect	https://www.protocols.io:443/robots.txt
Redirect Domain	www.protocols.io
Redirect Base	protocols.io
Domain IPs	18.223.137.131, 3.151.171.238
Redirect IPs	18.223.137.131, 3.151.171.238
Response IP	3.151.171.238
Found	Yes
Hash	32fe5eee904730436a415434c9e5ecc55d5ebbf8c965c5ce4d7f6920ddcbbfc6
SimHash	73285b81eff7

Groups

*

Rule	Path
Disallow	/private/
Disallow	/blind/
Disallow	/api/
Disallow	/download
Disallow	/pubchase
Disallow	/spectro
Disallow	/neb
Disallow	/career/
Disallow	/essays
Disallow	/editorials
Disallow	/test
Disallow	/flux

Rule

Path

Disallow

/private/

Disallow

/blind/

Disallow

/api/

Disallow

/download

Disallow

/pubchase

Disallow

/spectro

Disallow

/neb

Disallow

/career/

Disallow

/essays

Disallow

/editorials

Disallow

/test

Disallow

/flux

gptbot

Rule	Path
Disallow	/private/
Disallow	/api/

Rule

Path

Disallow

/private/

Disallow

/api/

anthropic-ai

Rule	Path
Disallow	/private/
Disallow	/api/

Rule

Path

Disallow

/private/

Disallow

/api/

ccbot

Rule	Path
Disallow	/private/
Disallow	/api/

Rule

Path

Disallow

/private/

Disallow

/api/

Back to top

Other Records

Field	Value
sitemap	https://www.protocols.io/sitemaps/protocols_sitemap.xml

Field

Value

sitemap

https://www.protocols.io/sitemaps/protocols_sitemap.xml

Back to top

Comments

=========================================================
robots.txt for https://www.protocols.io
Purpose:
- Allow discovery of public scientific content
- Protect private, authenticated, and system areas
- Provide explicit guidance to AI crawlers
=========================================================
-------------------------
Default rule (all crawlers)
-------------------------
AI crawlers
-------------------------
Sitemap
-------------------------

Back to top

protocols.iorobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

gptbot

anthropic-ai

ccbot

Other Records

Comments

protocols.io
robots.txt