gwern.net
robots.txt
Robots Exclusion Standard data for gwern.net
Resource Scan
Scan Details
| Site Domain | gwern.net |
| Base Domain | gwern.net |
| Scan Status | Ok |
| Last Scan | 2025-11-24T09:11:09+00:00 |
| Next Scan | 2025-12-24T09:11:09+00:00 |
Last Scan
| Scanned | 2025-11-24T09:11:09+00:00 |
| URL | https://gwern.net/robots.txt |
| Domain IPs | 104.26.10.177, 104.26.11.177, 172.67.71.248, 2606:4700:20::681a:ab1, 2606:4700:20::681a:bb1, 2606:4700:20::ac43:47f8 |
| Response IP | 104.26.11.177 |
| Found | Yes |
| Hash | b97e30a11dd5969b162bfbabff3637558cb2c26004376ac439897efb6b34bea4 |
| SimHash | 60200a3aced0 |
Groups
*
| Rule | Path |
|---|---|
| Disallow | /fulltext |
| Disallow | /*.md |
| Disallow | /*.md.html |
| Disallow | /static/*.*.html |
| Disallow | /static/nginx/* |
| Disallow | /static/redirect/* |
| Disallow | /metadata/* |
| Disallow | /metadata/annotation/backlink/* |
| Disallow | /metadata/annotation/similar/* |
| Disallow | /metadata/annotation/link-bibliography/* |
| Disallow | /confidential/* |
| Disallow | /private/* |
| Disallow | /secret/* |
| Disallow | /doc/www/* |
Other Records
| Field | Value |
|---|---|
| sitemap | https://gwern.net/sitemap.xml |
Comments