gnu-linux.org
robots.txt
Robots Exclusion Standard data for gnu-linux.org
Resource Scan
Scan Details
| Site Domain | gnu-linux.org |
| Base Domain | gnu-linux.org |
| Scan Status | Ok |
| Last Scan | 2025-11-08T15:14:41+00:00 |
| Next Scan | 2025-12-08T15:14:41+00:00 |
Last Scan
| Scanned | 2025-11-08T15:14:41+00:00 |
| URL | https://gnu-linux.org/robots.txt |
| Domain IPs | 104.21.57.202, 172.67.192.48, 2606:4700:3030::ac43:c030, 2606:4700:3031::6815:39ca |
| Response IP | 104.21.57.202 |
| Found | Yes |
| Hash | b717db179a242e688b8e56900457866f244d3b7116424b7412d59104bb53fa27 |
| SimHash | b800da40eb1a |
Groups
*
| Rule | Path |
|---|---|
| Disallow | /reg/ |
| Disallow | */pg-num* |
| Disallow | */id-girl/* |
| Disallow | /profile-click/* |