gnlug.org
robots.txt
            Robots Exclusion Standard data for gnlug.org
Resource Scan
Scan Details
| Site Domain | gnlug.org | 
| Base Domain | gnlug.org | 
| Scan Status | Ok | 
| Last Scan | 2025-10-24T06:56:36+00:00 | 
| Next Scan | 2025-11-23T06:56:36+00:00 | 
Last Scan
| Scanned | 2025-10-24T06:56:36+00:00 | 
| URL | https://gnlug.org/robots.txt | 
| Domain IPs | 2a01:4ff:f0:dd3a::1, 5.161.180.234 | 
| Response IP | 5.161.180.234 | 
| Found | Yes | 
| Hash | dcb3ab319c4f046f0f5e95308842d281aa8390f386b8bd43ad2e6a49f901fb22 | 
| SimHash | 76715a00c1c0 | 
Groups
*
          | Rule | Path | 
|---|---|
| Disallow | /panel/ | 
| Disallow | /bin/ | 
| Disallow | /conf/ | 
| Disallow | /data/ | 
| Disallow | /inc/ | 
| Disallow | /lib/ | 
| Disallow | /vendor/ | 
| Disallow | /.htaccess | 
| Disallow | /.htaccess.dist | 
| Disallow | /COPYING | 
| Disallow | /README | 
| Disallow | /SECURITY.md | 
| Disallow | /VERSION | 
| Disallow | /alrojovivo.html | 
| Disallow | /composer.json | 
| Disallow | /composer.lock | 
| Disallow | /index_old.html | 
| Disallow | /mc-legacy.html | 
| Disallow | /mc-player-counter.min.js | 
| Disallow | /mc.html | 
gptbot
claudebot
claude-web
ccbot
googlebot-extended
applebot-extended
facebookbot
meta-externalagent
meta-externalfetcher
diffbot
perplexitybot
omgili
omgilibot
webzio-extended
imagesiftbot
bytespider
amazonbot
youbot
semrushbot-ocob
petalbot
velenpublicwebcrawler
turnitinbot
timpibot
oai-searchbot
icc-crawler
ai2bot
ai2bot-dolma
dataforseobot
awariobot
awariosmartbot
awariorssbot
google-cloudvertexbot
pangubot
kangaroo bot
sentibot
img2dataset
meltwater
seekr
peer39_crawler
cohere-ai
cohere-training-data-crawler
duckassistbot
scrapy
          | Rule | Path | 
|---|---|
| Disallow | / | 
*
          No rules defined. All paths allowed.
Warnings
- `disallowaitraining` is not a known field.
Comments