club.lavanguardia.com
robots.txt
Robots Exclusion Standard data for club.lavanguardia.com
Resource Scan
Scan Details
Site Domain | club.lavanguardia.com |
Base Domain | lavanguardia.com |
Scan Status | Ok |
Last Scan | 2024-11-09T08:20:54+00:00 |
Next Scan | 2024-11-23T08:20:54+00:00 |
Last Scan
Scanned | 2024-11-09T08:20:54+00:00 |
URL | https://club.lavanguardia.com/robots.txt |
Domain IPs | 23.210.99.57 |
Response IP | 104.103.151.13 |
Found | Yes |
Hash | 4698aee3b41a3c277c63721be2aa9095cb08f7b943aa01f9d3901cf3c46e3f08 |
SimHash | a099788bc6d3 |
Groups
*
Rule | Path |
---|---|
Disallow | /*/search* |
Disallow | /*.swf$ |
Disallow | /*.tif$ |
Disallow | /*.mp3$ |
Disallow | /*.flv$ |
Disallow | /*.mp4$ |
Disallow | /*.avi$ |
Disallow | /*.ics$ |
Disallow | /.sql$ |
Disallow | /.tgz$ |
Disallow | /.gz$ |
Disallow | /.tar$ |
Disallow | /*.svn$ |
orthogaffe
msiecrawler
ia_archiver
ubicrawler
doc
zao
sitecheck.internetseer.com
zealbot
msiecrawler
sitesnagger
webstripper
webcopier
fetch
offline explorer
teleport
teleportpro
webzip
linko
httrack
microsoft.url.control
xenu
larbin
libwww
zyborg
download ninja
slurp
maxthon
cncdialer
ahrefsbot
wget
grub-client
k2spider
npbot
webreaper
mj12bot
claudebot
Rule | Path |
---|---|
Disallow | / |
Comments