ftp.acc.umu.se
robots.txt

Robots Exclusion Standard data for ftp.acc.umu.se

Resource Scan

Scan Details

Site Domain ftp.acc.umu.se
Base Domain umu.se
Scan Status Ok
Last Scan2024-05-23T04:35:26+00:00
Next Scan 2024-06-22T04:35:26+00:00

Last Scan

Scanned2024-05-23T04:35:26+00:00
URL https://ftp.acc.umu.se/robots.txt
Domain IPs 194.71.11.163, 194.71.11.165, 194.71.11.173, 2001:6b0:19::163, 2001:6b0:19::165, 2001:6b0:19::173
Response IP 194.71.11.163
Found Yes
Hash 306fb23d750b7fa2bf714b74e206a9e31f1b8dfb8b214781d9be861dfb3b074b
SimHash 2805a3c09574

Groups

*

Rule Path
Disallow /cdimage/.debian-mirror
Disallow /cdimage/snapshot
Disallow /mirror/cdimage/.debian-mirror
Disallow /mirror/cdimage/snapshot
Disallow /debian/pool
Disallow /mirror/debian/pool
Disallow /mirror/ubuntu/pool
Disallow /pub/debian/pool
Disallow /ubuntu/pool
Disallow /mirror/archlinux/pool
Disallow /mirror/debian-multimedia/pool
Disallow /mirror/linuxdeepin/packages/pool
Disallow /mirror/linuxmint.com/packages/pool
Disallow /mirror/osdn.net
Disallow /mirror/parrotsec.org/parrot/pool
Disallow /mirror/raspbian/mate/pool
Disallow /mirror/raspbian/multiarchcross/pool
Disallow /mirror/raspbian/raspbian/pool
Disallow /mirror/solydxk.com/repository/pool
Disallow /mirror/temp
Disallow /mirror/trisquel/packages/pool
Disallow /mirror/videolan.org/debian/pool
Disallow /mirror/videolan.org/ubuntu/pool
Disallow /mirror/opensuse.org/tumbleweed/repo
Disallow /mirror/fedora/enchilada/linux
Allow /icons/*.png$
Allow /icons2/*.png$
Disallow /*.iso$
Disallow /*.deb$
Disallow /*.rpm$
Disallow /*.gz$
Disallow /*.bz2$
Disallow /*.xz$
Disallow /*.arj$
Disallow /*.rar$
Disallow /*.zip$
Disallow /*.lzh$
Disallow /*.lha$
Disallow /*.7z$
Disallow /*.avi$
Disallow /*.wmv$
Disallow /*.mpg$
Disallow /*.mpeg$
Disallow /*.mp4$
Disallow /*.mkv$
Disallow /*.flv$
Disallow /*.qt$
Disallow /*.mov$
Disallow /*.m4v$
Disallow /*.webm$
Disallow /*.mp3$
Disallow /*.ogg$
Disallow /*.gif$
Disallow /*.png$
Disallow /*.jpg$
Disallow /*.dmg$
Disallow /*.zim$

Comments

  • robots.txt for http://ftp.acc.umu.se/
  • This file specifies what harvesting robots are allowed to index and not.
  • For information on the original format:
  • http://www.robotstxt.org/
  • For information on updated format that allows wildcards:
  • https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt
  • Rules without wildcards - original spec
  • Rules with wildcards - enhanced spec