agfa.com
robots.txt

Robots Exclusion Standard data for agfa.com

Resource Scan

Scan Details

Site Domain agfa.com
Base Domain agfa.com
Scan Status Ok
Last Scan2024-09-07T23:32:44+00:00
Next Scan 2024-10-07T23:32:44+00:00

Last Scan

Scanned2024-09-07T23:32:44+00:00
URL https://agfa.com/robots.txt
Redirect http://static.agfa.com/robots.txt
Redirect Domain static.agfa.com
Redirect Base agfa.com
Domain IPs 134.54.224.20
Redirect IPs 134.54.224.20
Response IP 134.54.224.20
Found Yes
Hash 2a696afa9d15e3342ad63d2ac67602d12918993117b3f27eee9b461a53f728c4
SimHash 821a9f2b8f7a

Groups

blp_bbot/0.1

Rule Path
Disallow /

blp_bbot

Rule Path
Disallow /

psbot/0.1

Rule Path
Disallow /

psbot

Rule Path
Disallow /

exabot/3.0

Rule Path
Disallow /

baiduspider+

Rule Path
Disallow /

yeti

Rule Path
Disallow /

snapbot/1.0

Rule Path
Disallow /

*

Rule Path
Disallow /cgi-bin
Disallow /agfa.belgium
Disallow /agfa.netherlands
Disallow /error
Disallow /graphics_home
Disallow /graphics_home2
Disallow /guestbook
Disallow /images
Disallow /img
Disallow /img_menu
Disallow /img_promo
Disallow /look
Disallow /moreimg
Disallow /photo
Disallow /thermoconduct
Disallow /co/global/en/internet/main/plus
Disallow /co/global/en/binaries/PLUS
Disallow /plus
Disallow /co/global/en/internet/main/icims.jsp
Disallow /docs-graphics/alfresco

Comments

  • This file is to guide robots
  • The * means 'any robot'
  • Basically, we do not want robots to execute scripts
  • For more info on how to configure a robots.txt file see
  • http://info.webcrawler.com/mak/projects/robots/robots.html
  • Trying to get rid of PLUS pages in the results (24/06/2013)
  • excluding graphics library documents (20/04/2015)