upc.edu
robots.txt

Robots Exclusion Standard data for upc.edu

Resource Scan

Scan Details

Site Domain upc.edu
Base Domain upc.edu
Scan Status Ok
Last Scan2024-10-29T16:33:04+00:00
Next Scan 2024-11-28T16:33:04+00:00

Last Scan

Scanned2024-10-29T16:33:04+00:00
URL https://upc.edu/robots.txt
Redirect https://www.upc.edu/robots.txt
Redirect Domain www.upc.edu
Redirect Base upc.edu
Domain IPs 147.83.2.135
Redirect IPs 147.83.2.135, 2001:40b0:7500:1::21
Response IP 147.83.2.135
Found Yes
Hash 479e6335f8c7d37f68c5d6be193748162c0aa2c470d4a376bd3e73462b41a6ed
SimHash af54aa554d65

Groups

*

Rule Path
Disallow */noindex-upc/*
Disallow /*sendto_form$
Disallow /*folder_factories$
Disallow /sga/*.pdf
Disallow /slt/gina*
Disallow /slt/assessorament*
Disallow /slt/aeronautica*
Disallow /slt/informe-mundial*
Disallow /slt/mobility*
Disallow /sostenible2015*
Disallow *manage_translations_form$
Disallow *folder_contents$
Disallow /sri/ca/estudiantat/mobilitat-pas/resolucions*
Disallow /sri/ca/estudiantat/mobilitat-pdi/personal-upc/erasmus-accio-ka10*
Disallow /genweb-pre*
Disallow /ca/media*
Disallow /es/media*
Disallow /en/media*
Disallow */content/*
Disallow */continguts-home/*
Disallow */contenidos-home/*
Disallow */home-contents/*
Disallow /ca/serveis/content/intro-serveis*

googlebot

Rule Path
Disallow /*?
Disallow /*atct_album_view$
Disallow /*folder_factories$
Disallow /*folder_summary_view$
Disallow /*login_form$
Disallow /*mail_password_form$
Disallow /*search
Disallow /*search_rss$
Disallow /*sendto_form$
Disallow /*summary_view$
Disallow /*thumbnail_view$
Disallow /*view$
Disallow /genweb-pre*
Disallow /ca/media*
Disallow /es/media*
Disallow /en/media*
Disallow */content/*
Disallow */continguts-home/*
Disallow */contenidos-home/*
Disallow */home-contents/*
Disallow /ca/serveis/content/intro-serveis*

Other Records

Field Value
sitemap https://www.upc.edu/sitemap.xml.gz

Comments

  • Define access-restrictions for robots/spiders
  • http://www.robotstxt.org/wc/norobots.html
  • By default we allow robots to access all areas of our site
  • already accessible to anonymous users
  • Add Googlebot-specific syntax extension to exclude forms
  • that are repeated for each piece of content in the site
  • the wildcard is only supported by Googlebot
  • http://www.google.com/support/webmasters/bin/answer.py?answer=40367&ctx=sibling