open.ac.uk
robots.txt

Robots Exclusion Standard data for open.ac.uk

Resource Scan

Scan Details

Site Domain open.ac.uk
Base Domain open.ac.uk
Scan Status Ok
Last Scan2025-07-06T20:29:45+00:00
Next Scan 2025-08-05T20:29:45+00:00

Last Scan

Scanned2025-07-06T20:29:45+00:00
URL https://www.open.ac.uk/robots.txt
Domain IPs 137.108.200.90
Response IP 137.108.200.90
Found Yes
Hash 94dd13fd85db94cc77628afa958aee4072d1a71e99b08cbd05ad9f6d4a9a5dda
SimHash b2c5194bc775

Groups

rightnow_webindexer
twitterbot

Product Comment
rightnow_webindexer RightNow # CUSTOM
Rule Path
Allow /ouheaders/gui/

*

Product Comment
* applies to all robots
Rule Path
Disallow /*cgi-bin*
Disallow /*CFIDE*
Disallow /account/*
Disallow /contact/new*
Disallow /request/*
Disallow /openlearn/profiles/*
Disallow /*feed-items*
Disallow /*feed%3D*
Disallow /library/news/feed*
Disallow /libraryservices/feeds*
Disallow /*feed?*
Disallow /*Tooltip-feed-atom*
Disallow /library/digital-archive/search*
Disallow /Arts/reading/UK/search_basic_results*
Disallow /Arts/reading/UK/browse_reader*
Disallow /libraryservices/beta/search/*
Disallow /outbound/article/*
Disallow /author/admin/
Disallow /libraryservices/feedback/poll/*
Disallow /*hello-world
Disallow /*sort%3D*
Disallow /*URL%3D*
Disallow /*url%3D*
Disallow /*MEDIA%3D*
Disallow /*KWCAMPAIGN%3D*
Disallow /*CATCODE%3D*
Disallow /*payments?rid=*
Disallow /*replytocom*
Disallow /*attachment_id%3D*
Disallow /*ajaxCalendar%3D*
Disallow /*timein%3D*
Disallow /*field_category_value*
Disallow /*pid%3D*
Disallow /*tag%3D*
Disallow /wikis/*

Comments

  • This file is to prevent the crawling and indexing of certain parts
  • of our site by web crawlers and spiders run by sites like Google.
  • By telling these "robots" where not to go on the site,
  • we save bandwidth and server resources.
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/wc/robots.html
  • additional
  • feeds
  • search results
  • Paths
  • parameters
  • wikis