open.ac.uk
robots.txt

Robots Exclusion Standard data for open.ac.uk

Archived Snapshots

Resource Scan

Scan Details

Site Domain	open.ac.uk
Base Domain	open.ac.uk
Scan Status	Ok
Last Scan	2025-07-06T20:29:45+00:00
Next Scan	2025-08-05T20:29:45+00:00

Last Scan

Scanned	2025-07-06T20:29:45+00:00
URL	https://www.open.ac.uk/robots.txt
Domain IPs	137.108.200.90
Response IP	137.108.200.90
Found	Yes
Hash	94dd13fd85db94cc77628afa958aee4072d1a71e99b08cbd05ad9f6d4a9a5dda
SimHash	b2c5194bc775

Groups

rightnow_webindexer
twitterbot

Product	Comment
rightnow_webindexer	RightNow # CUSTOM

Product

Comment

rightnow_webindexer

RightNow # CUSTOM

Rule	Path
Allow	/ouheaders/gui/

Rule

Path

Allow

/ouheaders/gui/

*

Product	Comment
*	applies to all robots

Rule	Path
Disallow	/cgi-bin
Disallow	/CFIDE
Disallow	/account/*
Disallow	/contact/new*
Disallow	/request/*
Disallow	/openlearn/profiles/*
Disallow	/feed-items
Disallow	/feed%3D
Disallow	/library/news/feed*
Disallow	/libraryservices/feeds*
Disallow	/feed?
Disallow	/Tooltip-feed-atom
Disallow	/library/digital-archive/search*
Disallow	/Arts/reading/UK/search_basic_results*
Disallow	/Arts/reading/UK/browse_reader*
Disallow	/libraryservices/beta/search/*
Disallow	/outbound/article/*
Disallow	/author/admin/
Disallow	/libraryservices/feedback/poll/*
Disallow	/*hello-world
Disallow	/sort%3D
Disallow	/URL%3D
Disallow	/url%3D
Disallow	/MEDIA%3D
Disallow	/KWCAMPAIGN%3D
Disallow	/CATCODE%3D
Disallow	/payments?rid=
Disallow	/replytocom
Disallow	/attachment_id%3D
Disallow	/ajaxCalendar%3D
Disallow	/timein%3D
Disallow	/field_category_value
Disallow	/pid%3D
Disallow	/tag%3D
Disallow	/wikis/*

Rule

Path

Disallow

/*cgi-bin*

Disallow

/*CFIDE*

Disallow

/account/*

Disallow

/contact/new*

Disallow

/request/*

Disallow

/openlearn/profiles/*

Disallow

/*feed-items*

Disallow

/*feed%3D*

Disallow

/library/news/feed*

Disallow

/libraryservices/feeds*

Disallow

/*feed?*

Disallow

/*Tooltip-feed-atom*

Disallow

/library/digital-archive/search*

Disallow

/Arts/reading/UK/search_basic_results*

Disallow

/Arts/reading/UK/browse_reader*

Disallow

/libraryservices/beta/search/*

Disallow

/outbound/article/*

Disallow

/author/admin/

Disallow

/libraryservices/feedback/poll/*

Disallow

/*hello-world

Disallow

/*sort%3D*

Disallow

/*URL%3D*

Disallow

/*url%3D*

Disallow

/*MEDIA%3D*

Disallow

/*KWCAMPAIGN%3D*

Disallow

/*CATCODE%3D*

Disallow

/*payments?rid=*

Disallow

/*replytocom*

Disallow

/*attachment_id%3D*

Disallow

/*ajaxCalendar%3D*

Disallow

/*timein%3D*

Disallow

/*field_category_value*

Disallow

/*pid%3D*

Disallow

/*tag%3D*

Disallow

/wikis/*

Back to top

Comments

This file is to prevent the crawling and indexing of certain parts
of our site by web crawlers and spiders run by sites like Google.
By telling these "robots" where not to go on the site,
we save bandwidth and server resources.
For more information about the robots.txt standard, see:
http://www.robotstxt.org/wc/robots.html
additional
feeds
search results
Paths
parameters
wikis

Back to top

open.ac.ukrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

rightnow_webindexertwitterbot

*

Comments

open.ac.uk
robots.txt

rightnow_webindexer
twitterbot