jisc.ac.uk
robots.txt

Robots Exclusion Standard data for jisc.ac.uk

Resource Scan

Scan Details

Site Domain jisc.ac.uk
Base Domain jisc.ac.uk
Scan Status Ok
Last Scan2024-06-02T22:28:27+00:00
Next Scan 2024-07-02T22:28:27+00:00

Last Scan

Scanned2024-06-02T22:28:27+00:00
URL https://jisc.ac.uk/robots.txt
Redirect https://www.jisc.ac.uk:443/robots.txt
Redirect Domain www.jisc.ac.uk
Redirect Base jisc.ac.uk
Domain IPs 13.248.204.98, 2a05:d028:5:8000:80d0:427f:9fad:83db, 2a05:d028:5:8001:e325:4e7a:17f:bd59, 76.223.88.61
Redirect IPs 2a05:d028:5:8000:63d4:8e03:79c8:dffa, 2a05:d028:5:8001:44ad:32a0:8f2a:5b18, 52.212.11.19, 54.247.77.60
Response IP 52.212.11.19
Found Yes
Hash 44aa6570e4236179f73d4e121ab861f769f3056ee5742f02e78077b6ce676b0f
SimHash f8901d00c764

Groups

*

Rule Path
Disallow /admin
Disallow /admin/
Disallow /connect-more/programme-2023/connect-more-2023-programme-file
Disallow /contact/your-relationship-manager/import-salesforce-data
Disallow /dcdc-conference/programme-2023/dcdc23-programme-file
Disallow /networkshop/programme-2023/networkshop-2023-programme-file
Disallow /security-conference/programme-2023/security-conference-programme-2023-file
Disallow /about-us/stakeholder-strategic-updates/stakeholder-strategic-update-event-2023
Disallow /cyber-security-for-independent-education
Disallow /cyber-security-for-local-authorities
Disallow /cyber-security-for-research-and-innovation
Disallow /heidi-plus/planned-maintenance
Disallow /heidi-plus/unavailable
Disallow /purchasing-consultancy-services-terms
Disallow /supplying-consultancy-services-terms
Disallow /website/accessibility-statement/data-explorer
Disallow /website/accessibility-statement/eduroam-app
Disallow /website/accessibility-statement/govroam-app
Disallow /website/accessibility-statement/heidi-plus
Disallow /website/accessibility-statement/jisc-support
Disallow /website/accessibility-statement/study-goal
Disallow /website/privacy-notice/data-explorer

Other Records

Field Value
crawl-delay 10

Comments

  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/robotstxt.html
  • Non-user-facing pages
  • User-facing pages