nhm.ac.uk
robots.txt

Robots Exclusion Standard data for nhm.ac.uk

Resource Scan

Scan Details

Site Domain nhm.ac.uk
Base Domain nhm.ac.uk
Scan Status Ok
Last Scan2024-09-23T22:28:00+00:00
Next Scan 2024-10-07T22:28:00+00:00

Last Scan

Scanned2024-09-23T22:28:00+00:00
URL https://nhm.ac.uk/robots.txt
Redirect https://www.nhm.ac.uk/robots.txt
Redirect Domain www.nhm.ac.uk
Redirect Base nhm.ac.uk
Domain IPs 20.108.76.200
Redirect IPs 13.107.246.59, 2620:1ec:bdf::59
Response IP 13.107.246.59
Found Yes
Hash 577867f481ff51a95d2df6035a36610ab38e61f612e79edcd75878baef98a724
SimHash 645d7857c6d3

Groups

*

Rule Path
Disallow /uksf-bin/
Disallow /cgi-bin/
Disallow /search/
Disallow /about-us/search/
Disallow /jdsml/mils/
Disallow /natureplus/people/
Disallow /research-curation/research/projects/british-insect-mines/database/
Disallow /research-curation/research/projects/solanaceaesource/
Disallow /research-curation/research/projects/species-dictionary-new/
Disallow /print-version/
Disallow /resources-rx/*
Disallow /resources-rx/files/*
Disallow /research-curation/scientific-resources/collections/library-collections/wallace-letters-online/
Disallow /visit/whats-on/programs/*
Disallow /sso/login
Disallow /gallery-api/*
Disallow /discover/macgillivray/*
Disallow /discover/endeavour/*
Disallow *.pdf

Other Records

Field Value
crawl-delay 8

googlebot

Rule Path
Disallow /bin/*
Disallow /discover/macgillivray/*
Disallow /discover/endeavour/*
Disallow /our-science/data/*
Disallow /CalmView/*
Disallow /content/*
Disallow /*?*tags=
Disallow /*?*gclid=
Disallow /*?*unique_ID=
Disallow /about-us/contact-enquiries/forms/emailform.jsp?*

Other Records

Field Value
sitemap https://www.nhm.ac.uk/sitemap.xml

Comments

  • robots.txt for www.nhm.ac.uk/