commons.datacite.org
robots.txt

Robots Exclusion Standard data for commons.datacite.org

Resource Scan

Scan Details

Site Domain commons.datacite.org
Base Domain datacite.org
Scan Status Ok
Last Scan2025-09-23T06:03:37+00:00
Next Scan 2025-10-23T06:03:37+00:00

Last Scan

Scanned2025-09-23T06:03:37+00:00
URL https://commons.datacite.org/robots.txt
Domain IPs 66.33.60.35, 66.33.60.67
Response IP 66.33.60.129
Found Yes
Hash 0808522c26e204bd824ccd1c841030f48c58d6253db5b374e65ba5258c122580
SimHash b8908fd56f46

Groups

*

Rule Path
Allow /$
Allow /doi.org$
Allow /orcid.org$
Allow /ror.org$
Allow /repositories$
Disallow /

Comments

  • DataCite by default denies robot access to Commons unless previous agreements made
  • Our data is publically available for machine access via our various APIs
  • Please get in touch at support@datacite.org to discuss your use-cases
  • See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
  • Disallow every robot except for the landing pages