SBN DOI Procedures

From The SBN Wiki
Revision as of 20:14, 5 June 2020 by Raugh (talk | contribs)
Jump to navigation Jump to search

This page describes the nominal procedures SBN will follow in reserving and publishing DOIs under the most common circumstances. Uncommon circumstances are, ironically, fairly common in small bodies data, so if you have one please do not hesitate to contact us.

DOI Milestones

The primary goal for tagging archive data sets is to give credit to those who produced a high-quality research data set (PDS archived data sets are refereed publications), as well as to enhance the discover-ability of the data in the larger world outside of the PDS archives. With that in mind, DOIs should be part of design considerations for data productions, and there are DOI milestones that should be included in planning for review and publication of the data.

Large data production efforts - those which produce one or more complete bundles, typically each with multiple collections, each of those with many thousands of individual products - will generally be governed by a signature document that includes the archive design. This document is often referred to as the System Interface Specification (SIS), Interface Control Document (ICD), or Data Management Plan (DMP). Small data production efforts should still have some less voluminous but equally important design and scheduling document. Key activities related to DOI generation should be included in both the design and the development schedule in the relevant document.

The major DOI-related events that should be included in project schedules are described following.

Determine which collections or bundles will receive DOIs.

This should be part of the archive design. The archive units (collections and bundles) that will be assigned DOIs should be noted in the design or control document. Consequently, authorship is an important criterion to consider for determining collection and bundle boundaries.

In general, DOIs are assigned to either a collection or a bundle, but not both. Remember that PDS DOIs are counted as refereed publications, so anything that looks like it might be artificially inflating an author's refereed publication count should be avoided. If you have a reason for wanting DOIs at both levels, let us know - it may well be appropriate and beneficial, and if so that should be included in the archiving plan.

On rare occasions you might also have a single document or product (like a press-release image) for which you would like a DOI, or an additional high-level product collection over and above what might have been in the original controlling document. SBN can accommodate these requests, with a rapid turn-around time, if needed (a publishing deadline, for example). Collect the metadata and give us a call.

Reserve the DOIs.

DOIs can be reserved with partial metadata. Reserved DOIs are not findable in public databases, but they allow the data preparers to include the DOI in the metadata for the archive unit, and the metadata XML file used by SBN to submit DOI requests is a very useful tool for collecting and validating the complete metadata required before the DOI can be published. Also, note that unpublished DOIs can, if needed, be expunged. Drafting and deleting unpublished DOIs is a relatively painless process.

Metadata requirements for reserving a DOI are covered below.

Complete the metadata.

The intention is to update the PDS4 Information Model and label structures to include as many of the metadata fields as is reasonable, but it is likely that this will never be a completely automatic process. In the meantime, SBN is using the DataCite metadata schema to define XML files containing the needed metadata, with an additional Schematron file to help enforce requirements and consistency in terminology. The sort of data that will likely need to be added manually includes such things as: contributors (beyond those who are included in the author/editor list); affiliations for creators and contributors; ORCIDs, where available; subject keywords relevant to the archive unit; data volumes; and possibly funding information, depending on how NASA evolves on that question. In addition, it will be important to ensure that titles and abstracts can logically stand alone in a non-PDS context (in an ADS listing, for example).

This should be done as part of review preparation, so that the DOIs can be published as soon as the data are accepted for archiving. SBN will not publish a DOI that does not have sufficient metadata. "Sufficient", in the case of refereed archive data, means "rich enough to meet FAIR data principles." (We realize, however, that "sufficient" may require context-dependent interpretation in some cases.)

Metadata Requirements