Filling Out the Identification Area Class

From The SBN Wiki
Revision as of 14:56, 31 July 2020 by Raugh (talk | contribs) (Update for IM 1.14.0.0)
Jump to navigation Jump to search

The <Identification_Area> class is required in all PDS4 labels. It contains the unique logical identifier (LID) for the product, its version identifier (VID), a title, and a link to the PDS4 Information Model version of the master schema used to create or verify the label. It can optionally contain a modification history, and additional information used to cite the product or for cataloging in a database like the Astrophysics Data System.

For additional explanation, see the PDS4 Standards Reference, or contact your PDS node consultant.

Following are the attributes and subclasses you'll find in the Identification_Area, in label order.

Note that in the PDS4 master schema, all classes have capitalized names, attributes never do.

<logical_identifier>

REQUIRED

Abbreviated "LID", this identifier is unique to the product. All LIDs consist of a series of colon-separated segments. The complete rules for formulating LIDs are in the Standards Reference. Here's a brief summary:

urn:nasa:pds:{BundleID}:[{CollectionID}:[{ProductID}]]

If you're formulating the LID for a bundle, you stop at BundleID, which has to be unique across the PDS. If you're formulating a collection LID, you'll need the LID of the bundle containing this collection, to which you will add a CollectionID that is unique within the bundle. And if you're working on a product that will be part of a collection, you'll need the LID of the collection, to which you will add a ProductID that is unique within the collection.

<version_id>

REQUIRED

Abbreviated "VID", this is a version number in the form "M.m", where M and m are integers representing the major and minor revision levels, respectively. This version number is for tracking archived versions, not development versions, so the first version accepted into the archive must be "1.0", and it should incremented only when a new version is archived. For development tracking, you can use the <Modification_History> entries, or large projects may want to incorporate development tracking into the mission/project dictionary.

<title>

REQUIRED

This is a display title (or name) for the product. It should be user-friendly, in that it should help a users quickly distinguish between similar products in a search returns list. Valid characters are any ASCII printable characters; formatting is not preserved; 255 bytes maximum.

<information_model_version>

REQUIRED

This must be identical to the value of the version= attribute in the <xs:schema> element near the top of the PDS4 master schema. (Do not include the double quotes in the value.) For example, as of this writing the current IM version number is 1.14.0.0. As things currently stand, attempting to validate a label against any other version of the master schema other than the one cited in this attribute, whether it's an earlier or later release, is a fatal error.

<product_class>

REQUIRED

The value for this must be identical to the name of the root document element of the label. In other words, if you're working on a label for an observational data product, then this attribute must have the value "Product_Observational". Case and underscores must be exact; do not include any quotes around the value.


<Alias_List>

OPTIONAL

This class is used to map the PDS4 LID to an ID in some other system (internal or external). In general, this class will be needed only rarely. It contains a series of one or more <Alias> classes.

<Alias>

REQUIRED

This class holds the particulars for one product alias - an alternate name by which the product is known. For example, if a PDS3 product can be migrated to PDS4 by simply adding a PDS4 label, a node may opt to use this class to provide the PDS3 identification for the product.

<alternate_id>

OPTIONAL

The identifier in the other system

<alternate_title>

OPTIONAL

The title in the other system

<comment>

OPTIONAL

An explanatory comment about the system, identifier, or relationship


<Citation_Information>

OPTIONAL

This class contains authorship and publication information for use in citing the product and its contents, and also for generating DOIs and registering the product with external catalog databases like the Astrophysics Data System. It is required in all collection, bundle, and individual document products, where the <description> is used as part of the information displayed by search routines for users selecting data. It may be used in any product - including individual observational data products - at the data preparer's discretion.

Of particular interest for SBN data is the <keyword> attribute, which should generally be used whenever a full citation (i.e., one with an author or editor list) is provided for a document or an individual data product. Keywords should be selected from the Unified Astronomy Thesaurus.


<author_list> / <editor_list>

OPTIONAL

For SBN data preparers, if the intent is to create a DOI for this specific product, then at least one of author_list or editor_list must occur; both may occur.

In general, the author_list credits the people responsible for collecting the data and putting it in its present object-file form with the label; the editor_list credits the people responsible for gathering existing data or documents into a PDS4 product. In either case, formatting follows the ADS conventions with the principal author/editor coming first. The format for a single name is "surname [suffix], initials", with each initial followed by a full stop ('.') and an optional space between initials. Multiple names are separated by a semi-colon (';').

<publication_year>

REQUIRED

The publication year is the year that PDS released the product for public consumption. For documents, this should be the year of publication of the PDS4-labelled product, which is likely to be different from the year of publication of a document previously released in some other form (the document label contains that information elsewhere).

<keyword>

OPTIONAL

Use this attribute for the sort of keywords you would include in a manuscript submission. This is an opportunity to get additional, specific search terms into the metadata to help users locate data without having to rely on words in free-format descriptive fields.

SBN strongly recommends that keywords be selected from the Unified Astronomy Thesaurus.

Some examples:

<keyword>Comet nuclei;/keyword>
<keyword>Comet tails</keyword>
<keyword>Near-Earth objects</keyword>

and so on.

<description>

REQUIRED

This should be a brief description - an abstract for a document, or the equivalent for an observational, collection, or bundle product.

Note that this description (or at least the first few hundred characters) will be returned as a brief description by the central search interface. So make it pithy and don't bury the lead.


<Modification_History>

OPTIONAL

This class is here for use in tracking the revision history of this product, as desired. It must have at least one <Modification_Detail> subclass, and can have as many as needed. PDS has no official opinion on whether your Modification_Details are listed in chronological or reverse chronological order. We do ask that, within a label (and preferably within a collection), you use one order consistently.

When deciding when and how to use this class, remember that the label is an archival document. Proper documentation here could help future users understand why new versions of the product were created, or how one version improves on a previous one. Comments that will have no meaning or significance to a user 20 years from now should probably not be recorded for confused posterity in the <Modification_History>.

<Modification_Detail>

REQUIRED

This class contains details of one episode of modification.

<modification_date>

REQUIRED

Date, in YYYY-MM-DD format, of the described modification. The date can be repeated in another <Modification_Detail> for this product, if that makes sense (no time part may be included in the value).

No chronological order is enforced, but it's probably a good idea to stick with either "most recent first" or "oldest first" in any given label, and preferably throughout the products of a single collection.

<version_id>

REQUIRED

The value of <version_id> in the <Identification_Area> after the described change was made. There may be several entries corresponding to separate changes made to produce a particular version, if that makes sense.

Logically, <version_id> should increase with <modification_date>, but as of this writing there are no plans to attempt to check or enforce that.

<description>

REQUIRED

A free-format text field (ASCII characters only) for describing the change(s) made. Any formatting (like tables or paragraph breaks) will be preserved.