Difference between revisions of "PDS4 Product Labels, Step by Step"

From The SBN Wiki
Jump to navigation Jump to search
m (→‎Collection Product Label Structure: corrected link to Context_Area description)
m
Line 9: Line 9:
 
----
 
----
  
''As of 2013-05-29, these pages are updated to reflect the Release 1.0 version of the schemas and documents.''
+
''As of 2014-07-08, these pages are updated to reflect the Release 1.2 version of the schemas and documents.''
  
 
----
 
----

Revision as of 17:01, 26 August 2014

This page provides step-by-step information for filling out a PDS4 label. There are several distinct types of labels that you will encounter, each with different structural details, and there is some overlap. In each case, the label is divided into several major sections, which you'll find listed below in the order in which they appear in the label.

You will find this and the associated pages most useful if you have a label in front of you to work on or reference. You can start with a template, or create a label of one of the following types from scratch via the PDS4 master schema,

The collection of SBN PDS4 templates is on the Templates page of this Wiki. The PDS4 master schema can be downloaded from http://pds.jpl.nasa.gov/pds4/schema/released/ (see the Questions topic Where is the Master Schema? for additional help).

If you have a schema-aware editor like Eclipse, you can get a general idea of how to create a new XML label from scratch from our Eclipse: Creating a New XML File from an XSD Schema File page.


As of 2014-07-08, these pages are updated to reflect the Release 1.2 version of the schemas and documents.


Observational Product Label Structure

The Product_Observational is overwhelmingly the most common instance of this label type you will encounter, and also the most complex in terms of descriptive options. Product_SPICE_Kernel is closely related.

In a typical observational product, there are four major sections, and occasionally there's a fifth:

  1. Identification Area - contains identifiers that distinguish this product from all others. This area is required in all labels regardless of type.
    Filling Out the Identification_Area Class
  2. Observation Area - contains information used to describe the observation and subsequent processing at a high level. This area is required in observational products.
    Filling Out the Observation_Area Classes
  3. Reference List - contains cross-references to internal products (e.g., calibration observations or documents) and/or external publications that are not already referenced elsewhere in the label. This area is always optional. Think of these as "Additional References".
    Filling Out the Reference_List Classes
  4. File Area - identifies the data file(s) and defines the data structures within observational products. Observational product labels must have at least one File Area, and may have more than one in the case of complex data supplied in multiple files.
    Filling Out the File_Area_Observational Classes
  5. Supplemental File Area - The structure of this area is identical to that of the File Area, but there additional data structures available in this class. This section is optional - use it if you have supplemental data that is not science data but should usually be supplied with the observational data (reduced-precision browse images, for example).
    Filling Out the File_Area_Observational_Supplemental Classes

Document Product Label Structure

Documents, which consist of at least a single file and may comprise an entire directory tree, replace the File_Area of the observational product label with a Document_Format_Set that describes the set of files constituting one complete version of a document. Product_Document is the label format for traditional (i.e., mainly textual) documentation.

The major sections of a document label are:

  1. Identification Area - as in the observational product label. Note that for documents that might plausibly be cited in the literature, which is most of them, the <Citation_Information> class should be included in this area.
    Filling Out the Identification Area Class
  2. Context Area - This is the more general form of the Observation_Area of the observational product label. Fields required for observational products are optional here. In fact, this class is optional in document products.
    Filling Out the Context_Area Classes
  3. Reference List - as in the observational product label. This section is also optional in document products.
    Filling Out the Reference_List Classes
  4. Document Area - This area contains what could be thought of as the "card catalog" information for the document being labeled - that is, the description of the logical context of the document, like title and author.
    Filling Out the Document Class
  5. Document Format Set - This class structure identifies and describes all the files that constitute one copy of the document in one electronic format. Often this will be a single UTF-8 text file or a PDF/A file, but it may also describe a text file with a series of graphics or image files. There must be at least one of these in the document label, and all formats of the document should be described in the same label.
    Filling Out the Document_Format_Set Classes

There are also specific product types for documentation that is image-based, and a special-case label for XML Schema files. These labels are simpler than the standard Product_Document label, and use a File_Area-type class rather than the Document_Format_Set class to point to their constituent files:

  • Product_Browse and Product_Thumbnail are used to label images used for supporting user access and file selection.
  • Product_XML_Schema is used to label schema documents (either XSD or Schematron) when they go into the archive,

Context Product Label Structure

Context products provide high level descriptive information (suitable for use in a search interface to provide a brief overview or list of distinctive characteristics) about the hardware, facilities, targets and phenomena involved in producing the observational data products. Existing context products can and should be referenced by data preparers creating new products whenever possible. When you have to write your own, you will create a Product_Context label. Here's what's in it:

  1. Identification Area - as in the observational product label, with one note: The <logical_identifier> must follow a set of conventions used for context product logical_identifiers to ensure uniqueness across the entire PDS archive, so you will likely be told exactly what logical_identifier and version_id values must be used in your label.
    Filling Out the Identification_Area Class
  2. Discipline Area - An optional area where classes from discipline dictionaries can be inserted, as needed, to provide additional parameters for info, or for searching. In all other product types, this area is inside the <Observation_Area> class, but there is no Observation_Area in a context product. The procedure, requirements, and constraints are the same, though.
    Filling Out the Discipline_Area
  3. Reference List - as in the observational product label. This is optional in context product labels.
    Filling Out the Reference_List Classes
  4. Context Data Object - This will be a descriptive class depending on the type of context concept you're describing. Each context product will have exactly one of these.
    Filling Out Context Object Classes

Collection Product Label Structure

A collection is itself a product - it's a product that defines and describes a significant relationship among a list of other, selected products (specifically, products that are not themselves collections or bundles). Products listed in a collection will be all (or nearly all) of the same type (observational, document, context, etc.), and thus comprise a collection of that type (i.e., an observational collection, a document collection, and so on.)

Regardless of type, every collection uses a Product_Collection label. A collection product has a structure that is similar to an observational product:

  1. Identification Area - as in the observational product label, but note that you will have to include a <Citation_Information> class in order to provide the required <description> in that class.
    Filling Out the Identification_Area Class
  2. Context Area - as in document product labels. This class is optional in context products, but is very useful for associating the collection as a whole with things like the observing instrument, spacecraft, mission, and so on. Collections of observation products in particular should make use of this class.
    Filling Out the Context_Area Classes
  3. Reference List - as in the observational product label. This is optional in context product labels, but should be used to make high-level associations between collections when that's appropriate - for example, to reference a calibration documentation collection from an observational data collection, or to reference calibration collections from corresponding raw or reduced data collections.
    Filling Out the Reference_List Classes
  4. Collection Area - This class defines the collection type. It is required, as you might expect.
    Filling Out the Collection Class
  5. Inventory File Area - Similar in structure to the File_Area_Observational, this area has many more constraints on it, reflecting the requirements for Collection inventory table formatting and content. It is required to be present.
    Filling Out the File_Area_Inventory Classes

Bundle Product Label Structure

As with collections, each bundle is itself a product, but in this case a product that defines a high-level relationship among collection products. Bundles, however, do not have inventory files - rather, the members are enumerated in the label. All bundles use the Product_Bundle label structure:

  1. Identification Area - as in the collection product label.
    Filling Out the Identification_Area Class
  2. Context Area - as in the collection label. This class is optional in bundle products.
    Filling Out the Observation_Area Classes
  3. Reference List - as in the collection label. This is optional in bundle product labels.
    Filling Out the Reference_List Classes
  4. Bundle - This required area provides a type and the option for top-level description of the bundle.
    Filling Out the Bundle Class
  5. Text File Area - This is an optional area that just points to a descriptive text file (think "ReadMe").
    Filling Out the File_Area_Text Class
  6. Bundle Member Entries - There will be an entry for each collection comprising the bundle. There must be at least one of these.
    Filling Out the Bundle_Member_Entry Class