Designing Labels

From The SBN Wiki
Revision as of 20:17, 25 October 2012 by Raugh (talk | contribs) (Creation - Safety Save)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page will give you one approach to designing PDS4 labels. It can be applied to one-off labels, as are typically written for documents, as well as for designing templates to be used by pipeline for mass-production.

Considerations

Essential Higher-Level Info

These things are determined at a higher level than the individual data product, so they may be out of your control. Collecting this information beforehand, where you can, may save you some annoying fishing around when you sit down to start your label.

  • You will need the logical_identifier of the collection containing the product you're designing. This is usually defined during the archive design process. You need it because it will be the first part of the logical_identifier you will have to define for the product.
  • You will need a list of standard names and abbreviations being used for things like instruments, spacecraft, phases and such. These sorts of things are also generally decided during the archive design process.
  • You will need a copy of the data dictionary for the mission (or equivalent), if there is one, so you can see what attributes are already defined. You will also need to know who is in charge of updating this dictionary, since it is likely you will be requesting that new attributes be added to it.
  • You will probably want a copy of one or more PDS discipline dictionaries, as applicable, to see what attributes may already be defined in those. You may also want to request that new attributes be added to this dictionary, so...
  • You might find it handy to have the name and contact info for your PDS data engineering consultant close by, especially if this is your first PDS4 label.


Know Thy Data

  • You should be well-versed in the format and content of the data you want to label, down to the byte level. For example, if you're labeling an image, you will need to know the hardware data type (like "MSB 2-byte integer", or "IEEE 4-byte float") of the pixels as well as the dimensions of the image array.
  • Consider what additional information you will want in the data structure description for validation. For example, if you are going to be labeling a data table, are there fields for which you want to specify extrema, or define flag values for missing information?
  • Also consider any additional description that would make an end-user's life easier - like including display formats for binary table values, or max/min values for fields that users are likely to want to scale. If such things are easy to supply, then doing so will earn you some brownie points with the external reviewers.
  • You should also know what meta-data (that is, what descriptive attributes) you need and/or want to include in the label above the data structure. Certain fields, like time of observation, will be required, but most observational data contain addition information like temperatures, mode settings, pointing, and so on. Non-observational products (documents, e.g.) will need to have useful text in their description attributes.


Things to Keep in Mind, Generally

  • This is a label for archival data - it needs to be useful and intelligible to future generations.
  • PDS4 is an archival data format. Not every file that comes out of a mission pipeline will be structured in a way compatible with the PDS4 format standards, but many will, and most will be close. Still, in some complex cases you may have to consider reordering bytes to meet the archival format requirements. If you can get the pipeline programmers to do this, all the better.


Things to Consider Before Mass-Producing Labels

If what you're working on is going to be a template fed into a program to write many labels, then you should bear these things in mind.

  • The logical_identifier of this product must be unique within the collection containing it. Whatever formula you use to generate logical_identifier values must ensure this is true. Conflicts will likely not be discovered until PDS attempts to load the data products into the registry, at which point even a single such conflict could cause the entire delivery to be rejected.

Basic Label Design

Suggested sequence:

  1. Determine compliant data structure(s)
  2. Select base Product type or template label
  3. Fill in required and optional fields
  4. Add discipline attributes
  5. Add mission (or equivalent) attributes