Difference between revisions of "Creating the Ingest LDD Dictionary Input File"

From The SBN Wiki
Jump to navigation Jump to search
(Creation - Safety Save)
 
(Creation)
Line 1: Line 1:
This page provides step-by-step information for filling out the basic '''Ingest_LDD''' template.  This XML file is the input to the ''LDDTool'' local dictionary generation tool, which in turn generates the XSD Scheme and Schematron files that comprise the working version of the dictionary.
+
This page provides some background and advice on creating the input file for the ''LDDTool'', and links to step-by-step information for filling out the basic '''Ingest_LDD''' template.  The ''Ingest_LDD'' XML file is the input to the ''LDDTool'' local dictionary generation tool, which in turn generates the XSD Schema and Schematron files that comprise the working version of the dictionary.
  
These pages are organized in working order - that is, working from the top of the file down.  Classes and attributes are addressed in the lexical order in which they are required to appear in the file.  An XML validating editor like ''Eclipse'' or ''oXygen'' will make your life considerably easier for finishing this task.
+
= Prerequisites =  
 
 
= Preparation =  
 
  
 
If you haven't already gotten familiar with a validating editor, or at least a schema-aware editor, you should take the time to do so first.  You'll thank me later.
 
If you haven't already gotten familiar with a validating editor, or at least a schema-aware editor, you should take the time to do so first.  You'll thank me later.
Line 15: Line 13:
 
You may find useful context information on the [[Using Local Dictionaries]] page elsewhere on this wiki.
 
You may find useful context information on the [[Using Local Dictionaries]] page elsewhere on this wiki.
  
----
 
----
 
The following assumes that you have initialized an XML file for your validation environment with the required PDS4 namespace and Schematron references.  It also assumes that the default namespace has been declared to be the PDS4 namespace, so that no "<code>pds:</code>" prefix is used in the tags.
 
----
 
----
 
  
= Filling Out the Ingest_LDD Class =
+
= Basic Strategies =
  
 
The '''Ingest_LDD''' class is the root of the XML document you're creating.  In it, you are going to be defining a series of attributes and classes.  You can think of attributes as scalars - single-valued entities that can be used to build classes.  Classes can be nested, so one class can contain other classes as well as attributes.  Note that classes can not be nested recursively - either directly or indirectly.
 
The '''Ingest_LDD''' class is the root of the XML document you're creating.  In it, you are going to be defining a series of attributes and classes.  You can think of attributes as scalars - single-valued entities that can be used to build classes.  Classes can be nested, so one class can contain other classes as well as attributes.  Note that classes can not be nested recursively - either directly or indirectly.
 
== Basic Strategies ==
 
  
 
Following are some emerging best practices we've noted at SBN.
 
Following are some emerging best practices we've noted at SBN.
  
=== Organization of the File Contents ===
+
== Organization of the File Contents ==
  
All attributes must be defined first; classes must be defined afterward.  Within each section, though, there is no enforced ordering. For non-trivial dictionaries, then, you should give some thought to organizing the definitions in each section to facilitate development and maintenance. Attributes can be sorted alphabetically, for example, or grouped by purpose, subsystem, or some other mission-specific/whimsical criterion. You may also find it convenient to organize your class definitions in some sort of hierarchy - defining lowest level subclasses first, for example, then defining each successive level of containing classes.
+
All attributes must be defined first; classes must be defined afterward.  Within each section, though, there is no enforced ordering. In fact, it is not necessary to define one class before including it as a component of another class - though it has to be defined in the file somewhere.
  
 +
For non-trivial dictionaries, then, you should give some thought to organizing the definitions in each section to facilitate development and maintenance. Attributes can be sorted alphabetically, for example, or grouped by purpose, subsystem, or some other mission-specific/whimsical criterion. You may also find it convenient to organize your class definitions in some sort of hierarchy - defining lowest level subclasses first, for example, then defining each successive level of containing classes.
  
=== Naming Attributes and Classes ===
+
== Naming Attributes and Classes ==
 
 
  
 
The PDS and discipline namespaces employ a simple naming convention with these features:
 
The PDS and discipline namespaces employ a simple naming convention with these features:
Line 41: Line 32:
 
* Words in attribute names are all lower-case.
 
* Words in attribute names are all lower-case.
 
* Words in class names are capitalized.
 
* Words in class names are capitalized.
 +
* Abbreviations are generally avoided.
 +
* Word order usually follows English grammar unless there's a good reason not to.
  
There are rare exceptions, but on the whole these are followed pretty consistently in the shared namespaces.  There is no requirement that you use these conventions in your attribute names and classes, but you might consider the aesthetics of the resulting label if you use conventions that are very far removed from the above.
+
There are rare exceptions, but on the whole these are followed pretty consistently in the shared namespaces.  There is no requirement that you use these conventions in your attribute names and classes, but you might consider the aesthetics and readability of the resulting label if you use conventions that are very far removed from the above.
  
 
You are constrained to ASCII characters, so don't go nuts.
 
You are constrained to ASCII characters, so don't go nuts.
  
=== Grouping Attributes into Classes ===
+
== Grouping Attributes into Classes ==
  
 
In general, you can group attributes into classes and subclasses, or not, pretty much any way you want.  The functional groupings established by comments in most PDS3 labels are, in general, a pretty reasonable approach to class definitions.
 
In general, you can group attributes into classes and subclasses, or not, pretty much any way you want.  The functional groupings established by comments in most PDS3 labels are, in general, a pretty reasonable approach to class definitions.
Line 52: Line 45:
 
SBN recommends that all attributes be a member of some class from the dictionary.  It may be valid to have "naked" attributes from mission dictionaries hanging around in the label <code>&lt;Mission_Area&gt;</code>, but from a user's point of view these attributes usually seem to be lacking something in context.
 
SBN recommends that all attributes be a member of some class from the dictionary.  It may be valid to have "naked" attributes from mission dictionaries hanging around in the label <code>&lt;Mission_Area&gt;</code>, but from a user's point of view these attributes usually seem to be lacking something in context.
  
=== Beware of Slacking Off ===
+
== Beware of Slacking Off ==
  
 
You ''must'' provide real, substantive definitions for all your defined attributes, and should endeavor to do so for all your classes.  These definitions are part of the external peer review - reviewers will judge them.  You do not want to be found wanting in this respect.
 
You ''must'' provide real, substantive definitions for all your defined attributes, and should endeavor to do so for all your classes.  These definitions are part of the external peer review - reviewers will judge them.  You do not want to be found wanting in this respect.
  
If your attribute has a unit of measure, you '''''must''''' include the <code>&lt;unit_of-measure_type&gt;</code> in the definition.  It is '''''not''''' sufficient to say, for example, "measured in nm" in the definition text without including <code>unit_of_measure_type</code> in the attribute definition.
+
If your attribute has a unit of measure, you '''''must''''' include the <code>&lt;unit_of_measure_type&gt;</code> in the definition.  It is '''''not''''' sufficient to say, for example, "measured in nm" in the definition text without including <code>unit_of_measure_type</code> in the attribute definition.
  
 +
Use ''TBD'' sparingly, if at all.  For non-trivial dictionaries, letting the population of ''TBD''s grow unchecked can lead to major issues just before - or worse, at - review.
  
Use ''TBD'' sparingly, if at all.  For non-trivial dictionaries, letting the population of ''TBD''s grow unchecked can lead to major issues just before - or worse, ''at'' - review.
 
  
== A Note About Complexity ==
 
  
 +
= A Note About Complexity =
 +
 +
The [[Filling Out the Ingest_LDD Class: Basic]] page referenced below only describes the basic techniques available in the '''Ingest_LDD''' class.  It is possible to define fairly complex relationships within this dictionary and to other namespaces.  In general, though, SBN discourages this for mission dictionaries.  Internal complexity very often leads to confusion on the part of users not affiliated with the original team, and referencing other namespaces establishes a connection between dictionaries that are under the control of two or more different stewards who may or may not be aware of any underlying assumptions that would be invalidated by future development, especially after end of mission.  So on the whole, try very hard to keep it simple and localized.  If that seems impossible for your metadata or mission organization, check in with your PDS node consultant for examples of more complex procedures that haven't ripped asunder the fabric of space-time.
 +
 +
 +
 +
= Filling Out the Ingest_LDD Class =
  
The sections following only describe the basic techniques available in the '''Ingest_LDD''' class.  It is possible to define fairly complex relationships within this dictionary and to other namespaces.  In general, SBN discourages this for mission dictionaries.  Internal complexity very often leads to confusion on the part of users not affiliated with the original team, and referencing other namespaces establishes a connection between dictionaries that are under the control of two or more different stewards who may or may not be aware of any underlying assumptions that would be invalidated by future development, especially after end of missionSo on the whole, try very hard to keep it simple and localizedIf that seems impossible for your metadata or mission organization, check in with your PDS node consultant for examples of more complex procedures that haven't ripped asunder the fabric of space-time.
+
This is actually a fairly long topic, so I've split it out to a separate pageBefore going there, get your template file with all its schema references ready to goThen follow along here:
 +
;;;;;* [[Filling Out the Ingest_LDD Class: Basic]]

Revision as of 17:33, 25 March 2015

This page provides some background and advice on creating the input file for the LDDTool, and links to step-by-step information for filling out the basic Ingest_LDD template. The Ingest_LDD XML file is the input to the LDDTool local dictionary generation tool, which in turn generates the XSD Schema and Schematron files that comprise the working version of the dictionary.

Prerequisites

If you haven't already gotten familiar with a validating editor, or at least a schema-aware editor, you should take the time to do so first. You'll thank me later.

If you haven't tried to write or read a PDS4 XML label before, you should probably at least walk through one with a knowledgeable guide. Contact your PDS node consultant, if you have one, or you can contact Anne Raugh (raugh) at the Small Bodies Node astro.umd.edu).

If you're using a schema-aware or validating editor, you should know how to create a new XML file from a schema file using your editor, and how to reference the schema and the related Schematron file if you're validating using your editor. You should use the PDS4 namespace schema to create your XML file, and select the Ingest_LDD element as your root element.

If you're using a plain text editor, you should know how to create a new XML file from scratch with all the appropriate schema references, or have a good PDS4 template to follow. Your root element - the one containing the XSD schema references - is called Ingest_LDD. In this case you're also going to need some way to validate the result before you try to run it through LDDTool.

You may find useful context information on the Using Local Dictionaries page elsewhere on this wiki.


Basic Strategies

The Ingest_LDD class is the root of the XML document you're creating. In it, you are going to be defining a series of attributes and classes. You can think of attributes as scalars - single-valued entities that can be used to build classes. Classes can be nested, so one class can contain other classes as well as attributes. Note that classes can not be nested recursively - either directly or indirectly.

Following are some emerging best practices we've noted at SBN.

Organization of the File Contents

All attributes must be defined first; classes must be defined afterward. Within each section, though, there is no enforced ordering. In fact, it is not necessary to define one class before including it as a component of another class - though it has to be defined in the file somewhere.

For non-trivial dictionaries, then, you should give some thought to organizing the definitions in each section to facilitate development and maintenance. Attributes can be sorted alphabetically, for example, or grouped by purpose, subsystem, or some other mission-specific/whimsical criterion. You may also find it convenient to organize your class definitions in some sort of hierarchy - defining lowest level subclasses first, for example, then defining each successive level of containing classes.

Naming Attributes and Classes

The PDS and discipline namespaces employ a simple naming convention with these features:

  • Words in the name are separated by underscores.
  • Words in attribute names are all lower-case.
  • Words in class names are capitalized.
  • Abbreviations are generally avoided.
  • Word order usually follows English grammar unless there's a good reason not to.

There are rare exceptions, but on the whole these are followed pretty consistently in the shared namespaces. There is no requirement that you use these conventions in your attribute names and classes, but you might consider the aesthetics and readability of the resulting label if you use conventions that are very far removed from the above.

You are constrained to ASCII characters, so don't go nuts.

Grouping Attributes into Classes

In general, you can group attributes into classes and subclasses, or not, pretty much any way you want. The functional groupings established by comments in most PDS3 labels are, in general, a pretty reasonable approach to class definitions.

SBN recommends that all attributes be a member of some class from the dictionary. It may be valid to have "naked" attributes from mission dictionaries hanging around in the label <Mission_Area>, but from a user's point of view these attributes usually seem to be lacking something in context.

Beware of Slacking Off

You must provide real, substantive definitions for all your defined attributes, and should endeavor to do so for all your classes. These definitions are part of the external peer review - reviewers will judge them. You do not want to be found wanting in this respect.

If your attribute has a unit of measure, you must include the <unit_of_measure_type> in the definition. It is not sufficient to say, for example, "measured in nm" in the definition text without including unit_of_measure_type in the attribute definition.

Use TBD sparingly, if at all. For non-trivial dictionaries, letting the population of TBDs grow unchecked can lead to major issues just before - or worse, at - review.


A Note About Complexity

The Filling Out the Ingest_LDD Class: Basic page referenced below only describes the basic techniques available in the Ingest_LDD class. It is possible to define fairly complex relationships within this dictionary and to other namespaces. In general, though, SBN discourages this for mission dictionaries. Internal complexity very often leads to confusion on the part of users not affiliated with the original team, and referencing other namespaces establishes a connection between dictionaries that are under the control of two or more different stewards who may or may not be aware of any underlying assumptions that would be invalidated by future development, especially after end of mission. So on the whole, try very hard to keep it simple and localized. If that seems impossible for your metadata or mission organization, check in with your PDS node consultant for examples of more complex procedures that haven't ripped asunder the fabric of space-time.


Filling Out the Ingest_LDD Class

This is actually a fairly long topic, so I've split it out to a separate page. Before going there, get your template file with all its schema references ready to go. Then follow along here: