Difference between revisions of "Filling Out the Array 2D Data Structure"

From The SBN Wiki
Jump to navigation Jump to search
m (typos)
(Update for IM 1.14.0.0)
 
(24 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
The '''''<Array_2D>''''' class is the generic base on which all the specific ''Array_2D_*'' classes are built.  Use this class only when one of the more specific flavors cannot be reasonably applied.
 
The '''''<Array_2D>''''' class is the generic base on which all the specific ''Array_2D_*'' classes are built.  Use this class only when one of the more specific flavors cannot be reasonably applied.
  
 +
In most cases, even a generic ''Array_2D'' data structure should be accompanied by a ''<Display_Settings>'' class in the ''Discipline_Area'' of the label, to define the correct way to draw the data on a display device.  If you think this does not apply to your Array_2D, please contact your SBN consultant ASAP for an argument.  See [[Filling Out the Display Dictionary Classes]] for more information.
  
 
For additional explanation, see the PDS4 ''Standards Reference'', or contact your PDS node consultant.
 
For additional explanation, see the PDS4 ''Standards Reference'', or contact your PDS node consultant.
Line 6: Line 7:
 
Following are the attributes and subclasses you'll find in ''<Array_2D>'', in label order.
 
Following are the attributes and subclasses you'll find in ''<Array_2D>'', in label order.
  
''Note that in the PDS4 master schema, all classes have capitalized names; attributes never do.
+
''Note that in the PDS4 master schema, all classes have capitalized names; attributes never do.''
 +
 
 
== <name> ==
 
== <name> ==
 +
 
''OPTIONAL''
 
''OPTIONAL''
 +
 
This attribute can be used to give a descriptive name to the array.
 
This attribute can be used to give a descriptive name to the array.
 +
 
== <local_identifier> ==
 
== <local_identifier> ==
 +
 
''OPTIONAL''
 
''OPTIONAL''
If you need to reference this array class from somewhere else in the label, use this attribute to define a local identifier to use as a hook. Follow the rules for naming a variable in a typical programming language and you should be OK.
+
 
 +
If you need to reference this array class from somewhere else in the label (like in a ''<Display_Settings>'' class describing the correct display orientation), use this attribute to define a local identifier to use as a hook. Since nearly all arrays ''should'' have display information included, you should get in the habit of providing ''<local_identifier>'' attributes for all array-type objects. Follow the rules for naming a variable in a typical programming language and you should be OK.
 +
 
 +
== <md5_checksum> ==
 +
 
 +
''OPTIONAL''
 +
 
 +
Use this attribute to supply an MD5 checksum for the data object ''only''.  In general, if the data object occupies the entire file, then the checksum should be given as an attribute of the ''<File>'' class.  This checksum is calculated using only the bytes defined by this ''Array'' data structure.
 +
 
 
== <offset> ==
 
== <offset> ==
 +
 
''REQUIRED''
 
''REQUIRED''
This is the offset in bytes from the beginning of the file containing the array data to the beginning of the array.  You must specify a unit of "bytes" for this attribute, thus:<pre>   <offset unit="bytes">0</offset></pre>
+
 
 +
This is the offset in bytes from the beginning of the file containing the array data to the beginning of the array.  You must specify a unit of "byte" for this attribute, thus:
 +
<pre>
 +
    <offset unit="byte">0</offset>
 +
</pre>
 +
 
 
== &lt;axes&gt; ==
 
== &lt;axes&gt; ==
 +
 
''REQUIRED''
 
''REQUIRED''
 +
 
This attribute is required to be present and must have a value of "2".
 
This attribute is required to be present and must have a value of "2".
 +
 
== &lt;axis_index_order&gt; ==
 
== &lt;axis_index_order&gt; ==
 +
 
''REQUIRED''
 
''REQUIRED''
This attribute is required to be present and must have a value of '''Last_Index_Fastest'''."Last" is with respect to the ''&lt;sequence_number&gt;'' values in the ''&lt;Axis_Array&gt'' classes.
+
 
== &lt;encoding_type&gt; ==
+
This attribute is required to be present and must have a value of '''Last Index Fastest'''. "Last" is with respect to the ''&lt;sequence_number&gt;'' values in the ''&lt;Axis_Array&gt;'' classes.
''REQUIRED''
+
 
This attribute may have a value of either '''Binary''' or '''Character'''.
 
The correct answer is almost certainly '''Binary'''.  If you think you have a case of '''Character''', contact your PDS consultant ''first''.
 
 
== &lt;description&gt; ==
 
== &lt;description&gt; ==
 +
 
''OPTIONAL''
 
''OPTIONAL''
 +
 
This is a place where additional description can be included, if desired.
 
This is a place where additional description can be included, if desired.
 +
 
== &lt;Element_Array&gt; ==
 
== &lt;Element_Array&gt; ==
 +
 
''REQUIRED''
 
''REQUIRED''
 +
 
This class defines the attributes of the array element.
 
This class defines the attributes of the array element.
 +
 
=== &lt;data_type&gt; ===
 
=== &lt;data_type&gt; ===
 +
 
''REQUIRED''
 
''REQUIRED''
The value here must be one from the list in the [[Standard_Values_Quick_Reference#In_array_elements|Standard Values Quick Reference]].
+
 
{| class="wikitable" style="background-color: thistle"| '''''Note:''''' ''While '''Character''' it ostensibly a valid ''encoding_type'', only binary data types are valid in this attribute.''|}
+
The value here must be one of the binary numeric types from the list in the [[Standard_Values_Quick_Reference#In_array_elements|Standard Values Quick Reference]].
 +
 
 
=== &lt;unit&gt; ===
 
=== &lt;unit&gt; ===
 +
 
''OPTIONAL''
 
''OPTIONAL''
 +
 
If there is a unit of measure associated with the array element values, use this attribute to specify it.
 
If there is a unit of measure associated with the array element values, use this attribute to specify it.
''If the data are unitless, '''DO NOT INCLUDE THIS ATTRIBUTE'''!''. The SBN will not accept data sets containing these or the equivalent:<pre>   <unit/>   <unit>N/A</unit></pre>
+
 
 +
''If the data are unitless, '''DO NOT INCLUDE THIS ATTRIBUTE'''!''  The SBN will not accept data sets containing these or the equivalent:
 +
<pre>
 +
    <unit/>
 +
    <unit>N/A</unit>
 +
</pre>
 +
 
 
=== &lt;scaling_factor&gt; ===
 
=== &lt;scaling_factor&gt; ===
  
 
''OPTIONAL''
 
''OPTIONAL''
If the data have been scaled (multiplied or divided by a constant), put the scaling factor in this attribute.   
+
 
When reading the data, ''&lt;scaling_factor&gt;'' is applied to the value before adding the ''&lt;offset&gt;'' value.
+
If the data have been scaled (divided by a constant), put the scaling factor in this attribute.   
 +
 
 +
When reading the data, the value is multiplied by the ''&lt;scaling_factor&gt;'' value before adding the ''&lt;value_offset&gt;''.
 +
 
 
=== &lt;value_offset&gt; ===
 
=== &lt;value_offset&gt; ===
 +
 
''OPTIONAL''
 
''OPTIONAL''
If an offset has been applied to the data, put the offset value in this attribute.  Offsets may be positive or negative.
+
 
When reading the data, ''&lt;scaling_factor&gt;'' is applied to the value before adding the ''&lt;offset&gt;'' value.
+
If an offset has been subtracted from the data, put the offset value in this attribute.  Offsets may be positive or negative.
 +
 
 +
When reading the data, the value is first multiplied by ''&lt;scaling_factor&gt;'', then the ''&lt;value_offset&gt;'' is added.
  
 
== &lt;Axis_Array&gt; ==
 
== &lt;Axis_Array&gt; ==
 +
 
''REQUIRED''
 
''REQUIRED''
This class describes one dimension of the two-dimensional array.  There must be exactly to instances of this class in any ''Array_2D_*'' object.
+
 
=== &lt;name&gt; ===
+
This class describes one dimension of the two-dimensional array.  There must be exactly two instances of this class in any ''Array_2D_*'' object.
 +
 
 +
=== &lt;axis_name&gt; ===
 +
 
 
''REQUIRED''
 
''REQUIRED''
This is the name of the array axis being described.  The ''name'' is typically something like "Wavelength" or "Distance".  The value should be useful for labelling the axis in a display.
+
 
 +
This is the name of the array axis being described.  The ''axis_name'' is typically something like "Wavelength" or "Distance".  The value should be useful for labelling the axis in a display. For some data structures derived from the &lt;Array_2D&gt; structure, the names of the axes may be fixed.  The value may contain imbedded spaces and UTF-8 characters, unless it is further constrained by the specific type of ''Array_2D'' objects you're describing.
 +
 
 +
{| class="wikitable" style="background-color: yellow"
 +
| '''Note:''' ''Axis names should be unique (within the array object), but this is not currently enforced.  Cut and paste carefully.''
 +
|}
 +
 
 +
=== &lt;local_identifier&gt; ===
 +
 
 +
''OPTIONAL''
 +
 
 +
This is a unique (within the label) name for the axis.  So if your label contains three different arrays, all three can have an axis with an ''axis_name'' value of "Line", but they may not have the same values for ''local_identifier''.  Include this attribute when you will be referencing specific axes of your data structure(s) from some other part of the label. 
 +
 
 
=== &lt;elements&gt; ===
 
=== &lt;elements&gt; ===
 +
 
''REQUIRED''
 
''REQUIRED''
This attribute must contain the number of elements along this axis of the array.  For example, if the ''Array_2D'' in question have dimensions 112x256, then the ''&lt;elements&gt;'' value in the first ''&lt;Axis_Array&gt;'' would be "112".
+
 
=== &lt;unit&gt; ===
+
This attribute must contain the number of elements along this axis of the array.  For example, if the ''Array_2D'' in question has dimensions 112x256, then the ''&lt;elements&gt;'' value in the first ''&lt;Axis_Array&gt;'' would be "112".
''OPTIONAL''
+
 
If there is a unit associated with this axis, here's the place to put it.  Once again, the intention is to provide a string to use in labelling axes in a display.
+
=== &lt;sequence_number&gt; ===
=== &lt;sequence_number&lt; ===
+
 
 
''REQUIRED''
 
''REQUIRED''
This number defines an order for the axes so that the ''&lt;axis_index_order&gt;'' value can be interpreted correctly for this ''Array''.  One of the axes must have a ''sequence_number'' of "1", the other "2".  It is not necessary that the first ''Axis_Array'' class in the label has ''&lt;sequence_number&gt;'' equal to 1, but it would tend to make like easier for reviewers and users.
+
 
=== &lt;Band_Bin_Set&gt; ===
+
This number defines an order for the axes so that the ''&lt;axis_index_order&gt;'' value can be interpreted correctly for this ''Array''.  One of the axes must have a ''sequence_number'' of "1", the other "2".  It is required that the first ''Axis_Array'' class appearing in the label have ''&lt;sequence_number&gt;'' equal to "1"; the second equal to "2".
''OPTIONAL''
 
This placeholder class contains no attributes.  Until it does, do not use it in SBN data sets.  If you think you need it, contact your PDS consultant.
 
  
 
== &lt;Special_Constants&gt; ==
 
== &lt;Special_Constants&gt; ==
 +
 
''OPTIONAL''
 
''OPTIONAL''
 +
 
Use this class to define any flag values that appear in the data to indicate drop outs, saturation, and other conditions that render a single pixel unknown.  Every attribute in this class is optional.  If you don't need any of the special constants, don't include this class in your ''Array_2D_*''.
 
Use this class to define any flag values that appear in the data to indicate drop outs, saturation, and other conditions that render a single pixel unknown.  Every attribute in this class is optional.  If you don't need any of the special constants, don't include this class in your ''Array_2D_*''.
{| class="wikitable" style="background-color: thistle"| '''''Note:''''' ''The differences between "unknown" and "missing", and between "error" and "invalid" are not at all clear. Each pair seems to consist of synonyms.  If you know of or can think of a distinction, let me know. Otherwise it seems like we could get by quite well with half as many of these as we currently have.''|}
+
 
 +
'''Note:''' All the special constant fields are defined as strings with the implicit assumption that the string can be converted to the same data type as defined for the corresponding ''&lt;Element_Array&gt;''. This, however, requires that a character string be transformed into a numeric format before it can be compared to values in the data object itself.  While integer conversions (within the hardware storage representation limits) are precise, floating-point representations are not unless the values are chosen with exquisite care.  The figurative floating point values "NaN" and "+/-INF" are not permitted for use in these constants. So, if you are planning to provide ''Special_Constants'' for floating point data, please keep in mind that comparisons to values in the data file will more than likely need to take into account the conversion error involved in going from string to floating point hardware representation.
 +
 
 
=== &lt;saturated_constant&gt; ===
 
=== &lt;saturated_constant&gt; ===
  
 
''OPTIONAL''
 
''OPTIONAL''
 +
 
This value indicates the data value was lost because of detector saturation.
 
This value indicates the data value was lost because of detector saturation.
 +
 
=== &lt;missing_constant&gt; ===
 
=== &lt;missing_constant&gt; ===
 +
 
''OPTIONAL''
 
''OPTIONAL''
 +
 
This value indicates the data value is known to be missing for some reason not covered by the other constants available in this class.
 
This value indicates the data value is known to be missing for some reason not covered by the other constants available in this class.
 +
 
=== &lt;error_constant&gt; ===
 
=== &lt;error_constant&gt; ===
 +
 
''OPTIONAL''
 
''OPTIONAL''
 +
 
This value indicates the data value originally reported was known to be in error for some reason, and was replaced by this flag.
 
This value indicates the data value originally reported was known to be in error for some reason, and was replaced by this flag.
 +
 
=== &lt;invalid_constant&gt; ===
 
=== &lt;invalid_constant&gt; ===
  
 
''OPTIONAL''
 
''OPTIONAL''
 +
 
This value indicates the data value originally recorded or calculated was outside the valid range for array elements.
 
This value indicates the data value originally recorded or calculated was outside the valid range for array elements.
 +
 
=== &lt;unknown_constant&gt; ===
 
=== &lt;unknown_constant&gt; ===
 +
 
''OPTIONAL''
 
''OPTIONAL''
 +
 
This value indicates the data value in this file is unknown because it was unknown in the source and cannot be recovered.
 
This value indicates the data value in this file is unknown because it was unknown in the source and cannot be recovered.
 +
 
=== &lt;not_applicable_constant&gt; ===
 
=== &lt;not_applicable_constant&gt; ===
 +
 
''OPTIONAL''
 
''OPTIONAL''
 +
 
This value indicates that the concept underlying the datum is not applicable in a particular context.
 
This value indicates that the concept underlying the datum is not applicable in a particular context.
{| class="wikitable" style="background-color: thistle"| '''''Note:''''' ''No ''Array_2D_*'' should ever have a reason to use this constant.  If you disagree, let me know.''|}
+
 
 +
{| class="wikitable" style="background-color: thistle"
 +
| '''''Note:''''' ''No ''Array_2D_*'' should ever have a reason to use this constant.  If you disagree, let me know.''
 +
|}
 +
 
 +
=== &lt;valid_maximum&gt; ===
 +
 
 +
''OPTIONAL''
 +
 
 +
This value is the maximum possible observational value that ''might'' be in the data.  This is useful if your flag values are greater than this value and you want to simplify the exclusion logic.
 +
 
 +
=== &lt;high_instrument_saturation&gt; ===
 +
 
 +
''OPTIONAL''
 +
 
 +
This value indicates the original datum was in the high-end saturation range of the instrument.
 +
 
 +
=== &lt;high_representation_saturation&gt; ===
 +
 
 +
''OPTIONAL''
 +
 
 +
This value is used to indicate that, while the original observed value was valid, it is out of range of the numeric format chosen for this ''Array_2D'' in a way that would be considered "too high" - absolute magnitude too great, positive value too large, or positive exponent too large to be represented.
 +
 
 +
=== &lt;valid_minimum&gt; ===
 +
 
 +
''OPTIONAL''
 +
 
 +
This value is the minimum possible observational value that ''might'' be in the data.  This is useful if your flag values are less than this value and you want to simplify the exclusion logic.
 +
 
 +
=== &lt;low_instrument_saturation&gt; ===
 +
 
 +
''OPTIONAL''
 +
 
 +
This value indicates the original datum was in the low-end saturation range of the instrument.
 +
 
 +
=== &lt;low_representation_saturation&gt; ===
 +
 
 +
''OPTIONAL''
 +
 
 +
This value is used to indicate that, while the original observed value was valid, it is out of range of the numeric format chosen for this ''Array_2D'' in a way that would be considered "too low" - negative value too large or negative exponent too large to be represented.
 +
 
 
== &lt;Object_Statistics&gt; ==
 
== &lt;Object_Statistics&gt; ==
 +
 
''OPTIONAL''
 
''OPTIONAL''
  
 
This class provides a place for statistical values calculated from the real data values of the pixels in the array. Every attribute in this class is optional.  If you don't need any of the statistics, don't include this class in your ''Array_2D_*''.
 
This class provides a place for statistical values calculated from the real data values of the pixels in the array. Every attribute in this class is optional.  If you don't need any of the statistics, don't include this class in your ''Array_2D_*''.
 +
 
=== &lt;local_identifier&gt; ===
 
=== &lt;local_identifier&gt; ===
 +
 
''OPTIONAL''
 
''OPTIONAL''
  
 
If you need to refer to this specific set of ''Object_Statistics'' from elsewhere in this label, this is the place to attach an identifier to it.  If your identifier looks like a variable name in a typical programming language, you should be OK.
 
If you need to refer to this specific set of ''Object_Statistics'' from elsewhere in this label, this is the place to attach an identifier to it.  If your identifier looks like a variable name in a typical programming language, you should be OK.
 +
 
=== &lt;maximum&gt; ===
 
=== &lt;maximum&gt; ===
 +
 
''OPTIONAL''
 
''OPTIONAL''
  
Maximum real data value found in the array as it exists in its file.  That is, any flag values identified in the corresponding ''&lt;Special_Constants&gt;'' class are ignored, and no ''offset'' or ''scaling_factor'' is applied before comparing values read from the file.
+
Maximum real data value found in the array as it exists in its file.  That is, after any flag values identified in the corresponding ''&lt;Special_Constants&gt;'' class are ignored and any relevant bit mask is applied, but before ''offset'' or ''scaling_factor'' are applied.
{| class="wikitable" style="background-color: thistle"
+
 
| '''''Note:''''' ''The data dictionary says that ''empty'' fields are also ignored, but to my knowledge there is no way to define an array element as being ''empty'' without using a special constant.  The data dictionary also does not indicate whether any ''bit_mask'' should be applied before comparing values to determine the maximum. There's also a reference to "repeating fields" that doesn't make any sense. Ignore it.''
 
|}
 
 
=== &lt;minimum&gt; ===
 
=== &lt;minimum&gt; ===
 +
 
''OPTIONAL''
 
''OPTIONAL''
  
Minimum real data value found in the array it exists in its file.  That is, any flag values identified in the corresponding ''&lt;Special_Constants&gt;'' class are ignored, and no ''offset'' or ''scaling_factor'' is applied before comparing values read from the file.
+
Minimum real data value found in the array it exists in its file.  That is, after any flag values identified in the corresponding ''&lt;Special_Constants&gt;'' class are ignored and any relevant bit mask is applied, but before ''offset'' or ''scaling_factor'' are applied.
{| class="wikitable" style="background-color: thistle"
+
 
| '''''Note:''''' ''The data dictionary says that ''empty'' fields are also ignored, but to my knowledge there is no way to define an array element as being ''empty'' without using a special constant.  The data dictionary also does not indicate whether any ''bit_mask'' should be applied before comparing values to determine the maximum.  There's also a reference to "repeating fields" that doesn't make any sense. Ignore it.''
 
|}
 
 
=== &lt;mean&gt; ===
 
=== &lt;mean&gt; ===
 +
 
''OPTIONAL''
 
''OPTIONAL''
  
This is the arithmetic mean of the values in the array, excluding those elements containing flag values defined in the associated ''&lt;Special_Constants&gt;'' class, in the same units as the element.
+
This is the arithmetic mean of the values in the array, excluding those elements containing flag values defined in the associated ''&lt;Special_Constants&gt;'' class, in the same units as the element.  Any bit mask is applied before the calculation, but offset and scaling factor are not.
  
{| class="wikitable" style="background-color: thistle"
 
| '''''Note:''''' ''The data dictionary does not specify whether this is the mean of the stored values, or the values after bit masks, scaling factors and offsets have been applied.  There's also a reference to "repeating fields" that doesn't make any sense.  Ignore it.''
 
|}
 
 
=== &lt;standard_deviation&gt; ===
 
=== &lt;standard_deviation&gt; ===
 +
 
''OPTIONAL''
 
''OPTIONAL''
  
This is the statistical standard deviation of the data values in the array, excluding those elements containing flag values defined in the associated ''&lt;Special_Constants&gt;'' class, in the same units as the element.
+
This is the standard deviation of the ''&lt;mean&gt;'', excluding those elements containing flag values defined in the associated ''&lt;Special_Constants&gt;'' class, in the same units as the element. Bit mask is applied; offset and scaling factor are not.
  
{| class="wikitable" style="background-color: thistle"
+
=== &lt;median&gt; ===
| '''''Note:''''' ''The data dictionary does not specify whether this is the standard deviation of the stored values, or the values after bit masks, scaling factors and offsets have been applied.  There's also a reference to "repeating fields" that doesn't make any sense.  Ignore it.''
 
|}
 
  
=== &lt;bit_mask&gt; ===
 
 
''OPTIONAL''
 
''OPTIONAL''
  
For values not aligned on word boundaries, this attribute contains the bit mask used to recover the value from the words after reading them into memory.  Bit masks are formulated as a simple string of ones and zeroes.  For example:
+
This attribute contains the median value of the real data values (excluding flag values) in the array, in the same units as the element. Any bit mask is applied prior to determining the median, but offset and scaling factor are not.
<pre>
 
    <bit_mask>00011111</bit_mask>
 
</pre>
 
Bit masks are applied before scaling factors and offsets, using the standard bitwise-and logical operation.
 
{| class="wikitable" style="background-color: thistle"
 
| '''''Notes:''''' ''Obviously, a bit mask isn't a "statistic".  This belongs in the array element definition class, not here, as it is essential to being able to read the data properly.''
 
''The data dictionary doesn't actually say that bit masks are applied first, but nothing else makes any sense, really.  The definition for the data type of ''bit_mask'' does not allow any syntax to flag that the value is a binary integer, nor does it constrain the value to have a length equal to the number of bits in the element to be masked.  Neither does the Schematron file place any constraints on bit masks. Avoid them in SBN data unless absolutely, positively necessary.  And then don't use them.''
 
|}
 
=== &lt;median&gt; ===
 
''OPTIONAL''
 
  
This attribute contains the arithmetic mean of the real data values (excluding flag values) in the array, in the same units as the element.
+
=== &lt;maximum_scaled_value&gt; ===
 
 
{| class="wikitable" style="background-color: thistle"
 
| '''''Note:''''' ''The data dictionary does not specify whether this is the mean of the stored values, or the values after bit masks, scaling factors and offsets have been applied.  There's also a reference to "repeating fields" that doesn't make any sense.  Ignore it.''
 
|}
 
=== &lt;md5_checksum&gt; ===
 
''OPTIONAL''
 
  
This is the checksum of just the array (that is, it might be a checksum of part of a file), calculated using the MD5 algorithm.
 
For the hex digits in the value, you ''must'' use the lowercase letters ''a-f''.
 
=== &lt;maximum_scaled_value&gt; ===
 
 
''OPTIONAL''
 
''OPTIONAL''
  
This is the maximum value in the array, excluding flag values, after scaling factors and offsets have been applied to the values read in from the file.
+
This is the maximum observational value represented in the array.  Flag values are excluded; bit mask, scaling factor, and offset are all applied before determining this value.  
This value should be equal to the ''&lt;maximum&gt;'' value multiplied by the ''scaling_factor'', added to the ''offset''.  Note, however, that you are not required to include both ''maximum'' and ''maximum_scaled_value''.
 
  
{| class="wikitable" style="background-color: thistle"
+
=== &lt;minimum_scaled_value&gt; ===
| '''''Note:''''' ''The data dictionary does not mention that bit masks must also be applied, but by this point you should have been expecting that.''
 
|}
 
  
=== &lt;minimum_scaled_value&gt; ===
 
 
''OPTIONAL''
 
''OPTIONAL''
  
This is the minimum value in the array, excluding flag values, after scaling factors and offsets have been applied to the values read in from the file.
+
This is the minimum observational value represented in the array. Flag values are excluded; bit mask, scaling factor, and offset are all applied before determining this value.  
This value should be equal to the ''&lt;minimum&gt;'' value multiplied by the ''scaling_factor'', added to the ''offset''.  Note, however, that you are not required to include both ''minimum'' and ''minimum_scaled_value''.
 
  
{| class="wikitable" style="background-color: thistle"
+
=== &lt;description&gt; ===
| '''''Note:''''' ''The data dictionary does not mention that bit masks must also be applied, but by this point you should have been expecting that.''
 
|}
 
  
=== &lt;description&gt; ===
 
 
''OPTIONAL''
 
''OPTIONAL''
  
 
If you need to provide any additional information or ''caveats'' about the statistics, this is the place to do it.
 
If you need to provide any additional information or ''caveats'' about the statistics, this is the place to do it.

Latest revision as of 17:29, 3 August 2020

The <Array_2D> class is the generic base on which all the specific Array_2D_* classes are built. Use this class only when one of the more specific flavors cannot be reasonably applied.

In most cases, even a generic Array_2D data structure should be accompanied by a <Display_Settings> class in the Discipline_Area of the label, to define the correct way to draw the data on a display device. If you think this does not apply to your Array_2D, please contact your SBN consultant ASAP for an argument. See Filling Out the Display Dictionary Classes for more information.

For additional explanation, see the PDS4 Standards Reference, or contact your PDS node consultant.

Following are the attributes and subclasses you'll find in <Array_2D>, in label order.

Note that in the PDS4 master schema, all classes have capitalized names; attributes never do.

<name>

OPTIONAL

This attribute can be used to give a descriptive name to the array.

<local_identifier>

OPTIONAL

If you need to reference this array class from somewhere else in the label (like in a <Display_Settings> class describing the correct display orientation), use this attribute to define a local identifier to use as a hook. Since nearly all arrays should have display information included, you should get in the habit of providing <local_identifier> attributes for all array-type objects. Follow the rules for naming a variable in a typical programming language and you should be OK.

<md5_checksum>

OPTIONAL

Use this attribute to supply an MD5 checksum for the data object only. In general, if the data object occupies the entire file, then the checksum should be given as an attribute of the <File> class. This checksum is calculated using only the bytes defined by this Array data structure.

<offset>

REQUIRED

This is the offset in bytes from the beginning of the file containing the array data to the beginning of the array. You must specify a unit of "byte" for this attribute, thus:

    <offset unit="byte">0</offset>

<axes>

REQUIRED

This attribute is required to be present and must have a value of "2".

<axis_index_order>

REQUIRED

This attribute is required to be present and must have a value of Last Index Fastest. "Last" is with respect to the <sequence_number> values in the <Axis_Array> classes.

<description>

OPTIONAL

This is a place where additional description can be included, if desired.

<Element_Array>

REQUIRED

This class defines the attributes of the array element.

<data_type>

REQUIRED

The value here must be one of the binary numeric types from the list in the Standard Values Quick Reference.

<unit>

OPTIONAL

If there is a unit of measure associated with the array element values, use this attribute to specify it.

If the data are unitless, DO NOT INCLUDE THIS ATTRIBUTE! The SBN will not accept data sets containing these or the equivalent:

    <unit/>
    <unit>N/A</unit>

<scaling_factor>

OPTIONAL

If the data have been scaled (divided by a constant), put the scaling factor in this attribute.

When reading the data, the value is multiplied by the <scaling_factor> value before adding the <value_offset>.

<value_offset>

OPTIONAL

If an offset has been subtracted from the data, put the offset value in this attribute. Offsets may be positive or negative.

When reading the data, the value is first multiplied by <scaling_factor>, then the <value_offset> is added.

<Axis_Array>

REQUIRED

This class describes one dimension of the two-dimensional array. There must be exactly two instances of this class in any Array_2D_* object.

<axis_name>

REQUIRED

This is the name of the array axis being described. The axis_name is typically something like "Wavelength" or "Distance". The value should be useful for labelling the axis in a display. For some data structures derived from the <Array_2D> structure, the names of the axes may be fixed. The value may contain imbedded spaces and UTF-8 characters, unless it is further constrained by the specific type of Array_2D objects you're describing.

Note: Axis names should be unique (within the array object), but this is not currently enforced. Cut and paste carefully.

<local_identifier>

OPTIONAL

This is a unique (within the label) name for the axis. So if your label contains three different arrays, all three can have an axis with an axis_name value of "Line", but they may not have the same values for local_identifier. Include this attribute when you will be referencing specific axes of your data structure(s) from some other part of the label.

<elements>

REQUIRED

This attribute must contain the number of elements along this axis of the array. For example, if the Array_2D in question has dimensions 112x256, then the <elements> value in the first <Axis_Array> would be "112".

<sequence_number>

REQUIRED

This number defines an order for the axes so that the <axis_index_order> value can be interpreted correctly for this Array. One of the axes must have a sequence_number of "1", the other "2". It is required that the first Axis_Array class appearing in the label have <sequence_number> equal to "1"; the second equal to "2".

<Special_Constants>

OPTIONAL

Use this class to define any flag values that appear in the data to indicate drop outs, saturation, and other conditions that render a single pixel unknown. Every attribute in this class is optional. If you don't need any of the special constants, don't include this class in your Array_2D_*.

Note: All the special constant fields are defined as strings with the implicit assumption that the string can be converted to the same data type as defined for the corresponding <Element_Array>. This, however, requires that a character string be transformed into a numeric format before it can be compared to values in the data object itself. While integer conversions (within the hardware storage representation limits) are precise, floating-point representations are not unless the values are chosen with exquisite care. The figurative floating point values "NaN" and "+/-INF" are not permitted for use in these constants. So, if you are planning to provide Special_Constants for floating point data, please keep in mind that comparisons to values in the data file will more than likely need to take into account the conversion error involved in going from string to floating point hardware representation.

<saturated_constant>

OPTIONAL

This value indicates the data value was lost because of detector saturation.

<missing_constant>

OPTIONAL

This value indicates the data value is known to be missing for some reason not covered by the other constants available in this class.

<error_constant>

OPTIONAL

This value indicates the data value originally reported was known to be in error for some reason, and was replaced by this flag.

<invalid_constant>

OPTIONAL

This value indicates the data value originally recorded or calculated was outside the valid range for array elements.

<unknown_constant>

OPTIONAL

This value indicates the data value in this file is unknown because it was unknown in the source and cannot be recovered.

<not_applicable_constant>

OPTIONAL

This value indicates that the concept underlying the datum is not applicable in a particular context.

Note: No Array_2D_* should ever have a reason to use this constant. If you disagree, let me know.

<valid_maximum>

OPTIONAL

This value is the maximum possible observational value that might be in the data. This is useful if your flag values are greater than this value and you want to simplify the exclusion logic.

<high_instrument_saturation>

OPTIONAL

This value indicates the original datum was in the high-end saturation range of the instrument.

<high_representation_saturation>

OPTIONAL

This value is used to indicate that, while the original observed value was valid, it is out of range of the numeric format chosen for this Array_2D in a way that would be considered "too high" - absolute magnitude too great, positive value too large, or positive exponent too large to be represented.

<valid_minimum>

OPTIONAL

This value is the minimum possible observational value that might be in the data. This is useful if your flag values are less than this value and you want to simplify the exclusion logic.

<low_instrument_saturation>

OPTIONAL

This value indicates the original datum was in the low-end saturation range of the instrument.

<low_representation_saturation>

OPTIONAL

This value is used to indicate that, while the original observed value was valid, it is out of range of the numeric format chosen for this Array_2D in a way that would be considered "too low" - negative value too large or negative exponent too large to be represented.

<Object_Statistics>

OPTIONAL

This class provides a place for statistical values calculated from the real data values of the pixels in the array. Every attribute in this class is optional. If you don't need any of the statistics, don't include this class in your Array_2D_*.

<local_identifier>

OPTIONAL

If you need to refer to this specific set of Object_Statistics from elsewhere in this label, this is the place to attach an identifier to it. If your identifier looks like a variable name in a typical programming language, you should be OK.

<maximum>

OPTIONAL

Maximum real data value found in the array as it exists in its file. That is, after any flag values identified in the corresponding <Special_Constants> class are ignored and any relevant bit mask is applied, but before offset or scaling_factor are applied.

<minimum>

OPTIONAL

Minimum real data value found in the array it exists in its file. That is, after any flag values identified in the corresponding <Special_Constants> class are ignored and any relevant bit mask is applied, but before offset or scaling_factor are applied.

<mean>

OPTIONAL

This is the arithmetic mean of the values in the array, excluding those elements containing flag values defined in the associated <Special_Constants> class, in the same units as the element. Any bit mask is applied before the calculation, but offset and scaling factor are not.

<standard_deviation>

OPTIONAL

This is the standard deviation of the <mean>, excluding those elements containing flag values defined in the associated <Special_Constants> class, in the same units as the element. Bit mask is applied; offset and scaling factor are not.

<median>

OPTIONAL

This attribute contains the median value of the real data values (excluding flag values) in the array, in the same units as the element. Any bit mask is applied prior to determining the median, but offset and scaling factor are not.

<maximum_scaled_value>

OPTIONAL

This is the maximum observational value represented in the array. Flag values are excluded; bit mask, scaling factor, and offset are all applied before determining this value.

<minimum_scaled_value>

OPTIONAL

This is the minimum observational value represented in the array. Flag values are excluded; bit mask, scaling factor, and offset are all applied before determining this value.

<description>

OPTIONAL

If you need to provide any additional information or caveats about the statistics, this is the place to do it.