Difference between revisions of "Python PDS4 Tools"

From The SBN Wiki
Jump to navigation Jump to search
m (Added warning that software is alpha)
(Updated for v0.4 of pds4_tools)
Line 14: Line 14:
 
=== Requirements ===
 
=== Requirements ===
  
Python 2.6 or 2.
+
Python 2.6+ or 3.3+
  
 
pds4_read: None <br>
 
pds4_read: None <br>
Line 49: Line 49:
 
| Yes
 
| Yes
 
| Yes
 
| Yes
| 2D and 3D only
+
| Yes, N-dims
 
| Under Development
 
| Under Development
 
|-
 
|-
Line 97: Line 97:
 
=== Download ===
 
=== Download ===
  
Download the ZIP file [[File:PDS4_tools-0.3.zip]]
+
Download the ZIP file [[File:PDS4_tools-0.4.zip]]. Released on December 3, 2015.
  
Note: This is Alpha quality software that is actively being developed, use at your own risk.  
+
Note: This is software that is still actively being developed.
 +
 
 +
Note: A distributable version of the viewer only, which does not require Python, is [[PDS4 Viewer|available]].
  
 
=== Installation ===
 
=== Installation ===
Line 105: Line 107:
 
==== Option 1 ====
 
==== Option 1 ====
  
Use "<tt>pip install PDS4_tools-0.3.zip</tt>" or "<tt>easy_install PDS4_tools-0.3.zip</tt>". You can also extract the ZIP file and use "<tt>python /path/to/extracted/setup.py install</tt>". Note that there is no uninstall script provided (although "<tt>pip uninstall pds4_tools</tt>" should work), and that this tool will be updated in the future.  
+
Use "<tt>pip install PDS4_tools-0.4.zip</tt>" or "<tt>easy_install PDS4_tools-0.4.zip</tt>". You can also extract the ZIP file and use "<tt>python /path/to/extracted/setup.py install</tt>". Note that there is no uninstall script provided (although "<tt>pip uninstall pds4_tools</tt>" should work), and that this tool will be updated in the future.  
  
 
==== Option 2 ====
 
==== Option 2 ====
Line 126: Line 128:
 
<pre>
 
<pre>
 
     Reads PDS4 compliant data into a `StructureList`
 
     Reads PDS4 compliant data into a `StructureList`
 
+
   
 
     Given a PDS4 label, reads the PDS4 data described in the label and
 
     Given a PDS4 label, reads the PDS4 data described in the label and
 
     associated label meta data into an `StructureList`, with each PDS4 data
 
     associated label meta data into an `StructureList`, with each PDS4 data
 
     structure (e.g. Array_2D, Table_Binary, etc) as its own `Structure`. By
 
     structure (e.g. Array_2D, Table_Binary, etc) as its own `Structure`. By
 
     default all data structures described in the label are read-in.
 
     default all data structures described in the label are read-in.
 +
   
 +
    Notes
 +
    -----
 +
    Currently supports Array structures, Table_Character and Table_Binary.
 +
    Packed bit fields in Table_Binary are not yet supported, all other
 +
    features of previously mentioned structures are fully supported.
 +
   
 +
    Parameters
 +
    ----------
 +
    filename : str
 +
        The filename, including full or relative path if necessary, of
 +
        the PDS4 label describing the data.
 +
    quiet : bool, optional
 +
        Suppresses all info/warnings from being output.
 +
    use_numpy : bool, optional
 +
        Returned data will be an ndarray and use NumPy data types.
 +
        Defaults to True if NumPy is installed.
 +
    structure_num : integer, optional
 +
        Instead of reading all data structures, only read the n^th
 +
        structure, where n = structure_num and is zero-based.
 +
    structure_name : str, optional
 +
        Instead of reading all data structures, only read the structure
 +
        with a name equal to structure_name.
 +
    structure_lid : str, optional
 +
        Instead of reading all data structures, only read the structure
 +
        with a local identifier equal to structure_lid.
 +
   
 +
    Returns
 +
    -------
 +
    StructureList
 +
        Contains PDS4 data `Structure`s, each of which contains the data,
 +
        the meta data and the label portion describing that data structure.
 +
        `StructureList` can be treated/accessed/used like a ``dict`` or
 +
        ``list``.
 +
   
 +
    Examples
 +
    --------
 +
   
 +
    Below we document how to read data described by an example label
 +
    which has two data structures, an Array_2D_Image and a Table_Binary.
 +
    An outline of the label, including the array and a table with 3
 +
    fields, is given.
 +
   
 +
    >>> struct_list = pds4_read('/path/to/Example_Label.xml')
 +
   
 +
    Example Label Outline:
 +
   
 +
        Array_2D_Image: unnamed
 +
        Table_Binary: Observations
 +
            Field: order
 +
            Field: wavelength
 +
            Group: unnamed
 +
                Field: pos_vector
 +
   
 +
    All below documentation assumes that the above outlined label,
 +
    containing an array that does not have a name indicated in the label,
 +
    and a table that has the name 'Observations' with 3 fields as shown,
 +
    has been read-in.
 +
   
 +
    Accessing Example Structures:
 +
   
 +
        To access the data structures in `StructureList`, which is returned
 +
        by pds4_read(), you may use any combination of `dict` or `list`.
 +
   
 +
        >>> unnamed_array = struct_list[0]
 +
        >>>              or struct_list['ARRAY_0']
 +
   
 +
        >>> obs_table = struct_list[1]
 +
        >>>          or struct_list['Observations']
 +
   
 +
    Label or Structure Overview:
 +
   
 +
        To see a summary of the data structures, which for Arrays shows the
 +
        type and dimensions of the array, and for Tables shows the type
 +
        and number of fields, you may use the info() method. Calling
 +
        info() on a specific `Structure` instead of `StructureList` will
 +
        provide a more detailed summary, including all Fields for a table.
 +
   
 +
        >>> struct_list.info()
 +
        >>> unnamed_array.info()
 +
        >>> obs_table.info()
 +
   
 +
    Accessing Example Label data:
 +
   
 +
        To access the read-in data, as an array-like (either list,
 +
        array.array or ndarray), you can use the data attribute for a
 +
        PDS4 Array data structure, or the field() method to access a field
 +
        for a table.
 +
   
 +
        >>> unnamed_array.data
 +
        >>> obs_table.field('wavelength')
 +
        >>> obs_table.field('pos_vector')
 +
   
 +
    Accessing Example Label meta data:
 +
   
 +
        You can access all meta data in the label for a given PDS4 data
 +
        structure or field via the `OrderedDict` meta_data attribute. The
 +
        below examples use the 'description' element.
 +
   
 +
        >>> unnamed_array.meta_data['description']
 +
   
 +
        >>> obs_table.field('wavelength').meta_data['description']
 +
        >>> obs_table.field('pos_vector').meta_data['description']
 +
   
 +
    Accessing Example Label:
 +
   
 +
        The XML for a label is also accessible via the label attribute,
 +
        either the entire label or for each PDS4 data structure.
 +
   
 +
        Entire label:
 +
            >>> struct_list.label
 +
   
 +
        Part of label describing Observations table:
 +
            >>> struct_list['Observations'].label
 +
            >>> struct_list[1].label
 +
   
 +
        The returned object is similar to an ElementTree instance. It is
 +
        searchable via find() and findall() methods and XPATH. Consult
 +
        ElementTree manual for more details. For example,
 +
   
 +
        >>> struct_list.label.findall('.//disp:Display_Settings')
 +
   
 +
        Will find all elements in the entire label named 'Display_Settings'
 +
        which are in the 'disp' namespace. You can additionally use the
 +
        to_dict() and to_string() methods.
 +
</pre>
 +
 +
==== pds4_viewer ====
  
    NOTES:
 
  
        Currently supports Array structures, Table_Character and Table_Binary.
+
To display the objects in a label you may call <tt>pds4_viewer</tt> from the command line, or import it in the Python interpreter:
        Packed bit fields in Table_Binary are not yet supported, all other
 
        features of previously mentioned structures are fully supported.
 
  
 +
<pre>
 +
    Displays PDS4 compliant data in a GUI
 +
   
 +
    Given a PDS4 label, displays PDS4 data described in the label and
 +
    associated label meta data in a GUI. By default all data structures described
 +
    in the label are read-in and displayed. Can be called without any
 +
    parameters, opening a GUI that has a File->Open function to select
 +
    desired label to be read-in and displayed.
 +
   
 
     Parameters:
 
     Parameters:
 
+
   
         filename: str
+
         filename: str, optional
 
             The filename, including full or relative path if necessary, of
 
             The filename, including full or relative path if necessary, of
             the PDS4 label describing the data.
+
             the PDS4 label describing the data to be viewed.
 +
        from_existing_structures: StructureList, optional
 +
            An existing StructureList, as returned by pds4_read(), to view. Takes
 +
            precedence if given together with filename.
 
         quiet: bool, optional
 
         quiet: bool, optional
             Suppresses all info/warnings from being output.
+
             Suppresses all info/warnings from being output and displayed.
        use_numpy: bool, optional
 
            Returned data will be an ndarray and use NumPy data types. On
 
            by default if NumPy is installed.
 
 
         structure_num: integer, optional
 
         structure_num: integer, optional
 
             Instead of reading all data structures, only read the n^th
 
             Instead of reading all data structures, only read the n^th
Line 157: Line 293:
 
             Instead of reading all data structures, only read the structure
 
             Instead of reading all data structures, only read the structure
 
             with a local identifier equal to structure_lid.
 
             with a local identifier equal to structure_lid.
 
    Returns:
 
 
        `StructureList`
 
            Contains PDS4 structures and label data. Can be treated/accessed/used
 
            like a `dict` or `list`.
 
 
    Example usage:
 
 
        Below we document how to read data described by an example label
 
        which has two data structures, an Array_2D_Image and a Table_Binary.
 
        An outline of the label, including the array and a table with 3
 
        fields, is given.
 
 
        >>> struct_list = pds4_read('/path/to/Example_Label.xml')
 
 
        Example Label Outline:
 
 
            Array_2D_Image: unnamed
 
            Table_Binary: Observations
 
                Field: order
 
                Field: wavelength
 
                Group: unnamed
 
                    Field: pos_vector
 
 
        All below documentation assumes that the above outlined label,
 
        containing an array that does not have a name indicated in the label,
 
        and a table has the name 'Observations' with 3 fields as shown
 
        has been read-in.
 
 
        Accessing Example Structures:
 
 
            To access the data structures in `StructureList`, which is returned by
 
            pds4_read(), you may use any combination of `dict` or `list`.
 
 
            >>> unnamed_array = struct_list[0]
 
            >>>              or struct_list['ARRAY_0']
 
 
            >>> obs_table = struct_list[1]
 
            >>>          or struct_list['Observations']
 
 
        Label or Structure Overview:
 
 
            To see a summary of the data structures, which for Arrays shows the
 
            type and dimensions of the array, and for Tables shows the type
 
            and number of fields, you may use the info() method. Calling
 
            info() on the `Structure` instead of `StructureList` will provide
 
            a more detailed summary, including all Fields for a table.
 
 
            >>> struct_list.info()
 
            >>> unnamed_array.info()
 
            >>> obs_table.info()
 
 
        Accessing Example Label data:
 
 
            To access the read-in data, as an array-like (either list,
 
            array.array or ndarray), you can use the data attribute for a
 
            PDS4 Array data structure, or the field() method to access a field
 
            for a table.
 
 
            >>> unnamed_array.data
 
            >>> obs_table.field('wavelength')
 
            >>> obs_table.field('pos_vector')
 
 
        Accessing Example Label meta data:
 
 
            You can access all meta data in the label for a given PDS4 data
 
            structure or field via the `OrderedDict` meta_data attribute. The
 
            below examples use the 'description' element.
 
 
            >>> unnamed_array.meta_data['description']
 
 
            >>> obs_table.field('wavelength').meta_data['description']
 
            >>> obs_table.field('pos_vector').meta_data['description']
 
 
        Accessing Example Label:
 
 
            The XML for a label is also accessible via the label attribute,
 
            either the entire label or for each PDS4 data structure.
 
 
            Entire label:
 
                >>> struct_list.label
 
 
            Part of label describing Observations table:
 
                >>> struct_list['Observations'].label
 
                >>> struct_list[1].label
 
 
            The returned object is similar to an ElementTree instance. It is
 
            searchable via find() and findall() methods and XPATH. Consult
 
            ElementTree manual for more details. For example,
 
 
            >>> struct_list.label.findall('.//disp:Display_Settings')
 
 
            Will find all elements in the entire label named 'Display_Settings'
 
            which are in the 'disp' namespace. You can additionally use the
 
            to_dict() and to_string() methods.
 
</pre>
 
 
==== pds4_viewer ====
 
 
 
To display the objects in a label you may call <tt>pds4_viewer</tt> from the command line, or import it in the Python interpreter:
 
 
<pre>
 
usage: pds4_viewer.py [-h] [--quiet] [--structure_num STRUCTURE_NUM]
 
                      [--structure_name STRUCTURE_NAME]
 
                      [--structure_lid STRUCTURE_LID]
 
                      [filename]
 
 
positional arguments:
 
  filename              Filename, including full path, of the label
 
 
optional arguments:
 
  -h, --help            show this help message and exit
 
  --quiet              Suppresses all info/warnings
 
  --structure_num STRUCTURE_NUM
 
                        Only reads the data structure specified by zero-based
 
                        order (integer)
 
  --structure_name STRUCTURE_NAME
 
                        Only reads the data structure specified by name
 
  --structure_lid STRUCTURE_LID
 
                        Only reads the data structure specified by local identifier
 
 
</pre>
 
</pre>
  
Line 299: Line 313:
  
 
struct_list = pds4_read('label.xml')
 
struct_list = pds4_read('label.xml')
pds4_viewer('label.xml', from_existing_structures=struct_list) # Won't re-read the data
+
pds4_viewer(from_existing_structures=struct_list) # Won't re-read the data
 
</pre>
 
</pre>

Revision as of 15:02, 3 December 2015

Introduction

This document describes the current status and usage of Python tools developed at PDS-SBN to read and visualize PDS4 data in Python. Please note that a PDS4 reader and visualizer for IDL is also available.

Reading and Displaying PDS4 Data

Introduction

This section describes a Python package that can read and display PDS4 data and meta data. In the future this tool is expected to support all PDS4 objects, currently support is limited to objects given in the Supported Objects section. The package expects labels that pass PDS4 Schema and Schematron validation.

Contact Lev Nagdimunov with questions or comments regarding this code or its description.

Requirements

Python 2.6+ or 3.3+

pds4_read: None
pds4_viewer: NumPy, matplotlib

You may use pds4_read to read-in data without any extra packages; pds4_viewer requires recent versions of the additional packages.

Optional Features

pds4_read: NumPy
Recommended for Arrays and Tables containing GROUP fields to allow for multi-dimensional indexing. Can result in significant improvements in memory usage and read-in speed for some data objects.

pds4_viewer: None

Supported Data Structures

PDS4 Data Standards < v1.3 are not officially supported but may work.
PDS4 Data Standards >= v1.3 are supported.
PDS3 Data Standards are not supported.

The table below lists the main PDS4 data objects and the current status.

Read-in column indicates support by pds4_read()
Display columns indicate support by pds4_viewer().

Structure Read-in Display as Table Display as Image Display Columns as Plot
Array Yes Yes Yes, N-dims Under Development
Array_2D Yes Yes Yes Under Development
Array_2D_* Yes Yes Yes Under Development
Array_3D Yes Yes Yes Under Development
Array_3D_* Yes Yes Yes Under Development
Table_Character Yes Yes No Under Development
Table_Binary Yes, except BitFields Yes No Under Development
Table_Delimited Future development Future development Future Development Future Development

Download

Download the ZIP file File:PDS4 tools-0.4.zip. Released on December 3, 2015.

Note: This is software that is still actively being developed.

Note: A distributable version of the viewer only, which does not require Python, is available.

Installation

Option 1

Use "pip install PDS4_tools-0.4.zip" or "easy_install PDS4_tools-0.4.zip". You can also extract the ZIP file and use "python /path/to/extracted/setup.py install". Note that there is no uninstall script provided (although "pip uninstall pds4_tools" should work), and that this tool will be updated in the future.

Option 2

Extract the downloaded file to a directory Python can find. To use it follow the instructions in Example Usage except with the following lines first,

import sys
sys.path.extend(['/path/to/your/extraction/directory'])

# On a windows machine use backslashes (/) instead of windows' normal forward slashes to specify paths

Example Usage

pds4_read

You may call pds4_read from command line or from your own script. The following is the docstring for pds4_read:

    Reads PDS4 compliant data into a `StructureList`
    
    Given a PDS4 label, reads the PDS4 data described in the label and
    associated label meta data into an `StructureList`, with each PDS4 data
    structure (e.g. Array_2D, Table_Binary, etc) as its own `Structure`. By
    default all data structures described in the label are read-in.
    
    Notes
    -----
    Currently supports Array structures, Table_Character and Table_Binary.
    Packed bit fields in Table_Binary are not yet supported, all other
    features of previously mentioned structures are fully supported.
    
    Parameters
    ----------
    filename : str
        The filename, including full or relative path if necessary, of
        the PDS4 label describing the data.
    quiet : bool, optional
        Suppresses all info/warnings from being output.
    use_numpy : bool, optional
        Returned data will be an ndarray and use NumPy data types.
        Defaults to True if NumPy is installed.
    structure_num : integer, optional
        Instead of reading all data structures, only read the n^th
        structure, where n = structure_num and is zero-based.
    structure_name : str, optional
        Instead of reading all data structures, only read the structure
        with a name equal to structure_name.
    structure_lid : str, optional
        Instead of reading all data structures, only read the structure
        with a local identifier equal to structure_lid.
    
    Returns
    -------
    StructureList
        Contains PDS4 data `Structure`s, each of which contains the data,
        the meta data and the label portion describing that data structure.
        `StructureList` can be treated/accessed/used like a ``dict`` or
        ``list``.
    
    Examples
    --------
    
    Below we document how to read data described by an example label
    which has two data structures, an Array_2D_Image and a Table_Binary.
    An outline of the label, including the array and a table with 3
    fields, is given.
    
    >>> struct_list = pds4_read('/path/to/Example_Label.xml')
    
    Example Label Outline:
    
        Array_2D_Image: unnamed
        Table_Binary: Observations
            Field: order
            Field: wavelength
            Group: unnamed
                Field: pos_vector
    
    All below documentation assumes that the above outlined label,
    containing an array that does not have a name indicated in the label,
    and a table that has the name 'Observations' with 3 fields as shown,
    has been read-in.
    
    Accessing Example Structures:
    
        To access the data structures in `StructureList`, which is returned
        by pds4_read(), you may use any combination of `dict` or `list`.
    
        >>> unnamed_array = struct_list[0]
        >>>              or struct_list['ARRAY_0']
    
        >>> obs_table = struct_list[1]
        >>>          or struct_list['Observations']
    
    Label or Structure Overview:
    
        To see a summary of the data structures, which for Arrays shows the
        type and dimensions of the array, and for Tables shows the type
        and number of fields, you may use the info() method. Calling
        info() on a specific `Structure` instead of `StructureList` will
        provide a more detailed summary, including all Fields for a table.
    
        >>> struct_list.info()
        >>> unnamed_array.info()
        >>> obs_table.info()
    
    Accessing Example Label data:
    
        To access the read-in data, as an array-like (either list,
        array.array or ndarray), you can use the data attribute for a
        PDS4 Array data structure, or the field() method to access a field
        for a table.
    
        >>> unnamed_array.data
        >>> obs_table.field('wavelength')
        >>> obs_table.field('pos_vector')
    
    Accessing Example Label meta data:
    
        You can access all meta data in the label for a given PDS4 data
        structure or field via the `OrderedDict` meta_data attribute. The
        below examples use the 'description' element.
    
        >>> unnamed_array.meta_data['description']
    
        >>> obs_table.field('wavelength').meta_data['description']
        >>> obs_table.field('pos_vector').meta_data['description']
    
    Accessing Example Label:
    
        The XML for a label is also accessible via the label attribute,
        either the entire label or for each PDS4 data structure.
    
        Entire label:
            >>> struct_list.label
    
        Part of label describing Observations table:
            >>> struct_list['Observations'].label
            >>> struct_list[1].label
    
        The returned object is similar to an ElementTree instance. It is
        searchable via find() and findall() methods and XPATH. Consult
        ElementTree manual for more details. For example,
    
        >>> struct_list.label.findall('.//disp:Display_Settings')
    
        Will find all elements in the entire label named 'Display_Settings'
        which are in the 'disp' namespace. You can additionally use the
        to_dict() and to_string() methods.

pds4_viewer

To display the objects in a label you may call pds4_viewer from the command line, or import it in the Python interpreter:

    Displays PDS4 compliant data in a GUI
    
    Given a PDS4 label, displays PDS4 data described in the label and
    associated label meta data in a GUI. By default all data structures described
    in the label are read-in and displayed. Can be called without any
    parameters, opening a GUI that has a File->Open function to select
    desired label to be read-in and displayed.
    
    Parameters:
    
        filename: str, optional
            The filename, including full or relative path if necessary, of
            the PDS4 label describing the data to be viewed.
        from_existing_structures: StructureList, optional
            An existing StructureList, as returned by pds4_read(), to view. Takes
            precedence if given together with filename.
        quiet: bool, optional
            Suppresses all info/warnings from being output and displayed.
        structure_num: integer, optional
            Instead of reading all data structures, only read the n^th
            structure, where n = structure_num.
        structure_name: str, optional
            Instead of reading all data structures, only read the structure
            with a name equal to structure_name.
        structure_lid: str, optional
            Instead of reading all data structures, only read the structure
            with a local identifier equal to structure_lid.

It is not necessary to include the filename parameter for pds4_viewer, you may simplify call it without any options or arguments and a GUI will open from which you can open labels.

You may also call pds4_viewer from another module or script. All the above arguments are available as optional named parameters. A basic example usage is as follows:

""" Basic pds4_viewer example """

from pds4_tools import pds4_read, pds4_viewer

pds4_viewer()

# or

pds4_viewer('/path/to/label.xml')

# or 

struct_list = pds4_read('label.xml')
pds4_viewer(from_existing_structures=struct_list) # Won't re-read the data