Python PDS4 Tools
Python and PDS4
This document describes the current status and usage of Python tools developed at PDS-SBN to read and visualize PDS4 data in Python. Please note that a PDS4 reader and visualizer for IDL is also available.
Reading and Displaying PDS4 Data
This section describes a Python package that can read and display PDS4 data and meta data. In the future this tool is expected to support all PDS4 data structures, currently support is limited to structures given in the Supported Data Structures section. The package expects valid PDS4 labels formatted according to the PDS4 Standard.
Contact Lev Nagdimunov with questions or comments regarding this code or its description.
Python 2.6+ or 3.3+
Supported Data Structures
PDS4 Data Standards >= v1.0 are supported.
PDS3 Data Standards are not supported.
The table below lists the main PDS4 data structures and the current status.
Read-in column indicates support by pds4_tools.read()
Display columns indicate support by pds4_tools.view().
|Structure||Read-in||Display as Table||Display as Image||Display Columns as Plot|
|Array||Yes||Yes||Yes, N-dims||Yes, 1-D only|
|Table_Binary||Yes, except BitFields||Yes||No||Yes|
Online usage documentation, both for scientists and for developers, is available.
Download the ZIP file. Released on October 4, 2020.
Note: A distributable version of the viewer only, which does not require Python, is available.
Use "pip install PDS4_tools-1.2.zip"
Extract the downloaded file to a directory Python can find. To use it follow the instructions in Example Usage except with the following lines first,
import sys sys.path.extend([r'/path/to/extraction_directory/']) import pds4_tools
See also the User Manual.
Import via "import pds4_tools". You may then call pds4_tools.read() from your own code. The following is the docstring for pds4_tools.read():
Reads PDS4 compliant data into a `StructureList`. Given a PDS4 label, reads the PDS4 data described in the label and associated label meta data into a `StructureList`, with each PDS4 data structure (e.g. Array_2D, Table_Binary, etc) as its own `Structure`. By default all data structures described in the label are immediately read into memory. Notes ----- Python 2 v. Python 3: Non-data strings (label, meta data, etc) in Python 2 will be decoded to ``unicode`` and in Python 3 they will be decoded to ``str``. The return type of all data strings is controlled by *decode_strings*. Remote URLs are downloaded into an on-disk cache which is cleared on Python interpreter exit. Parameters ---------- filename : str or unicode The filename, including full or relative path, or a remote URL to the PDS4 label describing the data. quiet : bool, optional Suppresses all info/warnings from being output. lazy_load : bool, optional If True, then the data of each PDS4 data structure will not be read-in to memory until the first attempt to access it. Additionally, for remote URLs, the data is not downloaded until first access. Defaults to False. no_scale : bool, optional If True, returned data will be exactly as written in the data file, ignoring offset or scaling values. Defaults to False. decode_strings : bool, optional If True, strings data types contained in the returned data will be decoded to the a unicode in Python 2, and to the str type in Python 3. If False, leaves string types as byte strings. Defaults to True. Returns ------- StructureList Contains PDS4 data `Structure`'s, each of which contains the data, the meta data and the label portion describing that data structure. `StructureList` can be treated/accessed/used like a ``dict`` or ``list``. Examples -------- Below we document how to read data described by an example label which has two data structures, an Array_2D_Image and a Table_Binary. An outline of the label, including the array and a table with 3 fields, is given. # Local file >>> struct_list = pds4_tools.read('/path/to/Example_Label.xml') # Remote URL >>> struct_list = pds4_tools.read('http://url.com/Example_Label.xml') Example Label Outline:: Array_2D_Image: unnamed Table_Binary: Observations Field: order Field: wavelength Group: unnamed Field: pos_vector All below documentation assumes that the above outlined label, containing an array that does not have a name indicated in the label, and a table that has the name 'Observations' with 3 fields as shown, has been read-in. Accessing Example Structures: To access the data structures in `StructureList`, which is returned by `pds4_read()`, you may use any combination of ``dict``-like or ``list``-like access. >>> unnamed_array = struct_list >>> or struct_list['ARRAY_0'] >>> obs_table = struct_list >>> or struct_list['Observations'] Label or Structure Overview: To see a summary of the data structures, which for Arrays shows the type and dimensions of the array, and for Tables shows the type and number of fields, you may use the `StructureList.info()` method. Calling `Structure.info()` on a specific ``Structure`` instead will provide a more detailed summary, including all Fields for a table. >>> struct_list.info() >>> unnamed_array.info() >>> obs_table.info() Accessing Example Label data: To access the read-in data, as an array-like (subclass of ``ndarray``), you can use the data attribute for a PDS4 Array data structure, or list-like and the field() method to access a field for a table. PDS4 Arrays >>> unnamed_array.data PDS4 Table fields >>> obs_table['wavelength'] >>> obs_table.field('wavelength') PDS4 Table records >>> obs_table[0:1000] Accessing Example Label meta data: You can access all meta data in the label for a given PDS4 data structure or field via the ``OrderedDict`` meta_data attribute. The below examples use the 'description' element. >>> unnamed_array.meta_data['description'] >>> obs_table.field('wavelength').meta_data['description'] >>> obs_table.field('pos_vector').meta_data['description'] Accessing Example Label: The XML for a label is also accessible via the label attribute, either the entire label or for each PDS4 data structure. Entire label: >>> struct_list.label Part of label describing Observations table: >>> struct_list['Observations'].label >>> struct_list.label The returned object is similar to an ElementTree instance. It is searchable via `Label.find()` and `Label.findall()` methods and XPATH. Consult ``ElementTree`` manual for more details. For example, >>> struct_list.label.findall('.//disp:Display_Settings') Will find all elements in the entire label named 'Display_Settings' which are in the 'disp' prefix's namespace. You can additionally use the `Label.to_dict()` and `Label.to_string()` methods.
Usage is described above. A basic usage example is as follows:
""" Basic Reader example """ import pds4_tools structures = pds4_tools.read('/path/to/label.xml') structures.info() 0 - Array_3D_Spectrum 'table_name' (3 axes, 21 x 10 x 36) 1 - Table_Binary 'array_name' (5 fields x 1000 records) # Table data access table = structures['table_name'] # or table = structures table.info() field_data = table.field('field_name') # or field_data = table.fields record_data = table[0:50] # Array data access array = structures['array_name'] # or array = structures array_data = array.data # Meta-data access field_meta = table.field('field_name').meta_data # or field_meta = table.fields.meta_data array_meta = array.meta_data print field_meta['description'] print field_meta['unit'] print array_meta['local_identifier'] # Label access label = structures.label # Full label label = table.label # Label section describing the table object display_settings = label.findall('.//disp:Display_Settings') display_dict = display_settings.to_dict() label_dict = label.to_dict() label_string = label.to_string()
Import via "import pds4_tools". To display the data structures (such as images, spectra, or tables) in a label you may then call pds4_tools.view() from the Python interpreter, with or without any arguments:
Displays PDS4 compliant data in a GUI. Given a PDS4 label, displays PDS4 data described in the label and associated label meta data in a GUI. By default all data structures described in the label are read-in and displayed. Can be called without any parameters, opening a GUI that has a File->Open function to select desired label to be read-in and displayed. Parameters: filename : str, optional The filename, including full or relative path if necessary, of the PDS4 label describing the data to be viewed. from_existing_structures : StructureList, optional An existing StructureList, as returned by pds4_read(), to view. Takes precedence if given together with filename. lazy_load : bool, optional Do not read-in data of each data structure until attempt to view said data structure. Defaults to True. quiet : bool, optional If True, suppresses all info/warnings from being output and displayed. Defaults to False.
It is not necessary to include the filename parameter for pds4_tools.view, you may simplify call it without any options or arguments and a GUI will open from which you can open labels.
You may also call pds4_tools.view from another module or script. All the above arguments are available as optional named parameters. A basic example usage is as follows:
""" Basic Viewer example """ import pds4_tools pds4_tools.view() # or pds4_tools.view('/path/to/label.xml') # or struct_list = pds4_tools.view('label.xml') pds4_tools.view(from_existing_structures=struct_list) # Won't re-read the data