St. Louis XML Training April 2015

From The SBN Wiki
Jump to navigation Jump to search

PDS node personnel will be meeting in person and via web conference on Monday, 20 April 2015, prior to the PDS Management Council Meeting, to share some tips and techniques for working with the PDS schema library and (mainly) the oXygen Editor. This page lists some topics we plan to cover, with some references and some file sets we'll use as samples and examples.

The intention is to provide sample file sets and help people work through the exercises with me as desired, so we can trouble-shoot as we go and answer questions as they come up. The structure will be largely informal, and our primary goal is that those who want to will walk away with a working development environment on their laptops that they understand and can continue to modify.


Not necessarily in chronological order, but this seems like a reasonable organization to start with:

The XML Prolog
The XML prolog is everything that precedes the document root. We'll walk through a typical prolog, what you might find in the wild, and what you should see in a PDS4 label. Here's a detailed breakdown:
Referencing Schemas
The PDS4 schema collection is actually rather complicated. We'll demonstrate how to reference the dozen or so schemas needed to validate a single, non-trivial label, and talk about ways to code those references into labels to maximize transportability. Here's a page on this wiki on this topic, for those who like to read ahead:
Setting Up a Work Environment
At UMD we have half a dozen different people working on various aspects of development, migration, and validation; and each of us will be working on data from a variety of different sources. I'll demonstrate the directory structures and XML catalog files we're using to maximize transportability and commonality within the group. The same technique can be extended to the data provider's side as well.
Label Creation with Oxygen
I'll run through configuring the Oxygen editor for validating, and demonstrate the detailed process of creating a new, non-trivial PDS4 label and testing that all schema references are working as desired. The process is similar in Eclipse, though at the moment I'm only planning to focus on Oxygen. If you want to see the same thing in Eclipse, let me know. You can also read this page elsewhere on this wiki:
Label Templates and Programming Technique
This'll probably be shorter than you'd think, but I will walk through a brute-force program I wrote to convert some Deep Impact labels from PDS3 to PDS4 as a demonstration of what moving to the XML paradigm buys us in development time-savers.

If you have additional topics or questions along these lines that you'd like to see covered, let me know. I can't guarantee to hit them, but we'll try.

Sample File Sets

Here are some samples, examples, and otherwise useful file sets you can download to follow along with.

  • This zip file contains the sample label we'll be coming back to throughout the day. It is a Deep Impact ITS calibrated spectral image data product label (no data file provided - ask if you want one), done as part of an earlier prototyping project. It references older schemas and draft dictionaries, so don't be surprised if you see stuff in here that you thought was no longer valid. One of the things we'll demonstrate is what happens when you change schema versions to something incompatible.
  • Released Schemas: This zip file contains a schema tree with all the .xsd and .sch files available from the PDS4 schema site as of 16 April 2015. You'll also find some sample XML Catalog files in here to play with that will be used in the workshop.
  • Development Schemas: This zip file contains some schemas still in a development state. We'll use this and the preceding schema collection to demonstrate some XML catalog file techniques for a development environment.
  • Example Python Code: This zip file contains Mark Showalter's PDS3 label reading Python library (which I believe he's further improved since), a template Deep Impact label similar to the example label above (but from an earlier protoype), and a Python subroutine that moves data from the PDS3 label to the PDS4 label using a brute-force DOM navigation approach.

Useful Links

Here are some links you might want to investigate prior to the workshop, especially if you're planning to play along:

Download the oXygen XML Editor
This is the download page for the oXygen XML Editor - a commercial product. You can get a 30-day trial license for free, but you do have to request one - so if you're planning to install this editor for the workshop you should allow an extra 10 minutes to deal with the license form and cut & paste activation required. If you're downloading a trial version, download the "Editor" edition (on the far left). It includes everything.
Download the Eclipse XML Editor
This is the download page for the Eclipse IDE editor, which is OpenSource. It comes in a wide variety of flavors, and you'll need to know whether you have a 32-bit or 64-bit Java installation. The version you should download is called "Eclipse IDE for Java Developers", and note that you'll need to add a plug-in after the initial install to add Schematron validation. For details see the Using Eclipse for XML Editing section of this wiki. Also, note that if you don't have control over which version of Java you're running, you may have to use an earlier version of Eclipse to get one compatible with your Java installation. For our purposes, using an earlier version does not make a significant difference - the XML support has not substantially changed in several versions.
Note: Since Eclipse installation takes a bit more effort than the oXygen install, and we're planning to focus on oXygen in the workshop, you should try to do the installation and configuration prior to Monday morning if you're planning to use Eclipse. I will be there early on Monday to help with this, if needed, and we can cover Eclipse vs. oXygen differences during the day as needed.
XML Tutorial at
If you have no clue about XML, its terminology, or what it's supposed to do, this tutorial will walk you quickly and succinctly through the essential concepts.
XML Standards Primer for PDS4
If you're a gory details aficionado, this page (on this wiki) collects pointers to the major standards involved in creating and validating PDS4 labels.