Configuring XML Schema validation
If you haven't already downloaded the sample files used for the "Creating projects and importing files" demo, now would be a good time to do so. We'll be using the schemas and labels in that package to test that our configuration is working.
In the XML File
XML Catalog files provide a map from logical references in the schema and label files to physical copies of the defining PDS4 schemas that you've stashed in a directory somewhere on your system. Since PDS labels should not contain links to local versions of, for example, the namespace-defining schemas, this is a critical component to get right. Plus learning how to create and edit XML Catalog files will make your life easier in the long run if you're planning to do more than a little PDS4 label development.
Let's start by opening the collection_1.0.xml file (double-click it or whatever). If this is the first time you're opened a file in eclipse, it will probably open in "Design" view, which looks like this:
Click on the Source tab at the bottom left of the editor pane to see the source code:
The very first line declares that this is an XML file following version 1.0 of the XML standard and using UTF-8 character encodings. We address the
<?xml-model> processing instruction in Configuring Schematron validation. So for now let's focus our attention on the first line of the actual label:
<Product_Collection xmlns="http://pds.nasa.gov/pds4/pds/v08" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
More specifically, examine the values of the two
xmlns attributes. The
xmlns attribute tells the processing program (in this case, eclipse) that a particular namespace will be referenced - the implication being that if you wish to be successful in processing the XML to follow, you should educate yourself on the details of that namespace. If your a program that wants to, for example, validate the content that follows, you'd better be able to find and read the schema document(s) that defines that namespace.
So in the
<Product_Collection> we see references to two namespaces:
The first reference you will come to know and love, or at least recognize, as the PDS4 common namespace, in this case with a version number ("v08") indicating it is a pre-release draft. Unless otherwise stated, everything in the label is presumed to come from the pds namespace. How do you know that? Because there is no namespace prefix specified, as there is for the second namespace ("xsi", in this case). So any time I see an element name, like Product_Collection, without a namespace prefix, I assume it is from the pds namespace.
The definition for the other namespace, the once associated with the xsi prefix in the line above, is built into eclipse - so you don't need to provide a schema file to define that namespace. This namespace holds attributes and elements that are part of the XML Schema language. In fact, we don't use any of those in this label, and the
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" part can be removed if you like.
Validating a Label
So let's try to validate this label against the pds namespace schema definition. One way to do that is to right-click on the file name in the Project Explorer window, go about halfway down the menu, and select Validate. When you do that, you'll see a warning in the Problems pane below the editing pane:
This is eclipse telling you that it can't do much validating unless you tell it what to validate against. Since we have the PDS4 master schema that we want to validate against, now's the time to set up an XML Catalog entry so eclipse can turn that namespace reference into a schema file location.
Creating XML Catalog Entries
XML Catalog information is configured as one of the items in the Window->Preferences menu. Look towards the bottom for XML, expand it, and click on XML Catalog to see the current catalog entries:
The "User Specified Entries" list is where we can create additional entries to translate namespace URIs, like
http://pds.nasa.gov/pds4/pds/v08, into references to the copy of the master PDS4 schema file that you'll find in the Schema subdirectory of our demo project. Click the Add... button to get to this dialogue box:
Eclipse maintains the XML catalog internally, so we'll be adding single entries to it via this dialogue box. We're doing straightforward mappings, so select Catalog Entry from the top of the column of icons on the left.
We want to add a reference to the PDS4 master schema which we know is in our current workspace, so click the Workspace button, navigate down into the Schema folder and select the PDS4_PDS_0800k.xsd file.
(Alternately, you can click the File System... button and select the file by navigating to it in the usual way. Either method will work, the point being that you can have a local schema repository that is not part of your personal workspace - a handy thing for group development.)
It does matter which of the .xsd files you select. The PDS4_PDS_0800k.xsd' file is the file containing the namespace definition that corresponds to the "/v08" test version. The other schemas are there so you can try pointing to the wrong one to see what happens.
Once you've selected the schema file and clicked the OK button, eclipse sets the Key type: value to "Namepsace name" in the XML Catalog Element dialogue because it recognizes that the schema defines a namespace. This in turn a) means it successfully found and opened the file, and b) saves you the trouble of cutting and pasting the namespace into the Key: box. At this point, you can click the OK button and you'll be returned to the XML Catalog Preferences dialogue box, this time showing the new User Specified entry:
Note that you can edit and remove the entry you just made if you find that something went wonky. We only need this one catalog mapping for the collection_1.0.xml label, so click OK and let's try it out. Right-click on the file name again, select Validate, and the warning in the Problems pane should disappear and you'll see a pop-up telling you the validation completed with no errors or warnings (note that you can turn this pop-up off if you like by checking the "Do not show this dialog in future" box in the pop-up).
Watch It Work
To demonstrate the validation is now working, try changing the spelling of any of the tags inside the
Product_Collection tag - remember to change both the opening and the closing tag, otherwise you'll mostly see the syntax errors, then re-validate. For example, I removed the underscore from the
<Identification_Area> tags, and this is what I see after validating:
You can see the full text of the error message by hovering the mouse over the line beginning with cvc-complex-type, by clicking on it and looking at the bottom of the eclipse window, or by double-clicking on it.
Local Dictionary Schemas
Local dictionaries include both discipline dictionaries as well as dictionaries produced by missions and large data suppliers. In order to validate against the contents of a local dictionary, you must add a namespace declaration and a catalog entry that points to the schema that defines that namespace. The second label in the test file package, hi173794441_9080000_001_rr.xml, references mock-ups for three local namespaces. The schemas for these namespaces are included in the test file package in the Dictionaries/ subdirectory. Add XML catalog entries for these schemas and you can see a more complex validation at work in this product label.