Configuring Schematron validation

From The SBN Wiki
Revision as of 15:54, 2 August 2017 by Akash (talk | contribs) (updated dead .zip link)
Jump to navigation Jump to search

The PDS4 core standards are contained in two separate schema documents. The first is the master schema, an XML Schema document that defines structures (i.e., classes and attributes). The second is a Schematron file. The Schematron Standard provides a syntax for defining relationships between XML document elements, standard value syntax, and conditions more elaborate than the XML Schema standard allows (like co-dependency between elements).

Note: The examples below reference the sample schema files used elsewhere in the Eclipse tutorial pages. The files are available from here:

These are preliminary versions of the released PDS4 Master Schema and related files, and thus have pre-release version numbers. If you're working with a difference version of the schema files you will need to use the appropriate version numbers for file names, information model version, and so on.

Project Configuration

We've already installed the Schematron plug-in and set the validation preferences in Downloading, installing and configuring ''Eclipse''. There is one additional configuration step that must be performed once for each project.

Right-click on the project name in the Project Explorer panel and you should see an Add/Remove Schematron Validation option:


You only have to click on this once for each project, unless, of course, you want to turn schematron validation off or you want to restore any errors you intentionally deleted from detection. You will not see this option until you have installed the Schematron plugin.

Unfortunately, there is no easy way to tell if schematron validation is currently on or off. Below, we'll cover how to tell if it's working in our demo files.

The last step needed is telling eclipse where to find the Schematron file to be applied to each label.

In the XML Label

Open the collection_1.0.xml label and look at the top of the file:


Remember that <?xml-model> processing instruction line we glossed over a couple pages back? Let's look at it now:

  <?xml-model href=""?>

This instruction is trying to tell eclipse that there is a Schematron file (the .sch suffix is commonly reserved for Schematron files) in the PDS4 v08 namespace that should be used for validation.

Ideally, eclipse would use the XML Catalog file to translate this reference to a physical file location. Unfortunately, the Schematron plugin does not, in fact, appear to use the XML Catalog information at all. So to prevent making more copies of the files than necessary, we're going to reference the copy of the schematron file that is in the Schema/ directory - this way, we only need one copy for anything in the current project.

Note: This is a significant problem for validating PDS labels with eclipse, since local references should never be left in a file destined for archiving. Until someone updates the plugin, however, we're stuck with it. Just remember to replace the reference before archiving, or the label will fail PDS validation.
(Commercial editors do a better job with this. The oXygen editor, for example, resolves the href values via the catalog file. If you find this has been fixed by an upgrade to the plugin, let us know.)

So, to get Schematron validation to work in our collection_1.0.xml label, change the second line to this:

  <?xml-model href="Schema/PDS4_PDS_0800k.sch"?>

(The href path is relative to the project root.)


Here's what the top of our label looks like now:


Save the file, and run validation. You should get no errors. (If you do, look for typos in file names and such.)

Note that if you remove a Schematron error in the Problems tab of Eclise, this error will not show up again even if you run Validation. To get deleted errors to show up again, you must disable and then re-enable Schematron Validation.

Also note that all versions of Eclipse require the Java Runtime Environment (JRE), version 1.6 or better. Unless you're working on a machine more than about 3 years old that is not kept updated and patched, you likely will have this available to you.

One of the things enforced by the schematron file is required values for some system keywords. The <information_model_version> element is one such system keyword - the value has to correspond to the precise version number of the master schema. Try changing one of the characters in the value. Then save the label file and run validation. You should now see an error message that tells you in a useful amount of detail that your value is not the same as the required value.