Difference between revisions of "Configuring XML Schema validation"

From The SBN Wiki
Jump to navigation Jump to search
(Added "Watch It Work")
m (Undo revision 2335 by Akash (talk))
 
(7 intermediate revisions by 3 users not shown)
Line 15: Line 15:
  
  
The very first line declares that this is an XML file following version 1.0 of the XML standard and using UTF-8 character encodings. We address the <code><?xml-model></code> processing instruction in [[Configuring Schematron Validation]]. So for now let's focus our attention on the first line of the actual label:
+
The very first line declares that this is an XML file following version 1.0 of the XML standard and using UTF-8 character encodings. We address the <code><?xml-model></code> processing instruction in [[Configuring Schematron validation]]. So for now let's focus our attention on the first line of the actual label:
  
 
<pre>
 
<pre>
Line 34: Line 34:
  
 
The definition for the other namespace, the once associated with the '''xsi''' prefix in the line above, is built into ''eclipse'' - so you don't need to provide a schema file to define that namespace.  This namespace holds attributes and elements that are part of the XML Schema language.  In fact, we don't use any of those in this label, and the <code><nowiki>xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"</nowiki></code> part can be removed if you like.
 
The definition for the other namespace, the once associated with the '''xsi''' prefix in the line above, is built into ''eclipse'' - so you don't need to provide a schema file to define that namespace.  This namespace holds attributes and elements that are part of the XML Schema language.  In fact, we don't use any of those in this label, and the <code><nowiki>xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"</nowiki></code> part can be removed if you like.
 
  
 
== Validating a Label ==
 
== Validating a Label ==
Line 62: Line 61:
 
''(Alternately, you can click the '''File System...''' button and select the file by navigating to it in the usual way.  Either method will work, the point being that you can have a local schema repository that is not part of your personal workspace - a handy thing for group development.)''
 
''(Alternately, you can click the '''File System...''' button and select the file by navigating to it in the usual way.  Either method will work, the point being that you can have a local schema repository that is not part of your personal workspace - a handy thing for group development.)''
  
It does matter which of the ''.xsd'' files you select.  The ''PDS4_PDS_0800l.xsd' file is the file containing the namespace definition that corresponds to the "/v08" test version. The other schemas are there so you can try pointing to the wrong one to see what happens.
+
It does matter which of the ''.xsd'' files you select.  The ''PDS4_PDS_0800k.xsd' file is the file containing the namespace definition that corresponds to the "/v08" test version. The other schemas are there so you can try pointing to the wrong one to see what happens.
  
 
Once you've selected the schema file and clicked the '''OK''' button, ''eclipse'' sets the '''Key type:''' value to "Namepsace name" in the ''XML Catalog Element'' dialogue because it recognizes that the schema defines a namespace.  This in turn a) means it successfully found and opened the file, and b) saves you the trouble of cutting and pasting the namespace into the '''Key:''' box.  At this point, you can click the '''OK''' button and you'll be returned to the XML Catalog Preferences dialogue box, this time showing the new User Specified entry:
 
Once you've selected the schema file and clicked the '''OK''' button, ''eclipse'' sets the '''Key type:''' value to "Namepsace name" in the ''XML Catalog Element'' dialogue because it recognizes that the schema defines a namespace.  This in turn a) means it successfully found and opened the file, and b) saves you the trouble of cutting and pasting the namespace into the '''Key:''' box.  At this point, you can click the '''OK''' button and you'll be returned to the XML Catalog Preferences dialogue box, this time showing the new User Specified entry:
Line 68: Line 67:
 
[[File:XMLCatalogPref2.png|center]]
 
[[File:XMLCatalogPref2.png|center]]
  
Note that you can edit and remove the entry you just made if you find that something went wonky.  We only need this one catalog mapping for the ''collection_1.0.xml'' label, so click '''OK''' and let's try it out.  Right-click on the file name again, select '''Validate''', and the warning in the '''Problems''' pane should disappear and you'll see a pop-up telling you the validation completed with no errors or warnings (note that you can turn this pop-up off if you like by checking the "Do not show this dialog in future" box in the pop-up.
+
Note that you can edit and remove the entry you just made if you find that something went wonky.  We only need this one catalog mapping for the ''collection_1.0.xml'' label, so click '''OK''' and let's try it out.  Right-click on the file name again, select '''Validate''', and the warning in the '''Problems''' pane should disappear and you'll see a pop-up telling you the validation completed with no errors or warnings (note that you can turn this pop-up off if you like by checking the "Do not show this dialog in future" box in the pop-up).
  
 
== Watch It Work ==
 
== Watch It Work ==
Line 74: Line 73:
 
To demonstrate the validation is now working, try changing the spelling of any of the tags inside the <code>Product_Collection</code> tag - remember to change both the opening ''and'' the closing tag, otherwise you'll mostly see the syntax errors, then re-validate.  For example, I removed the underscore from the <code><Identification_Area></code> tags, and this is what I see after validating:
 
To demonstrate the validation is now working, try changing the spelling of any of the tags inside the <code>Product_Collection</code> tag - remember to change both the opening ''and'' the closing tag, otherwise you'll mostly see the syntax errors, then re-validate.  For example, I removed the underscore from the <code><Identification_Area></code> tags, and this is what I see after validating:
  
[[File:IDAreaError.png|center]]  
+
[[File:IDAreaError.png|center]]
  
 +
You can see the full text of the error message by hovering the mouse over the line beginning with ''cvc-complex-type'', by clicking on it and looking at the bottom of the ''eclipse'' window, or by double-clicking on it.
  
 
== Local Dictionary Schemas ==
 
== Local Dictionary Schemas ==

Latest revision as of 20:29, 3 August 2017

If you haven't already downloaded the sample files used for the "Creating projects and importing files" demo, now would be a good time to do so. We'll be using the schemas and labels in that package to test that our configuration is working.

In the XML File

XML Catalog files provide a map from logical references in the schema and label files to physical copies of the defining PDS4 schemas that you've stashed in a directory somewhere on your system. Since PDS labels should not contain links to local versions of, for example, the namespace-defining schemas, this is a critical component to get right. Plus learning how to create and edit XML Catalog files will make your life easier in the long run if you're planning to do more than a little PDS4 label development.

Let's start by opening the collection_1.0.xml file (double-click it or whatever). If this is the first time you're opened a file in eclipse, it will probably open in "Design" view, which looks like this:

DesignView.png

Click on the Source tab at the bottom left of the editor pane to see the source code:

SourceView.png


The very first line declares that this is an XML file following version 1.0 of the XML standard and using UTF-8 character encodings. We address the <?xml-model> processing instruction in Configuring Schematron validation. So for now let's focus our attention on the first line of the actual label:

   <Product_Collection xmlns="http://pds.nasa.gov/pds4/pds/v08"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

More specifically, examine the values of the two xmlns attributes. The xmlns attribute tells the processing program (in this case, eclipse) that a particular namespace will be referenced - the implication being that if you wish to be successful in processing the XML to follow, you should educate yourself on the details of that namespace. If your a program that wants to, for example, validate the content that follows, you'd better be able to find and read the schema document(s) that defines that namespace.

So in the <Product_Collection> we see references to two namespaces:

  xmlns="http://pds.nasa.gov/pds4/pds/v08"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

The first reference you will come to know and love, or at least recognize, as the PDS4 common namespace, in this case with a version number ("v08") indicating it is a pre-release draft. Unless otherwise stated, everything in the label is presumed to come from the pds namespace. How do you know that? Because there is no namespace prefix specified, as there is for the second namespace ("xsi", in this case). So any time I see an element name, like Product_Collection, without a namespace prefix, I assume it is from the pds namespace.

The definition for the other namespace, the once associated with the xsi prefix in the line above, is built into eclipse - so you don't need to provide a schema file to define that namespace. This namespace holds attributes and elements that are part of the XML Schema language. In fact, we don't use any of those in this label, and the xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" part can be removed if you like.

Validating a Label

So let's try to validate this label against the pds namespace schema definition. One way to do that is to right-click on the file name in the Project Explorer window, go about halfway down the menu, and select Validate. When you do that, you'll see a warning in the Problems pane below the editing pane:

NoGrammarConstraintsWarning.png

This is eclipse telling you that it can't do much validating unless you tell it what to validate against. Since we have the PDS4 master schema that we want to validate against, now's the time to set up an XML Catalog entry so eclipse can turn that namespace reference into a schema file location.

Creating XML Catalog Entries

XML Catalog information is configured as one of the items in the Window->Preferences menu. Look towards the bottom for XML, expand it, and click on XML Catalog to see the current catalog entries:

XMLCatalogPreferences.png

The "User Specified Entries" list is where we can create additional entries to translate namespace URIs, like http://pds.nasa.gov/pds4/pds/v08, into references to the copy of the master PDS4 schema file that you'll find in the Schema subdirectory of our demo project. Click the Add... button to get to this dialogue box:

AddXMLCatalogElementsBlank.png

Eclipse maintains the XML catalog internally, so we'll be adding single entries to it via this dialogue box. We're doing straightforward mappings, so select Catalog Entry from the top of the column of icons on the left.

We want to add a reference to the PDS4 master schema which we know is in our current workspace, so click the Workspace button, navigate down into the Schema folder and select the PDS4_PDS_0800k.xsd file.

XMLCatalogPref1.png

(Alternately, you can click the File System... button and select the file by navigating to it in the usual way. Either method will work, the point being that you can have a local schema repository that is not part of your personal workspace - a handy thing for group development.)

It does matter which of the .xsd files you select. The PDS4_PDS_0800k.xsd' file is the file containing the namespace definition that corresponds to the "/v08" test version. The other schemas are there so you can try pointing to the wrong one to see what happens.

Once you've selected the schema file and clicked the OK button, eclipse sets the Key type: value to "Namepsace name" in the XML Catalog Element dialogue because it recognizes that the schema defines a namespace. This in turn a) means it successfully found and opened the file, and b) saves you the trouble of cutting and pasting the namespace into the Key: box. At this point, you can click the OK button and you'll be returned to the XML Catalog Preferences dialogue box, this time showing the new User Specified entry:

XMLCatalogPref2.png

Note that you can edit and remove the entry you just made if you find that something went wonky. We only need this one catalog mapping for the collection_1.0.xml label, so click OK and let's try it out. Right-click on the file name again, select Validate, and the warning in the Problems pane should disappear and you'll see a pop-up telling you the validation completed with no errors or warnings (note that you can turn this pop-up off if you like by checking the "Do not show this dialog in future" box in the pop-up).

Watch It Work

To demonstrate the validation is now working, try changing the spelling of any of the tags inside the Product_Collection tag - remember to change both the opening and the closing tag, otherwise you'll mostly see the syntax errors, then re-validate. For example, I removed the underscore from the <Identification_Area> tags, and this is what I see after validating:

IDAreaError.png

You can see the full text of the error message by hovering the mouse over the line beginning with cvc-complex-type, by clicking on it and looking at the bottom of the eclipse window, or by double-clicking on it.

Local Dictionary Schemas

Local dictionaries include both discipline dictionaries as well as dictionaries produced by missions and large data suppliers. In order to validate against the contents of a local dictionary, you must add a namespace declaration and a catalog entry that points to the schema that defines that namespace. The second label in the test file package, hi173794441_9080000_001_rr.xml, references mock-ups for three local namespaces. The schemas for these namespaces are included in the test file package in the Dictionaries/ subdirectory. Add XML catalog entries for these schemas and you can see a more complex validation at work in this product label.