Some Things You Should Know About XML Before You Start

From The SBN Wiki
Revision as of 12:21, 22 July 2014 by Raugh (talk | contribs) (Creation)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Following are some things that you should be aware of before you embark on creating your first PDS4 XML label. This is especially true if you're coming from the PDS3 world, where things like case and keyword order were largely irrelevant; or the HTML world, where closing tags can be treated fairly cavalierly.

XML Syntax

XML Tags Must Be Closed

Unlike HTML, all XML tags must be closed. So this is not valid:


XML Tags Must Be Closed in Order

All XML tags must be opened and closed in strict Last Opened - First Closed order. That is, all tags opened inside one tag must be closed before you close the outside tag. So this is valid:

<em>This is <strong>OK</strong></em>

But this is not:

<em>This is <strong>NOT VALID</em></strong>

Character Restrictions

You may not use the greater than (>), less than (<), or ampersand (&) characters in your text fields - an XML parser will always assume these begin a tag or an entity reference (a stand-in for a character that is not available for one reason or another). Instead, you must use the entity references for these characters:

Use "&lt;" for the '<' character.
Use "&gt;" for the '>' character.
Use "&amp;" for the '&' characer.

You must make these substitutions all the time in every text field where you want to use these characters. So, for example, in a table field description of a PDS4 label you might see something like this:

This field is set to "-999" when the observed counts are &gt; 10000.

Because this will cause a syntax error:

This field is set to "-999" when the observed counts are > 10000.

When you are writing code to deal with XML text fields, you will need to remember to decode the entity references before proceeding.

XML Schema (XSD)

XML Schema is Strictly Ordered

The XML Schema definition language (XSD) is strictly ordered. That is, attributes and classes must appear in the order in which they are defined in the XSD file. While it is possible to circumvent this, it is difficult and it can have a serious negative impact on validation. So unless otherwise indicated, you should assume that you must put classes and attributes in the order illustrated.

Note that while schema-aware editors can tell you whether any particular class or attribute is a valid choice, they tend to sort the options alphabetically - so it can be very difficult to guess which order attributes should be in for a large class. If you get an error message that an attribute is not valid at a particular place when you know the attribute does belong in the class, then it is almost certainly an ordering error. Check the XSD or the PDS4 Information Model for correct ordering.