Difference between revisions of "User:Raugh"

From The SBN Wiki
Jump to navigation Jump to search
Line 91: Line 91:
=== DOI Stuff ===
=== DOI Stuff ===
* The [[SBN DOI Wiki]]
* [[DataCite Schema]]
* [[DataCite Schema]]
* [[SBN DOI Policy]] - actually, more like philosophy
* [[SBN DOI Policy]] - actually, more like philosophy
* [[SBN DOI Procedures ]]
* [[SBN DOI Procedures ]]

Revision as of 20:31, 9 June 2020

XML Tools Notes


This is a suite of tools developed for creating and using XML in a variety of environments. It is an Apache open source project. Home page is:


Currently it is broken into four separate projects:

  1. XML Commons - components used by all projects. This include APIs to various functions and a catalog resolver.
  2. Perl - XML::Xerces, a Perl library for parsing and validating (not the only one available, just uses the Xerces code, which included XML 1.1 compatibility)
  3. Java - same, but for Java
  4. C++ - same, but for C++

These run on the three major platforms. Installation is non-trivial


Part of Gnome development, but independently useful, this is an XML C parser and toolkit.

Topics to present

Describe the types of parsers and how they are used. Understand SAX vs DOM and other terminology.

Show a working programming example of at least one parser and one validator.

Show example of working command-line validator.

List the standards being used and describe each.


  • xmllint
  • libxml

Terms to Define

Java API for XML Processing. This is a package of utilities for use in Java coding that includes DOM and SAX parsers and other processing utilities.
JavaScript Object Notation. A pseudo-code for describing JavaScript object structure that has also been applied to similarly structured things, like XML documents. A well-formed XML document can be described in JSON (using the DOM).
Document Object Model. This model views the XML document as a series of nodes that together build a tree (descending from the root node). Everything is a node - text content, elements, attributes of elements, and comments. The nodes have types corresponding to their content, and the nodes are inherently ordered according to their lexical order in the file.


DOM Parser: A DOM parser works by reading an XML file into a DOM structure in memory. The program can then crawl the tree in various ways, find nodes with specific properties or content, and also create and delete nodes.

SAX Parser: (Simple API for XML) A SAX parser works sequentially through the document generating events for processing in an object-oriented program. Originally developed for use in Java, it has been generalized but not formalized.

Local PDS4 Development Topics

Pages I'm Contemplating But Haven't Created Yet

DOI Stuff