User:Raugh

From The SBN Wiki
Revision as of 18:25, 1 November 2012 by Raugh (talk | contribs) (Notes for Moscow prep)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

XML Tools Notes

Xerces

This is a suite of tools developed for creating and using XML in a variety of environments. It is an Apache open source project. Home page is:

http://xerces.apache.org/

Currently it is broken into four separate projects:

  1. XML Commons - components used by all projects. This include APIs to various functions and a catalog resolver.
  2. Perl - XML::Xerces, a Perl library for parsing and validating (not the only one available, just uses the Xerces code, which included XML 1.1 compatibility)
  3. Java - same, but for Java
  4. C++ - same, but for C++

These run on the three major platforms. Installation is non-trivial

libxml2

Part of Gnome development, but independently useful, this is an XML C parser and toolkit.



Topics to present

Describe the types of parsers and how they are used. Understand SAX vs DOM and other terminology.

Show a working programming example of at least one parser and one validator.

Show example of working command-line validator.

List the standards being used and describe each.

Investigate:

  • xmllint
  • libxml

Parsers

XML DOM: Document Object Model. This model views the XML document as a series of nodes that together build a tree (descending from the root node). Everything is a node - text content, elements, attributes of elements, and comments. The nodes have types corresponding to their content, and the nodes are inherently ordered according to their lexical order in the file.

DOM Parser: A DOM parser works by reading an XML file into a DOM structure in memory. The program can then crawl the tree in various ways, find nodes with specific properties or content, and also create and delete nodes.

SAX Parser: (Simple API for XML) A SAX parser works sequentially through the document generating events for processing in an object-oriented program. Originally developed for use in Java, it has been generalized but not formalized.



Pages I'm Working on But Haven't Linked in Yet

XML Primer for PDS4