Using Validate Tool

From The SBN Wiki
Jump to navigation Jump to search

The Validate tool is the canonical PDS4 validator for single labels, collections, and bundles. It provides additional functionality beyond schematic development, and advanced features are still in active development.

If you haven't already installed and configured Validate, you'll find instructions on the Installing and Configuring Validate Tool page on this wiki, as well as included in the download bundle.

Format

The general command format is:

% validate [files] [options]

"Files" can be a single label file name, a directory, or a comma-separated list of both. This list may include absolute and relative paths, wild cards, and file globs. Alternately, the product labels to be validated can be specified as arguments to the -t option.

By default, if validate is given a directory as an argument it will attempt to validate all files ending in either ".xml" or ".XML" with the selected options, and will recurse down through all subdirectories doing the same.

Command Line Options

These options have been grouped by functional areas. All options have both a short and a long form. One or two hyphens can be used for both short and long options interchangeably; long options cannot be truncated. Most options require arguments, but even those that do not can not be globbed. That is, specifying "-Vh" will display only version information, not help information.

Program Information

These options provide information about the program itself.

Option Argument Notes
-V, -version --none-- This option displays the internal code version, the release date, and the core schemas applied by default by this version of validate, in addition to some licensing boilerplate.
-h, -help --none-- This option displays a command and option summary. It's about 50 lines long, so prepare to scroll.

Selecting What to Validate

These options, along with any files and directories included in the argument list, select and refine the file and directories validated. Whether included as arguments to validate or arguments to the -t option (below), the Validate Tool documentation refers to these collectively as "targets".

Note: No merging or duplicate removal is done on the list of files and directories supplied. If the list, when expanded, includes the same file or directory more than once, that element will be validated in its entirety each time it is listed. This is a waste of time - potentially a significant one for large file collections - so type carefully.
Option Argument Notes
-t, -target file/directory list Use this option as an alternate way to specify which files and/or directories (i.e., "targets") to validate. In the absence of the -t option, the comma-separated list of targets must immediately follow the validate command. By using the option, the target list can appear at any point among the other options. The syntax and globbing options are identical whether the -t switch is used or not. It is possible to provide two target lists: one with the -t option, and one without. In this case the two lists are concatenated, and any options that modify file selection apply to the merged list.
-L, -local --none-- Note the uppercase "L" in the short version of this option. When present, this option prevents validate from recursing down into subdirectories of any directories included in the argument list or -t target list.
-e, -regexp pattern[, pattern] This option appears to be misnamed. It does not accept regular expressions in general, but rather allows an addition option for file-globbing beyond what might be in the argument or target list. It is applied only to the file name, not the path, and behaves as if anchored to the beginning and end of the file name. So -e "*.XML" will select only files with the explicit uppercase "XML" extension.

Beware of selecting files by name without extension. The option -e "table*" will select all files beginning with "table" regardless of file extension - most likely producing a validation error when validate attempts to validate data files as well as labels.

Output Control

These options affect the format and destination of the output report. Three "INFO" messages produced at the start of every validate run go to the standard error output, but everything else is directed to standard output unless this option is present. You can also redirect the output to a file or a process, if desired.

Option Argument Notes
-r, -report-file file name This option directs the output to the named file.
-v, -verbose 1 | 2 | 3 This option affects the "Validation Details" listing in the report by changing the severity level of the messages generated. A value of 1 causes INFO level and above messages to be included in the details; a value of 2 (the default) includes WARNING level and above; a value of 3 includes only ERROR level messages. Note that the PASS and FAIL messages are always generated, irrespective of the verbosity setting.
-s, -report-style json | xml The default report format is human-readable. This option allows you specify an alternate form for the output that is more congenial to programmatic analysis. Two alternatives are currently available: json, which produces JSON output; and xml, which produces output with XML tags (there is no schema published for these tags, but the tag names are formulated from the titles present in the default report format). Note that these values must be in lowercase. Using "JSON" rather than "json", for example, will not produce the expected error message for an unrecognized value, but rather just the literal word "null".

Specifying Schemas

The Validate Tool comes with the released schema and Schematron files for the core ("pds") namespace built-in. By default, it will use the latest version of these for validation. If you're not sure what that version is, use the -version option to check. Older releases of the core namespace can be indicated via the -model-version option (below).

All other dictionary schema/Schematron files, including discipline dictionaries and any local dictionaries you have created, must be read in by validate - so you will have to have access to the relevant schema/Schematron files and will have to direct validate where to find them. A number of options are provided for that.

Note: PDS4 dictionaries consist of two files: a schema (.xsd) file and a Schematron (.sch) file. Both are needed to fully define any PDS4 namespace (core, discipline, or local). If you are validating files that reference non-core namespaces and do not provide the schema file, you will get an "ERROR" message in the "Validation Details" list containing a phrase like "The matching wildcard is strict, but no declaration can be found for element...", followed by an element from the missing namespace. If you omit the Schematron file, however, no notice of any kind will be generated - validate cannot tell that a Schematron file is missing due to the nature of Schematron validation (i.e., further constraints on top of the definitions contained in the schema file).

Triple-check your Schematron file lists to avoid missing validation errors.

Option Argument Notes
-m, -model-version version code By default, validate will validate all labels against the default version of the core namespace (listed in the -version option output). You can change the core namespace version used for validation to any previously released version with this option. The "version code" is the version ID with all the '.' characters removed - so to validate all labels against the 1.7.0.0 version of the core namespace, use "-m 1700".

Note: Use caution when combining this option with the -x and -S options (see below). This option is ignored without comment when the -f option is present.

-f, -force --none-- This option forces validate to validate each label against the schema and Schematron files actually referenced in the label (via, for example, schemaLocation attributes). Those references must be resolvable. So, for example, you use a schemaLocation style that looks like a URL, validate will attempt to connect to that URL to download the file. If it can't, the program will issue a "FATAL_ERROR" and move on to the next label.

Note: This option cannot be used in conjunction with the -catalog option, so labels may contain only absolute paths or URL references that can be resolved through a network connection for this option to be viable. Neither can this option be used with the -schema and -schematron' options.

-C, -catalog XML catalog file This option takes an XML catalog file as an option and attempts to use it to resolve public and system ID references for validation files. (See the "Understanding XML Catalog Files" on this wiki for more information on catalog files and their use.) In most respects, it fails.

There is an example of a catalog file that might work in the Validate Tool Operations guide, but the same effect can be achieved with less effort by using a configuration file (see below) with the schema file lists. In particular, this option fails to properly apply an XML Catalog file that makes use of the <rewriteURI> element. This, combined with the fact that it cannot be combined with the -force option, where you would likely want to be able to locate multiple different versions of various schema files, makes the functional usefulness of this option extremely limited.

-x, -schema XSD file list Use this option to list the schema (i.e., the "*.xsd") files, by location, to be used for validating labels. Typically, this will be used to provide the XSD part of discipline and local dictionary schemas, and must be used in conjunction with the -schematron option. If you are using an older version of Validate with a new version of the core schema (a version later than the default version built into the tool), you will also need to supply that schema location via this option. The "location" can be either a directory on your system, or a URL (if you have a network connection and valid URL).

Caveats:

  • This option cannot be used in conjunction with the -force option.
  • If you are listing a core schema file, it must come first.
  • If you list more than one version of any single dictionary XSD file, only the last one listed will be applied.
  • Using this option more than once is not an error. The program behaves as if the various arguments to the option were concatenated into a single, comma-separated list in the order specified on the command line.
  • Using this option without the -schematron option is not an error and doesn't generate a warning - even though every PDS4 dictionary that has a schema file also has a Schematron file that must be used for validation as well.
  • Listing an XSD file in this option does not automatically include the corresponding Schematron (.sch) file - you must include it explicitly via the -schematron option.
  • There is no attempt to match the XSD file list in this option with any Schematron list that might or might not be provided via the -schematron option.
  • Attempting to use this option with the -model-version option does not produce an error, but the -model-version will override the core schema version listed by this option (but see the similar note for the -schematron option).
-S, -schematron SCH file list Use this option to list the Schematon (i.e., the ".sch") files, by location, to be used for validating labels. Typically, this option will be used to provide the Schematron part of discipline and local dictionary schemas, and must be used in conjunction with the -schema option. If you are using an older version of Validate with a new version of the core schema, you will also need to supply that Schematron location via this option. The "location" can be either a directory on your system, or a URL (if you have a network connection and valid URL).

Caveats:

  • This option cannot be used in conjuntion with the -force option.
  • If you list more than one version of any single dictionary Schematron file, both will be applied to all labels. m THis may result in either duplicate or conflicting errors flagged by the Schematron files.
  • Using this option more than once is not an error. The program behaves as if the various arguments to the options were concatenated into a single, comma-separated list in the order specified on the command line.
  • Using this option without the -schema option is not an error and doesn't generate a warning - even though every PDS4 dictionary that has a Schematron file also has a schema file that must be used for validation.
  • Listing a Schematron file in this option does not automatically include the corresponding schema file.
  • There is no attempt to match the Schematron file list in this option with any schema files listed via the -schema option.
  • Attempting to use this option with the -model-version option does not produce an error, but - unlike the case with the -schema option - this option will override the -model-version Schematron.

Additional Validation

Option Argument Notes

Configuration File

Option Argument Notes

Common Error Messages

null

If the only word in your output listing or report file is the string "null", you've got an error in your command line somewhere. Check spelling (no truncation of long-form options is allowed) and case of options and their arguments to find the culprit.

ERROR Uncaught exception while validating: Input file is not a directory: ...

This is usually caused by a typo in your command (or configuration file). Look at the name of the "file" for a clue. This can be caused by a mistyped long option as well as mistyped file or directory names.

ERROR line [number, character]: src-resolved: Cannot resolve the name '[name]' to a(n) 'element declaration' component

There are some variations on this theme, but the "Cannot resolve" message means that a schema (XSD) file you attempted to reference directly via command line or configuration file options, or perhaps indirectly through a label schemaLocation when using the -force option, could not be found and read. Check for typos in the options, or in the case of -force, the label file that generated the error.