Using Validate Tool

From The SBN Wiki
Revision as of 16:09, 28 May 2017 by Raugh (talk | contribs) (Safety Save)
Jump to navigation Jump to search

The Validate tool is the canonical PDS4 validator for single labels, collections, and bundles. It provides additional functionality beyond schematic development, and advanced features are still in active development.

If you haven't already installed and configured Validate, you'll find instructions on the Installing and Configuring Validate Tool page on this wiki, as well as included in the download bundle.

Format

The general command format is:

% validate [files] [options]

"Files" can be a single label file name, a directory, or a comma-separated list of both. This list may include absolute and relative paths, wild cards, and file globs. Alternately, the product labels to be validated can be specified as arguments to the -t option.

By default, if validate is given a directory as an argument it will attempt to validate all files ending in either ".xml" or ".XML" with the selected options, and will recurse down through all subdirectories doing the same.

Command Line Options

These options have been grouped by functional areas. All options have both a short and a long form. One or two hyphens can be used for both short and long options interchangeably; long options cannot be truncated. Most options require arguments, but even those that do not can not be globbed. That is, specifying "-Vh" will display only version information, not help information.

Program Information

These options provide information about the program itself.

Option Argument Notes
-V, -version --none-- This option displays the internal code version, the release date, and the core schemas applied by default by this version of validate, in addition to some licensing boilerplate.
-h, -help --none-- This option displays a command and option summary. It's about 50 lines long, so prepare to scroll.

Selecting What to Validate

These options, along with any files and directories included in the argument list, select and refine the file and directories validated. Whether included as arguments to validate or arguments to the -t option (below), the Validate Tool documentation refers to these collectively as "targets".

Note: No merging or duplicate removal is done on the list of files and directories supplied. If the list, when expanded, includes the same file or directory more than once, that element will be validated in its entirety each time it is listed. This is a waste of time - potentially a significant one for large file collections.
Option Argument Notes
-t, -target file/directory list Use this option as an alternate way to specify which files and/or directories (i.e., "targets") to validate. In the absence of the -t option, the comma-separated list of targets must immediately follow the validate command. By using the option, the target list can appear at any point among the other options. The syntax and globbing options are identical whether the -t switch is used or not. It is possible to provide two target lists: one with the -t option, and one without. In this case the two lists are concatenated, and any options that modify file selection apply to the merged list.
-L, -local --none-- Note the uppercase "L" in the short version of this option. When present, this option prevents validate from recursing down into subdirectories of any directories included in the argument list or -t target list.
-e, -regexp pattern[, pattern] This option appears to be misnamed. It does not accept regular expressions in general, but rather allows an addition option for file-globbing beyond what might be in the argument or target list. It is applied only to the file name, not the path, and behaves as if anchored to the beginning and end of the file name. So -e "*.XML" will select only files with the explicit uppercase "XML" extension.

Beware of selecting files by name without extension. The option -e "table*" will select all files beginning with "table" regardless of file extension - most likely producing a validation error when validate attempts to validate data files as well as labels.

Output Control

These options affect the format and destination of the output report. Three "INFO" messages produced at the start of every validate run go to the standard error output, but everything else is directed to standard output unless this option is present. You can also redirect the output to a file or a process, if desired.

Option Argument Notes
-r, -report-file file name This option directs the output to the named file.
-v, -verbose 1 | 2 | 3 This option affects the "Validation Details" listing in the report by changing the severity level of the messages generated. A value of 1 causes INFO level and above messages to be included in the details; a value of 2 (the default) includes WARNING level and above; a value of 3 includes only ERROR level messages. Note that the PASS and FAIL messages are always generated, irrespective of the verbosity setting.
-s, -report-style json | xml The default report format is human-readable. This option allows you specify an alternate form for the output that is more congenial to programmatic analysis. Two alternatives are currently available: json, which produces JSON output; and xml, which produces output with XML tags (there is no schema published for these tags, but the tag names are formulated from the title present in the default report format). Note that these values must be in lowercase. Using "JSON" rather than "json", for example, will not produce the expected error message for an unrecognized value, but rather just the literal word "null".

Specifying Schemas

The Validate Tool comes with the released schema and Schematron files for the core ("pds") namespace built-in. By default, it will use the latest version of these for validation. If you're not sure what that version is, use the -version option to check. Older releases of the core namespace can be indicated via the -model-version option (below).

All other dictionary schema/Schematron files, including discipline dictionaries and any local dictionaries you have created, must be read in by validate - so you will have to have access to the relevant schema/Schematron files and will have to direct validate where to find them. A number of options are provided for that.

Note: PDS4 dictionaries consist of two files: a schema (.xsd) file and a Schematron (.sch) file. Both are needed to fully define any PDS4 namespace (core, discipline, or local). If you are validating files that reference non-core namespaces and do not provide the schema file, you will get an "ERROR" message in the "Validation Details" list containing a phrase like "The matching wildcard is strict, but no declaration can be found for element...", followed by an element from the missing namespace. If you omit the Schematron file, however, no notice of any kind will be generated - validate cannot tell that a Schematron file is missing due to the nature of Schematron validation (i.e., further constraints on top of the definitions contained in the schema file). Triple-check your Schematron file lists to avoid missing validation errors.
Option Argument Notes

Additional Validation

Option Argument Notes

Configuration File

Option Argument Notes

Common Errors

ERROR Uncaught exception while validating: Input file is not a directory: ...

This is usually caused by a typo in you command (or configuration file). Look at the name of the "file" for a clue. This can be caused by a mistyped long option as well as mistyped file or directory names.