Difference between revisions of "Installing and Configuring Validate Tool"

From The SBN Wiki
Jump to navigation Jump to search
m (Safety Save)
(Safety Save)
Line 15: Line 15:
 
     % validate product.xml -r report.txt
 
     % validate product.xml -r report.txt
 
</pre>
 
</pre>
 +
 +
Note that there is an installation document with a more terse installation process description included in the documentation provided with the code.  Those experienced with installing Java-based software on their system may well prefer to reference that.  This page is intended for those who are unfamiliar with that process and would like a bit of additional detail and guidance.
  
 
== Part List ==
 
== Part List ==
Line 64: Line 66:
 
* ''LICENSE.txt'' : Standard boilerplate license (JPL employees produced this code, and JPL is part of the California Institute of Technology)
 
* ''LICENSE.txt'' : Standard boilerplate license (JPL employees produced this code, and JPL is part of the California Institute of Technology)
 
* ''README.txt'' : This file just directs you to the ''doc/index.html'' file.
 
* ''README.txt'' : This file just directs you to the ''doc/index.html'' file.
 +
 +
 +
== Install the Executable and Support Directories ==
 +
 +
Unless you're seriously hardcore about running Java applications, you will be running Validate by invoking a wrapper script (or batch file).  This script sets up some environment variables and then invokes Java with the appropriate options and arguments for executing the ''validate'' ".jar" file with options and arguments passed on by the wrapper script.
 +
 +
'''Note:''' The Java code not only expects to find the environment variables set by the wrapper script, it also expects to be running from a ''bin/'' directory that is adjacent to the ''lib/'' directory containing the ''jar'' files - so that aspect of the directory tree must be preserved for the code to execute successfully.  There are various ways of accomplishing that in different operating environments if you know what you're doing.  This page describes one simple way to achieve that for non-experts.
 +
 +
=== Choosing an Installation Location ===
 +
 +
On linux-based multi-user systems, you can install Validate for general use by all users either by installing into one of the standard system locations (''/usr/share'', for example), or in shared disk space.  If the latter, users wanting to execute Validate will likely have to add the appropriate directory location to the <code>$PATH</code> setting.  Alternately, you as a single, non-super user can install it into your own ''~/bin/'' directory.  Note that if you haven't created or used a personal ''~/bin/'' directory before, you may have to add it to your <code>$PATH</code> to use it.
 +
 +
In any event, on a linux-based systen you will ultimately have to choose one of these options:
 +
# Add the ''validate-[version]/bin'' directory to your <code>$PATH</code>, which required editing your shell resource file; or
 +
# Create a link to ''validate-[version]/bin/validate'' (i.e., to the script rather than just the directory) in a directory already in your <code>$PATH</code>, which requires an additional edit to the ''validate'' wrapper script; or
 +
# Type the full path to the ''validate'' script every time you want to run it.
 +
 +
On Windows systems, you can install the Validate directory tree into the "Program Files\" directory for general use (this typically requires admin privileges), or in your own directory space for personal use.  In either case you will have to modify the <code>%PATH%</code> environment variable setting information to make the ''validate.bat'' executable visible to all users or to yourself, respectively.  Or you can run the batch file by typing the full path reference each time.
 +
 +
=== What to Copy/Mode ===
 +
 +
Create a directory in your chosen installation location to hold the Validate Tool tree.  You can name this ''validate'', or include a version number, or rename it anything convenient.  The name of the directory itself is not significant to the code.
 +
 +
Under this directory, copy over the entire contents of the ''lib/'' directory, and either copy or create a ''bin/'' directory to contain the edited wrapper script - ''validate'' for linux-based systems, or ''validate.bat'' for Windows systems.  For linux systems, you should also make sure the ''validate'' script is executable.
 +
 +
At this point you may also want to copy over the contents of the ''doc/'' directory, for easy reference; it is not needed to run the code.  I also copy the ''README'' and ''LICENSE'' files from the root of the install package, just in case I want to find them again later.
 +
 +
== Edit the Wrapper Script/Batch File ==
 +
 +
The ''validate'' script (linux) or ''validate.bat'' file (Windows) is used to run the tool.  This file will need to be edited to conform to the installation environment.  Any simple text editor (as described above) can do the job.
 +
 +
=== Windows Batch File ''validate.bat'' ===
 +
 +
You'll likely want or need to make a couple changes to this file. Lines beginning with "'''::'''" are comments - feel free to add more.
 +
 +
The first executable line in the file is:
 +
:: <code>@echo off</code>
 +
which stops the shell from printing every executable line to your command window when you run the batch file.  Comment this line out if you're trying to trouble-shoot the batch file.
 +
 +
Immediately after this <code>@echo off</code> line, you should probably add this line:
 +
:: <code>SETLOCAL</code>
 +
This makes sure that any variables that are set by this batch file do not permanently overwrite any environment variables with the same name that might have already existed for other reasons.
 +
 +
Following the next set of comments you'll see the (uncommented) lines that check whether the ''%JAVA_HOME%'' environment variable is already set, and kill the batch file if it isn't.  See the "[[Finding and Setting JAVA_HOME]]" page on this wiki for detailed steps to check the variable, and to find the right value to use if it isn't set. 
 +
 +
If ''%JAVA_HOME%'' is not currently set, you have two options:
 +
# Permanently add the definition to your environment variables.  (See, for example, [http://www.computerhope.com/issues/ch000549.htm How to set the path and environment variables in Windows], by ComputerHope.com).  In this case you don't need to make any changes to the ''%JAVA_HOME%'' test in the batch file.
 +
# Replace these lines in the batch file:
 +
<pre>
 +
    if not defined JAVA_HOME (
 +
    echo The JAVA_HOME environment variable is not set.
 +
    goto END
 +
    )
 +
</pre>
 +
::with something like this line:
 +
<pre>
 +
    set JAVA_HOME="C:\Program Files\Java"
 +
</pre>
 +
::where you replace ''C:\Program Files\Java'' with the actual location of your Java home directory.  Note that there should be no spaces around the "="; and the double quotes are not necessary in the <code>set JAVA_HOME</code> line if your path does not contain embedded blanks.
 +
 +
Finally, the last executable line in the batch file before the <code>:END</code> statement looks like this:
 +
::<code>"%JAVA_HOME%"\bin\java -Xms256m -Xmx1024m -jar "%VALIDATE_JAR%" %*</code>
 +
Remove the quotes from around <code>%JAVA_HOME%</code>. If the quotes were needed to set the value, then they are already part of the string and the additional quotes will cause a syntax error.
 +
 +
'''N.B.:''' Paths with embedded blanks missing quotes and extra sets of quotes can cause failures, frequently with messages about unexpected information or invalid paths.  If you see that sort of message when you test the batch file, comment out the <code>@echo off</code> line so you can see exactly where the script is failing, and you may have to add or remove quotes on that line or an earlier line to adjust for the actual paths in your environment.
 +
 +
=== Linux ''validate'' script ===
 +
 +
If you ''$JAVA_HOME'' environment variable is not already set, the script will exist without invoking ''Validate''.  See the "[[Finding and Setting JAVA_HOME]]" page on this wiki for gory details on determining the right value to set and how to set it in your environment. Note that the ''validate'' script is written to be run in the Bourne shell, so use Bourne shell syntax to set ''$JAVA_HOME'' in the script regardless of what your login shell is. So if you prefer to set ''%JAVA_HOME'' in the script, replace these lines:
 +
<pre>
 +
    if [ -z "${JAVA_HOME}" ]; then
 +
        echo "The JAVA_HOME environment variable is not set." 1>&2
 +
        exit 1
 +
    fi
 +
</pre>
 +
with something like this line:
 +
<pre>
 +
    JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-2.b11.el7_3.x86_64/jre
 +
</pre>
 +
where <code>/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-2.b11.el7_3.x86_64/jre</code> should be replaced with your actual Java home directory.  Alternately, if you have a handy little ''java_home'' script installed (see [[Finding and Setting JAVA_HOME]]), this will also work:
 +
<pre>
 +
    JAVA_HOME=`java_home`
 +
</pre>
 +
 +
If you're planning to link the ''validate'' script into an ''bin/'' directory (as opposed to adding a new element to you ''$PATH'' to access this one executable), you'll need to edit a couple more lines in the ''validate'' wrapper script.  The script crawls the local directory tree to find related ''lib/'' directory using system functions, but that doesn't quite work of the script was invoked via a link.  So in this case, replace these lines:
 +
<pre>
 +
    SCRIPT_DIR=`dirname $0`
 +
    PARENT_DIR=`cd ${SCRIPT_DIR}/.. && pwd`
 +
</pre>
 +
with something like these lines:
 +
<pre>
 +
    PARENT_DIR=/usr/share/pds4tools/validate/validate-1.11.0
 +
    SCRIPT_DIR=${PARENT_DIR}/bin
 +
</pre>
 +
where you should replace <code>/usr/share/pds4tools/validate/validate-1.11.0</code> with the absolute path to your installed ''validate'' tree (the directory containing the ''bin/'' and ''lib/'' subdirectories).  Also note the inverted order of the definitions.
 +
 +
Finally, make sure the ''validate'' script is executable.

Revision as of 12:48, 19 May 2017

Introduction

The Validate Tool is the canonical validator for PDS4 products. It can be used to validate single product labels, collections, and bundles. Capabilities are still being developed and added, and will continue to be for the foreseeable future. Being able to run Validate Tool locally can reduce the turn-around time for submitting data to PDS for review and archiving by allowing data preparers to spot-check their labels throughout development, and to run the same canonical tool used by PDS to vet their submissions for standards compliance.

Validate Tool Functions

The primary functions provided by Validate Tool as of this writing are schematic validation (against both XSD and Schematron files), and referential integrity checking of collections and bundles (i.e., ensuring all member products are present and complete, and no non-member products are present). Capabilities still in development include the ability to validate data objects (like images and tables) against their label definitions.

Validate Tool can also, on request, perform a checksum verification as part of the validation process.

Goal

Our goal of this page is to start with the Validate Tool installation package and end up with the tool installed for general use on the target system. "General use" in this case means you can invoke the tool in any directory where you happen to be working with a command line that, in its simplest form, looks something like this:

     % validate product.xml -r report.txt

Note that there is an installation document with a more terse installation process description included in the documentation provided with the code. Those experienced with installing Java-based software on their system may well prefer to reference that. This page is intended for those who are unfamiliar with that process and would like a bit of additional detail and guidance.

Part List

To run the Validate Tool locally, you'll need:

  • Java 6 (1.6) or later. Type java -version at your command line to what version of Java, if any, you have available. If you don't have Java installed, or want to work with a later version, you'll usually need administrator privileges on your computer to download and install a newer version from the Oracle web site https://java.com/download. Java 7 (1.7) and later includes a handy feature that will help with configuration later on, so if you're still running a (relatively) ancient version, you now have one more reason to upgrade.
  • A text editor that can handle simple text files for batch processing without filling them up with stupid control characters. On linux-based systems, things like vi, pico, or gedit will work; from the Windows DOS command line, you can use the edit command on older systems (pre-Windows 7), or Notepad (which can be invoked from the Windows command line as notepad) on newer ones.

General Procedure

Here's the general procedure for setting up the tool:

  1. Unpack the Validate Tool package.
  2. Move the directories you nee to run the tool to a permanent location.
  3. Edit the wrapper script for the local environment.
  4. Install the wrapper script.
  5. Test the installation.
  6. Rejoice in the knowledge of a job well done.

Procedure

Unpack the Validate Tool Package

Use an standard ZIP tool (unzip on linux-based systems; the Extract All option in Windows Explorer) to extract the files from the ZIP package. For the tar file, use the z option to uncompress while you extract on a linux system. You should end up with a directory with a name that starts with validate- and ends with the internal version number of the tool. As of this writing, the latest version of the tool is 1.11.0, so the delivery package unpacks into a directory called validate-1.11.0. You can unpack it anywhere - we'll move the directory tree we need to a new home once we're picked one out. If you haven't inspected previous Validate Tool delivery packages, you should probably take a few minutes to familiarize yourself with the contents.

Here's what you'll find in the unpacked directory:

Executables

The executable elements of the package include:

bin/
You'll only need one of the two files from this directory, depending on your system type. We'll be modifying the appropriate script to work on your local system.
lib/
This directory contains the Java archive files comprising the Validate Tool code.

Documentation

The doc/ subdirectory contains an HTML directory tree. Point your browser to the index.html file to see it in its intended format.

Peanuts

Like packing peanutes, these files are included in the ZIP but are not directly involved in program operation:

  • LICENSE.txt : Standard boilerplate license (JPL employees produced this code, and JPL is part of the California Institute of Technology)
  • README.txt : This file just directs you to the doc/index.html file.


Install the Executable and Support Directories

Unless you're seriously hardcore about running Java applications, you will be running Validate by invoking a wrapper script (or batch file). This script sets up some environment variables and then invokes Java with the appropriate options and arguments for executing the validate ".jar" file with options and arguments passed on by the wrapper script.

Note: The Java code not only expects to find the environment variables set by the wrapper script, it also expects to be running from a bin/ directory that is adjacent to the lib/ directory containing the jar files - so that aspect of the directory tree must be preserved for the code to execute successfully. There are various ways of accomplishing that in different operating environments if you know what you're doing. This page describes one simple way to achieve that for non-experts.

Choosing an Installation Location

On linux-based multi-user systems, you can install Validate for general use by all users either by installing into one of the standard system locations (/usr/share, for example), or in shared disk space. If the latter, users wanting to execute Validate will likely have to add the appropriate directory location to the $PATH setting. Alternately, you as a single, non-super user can install it into your own ~/bin/ directory. Note that if you haven't created or used a personal ~/bin/ directory before, you may have to add it to your $PATH to use it.

In any event, on a linux-based systen you will ultimately have to choose one of these options:

  1. Add the validate-[version]/bin directory to your $PATH, which required editing your shell resource file; or
  2. Create a link to validate-[version]/bin/validate (i.e., to the script rather than just the directory) in a directory already in your $PATH, which requires an additional edit to the validate wrapper script; or
  3. Type the full path to the validate script every time you want to run it.

On Windows systems, you can install the Validate directory tree into the "Program Files\" directory for general use (this typically requires admin privileges), or in your own directory space for personal use. In either case you will have to modify the %PATH% environment variable setting information to make the validate.bat executable visible to all users or to yourself, respectively. Or you can run the batch file by typing the full path reference each time.

What to Copy/Mode

Create a directory in your chosen installation location to hold the Validate Tool tree. You can name this validate, or include a version number, or rename it anything convenient. The name of the directory itself is not significant to the code.

Under this directory, copy over the entire contents of the lib/ directory, and either copy or create a bin/ directory to contain the edited wrapper script - validate for linux-based systems, or validate.bat for Windows systems. For linux systems, you should also make sure the validate script is executable.

At this point you may also want to copy over the contents of the doc/ directory, for easy reference; it is not needed to run the code. I also copy the README and LICENSE files from the root of the install package, just in case I want to find them again later.

Edit the Wrapper Script/Batch File

The validate script (linux) or validate.bat file (Windows) is used to run the tool. This file will need to be edited to conform to the installation environment. Any simple text editor (as described above) can do the job.

Windows Batch File validate.bat

You'll likely want or need to make a couple changes to this file. Lines beginning with "::" are comments - feel free to add more.

The first executable line in the file is:

@echo off

which stops the shell from printing every executable line to your command window when you run the batch file. Comment this line out if you're trying to trouble-shoot the batch file.

Immediately after this @echo off line, you should probably add this line:

SETLOCAL

This makes sure that any variables that are set by this batch file do not permanently overwrite any environment variables with the same name that might have already existed for other reasons.

Following the next set of comments you'll see the (uncommented) lines that check whether the %JAVA_HOME% environment variable is already set, and kill the batch file if it isn't. See the "Finding and Setting JAVA_HOME" page on this wiki for detailed steps to check the variable, and to find the right value to use if it isn't set.

If %JAVA_HOME% is not currently set, you have two options:

  1. Permanently add the definition to your environment variables. (See, for example, How to set the path and environment variables in Windows, by ComputerHope.com). In this case you don't need to make any changes to the %JAVA_HOME% test in the batch file.
  2. Replace these lines in the batch file:
    if not defined JAVA_HOME (
    echo The JAVA_HOME environment variable is not set.
    goto END
    )
with something like this line:
    set JAVA_HOME="C:\Program Files\Java"
where you replace C:\Program Files\Java with the actual location of your Java home directory. Note that there should be no spaces around the "="; and the double quotes are not necessary in the set JAVA_HOME line if your path does not contain embedded blanks.

Finally, the last executable line in the batch file before the :END statement looks like this:

"%JAVA_HOME%"\bin\java -Xms256m -Xmx1024m -jar "%VALIDATE_JAR%" %*

Remove the quotes from around %JAVA_HOME%. If the quotes were needed to set the value, then they are already part of the string and the additional quotes will cause a syntax error.

N.B.: Paths with embedded blanks missing quotes and extra sets of quotes can cause failures, frequently with messages about unexpected information or invalid paths. If you see that sort of message when you test the batch file, comment out the @echo off line so you can see exactly where the script is failing, and you may have to add or remove quotes on that line or an earlier line to adjust for the actual paths in your environment.

Linux validate script

If you $JAVA_HOME environment variable is not already set, the script will exist without invoking Validate. See the "Finding and Setting JAVA_HOME" page on this wiki for gory details on determining the right value to set and how to set it in your environment. Note that the validate script is written to be run in the Bourne shell, so use Bourne shell syntax to set $JAVA_HOME in the script regardless of what your login shell is. So if you prefer to set %JAVA_HOME in the script, replace these lines:

    if [ -z "${JAVA_HOME}" ]; then
        echo "The JAVA_HOME environment variable is not set." 1>&2
        exit 1
    fi

with something like this line:

    JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-2.b11.el7_3.x86_64/jre

where /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-2.b11.el7_3.x86_64/jre should be replaced with your actual Java home directory. Alternately, if you have a handy little java_home script installed (see Finding and Setting JAVA_HOME), this will also work:

    JAVA_HOME=`java_home`

If you're planning to link the validate script into an bin/ directory (as opposed to adding a new element to you $PATH to access this one executable), you'll need to edit a couple more lines in the validate wrapper script. The script crawls the local directory tree to find related lib/ directory using system functions, but that doesn't quite work of the script was invoked via a link. So in this case, replace these lines:

    SCRIPT_DIR=`dirname $0`
    PARENT_DIR=`cd ${SCRIPT_DIR}/.. && pwd`

with something like these lines:

    PARENT_DIR=/usr/share/pds4tools/validate/validate-1.11.0
    SCRIPT_DIR=${PARENT_DIR}/bin

where you should replace /usr/share/pds4tools/validate/validate-1.11.0 with the absolute path to your installed validate tree (the directory containing the bin/ and lib/ subdirectories). Also note the inverted order of the definitions.

Finally, make sure the validate script is executable.