Filling Out the Document Class

From The SBN Wiki
Revision as of 17:27, 22 January 2013 by Raugh (talk | contribs) (Added links)
Jump to navigation Jump to search

The <Document> class contains the high-level description of the logical content (as opposed to the physical file formatting) of an archival document. This is the class where you'll find the publication information, like title, author, and abstract. It is required in all Document products.

There is some apparent overlap between the information in this class and information that might be included as part of the <Citation_Information> class in the Identification_Area. There is a subtle distinction. The Citation_Information should be used to cite the document product - which may contain several physical formats of the same document, and has its own revision history separate from the logical content of the document. The attributes in th <Document> class provide similar information for the document itself - that is, for the source material used to create the PDS product. Often there will be overlap, but there may be cases where, for example, PDS personnel have done significant work editing or restoring a document from the primary source (scanning, editing OCR files, re-keying from paper, etc.), so that the document product will have an editor credit, but the document itself would have only an author credit.

For additional explanation, see the PDS4 Standards Reference, or contact your PDS node consultant.

Following are the attributes and subclasses you'll find in the Document class, in label order.

Note that in the PDS4 master schema, all classes have capitalized names; attributes never do.

<revision_id>

OPTIONAL

This is for the revision number of the document itself, as opposed to the version_id in the Identification_Area, which is the version of the PDS3 product comprising the document.

<document_name>

OPTIONAL

Use this attribute only in cases where the actual title of the document is too long to fit in the title attribute of the Identification_Area, which is limited to 255 bytes.

<doi>

OPTIONAL

If the document you're labelling has previously been published, it may have been assigned a Digital Object Identifier (DOI) by that publisher. The value of the <doi> attribute is case sensitive, so copy the DOI exactly; do not include any "doi:" or "DOI:" prefix.

<author_list>

OPTIONAL

The list of authors, if any, for this document, with the principal author listed first. Names should be in "Surname, Initials" format, with a semi-colon separating names. (That is, the same format as described for the same field in the <Citation_Information> class.)

<editor_list>

OPTIONAL

The list of editors, if any, for this document, with the principal editor listed first. Names should be in "Surname, Initials" format, with a semi-colon separating names. (That is, the same format as described for the same field in the <Citation_Information> class.)

<acknowledgement_text>

OPTIONAL

When republishing a copyrighted work, it is polite (and sometimes legally required) to acknowledge the original source. Here's the place to do that, formally or informally.

Even for public domain documents, this area can be used to acknowledge sources, for example: "Reprinted from [URL] with the kind permission of [web master]."

<copyright>

OPTIONAL

If you are reprinting a copyrighted document and the copyright holder wants a formal copyright notification to be included, this is the place to put it. This is also the place to specifically state that a document is in the public domain, if that seems appropriate; or to attach a formal copyright notice to a copyrighted document in the very rate case where one is created for PDS archiving.

The vast majority of documents in the PDS archive are in the public domain as they are government-funded works-for-hire.

<description>

OPTIONAL

This is where you provide a brief abstract for the document. The level of formality should follow the level of formality of the document itself. If you're republishing an article that has a formal abstract, you can simply past that in here.

Note: While technically optional, SBN will require that all Document products have a description, so that there is something to display to users searching for documents.

<publication_date>

REQUIRED

This is the publication date of the document. If the document has been previously published, outside the PDS, this should be the date of first publication. For documents originating in the archive, this is generally the date when the document is considered to be public - that could be the date of the review, the date of posting on a web site, or the date the product was created, depending on the node or data preparer conventions.