Metadata and Predefined text

The front material of documents generated by Metanorma routinely involves templated text, including both the front page, and “predefined text” about legal and other obligations surrounding the document. Those text templates in turn are routinely populated using metadata extracted from the document.

Metadata

The bibdata element in a Metanorma document contains various metadata elements about the document, as a bibliographic description.

These elements are populated either from the document attributes in the Metanorma AsciiDoc input, or with default values.

Specifically, the bibdata element is populated through the Asciidoctor::Standoc::Converter.metadata method, and its inheritors.

The bibdata element is not rendered directly as the document front page. Instead, the document front page, and other templated texts, are populated wth elements extracted from the bibdata element. That extraction takes place using the Isodoc.info method and its inheritors, which invoke the Isodoc::Metadata class and its inheritors. The extraction results in a Hash of metadata keys and values, which is used to populate any templated text.

For example, in the Metanorma ISO flavour, the document header

= This title is overriden by :title-main-en:
:docnumber: 33032
:edition: 1
:technical-committee: TC
:technical-committee-number: 399
:technical-committee-type: TC
:docstage: 10
:docsubstage: 20
:title-intro-en: Cybernetics
:title-main-en: Neuro-information interchange interface

generates the following bibdata element:

<bibdata type="standard">
  <title language="en" format="text/plain" type="main">Cybernetics — Neuro-information interchange interface</title>
  <title language="en" format="text/plain" type="title-intro">Cybernetics</title>
  <title language="en" format="text/plain" type="title-main">Neuro-information interchange interface</title>
  <docidentifier type="iso">ISO/NWIP 33032</docidentifier>
  <docidentifier type="iso-with-lang">ISO/NWIP 33032 (E)</docidentifier>
  <docnumber>1000</docnumber>
  <contributor>
    <role type="author"/>
    <organization>
      <name>International Organization for Standardization</name>
      <abbreviation>ISO</abbreviation>
    </organization>
  </contributor>
  <contributor>
    <role type="publisher"/>
    <organization>
      <name>International Organization for Standardization</name>
      <abbreviation>ISO</abbreviation>
    </organization>
  </contributor>
  <edition>1</edition>
  <language>en</language>
  <script>Latn</script>
  <status>
    <stage>10</stage>
    <substage>20</substage>
  </status>
  <copyright>
    <from>2020</from>
    <owner>
      <organization>
        <name>International Organization for Standardization</name>
        <abbreviation>ISO</abbreviation>
      </organization>
    </owner>
  </copyright>
  <ext>
    <doctype>article</doctype>
    <editorialgroup>
      <technical-committee number="1" type="TC">TC</technical-committee>
      <subcommittee/>
      <workgroup/>
    </editorialgroup>
    <structuredidentifier>
      <project-number>ISO 33032</project-number>
    </structuredidentifier>
  </ext>
</bibdata>

In turn, that generates the following metadata Hash:

{
  :agency => "ISO",
  :authors => [],
  :authors_affiliations => {},
  :docnumber => "ISO/NWIP 33032",
  :docnumeric => "33032",
  :docsubtitle => "",
  :docsubtitlemain => "",
  :docsubtitlepartlabel => "Partie&nbsp;",
  :doctitle => "Cybernetics&#x2009;&#x2014;&#x2009;Neuro-information interchange interface",
  :doctitlemain => "Neuro-information interchange interface",
  :doctitlepartlabel => "Part&nbsp;",
  :doctype => "Article",
  :docyear => "2020",
  :draft => nil,
  :draftinfo => "",
  :edition => "2",
  :editorialgroup => ["TC 399"],
  :ics => "XXX",
  :obsoletes => nil,
  :obsoletes_part => nil,
  :revdate => nil,
  :sc => "XXXX",
  :secretariat => "XXXX",
  :stage => "10",
  :stage_int => 10,
  :statusabbr => "NWIP",
  :tc => "TC 399",
  :tc_docnumber => [],
  :unpublished => true,
  :wg => "XXXX"
}

Some metadata hash values are normalized, especially as the contents of the hash are intended for display; dates, for example, are often resolved from the ISO 8601-1 and ISO 8601-2 formats to formats with the month spelled out.

Default metadata values

Each gem can customise its own metadata values.

These are the default metadata values extracted by the base Isodoc::Metadata class, and the corresponding Metanorma XML locations they are populated from:

authors

an array of personal author names, each name extracted from //bibdata/contributor[role/@type = 'author' or xmlns:role/@type = 'editor']/person, and being either ./name/completename or ./name/forename + " " ./name/surname.

authors_affiliations

a hash of affiliations that personal authors have, each personal affiliation mapping to the array of personal names of authors working there. The affiliations are extracted from the personal author names (see above) as ./affiliation/organization/name plus ./affiliation/organization/address/formattedAddress, comma-delimited, or else either the name or the address. So for example, { "CSIRO" ⇒ ["Fred Nerk", "Joe Bloggs"], "University of Auckland" ⇒ ["John Doe"] }.

{type}date

The date at which the {type} event occurred. The {type} is the name of the lifecycle event modelled by Relaton, including published accessed created implemented obsoleted confirmed updated issued received transmitted copied unchanged circulated. The date is extracted from //bidata/date[@type = {type}].

doctype

Flavour-specific document type, from //bibdata/ext/doctype.

doctype_display

Flavour-specific localised document type, from //local_bibdata/ext/doctype [added in https://github.com/metanorma/isodoc/releases/tag/v1.2.5].

agency

A concatenation of all the agency abbreviations (or, if that is unavailable, agency names) responsible for publishing the document. Extracted from //bibdata/contributor[xmlns:role/@type = 'publisher']/organization, using either ./abbreviation or ./name. E.g. "ISO/IEC".

publisher

A concatenation of all the agency names responsible for publishing the document. Extracted from //bibdata/contributor[xmlns:role/@type = 'publisher']/organization/name [added in https://github.com/metanorma/isodoc/releases/tag/v1.0.23].

subdivision

Subdivision of the first agency responsible for publishing the document, extracted from organization/subdivision [added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6].

pub_address

Address of the first agency responsible for publishing the document, extracted from organization/address/formattedAddress [added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6].

pub_phone

Phone number of the first agency responsible for publishing the document, extracted from organization/phone[not(@type = 'fax')] [added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6].

pub_fax

Fax number of the first agency responsible for publishing the document, extracted from organization/phone[@type = 'fax'] [added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6].

pub_email

Email of the first agency responsible for publishing the document, extracted from organization/email [added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6].

pub_uri

URI of the first agency responsible for publishing the document, extracted from organization/uri [added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6].

unpublished

Boolean value of whether the document is considered to be an unpublished draft or published, based on the status of the document.

keywords

An array of the keywords of the document.

stage

The stage of the document, extracted from //bibdata/status/stage.

stage_display

The localised stage of the document, extracted from //local_bibdata/status/stage [added in https://github.com/metanorma/isodoc/releases/tag/v1.2.5].

stageabbr

The abbreviation of the stage of the document, as extracted from //bibdata/status/stage. By default, this is the initials of the stage if the document is unpublished, and nil if the document is published.

substage

The substage of the document, extracted from //bibdata/status/substage.

substage_display

The localised substage of the document, extracted from //bibdata/status/substage [added in https://github.com/metanorma/isodoc/releases/tag/v1.2.5].

iteration

The iteration of the document stage, extracted from //bibdata/status/iteration.

docnumber

The first document identifier given in the XML for the document, extracted from //bibdata/docidentifier.

docnumeric

The numeric identifier for the document, extracted from //bibdata/docnumber. The canonical document identifier in docnumber is typically the docnumeric value, preceded by an agency abbreviation and/or a document type.

edition

The document edition, extracted from //bibdata/edition.

docyear

The document copyright year, extracted from //bibdata/copyright/from.

draft

The document draft number, extracted from //bibdata/version/draft.

revdate

The document revision date, extracted from //bibdata/version/revision-date.

revdate_monthyear

The document revision date, extracted from //bibdata/version/revision-date, given as month name and year (internationalised where defined).

draftinfo

The draft number and revision date, preceded with the local label for DRAFT.

doctitle

The document title, extracted from the first //bibdata/title[@language='en'] found in the document.

partof

The identifier of the document this document is part of, extracted from //bibdata/relation[@type = 'partOf']//docidentifier.

obsoletes

The identifier of the document this document obsoletes, extracted from //bibdata/relation[@type = 'obsoletes']//docidentifier.

obsoletes_part

The part of this document that has been obsoleted, extracted from //bibdata/relation[@type = 'obsoletes']//locality.

html

The URL for an HTML version of this document, extracted from //bibdata/uri[@type = 'html'].

xml

The URL for an XML version of this document, extracted from //bibdata/uri[@type = 'xml'].

pdf

The URL for an PDF version of this document, extracted from //bibdata/uri[@type = 'pdf'].

doc

The URL for a DOC version of this document, extracted from //bibdata/uri[@type = 'doc'].

url

The URL for an unspecified version of this document, extracted from //bibdata/uri[not(@type)].

keywords

The keywords of the document, extracted from //bibdata/keywords.

title_footnote

Footnotes belonging to the document title, extracted from //bibdata/note[@type = 'title-footnote'] [added in https://github.com/metanorma/isodoc/releases/tag/v1.2.6].

Predefined text processing

The metadata hash is used by the Isodoc::Convert.populate method, to populate all templated text. Templated text is expected to be in Liquid template language.

The keys of the metadata hash are the variable names passed into Liquid.

Given given the metadata Hash above, the following templated text:

<div class="doctitle-en">
  <div>
    <span class="title">{{ doctitleintro }}{% if doctitleintro and doctitlemain %} — {% endif %}</span><span class="subtitle">{{ doctitlemain }}{% if doctitlemain and doctitlepart %} —{% endif %}</span>
{% if doctitlepart %}
  </div>
  <div class="doctitle-part">
    {% if doctitlepartlabel %}
    <span class="partlabel">{{ doctitlepartlabel }}:</span>
    {% endif %}
    <span class="part">{{ doctitlepart }}</span>
{% endif %}
  </div>
</div>

is populated as:

<div class="doctitle-en">
  <div>
    <span class="title"></span><span class="subtitle">Main Title&#x2009;&#x2014;&#x2009;Title</span>
  </div>
</div>

and all the conditional output is ignored, because the document has neither a part component nor an introductory component to its title: only {{ doctitlemain}} ends up populated.

The Isodoc::Convert.populate method merges the metadata Hash with the @labels hash used for internationalisation (see Localization how-to guide). This is so that any templated text can also access localised labels defined for the current language.

The metadata hash for a flavour is also populated with the absolute file locations of the gem’s copy of any logo images. That means that any logos are populated in templated text using the metadata hash.

For example, the HTML and Word logo images for the Metanorma M3D flavour are defined in IsoDoc::M3d::Metadata.initialize as:

def initialize(lang, script, labels)
  super
  here = File.dirname(__FILE__)
  set(:logo_html,
      File.expand_path(File.join(here, "html", "m3-logo.png")))
  set(:logo_word,
      File.expand_path(File.join(here, "html", "logo.jpg")))
end

That means that the HTML logo image is populated in the HTML cover page for M3D through a Liquid variable:

<img src="{{ logo_html }}" alt="m3 logo"/>
Note
Although the absolute file location of the image inside the gem is used, postprocessing replaces this with either a local copy or a Data URI, in the case of HTML, and a MIME embedded attachment containing the image, in the case of Word.

The templated text populated through metadata can include:

  • Under the isodoc/*/html directory of the gem:

    • The HTML cover page (html_*titlepage.html) and Word cover page (word*_titlepage.html), which are the main destination for bibdata metadata.

    • The introductory page for HTML and Word (html_*intro.html, word*_intro.html), although this is usually populated instead via Metanorma predefined text (see below).

    • The Word header (header.html).

    • The HTML and Word Stylesheets (*.scss). This is in case any variables are used to either populate the stylesheet, or to conditionally include text; NIST and IEC use the current document status to turn line numbering on or off in the Word stylesheet. (Draft documents are line-numbered, and whether a document is in draft or not depends on the value of bibdata/status.)

  • Under the asciidoctor/* directory of the gem:

    • The Metanorma predefined text file (boilerplate.xml)

Predefined text

The boilerplate element in Metanorma XML follows after bibdata, and contains text that is repeatedly included in each instance of the document class, and that outlines the rules under which the document may be used.

By default, the boilerplate element contains up to four elements:

  • copyright-statement,

  • license-statement,

  • legal-statement, and

  • feedback-statement.

Each of those statements is a Metanorma clause, which can contain a title, multiple paragraphs, and subclauses.

Because the predefined text is repeated for each document in its class, it is not expected to be supplied by the user (although the user can supply their own predefined text file using the :boilerplate-authority: document attribute). Instead, the predefined text is included as a Metanorma XML file within the gem; by default, it is called boilerplate.xml.

Some of the predefined text may be populate with metadata specific to the current document, so the predefined text file is a Liquid template, populated with variables from the current flavour metadata Hash as with other templated text.

The content in the boilerplate element is processed as part of the document preface, and converted to HTML or Word like the rest of the Metanorma XML. However, predefined text usually ends up in the cover page or introductory page of the document instead. The following are the default conventions in Metanorma, although they can be overridden in the IsoDoc::*::Converter.authority_cleanup method (as is currently done in NIST):

  • Content in the copyright-statement element is rendered in a <div class="boilerplate-copyright"> container.

  • The authority_cleanup method, defined in postprocessing for both the HTML and the Word converters, looks for a single element with id attribute boilerplate-copyright-destination.

  • If it finds such an element, it moves the <div class="boilerplate-copyright"> container and its contents to replace that element. This is how predefined text can populate the cover page or introductory page, instead of occurring within the document body.

  • This is repeated for each of license-statement, legal-statement, and feedback-statement.

For example, in Metanorma ISO:

  • the copyright statement for ISO occurs on the second page:

    • <div id="boilerplate-copyright-destination"/> appears accordingly in the introductory page template;

  • the license statement is the warning present, if the document is in draft:

    • <div id="boilerplate-license-destination"/> appears in the title page template for the flavour;

    • the CSS styling for the front page draft warning is styled as boilerplate-license.

The following predefined text from metanorma-csa exemplifies all four statements in a predefined text, and its processing as a Liquid template.

<boilerplate>
  <copyright-statement>
    <clause>
      <p>© {{ docyear }} Cloud Security Alliance, LLC.</p>
    </clause>
  </copyright-statement>

  {% if unpublished %}
  <license-statement>
    <clause>
    <title>Warning for Drafts</title>

    <p>This document is not a CSA Standard. It is distributed for review and
      comment, and is subject to change without notice and may not be referred to as
      a Standard. Recipients of this draft are invited to submit, with their
      comments, notification of any relevant patent rights of which they are aware
      and to provide supporting documentation.
    </p>
  </clause>
  </license-statement>
  {% endif %}

  <legal-statement>
    <clause>
      <p>All rights reserved. Unless otherwise specified, no part of this
        publication may be reproduced or utilized otherwise in any form or by any
        means, electronic or mechanical, including photocopying, or posting on the
        internet or an intranet, without prior written permission. Permission can
        be requested from the address below.
      </p>
    </clause>
  </legal-statement>

  <feedback-statement>
    <clause>
      <p>Cloud Security Alliance</p>
      <p align="left">
        2212  Queen Anne Ave N<br />
        Seattle<br />
        WA 98109<br />
        United States of America<br />
        <br />
        <link target="mailto:copyright@cloudsecurityalliance.com">copyright@cloudsecurityalliance.com</link><br />
        <link target="www.cloudsecurityalliance.com">www.cloudsecurityalliance.com</link>
      </p>
    </clause>
  </feedback-statement>
</boilerplate>