How can I adapt the StanDoc toolchain for my own publications?

Tip

The easiest way to adopt StanDoc is to use the metanorma-acme gem: https://github.com/metanorma/metanorma-acme, supplying your own stylesheets and HTML files for styling.

If you wish to create a custom gem, in order to customise behaviour further:

  • Clone the metanorma-sample gem: https://github.com/metanorma/metanorma-sample.

  • Change the namespace for RSD documents (RSD_NAMESPACE = "https://open.ribose.com/standards/rsd") to a namespace specific to your organisation’s document standard.

  • Change any references to sample or Sample in the gem to your organisation’s document standard.

  • Change the styling of the document outputs (…​/lib/isodoc/XXX/html).

The tool chains currently available proceed in two steps: map an input markup language (currently Asciidoctor only) into Standoc XML, and map Standoc XML into various output formats (currently Word doc, HTML, PDF via HTML). Running the metanorma tool involves a third step, of exposing the capabilities available in the first two in a consistent format. These two steps are represented as three separate modules, which are included in the same gem; for the Sample gem, they are Asciidoctor::Sample, Isodoc::Sample, and Metanorma::Sample. (In the case of Metanorma-ISO, almost all the content of Isodoc::ISO is in the isodoc gem, so the base classes of the two steps are in separate gems.)

Your adaptation of the toolchain will need to instantiate these three modules. The connection between the two first steps is taken care of in the toolchain, and metanorma explicitly invokes the two steps, feeding the XML output of the first step as input into the second. The metanorma-sample gem outputs both Word and HTML; you can choose to output only Word (as is done in metanorma-m3d), or only HTML (as is done in metanorma-csand), and you can choose to generate PDF from HTML as well (as is done in metanorma-csd).

The modules involve classes which rely on inheritance from other classes; the current gems use Asciidoctor::{Standoc, ISO}::Converter, Isodoc::{Metadata, HtmlConvert, WordConvert}, and Metanorma::Processor as their base classes. This allows the standards-specific classes to be quite succinct, as most of their behaviour is inherited from other classes; but it also means that you need to be familiar with the underlying gems, in order to do most customisation.

In the case of Asciidoctor::X classes, the changes you will need to make involve the intermediate XML representation of your document, which is built up through Nokogiri Builder; e.g. adding different enums, or adding new elements. The adaptations in Asciidoctor::Sample::Converter are limited, and most projects can take them across as is.

The customisations needed for Metanorma::Sample::Processor are minor, and involve invoking methods specific to the gem for document generation.

The customisations needed for Isodoc::Sample are more extensive. Three base classes are involved:

  • Isodoc::Metadata processes the metadata about the document stored in //bibdata. This information typically ends up in the document title page, as opposed to the document body. For that reason, metadata is extracted into a hash, which is passed to document output (title page, Word header) via the Liquid template language.

  • Isodoc::HtmlConvert converts Standoc XML to HTML.

  • Isodoc::WordConvert converts Standoc XML to Word HTML; the html2doc gem then converts this to a .doc document.

The Isodoc::HtmlConvert and Isodoc::WordConvert overlap substantially, as both use variants of HTML. However there is no reason not to make substantially different rendering choices in the HTML and Word branches of the code.

Asciidoctor::Standoc customisation in metanorma-sample

Examples from Asciidoctor::Sample in metanorma-sample. In the following snippets, the parameter node represents the current node of the Asciidoctor document, and xml represents the Nokogiri Builder node of the XML output.

  • The boilerplate representation of the document’s author, publisher and copyright holder names Acme instead of ISO as the responsible organisation.

      def metadata_author(node, xml)
        xml.contributor do |c|
          c.role **{ type: "author" }
          c.organization do |a|
            a.name "Acme"
          end
        end
      end
  • The editorial committees are represented as a single element, as opposed to ISO’s name plus number. (node.attr() recovers Asciidoctor document attribute values.)

      def metadata_committee(node, xml)
        xml.editorialgroup do |a|
          a.committee node.attr("committee"),
            **attr_code(type: node.attr("committee-type"))
        end
      end
  • The document title is monolingual, not bilingual. (asciidoc_sub() is already defined to apply Asciidoctor text substitutions to its contents, such as smart quotes and em-dashes.)

      def title(node, xml)
        ["en"].each do |lang|
          xml.title **{ language: lang, format: "plain" } do |t|
            t << asciidoc_sub(node.attr("title"))
          end
        end
      end
  • The document status is a single element, as opposed to ISO’s two-part code.

      def metadata_status(node, xml)
        xml.status(**{ format: "plain" }) { |s| s << node.attr("status") }
      end
  • The document identifier is a single element.

      def metadata_id(node, xml)
        xml.docidentifier { |i| i << node.attr("docnumber") }
      end
  • A security element is added to the document metadata.

      def metadata_security(node, xml)
        security = node.attr("security") || return
        xml.security security
      end

      def metadata(node, xml)
        super
        metadata_security(node, xml)
      end
  • Title validation and style validation is disabled.

      def title_validate(root)
        nil
      end
  • The root element of the document is changed from iso-standard to sample-standard.

      def makexml(node)
        result = ["<?xml version='1.0' encoding='UTF-8'?>\n<sample-standard>"]
        @draft = node.attributes.has_key?("draft")
        result << noko { |ixml| front node, ixml }
        result << noko { |ixml| middle node, ixml }
        result << "</sample-standard>"
        ....
      end
  • The document type attribute is restricted to a prescribed set of options.

      def doctype(node)
        d = node.attr("doctype")
        unless %w{policy-and-procedures best-practices
          supporting-document report legal directives proposal
          standard}.include? d
          warn "#{d} is not a legal document type: reverting to 'standard'"
          d = "standard"
        end
        d
      end
  • The literal asciidoctor block is processed as a preformatted tag (pre). (The code uses the built-in Asciidoctor literal() method, and embeds pre within a figure tag.)

      def literal(node)
        noko do |xml|
          xml.figure **id_attr(node) do |f|
            figure_title(node, f)
            f.pre node.lines.join("\n")
          end
        end
      end
  • A keyword element is added. (The keyword is encoded through the role attribute of Asciidoc: text)

      def inline_quoted(node)
        noko do |xml|
          case node.type
          ...
          else
            case node.role
            ...
            when "keyword" then xml.keyword node.text
            else
              xml << node.text
            end
          end
        end.join
      end
  • The inline headers of ISO are ignored.

      def sections_cleanup(x)
        super
        x.xpath("//*[@inline-header]").each do |h|
          h.delete("inline-header")
        end
      end

Metanorma::Processor customisation in metanorma-sample

  • initialize names the token by which Asciidoctor registers the standard

      def initialize
        @short = :sample
        @input_format = :asciidoc
        @asciidoctor_backend = :sample
      end
  • output_formats names the available output formats (including XML, which is inherited from the parent class)

      def output_formats
        super.merge(
          html: "html",
          doc: "doc",
          pdf: "pdf"
        )
      end
  • version gives the current version string for the gem

     def version
        "Asciidoctor::Sample #{Asciidoctor::Sample::VERSION}"
      end
  • input_to_isodoc is the call which converts Asciidoctor input into IsoDoc XML

      def input_to_isodoc(file, filename)
        Metanorma::Input::Asciidoc.new.process(file, filename, @asciidoctor_backend)
      end
  • output is the call which converts IsoDoc XML into various nominated output formats

      def output(isodoc_node, outname, format, options={})
        case format
        when :html
          IsoDoc::Sample::HtmlConvert.new(options).convert(outname, isodoc_node)
        when :doc
          IsoDoc::Sample::WordConvert.new(options).convert(outname, isodoc_node)
        when :pdf
          IsoDoc::Sample::PdfConvert.new(options).convert(outname, isodoc_node)
        else
          super
        end
      end

Isodoc::Standoc customisation in metanorma-sample

  • Initialise the HTML Converter:

    • Set @libdir, the current directory of the HTML converter, and the basis of the html_doc_path() method for accessing HTML assets (the html subdirectory of the current directory).

    • Copy the logo JPG from the HTML asset directory to the working directory, so that it can be access by the HTML template; flag the copy for deletion at the end of processing.

      def initialize(options)
        @libdir = File.dirname(__FILE__)
        super
        FileUtils.cp html_doc_path('logo.jpg'), "logo.jpg"
        @files_to_delete << "logo.jpg"
      end
  • Set the default fonts for the HTML rendering, which will be used to populate the HTML CSS stylesheet.

    class HtmlConvert < IsoDoc::HtmlConvert
      def default_fonts(options)
        {
          bodyfont: (options[:script] == "Hans" ? '"SimSun",serif' : '"Overpass",sans-serif'),
          headerfont: (options[:script] == "Hans" ? '"SimHei",sans-serif' : '"Overpass",sans-serif'),
          monospacefont: '"Space Mono",monospace'
        }
      end
  • Set the default HTML assets for the HTML rendering.

    class HtmlConvert < IsoDoc::HtmlConvert
      def default_file_locations(_options)
        {
          htmlstylesheet: html_doc_path("htmlstyle.scss"),
          htmlcoverpage: html_doc_path("html_sample_titlepage.html"),
          htmlintropage: html_doc_path("html_sample_intro.html"),
          scripts: html_doc_path("scripts.html"),
        }
      end
  • Set distinct default fonts and HTML assets for the Word rendering.

    class WordConvert < IsoDoc::WordConvert
      def default_fonts(options)
        {
          bodyfont: (options[:script] == "Hans" ? '"SimSun",serif' : '"Arial",sans-serif'),
          headerfont: (options[:script] == "Hans" ? '"SimHei",sans-serif' : '"Arial",sans-serif'),
          monospacefont: '"Courier New",monospace'
        }
      end

      def default_file_locations(_options)
        {
          wordstylesheet: html_doc_path("wordstyle.scss"),
          standardstylesheet: html_doc_path("sample.scss"),
          header: html_doc_path("header.html"),
          wordcoverpage: html_doc_path("word_sample_titlepage.html"),
          wordintropage: html_doc_path("word_sample_intro.html"),
          ulstyle: "l3",
          olstyle: "l2",
        }
      end
  • Set the content of the HTML head, other than the CSS stylesheets. Note that the head title is given as a Liquid Template reference to metadata (){{ doctitle }}, which we have seen populated above.)

     def html_head
        <<~HEAD.freeze
        <title>{{ doctitle }}</title>
    <script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
    <!--TOC script import-->
    <script type="text/javascript" src="https://cdn.rawgit.com/jgallen23/toc/0.3.2/dist/toc.min.js"></script>
    <!--Google fonts-->
    ....
    <style class="anchorjs"></style>
        HEAD
      end
  • Change the default label for annexes from "Annex" to "Appendix".

      def i18n_init(lang, script)
        super
        @annex_lbl = "Appendix"
      end
  • Define rendering for the pre and keyword preformatted text tags.

      def pre_parse(node, out)
        out.pre node.text # content.gsub(/</, "&lt;").gsub(/>/, "&gt;")
      end

      def error_parse(node, out)
        # catch elements not defined in ISO
        case node.name
        when "pre"
          pre_parse(node, out)
        when "keyword"
          out.span node.text, **{ class: "keyword" }
        else
          super
        end
      end
  • Render term headings in the same paragraph as the term heading number

      def term_cleanup(docxml)
        docxml.xpath("//p[@class = 'Terms']").each do |d|
          h2 = d.at("./preceding-sibling::*[@class = 'TermNum'][1]")
          h2.add_child("&nbsp;")
          h2.add_child(d.remove)
        end
        docxml
      end