Metanorma: Aequitate Verum

Output formats

The Metanorma suite compiles documents in the following formats.

Metanorma XML

The Metanorma XML output is the intermediate format which marks up the semantic content of the standards document, and is used to drive the other formats. The Metanorma XML file is also the file which is used during the validation stage.

HTML

The HTML output is in HTML 5. It has optional Data-URI encoding of local images; if images in the output are are not Data-URI encoded, they are moved to a folder called {filename}_images, and renamed with GUID names, to prevent collisions. Audio and video files are not supported. All HTML output has a sidebar with a Javascript-generated Table of Contents, which is two section levels deep.

PDF

Metanorma generates PDF output from XML. The styling comes from an XSL-FO stylesheet. Apache FOP interprets the stylesheet and generates the PDF.

For more information about how Apache FOP supports the XSL-FO standard, see the Apache FOP documentation.

Microsoft Word

The Word output is output as legacy DOC format (used in pre-2007 version of MS Word) rather than DOCX, and it is generated using the Microsoft Office flavor of HTML 4, as a Multipart HTML Word Document (MHT, the MIME-encoded counterpart to the HTML obtained when you save a Word document as HTML).

Limitations of .doc format

Using .doc imposes some constraints:

  • SVG images are not supported. (Word internally converts them into PNG files to render them in Word HTML.)

  • .doc files cannot be processed by Apple Pages or LibreOffice, only by Microsoft Word.

Tip

To open the Word output in LibreOffice in particular, you will need to convert .doc file to .docx file in a specific way:

  1. Open file with MS Word

  2. Save it once by changing filename extension to .doc (you can confirm file overwrite, when asked)

  3. "Save As" and choose .docx format this time

  • When you open the document, the table of contents shows all pages as being 1. That is because the table of contents needs to be updated from within Word: Right click the table of contents, and select "Update Field" and then "Update entire table".

Why not .docx

Using DOC HTML makes it much easier to generate documents with the advanced formatting requirements of Metanorma (including complex tables, formulas, footnotes, headers and footers, nested list numbering and cross-references) than generating either native DOCX (in OOXML), or the DOCX flavor of MHT.

Tip

For more on why Metanorma uses .doc, see https://github.com/metanorma/html2doc/wiki/Why-not-docx%3F.