Metanorma: Aequitate Verum

Languages and localization

Languages

Metanorma allows generation of standards in different languages.

Document specification

The language of the document is specified using the following document attributes:

:language:

ISO 639-2 two-letter language code. Defaults to en.

:script:

ISO 15924 script code (for languages with more than one script). Defaults to Latn. e.g., specify Hant for Traditional Chinese.

:locale:

ISO 3166-1 two-letter country code [added in https://github.com/metanorma/metanorma-standoc/releases/tag/v2.2.4].

Note
As of this writing, the impact of locale in localisation is minor: narrow non-breaking space in front of colon for templated localised text (e.g. cross-references) in Swiss French, vs full non-breaking space for other locales of French.
:i18nyaml:

language template file. Only needed if you’re using a language other than English, Chinese (Simplified) or French.

Content specification

Localised string variants can be specified using the lang:[] inline command [added in https://github.com/metanorma/metanorma-standoc/releases/tag/v1.6.3]:

The lang:[] inline command is specified in content as:

... lang:{locale-code}[] ...

Where,

  • {locale-code} is either a:

    • language code. A two-letter code specifying the language, specified using ISO 639-2 two-letter language codes.

      Example 1. French text specified in a document that is not specified as using French
      …​ lang:fr[je pense, donc je suis] …​
    • language code and script codes linked with a hyphen. The language code is specified using ISO 639-2 two-letter language codes, and the script code is specified using ISO 15924 codes.

      Example 2. Traditional Chinese text specified in a document that is not specified as using Traditional Chinese
      …​ lang:zh-Hant[國有四維] …​

If lang:[] command appear next to each other in a document, and one of the commands matches the current document language (and script, if supplied), then only that command will be rendered.

Example 3. Encoding language-specific content in a document specified with a different language

Suppose the document is in English:

=== lang:en[husked rice] lang:fr[riz décortiqué]

is rendered as

husked rice

Tip

If you’re looking for information about how to add support for a new language in your custom Metanorma workflow, see localization in Metanorma developer’s documentation.

Number localization

General

Standardization bodies often require numbers to be localized in their display, with the use of a particular decimal marker, groupings of digits, and groupings of decimal digits.

The SI system defined by the Metre Convention, notably describes acceptable practices for representing numbers and quantities (as documented in the BIPM SI Brochure)

Some standardization bodies also utilize the ISO 80000 series of standards which provide details on representing numbers.

Metanorma supports localization for number representation formats as well as specific formats required by specific standardization bodies.

The number localization format can be overridden in particular flavours of Metanorma, to flavour-specific requirements.

Specifying numbers to localize

Localization of numbers is only applied if numbers are encoded as stem expressions.

Example 4. Example of representing numbers without localization
The number 60007.1234, entered in that way, is displayed plainly as 60007.1234.

By default, numbers in Metanorma are localized to the display conventions of the language of the document (specified using the attribute :language:). Metanorma automatically applies localization conventions of the particular language, as specified by the Unicode CLDR data repository (via the Twitter CLDR gem).

Typical English localization formats the number 60007.1234 as 60,007.1234. This is achieved in Metanorma by tagging the document as being in English (:language: en), and encoding the number as a stem expression (stem‌:[60007.1234]).

Example 5. Example of localizing numbers using stem encoding
The encoding of stem‌:[60007.1234] in a document with language specified to English is displayed as 60,007.1234.

Flavor-specific number formats

Flavours can override the language-specific conventions.

Flavor-specific number formats are implemented for the following flavors:

Overriding number representation formats

The language and flavour conventions for number localization can be overridden with the :localize-number: attribute, which uses a template to indicate how numbers should be displayed.

The template is encoded as a string as follows:

:localize-number: #,##0.### ###

Where,

  • The decimal marker is the character immediately after "0". It is typically specified as a decimal point (.), or decimal comma (,).

  • 1 + the number of continuous hashes before 0 is the number of digits to be grouped together. (This is normally three, but two are used in India.)

  • The separator of groups of digits is the first character before the continuous run of hashes before 0. If there is no non-hash character before 0, then there is no grouping of digits before the decimal marker.

  • The number of contiguous hashes after the decimal marker is the number of fractional digits to be grouped together.

  • The first character after the contiguous hashes after the decimal marker is the separator of groups of fractional digits. If there is no non-hash character after the decimal marker, then there is no grouping of digits after the decimal marker.

If a non-breaking space is to be entered, please directly enter the corresponding unicode character within the template string. The differences between a normal whitespace and non-breaking spaces can be seen at Non-breaking space on Wikipedia.

To illustrate, the encoding stem‌:[6007.1234] will be rendered as:

  • 60 007.123 4 if specified with :localize-number: # ##0.### ###

  • 60 007,123 4 if specified with :localize-number: # ##0,### ###

  • 60007.1234 if specified with :localize-number: ###0.######

  • 60,007.12 34 if specified with :localize-number: #,##0.## ## ##