Version 0.3, 22 March 2022
Version history:
0.1, 26 August 2021
0.2, 31 August 2021
0.3, 22 March 2022
Introduction
The Mongolian script is not widely supported in page layout applications, nor has the Unicode encoding of Mongolian been settled definitively (see Liang Hai, The Mongolian Script: What’s Going On?!). The following information and instructions must be considered provisional and incomplete until matters have settled more.
About the Mongolian script
Richard Ishida’s Mongolian script summary gives the best overview. Contents:
Sample — Usage & History — Basic Features —
Character index (Letters – Combining marks – Numbers – Punctuation – Separator & other)
Phonology (Vowel sounds – Consonant sounds)
Structure (Vowel harmony – Glyphs vs. phonemes – Spelling vs. pronunciation)
Vowels (Basic vowels – Suffixes – Final vowel separation)
Consonants (Basic Mongolian consonants – QA and GA – Repertoire extension
– Consonants for other languages – Consonant clusters & gemination)
Combining marks — Numbers — Text direction (In horizontal contexts)
Glyph shaping & positioning (Cursive shaping – Context-based shaping
– Context-based positioning – Font styles – Baselines & inline alignment)
Punctuation & inline features (Grapheme boundaries – Word boundaries
– Phrase & section boundaries – Parentheses & brackets – Quotations
– Emphasis – Abbreviation, ellipsis & repetition – Inline notes & annotations
– Other inline ranges – Other punctuation)
Line & paragraph layout (Line breaking & hyphenation – Text alignment & justification
– Letter spacing – Counters, lists, etc. – Styling initials
Page & book layout (General page layout & progression – Forms & user interaction
– Page numbering, running headers, etc.)
Languages using the Mongolian script — Online resources — References
Sources of confusion
- As Richard Ishida writes: “Unicode encodes separate characters for different sounds for the Mongolian language, regardless of whether the glyph shapes used are identical. (…) The result of this encoding method is that it is impossible to accurately copy Mongolian text from a visual source unless you speak the language well enough to recognise the phonetics of the words involved. It also leads to mistakes when Mongolian speakers type text.” Just by looking at a character you cannot be certain what the signified phoneme is. And one glyph shape may hide one of two different character codes.
- Furthermore, Mongolian orthography, at least in the Mongolian script (not in the Cyrillic script), reflects an archaic pronunciation even if the written Mongolian text is supposed to represent a modern utterance. “For example, if you were to spell out the letters in the following word as written you would get uʤəgulxu, whereas the modern pronunciation is uʤuuləx. (ᠤᠵᠡᠭᠦᠯᠬᠦ)” (Richard Ishida)
- Many Mongolian characters have variant forms, and which variant is to be rendered in each case is not always deducible algorithmically. This forces human intervention by inserting (invisible) control characters.
Unfortunately, no consensus presently exists about which variant forms can be selected automatically by font mechanisms and which others demand that the user select the correct form. This has resulted in different behaviour depending on the font chosen to render Mongolian-script text.
Such lack of automation severely impedes robust Mongolian-script text processing. The uncertainties about the encoding of Mongolian text and the divergent font behaviours make for a very unreliable text flow in the publishing chain. Although work is underway to remedy the situation, there is still no end date in sight.
Fonts
Microsoft Windows 10 comes bundled with one dedicated Mongolian font, Mongolian Baiti.
Mongolian Baiti font sizes:
Brill 11 pt ~ Mongolian Baiti 16 pt, lowered by 3 pt
Brill 10 pt ~ Mongolian Baiti 14 pt, lowered by 2½ pt
Brill 9 pt ~ Mongolian Baiti 13 pt, lowered by 2½ pt
How to render Mongolian in the Mongolian script?
During the present state of the Unicode Standard regarding the Mongolian script, fonts, and applications, all Mongolian text written in its native script should be rendered as an image, preferably a resolution-independent .svg picture.
XML language and script tags
(For future application, when Mongolian can be rendered reliably in its native script: 'mn-Mong'.)