Page tree

Version 1.0.2, 27 June 2022

Version history:

  • 1.0, 19 January 2018

  • 1.0.1, 22 January 2018

  • 1.0.2, 27 June 2022

The Burmese script

The Burmese (or ‘Myanmar’: both designations are used) writing system is a complex southeast Asian abugida. It is written from left to right. For a brief description of the script, see ScriptSource.org. Software support for it is still patchy, especially among page layout applications. It is to be noted that in January 19, 2018, Adobe InDesign (CC 2018, v. 13.0) was not capable of correctly rendering Burmese text, no matter whether the World-Ready Paragraph Composer is chosen or the ‘ordinary’ Paragraph Composer. This rules out most typesetters employed by Brill.

Some recent web browsers, such as Firefox 57, seem to be handling Burmese quite satisfactorily, but Google Chrome does not perform as well.

For typesetting texts Brill plans to first employ TAT Zetwerk of Utrecht, the Netherlands, because they have proved to be very adept at supporting non-Latin scripts in their LuaTeX setup, mainly because they have access to very low-level program code and to the complete text stack. It should be noted that it is unknown at this time whether OpenType technology should be leveraged (in the shape of the Harfbuzz library), or SIL’s Graphite software. LuaTeX is capable of accessing both.

Languages written in the Burmese script

Apart from the Burmese language itself, including Old Burmese: Sanskrit, Pali, Aiton, Akha, Asho Chin, Bwe Karen, Geba Karen, Khamti, Lamkang, Manumanaw Karen, Marma, Moken, Mon, Pa’o Karen, Phake, Pwo Eastern Karen, Pwo Western Karen, Rakhine, Ruching Palaung, Rumai Palaung, S’gaw Karen, Shan, Shwe Palaung, Tai Laing, Western Kayah.

Encoding

Unicode and ‘Ad Hoc’ encodings

Despite having entered the Unicode Standard relatively early on (in 1999), the encoding of Burmese-script text has been fraught with difficulties from the beginning. Due to lack of complex-script support in computer operating systems and applications in approximately the first decade of the 21st century, and only slow advances in support after that, users of the Burmese script felt forced to adopt fonts and encodings which break important Unicode principles. The main one relevant for Burmese is that Unicode text must always be input ‘logically’, not ‘visually’. The script is complex: a cluster (or ‘conjunct’) of characters signifying phonemic values, such as base consonant, vowel, adjunct consonant(s) either in the syllable onset or in final position, tone, etc., which in the script is a typographic unity, may consist of five or even more characters whose positions and even shapes are dependent on the (typo)graphic characteristics of the others. Sometimes, a character must logically be input after another but be rendered before the other character. Examples of combining diacritics [which are here displayed with the symbol ◌ standing for the base character], with the vowel e ေ; and with the character ‘ya yit’ ြ, which palatalizes velar consonants:

  • နေ /ne/: the logical order is na (န) and then e (ေ), which is the way Unicode demands it is encoded. Note that the vowel sign must appear to the left of the consonant according to the rules of the Burmese script.

  • မြန်မာ /myanma/: the logical order is ma (မ), ‘ya yit’ (ြ), n (န), a syllable-final consonant marker (်), ma (မ) and a low tone marker (ာ). The ya yit palatalization marker must be encoded after the consonant to which it applies, but it must be rendered to the left and around that character.

At first, such reordering of characters before rendering them was beyond what any text processing application could handle, which was, of course, unacceptable to those who must daily use the script. So workarounds or plain old hacks were thought out. Most of them involved visual ordering of the characters for input as well as encoding, contrary to what is prescribed in the Unicode Standard. At the same time, fonts which allowed such visual ordering were developed. A well-known ad hoc encoding along these lines is called Zawgyi. It must be stressed that text encoded in such an ‘illegal’ way is unusable in proper Unicode-oriented text flows.

Up-to-date information on encoding

For up-to-date information on Burmese in Unicode, for full explanations on and rules for the proper ordering of Burmese characters, and for the Burmese script as used for other languages, the document Unicode Technical Note #11. Representing Myanmar in Unicode: Details and Examples is indispensable.

Line breaking

Line breaking in Burmese-script text demands some attention, and little software has been developed to aid users in this. Burmese-script text does not use spaces between words, but only between phrases which in Latin-script text would be marked with commas or (semi)colons. This makes it difficult for a typesetter, who will in all likelihood not be familiar with the script and the language(s), to properly format paragraphs, especially if these are justified. Therefore, instruct typesetters to format Burmese-script-only paragraphs left-aligned, i.e., ragged-right.

One may break any string of Burmese-script text between syllables (Burmese words are predominantly monosyllabic) using U+200B ZERO WIDTH SPACE (ZWSP); finding syllable boundaries can be quite tricky, however. This requires either knowledge of language and script, or a sophisticated line breaking algorithm along the lines set out in Unicode Technical Note #11. Representing Myanmar in Unicode: Details and Examples, pp. 12-13.

Marking Emphasis

Emphasis is marked by using bold type. If a tertiary text category must be marked, colour must be used. On no account may underlining be used, because the script has many descenders and also exhibits stacking behaviour: underlining will clash with descenders and stacked conjuncts, become indistinct, and render text less than readable.

Fonts

The typeface chosen for the Burmese/Myanmar script is SIL International’s Padauk. Its most recent version is 5.001. The two fonts Brill typesetters should use in PDF production according to the BTS are Padauk Book and Padauk Book Bold. Not only are these fonts well-designed, but they also support not just the Burmese/Myanmar language, but also most, if not all, of the languages listed above.

Type specifications

Burmese/Myanmar-script text: single words or short phrases embedded in Latin-script paragraphs

  • Brill 11 pt: ~ Padauk Book 9 pt

  • Brill 10 pt: ~ Padauk Book 9 pt

  • Brill 9 pt: ~ Padauk Book 7 pt

It should be noted that the above values are preliminary.
Also, these values have been chosen for Burmese-script text appearing within a Latin-dominated context, not for Burmese-script-only paragraphs.

Burmese/Myanmar-script-only paragraphs

Apart from the first line of such paragraphs or contiguous groups of paragraphs, text in these paragraphs will not conform to the regular BTS baseline grid. In order to avoid any clashes between descenders in one line, or very tall character stacks, and ascenders in the next, these paragraphs will have an increased leading. The type sizes for body text and for footnote or index text are also increased slightly to enhance readability of longer runs of Burmese/Myanmar-script text.

  • Body text: ~ Padauk Book 10 pt; leading 20½ pt

  • Appendix with smaller Burmese/Myanmar type: ~ Padauk Book 9 pt; leading 20½ pt

  • Footnotes consisting entirely of Burmese/Myanmar type: ~ Padauk Book 8 pt; leading 16 pt

Paragraph alignment: align left, ragged right. This is meant to avoid any unsightly rivers of white space which would occur otherwise in fully justified text: Burmese/Myanmar-script text contains far fewer spaces than, e.g., English-language text, for spaces in the Burmese script indicate pauses which in Latin-script text would be marked with commas or (semi)colons, but they do not mark Burmese/Myanmar word breaks. Also, there is no hyphenation (word division at the end of lines) in Burmese/Myanmar text.