Page tree

Version 1.0, 26 January 2018

Version history:

  • 0.5, 25 January 2018

  • 1.0, 26 January 2018

The Khmer script

The Khmer writing system is a complex southeast Asian abugida. It is written from left to right. For a brief description of the script, see ScriptSource.org.

Languages written in the Khmer script

Khmer, Sanskrit, Pali, Bunong (a.k.a. Phnong), Cham, Krung, Tampuan.

Encoding

Correct ordering of components in Khmer orthographical syllables is essential in ensuring correct rendering of Khmer text, as well as in safeguarding reliable search and sort functions in software. Unfortunately, “[t]he implementation of Khmer Unicode differs from font to font and from operating system to operating system,” at least according to the 2012 document The Mondulkiri Font Family bundled with version 7.1 of the Mondulkiri fonts archive, available from SIL. This means that while Unicode-encoded textual data is generally speaking stable, this may not be the case with Khmer texts, and users should be aware of this possibility.

When problems with Khmer text are detected, either because a proofreading author has to mark some corrections consistently, or because a typesetter suddenly notices Khmer text ‘breaking’ and dotted circles () cropping up, the first thing to check is the encoding of the troublesome characters.

To fix any encoding errors, a standard order of syllabic components must be (re)established; the first reliable source for a standard order is, of course, the text of The Unicode Standard, in the chapter and section dealing with the Khmer script (in the current [2022] version 14.0, chapter 16 “Southeast Asia,” section 4 “Khmer,” pp. 669ff.). An extended document with some improvements on the Unicode Standard definition of the Khmer orthographical syllable is published by SIL, to accompany SIL’s Mondulkiri fonts: The order of components in Khmer orthographical syllables bundled with version 7.1 of the Mondulkiri fonts archive.

Marking Emphasis

Emphasis is marked by using bold type. If a tertiary text category must be marked, colour must be used. On no account may underlining be used, because the script has many descenders and also exhibits stacking behaviour: underlining will clash with descenders and stacked conjuncts, become indistinct, and render text less than readable.

Fonts

The font family chosen for the Khmer script is SIL International’s Khmer Mondulkiri. The two fonts Brill typesetters should use in PDF production according to the BTS are Khmer Busra Regular and Khmer Busra Bold. These fonts have extensive documentation bundled with the downloadable Mondulkiri fonts archive. They also provide alternative shapes and/or positions of various characters, by offering both OpenType Stylistic Sets and Apple’s AAT Typography options; all described in the documentation.

Type specifications

Khmer-script text: single words or short phrases embedded in Latin-script paragraphs

  • Brill 11 pt: ~ Khmer Busra 8 pt

  • Brill 10 pt: ~ Khmer Busra 8 pt

  • Brill 9 pt: ~ Khmer Busra 7 pt

It should be noted that the above values are preliminary.
Also, these values have been chosen for Khmer-script text appearing within a Latin-dominated context, not for Khmer-script-only paragraphs.

Khmer-script-only paragraphs

Apart from the first line of such paragraphs or contiguous groups of paragraphs, text in these paragraphs will not conform to the regular BTS baseline grid. In order to avoid any clashes between descenders in one line, or very tall character stacks, and ascenders in the next, these paragraphs will have an increased leading. The type sizes for body text and for footnote or index text are also increased slightly to enhance readability of longer runs of Khmer-script text.

  • Body text: ~ Khmer Busra 9 pt; leading 20½ pt

  • Appendix with smaller Khmer type: ~ Khmer Busra 8 pt; leading 20½ pt

  • Footnotes consisting entirely of Khmer type: ~ Khmer Busra 8 pt; leading 16 pt

Paragraph alignment: align left, ragged right. This is meant to avoid any unsightly rivers of white space which would occur otherwise in fully justified text: Khmer-script text contains far fewer spaces than, e.g., English-language text, for spaces in the Khmer script indicate pauses which in Latin-script text would be marked with commas or (semi)colons, but they do not mark Khmer word breaks. Also, there is no hyphenation (word division at the end of lines) in Khmer text.