Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: upped version point number

Table of Contents
Version 0.3.1,

...

12 May 2022

Version history:

  • 0.1, 22 June 2020

  • 0.2, 9 August 2021
  • 0.3, 9 May 2022
  • 0.3.1, 12 May 2022

Scope of this document

The number of linguistics characters in the Unicode Standard is enormous. No attempt is made here to cover all of them. The following are observations of phenomena that have had an impact on Brill’s treatment of linguistic texts. It should be noted that the term ‘linguistics’ can cover the study of specific languages; the study of ‘language’ as such (sometimes called ‘theoretical linguistics’); comparative linguistics; and philology, which is the study of all sorts of language phenomena within the context of traditional scholarly disciplines, such as Classical Studies, theology, Semitic Studies, Arabic Studies, Sinology, and so on.

...

Because of the subtlety of differences in appearance of these characters it is important to check (or spot-check) these characters by code point. The easiest way to do this in MS Office (Windows) is to copy the character whose Unicode value you wish to know from its source and paste it into a Word document. Once pasted, with the insertion point positioned just after the character in question, type Alt X, which converts the character to its Unicode hexadecimal value (typing Alt X again will toggle this back to the character). On macOS, you can use Character Viewer (sometimes referred to as ‘Emoji & Symbols’): in its Search field, paste the character whose value you wish to determine and it will show the required information instantly next to ‘Unicode’, as a hexadecimal value prefixed with ‘U+’.
For more information, see Using Unicode hexadecimal codes.

Latin twins in the Brill typeface

...

CharacterCode pointNameRemarks
ɑU+0251Latin alpha or ‘script a’There is a capital, Ɑ, U+2C6D, but this forms part of several Cameroon language orthographies, and it occurs but rarely in strictly linguistic contexts. Note also the existence of ᵅ U+1D45, ɒ U+0252, ᶛ U+1D9B, ꭤ U+AB64, and ꬰ U+AB30.
U+A7B5Latin betaThere is a capital, Ꞵ, U+A7B4, but this forms part of Gabonese orthographies, and it occurs but rarely in strictly linguistic contexts. Note the availability of the Latin glyph shape of Greek beta U+03B2 in the pre-version-4 Brill fonts through application of OpenType ss20.
ɣU+0263Latin gammaThere is a capital, Ɣ, U+0194, but this forms part of some African orthographies, and it occurs but rarely in strictly linguistic contexts. Note also the existence of ˠ U+02E0 Superscript Latin gamma, and ɤ U+0264 ‘Baby gamma’ or ‘ram’s horns’.
U+1E9FLatin delta or ‘script d’ or ‘insular d’Note also the existence of ƍ U+018D turned delta.
ɛU+025BLatin epsilon or ‘open e’There is a capital, Ɛ, U+0190, but this forms part of some African (Niger-Congo) orthographies, and it is not ordinarily used occurs but rarely in strictly linguistic contexts. Note also the existence of ᶓ U+1D93, ɜ U+025C, ᶔ U+1D94, ɝ U+025D, ᶟ U+1D9F, ɞ U+025E, ʚ U+029A, ᴈ U+1D08, ᵋ U+1D4B, and ᵌ U+1D4C.

U+03B8Latin thetaThis character has not yet been encoded in Unicode. The Latin glyph shape of Greek theta U+03B8 in the Brill fonts is accessible by application of OpenType ss20.
ɩU+0269Latin iotaThere is a capital, Ɩ, U+0196, but this forms part of some African (Niger-Congo) orthographies, and it is not ordinarily used occurs but rarely in strictly linguistic contexts. Note also the existence of ᶥ U+1DA5 and ᵼ U+1D7C. Do not confuse with ꙇ Cyrillic iota U+A647.
U+03BBLatin lambdaThis character has not yet been encoded in Unicode. The Latin glyph shape of Greek lambda U+03BB in the Brill fonts is accessible by application of OpenType ss20. Note also the existence of ƛ U+019B.
ʊU+028ALatin upsilonThere is a capital, Ʊ U+01B1, but this forms part of some African (Niger-Congo) orthographies, and it is not ordinarily used occurs but rarely in strictly linguistic contexts. Note also the existence of ᵿ U+1D7F and ᶷ U+1DB7.
ɸU+0278Latin phiNote also the existence of ᶲ U+1DB2 and ⱷ U+2C77.
U+AB53Latin khiNote the availability of the Latin glyph shape of Greek khi U+03C7 in the pre-version-4 Brill fonts through application of OpenType ss20. There is a capital, Ꭓ, U+A7B3, but this is only used in German dialectology. Note also the existence of ꭔ U+AB54 and ꭕ U+AB55.
U+A7B7Latin omegaThere is a capital, Ꞷ, U+A7B6. Both are used in African orthographies. Note also the existence of ɷ U+0277 and ꭥ U+AB65.

...

In linguistics, the following non-literal symbols are often confused:

'double hyphen'
WRONG characterCode pointNameCORRECT characterCode pointNameRemarks

Ø

U+00D8Latin capital letter O with strokeU+2205Empty setThe 'empty set' ‘empty set’ is used in linguistics to denote a zero morpheme (null morpheme) or zero-grade ablaut (or phonological 'zero'‘zero’). Often submitted by authors as Capital letter O with stroke.
=U+003DEquals signU+2E17Double oblique hyphenThe ‘double oblique hyphen’ is often used in grammars, as a clitic marker or morpheme boundary marker. Often submitted by authors as Equals sign.

``
''
"
U+201C
U+0060(2×)
U+0027(2×)
U+0022
Left double quotation mark;
Grave accent(2×);
Apostrophe(2×);
Quotation mark
ʺU+02BAModifier letter double primeTo transliterate the Cyrillic hard sign Ъ ъ (capital and lowercase) in the Latin script. Note that the double prime ʺ consists of just one U+02BA character and that this exhibits no casing behaviour.

`
'
U+2018;
U+0060;
U+0027
Left single quotation mark;
Grave accent;
Apostrophe
ʹU+02B9Modifier letter primeTo transliterate the Cyrillic soft sign Ь ь (capital and lowercase) in the Latin script. Note that the single prime ʹ U+02B9 exhibits no casing behaviour.