The long march to Unicode: a digital approach to variability in Icibemba orthography
Abstract
The study examined the orthography of Icibemba, a Bantu language spoken in Zambia
and Congo─Kinshasa. The findings recommended adopting Unicode standards wherever
different letters (graphemes) represented the same speech sound (allophone). For
example, the monograph symbol <c> and the digraph symbol <ch> represented the
same speech sound, i.e. the voiceless affricate <ch> as in batch (IPA [tʃ]), the starred
form (*) being inconsistent
Cl.3/4 <umucila>→ [umucila] → /u─mu─cila/’tail’
Cl.3/4 *<umuchila> → [umucila] → /u─mu─cila/’tail’
Secondly, the monograph symbol <V > and the digraph symbol <VV> were not being
used to distinguish short and long vowels, contrary to the syllable structure of Icibemba
Cl.14 <ubuuci> → [uβuci] → /u─bu─uci/’honey’
Cl.14 *<ubuchi> → [uβuci] → /u─bu─uci/’honey’
Thirdly, certain graphemes represented allophones in an irregular manner by ignoring
the existence of more suitable options dictated by the particular structural environment.
Examples included
(1) The use of <r> instead of <l>
*< Mporokoso> → <Mpolokoso> → [mpolokoso] → /Ø–mpolokoso/’a district in
northern Zambia’;
(2) The use of <w> instead of <b>
*< Luwingu> → <Lubingu> → [luβingu] → /Ø–luβingu/’a district in northern Zambia’.