Last Updated: 2020-07-02

Background

Language Family: Turkic / Common Turkic / Karluk / Uyghur

Phonology

Consonants

  • Memtimin (2016) lists the aspirated forms of voiceless stops as their underlying phonemes; Hahn (1991) does not, but does include aspirated forms of those sounds (plus /tʃ/) as common allophones. I have judged it more likely that the simple forms of these consonants are the underlying ones.
  • According to Hahn (1991), /v/ is phonemic in some nonstandard dialects (p. 59); according to Memtimin (2016), it’s an allophone of /w/. Either way, it does not appear to merit inclusion in the ruleset.
  • Hahn (1991) lists /χ/ and /ɢ/ rather than /x/ and /ɣ/ (p. 59); interestingly though, he mentions that “in most environments /ɢ/ is phonetically realized as a fricative” (p. 60).
Place of Articulation
Manner of Articulation Labial Alveolar Postalveolar Palatal Velar Uvular Glottal
Stops p b t d k ɡ q
Affricates tʃ dʒ
Fricatives f s z ʃ ʒ x ɣ h
Nasals m n ŋ
Trills r
Approximants w l j
Note: For phonemes that share a cell, those on the left are voiceless, whereas those on the right are voiced.

Vowels

  • /e/ is rare in native words (Hahn 1991, 33).
Front Back
High i y u
Mid e ø o
Low æ ɑ
Note: For phonemes that share a cell, those on the left are unrounded, whereas those on the right are rounded.

Alphabet

The Uyghur Arabic alphabet is the most common orthography in Uyghur; it has been the official script of the language since 1982, and it has by far the largest Crúbadán corpus of the various Uyghur scripts. Cyrillic and Latin scripts do also exist, however, and will be factored into the rulesets.

Arabic

  • Vowels in the Uyghur Arabic alphabet are formed from the combination of two characters. The first is ⟨ئ⟩, which carries no phonetic value but serves as a “base” upon which different vowel characters are appended; the second is the affixed character, which marks which of the vowels is being written.
  • For each vowel, the alphabet shown here features both the affix and the full combined character. The (column) representation for phonemes that have both a grapheme and an affix are switched in the Rmd file. That is, in the Rmd file, the affix is in the grapheme column and the grapheme is in the affix column. The rendering of the html document flips them, thus having the correct representation.
  • Note: /ʒ/ is not represented in the Uyghur (Arabic script) Crúbadán corpus.
Grapheme (Vowel) Affix Phoneme
ئا ا /ɑ/
ئە ە /æ/
ب /b/
پ /p/
ت /t/
ج /dʒ/
چ /tʃ/
خ /x/
د /d/
ر /r/
ز /z/
ژ /ʒ/
س /s/
ش /ʃ/
غ /ɣ/
ف /f/
ق /q/
ك /k/
گ /ɡ/
ڭ /ŋ/
ل /l/
م /m/
ن /n/
ھ /h/
ئو و /o/
ئۇ ۇ /u/
ئۆ ۆ /ø/
ئۈ ۈ /y/
ۋ /w/
ئې ې /e/
ئى ى /i/
ي /j/

Cyrillic

Grapheme Phoneme
а /ɑ/
ə /æ/
б /b/
п /p/
т /t/
җ /dʒ/
ч /tʃ/
х /x/
д /d/
р /r/
з /z/
ж /ʒ/
с /s/
ш /ʃ/
ғ /ɣ/
ф /f/
қ /q/
к /k/
г /ɡ/
ң /ŋ/
л /l/
м /m/
н /n/
һ /h/
о /o/
у /u/
ө /ø/
ү /y/
в /w/
е /e/
и /i/
й /j/
ю /ju/
я /ja/

Latin

  • I opted not to build a ruleset for the Latin script, because the Crúbadán corpus for Latin-script Uyghur only has around 150,000 words. See Janbaz, Saleh, and Duval (2006) for further insight on this script.
Grapheme Phoneme Comment
a /ɑ/
e /æ/
b /b/
p /p/
t /t/
j /dʒ/
x /x/
d /d/
r /r/
z /z/
s /s/
f /f/
q /q/
k /k/
g /ɡ/
l /l/
m /m/
n /n/
h /h/
o /o/
u /u/
ö /ø/
ü /y/
w /w/
ë /e/ sometimes ⟨é⟩ (Janbaz, Saleh, and Duval 2006, 9)
i /i/
y /j/
Digraph
ch /tʃ/
zh /ʒ/
sh /ʃ/
gh /ɣ/
ng /ŋ/

Syllable Structure

Lenition Rules

Misc. Rules

References

Hahn, Reinhard F. 1991. Spoken Uyghur. University of Washington Press.

Janbaz, Waris Abdukerim, Imad Saleh, and Jean Rahman Duval. 2006. “An Introduction to Latin-Script Uyghur.” In.

Memtimin, Aminem. 2016. Language Contact in Modern Uyghur. Harrassowitz Verlag.