Last Updated: 2019-12-05

COMPROMISED: some ambiguity in the transcription of alif; some conflation between /w/ and /uː/ and between /j/ and /iː/

Background

Language Family: Afro-Asiatic / Semitic / Central / South / Arabic

Phonology

Consonants

  • Arabic includes what are called emphatic consonants, which are produced when the back, or the root, of the tongue retracts towards the pharynx (Amayreh and Dyson 1998, 643).
Place of Articulation
Manner of Articulation Labial Dental Alveolar Postalveolar Palatal Velar Uvular Pharyngeal Glottal
Stops (plain) b t tˤ d dˤ k q ʔ
Affricates
Fricatives f θ ð ðˤ s sˤ z ʃ x ɣ ħ ʕ h
Nasals m n
Trills r
Approximants w l j
Note: For phonemes that share a cell, those on the left are voiceless and those on the right are voiced. Phonemes that have the diacritic (ˤ) are emphatic.

Vowels

  • Vowel length is contrastive in Arabic (Amayreh and Dyson 1998, 643).
  • /e/ and /o/ exist in spoken varieties of Arabic, but not in Standard Modern Arabic (R. Ibrahim, Eviatar, and Aharon-Peretz 2002, 323).
Front Central Back
High i u
Low a
Diphthongs
/aj/, /aw/

Alphabet

Grapheme Phoneme Comment
ا /aː/; /ʔ/ /ʔ/: word-initially (not always marked, which somewhat compromises the language)
ب /b/
ت /t/
ث /θ/
ج /dʒ/
ح /ħ/
خ /x/
د /d/
ذ /ð/
ر /r/
ز /z/
س /s/
ش /ʃ/
ص /sˤ/
ض /dˤ/
ط /tˤ/
ظ /ðˤ/
ع /ʕ/
غ /ɣ/
ف /f/
ق /q/
ك /k/
ل /l/
م /m/
ن /n/
ه /h/
و /w/; /uː/ /w/: word-initially (used as default in the rules); /uː/: preceded by a short /u/ diacritic
ي /j/; /iː/ /j/: word -initially (used as default in the rules); /iː/ preceded by a short /i/ diacritic
ء /ʔ/ called a hamza, this grapheme also exists as a diacritic (explained below)
ة ∅; /t/ called a ta-marbuta, this grapheme appears word-finally, corresponding to /t/ if followed by a diacritic or ∅ otherwise (Biadsy, Habash, and Hirschberg 2009, 3)
ى /a/ called an alif-maqsura, this grapheme occurs word-finally (Habash 2010, 11; Biadsy, Habash, and Hirschberg 2009, 3)
Diacritic
ُ /u/ this diacritic is called a dammah (Yurtbaşı 2016, 146)
َ /a/ this diacritic is called a fatḥah (ibid.)
ِ /i/ this diacritic is called a kasrah (ibid.)
ٰ /aː/ this diacritic is called an alif khanjariyah (ibid.)
ٔ /ʔ/ this diacritic is called a hamza, and only appears (as a diacritic) in combination with ⟨ا⟩ ,⟨ي⟩, and ⟨و⟩ (Habash 2010, 5–6)
ٕ /ʔi/
ٓ /ʔ/ this diacritic is called a madda (a variant of the hamza), appearing in combination with ⟨ا⟩ (Habash 2010, 6)
ّ called a shadda, this diacritic indicates gemination of consonants (Habash 2010, 11)
ْ called a sukun, this diacritic indicates that no vowel follows the consonant in which it’s attached to; it also typically marks syllable boundaries (Habash, Diab, and Rambow 2012, 712)
ٌ /an/ indicates a word-final /an/ (nunnation) (Habash, Diab, and Rambow 2012, 713)
ٍ /in/ indicates a word-final /in/ (nunnation) (ibid.)
ً /un/ indicates a word-final /un/ (nunnation) (ibid.)

Syllable Structure

Lenition Rules

Misc. Rules

References

Amayreh, Mousa M., and Alice T. Dyson. 1998. “The Acquisition of Arabic Consonants.” Journal of Speech, Language, and Hearing Reasearch.

Awde, N. 2000. The Arabic Alphabet: How to Read and Write It. LYLE STUART. https://www.ebook.de/de/product/3309537/n_awde_the_arabic_alphabet_how_to_read_and_write_it.html.

Biadsy, Fadi, Nizar Habash, and Julia Hirschberg. 2009. “Improving the Arabic Pronunciation Dictionary for Phone and Word Recognition with Linguistically-Based Pronunciation Rules.” In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 397–405. NAACL ’09. Stroudsburg, PA, USA: Association for Computational Linguistics. http://dl.acm.org/citation.cfm?id=1620754.1620812.

Boudelaa, Sami, and William D. Marslen-Wilson. 2010. “Aralex: A Lexical Database for Modern Standard Arabic.” Behavior Research Methods 42 (2). Springer Science; Business Media LLC: 481–87. doi:10.3758/brm.42.2.481.

Habash, Nizar. 2010. Introduction to Arabic Natural Language Processing. Morgan & Claypool.

Habash, Nizar, Mona Diab, and Owen Rambow. 2012. “Conventional Orthography for Dialectal Arabic.” Proceedings of the Language Resources and Evaluation Conference (LREC), Istanbul, January.

Ibrahim, Abdulateef. 2019. “Glottal Stop in Arabic with Reference to English: Phonological and Orthographical Study” ( 2016 M- 1437 e) (April).

Ibrahim, Raphiq, Zohar Eviatar, and Judith Aharon-Peretz. 2002. “The Characteristics of Arabic Orthography Slow Its Processing.” Neuropsychology 16 (3). American Psychological Association (APA): 322–26. doi:10.1037/0894-4105.16.3.322.

Saiegh-Haddad, Elinor, and Roni Henkin-Roitfarb. 2014. “The Structure of Arabic Language and Orthography.” In Literacy Studies, 3–28. Springer Netherlands. doi:10.1007/978-94-017-8545-7_1.

Yurtbaşı, Metin. 2016. “Sura Yusuf in Full Ipa (Segmental-Suprasegmental) Transcription with English Translation.” International Journal of Arts and Humanities and Social Sciences.