ENG 408 Projects: Sounds & Sound Patterns in Twelve Languages

Chapter 6: The Sounds of Swahili

Introduction: About Swahili

Geographic Location 

     According to The World Atlas of Language Structures Online (WALS), Swahili is spoken along the eastern coast of Africa, particularly in Tanzania. John Mugane's 2015 book The Story of Swahili paints a more detailed picture. According to his book, Swahili is spoken across a third of the African continent, including in the heart. Its geography stretches from the east coast to the west coast and spreads into both the Northern and Southern regions. Despite the wide geographic range found today, Swahili has much smaller origins. It began as an island language with its ancestral lands consisting of just 2,500 km of coast from Mogadishu, Somalia to Sofala, Mozambique (1). Note that this historical range does include the Tanzanian coast. Today, Swahili is the national or official language of 4 countries: Tanzania, Kenya, Rwanda, and Uganda ("Swahili Language"). It is also one of the official languages for several organizations including the African Union, the Southern African Development Community, and the East African Community (EAC), which encompasses Burundi, Kenya, Rwanda, South Sudan, Tanzania, and Uganda (ibid). It can also be assumed that with things like migration and diaspora populations, Swahili is spoken to some degree around the world. The image carousel below shows the ancestral lands of Swahili.


Language Family 

     Which language family Swahili falls into is somewhat contested. Some pages, like WALS, classify it as belonging to the Niger-Congo family, while other pages, like Glottolog 4.4, classify it as belonging to the Atlantic-Congo family. This is likely due to the fact that Atlantic-Congo is a subfamily of Niger-Congo. Regardless, both pages agree that Swahili belongs to the subfamily Benue-Congo and the genus Bantoid. The image carousel below shows the Niger-Congo family tree as well as the geographic distribution of the different languages found within this language family. 

Numbers of Speakers 

     Swahili is estimated to have between 50 and 150 million speakers ("Swahili Language"). In 2012, there were an estimated 18 million native speakers, and as of 2015 there was an estimated 90 million speakers of Swahili as a second language (ibid). These numbers are not surprising due to its vast geographic region. Historically, immigrants from the interior of Africa, traders from around the world, and colonizers have all adopted this language for their own use as well (Mugane 1). This has greatly grown not only the geographic region but also the number of speakers outside of tribal members (ibid).
 

Other Relevant Information

      Dialects: There are 24 different dialects of Swahili, although 3 of them are extinct. Most of these dialects exist in their own geographic region. Some of them are also mutually unintelligible, meaning a speaker of another dialect cannot understand that specific dialect. Despite this, many of the dialects can be grouped together based on their linguistic similarities. While standard Swahili is based on the Kiunguja dialect of Zanzibar Town, the ancient Kingozi dialect from the Indian Ocean coast is often considered to be the source of Swahili ("Swahili language"). 
      Endangerment Status: The endangerment status of Swahili is considered to be shifting. There are some dialects, like Mwini, that have become endangered however ("Spoken L1 Language: Swahili").
      Identifier Codes: According to the WALS page, the identifier codes for this language, which are largely used for documentation purposes, include the following: 
      - Glottocode: swah1253 
      - ISO 639-3 Code: swh
      - WALS Code: swa
 

The Consonants of Swahili 

Consonants are more than just the letters we use in English. This means different languages have different amounts and different kinds of them! But if consonants aren't letters, what are they? According to Ashby, they are a type of speech sound (4) that can be identified via Voice-Place-Manner (VPM) labels (48). Voicing refers to if the sound is voiced or unvoiced (Ashby 21). Place refers to what part of the mouth is used to make the sound. According to Hildebrandt (Sept 1 2021), there are several different place categories including:
     - bilabial (when both lips are in play)
     - labiodental (when the lower lip and upper teeth are in play)
     - dental (when the tip or blade of tongue comes into contact with teeth)
     - alveolar (when the tongue tip, blade, or front comes into contact with alveolar ridge)
     - retroflex (when the underside of the tongue hits the hard palate)
     - palatal (when the tongue front or center hits the hard palate)
     - velar (when the front or back of tongue hits the soft palate)
     - uvular (when the back of the tongue hits the uvula)
     - pharyngeal (when the back or root of the tongue hits pharyngeal wall)
     - epiglottal (when the epiglottis hits the pharyngeal wall)
Manner refers to the way the sound is made. According to Hildebrandt (Sept. 1 2021, Sept. 9 2021), there are the following manner categories:
     - stops/plosives (characterized by stopped airflow)
     - nasals (characterized by airflow out of the nasal cavity)
     - trills (characterized by the tip of tongue being set in motion)
     - taps/flaps (characterized by a single contraction of the tongue muscles)
     - fricatives (characterized by a turbulent airflow from partial airflow construction)
     - approximants (characterized by less airflow construction)
     - laterals (characterized by airstream along the sides of the tongue rather than over)
     - affricates (an in-between of a stop and a fricative)
Therefore, according to the VPM labelling system, the sound /p/ is an unvoiced bilabial plosive. 

Swahili's consonant inventory is surprisingly large considering the inventories of other Bantu languages. In fact, it is twice as large as some of them (Jerro 9). This is because unlike many other Bantu languages, Swahili has both voiced and voiceless stops and fricatives as well as aspirated stops and fricatives (ibid). Another interesting aspect of the Swahili consonant inventory is that all voiced stops are implosives, which is also not found among the other Bantu languages of East Africa (Jerro 10).

There are 29 different consonants identified on pages 38 and 39 of the Swahili Language Handbook. These consonant sounds (called phonemes) include the following: /p/, /pʰ/, /b/, /t/, /tʰ/, /d/, /t̠š/ (which can also be transcribed as /č/), /t̠šʰ/ (which can also be transcribed as /čʰ/), /k/, /kʰ/, /g/, /f/, /v/, /θ/, /ð/, /s/, /z/, /š/, /ɣ/, /h/, /m/, /n/, /ɲ/, /ŋ/, /l/, /r/, /w/, /j/ and /ɟ/. Of the 29 different consonants, 5 (/b/, /d/, /ɟ/, /g/ and /h/) have another form (called an allophone) that appears under specific circumstances. The allophonic alternations include [b] and [ɓ] for /b/, [d] and [ɗ] for /d/, [ɟ] and [d̠ž] for /ɟ/, [g] and [ɠ] for /g/, and [h] and [x] for /h/. There will be more on this later.

It is important to note that the above data is slighting contestable. For example, on page 9 of his study entitled Linguistic complexity: A Case study from Swahili, Kyle Jerro states that there are 30 consonants in Swahili, one more than the list here. The missing consonant could possibly be the palatal fricative sound /sh/ (also transcribed as /ʃ/) that is included on page 386 of Rick Treece's 1990 article Underspecifying Swahili Phonemes. This sound is included in Phoible database's Swahili inventory, "Language Swahili", which also includes two other sounds, the obsolete voiceless postalveolar affricate /ʧ/ and the aspirated obsolete voiceless postalveolar affricate /ʧʰ/. Another example of contradictory information from Phoible regards the allophonic alternations. According to Phoible's list, there are more of them than what are listed in Polome's handbook. This includes three allophones for /f/ ([f], [ʒ], and [d̠ʒ]), two allophones for /m/ ([m] and [ɱ]) and two allophones for /r/ ([r] and [ɾ]). The database also lists [n] and [ŋ] as allophonic alternations of /n/ rather than as separate sounds. Note that this extent of ambiguity is not found with Swahili vowels. The image carousel below shows the IPA chart for the Swahili language consonant sounds in addition to the universal IPA chart (which can be used for comparison).    




There are several consonant phonemes mentioned in the above paragraph that are not on this chart. They include 3 implosives, 4 aspirates, and 2 obsoletes. They are: 
     - /ɓ/ - implosive voiced bilabial stop 
     - /ɠ/ - implosive voiced velar stop 
     - /ɗ/ - implosive voiced alveolar stop 
     - /tʰ/ - voiceless aspirate alveolar plosive 
     - /pʰ/ - voiceless aspirate bilabial plosive
     - /t̠šʰ/ - voiceless aspirate postalveolar affricate
     - /kʰ/ - voiceless aspirate velar plosive
     - /ʧ/ - obsolete voiceless postalveolar affricate 
     - /ʤ/ - obsolete voiced postalveolar affricate 
This data uses a couple of diacritics. [xʰ] indicates aspiration, [x̠] indicates retraction, and [x̌] indicates rising contour. For each of these examples, the 'x' serves only as a placeholder and is not indicative of a sound. 

Linguists know these are all different sounds because they are able to create minimal pairs/sets with them. Minimal pairs or sets are words that are exactly the same except for one sound (Gussenhoven 63). Each of the different sounds indicate a different phoneme. The following minimal pairs and sets for Swahili are taken from wordlists 1 and 3 from the Swahili entry of the UCLA Phonetics Lab Archive. 

To highlight /p/, /b/, /k/, /g/, /m/, /n/, /s/, and /z/ with the frame "sɑ́[_]ɑ":
[sɑ́pɑ] - 'make a clean sweep'
[sɑ́bɑ] - 'seven' 
[sɑ́kɑ] - 'catch' 
[sɑ́gɑ] - 'grind'
[sɑ́mɑ] - 'choke' 
[sɑ́nɑ] - 'very much'
[sɑ́sɑ] - 'now'
[sɑzɑ̃] - 'to be left over' 
   
To highlight /m/, /n/, and /ɲ/ with the frame "[_]ɑ́mɑ" 
[ɲɑ́mɑ] - 'meat, flesh'
[nɑ́mɑ] - 'to be flexible' 
[mɑ́mɑ] - 'mama'

To highlight /p/, /pʰ/, /k/, /kʰ/, /f/, and /v/ with the frame "[_]aː":
[paː] - 'roof' 
[pʰaː] - 'gazelle'
[kaː] - 'charcoal' 
[kʰaː] - 'landcrab' 
[faː] - 'be of use'
[vaː] - 'put on, dress'

To highlight /r/ and /l/ with the frame "si[_]a": 
[sira] - 'dregs, lees' 
[sila] - 'pail, bucket' 

To highlight /w/ and /k/ with the frame "wa[_]a": 
[waka] - 'smart, burn'
[wawa] - 'itch'

To highlight /h/ and /ɣ/ with the frame "[_]ɑ́mu":
[ɣɑ́mu] - 'grief' 
[hɑ́mu] - 'desire, longing' 

To highlight /ð/ and /θ/ with the frame "[_]ibíti":
[θibíti] - 'firm, constant, strong'
[ðibíti] - 'guard, protect against' 

To highlight /ɟ/ and /ʃ/ with the frame "[_]írɑ":
[ɟírɑ] - 'cumin seed'
[ʃírɑ] - 'sail of a vessel'

To highlight /t/ and /tʰ/ with the frame "[_]ɛ́mbo":
[tɛ́mbo] - 'alcohol'
[tʰɛ́mbo] - 'elephant' 

To highlight /d/ and /s/ with the frame "[_]ɑ́kɑ":
[dɑ́kɑ] - 'catch, snatch'
[sɑ́kɑ] - 'catch'

To highlight /j/ and /β/ with the frame "[_]ɑ́jɑ": 
[βɑ́jɑ] - 'earthen dish'
[jɑ́jɑ] - 'child's nurse'

To highlight /č/ and /k/ with the frame "[_]ua":
[čua] - 'apply friction'
[kua] - 'grow, grow up' 


Below are some examples of the consonant sounds, taken from word list 3 of the UCLA Phonetics Lab Archive: 

/p/ and /pʰ/ - 

In this sound file, there are two words. The first word, 'roof', is transcribed as [paː] and represents the /p/ sound, which is the voiceless bilabial plosive. The second word, 'gazelle', is transcribed as [pʰaː] and represents the aspirated /p/ sound. This phoneme is the voiceless aspirate bilabial plosive. Both words are included here to show the contrast between aspirated and unaspirated sounds. 

/ɗ/ - 
In this sound file, there is one word that highlights the /d/ sound. This word translates to 'sister' and is transcribed as [ɗaɗa]. This sound is the implosive voiced alveolar stop. Note that this is not a phoneme, rather it is an allophonic alternation.

/g/


In this sound file, there is one word that highlights the /g/ sound. This word translates to 'lazy, weak person' and is transcribed as [gɔigɔi]. This phoneme is the voiced velar plosive.

/k/
In this sound file, there is one word that highlights the /k/ sound. This is Swahili the word for 'charcoal', which is transcribed as [kaː]. This phoneme is the voiceless velar plosive.

/č/ - 
In this sound file, there is one word that highlights the /č/ (or /t̠š/) sound. This word translates to 'apply friction' and is transcribed as [čua]. This phoneme is the voiceless postalveolar affricate. One important thing to note with this sound is the rising contour diacritic above it, which indicates this sound is said in a rising tone. 

/m/ - 


In this sound file, there is one word that highlights the /m/ sound. This word is the Swahili word for 'mama', which is transcribed as [mɑ́mɑ]. This phoneme is the voiceless bilabial nasal. 

/n/ - 


In this sound file, there is one word that highlights the /n/ sound. This word translates to 'to be flexible' and is transcribed as [nɑ́mɑ]. This phoneme is the voiceless alveolar nasal. 

/s/ - 


In this sound file, there is one word that highlights the /s/ sound. This word means 'an hour' and is transcribed as [saː]. This phoneme is the voiceless alveolar fricative. 

/z/ - 


In this sound file, there is one word that highlights the /z/ sound. This word translates to 'denotes vital reproduction in male or female' and is transcribed as [zaː]. This phoneme is the voiced alveolar fricative.

/v/ - 



In this sound file, there is one word that highlights the /v/ sound. This is the Swahili word for 'put on, dress', which is transcribed as [vaː]. This phoneme is the voiced labiodental fricative. 

/f/ - 


In this sound file, there is one word that highlights the /f/ sound. This word is the Swahili word for 'be of use', which is transcribed as [faː]. This phoneme is the unvoiced labiodental fricative.

/ð/ -


In this sound file, there is one word that highlights the /dh/ sound. This word means 'guard, protect against' and is transcribed as [ðibíti]. This phoneme is the voiced dental fricative. 

/θ/ -


In this sound file, there is one word that highlights the /th/ sound. This word translates to 'firm, constant, strong' and is transcribed as [θibíti]. This phoneme is the voiceless dental fricative. 

/j/ - 


In this sound file, there is one word that highlights the /y/ sound. This word is the Swahili pronoun '3rd person singular', which is transcribed as [jeje]. This phoneme is the voiceless palatal approximant. 

/ɟ/ - 


In this sound file, there is one word that highlights the /j/ sound. This word means 'cumin seed' and is transcribed as [ɟírɑ]. This phoneme is the voiced palatal plosive. 

/h/ -



In this sound file, there is one word that highlights the /h/ sound. This word means 'condition' and is transcribed as [hɑ́li]. This phoneme is the voiceless glottal fricative. 

/ʃ/ -


In this sound file, there is one word that highlights the /sh/ sound. This word translates to 'sail of a vessel' and is transcribed as [ʃírɑ]. This phoneme is the unvoiced postalveolar fricative. 

/ɣ/ -


In this sound file, there is one word that highlights the /gh/ sound. This is the Swahili word for 'grief', which is transcribed as [ɣɑ́mu]. This phoneme is the voiced velar fricative. 

/r/ - 

In this sound file, there is one word that highlights the /r/ sound. This word translates to 'a secret' and is transcribed as [siri]. This phoneme is the voiceless alveolar trill. 

/t/ and /tʰ/ - 

In this sound file, there are two words. The first word, 'alcohol', is transcribed as [tɛ́mbo] and represents the /t/ sound. This phoneme is the voiceless dental plosive /t/. The second word, 'elephant', is transcribed as [tʰɛ́mbo] and represents the aspirated /t/ sound. This phoneme is the voiceless aspirate dental plosive /tʰ/. Both words are included here to show the contrast between aspirated and unaspirated sounds. 

/w/ - 

In this sound file, there is one word that highlights the /w/ sound. This word means 'itch' and is transcribed as [wawa]. This phoneme is the voiced labial velar approximant. 

/ɲ/ - 

In this sound file, there is one word that highlights the /n/ sound. This word translates to 'meat, flesh' and it is transcribed as [ɲɑ́mɑ]. This phoneme is the voiceless palatal nasal. 

/l/ -

In this sound file, there is one word that highlights the /l/ sound. This word means 'pearl' and is transcribed as [lulu]. This phoneme is the voiceless alveolar lateral approximant. 

/ɓ/ - 

In this sound file, there is one word that highlights the /b/ sound. This word is the Swahili word for 'dumb person', which is transcribed as [ɓuɓu]. This sound is the voiced bilabial implosive. Note that it is not a phoneme, rather it is an allophonic alternation of /b/. 


The Vowels of Swahili 

Like consonants, vowels are much more than English's a-e-i-o-u (and sometimes y). Once again, they are speech sounds rather than letters. Note that these speech sounds differ from consonants in that they occupy the middle of a syllable rather than the ends and they tend to be the loudest part of a syllable (Hildebrandt, Sept 15 2021). Vowels are also identified by BOR labels (backness-openness-rounding) instead of VPM labels (Ashby 89). Backness refers to if the tongue comes into contact with the roof of the mouth at the front or the back of the mouth and has the following categories: front, central, and back (Ashby 89). Openness refers to how open or closed the jaw is and has the following categories: close, close mid, open mid, and open (Hildebrandt, Sept 15 2021). Roundness refers to the shape of the lips and has the following categories: rounded and unrounded (Hildebrandt, Sept 15 2021). Therefore, the vowel sound /i/ is a front, close, unrounded vowel. 

According to page 46 of Polome's handbook, there are 5 main vowel sounds in Swahili, /a/, /i/, /e/, /o/, /u/. Each of these sounds have 2 allophones, although there are 3 in the case of /a/. The allophones include: [a], [ə], and [ɒ] for /a/, [i] and [ɪ] for /i/, [e] and [ɛ̝] for /e/, [o] and [ɔ̝] for /o/, and [u] and [ʊ] for /u/. Note that the number of allophonic variations is slightly contested. The main difference is that Phoible cites an additional allophone for /a/, the more fronted [a̟] (which Polome talks about but doesn't cite as an allophone). Phoible also cites a lowered [e] sound ([e̞]) instead of the raised [ɛ] sound ([ɛ̝]) like Polome for /e/, and a lowered [o] sound ([o̞]) instead of a raised [ɔ] sound ([ɔ̝]) for /o/. These are not really differences though, as they are ultimately just two ways of saying the same thing (which is a sound between /e/ and [ɛ] and a sound between /o/ and [ɔ]). The image carousel below shows the IPA chart for the vowel sounds of Swahili as well as the universal IPA chart, which shows all the possible vowel sounds for a comparison.


There are two sets of variants that are not on this chart but are mentioned in Polome's book. These variants deal with more or less fronting and include: 
     - the more or less fronted varieties for /a/: [a̟] and [a̠] 
     - the more or less fronted varieties for /ɑ/: [ɑ̟] and [ɑ̠]
This data uses a couple of diacritics. [x̟] indicates advancement of the tongue root and [x̠] indicates retraction of the tongue root. For each of these examples, the 'x' serves only as a placeholder, not as an actual sound.  

Like with consonants, linguists know each of these vowel phonemes are separate sounds because there is a minimal set containing all 5 with the frame ˈp[_]ta' from Word List 5 of the UCLA Phonetics Lab Archive: 
[ˈpata] - 'hinge'
[ˈpeta] - 'to bend around'
[ˈpita] - 'to pass'
[ˈpota] - 'to twist' 
[ˈputa] - 'to thrash' 

One interesting thing about vowels in Swahili is that they are contrastive via their duration. In other words, if a vowel is elongated it results in a different word meaning than if it were to be the normal length (The UCLA Phonetics Lab Archive). Interestingly, Swahili has lost some of the length (and tonal) contrasts of its maternal language, Proto-Bantu, although they do still exist to an extent (Jerro 6-7). Some other interesting factors of Swahili notable includes the utilization of syllabic nasal consonants as vowels, the presence of vowel hiatus, and several cases of irregular lexical stress (Jerro 8).

The image carousel below provides visual representation of the elongation phenomena mentioned above using the words [paː] ('roof') and [kata] ('cut'). Notice how the highlighted section in the first image of [paː] is much longer than the highlighted section in the second image of the second syllable of [kata] although they are the same vowel. This is because [paː] is the elongated version of the vowel, which means it is longer than normal. As mentioned above, this difference is contrastive, so theoretically while [paː] means roof, the non-elongated /pa/ could mean something else completely. 



Here are some examples of the vowel sounds, taken from word list 3 of the UCLA Phonetics Lab Archive: 

/a/ 
In this audio file, the speaker is asked to say two words that highlight the /a/ sound. The first word, 'cut', is transcribed as [kata]. The second word, 'waddle', is transcribed as [ɓata]. This phoneme is the open front vowel.

/i/ 
In this audio file, the speaker is asked to say two words that highlight the /i/ sound. The first word, 'chair', is transcribed as [kiti]. The second word, 'adder', is transcribed as pili. This phoneme is the closed unrounded front vowel. 

/e/ 


In this audio file, the speaker is asked to say one word that highlights the /e/ sound. This word means 'mango' and is transcribed as [em͡be]. This phoneme is the mid-closed unrounded front vowel. 

/o/ 
In this audio file, the speaker is asked to say one word that highlights the /o/ sound. This word translates to 'fire' and is transcribed as [moto]. This phoneme is the mid-closed rounded back vowel. 

/u/ 
In this audio file, the speaker is asked to say two words that highlights the /u/ sound. The first word, 'pearl', is transcribed as [lulu]. The second word, 'dumb person', is transcribed as [ɓuɓu]. This phoneme the closed rounded back vowel. 
 

Allophonic Alternations 

What Are Allophonic Alternations? 

To understand allophonic alternations, we must first have a better understanding of what phonemes are. Phonemes are "an overarching unit that can bring about a change of meaning and includes two variants" (Ashby 12). This means that phonemes are the basic units of a language that contribute to word meaning. As mentioned throughout the text, phonemes can have variants, which are called allophones. Allophones are "context-dependent variants" (Gussenhoven 62). In other words, allophones are the predictable variations of phonemes that present themselves in specific contexts. An example to illustrate this concept is as follows: /l/ is a phoneme. This phoneme has two allophonic variations: [l] and [ɫ]. The allophonic variation [ɫ] only occurs in a specific context: when the /l/ phoneme follows a vowel. 

Just like with phonemes, linguists use minimal pairs or minimal sets to find allophonic alternations. Remember that minimal pairs or sets are words that are distinguishable by only one sound (Gussenhoven 63), such as 'could', 'would', and 'should' in English. If two sounds can be put into minimal pairs or sets, they are not allophonic variations, rather they are phonemes. In Swahili, some examples of minimal sets/pairs are [saː], [faː], [zaː], [vaː], [paː], and [pʰaː], [taka] and [waka], and [kata] and [ɓata]. These examples, taken from word list 3 of the Swahili entry in the UCLA Phonetics Lab Archive, show that /s/, /f/, /z/, /v/, /p/, /pʰ/ /t/, /w/, /k/, and /b/ are all phonemes, not allophonic variations. Note that some phonemes (like /b/) can have allophonic variations, however when /b/ is acting as an allophonic variant, it will not appear as part of a minimal pair or set with its allophonic partner [ɓ]. 

Some Allophonic Alternations in Swahili

In the consonant and vowel sections of this chapter, sounds with allophonic alternations were discussed. We will revisit these sounds here. Consonantal allophones include [b] and [ɓ] for /b/, [d] and [ɗ] for /d/, [ɟ] and [d̠ž] for /ɟ/, [g] and [ɠ] for /g/, [h] and [x] for /h/, [f/], [ʒ], and [d̠ʒ] for /f/, [m] and [ɱ] for /m/, and [r] and [ɾ] for /r/. Vowel allophones include [a], [ə], and [ɒ] for /a/, [i] and [ɪ] for /i/, [e] and [ɛ̝] for /e/, [o] and [ɔ̝] for /o/, and [u] and [ʊ] for /u/. 

Here are some examples of allophonic alternations from word lists 1 and 3 of the Swahili entry of the UCLA Phonetics Lab Archive. For each set, the first sound file is the phoneme (the general sound) and the second file is the allophonic alternation (the sound that only occurs in a specific pattern). 

/r/ and [ɾ]
Did you hear the difference between the two sounds? The first sound, /r/ (showcased by [siri] - 'a secret'), sounds like a relatively normal r sound to a native English speaker. In contrast, the second sound, [ɾ] (showcased by [siɾa] - 'dregs, lees'), has a rolled sound to it, similar to the rolled r sound that comes to mind when many people think of Spanish. 


/u/ and [ʊ] 
What about the difference between these sounds? The first sound, /u/ (showcased with [lulu] - 'pearl' and [ɓuɓu] - 'dumb person') sounds more like "ooo". The second sound, [ʊ] (showcased with [aʊ] - 'or') sounds more like "ouh". The second sound can be slightly difficult to pin point because it is the second vowel in a row. With this word, the "aah" sound comes from the first vowel, not the second (so pay more attention to the end of the word!).  


/e/ and [ɛ̝] And finally what about these? The first sound, /e/ (showcased with [em͡be] - 'mango') is a bit longer and sounds more like "aye". The second sound, [ɛ̝] (showcased with [nɛ̝mɑ] - 'bend, yield') is a bit shorter and sounds more like "eh". 


According to Polome's book, there are a couple different kinds of allophonic alternations in Swahili. These alternations fall into 3 main groups: stress or syllabic alternations (43), free variation or unpredictable alternations (46), and social alternations (45). 

The first kind of alternation he discusses are the stress/syllabic alternations. These types of alternations occur depending on the placement of the phoneme in the syllable and whether or not it is in the stressed or unstressed part. There are several examples of these sorts of allophones in Swahili. For example, the nasalized allophones of /m/, /n/, and /ŋ/ are "distributionally predictable only when the nasal carries the stress in words of the type /NC(C)V/" (43). Note that the notation /NC(C)V/ stands for "dissyllabic words in which the nasal appears in initial position and followed only by a consonant or consonant cluster, plus a vowel" (43). In other words, /NC(C)V/ shows a two syllable word with the nasal sound appearing as the first sound and the first syllable. This nasal sound would then be followed the second syllable, which is compromised of either a singular consonant or a set of consonants followed by a vowel. Another example of stress/syllabic alternation in Swahili is the distribution of /i/, /u/, /e/, and /o/. For this discussion, remember that slashes indicate the phoneme and brackets indicate the allophone. In Swahili, /i/, /u/, [e̝], and [o̝] occur in "stressed syllables with a marked lengthening" (47) while their counterparts [ɪ], [ʊ], /e/, and /o/ occur in unstressed syllables. Note that there are modifications like half-long for [ɪ] and [ʊ] or half-low for [e] and [o] that can result in the appearance of these sounds in stressed syllables under certain conditions (47). 

The second kind of alternation discussed in Polome's book is free/unpredictable variation. This type of allophonic alternation does not occur in a specific pattern like other forms of alternations, they are just random. There are many examples in Swahili of free variation allophones. One example is "between syllabics, postalveolar voiced fricative [ž] may occur in free variation with [d̠ž]" (42). Another example is "the presence of the velar allophone [x] for the phoneme /h/ in loanwords with Arabic /x/, but is always in free variation with the allophone [h]" (43). Other unpredictable variations include "the neutralization of the voiced/unvoiced contrast for some obstruents", "shifts in the place of articulation in the alveolar/post-alveolar area", and "shifts in the point of articulation between the alveolar and labiodental areas in the case of fricatives" (46).  
      

The third kind of alternation discussed in Polome's book is social alternation. These alternations result from "the prestige of Arabic culture among the Swahili Muslims of the islands and the coastal areas still strongly under the spiritual influence of the Sultante" (45). Examples of this include "a distinction of emphatic and non-emphatic articulations of the allophones /t/, /s/, and /ð/", "the appearance of a velarized allophone [ɫ] for /l/", "a post-velar articulation of [q] for /k/", and "a dental articulation of [t̪] for /t/" (45). 
 

Syllables and Syllable Structure in Swahili

What are syllables? 

Syllables are structural units of language that provide melodic organization to strings of speech sounds (Blevins 207). They are identified with sonority increases and decreases, where the center of the syllable (called the nucleus) is the sonority peak (or the loudest part) of the sound (ibid). Typically, vowels are found in the nucleus while consonants are found on the edges (called word boundaries) in onset (before the vowel) and coda (after the vowel) positions (Hildebrandt Nov. 10th 2021). The amount and type of onsets and codas that a given word can have depends on the language and its syllable restrictions, but every syllable no matter the language will have a nucleus (ibid).  

Other aspects of languages that relate to syllables are tone and stress. Tone places different pitch values over identical sound segments to create different meanings (Hildebrandt Nov. 10th 2021). Oftentimes, the syllable is the tone bearing unit (TBU) with tone being placed on the vowel sound in particular. Additionally, different types of tones can be restricted to different parts of the syllable. For example, in some languages level tones (high and low tones) can appear in all types of syllables whereas as contour tones (falling and rising tones) can only appear on monosyllables or on the final syllable of disyllabic words (Gussenhoven 148-149). Stress is the degree of emphasis that is placed on a syllable or word (Hildebrandt, Nov 17 2021).  Like with tone, stress often is represented at the syllable level, with certain syllables (called heavy or stressed syllables) carrying more stress than others (called weak or unstressed syllables) (Gussenhoven 214). 
 

Syllables in Swahili 

According to page 50 of Polome's handbook, the smallest syllables in Swahili are comprised of only a single vowel or syllabic nasal while larger ones are comprised of one or more consonants in the onset position and then the vowel. Every syllable in Swahili ends with a vowel, meaning the consonant immediately after the vowel marks a new syllable. The only exception is with non-Bantu loan words (i.e. words from Arabic), which can have a consonant in the coda position. 

The most common types of syllables in Swahili are V (just the vowel or syllabic nasal) and CV (an onset consonant + a vowel) syllables. CCV syllables (two onset consonants + a vowel) are restricted to syllables that have nasal as the first consonant, or that have the unvoiced palatal approximant /j/ or the voiced labial velar approximant /w/ as the second consonant. The remaining two types of syllables, C(C)VC (up to two onset consonants + a vowel + one coda consonant) and CCCV (three onset consonants + a vowel), are found only in non-Bantu loan words. There is one exception to that restriction however: CCCV syllables can in fact occur in a Swahili word when the first consonant is a nasal and the third one is /w/ (Polome 50). Below are examples of these syllable types found in Word List one of the UCLA Phonetics Language Archive entry for Swahili. Note that an example of CCCV was not included in any of the word lists and so it is not shown here.

1.) [fú.a] (hammer, beat)
       This word is comprised of two syllables, [fú] and [a]. The first syllable, [fú], is an example of a CV syllable. The second syllable, [a], is an example of the V syllable. Below is a visual breakdown of these syllables. Note that C means consonant, V means vowel, O means onset, N means nucleus, and . indicates the break between syllables: 
                                                                                        fú.a 
                                                                                      ON.N
                                                                                      CV .V

2.) [nám.ba] (number) 
       Like in the previous example, this word is also comprised of two syllables, [nám] and [ba]. The first syllable, [nám], is an example of a CVC syllable (which indicates that it is a non-Bantu loan word). The second syllable, [ba], is another example of a CV syllable. Below is a visual breakdown of these syllables. Note that C in the second row means consonant, V means vowel, O means onset, N means nucleus, C in the first row means coda and . indicates the break between syllables: 
                                                                                       nám.ba
                                                                                      ONC.ON
                                                                                      CVC.CV

3.) [mbú.zi] (goat) 
        Here is a third word, again comprised of two syllables, [mbú] and [zi]. The first syllable, [mbú], is an example of a CCV syllable. This syllable abides by the restriction placed on CCV syllables and has the bilabial nasal /m/ as the first consonant. The second syllable [zi] is a third example of the CV syllable. Below is a visual breakdown of these syllables. Note that C means consonant, V means vowel, O means onset, N means nucleus, and . indicates the break between syllables: 
                                                                                        mbú.zi
                                                                                        OON.ON
                                                                                        CCV.CV


There are hints of tone in Swahili, although Swahili's tonal system has lost the complexity of its maternal language (Jerro 6-7). Polome defines what is left as an "intonation system" that has different patterns based on Arabic, English, and the native tongues of non-native Swahili speakers in mainland Africa (51-52). Despite this variation, some general patterns can be found (Polome 52). These patterns include 3 contrastive levels of pitch, known as high, mid, and low (ibid). In between each level there are also intermediate tones that can be found in ascending and descending pitch contours (ibid). 

Swahili also has a stress system. According to page 50 of Polome's handbook, "stress occurs as a rule on the penultimate vowel phoneme or syllabic allophone of a nasal phoneme", with penultimate meaning next to last. Note there are irregular stress characteristics that are a result of Arabic influence where the stress falls on the antepenultimate syllable (the syllable infront of the penultimate syllable) rather than the penultimate syllable (Jerro 8-9). Interestingly, in Swahili stressed syllables are also characterized by a slightly higher pitch than the non-stressed syllable before it, specifically when in the onset position (Polome 52). 

The image carousel below will give a visual for stress. The first image depicts the unstressed /e/ with the word [em͡be] (meaning 'mango') and the second image depicts the stressed [e̝] with the word [nɛ̝mɑ] (meaning 'bend, yield'). Notice how the second image has a much darker spectrogram, a higher pitch line (the blue line), and a higher frequency line (the yellow line) than the first image of the unstressed /e/. 

Focus on Aspiration  

Aspiration is a form of devoicing where the vocal folds do not vibrate together at first and therefore do not create the sound until a little bit later than normal (Gussenhoven 18). This phenomena is marked with a superscript h that looks like ʰ (ibid). Aspiration plays a large part in the Swahili sound inventory and helps contribute to its large consonant inventory in particular (Jerro 9). An example of aspiration in Swahili was mentioned earlier in this chapter with the sound /pʰ/, and I will provide it again here, with both an audial and a visual representation. When listening to the sound file, keep in mind that the first word provided ([paː] - 'roof') is pronounced with the unaspirated /pʰ/ sound while the second word ([pʰaː] - 'gazelle') is pronounced with the aspirated /pʰ/ sound. While looking at the image carousel below, note how the highlighted section in the first photo, which represents the /p/ sound, is much larger than the highlighted section in the second photo, which represents the /pʰ/ sound. Also notice how the blue line, which indicates pitch (a phenomena that is tied to vowels), starts much earlier in the second image than the first. This is because aspiration shortens the presence of the consonant sound as well as the force behind it, therefore allowing the vowel sound to begin sooner. 







Part of what makes aspiration in Swahili so interesting is that it does not seem to be as well defined of a contrastive unit as it used to be (Polome 39). Aspiration can be seen as a contrastive lexical unit within sounds themselves with minimal pairs such as /p/ and /pʰ/ (ibid), but the more interesting evidence of aspiration is seen with nouns and their augmentatives (Polome 40). For example, pembe pronounced with a /pʰ/ means 'horn' while pembe pronounced with a /p/ means 'big horn', tendu pronounced with a /tʰ/ means 'hole' while tendu pronounced with a /t/ means 'big hole', and kuta prononunced with a /kʰ/ means 'walls' while kuta pronounced with a /k/ means 'big walls' (ibid). Unfortunately, this kind of contrastive value is much less common than that of lexical units like /p/ and /pʰ/ because it is generally only used before a voiceless obstruent by most speakers and in Northern Swahili, it is only noticeable when the word is lexicalized (ibid). There are, however, some Swahili speakers who carry this form of contrastive aspiration to the voiceless obstruents of other noun classes (ibid). These changes in the use of aspiration are an excellent example of how even well established languages are constantly undergoing change. According to Polome, these changes can be partially attributed to "the low functional yield of aspiration in the semantic field" which makes it unknown to many speakers, "the tendency to aspirate initial voiceless stops and affricates" which "leads to the confusion of the augmentative and the ordinary form of some nouns", the absence of a written notation for aspiration, and the growing influence of non-native speakers (41). 

Another interesting aspect of aspiration in Swahili is that it is often connected to stress and initial position. This means aspiration typically falls in stressed syllables rather than in unstressed ones, and it is often the beginning sound of a word rather than a middle or ending sound. Examples of this can be seen with /t/ vs /tʰ/. The word 'mwitu' (meaning 'forest') is pronounced with /t/, while 'mwituni' (meaning 'in the forest') is pronounced with /tʰ/ in the stressed penultimate syllable of the word. Additionally, the phrase 'una takataka' (meaning 'you are dirty') is pronounced with /tʰ/ in the word-initial position and in the stressed penultimate syllable (Polome 41). 
 

References Cited 

Ashby, Patricia. Understanding Phonetics. Hodder Education, 2011.
Blevins, Juliette. “The Syllable in Phonological Theory .” The Handbook of Phonological Theory , edited by John A Goldsmith , Blackwell , London , 1996, pp. 206–234.
Filippo , Cesare de et al. “Niger Congo Language Tree. .” Research Gate, 2011, www.researchgate.net/figure/Niger-Congo-language-tree-Schematic-tree-of-the-NigerCongo-language-phylum-that_fig1_49637635.
"Language Swahili." The World Atlas of Language Structures Online, https://wals.info/languoid/lect/wals_code_swa
Gussenhoven, Carlos, and Haike Jacobs. Understanding Phonology. Hodder Education, 2011.
Hildebrandt, Kristine. "Phonological Analysis." ENG 408. 1 Sept. 2021.
---. "Phonological Analysis." ENG 408. 9 Sept. 2021. 
---. "Phonological Analysis." ENG 408. 15 Sept. 2021. 
---. "Phonological Analysis." ENG 408. 10 Nov. 2021. 
---. "Phonological Analysis." ENG 408. 17 Nov. 2021. 
“International Phonetic Alphabet Chart.” Wikipedia, Wikimedia Foundation, 11 Aug. 2021, en.wikipedia.org/wiki/International_Phonetic_Alphabet_chart#Official_chart.
Jerro, Kyle. “1. Linguistic Complexity: A Case Study from Swahili .” African Linguistics on the Prairie: Selected Papers from the 45th Annual Conference on African Linguistics, edited by Jason Kandybowicz et al., Language Science Press , Berlin , pp. 3–19 .
“Language Swahili .” Edited by Steven Moran and Daniel McCloy , PHOIBLE 2.0, https://phoible.org/languages/swah1253.
Mugane, John M. The Story of Swahili. Ohio University Press, in association with the Ohio University Center for International Studies, 2015.
Peterfitzgerald. “East Africa Regions Map .” East Africa, Wikitravel , 19 July 2011, wikitravel.org/en/File:East_Africa_regions_map.png.
Polome, Edgar C. Swahili Language Handbook. Center for Applied Linguistics, 1967.
Soumya-8974. “Swahili Speaking Africa .” Swahili Language, Wikipedia, 29 June 2020, en.wikipedia.org/wiki/Swahili_language#/media/File:Swahili-speaking_Africa.png.
"Spoken L1 Language: Swahili." Glottolog 4.4, https://glottolog.org/resource/languoid/id/swah1253
SUM1. “Map of Niger Congo Languages .” Niger-Congo Languages, Wikipedia, 1 June 2018, en.wikipedia.org/wiki/Niger%E2%80%93Congo_languages#/media/File:Map_of_the_Niger%E2%80%93Congo_languages.svg.
“Swahili Language.” Wikipedia, Wikimedia Foundation, 23 Oct. 2021, https://en.wikipedia.org/wiki/Swahili_language.
The UCLA Phonetics Lab Archive. UCLA Department of Linguistics, 2007, archive.phonetics.ucla.edu/.
 

Created by Hope Krisko, Fall 2021

Contents of this path:

This page references: