Swedish phonology

Swedish has a large vowel inventory, with nine vowels distinguished in quality and to some degree in quantity, making 18 vowel phonemes in most dialects. Another notable feature is the pitch accent, a development which it shares with Norwegian. Swedish pronunciation of most consonants is similar to that of other Germanic languages.

There are 18 consonant phonemes, of which and  show considerable variation depending on both social and dialectal context.

Finland Swedish has a slightly different phonology.

Vowels


Swedish has nine vowels that, as in many other Germanic languages, exist in pairs of long and short versions. The length covaries with the quality of the vowels, as shown in the table below (long vowels in the first column, short in the second), with short variants being more centered and lax. The length is generally viewed as the primary distinction, with quality being secondary. No short vowels appear in open stressed syllables. The front vowels appear in rounded-unrounded pairs: –, –, – and –.


 * Central Standard Swedish is a near-close near-front compressed vowel  that differs from  by the type of rounding. In other dialects,  may be central.
 * are mid.
 * has been variously described as central and front.

Rounded vowels have two types of rounding:
 * ,, and  are compressed , ,  and
 * ,, and its pre- allophone ,  and its pre- allophone ,  and  are protruded , , , , , ,  and.

Type of rounding is the primary way of distinguishing from, especially in Central Standard Swedish.

, (in stressed syllables),  (with a few exceptions) and  are lowered to, ,  and , respectively, when preceding.


 * ära →  ('honor')
 * ärt →  ('pea')
 * öra →  ('ear')
 * dörr →  ('door')

The low allophones are becoming unmarked in younger speakers of Stockholm Swedish, so that läsa ('to read') and köpa ('to buy') are pronounced and  instead of standard  and. These speakers often also pronounce pre-rhotic and  even lower, i.e.  and. This is especially true for the long allophone. Also, the allophone is sometimes difficult to distinguish from the long.

In some pronunciations, traditionally characteristic of the varieties spoken around Gothenburg and in Östergötland, but today more common e.g. in Stockholm and especially in younger speakers, and  merge, most commonly into  (especially before  and the retroflex consonants). Words like fördömande ('judging', pronounced in Standard Swedish) and fördummande ('dumbing', pronounced  in Standard Swedish) are then often pronounced similarly or identically, as.

In Central Standard Swedish, unstressed is slightly retracted, but is still a front vowel rather than central. However, the latter pronunciation is commonly found in Southern Swedish. Therefore, begå 'to commit' is pronounced in Central Standard Swedish and  in Southern Swedish. Before, southerners may use a back vowel. In Central Standard Swedish, a true schwa is commonly found as a vocalic release of word-final lenis stops, as in e.g. bädd  'bed'.

In many central and eastern areas (including Stockholm), the contrast between short and  is lost. The loss of this contrast has the effect that hetta ('heat') and hätta ('cap') are pronounced the same.

In Central Standard Swedish, long is weakly rounded. The rounding is stronger in Gothenburg and weaker in most North Swedish dialects.

One of the varieties of is made with a constriction that is more forward than is usual. Peter Ladefoged and Ian Maddieson describe this vowel as being pronounced "by slightly lowering the body of the tongue while simultaneously raising the blade of the tongue (...) Acoustically this pronunciation is characterized by having a very high F3, and an F2 which is lower than that in ." They suggest that this may be the usual Stockholm pronunciation of.

There is some variation in the interpretations of vowel length's phonemicity. , for example, treats vowel quantity as its own separate phoneme (a "prosodeme") so that long and short vowels are allophones of a single vowel phoneme.

Patterns of diphthongs of long vowels occur in three major dialect groups. In Central Standard Swedish, the high vowels, , and  are realized as narrow closing diphthongs with fully close ending points:. According to Engstrand, the second element is so close as to become a palatal or bilabial fricative:. Elsewhere in the article, the broad transcription ⟨iː yː ʉː uː⟩ is used.

In Central Standard Swedish,, and  are often realized as centering diphthongs ,  and.

In Southern Swedish dialects, particularly in Scania and Blekinge, the diphthongs are preceded by a rising of the tongue from a central position so that and  are realized as  and  respectively. A third type of distinctive diphthongs occur in the dialects of Gotland. The pattern of diphthongs is more complex than those of southern and eastern Sweden;, and  tend to rise while  and  fall; , ,  and  are not diphthongized at all.

Consonants
The table below shows the Swedish consonant phonemes in spoken Standard Swedish.

are dental, but can be either dental  or alveolar. If is alveolar, then  is also alveolar. Dental realization of is the predominant one in Central Standard Swedish.

Stops
Initial fortis stops are aspirated in stressed position, but unaspirated when preceded by  within the same morpheme. Hence ko ('cow') is, but sko ('shoe') becomes. Compare English ('cool') vs  ('school'). In Finland Swedish, aspiration does not occur and initial lenis stops are usually voiced throughout. Word-medial lenis stops are sometimes voiceless in Finland, a likely influence from Finnish.

Preaspiration of medial and final fortis stops, including the devoicing of preceding sonorants, is common, though its length and normativity varies from dialect to dialect, being optional (and idiolectal) in Central Standard Swedish but obligatory in, for example, the Swedish dialects of Gräsö, Vemdalen and Arjeplog. In Gräsö, preaspiration is blocked in certain environments (such as an following the fortis consonant or a morpheme boundary between the vowel and the consonant), while it is a general feature of fortis medial consonants in Central Standard Swedish. When not preaspirated, medial and final fortis stops are simply unaspirated. In clusters of fortis stops, the second "presonorant" stop is unaspirated and the former patterns with other medial final stops (that is, it is either unaspirated or is preaspirated).

The phonetic attributes of preaspiration also vary. In the Swedish of Stockholm, preaspiration is often realized as a fricative subject to the character of surrounding vowels or consonants so that it may be labial, velar, or dental; it may also surface as extra length of the preceding vowel. In the province of Härjedalen, though, it resembles or. The duration of preaspiration is highest in the dialects of Vemdalen and Arjeplog. Helgason notes that preaspiration is longer after short vowels, in lexically stressed syllables, as well as in pre-pausal position.

Fricatives
is dental in Central Standard Swedish, but retracted alveolar  in Blekinge, Bohuslän, Halland and Scania.

The Swedish fricatives and  are often considered to be the most difficult aspects of Swedish pronunciation for foreign students. The combination of occasionally similar and rather unusual sounds as well as the large variety of partly overlapping allophones of often presents difficulties for non-natives in telling the two apart. The existence of a third sibilant in the form of tends to confuse matters even more, and in some cases realizations that are labiodental can also be confused with. In Finland Swedish, is an affricate:  or.

The Swedish phoneme (the "sje-sound" or voiceless postalveolar-velar fricative) and its alleged coarticulation is a difficult and complex issue debated amongst phoneticians. Though the acoustic properties of its allophones are fairly similar, the realizations can vary considerably according to geography, age, gender as well as social context and are notoriously difficult to describe and transcribe accurately. Most common are various sh-like sounds, with occurring mainly in northern Sweden and  in Finland. A voiceless uvular fricative,, can sometimes be used in the varieties influenced by major immigrant languages like Arabic and Kurdish. The different realizations can be divided roughly into the following categories:


 * "Dark sounds" –, commonly used in the Southern Standard Swedish. Some of the varieties specific, but not exclusive, to areas with a larger immigrant population that commonly realizes the phoneme as a voiceless uvular fricative.
 * "Light sounds" –, used in the northern varieties and , and (or something in between) in Finland Swedish.
 * Combination of "light" and "dark" – darker sounds are used as morpheme initials preceding stressed vowels (sjuk 'sick', station 'station'), while the lighter sounds are used before unstressed vowels and at the end of morphemes (bagage 'baggage', dusch 'shower').

Sonorants
has distinct variations in Standard Swedish. For most speakers, the realization as an alveolar trill occurs only in contexts where emphatic stress is used. In Central Swedish, it is often pronounced as a fricative (transcribed as ) or approximant (transcribed as ), which is especially frequent in weakly articulated positions such as word-finally and somewhat less frequent in stressed syllable onsets, in particular after other consonants. It may also be an apico-alveolar tap. One of the most distinct features of the southern varieties is the uvular realization of, which may be a trill , a fricative or an approximant. In Finland, is usually an apical trill, and may be an approximant  postvocalically.

In most varieties of Swedish that use an alveolar (in particular, the central and northern forms), the combination of  with dental consonants  produces retroflex consonant realizations, a recursive sandhi process called "retroflexion". Thus, ('map') is realized as,  ('north') as ,  ('Vänern') as , and  ('fresh') as. The process of retroflexion is not limited to just one dental, and e.g. först is pronounced. The combination of and  does not uniformly cause retroflexion, so that it may also be pronounced with two separate consonants, and even, occasionally in a few words and expressions, as a mere. Thus sorl ('murmur') may be pronounced, but also.

In Gothenburg and neighbouring areas (such as Mölndal and Kungälv) the retroflex consonants are substituted by alveolar ones, with their effects still remaining. For example: is  not,  is , not. However,, unlike what many other Swedes believe, is not but , i.e.  is , not.

As the adjacent table shows, this process is not limited by word boundaries, though there is still some sensitivity to the type of boundary between the and the dental in that retroflexion is less likely with boundaries higher up in the prosodic hierarchy. In the southern varieties, which use a uvular, retroflex realizations do not occur. For example, ('map') is realized as  (note that Tone 2 in Malmö sounds like Tone 1 in Stockholm), etc. An  spelled $⟨rr⟩$ usually will not trigger retroflexion so that spärrnät  ('anti-sub net') is pronounced. Retroflexion also does not usually occur in Finland.

Variations of are not as common, though some phonetic variation exists, such as a retroflex flap  that exists as an allophone in proximity to a labial or velar consonant (e.g. glad ('glad')) or after most long vowels.

In casual speech, the nasals tend to assimilate to the place of articulation of a following obstruent so that, for example, han kom ('he came') is pronounced.

and are pronounced with weak friction and function phonotactically with the sonorants.

Stress
In Swedish, stress is not fixed. Primary stress can fall on one of the last three syllables in a word’s stem. This can lead to surface contrasts based solely on difference in position of stress:


 * formel 'formula'
 * formell 'formal'

Primary stressed syllables are always metrically heavy, i.e. contain either a long vowel or a short vowel followed by a consonant. In phonological analyses of Swedish, stressed syllables in underived forms are assumed to be associated with a basic moraic trochaic foot [μ μ]σ, e.g. bˈil 'car' (stress marked as (ˈ)). More whole-word based analyses of metrical structure where affixes are included also assume other foot types, in particular, syllabic trochaic feet [σ σ]Ft, bˈil-ar 'cars'. Affixes affect stress to a considerable degree in the sense that inflectional suffixes can never receive primary stress (bˈil-ar-na 'the cars'), whereas many derivational suffixes can tent-ˈabel 'examinable'. Disyllabic words with accent 2 like ˈandˌe ‘spirit’, kvˈinnˌa ‘woman’, bˈilˌar 'cars' have secondary stress on the second syllable. In the Swedish Academy's lexicon, these disyllables are transcribed with the stress pattern 3 2, e.g. kvin3a2 where (3) stands for primary stressed syllable with accent 2 and (2) represents a ‘secondary stressed’ syllable in words with accent 2). This secondary stress is assumed to have existed in Old Norse (see  and references therein). Compound words have primary stress on the first element and secondary stress on the last element bˈil-dels-butˌiken 'car-part shop' (secondary stress marked as (ˌ)).

Pitch accents
Stressed syllables carry one of two different tones, often described as pitch accents, or tonal word accents. They are called acute and grave accent, accent 1 and accent 2. The actual realization of these two tones varies from dialect to dialect. In the central Swedish dialect of Stockholm, accent 1 is characterized by a low tone at the beginning of the stressed syllable (fìsken 'the fish') and accent 2, by a high tone at the beginning of the stressed syllable (mátta 'mat'). When the word is in a prominent/focused position, a high tone often occurs following the word accent (fìskén). In accent 2 words, this results in two high tones within the word (e.g. máttá), hence the term "two-peaked" for this dialect. In southern Swedish, a "one-peaked" dialect, accent 1 is realized as a high tone at the beginning of the stressed syllable (físken) and accent 2, by a low tone (màtta). Generally, the grave accent is characterized by a later timing of the word accent pattern as compared with the acute accent.

The phonemicity of this tonal system is demonstrated in the nearly 300 pairs of two-syllable words differentiated only by their use of either grave or acute accent. Outside of these pairs, the main tendency for tone is that the acute accent appears in monosyllables (since the grave accent cannot appear in monosyllabic words) while the grave accent appears in polysyllabic words. Polysyllabic forms resulting from declension or derivation also tend to have a grave accent except when it is the definite article that is added. This tonal distinction has been present in Scandinavian dialects at least since Old Norse though a greater number of polysyllables now have an acute accent. These are mostly words that were monosyllabic in Old Norse, but have subsequently become disyllabic, as have many loanwords. For example, Old Norse kømr ('comes') has become kommer in Swedish (with an acute accent).

The distinction can be shown with the minimal pair anden 'the mallard' (tone 1) and anden 'the spirit' (tone 2).

In Central Swedish, this is a high, slightly falling tone followed by a low tone; that is, a single drop from high to low pitch spread over two syllables. In Central Swedish, a mid falling tone followed by a high falling tone; that is, a double falling tone.
 * Acute accent: (realized  = ) 'the mallard' (from and 'mallard')
 * Grave accent: (realized  = ) 'the spirit' (from ande 'spirit')

The exact realization of the tones also depends on the syllable's position in an utterance. For instance, at the beginning of an utterance, the acute accent may have a rising rather than slightly falling pitch on the first syllable. Also, these are word tones that are spread across the syllables of the word. In trisyllabic words with the grave accent, the second fall in pitch is distributed across the second and third syllables:


 * Grave-accent trisyllable: flickorna (realized  = ) 'the girls'

The position of the tone is dependent upon stress: The first stressed syllable has a high or falling tone, as does the following syllable(s) in grave-accented words.

In most Finland-Swedish varieties, however, the distinction between grave and acute accent is missing.

A reasonably complete list of uncontroversial so-called minimal pairs can be seen below. The two words in each pair are distinguished solely by having different tone (acute vs. grave). In those cases where both words are nouns it would have been possible to list the genitive forms of the words as well, thereby creating another word pair, but this has been avoided. A few word pairs where one of the words is a plural form with the suffix -or have been included. This is due to the fact that many Swedish-speakers in all parts of Sweden pronounce the suffix -or the same way as -er.

Note that karaten/karaten is the only pair with more than two syllables (although we would get a second one if we used the definite forms of the pair perser/pärser, i.e. perserna/pärserna). The word pair länder ('countries', plural of land) and länder ('loins', plural of länd) could have been included, but this one is controversial. For those speakers who have grave accent in the plural of länd, the definite plural forms will also constitute a three-syllable minimal pair: länderna (acute accent, 'the countries') vs. länderna (grave accent, 'the loins'). Although examples with more than two syllables are very few in Standard Swedish, it is possible to find other three-syllable pairs in regional dialects, such as Värmländska: hunnera (acute, 'the Huns') vs. hunnera (grave, 'the dogs'), ändera/ännera (acute, 'the mallards') vs. ändera/ännera (grave, 'the ends'), etc.

Prosody in Swedish often varies substantially between different dialects including the spoken varieties of Standard Swedish. As in most languages, stress can be applied to emphasize certain words in a sentence. To some degree prosody may indicate questions, although less so than in English.

Phonotactics
At a minimum, a stressed syllable must consist of either a long vowel or a short vowel and a long consonant. Like many other Germanic languages, Swedish has a tendency for closed syllables with a relatively large number of consonant clusters in initial as well as final position. Though not as complex as that of most Slavic languages, examples of up to 7 consecutive consonants can occur when adding Swedish inflections to some foreign loanwords or names, and especially when combined with the tendency of Swedish to make long compound nouns. The syllable structure of Swedish can therefore be described with the following formula:


 * (C)(C)(C)V(C)(C)(C)

This means that a Swedish one-syllable morpheme can have up to three consonants preceding the vowel that forms the nucleus of the syllable, and three consonants following it. Examples: skrämts (verb 'scare' past participle, passive voice) or sprängts  (verb 'explode' past participle, passive voice). All but one of the consonant phonemes,, can occur at the beginning of a morpheme, though there are only 6 possible three-consonant combinations, all of which begin with , and a total of 31 initial two-consonant combinations. All consonants except for and  can occur finally, and the total number of possible final two-consonant clusters is 62.

In some cases this can result in very complex combinations, such as in västkustskt, consisting of västkust ('west coast') with the adjective suffix -sk and the neuter suffix -t.

Central Standard Swedish and most other Swedish dialects feature a rare "complementary quantity" feature wherein a phonologically short consonant follows a long vowel and a long consonant follows a short vowel; this is true only for stressed syllables and all segments are short in unstressed syllables. This arose from the historical shift away from a system with a four-way contrast (that is,, , and  were all possible) inherited from Proto-Germanic to a three-way one (,  and ), and finally the present two-way one; certain Swedish dialects have not undergone these shifts and exhibit one of the other two phonotactic systems instead. In literature on Swedish phonology, there are a number of ways to transcribe complementary relationship, including:
 * A length mark for either the vowel  or the consonant
 * Gemination of the consonant ( vs. )
 * Diphthongization of the vowel ( vs. )
 * The position of the stress marker ( vs. )

With the conventional assumption that medial long consonants are ambisyllabic (that is, penna ('pen'), is syllabified as ), all stressed syllables are thus "heavy". In unstressed syllables, the distinction is lost between and  or between. With each successive post-stress syllable, the number of contrasting vowels decreases gradually with distance from the point of stress; at three syllables from stress, only and  occur.

Sample
The sample text is a reading of The North Wind and the Sun. The transcriptions are based on the section on Swedish found in The Handbook on the International Phonetic Association, in which a man in his forties from Stockholm is recorded reading out the traditional fable in a manner typical of Central Standard Swedish as spoken in his area. The broad transcription is phonemic, while the narrow is phonetic.

Orthographic version
Nordanvinden och solen tvistade en gång om vem av dem som var starkast. Just då kom en vandrare vägen fram, insvept i en varm kappa. De kom då överens om att den som först kunde få vandraren att ta av sig kappan, han skulle anses vara starkare än den andra. Då blåste nordanvinden så hårt han någonsin kunde, men ju hårdare han blåste, desto tätare svepte vandraren kappan om sig, och till slut gav nordanvinden upp försöket. Då lät solen sina strålar skina helt varmt och genast tog vandraren av sig kappan, och så var nordanvinden tvungen att erkänna att solen var den starkaste av de två.