We have seen that allophones are phones belonging to the same phoneme, and that the distribution of the different allophones is typically determined by their phonetic environment. For example, [pʰ] can only occur at the beginning of a syllable, [p] can occur everywhere else. We have also seen, that there are phonemes that are restricted in where they can occur — [h] can only occur at the beginning of a syllable, [ŋ] can only occur at the end. This is not due to a phonological rule, but to general principles of the English language. The study of where in a syllable or word particular phonemes can occur is called phonotactics (-tactics is derived from Ancient Greek taktikós ‘fit for ordering or arranging’).
Universal phonotactics: the sonority sequencing principle
There is a general phonotactic principle that governs the structure of syllables in spoken human languages: The sonority sequencing principle (SSP). In order to understand this principle, we must first note that phones can be ordered in terms of their sonority, an abstract measure of relative prominence that corresponds roughly to loudness. Figure 4.5.1 shows a widely used variant of this ranking, referred to as sonority hierarchy.

Fig. 4.5.1: The sonority hierarchy
The sonority sequencing principle states that consonants in onsets tend to show an increasing sonority the closer they are to the nucleus and for consonants in the coda to show a decreasing sonority the further they are from the nucleus. The principle is illustrated by the English one-syllable word [plænt] plant: obstruents have the lowest sonority in English, followed by nasal stops, followed by other sonorants, followed by vowels at the top of the sonority hierarchy. Reversing the segments in the onset and coda to create the attempted syllable *[lpætn] violates the SSP, because the onset has falling rather than rising sonority, and the coda has rising rather than falling sonority. The difference in the sonority sequence between these two sequences is shown in Fig. 4.5.2.
![A grid with the sonority hierarchy from top to bottom: vowels, approximants, nasals, fricatives, affricates, stops. Below the grid, the IPA sequences [plænt] and *[lpætn]. Above each symbol, a dot on the appropriate line of the grid. The dots are connected by lines, in green, where they conform to the SSP, in red, where they do not conform.](https://linguistica.info/b/lei/wp-content/uploads/2024/10/sonority-plaent-lpaent-300x90.png)
Figure 4.5.2: Representations of sonority patterns in the English word [plænt] and the attempted English word *[lpætn].
Note that the SSP is motivated by perceptual considerations: in order to be clearly perceivable, a syllable needs a sonorous nucleus — typically a vowel, although consonants are also possible, with a decreasing likelihood as we move down the sonority hierarchy. The fact that the remaining phones in a syllable tend to rise in sonority towards the nucleus has perceptual reasons: if a syllable deviates too much from it, the segments that violate will either be difficult to perceive, or they will have to be articulated with so much acoustic energy that they will be perceived as nuclei of their own syllable. The hypothetical word *[lpætn] shows this: if you try to pronounce it as a single syllable, it will sound almost like [pæt], with the [l] and the [n] being almost inaudible. If you give them more articulatory prominence, the result will quickly turn into a sequence of syllables: [l̩.pæt.n̩]. It is difficult to hit the sweet spot between these two extremes.
Nevertheless, the SSP is only a tendency, not an absolute rule. Most languages allow portions of a syllable to have a sonority plateau (when two adjacent segments have the same sonority) and some languages may have even looser syllable structure, allowing one or more sonority reversals.
Consider, for example, the words in Figure 4.5.3. The English word psyched would be expected to have the form [psaɪkt|, as it contains the form psyche, a loanword from Ancient Greek [psyː.kʰɛ̌ː] ψυχή. This would also conform to the SSP, allowing for the sonority plateau at the end (two plosives in a row). However, the sequence [ps] is not allowed by the phonotactic principles of English, so the actual form of the word is [saɪkt|. Conversely, the English word [spaɪkt] spiked is perfectly fine, even though the sequence [sp] violates the SSP.
![A grid with the sonority hierarchy from top to bottom: vowels, approximants, nasals, fricatives, affricates, stops. Below the grid, the IPA sequences *[psaɪkt] and [spaɪkt] Above each symbol, a dot on the appropriate line of the grid. The dots are connected by lines, in green, where they conform to the SSP, in red, where they do not conform.](https://linguistica.info/b/lei/wp-content/uploads/2024/10/psaikd-spaikd-300x90.png)
Figure 4.5.3. Representations of sonority patterns in the attempted English word *[psaɪkt] and the attested English word [spaɪkt]
Language-specific phonotactics
The phonotactic principles of an individual language can thus be described in terms of the universal SSP on the one hand, and language-specific violations of the SSP on the other: languages may (a) allow something that the SSP forbids (such as the drop in sonority from [s] to [p] in [spaɪkt]), or they may (b) forbid something that the SSP allows (such as the sequence [ps] in the hypothetical word *[psaɪkt]).
In a given language, these violations of the SSP are systematic. Take the sequence [sp]: in English, this is not a quirky exception found in the word spiked (although such word-specific quirks exist). Instead, it is due to a general phonotactic principle: in the onset of an English syllable, [s] can only occur by itself (simple onset) or as the first phone in a sequence (complex onset). For example,
- we have [sp], but not *[ps]:
- we have [st], as in [stɑɹ] star, but not *[ts]: for example, the Russian loanword tsar is pronounced [zɑɹ], the Tswana loanword tsetse (fly) is pronounced [ˈtɛt.si] (or [ˈsɛt.si]), the Japanese loanword tsundoku ‘leaving a book unread on a pile of other unread books after buying it’ is pronounced [ˈsʌn.doʊ.ku];
- we have [sk], as in [skaɪ] sky, but not *[ks]: the Greek loanwords xanthan, xenophobia and xylophone, all pronounced with an initial [ks] in Greek, are pronounced [ˈzæn.θən], [ˌzen.əˈfoʊ.bi.ə] and [ˈzaɪ.lə.foʊn] in English.
None of these sequences are very difficult to articulate. All of them occur in the closely related language German, for example in [ˈpsyːçə] Psyche, [ʦaːɐ̯] Zar, and [ˌksyloˈfoːn] Xylophon. Also, English speakers produce them regularly in coda position, in words like [hɑps] hops, [bɪts] bits and [æks] ax. They even produce them in onsets of a few loan words such as [tsuˈnɑ.mi] tsunami (from Japanese) or [psaɪ] psi (the name of the Greek letter Ψ).
If [s] occurs at the beginning of a complex onset consisting of two phones, it can be followed by all voiceless plosives (as in the examples above), by the voiceless fricative [f] (although this sequence is only found in loanwords, such as [sfɪɹ] sphere), and by the approximants [w] (as in [swɪŋ] swing), [l] (as in [sloʊ] slow), and, in British English, [j] (as in [sjuːt] suit). It cannot be followed by voiced plosives, voiced fricatives (with the single exception of [v] in the word [svelt] svelte, borrowed from Italian via French), the voiceless fricatives [θ], [ʃ] and [h], the approximant [ɹ], or any affricates. With the exception of [sθ] and [sʃ], none of these sequences would be difficult to pronounce, and they are all found in other languages.
If [s] occurs at the beginning of a complex onset consisting of three phones, the options become very limited in English: the next phone can only be a [t], [p] or [k]. If it is a [t], the third phone must be a [ɹ] (in British English, a [j] is also possible); if it is a [p], the third phone must be a [j], a [ɹ] or an [l]; if it is a [k], the third phone must be a [j], [ɹ], [l], or [w].
There are many other phonotactic principles in English that are equally arbitrary. We can explain a few of them with articulatory difficulties, but most of them just exist, for no apparent reason. They are nevertheless interesting, as they are part of the mental representations of the speakers of a language. This can be seen, for example, in the process of borrowing, where loanwords will normally be reshaped to fit the phonotactic constraints of the borrowing language, as we saw in the examples above.
CC-BY-NC-SA 4.0, Introduction and section on “Language-specific phonotactics” written by Anatol Stefanowitsch, section on “Universal phonotactics” adapted from Catherine Anderson, Bronwyn Bjorkman, Derek Denis, Julianne Doner, Margaret Grant, Nathan Sanders, and Ai Taniguchi, Essentials of Linguistics. 2nd ed. with editing and additions by Anatol Stefanowitsch.