Introduction to consonants

English consonants are deceptively similar to Spanish consonants. The letters are the same and many of them are even represented by the same symbols in phonetics, but only just a few sound the same. The rest are completely different. Fortunately for us, they are not very difficult to learn. You can have a first look at them in Table 1. If you click on the symbol, you will hear the sound and the examples.

Table 1


pin, pot, appear


bin, about, by


tin, two, attain, advanced


din, dart, addition, lived


chin, choice, cheek


gin, job, adjust


kin, count, character, occur


gill, gone, ghost, again


fin, fault, fan, afford


van, veil, available


think, breath, authority


this, breathe, other, although


sell, basic, less, assignment


zoo, desert, dessert, lose


she, ashamed, shun


vision, pleasure, measure


man, me, amazing


no, nap, announce, moon


hang, song, singing


hot, house, ahead


lean, aloud, well, fill


red, rip, around


you, yes, use, university


we, one, await

Voiceless Voiced

Let’s now look at the English consonants in more detail. To describe a consonant, we have to consider three elements: 1) manner of articulation (how the sound is produced); 2) place of articulation (where it is produced); and 3) voicing (if they are voiced or voiceless). All this information is included in Table 2. The concepts are explained below.

Table 2

English consonants
Place of articulation
Bilabial Labio-dental Dental Alveolar Post-alveolar Palatal Velar Glottal Labio-velar
Manner of articulation Plosive

p  / b

t / d

k / g


f / v

θ / ð

s / z

ʃ / ʒ



ʧ / ʤ











When symbols are arranged in pairs, they follow the order voiceless – voiced (e.g. p: voiceless / b: voiced)

Manner of articulation

Consonants differ from vowels in that the airflow cannot escape the mouth freely but has to overcome an obstacle. The specific noise that characterizes each consonant is generated in this way. So, in order to analyze a consonant, one of the parameters we have to know is which kind of obstruction is presented to the airflow. This is what constitutes its manner of articulation.

Plosive. There is a complete closure at one point of the vocal tract that prevents the air from passing. The air builds up in the space behind that obstruction and then is suddenly released, which causes a small explosion accompanied by a distinctive noise. Depending on its place of articulation, the blockage can be of three different types: bilabial (both lips together, /p/, /b/), alveolar (blade of the tongue against the alveolar ridge, /t/, /d/) and velar (back of the tongue against the soft palate, /k/, /g/). If you go to Table 1, on the top line you can hear the six English plosives plus the two affricates (they are placed there because affricates start as plosives, as you will see below).

Fricative. The air is forced through a narrow passage, causing a hissing noise. This is because one articulator (e.g. the tongue) comes close enough to another (e.g. the teeth) for the sound to be produced. It is what happens when you say think /θɪŋk/, for instance, for which you use the dental fricative /θ/. This is the same sound you produce when you say the Spanish word zapato (in peninsular Spanish, not in the Spanish spoken in Latin America). Unlike plosives, fricatives can be produced continuously, that is, you can say /θ/, /s/, /f/, etc. for as long as you have enough breath. You can hear the nine fricative phonemes in Table 1, line 2.

Affricate. This is a mixed sound. It starts as a plosive, with a complete closure, and continues as a fricative, because the air is released more slowly than in plosives. The two affricate sounds in English are the voiceless /ʧ/ (China /ˈʧaɪnə/, exactly the same as in  Spanish) and the voiced /ʤ/ (judg/ʤʌʤ/), which doesn’t have so clear an equivalent in Spanish and tends to be more troublesome (but you can learn how to do it here). As you see, the fact that they are represented by double symbols reflects their hybrid nature.

Nasal. A complete closure is made in some part of the vocal cavity, but the air escapes through the nose. There are three nasal consonants in English depending on the place where the blockage occurs: the bilabial /m/ (mother, the same sound used to say madre in Spanish), the alveolar /n/ (nose, as in the Spanish word nariz) and the velar /ŋ/ (as in sing), which is a bit more difficult to say correctly (learn how to do it here).

Lateral. The tongue is pressed against the alveolar ridge, so there is an obstruction to the airflow in the centre and the air escapes along the sides of the tongue. The English lateral consonant is /l/ (long, as in the Spanish word largo). It is produced as a Spanish l  when the next sound is a vowel, but when it’s followed by a consonant or a pause (well, gild) it is different. Learn about it here.

Approximant. Here the two articulators come close to one another but not so much as to cause any friction. English approximants are /w/ (well), /r/ (red) and /j/ (you). To a certain extent, they resemble vowels. Actually, /w/ and /j/ are also called semivowels because of their similarity to/u/  and /i/.

Place of articulation

This is the location where the obstruction to the airflow occurs along the vocal tract. Notice that the different places of articulation are described in a backward movement beginning in the lips and ending in the velum. So each place is a little farther back in the mouth than the previous one.

Bilabial. The two lips come together (/p, b, m/).

Labio-dental. The lower lip touches the upper teeth (/f, v/).

Dental. The tongue is placed either between the teeth or against the back of the upper teeth (/θ, ð/).

Alveolar. The tongue is level with the alveolar ridge, which is the hard area above the top front teeth. /t/, /d/, /n/ and /l/ are placed against it, which means that the two surfaces touch. In the case of /s/ and /z/, the tongue is very close to the alveolar ridge, but without making actual contact.

Post-alveolar. The tongue is placed just behind the alveolar ridge (/ʧ, ʤʃ, ʒ, r/).

Palatal. The tongue is raised against the hard palate, that is, the middle part of the roof of the mouth (/j/).

Velar. The back of the tongue articulates with the soft palate, which is situated at the back of the mouth (/k/, /g/, /ŋ/).

Glottal. An audible friction is produced between the vocal folds, but without vocal fold vibration (/h/).

Labio-velar. There are two simultaneous constrictions: an open approximation at the lips and an open approximation at the velum. This is called double articulation (/w/).


Consonants can be accompanied by vocal fold vibration or not, so they are said to be voiced, in the first case, or voiceless, in the second. You can check the voicing of each consonant in Table 1, where they are organized in two different colours.

All the voiceless phonemes, except for the /h/, have a voiced counterpart. This means that they share place and manner of articulation but differ in voicing. In other words, they are the same sound except for the fact that in one case the vocal folds vibrate and in the other they don’t. This can be seen in Table 2. Whenever the phonemes are arranged in pairs, they follow the order voiceless-voiced. The phonemes that stand alone are all voiced except for the /h/.

However, you can easily find out for yourself the voicing of a consonant. Place your fingers against your throat while producing a sound and you’ll be able to feel whether your vocal folds vibrate or not.

If you want to practice, the fricative pairs are wonderful, since they can be done continuously. Have a look at this little exercise:

From /s/ to /z/ sound_loud_speaker. As you can see, the sound begins as an /s/ (voiceless), but at some point the vocal folds start vibrating and it becomes an /z/ (voiced).

In actual words, the difference can be exemplified by the following minimal pair:

sound_loud_speaker Seal   /siːl/                           sound_loud_speaker Zeal  /ziːl/

This is the type of work I do with my students in my one-to-one classes. I make them practise these processes with exercises until they improve their comprehension of native speakers and are capable of speaking like that themselves. If you are interested in my classes, you can contact me here.

Devoicing of final consonants. As English pronunciation is seldom simple, we have to deal with an additional element: the devoicing of final consonants. This is a rather complicated issue. But don’t worry. Read this explanation, and it’ll become crystal clear to you.

Voiced consonants which have a voiceless counterpart -that is, plosives, fricatives and affricates-  tend to lose their voicing when they are placed at the end of words and followed by a silence or a voiceless sound. In other words, they are produced without vocal fold vibration when they are said before a voiceless consonant or are the last word in an utterance. This loss can be partial or total, depending on the context, speaker, etc., but generally it is very noticeable.

Now, logically, if a voiced consonant loses its voicing, it would sound very similar to its voiceless counterpart: /d/ would sound like /t/; /z/ like /s/; /g/ like /k/; /v/ like /f/, etc. And it stands to reason that this process would make English more difficult to understand.

We find the clearest example of this problem in minimal pairs. The question is: how are we going to distinguish pairs of words which differ only in that single element, a final consonant, if that consonant loses its voicing and becomes similar to its counterpart? Of course we have all the information provided by context, but there is another clue.

Listen to the following examples:

sound_loud_speaker I’m worried about my bag.

sound_loud_speaker I’m worried about my back.

Here you can observe two things:

1. /g/ and /k/ sound very similar, almost identical. And the /g/ sound is  clearly different from the one we do in Spanish (we would use a fully voiced /g/, like this sound_loud_speaker). Actually, an English /g/ is much closer to an Spanish/k/ than to a Spanish /g/.

2. The /æ/ sound in bag  is longer than the /æ/ sound in back, even though the vowel is the same: /bæg/ sound_loud_speaker – /bæk/ sound_loud_speaker.

The second point, the length of the previous vowel, is the crucial element to distinguish bag  from back. It is due to a phenomenon called pre-fortis clipping whereby vowels become shorter if they are followed by voiceless consonants. On the contrary, voiced vowels retain their proper length even if they are devoiced. So, strange though it may seem, native speakers don’t distinguish these kinds of pairs by the voicing of the consonant -that is to say, because it is a /k/ or a /g/- since this difference tends to be so small that it’s almost always neutralized. The key element to tell the difference between those words is the length of the preceding vowel.

Let’s now listen to this example from BBC4 where stand-up comedian David Mitchell describes how he and a friend were trying to upstage each other while performing a show, and so drifting away from the audience. He describes the movement and, at the end, makes a comparison (which I won’t transcribe yet).

sound_loud_speaker The best place to stand for you is slightly upstage of the other person, so you are as it were addressing the crowd more. And we both were just slightly trying to do that to the other one, which had the result of our both edging gradually away from the audience, like a sort of…     ¡¿What?!

Listen to the last part again sound_loud_speaker.

Of course he doesn’t say crap, but crabbut the last /b/ is devoiced and sounds very much like a /p/. Have a look at the length of the vowel /æ/, though. Despite being a short vowel, it is pretty long. If he were saying crap, he would certainly use a shorter vowel. It is this length, therefore, which allows the listener to get the word (and, consequently, the meaning) right.

Recovering the voicing. As said, English consonants lose their voicing when they are final or followed by a voiceless phoneme, but this doesn’t happen if they are closely followed by a voiced sound. Have a look at this example.

sound_loud_speaker He play(devoiced, almost as /s/).     sound_loud_speaker He plays to kill time (devoiced, almost as /s/).       sound_loud_speaker He plays alone (followed by a voiced sound and then said as /z/).

Previous Next