Phoneme
From Wikipedia, the free encyclopedia.
In oral language, a phoneme is the theoretical basic unit of sound that can be used to distinguish words or morphemes; in sign language, it is a similarly basic unit of hand shape, motion, position, or facial expression. (Formerly termed chereme.) That is, changing a phoneme in a word produces either nonsense, or a different word with a different meaning.
Phonemes are not physical sounds, but mental abstractions of speech sounds. A phoneme is a family of speech sounds (phones) that the speakers of a language think of, and usually hear as, a single sound. A "perfect" alphabet is often considered to be one that has one symbol for each such phoneme.
Phonemics, a branch of phonology, is the study of the system of phonemes of a language.
Contents |
Background and related ideas
The phoneme is a structuralist abstraction that was introduced by the Polish-Russian linguist Jan Niecislaw Baudouin de Courtenay (1845-1929) and elaborated in the works of Nikolai Trubetzkoi (1890-1938). It was later adapted to and formally psychologized in generative linguistics (after Chomsky and Halle). Rather than a basic mental unit of language, however, it may well be a perceptual artifact of alphabetic literacy (see the terms Phonemic awareness and Phonological awareness). If not that, it may be an epiphenomenal aspect to listening removed from face-to-face encounters, that is, text-like listening (qv phone and feature). It could be said that the unit of the phoneme is a necessary construct if we wish to set a dynamic, complex spoken language into static, written form expressed at a sub-syllabic level, though the model is a simplification and no where near phonologically or phonetically complete.
Variant phones that are not recognized as distinct by a speaker, and which are not meaningfully different in the language, are known as allophones of a phoneme. For example, the English words "pat" and "sat" differ only in their initial consonants. This difference known as contrastiveness is sufficient to distinguish words, and therefore the P and S sounds are said to be different phonemes. A pair of words that are identical except for such a sound are known as a minimal pair; this is the most frequent demonstration that two sounds are phonemes. If no minimal pair can be found to demonstrate that two sounds are distinct, it may be that they are allophones. This is especially likely if they consistently occur in different environments. For example, the "dark" L sound at the end of the English word "wool" is quite different from the "light" L sound at the beginning of the word "leaf", but this difference is meaningless in English, and is determined by whether the sound is at the beginning or end of a word. A native English speaker might have a hard time hearing the difference at first, but in Turkish the difference between "light" and "dark" L is sufficient to distinguish words. That is, they are two separate phonemes in Turkish, but allophones of a single phoneme in English.
The phonemic relationship of two sounds may not be obvious to a non-native speaker, which is why minimal pairs and an understanding of phonetic environments are important. For example, in Japanese, there is a phoneme /t/ that sounds rather like an English T before the vowel /a/, but more like a CH before the vowel /i/. Likewise, in Korean there is a phoneme /r/ that sounds like an English N at the begining of a word, like a flapped R between vowels, and like an L at the end of a word. These sound very different to an English speaker, who is attuned to hearing them because the differences are meaningful in English. However, the native speaker has learned from an early age to filter out the difference, as they are not meaningful in their language. In Korean, for instance, it is impossible to distinguish the two words "ram" and "lam", despite the fact that both R and L sounds occur in the language.
The exact number of phonemes in English depends on the speaker and the method of determining phoneme vs. allophone, but estimates typically range from 40 to 45, which is above average across all languages. Pirahã has only 10, while !Xóõ has 141.
Depending on the language and the alphabet used, a phoneme may be written consistently with one letter; however there are many exceptions to this rule — see Writing systems below.
Some languages make use of pitch for the precise same purpose. In this case, the tones used are called tonemes. Some languages distinguish words made up of the same phonemes (and tonemes) by using different durations of some elements, which are called chronemes. The equivalents of phonemes in sign languages are called cheremes.
Notation
A transcription that only indicates the different phonemes of a languages is said to be phonemic. Such transcriptions are enclosed within virgules (slashes), / /; these show that each enclosed symbol is claimed to be phonemically meaningful. On the other hand, a transcription that indicates finer detail, including allophonic variation like the two English L's, is said to be phonetic, and is enclosed in square brackets, [ ].
The common notation used in linguistics employs virgules (slashes) (/ /) around the symbol that stands for the phoneme. For example, the phoneme for the initial consonant sound in the word "phoneme" would be written as /f/. In other words, the graphemes are <ph>, but this digraph represents one sound /f/. Allophones, real speech variants of a phoneme, are often denoted in linguistics by the use of diacritical or other marks added to the phoneme symbols and then placed in square brackets ([ ]) to differentiate them from the phoneme in slant brackets (/ /). The conventions of orthography are then kept separate from both phonemes and allophones by the use of the markers < > to enclose the spelling.
The symbols of the International Phonetic Alphabet (IPA) and extended sets adapted to a particular language are often used by linguists to write phonemes, with the principle being one symbol equals one categorical sound. Due to problems displaying some symbols in the early days of the Internet, systems such as X-SAMPA and Kirshenbaum were developed to represent IPA symbols in plain text. As of 2004, any modern web browser can display IPA symbols (as long as the operating system provides the appropriate fonts), and we use this system in this article.
Examples
Examples of phonemes in the English language would include sounds from the set of English consonants, like /p/ and /b/. These two are most often written consistently with one letter for each sound. However, phonemes might not be so apparent in written English, such as when they are typically represented with combined letters, called digraphs, like <sh> (pronounced /ʃ/) or <ch> (pronounced /tʃ/).
To see a list of the phonemes in the English language, see IPA for English.
Two sounds that may be allophones (sound variants belonging to the same phoneme) in one language may belong to separate phonemes in another language or dialect. In English, for example, /p/ has aspirated and non-aspirated allophones:aspirated as in /pɪn/, and non-aspirated as in /spɪn/. However, in many languages (e. g. Chinese), aspirated /pʰ/ is a phoneme distinct from unaspirated /p/. As another example, there is no distinction between [r] and [l] in Japanese, there is only one /r/ phoneme in Japanese, although the Japanese /r/ has allophones that make it sound more like an /l/, /d/, or /r/ to English speakers. The sounds /z/ and /s/ are distinct phonemes in English, but allophones in Spanish. /n/ (as in run) and /ŋ/ (as in rung) are phonemes in English, but allophones in Italian and Spanish.
An important phoneme is the chroneme, a phonemically-relevant extension of the duration a consonant or vowel. Some languages or dialects such as Finnish or Japanese allow chronemes after both consonants and vowels. Others, like Italian or Australian English use it after only one (in the case of Italian, consonants; in the case of Australian, vowels). Often a change in quantity correlates with a change in quality, and thus it may be contentious as to whether the quality or quantity, or both, is phonemically relevant.
Restricted phonemes
A restricted phoneme is a phoneme that can only occur in a certain environment and has restrictions as to where it can occur.
Restricted phonemes in English include:
/ŋ/ as in sing can occur only at the end of a syllable or word and can never occur at the beginning of a word.
Under most interpretations, /w/ and /j/ can occur only before a vowel and can never occur at the end of a syllable or word.
/h/ can occur only at the beginning of a syllable or word or at the beginning of a cluster and can never occur at the end of a syllable or word.
Under most interpretations, in American accents with the cot-caught merger /ɔ/ can occur only before /r/ and can never occur elsewhere.
In non-rhotic accents, /r/ can only occur before a vowel or intervocalically and can never occur at the end of a word or before a consonant.
Neutralization, archiphoneme, underspecification
Phonemes that are contrastive in certain environments may not be contrastive in all environments. In the environments where they don't contrast, the contrast is said to be neutralized. In English there are three nasal phonemes, /m, n, ŋ/, as shown by the minimal triplet,
|
|
/sɐm/ |
sum |
|
|
/sɐn/ |
sun |
|
|
/sɐŋ/ |
sung |
However, these sounds are not contrastive before plosives such as /p, t, k/. Although all three phones appear before plosives, for example in imp, hint, ink, only one of these may appear before each of the plosives. That is, the /m, n, ŋ/ distinction is neutralized before each of the plosives /p, t, k/:
Only [m] occurs before [p],
only [n] before [t], and
only [ŋ] before [k].
Thus there is no evidence that these are distinct phonemes in these environments, nor is there any evidence as to what the underlying representation might be. If we theorize that we are dealing with only a single underlying nasal, there is no reason to pick one of the three phonemes /m, n, ŋ/ over the other two.
(In some languages there is only one phonemic nasal anywhere, and it surfaces as [m, n, ŋ] in just these environments, so this idea is not far fetched.)
In certain schools of phonology, such a neutralized distinction is known as an archiphoneme. Archiphonemes are often notated with a capital letter. Following this convention, the neutralization of /m, n, ŋ/ before /p, t, k/ could be notated as |N|, and imp, hint, ink would be represented as |iNp, hiNt, iNk|. (The |pipes| indicate underlying representation.) Other ways this archiphoneme could be notated are /m-n-ŋ/, {m, n, ŋ}, or /n*/.
Another example from English is the neutralization of the plosives /k, g/ following /s/. Phonetically, the tenuis plosive in sky is closer to English /g/, which is partially voiceless in initial position, than to aspirated /k/. This can be heard by comparing the sky with this guy, and by young children who control voicing but not yet consonant clusters, who pronounce sky as /gai/. That is, /k/ and /g/ are constrastive word initially,
|
|
/kai/ |
chi |
|
|
/gai/ |
guy |
But not after an /s/,
|
|
/skai/ |
→ |
sky |
|
|
/sgai/ |
→ |
|
Thus one cannot say whether the underlying representation of the plosive in sky is /skai/ without aspiration, or /sgai/ without voicing. This neutralization can instead be represented as an archiphoneme |G|, in which case the underlying representation of sky would be |sGai|.
Another way to talk about archiphonemes involves the concept of underspecification. Phonemes can be considered fully specified segments while archiphonemes are underspecified segments. In Tuvan, phonemic vowels are specified with the features of tongue height, backness, and lip rounding. The archiphoneme |U| is an underspecified high vowel where only the tongue height is specified.
|
|
phoneme/ |
height |
backness |
roundedness |
|
|
/i/ |
high |
front |
unrounded |
|
|
/ɯ/ |
high |
back |
unrounded |
|
|
/u/ |
high |
back |
rounded |
|
|
|U| |
high |
- |
- |
Whether |U| is pronounced as front or back and whether rounded or unrounded depends on vowel harmony. If |U| occurs following a front unrounded vowel, it will be pronounced as the phoneme /i/; if following a back unrounded vowel, it will be as an /ɯ/; and if following a back rounded vowel, it will be an /u/. This can been seen in the following words:
|
|
-|Um| |
|
|
'my' |
|
(the vowel of this suffix is underspecified) |
|
|
|idikUm| |
→ |
/idikim/ |
'my boot' |
|
(/i/ is front & unrounded) |
|
|
|xarUm| |
→ |
/xarɯm/ |
'my snow' |
|
(/a/ is back & unrounded) |
|
|
|nomUm| |
→ |
/nomum/ |
'my book' |
|
(/o/ is back & rounded) |
Non-phonemes
Prothesis, epenthesis and paragoge due to phonotactics add sounds into words without adding meaning. Nevertheless, the sound is added, and thus the phoneme status may be ambiguous. For example, Spanish prothetic e- must be added before consonant clusters, e.g. estres.
Phonological extremes
Of all the sounds that a human vocal tract can create, different languages vary considerably in the number of these sounds that are considered to be distinctive phonemes in the speech of that language. Ubyx and some dialects of Abkhaz have only two phonemic vowels, and many Native American languages have three. On other extreme, the Bantu language Ngwe has fourteen vowel qualities, twelve of which may occur long or short, for twenty-six oral vowels, plus six nasalized vowels, long and short, for thirty-eight vowels; while !Xóõ achieves thirty-one pure vowels—not counting vowel length, which it also has—by varying the phonation. Template:Susbt:ll has only six consonants, while !Xóõ has somewhere in the neighborhood of seventy-seven, and Ubyx eighty-one. French has no phonemic tone or stress, while several of the Kam-Sui languages have nine tones, and one of the Kru languages, Wobe, has been claimed to have fourteen, though this is disputed. The total number of phonemes in languages varies from as few as eleven in Rotokas to as many as 112 in !Xóõ (including four tones). These may range from familiar sounds like [t], [s], or [m] to very unusual ones produced in extraordinary ways (see: Click consonant, phonation, airstream mechanism). The English language itself uses a rather large set of thirteen to twenty-two vowels, including diphthongs, though its twenty-two to twenty-six consonants are close to average. (There are twenty-one consonant and five vowel letters in the English alphabet, but this does not correspond to the number of consonant and vowel sounds.)
The most common vowel system consists of the five vowels /i/, /e/, /a/, /o/, /u/. The most common consonants are /p/, /t/, /k/, /m/, /n/. A very few languages lack one of these: standard Hawai`ian lacks /t/, Mohawk lacks /p/ and /m/, Hupa lacks both /p/ and a simple /k/, colloquial Samoan lacks /t/ and /n/, while Rotokas and Quileute lack /m/ and /n/. While most of these languages have very small inventories, Quileute and Hupa have quite complex consonant systems.
The ways that sounds are pronounced can vary slightly from language to language even if the same IPA symbol is used. For example, the Finnish word maat ("countries") sounds different from the British English (Received Pronunciation) word mart even though both are transcribed as IPA [mɑ:t][1]; the Spanish word sin ("without") has a somewhat different vowel from the American English seen though both are transcribed as [sin].
Writing systems
Languages where a given symbol represents only one phoneme and every phoneme is represented only by one symbol are known by the layman as "phonetic languages", which might be better described as "phonemically written". English is often given as an example of an "unphonetic" language as its spelling system is highly erratic. There are numerous cases in which it is not possible to predict the pronunciation from the spelling or vice versa. Welsh is also among the least predictable of the languages using the Latin alphabet. In French, rules to predict pronunciation from spelling are quite simple and with few exceptions (as long as there are some clues such as context or part of speech), but guessing spelling from pronunciation is quite difficult, especially because of the many silent letters. Italian, Spanish and especially Finnish have a very close letter-to-phoneme correspondence. Karelian has a perfectly phonemic spelling system, as it has no standard language, but it has a complete spelling system.
However, the split between phonemically-written and non-phonemically-written languages is usually exaggerated. All languages are in fact written with conventional signs that represent meaning and are inspired to some degree by pronunciation. This is true at both ends of the scale: Chinese characters are first and foremost symbols of meaning, but they do also have some minimal phonetic information. At the other extreme, there are some few orthographies which are perfect phonemic representations of the standard accent, but since they make no effort to represent the variation in pronunciation within a language, they too are partially conventional.
All other languages fall somewhere between these extremes. Although English is often given as an example of an "unphonetic" language, in reality its system is nowhere near as close to being a purely conventional system as Chinese writing is. English spelling conveys etymological information, but also vast amounts of phonetic information. Spanish is often given as an example of a "phonetic" language; however, it has numerous imperfections including silent letters. It is, at least, possible to know the correct pronunciation of any written Spanish word. Another phonetic language is Serbian, its phoneticity was established by Serbian "Webster" Vuk Stefanovic Karadzic; he followed a strict phonemical principle, which is best told by his own words: "Write as you speak and read as it is written.". Hindi, a descendant of Sanskrit, is an example of "phonetic" language written with a non-Roman Alphabet.