Early Modern English or Early New English (sometimes abbreviated EModE, EMnE, or EME) is the stage of the English language from the beginning of the Tudor period to the English Interregnum and Restoration, or from the transition from Middle English, in the late 15th century, to the transition to Modern English, in the mid-to-late 17th century.
Before and after the accession of James I to the English throne in 1603, the emerging English standard began to influence the spoken and written Middle Scots of Scotland.
The grammatical and orthographical conventions of literary
English in the late 16th century and the 17th century are still very
influential on modern Standard English. Most modern readers of English can understand texts written in the late phase of Early Modern English, such as the King James Bible and the works of William Shakespeare, and they have greatly influenced Modern English.
Texts from the earlier phase of Early Modern English, such as the late-15th century Le Morte d'Arthur (1485) and the mid-16th century Gorboduc
(1561), may present more difficulties but are still obviously closer to
Modern English grammar, lexicon, and phonology than are 14th-century
Middle English texts, such as the works of Geoffrey Chaucer.
The change from Middle English
to Early Modern English was not just a matter of changes of vocabulary
or pronunciation; a new era in the history of English was beginning.
An era of linguistic change in a language with large variations
in dialect was replaced by a new era of a more standardised language,
with a richer lexicon and an established (and lasting) literature.
1476 – William Caxton starts printing in Westminster;
however, the language that he uses reflects the variety of styles and
dialects used by the authors who originally wrote the material.
1485 – Caxton publishes Thomas Malory's Le Morte d'Arthur,
the first print bestseller in English. Malory's language, while archaic
in some respects, is clearly Early Modern and is possibly a Yorkshire
or Midlands dialect.
1491 or 1492 – Richard Pynson starts printing in London; his style tends to prefer Chancery Standard, the form of English used by the government.
Henry VIII
c. 1509 – Pynson becomes the King's official printer.
1539 – Publication of the Great Bible, the first officially-authorised Bible in English. Edited by Myles Coverdale,
it is largely from the work of Tyndale. It is read to congregations
regularly in churches, which familiarises much of the population of
England with a standard form of the language.
1549 – Publication of the first Book of Common Prayer in English, under the supervision of Thomas Cranmer
(revised 1552 and 1662), which standardises much of the wording of
church services. Some have argued that since attendance at prayer book
services was required by law for many years, the repetitive use of its
language helped to standardise Modern English even more than the King James Bible (1611) did.
Title page of Gorboduc (printed 1565). The
Tragedie of Gorbodvc, whereof three Actes were written by Thomas
Nortone, and the two laste by Thomas Sackuyle. Sett forthe as the same
was shewed before the Qvenes most excellent Maiestie, in her highnes
Court of Whitehall, the .xviii. day of January, Anno Domini .1561. By
the Gentlemen of Thynner Temple in London.
1582 – The Rheims and Douai Bible is completed, and the New Testament is released in Rheims,
France, in 1582. It is the first complete English translation of the
Bible that is officially sponsored and carried out by the Catholic Church
(earlier translations into English, especially of the Psalms and
Gospels existed as far back as the 9th century, but it is the first
Catholic English translation of the full Bible). Though the Old
Testament is ready complete, it is not published until 1609–1610, when
it is released in two volumes. While it does not make a large impact on
the English language at large, it certainly plays a role in the
development of English, especially in the world's heavily-Catholic
English-speaking areas.
1607 – The first successful permanent English colony in the New World, Jamestown, is established in Virginia. Early vocabulary specific to American English comes from indigenous languages (such as moose, racoon).
1611 – The King James Version is published, largely based on Tyndale's translation. It remains the standard Bible in the Church of England for many years.
The English Civil War and the Interregnum were times of social and political upheaval and instability.
The dates for Restoration literature
are a matter of convention and differ markedly from genre to genre. In
drama, the "Restoration" may last until 1700, but in poetry, it may last
only until 1666, the annus mirabilis
(year of wonders), and in prose lasts until 1688. With the increasing
tensions over succession and the corresponding rise in journalism and
periodicals, or until possibly 1700, when those periodicals grew more
stabilised.
The 17th-century port towns and their forms of speech gain influence over the old county towns.
From around the 1690s onwards, England experienced a new period of
internal peace and relative stability, which encouraged the arts
including literature.
The towering importance of William Shakespeare over the other Elizabethan authors was the result of his reception during the 17th and the 18th centuries, which directly contributes to the development of Standard English. Shakespeare's plays are therefore still familiar and comprehensible 400 years after they were written, but the works of Geoffrey Chaucer and William Langland, which had been written only 200 years earlier, are considerably more difficult for the average modern reader.
Orthography
Shakespeare's writings are universally associated with Early Modern English
The orthography
of Early Modern English was fairly similar to that of today, but
spelling was unstable. Early Modern English, as well as Modern English,
inherited orthographical conventions predating the Great Vowel Shift.
Early Modern English spelling was similar to Middle English orthography. Certain changes were made, however, sometimes for reasons of etymology (as with the silent ⟨b⟩ that was added to words like debt, doubt and subtle).
Early Modern English orthography had a number of features of spelling that have not been retained:
The letter ⟨S⟩ had two distinct lowercase forms: ⟨s⟩ (short s), as is still used today, and ⟨ſ⟩ (long s). The short s was always used at the end of a word and often elsewhere. The long s, if used, could appear anywhere except at the end of a word. The double lowercase S was written variously ⟨ſſ⟩, ⟨ſs⟩ or ⟨ß⟩ (the last ligature is still used in German ß). That is similar to the alternation between medial (σ) and final lower casesigma (ς) in Greek.
⟨u⟩ and ⟨v⟩
were not considered two distinct letters then but as still different
forms of the same letter. Typographically, ⟨v⟩ was frequent at the start
of a word and ⟨u⟩ elsewhere: hence vnmoued (for modern unmoved) and loue (for love).
The modern convention of using ⟨u⟩ for the vowel sounds and ⟨v⟩ for the
consonant appears to have been introduced in the 1630s. Also, ⟨w⟩ was frequently represented by ⟨vv⟩.
Similarly, ⟨i⟩ and ⟨j⟩ were also still considered not as two distinct letters, but as different forms of the same letter: hence ioy for joy and iust for just. Again, the custom of using ⟨i⟩ as a vowel and ⟨j⟩ as a consonant began in the 1630s.
The letter ⟨þ⟩ (thorn)
was still in use during the Early Modern English period but was
increasingly limited to handwritten texts. In Early Modern English
printing, ⟨þ⟩ was represented by the Latin ⟨Y⟩ (see Ye olde), which appeared similar to thorn in blackletter typeface ⟨𝖞⟩. Thorn had become nearly totally disused by the late Early Modern English period, the last vestiges of the letter being its ligatures, ye (thee), yt (that), yu (thou), which were still seen occasionally in the 1611 King James Version and in Shakespeare's Folios.
A silent ⟨e⟩ was often appended to words, as in ſpeake and cowarde. The last consonant was sometimes doubled when the ⟨e⟩ was added: hence manne (for man) and runne (for run).
The sound /ʌ/ was often written ⟨o⟩ (as in son): hence ſommer, plombe (for modern summer, plumb).
The final syllable of words like public was variously spelt but came to be standardised as -ick. The modern spellings with -ic did not come into use until the mid-18th century.
⟨y⟩ was often used instead of ⟨i⟩.
The vowels represented by ⟨ee⟩ and ⟨e_e⟩ (for example in meet and mete) changed, and ⟨ea⟩ became an alternative.
Many spellings had still not been standardised, however. For example, he was spelled as both he and hee in the same sentence in Shakespeare's plays and elsewhere.
Phonology
Consonants
Most
consonant sounds of Early Modern English have survived into present-day
English; however, there are still a few notable differences in
pronunciation:
Today's "silent" consonants found in the consonant clusters of such words as knot, gnat, sword were still fully pronounced up until the mid-to-late 16th century and thus possibly by Shakespeare, though they were fully reduced by the early 17th century. The digraph <ght>, in words like night, thought, and daughter, originally pronounced [xt] in much older English, was probably reduced to simply [t] (as it is today) or at least heavily reduced in sound to something like [ht], [çt], or [ft]. It seems likely that much variation existed for many of these words.
The now-silent l of would and should may have persisted in being pronounced as late as 1700 in Britain and perhaps several decades longer in the British American colonies. The l in could, however, first appearing in the early 16th century, was presumably never pronounced.
The modern phoneme /ʒ/ was not documented as occurring until the second half of the 17th century. Likely, that phoneme in a word like vision was pronounced as /zj/ and in measure as /z/.
Most words with the spelling ⟨wh⟩, such as what, where, and whale, were still pronounced [ʍ](listen), rather than [w](listen). That means, for example, that wine and whine were still pronounced differently, unlike in most varieties of English today.
Early Modern English was rhotic. In other words, the r was always pronounced, but the precise nature of the typical rhotic consonant remains unclear. It was, however, certainly one of the following:
The "R" of most varieties of English today: [ɹ̠](listen)
In Early Modern English, the precise nature of the light and dark variants of the l consonant, respectively [l](listen) and [ɫ](listen), remains unclear.
Word-final ⟨ng⟩, as in sing, was still pronounced /ŋɡ/ until the late 16th century, when it began to coalesce into the usual modern pronunciation, [ŋ].
H-dropping at the start of words was common, as it still is in informal English throughout most of England. In loanwords taken from Latin, Greek, or any Romance language, a written h was usually mute well into modern English times, e.g. in heritage, history, hermit, hostage, and still today in heir, honor, hour etc.
With words originating from or passed through ancient Greek, th was commonly pronounced as t, e.g. theme, theater, cathedral, anthem; this is still retained in some proper names as Thomas, Anthony and a few common nouns like thyme.
Pure vowels and diphthongs
The following information primarily comes from studies of the Great Vowel Shift; see the related chart.
The modern English phoneme/aɪ/(listen), as in glide, rhyme and eye, was [ɘi] and later [əi]. Early Modern rhymes indicate that [əi] was also the vowel that was used at the end of words like happy, melody and busy.
/ɛ/(listen), as in fed, elm and hen, was more or less the same as the phoneme represents today or perhaps a slightly higher[ɛ̝](listen), sometimes approaching [ɪ](listen) (as it still retains in the word pretty).
/eɪ/(listen), as in name, case and sake, was a long monophthong. It shifted from [æː](listen) to [ɛː](listen) and finally to [eː](listen). Earlier in Early Modern English, mat and mate were near-homophones, with a longer vowel in the second word. Thus, Shakespeare rhymed words like haste, taste and waste with last and shade with sad. The more open pronunciation remains in some dialects, notably in Scotland, Northern England, and perhaps Ireland. During the 17th century, the phoneme variably merged with the phoneme [ɛi](listen) as in day, weigh,
and the merger survived into standard forms of Modern English, though a
few dialects kept these vowels distinct at least to the 20th century
(see pane–pain merger).
/iː/(listen) (typically spelled ⟨ee⟩ or ⟨ie⟩) as in see, bee and meet, was more or less the same as the phoneme represents today, but it had not yet merged with the phoneme represented by the spellings ⟨ea⟩ or ⟨ei⟩ (and perhaps ⟨ie⟩, particularly with fiend, field and friend), as in east, meal and feat, which were pronounced with [eː](listen) or [ɛ̝ː]. However, words like breath, dead and head may have already split off towards /ɛ/(listen)).
/ɪ/(listen), as in bib, pin and thick, was more or less the same as the phoneme represents today.
/oʊ/(listen), as in stone, bode and yolk, was [oː](listen) or [o̞ː](listen). The phoneme was probably just beginning the process of merging with the phoneme [ou], as in grow, know and mow, without yet achieving today's complete merger. The old pronunciation remains in some dialects, such as in Yorkshire and Scotland.
/ɔɪ/(listen), as in boy, choice and toy, is even less clear than other vowels. By the late-16th century, the similar but distinct phonemes /ɔɪ/, /ʊi/ and /əɪ/ all existed. By the late-17th century, only /ɔɪ/ remained.
Because those phonemes were in such a state of flux during the whole
Early Modern period (with evidence of rhyming occurring among them as
well as with the precursor to /aɪ/), scholars often assume only the most neutral possibility for the pronunciation of /ɔɪ/ as well as its similar phonemes in Early Modern English: [əɪ] (which, if accurate, would constitute an early instance of the line–loin merger since /aɪ/ had not yet fully developed in English).
/ʌ/(listen) (as in drum, enough and love) and /ʊ/(listen) (as in could, full, put) had not yet split and so were both pronounced in the vicinity of [ɤ](listen).
/uː/(listen) was about the same as the phoneme represents today but occurred in not only words like food, moon and stool but also all other words spelled with ⟨oo⟩ like blood, cook and foot.
The nature of the vowel sound in the latter group of words, however, is
further complicated by the fact that the vowel for some of those words
was shortened: either beginning or already in the process of
approximating the Early Modern English [ɤ](listen) and later [ʊ](listen). For instance, at certain stages of the Early Modern period or in certain dialects (or both), doom and come
rhymed; this is certainly true in Shakespeare's writing. That
phonological split among the ⟨oo⟩ words was a catalyst for the later foot–strut split and is called "early shortening" by John C. Wells.[20] The ⟨oo⟩ words that were pronounced as something like [ɤ](listen) seem to have included blood, brood, doom, good and noon.
/ɪʊ̯/ or /iu̯/ occurred in words spelled with ew or ue such as due and dew. In most dialects of Modern English, it became /juː/ and /uː/ by yod-dropping and so do, dew and due
are now perfect homophones in most American pronunciations, but a
distinction between the two phonemes remains in other versions of
English.
Rhotic vowels
The r sound (the phoneme /r/) was probably always pronounced with following vowel sounds (more in the style of today's General American, West Country English, Irish accents and Scottish accents; although in the case of the Scottish accent the R is rolled, and less like today's typical London or Received Pronunciation). Furthermore, /ɛ/, /ɪ/ and /ʌ/ were not necessarily merged before /r/, as they are in most modern English dialects. The stressed modern phoneme /ɜːr/, when it is spelled ⟨er⟩, ⟨ear⟩ and perhaps ⟨or⟩ (as in clerk, earth, or divert), had a vowel sound with an a-like quality, perhaps about [ɐɹ] or [äɹ]. With the spelling ⟨or⟩, the sound may have been backed, more toward [ɒɹ] in words like worth and word. In some pronunciations, words like fair and fear, with the spellings ⟨air⟩ and ⟨ear⟩, rhymed with each other, and words with the spelling ⟨are⟩, such as prepare and compare, were sometimes pronounced with a more open vowel sound, like the verbs are and scar. See Great Vowel Shift § Later mergers for more information.
Particular words
Nature was pronounced approximately as [ˈnɛːtəɹ] and may have rhymed with letter or, early on, even latter. One may have been pronounced own, with both one and other using the era's long GOAT vowel, rather than today's STRUT vowels. Tongue derived from the sound of tong and rhymed with song.
Grammar
Pronouns
Beginning of the Epistle to the Hebrews in the 1611 King James Version. God
who at sundry times, and in divers manners, spake in times past unto
the Fathers by the Prophets, Hath in these last dayes spoken unto us by
his Sonne, whom he hath appointed heire of all things, by whom also he
made the worlds, who being the brightnesse of his glory, and the
expresse image of his person, and upholding all things by the word of
his power, when hee had by himselfe purged our sinnes, sate down on ye
right hand of the Maiestie on high, Being made so much better then the
Angels, as hee hath by inheritance obtained a more excellent Name then
they.
Early Modern English had two second-person personal pronouns: thou, the informal singular pronoun, and ye, the plural (both formal and informal) pronoun and the formal singular pronoun.
"Thou" and "ye" were both common in the early-16th century (they can be seen, for example, in the disputes over Tyndale's translation of the Bible in the 1520s and the 1530s) but by 1650, "thou" seems old-fashioned or literary. It has effectively completely disappeared from Modern Standard English.
The translators of the King James Version of the Bible (begun
1604 and published 1611, while Shakespeare was at the height of his
popularity) had a particular reason for keeping the informal
"thou/thee/thy/thine" forms that were slowly beginning to fall out of
spoken use, as it enabled them to match the Hebrew and Ancient Greek
distinction between second person singular ("thou") and plural ("ye").
It was not to denote reverence (in the King James Version, God addresses
individual people and even Satan as "thou") but only to denote the
singular. Over the centuries, however, the very fact that "thou" was
dropping out of normal use gave it a special aura and so it gradually
and ironically came to be used to express reverence in hymns and in
prayers.
Like other personal pronouns, thou and ye have different forms dependent on their grammatical case; specifically, the objective form of thou is thee, its possessive forms are thy and thine, and its reflexive or emphatic form is thyself.
The objective form of ye was you, its possessive forms are your and yours and its reflexive or emphatic forms are yourself and yourselves.
The older forms "mine" and "thine" had become "my" and "thy" before words beginning with a consonant other than h, and "mine" and "thine" were retained before words beginning with a vowel or an h, as in mine eyes or thine hand.
From the early Early Modern English period up until the 17th century, his was the possessive of the third-person neuter it as well as of the third-person masculine he. Genitive "it" appears once in the 1611 King James Bible (Leviticus 25:5) as groweth of it owne accord.
Verbs
Tense and number
During the Early Modern period, the verb inflections became simplified as they evolved towards their modern forms:
The third-person singular present lost its alternate inflections: -eth and -th became obsolete, and -s survived. (Both forms can be seen together in Shakespeare: "With her, that hateth thee and hates us all".)
The plural present form became uninflected. Present plurals had been marked with -en and singulars with -th or -s (-th and -s survived the longest, especially with the singular use of is, hath and doth). Marked present plurals were rare throughout the Early Modern period and -en was probably used only as a stylistic affectation to indicate rural or old-fashioned speech.
The second-person singular indicative was marked in both the present and past tenses with -st or -est (for example, in the past tense, walkedst or gav'st). Since the indicative past was not and still is not otherwise marked for person or number, the loss of thou made the past subjunctive indistinguishable from the indicative past for all verbs except to be.
Modal auxiliaries
The modal auxiliaries
cemented their distinctive syntactical characteristics during the Early
Modern period. Thus, the use of modals without an infinitive became
rare (as in "I must to Coventry"; "I'll none of that"). The use of
modals' present participles to indicate aspect (as in "Maeyinge suffer
no more the loue & deathe of Aurelio" from 1556), and of their
preterite forms to indicate tense (as in "he follow'd Horace so very
close, that of necessity he must fall with him") also became uncommon.
Some verbs ceased to function as modals during the Early Modern period. The present form of must, mot, became obsolete. Dare also lost the syntactical characteristics of a modal auxiliary and evolved a new past form (dared), distinct from the modal durst.
Perfect and progressive forms
The perfect
of the verbs had not yet been standardised to use only the auxiliary
verb "to have". Some took as their auxiliary verb "to be", such as this
example from the King James Version: "But which of you... will say unto
him... when he is come from the field, Go and sit down..." [Luke
XVII:7]. The rules for the auxiliaries for different verbs were similar
to those that are still observed in German and French (see unaccusative verb).
The modern syntax used for the progressive aspect ("I am walking") became dominant by the end of the Early Modern period, but other forms were also common such as the prefix a- ("I am a-walking") and the infinitive paired with "do" ("I do walk"). Moreover, the to be + -ing
verb form could be used to express a passive meaning without any
additional markers: "The house is building" could mean "The house is
being built".
Vocabulary
A number of words that are still in common use in Modern English have undergone semantic narrowing.
The use of the verb "to suffer" in the sense of "to allow"
survived into Early Modern English, as in the phrase "suffer the little
children" of the King James Version, but it has mostly been lost in
Modern English.
Also, this period reveals a curious case of one of the earliest
Russian borrowings to English (which is historically a rare occasion
itself); at least as early as 1600, the word "steppe" (rus. степь) first appeared in English in William Shakespeare's comedy "A Midsummer Night's Dream". It is believed that this is a possible indirect borrowing via either German or French.
The substantial borrowing of Latin and sometimes Greek words for
abstract concepts, begun in Middle English, continued unabated, often
terms for abstract concepts not available in English.
The genitives my, mine, thy, and thine are used as possessive adjectives before a noun, or as possessive pronouns without a noun. All four forms are used as possessive adjectives: mine and thine are used before nouns beginning in a vowel sound, or before nouns beginning in the letter h, which was usually silent (e.g. thine eyes and mine heart, which was pronounced as mine art) and my and thy before consonants (thy mother, my love). However, only mine and thine are used as possessive pronouns, as in it is thine and they were mine (not *they were my).
A language is a structured system of communication used by humans. Languages can be based on speech and gesture (spoken language), sign, or writing. The structure of language is its grammar and the free components are its vocabulary. Many languages, including the most widely-spoken ones, have writing systems that enable sounds or signs to be recorded for later reactivation. Human language is unique among the known systems of animal communication in that it is not dependent on a single mode of transmission (sight, sound, etc.), is highly variable between cultures and across time, and affords a much wider range of expression than other systems.
Estimates of the number of human languages in the world vary
between 5,000 and 7,000. Precise estimates depend on an arbitrary
distinction (dichotomy) being established between languages and dialects. Natural languages are spoken, signed, or both; however, any language can be encoded into secondary media using auditory, visual, or tactile stimuli – for example, writing, whistling, signing, or braille. In other words, human language is modality-independent, but written or signed language is the way to inscribe or encode the natural human speech or gestures.
Depending on philosophical perspectives
regarding the definition of language and meaning, when used as a
general concept, "language" may refer to the cognitive ability to learn
and use systems of complex communication, or to describe the set of
rules that makes up these systems, or the set of utterances that can be
produced from those rules. All languages rely on the process of semiosis to relate signs to particular meanings. Oral, manual and tactile languages contain a phonological system that governs how symbols are used to form sequences known as words or morphemes, and a syntactic system that governs how words and morphemes are combined to form phrases and utterances.
The scientific study of language is called linguistics. Critical examinations of languages, such as philosophy of language, the relationships between language and thought, etc., such as how words represent experience, have been debated at least since Gorgias and Plato in ancient Greek civilization. Thinkers such as Rousseau (1712 – 1778) have debated that language originated from emotions, while others like Kant (1724 –1804), have held that languages originated from rational and logical thought. Twentieth century philosophers such as Wittgenstein
(1889 – 1951) argued that philosophy is really the study of language
itself. Major figures in contemporary linguistics of these times include
Ferdinand de Saussure and Noam Chomsky.
Language is thought to have gradually diverged from earlier primate communication systems when early hominins acquired the ability to form a theory of mind and shared intentionality.
This development is sometimes thought to have coincided with an
increase in brain volume, and many linguists see the structures of
language as having evolved to serve specific communicative and social
functions. Language is processed in many different locations in the human brain, but especially in Broca's and Wernicke's areas. Humans acquire
language through social interaction in early childhood, and children
generally speak fluently by approximately three years old. Language and
culture are codependent. Therefore, in addition to its strictly
communicative uses, language has social uses such as signifying group identity, social stratification, as well as use for social grooming and entertainment.
Languages evolve and diversify over time, and the history of their evolution can be reconstructed by comparing
modern languages to determine which traits their ancestral languages
must have had in order for the later developmental stages to occur. A
group of languages that descend from a common ancestor is known as a language family; in contrast, a language that has been demonstrated to not have any living or non-living relationship with another language is called a language isolate. There are also many unclassified languages whose relationships have not been established, and spurious languages
may have not existed at all. Academic consensus holds that between 50%
and 90% of languages spoken at the beginning of the 21st century will
probably have become extinct by the year 2100.
As an object of linguistic study, "language" has two primary
meanings: an abstract concept, and a specific linguistic system, e.g. "French". The Swiss linguist Ferdinand de Saussure, who defined the modern discipline of linguistics, first explicitly formulated the distinction using the French word language for language as a concept, langue as a specific instance of a language system, and parole for the concrete usage of speech in a particular language.
When speaking of language as a general concept, definitions can be used which stress different aspects of the phenomenon.
These definitions also entail different approaches and understandings
of language, and they also inform different and often incompatible
schools of linguistic theory. Debates about the nature and origin of language go back to the ancient world. Greek philosophers such as Gorgias and Plato
debated the relation between words, concepts and reality. Gorgias
argued that language could represent neither the objective experience
nor human experience, and that communication and truth were therefore
impossible. Plato maintained that communication is possible because
language represents ideas and concepts that exist independently of, and
prior to, language.
During the Enlightenment and its debates about human origins, it became fashionable to speculate about the origin of language. Thinkers such as Rousseau and Herder
argued that language had originated in the instinctive expression of
emotions, and that it was originally closer to music and poetry than to
the logical expression of rational thought. Rationalist philosophers
such as Kant and Descartes
held the opposite view. Around the turn of the 20th century, thinkers
began to wonder about the role of language in shaping our experiences of
the world – asking whether language simply reflects the objective
structure of the world, or whether it creates concepts that it in turn
impose on our experience of the objective world. This led to the
question of whether philosophical problems are really firstly linguistic
problems. The resurgence of the view that language plays a significant
role in the creation and circulation of concepts, and that the study of
philosophy is essentially the study of language, is associated with what
has been called the linguistic turn and philosophers such as Wittgenstein
in 20th-century philosophy. These debates about language in relation to
meaning and reference, cognition and consciousness remain active today.
Mental faculty, organ or instinct
One definition sees language primarily as the mental faculty
that allows humans to undertake linguistic behaviour: to learn
languages and to produce and understand utterances. This definition
stresses the universality of language to all humans, and it emphasizes
the biological basis for the human capacity for language as a unique
development of the human brain.
Proponents of the view that the drive to language acquisition is innate
in humans argue that this is supported by the fact that all cognitively
normal children raised in an environment where language is accessible
will acquire language without formal instruction. Languages may even
develop spontaneously in environments where people live or grow up
together without a common language; for example, creole languages and spontaneously developed sign languages such as Nicaraguan Sign Language. This view, which can be traced back to the philosophers Kant and Descartes, understands language to be largely innate, for example, in Chomsky's theory of Universal Grammar, or American philosopher Jerry Fodor's extreme innatist theory. These kinds of definitions are often applied in studies of language within a cognitive science framework and in neurolinguistics.
Formal symbolic system
Another definition sees language as a formal system
of signs governed by grammatical rules of combination to communicate
meaning. This definition stresses that human languages can be described
as closed structural systems consisting of rules that relate particular signs to particular meanings. This structuralist view of language was first introduced by Ferdinand de Saussure, and his structuralism remains foundational for many approaches to language.
Some proponents of Saussure's view of language have advocated a
formal approach which studies language structure by identifying its
basic elements and then by presenting a formal account of the rules
according to which the elements combine in order to form words and
sentences. The main proponent of such a theory is Noam Chomsky, the originator of the generative theory of grammar, who has defined language as the construction of sentences that can be generated using transformational grammars. Chomsky considers these rules to be an innate feature of the human mind and to constitute the rudiments of what language is. By way of contrast, such transformational grammars are also commonly used in formal logic, in formal linguistics, and in applied computational linguistics.
In the philosophy of language, the view of linguistic meaning as
residing in the logical relations between propositions and reality was
developed by philosophers such as Alfred Tarski, Bertrand Russell, and other formal logicians.
Yet another definition sees language as a system of communication
that enables humans to exchange verbal or symbolic utterances. This
definition stresses the social functions of language and the fact that
humans use it to express themselves and to manipulate objects in their
environment. Functional theories of grammar
explain grammatical structures by their communicative functions, and
understand the grammatical structures of language to be the result of an
adaptive process by which grammar was "tailored" to serve the
communicative needs of its users.
This view of language is associated with the study of language in pragmatic, cognitive, and interactive frameworks, as well as in sociolinguistics and linguistic anthropology.
Functionalist theories tend to study grammar as dynamic phenomena, as
structures that are always in the process of changing as they are
employed by their speakers. This view places importance on the study of linguistic typology, or the classification of languages according to structural features, as it can be shown that processes of grammaticalization tend to follow trajectories that are partly dependent on typology. In the philosophy of language, the view of pragmatics as being central to language and meaning is often associated with Wittgenstein's later works and with ordinary language philosophers such as J.L. Austin, Paul Grice, John Searle, and W.O. Quine.
A number of features, many of which were described by Charles Hockett and called design features set human language apart from communication used by non-human animals.
Communication systems used by other animals such as bees or apes are closed systems that consist of a finite, usually very limited, number of possible ideas that can be expressed. In contrast, human language is open-ended and productive,
meaning that it allows humans to produce a vast range of utterances
from a finite set of elements, and to create new words and sentences.
This is possible because human language is based on a dual code, in
which a finite number of elements which are meaningless in themselves
(e.g. sounds, letters or gestures) can be combined to form an infinite
number of larger units of meaning (words and sentences). However, one study has demonstrated that an Australian bird, the chestnut-crowned babbler, is capable of using the same acoustic elements in different arrangements to create two functionally distinct vocalizations. Additionally, pied babblers
have demonstrated the ability to generate two functionally distinct
vocalisations composed of the same sound type, which can only be
distinguished by the number of repeated elements.
Several species of animals have proved to be able to acquire forms of communication through social learning: for instance a bonobo named Kanzi learned to express itself using a set of symbolic lexigrams.
Similarly, many species of birds and whales learn their songs by
imitating other members of their species. However, while some animals
may acquire large numbers of words and symbols,
none have been able to learn as many different signs as are generally
known by an average 4 year old human, nor have any acquired anything
resembling the complex grammar of human language.
Human languages differ from animal communication systems in that they employ grammatical and semantic categories, such as noun and verb, present and past, which may be used to express exceedingly complex meanings. It is distinguished by the property of recursivity:
for example, a noun phrase can contain another noun phrase (as in
"[[the chimpanzee]'s lips]") or a clause can contain another clause (as
in "[I see [the dog is running]]"). Human language is the only known natural communication system whose adaptability may be referred to as modality independent.
This means that it can be used not only for communication through one
channel or medium, but through several. For example, spoken language
uses the auditive modality, whereas sign languages and writing use the visual modality, and braille writing uses the tactile modality.
Human language is unusual in being able to refer to abstract
concepts and to imagined or hypothetical events as well as events that
took place in the past or may happen in the future. This ability to
refer to events that are not at the same time or place as the speech
event is called displacement, and while some animal communication systems can use displacement (such as the communication of bees
that can communicate the location of sources of nectar that are out of
sight), the degree to which it is used in human language is also
considered unique.
Theories about the origin of language differ in regard to their basic
assumptions about what language is. Some theories are based on the idea
that language is so complex that one cannot imagine it simply appearing
from nothing in its final form, but that it must have evolved from
earlier pre-linguistic systems among our pre-human ancestors. These
theories can be called continuity-based theories. The opposite viewpoint
is that language is such a unique human trait that it cannot be
compared to anything found among non-humans and that it must therefore
have appeared suddenly in the transition from pre-hominids to early man.
These theories can be defined as discontinuity-based. Similarly,
theories based on the generative view of language pioneered by Noam Chomsky
see language mostly as an innate faculty that is largely genetically
encoded, whereas functionalist theories see it as a system that is
largely cultural, learned through social interaction.
Continuity-based theories are held by a majority of scholars, but they
vary in how they envision this development. Those who see language as
being mostly innate, such as psychologist Steven Pinker, hold the precedents to be animal cognition, whereas those who see language as a socially learned tool of communication, such as psychologist Michael Tomasello, see it as having developed from animal communication in primates: either gestural or vocal communication to assist in cooperation. Other continuity-based models see language as having developed from music, a view already espoused by Rousseau, Herder, Humboldt, and Charles Darwin. A prominent proponent of this view is archaeologist Steven Mithen. Stephen Anderson states that the age of spoken languages is estimated at 60,000 to 100,000 years and that:
Researchers
on the evolutionary origin of language generally find it plausible to
suggest that language was invented only once, and that all modern spoken
languages are thus in some way related, even if that relation can no
longer be recovered ... because of limitations on the methods available
for reconstruction.
Because language emerged in the early prehistory
of man, before the existence of any written records, its early
development has left no historical traces, and it is believed that no
comparable processes can be observed today. Theories that stress
continuity often look at animals to see if, for example, primates
display any traits that can be seen as analogous to what pre-human
language must have been like. Early human fossils can be inspected for
traces of physical adaptation to language use or pre-linguistic forms of
symbolic behaviour. Among the signs in human fossils that may suggest
linguistic abilities are: the size of the brain relative to body mass,
the presence of a larynx capable of advanced sound production and the nature of tools and other manufactured artifacts.
It was mostly undisputed that pre-human australopithecines did not have communication systems significantly different from those found in great apes in general. However, a 2017 study on Ardipithecus ramidus challenges this belief. Scholarly opinions vary as to the developments since the appearance of the genus Homo
some 2.5 million years ago. Some scholars assume the development of
primitive language-like systems (proto-language) as early as Homo habilis (2.3 million years ago) while others place the development of primitive symbolic communication only with Homo erectus (1.8 million years ago) or Homo heidelbergensis (0.6 million years ago), and the development of language proper with anatomically modern Homo sapiens with the Upper Paleolithic revolution less than 100,000 years ago.
Chomsky is one prominent proponent of a discontinuity-based theory of human language origins.
He suggests that for scholars interested in the nature of language,
"talk about the evolution of the language capacity is beside the point."
Chomsky proposes that perhaps "some random mutation took place [...]
and it reorganized the brain, implanting a language organ in an
otherwise primate brain."
Though cautioning against taking this story literally, Chomsky insists
that "it may be closer to reality than many other fairy tales that are
told about evolutionary processes, including language."
The study of language, linguistics, has been developing into a science since the first grammatical descriptions of particular languages in India more than 2000 years ago, after the development of the Brahmi script.
Modern linguistics is a science that concerns itself with all aspects
of language, examining it from all of the theoretical viewpoints
described above.
Subdisciplines
The academic study of language is conducted within many different
disciplinary areas and from different theoretical angles, all of which
inform modern approaches to linguistics. For example, descriptive linguistics examines the grammar of single languages, theoretical linguistics
develops theories on how best to conceptualize and define the nature of
language based on data from the various extant human languages, sociolinguistics
studies how languages are used for social purposes informing in turn
the study of the social functions of language and grammatical
description, neurolinguistics studies how language is processed in the human brain and allows the experimental testing of theories, computational linguistics
builds on theoretical and descriptive linguistics to construct
computational models of language often aimed at processing natural
language or at testing linguistic hypotheses, and historical linguistics
relies on grammatical and lexical descriptions of languages to trace
their individual histories and reconstruct trees of language families by
using the comparative method.
The formal study of language is often considered to have started in India with Pāṇini, the 5th century BC grammarian who formulated 3,959 rules of Sanskritmorphology. However, Sumerian scribes already studied the differences between Sumerian and Akkadian grammar around 1900 BC. Subsequent grammatical traditions developed in all of the ancient cultures that adopted writing.
In the 17th century AD, the French Port-Royal Grammarians
developed the idea that the grammars of all languages were a reflection
of the universal basics of thought, and therefore that grammar was
universal. In the 18th century, the first use of the comparative method by British philologist and expert on ancient India William Jones sparked the rise of comparative linguistics. The scientific study of language was broadened from Indo-European to language in general by Wilhelm von Humboldt. Early in the 20th century, Ferdinand de Saussure introduced the idea of language as a static system of interconnected units, defined through the oppositions between them.
By introducing a distinction between diachronic and synchronic
analyses of language, he laid the foundation of the modern discipline
of linguistics. Saussure also introduced several basic dimensions of
linguistic analysis that are still fundamental in many contemporary
linguistic theories, such as the distinctions between syntagm and paradigm, and the Langue-parole distinction, distinguishing language as an abstract system (langue), from language as a concrete manifestation of this system (parole).
Modern linguistics
Noam Chomsky is one of the most important linguistic theorists of the 20th century.
In the 1960s, Noam Chomsky formulated the generative theory of language.
According to this theory, the most basic form of language is a set of
syntactic rules that is universal for all humans and which underlies the
grammars of all human languages. This set of rules is called Universal Grammar;
for Chomsky, describing it is the primary objective of the discipline
of linguistics. Thus, he considered that the grammars of individual
languages are only of importance to linguistics insofar as they allow us
to deduce the universal underlying rules from which the observable
linguistic variability is generated.
In opposition to the formal theories of the generative school, functional theories of language
propose that since language is fundamentally a tool, its structures are
best analyzed and understood by reference to their functions. Formal theories of grammar
seek to define the different elements of language and describe the way
they relate to each other as systems of formal rules or operations,
while functional theories seek to define the functions performed by
language and then relate them to the linguistic elements that carry them
out. The framework of cognitive linguistics
interprets language in terms of the concepts (which are sometimes
universal, and sometimes specific to a particular language) which
underlie its forms. Cognitive linguistics is primarily concerned with
how the mind creates meaning through language.
Physiological and neural architecture of language and speech
Speaking is the default modality for language in all cultures. The
production of spoken language depends on sophisticated capacities for
controlling the lips, tongue and other components of the vocal
apparatus, the ability to acoustically decode speech sounds, and the
neurological apparatus required for acquiring and producing language. The study of the genetic bases for human language is at an early stage: the only gene that has definitely been implicated in language production is FOXP2, which may cause a kind of congenital language disorder if affected by mutations.
The brain is the coordinating center of all linguistic activity; it
controls both the production of linguistic cognition and of meaning and
the mechanics of speech production. Nonetheless, our knowledge of the
neurological bases for language is quite limited, though it has advanced
considerably with the use of modern imaging techniques. The discipline
of linguistics dedicated to studying the neurological aspects of
language is called neurolinguistics.
Early work in neurolinguistics involved the study of language in
people with brain lesions, to see how lesions in specific areas affect
language and speech. In this way, neuroscientists in the 19th century
discovered that two areas in the brain are crucially implicated in
language processing. The first area is Wernicke's area, which is in the posterior section of the superior temporal gyrus in the dominant cerebral hemisphere. People with a lesion in this area of the brain develop receptive aphasia,
a condition in which there is a major impairment of language
comprehension, while speech retains a natural-sounding rhythm and a
relatively normal sentence structure. The second area is Broca's area, in the posterior inferior frontal gyrus of the dominant hemisphere. People with a lesion to this area develop expressive aphasia, meaning that they know what they want to say, they just cannot get it out.
They are typically able to understand what is being said to them, but
unable to speak fluently. Other symptoms that may be present in
expressive aphasia include problems with word repetition.
The condition affects both spoken and written language. Those with this
aphasia also exhibit ungrammatical speech and show inability to use
syntactic information to determine the meaning of sentences. Both
expressive and receptive aphasia also affect the use of sign language,
in analogous ways to how they affect speech, with expressive aphasia
causing signers to sign slowly and with incorrect grammar, whereas a
signer with receptive aphasia will sign fluently, but make little sense
to others and have difficulties comprehending others' signs. This shows
that the impairment is specific to the ability to use language, not to
the physiology used for speech production.
With technological advances in the late 20th century, neurolinguists have also incorporated non-invasive techniques such as functional magnetic resonance imaging (fMRI) and electrophysiology to study language processing in individuals without impairments.
Real time MRI scan of a person speaking in Mandarin Chinese
Spoken language relies on human physical ability to produce sound, which is a longitudinal wave propagated through the air at a frequency capable of vibrating the ear drum. This ability depends on the physiology of the human speech organs. These organs consist of the lungs, the voice box (larynx),
and the upper vocal tract – the throat, the mouth, and the nose. By
controlling the different parts of the speech apparatus, the airstream
can be manipulated to produce different speech sounds.
The sound of speech can be analyzed into a combination of segmental and suprasegmental
elements. The segmental elements are those that follow each other in
sequences, which are usually represented by distinct letters in
alphabetic scripts, such as the Roman script. In free flowing speech,
there are no clear boundaries between one segment and the next, nor
usually are there any audible pauses between them. Segments therefore
are distinguished by their distinct sounds which are a result of their
different articulations, and can be either vowels or consonants.
Suprasegmental phenomena encompass such elements as stress, phonation type, voice timbre, and prosody or intonation, all of which may have effects across multiple segments.
Consonants and vowel segments combine to form syllables, which in turn combine to form utterances; these can be distinguished phonetically as the space between two inhalations. Acoustically, these different segments are characterized by different formant structures, that are visible in a spectrogram of the recorded sound wave. Formants are the amplitude peaks in the frequency spectrum of a specific sound.
Vowels are those sounds that have no audible friction caused by
the narrowing or obstruction of some part of the upper vocal tract. They
vary in quality according to the degree of lip aperture and the
placement of the tongue within the oral cavity. Vowels are called close when the lips are relatively closed, as in the pronunciation of the vowel [i] (English "ee"), or open when the lips are relatively open, as in the vowel [a] (English "ah"). If the tongue is located towards the back of the mouth, the quality changes, creating vowels such as [u] (English "oo"). The quality also changes depending on whether the lips are rounded as opposed to unrounded, creating distinctions such as that between [i] (unrounded front vowel such as English "ee") and [y] (rounded front vowel such as German "ü").
Consonants are those sounds that have audible friction or closure
at some point within the upper vocal tract. Consonant sounds vary by
place of articulation, i.e. the place in the vocal tract where the
airflow is obstructed, commonly at the lips, teeth, alveolar ridge, palate, velum, uvula, or glottis. Each place of articulation produces a different set of consonant sounds, which are further distinguished by manner of articulation, or the kind of friction, whether full closure, in which case the consonant is called occlusive or stop, or different degrees of aperture creating fricatives and approximants. Consonants can also be either voiced or unvoiced,
depending on whether the vocal cords are set in vibration by airflow
during the production of the sound. Voicing is what separates English [s] in bus (unvoiced sibilant) from [z] in buzz (voiced sibilant).
Some speech sounds, both vowels and consonants, involve release of air flow through the nasal cavity, and these are called nasals or nasalized sounds. Other sounds are defined by the way the tongue moves within the mouth such as the l-sounds (called laterals, because the air flows along both sides of the tongue), and the r-sounds (called rhotics).
By using these speech organs, humans can produce hundreds of
distinct sounds: some appear very often in the world's languages,
whereas others are much more common in certain language families,
language areas, or even specific to a single language.
Modality
Human language is plastic in its choice of the mode used to convey it. Two modes of communication appear to be fundamental: oral (speech and mouthing) and manual (sign and gesture). It is common for oral language to be accompanied by gesture, and for sign language to be accompanied by mouthing.
In addition, some language communities use both modes to convey lexical
or grammatical meaning, each mode complementing the other. Such bimodal
use of language is especially common in genres such as story-telling
(with Plains Indian Sign Language and Australian Aboriginal sign languages
used alongside oral language, for example), but also occurs in mundane
conversation. For instance, many Australian languages have a rich set of
case
suffixes that provide details about the instrument used to perform an
action. Others lack such grammatical precision in the oral mode, but
supplement it with gesture to convey that information in the sign mode.
In Iwaidja,
for example, 'he went out for fish using a torch' is spoken as simply
"he-hunted fish torch", but the word for 'torch' is accompanied by a
gesture indicating that it was held. In another example, the ritual
language Damin
had a heavily reduced oral vocabulary of only a few hundred words, each
of which was very general in meaning, but which were supplemented by
gesture for greater precision (e.g., the single word for fish, l*i, was accompanied by a gesture to indicate the kind of fish).
Secondary modes of language, by which a fundamental mode is conveyed in a different medium, include writing (including braille), sign (in manually coded language), whistling and drumming. Tertiary modes – such as semaphore, Morse code and spelling alphabets
– convey the secondary mode of writing in a different medium. For some
extinct languages that are maintained for ritual or liturgical purposes,
writing may be the primary mode, with speech secondary.
Structure
When described as a system of symbolic communication, language is traditionally seen as consisting of three parts: signs, meanings, and a code connecting signs with their meanings. The study of the process of semiosis, how signs and meanings are combined, used, and interpreted is called semiotics.
Signs can be composed of sounds, gestures, letters, or symbols,
depending on whether the language is spoken, signed, or written, and
they can be combined into complex signs, such as words and phrases. When
used in communication, a sign is encoded and transmitted by a sender
through a channel to a receiver who decodes it.
Some of the properties that define human language as opposed to other
communication systems are: the arbitrariness of the linguistic sign,
meaning that there is no predictable connection between a linguistic
sign and its meaning; the duality of the linguistic system, meaning that
linguistic structures are built by combining elements into larger
structures that can be seen as layered, e.g. how sounds build words and
words build phrases; the discreteness of the elements of language,
meaning that the elements out of which linguistic signs are constructed
are discrete units, e.g. sounds and words, that can be distinguished
from each other and rearranged in different patterns; and the
productivity of the linguistic system, meaning that the finite number of
linguistic elements can be combined into a theoretically infinite
number of combinations.
The rules by which signs can be combined to form words and phrases are called syntax or grammar. The meaning that is connected to individual signs, morphemes, words, phrases, and texts is called semantics.
The division of language into separate but connected systems of sign
and meaning goes back to the first linguistic studies of de Saussure and
is now used in almost all branches of linguistics.
Languages express meaning by relating a sign form to a meaning, or
its content. Sign forms must be something that can be perceived, for
example, in sounds, images, or gestures, and then related to a specific
meaning by social convention. Because the basic relation of meaning for
most linguistic signs is based on social convention, linguistic signs
can be considered arbitrary, in the sense that the convention is
established socially and historically, rather than by means of a natural
relation between a specific sign form and its meaning.
Thus, languages must have a vocabulary of signs related to specific meaning. The English sign "dog" denotes, for example, a member of the species Canis familiaris. In a language, the array of arbitrary signs connected to specific meanings is called the lexicon, and a single sign connected to a meaning is called a lexeme.
Not all meanings in a language are represented by single words. Often,
semantic concepts are embedded in the morphology or syntax of the
language in the form of grammatical categories.
All languages contain the semantic structure of predication:
a structure that predicates a property, state, or action.
Traditionally, semantics has been understood to be the study of how
speakers and interpreters assign truth values
to statements, so that meaning is understood to be the process by which
a predicate can be said to be true or false about an entity, e.g. "[x
[is y]]" or "[x [does y]]". Recently, this model of semantics has been
complemented with more dynamic models of meaning that incorporate shared
knowledge about the context in which a sign is interpreted into the
production of meaning. Such models of meaning are explored in the field
of pragmatics.
A spectrogram showing the sound of the spoken English word "man", which is written phonetically as [mæn].
Note that in flowing speech, there is no clear division between
segments, only a smooth transition as the vocal apparatus moves.
Depending on modality, language structure can be based on systems of
sounds (speech), gestures (sign languages), or graphic or tactile
symbols (writing). The ways in which languages use sounds or signs to
construct meaning are studied in phonology.
Sounds as part of a linguistic system are called phonemes.
Phonemes are abstract units of sound, defined as the smallest units in a
language that can serve to distinguish between the meaning of a pair of
minimally different words, a so-called minimal pair. In English, for example, the words bat[bæt] and pat[pʰæt] form a minimal pair, in which the distinction between /b/ and /p/
differentiates the two words, which have different meanings. However,
each language contrasts sounds in different ways. For example, in a
language that does not distinguish between voiced and unvoiced
consonants, the sounds [p] and [b]
(if they both occur) could be considered a single phoneme, and
consequently, the two pronunciations would have the same meaning.
Similarly, the English language does not distinguish phonemically
between aspirated and non-aspirated pronunciations of consonants, as many other languages like Korean and Hindi do: the unaspirated /p/ in spin[spɪn] and the aspirated /p/ in pin[pʰɪn] are considered to be merely different ways of pronouncing the same phoneme (such variants of a single phoneme are called allophones), whereas in Mandarin Chinese, the same difference in pronunciation distinguishes between the words [pʰá] 'crouch' and [pá] 'eight' (the accent above the á means that the vowel is pronounced with a high tone).
All spoken languages have phonemes of at least two different categories, vowels and consonants, that can be combined to form syllables.
As well as segments such as consonants and vowels, some languages also
use sound in other ways to convey meaning. Many languages, for example,
use stress, pitch, duration, and tone to distinguish meaning. Because these phenomena operate outside of the level of single segments, they are called suprasegmental. Some languages have only a few phonemes, for example, Rotokas and Pirahã language with 11 and 10 phonemes respectively, whereas languages like Taa may have as many as 141 phonemes. In sign languages, the equivalent to phonemes (formerly called cheremes)
are defined by the basic elements of gestures, such as hand shape,
orientation, location, and motion, which correspond to manners of
articulation in spoken language.
Writing systems represent language using visual symbols, which may or may not correspond to the sounds of spoken language. The Latin alphabet
(and those on which it is based or that have been derived from it) was
originally based on the representation of single sounds, so that words
were constructed from letters that generally denote a single consonant
or vowel in the structure of the word. In syllabic scripts, such as the Inuktitut syllabary, each sign represents a whole syllable. In logographic scripts, each sign represents an entire word, and will generally bear no relation to the sound of that word in spoken language.
Because all languages have a very large number of words, no
purely logographic scripts are known to exist. Written language
represents the way spoken sounds and words follow one after another by
arranging symbols according to a pattern that follows a certain
direction. The direction used in a writing system is entirely arbitrary
and established by convention. Some writing systems use the horizontal
axis (left to right as the Latin script or right to left as the Arabic script),
while others such as traditional Chinese writing use the vertical
dimension (from top to bottom). A few writing systems use opposite
directions for alternating lines, and others, such as the ancient Maya
script, can be written in either direction and rely on graphic cues to
show the reader the direction of reading.
In order to represent the sounds of the world's languages in writing, linguists have developed the International Phonetic Alphabet, designed to represent all of the discrete sounds that are known to contribute to meaning in human languages.
Grammar is the study of how meaningful elements called morphemes within a language can be combined into utterances. Morphemes can either be free or bound. If they are free to be moved around within an utterance, they are usually called words, and if they are bound to other words or morphemes, they are called affixes.
The way in which meaningful elements can be combined within a language
is governed by rules. The study of the rules for the internal structure
of words are called morphology. The rules of the internal structure of phrases and sentences are called syntax.
Grammar can be described as a system of categories and a set of rules
that determine how categories combine to form different aspects of
meaning.
Languages differ widely in whether they are encoded through the use of
categories or lexical units. However, several categories are so common
as to be nearly universal. Such universal categories include the
encoding of the grammatical relations of participants and predicates by
grammatically distinguishing between their relations to a predicate, the encoding of temporal and spatial relations on predicates, and a system of grammatical person governing reference to and distinction between speakers and addressees and those about whom they are speaking.
Word classes
Languages organize their parts of speech
into classes according to their functions and positions relative to
other parts. All languages, for instance, make a basic distinction
between a group of words that prototypically denotes things and concepts
and a group of words that prototypically denotes actions and events.
The first group, which includes English words such as "dog" and "song",
are usually called nouns. The second, which includes "think" and "sing", are called verbs. Another common category is the adjective:
words that describe properties or qualities of nouns, such as "red" or
"big". Word classes can be "open" if new words can continuously be added
to the class, or relatively "closed" if there is a fixed number of
words in a class. In English, the class of pronouns is closed, whereas
the class of adjectives is open, since an infinite number of adjectives
can be constructed from verbs (e.g. "saddened") or nouns (e.g. with the
-like suffix, as in "noun-like"). In other languages such as Korean, the situation is the opposite, and new pronouns can be constructed, whereas the number of adjectives is fixed.
Word classes also carry out differing functions in grammar. Prototypically, verbs are used to construct predicates, while nouns are used as arguments
of predicates. In a sentence such as "Sally runs", the predicate is
"runs", because it is the word that predicates a specific state about
its argument "Sally". Some verbs such as "curse" can take two arguments,
e.g. "Sally cursed John". A predicate that can only take a single
argument is called intransitive, while a predicate that can take two arguments is called transitive.
Many other word classes exist in different languages, such as conjunctions like "and" that serve to join two sentences, articles that introduce a noun, interjections such as "wow!", or ideophones
like "splash" that mimic the sound of some event. Some languages have
positionals that describe the spatial position of an event or entity.
Many languages have classifiers that identify countable nouns as belonging to a particular type or having a particular shape. For instance, in Japanese, the general noun classifier for humans is nin (人), and it is used for counting humans, whatever they are called:
san-nin no gakusei (三人の学生) lit. "3 human-classifier of student" — three students
For trees, it would be:
san-bon no ki (三本の木) lit. "3 classifier-for-long-objects of tree" — three trees
Morphology
In linguistics, the study of the internal structure of complex words and the processes by which words are formed is called morphology. In most languages, it is possible to construct complex words that are built of several morphemes.
For instance, the English word "unexpected" can be analyzed as being
composed of the three morphemes "un-", "expect" and "-ed".
Morphemes can be classified according to whether they are independent morphemes, so-called roots, or whether they can only co-occur attached to other morphemes. These bound morphemes or affixes can be classified according to their position in relation to the root: prefixes precede the root, suffixes follow the root, and infixes
are inserted in the middle of a root. Affixes serve to modify or
elaborate the meaning of the root. Some languages change the meaning of
words by changing the phonological structure of a word, for example, the
English word "run", which in the past tense is "ran". This process is
called ablaut. Furthermore, morphology distinguishes between the process of inflection, which modifies or elaborates on a word, and the process of derivation,
which creates a new word from an existing one. In English, the verb
"sing" has the inflectional forms "singing" and "sung", which are both
verbs, and the derivational form "singer", which is a noun derived from
the verb with the agentive suffix "-er".
Languages differ widely in how much they rely on morphological
processes of word formation. In some languages, for example, Chinese,
there are no morphological processes, and all grammatical information is
encoded syntactically by forming strings of single words. This type of
morpho-syntax is often called isolating,
or analytic, because there is almost a full correspondence between a
single word and a single aspect of meaning. Most languages have words
consisting of several morphemes, but they vary in the degree to which
morphemes are discrete units. In many languages, notably in most
Indo-European languages, single morphemes may have several distinct
meanings that cannot be analyzed into smaller segments. For example, in
Latin, the word bonus, or "good", consists of the root bon-, meaning "good", and the suffix -us, which indicates masculine gender, singular number, and nominative case. These languages are called fusional languages, because several meanings may be fused into a single morpheme. The opposite of fusional languages are agglutinative languages
which construct words by stringing morphemes together in chains, but
with each morpheme as a discrete semantic unit. An example of such a
language is Turkish, where for example, the word evlerinizden, or "from your houses", consists of the morphemes, ev-ler-iniz-den with the meanings house-plural-your-from. The languages that rely on morphology to the greatest extent are traditionally called polysynthetic languages. They may express the equivalent of an entire English sentence in a single word. For example, in Persian the single word nafahmidamesh means I didn't understand it consisting of morphemes na-fahm-id-am-esh with the meanings, "negation.understand.past.I.it". As another example with more complexity, in the Yupik word tuntussuqatarniksatengqiggtuq, which means "He had not yet said again that he was going to hunt reindeer", the word consists of the morphemes tuntu-ssur-qatar-ni-ksaite-ngqiggte-uq with the meanings, "reindeer-hunt-future-say-negation-again-third.person.singular.indicative", and except for the morpheme tuntu ("reindeer") none of the other morphemes can appear in isolation.
Many languages use morphology to cross-reference words within a sentence. This is sometimes called agreement.
For example, in many Indo-European languages, adjectives must
cross-reference the noun they modify in terms of number, case, and
gender, so that the Latin adjective bonus, or "good", is
inflected to agree with a noun that is masculine gender, singular
number, and nominative case. In many polysynthetic languages, verbs
cross-reference their subjects and objects. In these types of languages,
a single verb may include information that would require an entire
sentence in English. For example, in the Basque phrase ikusi nauzu, or "you saw me", the past tense auxiliary verb n-au-zu (similar to English "do") agrees with both the subject (you) expressed by the n- prefix, and with the object (me) expressed by the – zu suffix. The sentence could be directly transliterated as "see you-did-me"
In addition to word classes, a sentence can be analyzed in terms of grammatical functions: "The cat" is the subject of the phrase, "on the mat" is a locative phrase, and "sat" is the core of the predicate.
Another way in which languages convey meaning is through the order of
words within a sentence. The grammatical rules for how to produce new
sentences from words that are already known is called syntax. The
syntactical rules of a language determine why a sentence in English such
as "I love you" is meaningful, but "*love you I" is not.
Syntactical rules determine how word order and sentence structure is
constrained, and how those constraints contribute to meaning.
For example, in English, the two sentences "the slaves were cursing the
master" and "the master was cursing the slaves" mean different things,
because the role of the grammatical subject is encoded by the noun being
in front of the verb, and the role of object is encoded by the noun
appearing after the verb. Conversely, in Latin, both Dominus servos vituperabat and Servos vituperabat dominus mean "the master was reprimanding the slaves", because servos, or "slaves", is in the accusative case, showing that they are the grammatical object of the sentence, and dominus, or "master", is in the nominative case, showing that he is the subject.
Latin uses morphology to express the distinction between subject
and object, whereas English uses word order. Another example of how
syntactic rules contribute to meaning is the rule of inverse word order in questions,
which exists in many languages. This rule explains why when in English,
the phrase "John is talking to Lucy" is turned into a question, it
becomes "Who is John talking to?", and not "John is talking to who?".
The latter example may be used as a way of placing special emphasis
on "who", thereby slightly altering the meaning of the question. Syntax
also includes the rules for how complex sentences are structured by
grouping words together in units, called phrases,
that can occupy different places in a larger syntactic structure.
Sentences can be described as consisting of phrases connected in a tree
structure, connecting the phrases to each other at different levels.
To the right is a graphic representation of the syntactic analysis of
the English sentence "the cat sat on the mat". The sentence is analyzed
as being constituted by a noun phrase, a verb, and a prepositional
phrase; the prepositional phrase is further divided into a preposition
and a noun phrase, and the noun phrases consist of an article and a
noun.
The reason sentences can be seen as being composed of phrases is
because each phrase would be moved around as a single element if
syntactic operations were carried out. For example, "the cat" is one
phrase, and "on the mat" is another, because they would be treated as
single units if a decision was made to emphasize the location by moving
forward the prepositional phrase: "[And] on the mat, the cat sat".
There are many different formalist and functionalist frameworks that
propose theories for describing syntactic structures, based on different
assumptions about what language is and how it should be described. Each
of them would analyze a sentence such as this in a different manner.
Languages can be classified in relation to their grammatical types.
Languages that belong to different families nonetheless often have
features in common, and these shared features tend to correlate. For example, languages can be classified on the basis of their basic word order, the relative order of the verb, and its constituents in a normal indicative sentence. In English, the basic order is SVO (subject–verb–object): "The snake(S) bit(V) the man(O)", whereas for example, the corresponding sentence in the Australian languageGamilaraay would be d̪uyugu n̪ama d̪ayn yiːy (snake man bit), SOV.
Word order type is relevant as a typological parameter, because basic
word order type corresponds with other syntactic parameters, such as the
relative order of nouns and adjectives, or of the use of prepositions or postpositions. Such correlations are called implicational universals. For example, most (but not all) languages that are of the SOV type have postpositions rather than prepositions, and have adjectives before nouns.
All languages structure sentences into Subject, Verb, and Object,
but languages differ in the way they classify the relations between
actors and actions. English uses the nominative-accusative
word typology: in English transitive clauses, the subjects of both
intransitive sentences ("I run") and transitive sentences ("I love you")
are treated in the same way, shown here by the nominative pronoun I. Some languages, called ergative,
Gamilaraay among them, distinguish instead between Agents and Patients.
In ergative languages, the single participant in an intransitive
sentence, such as "I run", is treated the same as the patient in a
transitive sentence, giving the equivalent of "me run". Only in
transitive sentences would the equivalent of the pronoun "I" be used.
In this way the semantic roles can map onto the grammatical relations
in different ways, grouping an intransitive subject either with Agents
(accusative type) or Patients (ergative type) or even making each of the
three roles differently, which is called the tripartite type.
The shared features of languages which belong to the same
typological class type may have arisen completely independently. Their
co-occurrence might be due to universal laws governing the structure of
natural languages, "language universals", or they might be the result of
languages evolving convergent solutions to the recurring communicative
problems that humans use language to solve.
Social contexts of use and transmission
Wall of Love on Montmartre in Paris: "I love you" in 250 languages, by calligraphist Fédéric Baron and artist Claire Kito (2000)
While humans have the ability to learn any language, they only do so
if they grow up in an environment in which language exists and is used
by others. Language is therefore dependent on communities of speakers in which children learn language
from their elders and peers and themselves transmit language to their
own children. Languages are used by those who speak them to communicate and to solve a plethora of social tasks. Many aspects of language use can be seen to be adapted specifically to these purposes.[23]
Owing to the way in which language is transmitted between generations
and within communities, language perpetually changes, diversifying into
new languages or converging due to language contact. The process is similar to the process of evolution, where the process of descent with modification leads to the formation of a phylogenetic tree.
However, languages differ from biological organisms in that they
readily incorporate elements from other languages through the process of
diffusion, as speakers of different languages come into contact. Humans also frequently speak more than one language, acquiring their first language
or languages as children, or learning new languages as they grow up.
Because of the increased language contact in the globalizing world, many
small languages are becoming endangered
as their speakers shift to other languages that afford the possibility
to participate in larger and more influential speech communities.
When studying the way in which words and signs are used, it is often
the case that words have different meanings, depending on the social
context of use. An important example of this is the process called deixis,
which describes the way in which certain words refer to entities
through their relation between a specific point in time and space when
the word is uttered. Such words are, for example, the word, "I" (which
designates the person speaking), "now" (which designates the moment of
speaking), and "here" (which designates the position of speaking). Signs
also change their meanings over time, as the conventions governing
their usage gradually change. The study of how the meaning of linguistic
expressions changes depending on context is called pragmatics. Deixis
is an important part of the way that we use language to point out
entities in the world.
Pragmatics is concerned with the ways in which language use is
patterned and how these patterns contribute to meaning. For example, in
all languages, linguistic expressions can be used not just to transmit
information, but to perform actions. Certain actions are made only
through language, but nonetheless have tangible effects, e.g. the act of
"naming", which creates a new name for some entity, or the act of
"pronouncing someone man and wife", which creates a social contract of
marriage. These types of acts are called speech acts, although they can also be carried out through writing or hand signing.
The form of linguistic expression often does not correspond to
the meaning that it actually has in a social context. For example, if at
a dinner table a person asks, "Can you reach the salt?", that is, in
fact, not a question about the length of the arms of the one being
addressed, but a request to pass the salt across the table. This meaning
is implied by the context in which it is spoken; these kinds of effects
of meaning are called conversational implicatures.
These social rules for which ways of using language are considered
appropriate in certain situations and how utterances are to be
understood in relation to their context vary between communities, and
learning them is a large part of acquiring communicative competence in a language.
All healthy, normally developing
human beings learn to use language. Children acquire the language or
languages used around them: whichever languages they receive sufficient
exposure to during childhood. The development is essentially the same
for children acquiring sign or oral languages.
This learning process is referred to as first-language acquisition,
since unlike many other kinds of learning, it requires no direct
teaching or specialized study. In The Descent of Man, naturalist Charles Darwin called this process "an instinctive tendency to acquire an art".
First language acquisition proceeds in a fairly regular sequence,
though there is a wide degree of variation in the timing of particular
stages among normally developing infants. Studies published in 2013 have
indicated that unborn fetuses are capable of language acquisition to some degree.
From birth, newborns respond more readily to human speech than to other
sounds. Around one month of age, babies appear to be able to
distinguish between different speech sounds. Around six months of age, a child will begin babbling, producing the speech sounds or handshapes of the languages used around them. Words appear around the age of 12 to 18 months; the average vocabulary of an eighteen-month-old child is around 50 words. A child's first utterances are holophrases
(literally "whole-sentences"), utterances that use just one word to
communicate some idea. Several months after a child begins producing
words, he or she will produce two-word utterances, and within a few more
months will begin to produce telegraphic speech, or short sentences that are less grammatically
complex than adult speech, but that do show regular syntactic
structure. From roughly the age of three to five years, a child's
ability to speak or sign is refined to the point that it resembles adult
language.
Acquisition of second and additional languages can come at any
age, through exposure in daily life or courses. Children learning a
second language are more likely to achieve native-like fluency than
adults, but in general, it is very rare for someone speaking a second
language to pass completely for a native speaker. An important
difference between first language acquisition and additional language
acquisition is that the process of additional language acquisition is
influenced by languages that the learner already knows.
Languages, understood as the particular set of speech norms of a
particular community, are also a part of the larger culture of the
community that speaks them. Languages differ not only in pronunciation,
vocabulary, and grammar, but also through having different "cultures of
speaking." Humans use language as a way of signalling identity with one
cultural group as well as difference from others. Even among speakers of
one language, several different ways of using the language exist, and
each is used to signal affiliation with particular subgroups within a
larger culture. Linguists and anthropologists, particularly sociolinguists, ethnolinguists, and linguistic anthropologists have specialized in studying how ways of speaking vary between speech communities.
Linguists use the term "varieties" to refer to the different ways of speaking a language. This term includes geographically or socioculturally defined dialects as well as the jargons or styles of subcultures.
Linguistic anthropologists and sociologists of language define
communicative style as the ways that language is used and understood
within a particular culture.
Because norms for language use are shared by members of a
specific group, communicative style also becomes a way of displaying and
constructing group identity. Linguistic differences may become salient
markers of divisions between social groups, for example, speaking a
language with a particular accent may imply membership of an ethnic
minority or social class, one's area of origin, or status as a second
language speaker. These kinds of differences are not part of the
linguistic system, but are an important part of how people use language
as a social tool for constructing groups.
However, many languages also have grammatical conventions that
signal the social position of the speaker in relation to others through
the use of registers that are related to social hierarchies or
divisions. In many languages, there are stylistic or even grammatical
differences between the ways men and women speak, between age groups, or
between social classes, just as some languages employ different words depending on who is listening. For example, in the Australian language Dyirbal, a married man must use a special set of words to refer to everyday items when speaking in the presence of his mother-in-law. Some cultures, for example, have elaborate systems of "social deixis", or systems of signalling social distance through linguistic means.
In English, social deixis is shown mostly through distinguishing
between addressing some people by first name and others by surname, and
in titles such as "Mrs.", "boy", "Doctor", or "Your Honor", but in other
languages, such systems may be highly complex and codified in the
entire grammar and vocabulary of the language. For instance, in
languages of east Asia such as Thai, Burmese, and Javanese,
different words are used according to whether a speaker is addressing
someone of higher or lower rank than oneself in a ranking system with
animals and children ranking the lowest and gods and members of royalty
as the highest.
Throughout history a number of different ways of representing language in graphic media have been invented. These are called writing systems.
The use of writing
has made language even more useful to humans. It makes it possible to
store large amounts of information outside of the human body and
retrieve it again, and it allows communication across physical distances
and timespans that would otherwise be impossible. Many languages
conventionally employ different genres, styles, and registers in written
and spoken language, and in some communities, writing traditionally
takes place in an entirely different language than the one spoken. There
is some evidence that the use of writing also has effects on the
cognitive development of humans, perhaps because acquiring literacy
generally requires explicit and formal education.
The invention of the first writing systems is roughly contemporary with the beginning of the Bronze Age in the late 4th millennium BC. The Sumerian archaic cuneiform script and the Egyptian hieroglyphs
are generally considered to be the earliest writing systems, both
emerging out of their ancestral proto-literate symbol systems from 3400
to 3200 BC with the earliest coherent texts from about 2600 BC. It is
generally agreed that Sumerian writing was an independent invention;
however, it is debated whether Egyptian writing was developed completely
independently of Sumerian, or was a case of cultural diffusion. A similar debate exists for the Chinese script, which developed around 1200 BC. The pre-ColumbianMesoamerican writing systems (including among others Olmec and Maya scripts) are generally believed to have had independent origins.
The first page of the poem Beowulf, written in Old English
in the early medieval period (800–1100 AD). Although Old English is the
direct ancestor of modern English, it is unintelligible to contemporary
English speakers.
All languages change as speakers adopt or invent new ways of speaking
and pass them on to other members of their speech community. Language
change happens at all levels from the phonological level to the levels
of vocabulary, morphology, syntax, and discourse. Even though language
change is often initially evaluated negatively by speakers of the
language who often consider changes to be "decay" or a sign of slipping
norms of language usage, it is natural and inevitable.
Changes may affect specific sounds or the entire phonological system. Sound change can consist of the replacement of one speech sound or phonetic feature
by another, the complete loss of the affected sound, or even the
introduction of a new sound in a place where there had been none. Sound
changes can be conditioned in which case a sound is changed only
if it occurs in the vicinity of certain other sounds. Sound change is
usually assumed to be regular, which means that it is expected to
apply mechanically whenever its structural conditions are met,
irrespective of any non-phonological factors. On the other hand, sound
changes can sometimes be sporadic, affecting only one particular word or a few words, without any seeming regularity. Sometimes a simple change triggers a chain shift in which the entire phonological system is affected. This happened in the Germanic languages when the sound change known as Grimm's law affected all the stop consonants in the system. The original consonant *bʰ became /b/ in the Germanic languages, the previous *b in turn became /p/, and the previous *p became /f/. The same process applied to all stop consonants and explains why Italic languages such as Latin have p in words like pater and pisces, whereas Germanic languages, like English, have father and fish.
Another example is the Great Vowel Shift
in English, which is the reason that the spelling of English vowels do
not correspond well to their current pronunciation. This is because the
vowel shift brought the already established orthography out of
synchronization with pronunciation. Another source of sound change is
the erosion of words as pronunciation gradually becomes increasingly
indistinct and shortens words, leaving out syllables or sounds. This
kind of change caused Latin mea domina to eventually become the Frenchmadame and American English ma'am.
Change also happens in the grammar of languages as discourse patterns such as idioms or particular constructions become grammaticalized.
This frequently happens when words or morphemes erode and the
grammatical system is unconsciously rearranged to compensate for the
lost element. For example, in some varieties of Caribbean Spanish the final /s/ has eroded away. Since Standard Spanish uses final /s/ in the morpheme marking the second person subject "you" in verbs, the Caribbean varieties now have to express the second person using the pronoun tú. This means that the sentence "what's your name" is ¿como te llamas?[ˈkomo te ˈjamas] in Standard Spanish, but [ˈkomo ˈtu te ˈjama] in Caribbean Spanish. The simple sound change has affected both morphology and syntax.
Another common cause of grammatical change is the gradual petrification
of idioms into new grammatical forms, for example, the way the English
"going to" construction lost its aspect of movement and in some
varieties of English has almost become a full-fledged future tense (e.g.
I'm gonna).
Language change may be motivated by "language internal" factors,
such as changes in pronunciation motivated by certain sounds being
difficult to distinguish aurally or to produce, or through patterns of
change that cause some rare types of constructions to drift towards more common types.
Other causes of language change are social, such as when certain
pronunciations become emblematic of membership in certain groups, such
as social classes, or with ideologies,
and therefore are adopted by those who wish to identify with those
groups or ideas. In this way, issues of identity and politics can have
profound effects on language structure.
One important source of language change is contact and resulting diffusion of linguistic traits between languages. Language contact occurs when speakers of two or more languages or varieties interact on a regular basis. Multilingualism is likely to have been the norm throughout human history and most people in the modern world are multilingual. Before the rise of the concept of the ethno-national state,
monolingualism was characteristic mainly of populations inhabiting
small islands. But with the ideology that made one people, one state,
and one language the most desirable political arrangement,
monolingualism started to spread throughout the world. Nonetheless,
there are only 250 countries in the world corresponding to some 6000
languages, which means that most countries are multilingual and most
languages therefore exist in close contact with other languages.
When speakers of different languages interact closely, it is
typical for their languages to influence each other. Through sustained
language contact over long periods, linguistic traits diffuse between
languages, and languages belonging to different families may converge to
become more similar. In areas where many languages are in close
contact, this may lead to the formation of language areas
in which unrelated languages share a number of linguistic features. A
number of such language areas have been documented, among them, the Balkan language area, the Mesoamerican language area, and the Ethiopian language area. Also, larger areas such as South Asia, Europe, and Southeast Asia have sometimes been considered language areas, because of widespread diffusion of specific areal features.
Language contact may also lead to a variety of other linguistic phenomena, including language convergence, borrowing, and relexification
(replacement of much of the native vocabulary with that of another
language). In situations of extreme and sustained language contact, it
may lead to the formation of new mixed languages that cannot be considered to belong to a single language family. One type of mixed language called pidgins
occurs when adult speakers of two different languages interact on a
regular basis, but in a situation where neither group learns to speak
the language of the other group fluently. In such a case, they will
often construct a communication form that has traits of both languages,
but which has a simplified grammatical and phonological structure. The
language comes to contain mostly the grammatical and phonological
categories that exist in both languages. Pidgin languages are defined by
not having any native speakers, but only being spoken by people who
have another language as their first language. But if a Pidgin language
becomes the main language of a speech community, then eventually
children will grow up learning the pidgin as their first language. As
the generation of child learners grow up, the pidgin will often be seen
to change its structure and acquire a greater degree of complexity. This
type of language is generally called a creole language. An example of such mixed languages is Tok Pisin, the official language of Papua New-Guinea, which originally arose as a Pidgin based on English and Austronesian languages; others are Kreyòl ayisyen, the French-based creole language spoken in Haiti, and Michif, a mixed language of Canada, based on the Native American language Cree and French.
SIL Ethnologue
defines a "living language" as "one that has at least one speaker for
whom it is their first language". The exact number of known living
languages varies from 6,000 to 7,000, depending on the precision of
one's definition of "language", and in particular, on how one defines
the distinction between a "language" and a "dialect". As of 2016, Ethnologue cataloged 7,097 living human languages. The Ethnologue establishes linguistic groups based on studies of mutual intelligibility, and therefore often includes more categories than more conservative classifications. For example, the Danish language that most scholars consider a single language with several dialects is classified as two distinct languages (Danish and Jutish) by the Ethnologue.
According to the Ethnologue, 389 languages (nearly 6%)
have more than a million speakers. These languages together account for
94% of the world's population, whereas 94% of the world's languages
account for the remaining 6% of the global population.
There is no clear distinction between a language and a dialect, notwithstanding a famous aphorism attributed to linguist Max Weinreich that "a language is a dialect with an army and navy".
For example, national boundaries frequently override linguistic
difference in determining whether two linguistic varieties are languages
or dialects. Hakka, Cantonese and Mandarin are, for example, often classified as "dialects" of Chinese, even though they are more different from each other than Swedish is from Norwegian. Before the Yugoslav civil war, Serbo-Croatian was generally considered a single language with two normative variants, but due to sociopolitical reasons, Croatian and Serbian
are now often treated as separate languages and employ different
writing systems. In other words, the distinction may hinge on political
considerations as much as on cultural differences, distinctive writing systems, or degree of mutual intelligibility.
Principal language families of the world (and in some cases geographic groups of families). For greater detail, see Distribution of languages in the world.
The world's languages can be grouped into language families
consisting of languages that can be shown to have common ancestry.
Linguists recognize many hundreds of language families, although some of
them can possibly be grouped into larger units as more evidence becomes
available and in-depth studies are carried out. At present, there are
also dozens of language isolates: languages that cannot be shown to be related to any other languages in the world. Among them are Basque, spoken in Europe, Zuni of New Mexico, Purépecha of Mexico, Ainu of Japan, Burushaski of Pakistan, and many others.
The language family of the world that has the most speakers is the Indo-European languages, spoken by 46% of the world's population. This family includes major world languages like English, Spanish, French, German, Russian, and Hindustani (Hindi/Urdu). The Indo-European family achieved prevalence first during the EurasianMigration Period (c. 400–800 AD), and subsequently through the European colonial expansion, which brought the Indo-European languages to a politically and often numerically dominant position in the Americas and much of Africa. The Sino-Tibetan languages are spoken by 20% of the world's population and include many of the languages of East Asia, including Hakka, Mandarin Chinese, Cantonese, and hundreds of smaller languages.
The areas of the world in which there is the greatest linguistic diversity, such as the Americas, Papua New Guinea, West Africa,
and South-Asia, contain hundreds of small language families. These
areas together account for the majority of the world's languages, though
not the majority of speakers. In the Americas, some of the largest
language families include the Quechumaran, Arawak, and Tupi-Guarani families of South America, the Uto-Aztecan, Oto-Manguean, and Mayan of Mesoamerica, and the Na-Dene, Iroquoian, and Algonquian language families of North America. In Australia, most indigenous languages belong to the Pama-Nyungan family, whereas New Guinea is home to a large number of small families and isolates, as well as a number of Austronesian languages.
Together,
the eight countries in red contain more than 50% of the world's
languages. The areas in blue are the most linguistically diverse in the
world, and the locations of most of the world's endangered languages.
Language endangerment occurs when a language is at risk of falling out of use as its speakers die out or shift to speaking another language. Language loss occurs when the language has no more native speakers, and becomes a dead language. If eventually no one speaks the language at all, it becomes an extinct language.
While languages have always gone extinct throughout human history, they
have been disappearing at an accelerated rate in the 20th and 21st
centuries due to the processes of globalization and neo-colonialism, where the economically powerful languages dominate other languages.
The more commonly spoken languages dominate the less commonly
spoken languages, so the less commonly spoken languages eventually
disappear from populations. Of the between 6,000 and 7,000 languages spoken as of 2010, between 50 and 90% of those are expected to have become extinct by the year 2100. The top 20 languages,
those spoken by more than 50 million speakers each, are spoken by 50%
of the world's population, whereas many of the other languages are
spoken by small communities, most of them with less than 10,000
speakers.
The United Nations Educational, Scientific and Cultural Organization
(UNESCO) operates with five levels of language endangerment: "safe",
"vulnerable" (not spoken by children outside the home), "definitely
endangered" (not spoken by children), "severely endangered" (only spoken
by the oldest generations), and "critically endangered" (spoken by few
members of the oldest generation, often semi-speakers). Notwithstanding claims that the world would be better off if most adopted a single common lingua franca, such as English or Esperanto,
there is a consensus that the loss of languages harms the cultural
diversity of the world. It is a common belief, going back to the
biblical narrative of the tower of Babel in the Old Testament, that linguistic diversity causes political conflict,
but this is contradicted by the fact that many of the world's major
episodes of violence have taken place in situations with low linguistic
diversity, such as the Yugoslav and American Civil War, or the genocide of Rwanda, whereas many of the most stable political units have been highly multilingual.
Many projects aim to prevent or slow this loss by revitalizing endangered languages and promoting education and literacy in minority languages. Across the world, many countries have enacted specific legislation to protect and stabilize the language of indigenous speech communities.
A minority of linguists have argued that language loss is a natural
process that should not be counteracted, and that documenting endangered
languages for posterity is sufficient.