Nation, I. S. P. (2001) - Learning Vocabulary in Another Language. Cambridge University Press.
Nation, I. S. P. (2001) - Learning Vocabulary in Another Language. Cambridge University Press.
Nation, I. S. P. (2001) - Learning Vocabulary in Another Language. Cambridge University Press.
Another Language
I. S. P. Nation
Victoria University of Wellington
published by the press syndicate of the university of cambridge
The Pitt Building, Trumpington Street, Cambridge, United Kingdom
c a m b r i d g e u n i v e rs i t y p r e s s
The Edinburgh Building, Cambridge CB2 2RU, UK
40 West 20th Street, New York, NY 10011–4211, USA
10 Stamford Road, Oakleigh, VIC 3166, Australia
Ruiz de Alarcón 13, 28014 Madrid, Spain
Dock House, The Waterfront, Cape Town 8001, South Africa
http://www.cambridge.org
A catalogue record for this book is available from the British Library
Introduction 1
Learning goals 1
The four strands 2
Main themes 3
The audience for this book 4
1 The goals of vocabulary learning 6
How much vocabulary do learners need to know? 6
How many words are there in the language? 6
How many words do native speakers know? 8
How much vocabulary do you need to use another
language? 9
High-frequency words 13
Specialised vocabulary 17
Low-frequency words 19
Testing vocabulary knowledge 21
2 Knowing a word 23
Learning burden 23
The receptive/productive distinction 24
The scope of the receptive/productive distinction 26
Experimental comparisons of receptive and productive
vocabulary 30
Aspects of knowing a word 33
Levelt’s process model of language use 34
Spoken form 40
Written form 44
Word parts 46
Connecting form and meaning 47
Concept and referents 49
vii
viii Contents
Associations 52
Grammatical functions 55
Collocations 56
Constraints on use 57
Item knowledge and system knowledge 58
3 Teaching and explaining vocabulary 60
Learning from teaching and learning activities 60
Vocabulary in classrooms 74
Repetition and learning 74
Communicating meaning 81
Helping learners comprehend and learn from definitions 90
Spending time on words 93
Rich instruction 94
Arguments against rich instruction 95
Providing rich instruction 97
Spoken form 98
Written form 98
Word parts 100
Strengthening the form–meaning connection 101
Concept and referents 102
Associations 104
Grammar 106
Collocation 106
Constraints on use 106
Vocabulary teaching procedures 107
Computer-assisted vocabulary learning 108
Using concordances 111
Research on CAVL 112
4 Vocabulary and listening and speaking 114
What vocabulary knowledge is needed for listening? 114
Providing vocabulary support for listening 116
Learning vocabulary from listening to stories 117
Learning vocabulary through negotiation 123
The vocabulary of speaking 125
Developing fluency with spoken vocabulary 127
Using teacher input to increase vocabulary knowledge 129
Using labelled diagrams 131
Using cooperative tasks to focus on vocabulary 133
How can a teacher design activities to help incidental
vocabulary learning? 134
Designing and adapting activities 139
Contents ix
6
The goals of vocabulary learning 7
Tokens
One way is simply to count every word form in a spoken or written
text and if the same word form occurs more than once, then each
occurrence of it is counted. So the sentence ‘It is not easy to say it cor-
rectly’ would contain eight words, even though two of them are the
same word form, it. Words which are counted in this way are called
‘tokens’, and sometimes ‘running words’. If we try to answer ques-
tions like ‘How many words are there on a page or in a line?’ ‘How
long is this book?’ ‘How fast can you read?’ ‘How many words does
the average person speak per minute?’ then our unit of counting will
be the token.
Types
We can count the words in the sentence ‘It is not easy to say it cor-
rectly’ another way. If we see the same word again, we do not count it
again. So the sentence of eight tokens consists of seven different words
or ‘types’. We count words in this way if we want to answer questions
like ‘How large was Shakespeare’s vocabulary?’ ‘How many words do
you need to know to read this book?’ ‘How many words does this dic-
tionary contain?’
Lemmas
A lemma consists of a headword and some of its inflected and reduced
(n’t) forms. Usually, all the items included under a lemma are the same
part of speech (Francis and Kučera, 1982: 461). The English inflec-
tions consist of plural, third person singular present tense, past tense,
past participle, -ing, comparative, superlative and possessive (Bauer
and Nation, 1993). The Thorndike and Lorge (1944) frequency count
used lemmas as the basis for counting, and the more recent computer-
ised count on the Brown Corpus (Francis and Kučera, 1982) has pro-
duced a lemmatised list. In the Brown count the comparative and
superlative forms are not included in the lemma, and the same form
used as a different part of speech (walk as a noun, walk as a verb) are
not in the same lemma. Variant spellings (favor, favour) are usually
included as part of the same lemma when they are the same part of
speech.
Lying behind the use of lemmas as the unit of counting is the idea of
learning burden (Swenson and West, 1934). The learning burden of an
item is the amount of effort required to learn it. Once learners can use
the inflectional system, the learning burden of for example mends, if
8 The goals of vocabulary learning
Word families
Lemmas are a step in the right direction when trying to represent
learning burden in the counting of words. However, there are clearly
other affixes which are used systematically and which greatly reduce
the learning burden of derived words containing known base forms.
These include affixes like -ly, -ness and un-. A word family consists
of a headword, its inflected forms, and its closely related derived
forms.
The major problem in counting using word families as the unit is to
decide what should be included in a word family and what should not.
Learners’ knowledge of the prefixes and suffixes develops as they gain
more experience of the language. What might be a sensible word
family for one learner may be beyond another learner’s present level of
proficiency. This means that it is usually necessary to set up a scale of
word families, starting with the most elementary and transparent
members and moving on to less obvious possibilities.
teenth century are often wildly incorrect. We will look at the reasons
for this later in this book.
Recent reliable studies (Goulden, Nation and Read, 1990;
Zechmeister, Chronis, Cull, D’Anna and Healy, 1995) suggest that
educated native speakers of English know around 20,000 word fami-
lies. These estimates are rather low because the counting unit is word
families which have several derived family members and proper nouns
are not included in the count. A very rough rule of thumb would be
that for each year of their early life, native speakers add on average
1,000 word families a year to their vocabulary. These goals are man-
ageable for non-native speakers of English, especially those learning
English as a second rather than foreign language, but they are way
beyond what most learners of English as another language can realis-
tically hope to achieve.
would tend to swing from podocarps to beech forests regardless of the state
of the podocarp resource. The colonists cannot be blamed for plunging in
without thought to whether the resource had limits. They brought from
Britain little experience or understanding of how to maintain forest
structure and a timber supply for all time. Under German management it
might have been different here. The Germans have practised the sustained
approach since the seventeenth century when they faced a timber shortage as
a result of a series of wars. In New Zealand in the latter part of the twentieth
century, an anticipated shortage of the most valuable native timber, rimu,
prompts a similar response – no more contraction of the indigenous forest
and a balancing of yield with increment in selected areas.
This is not to say the idea is being aired here for the first time. Over a
century ago the first Conservator of Forests proposed sustained harvesting.
He was cried down. There were far too many trees left to bother about it.
And yet in the pastoral context the dangers of overgrazing were appreciated
early in the piece. New Zealand geography students are taught to this day
how overgrazing causes the degradation of the soil and hillsides to slide
away, and that with them can go the viability of hill-country sheep and cattle
farming. That a forest could be overgrazed as easily was not widely accepted
until much later – so late, in fact, that the counter to it, sustained-yield
management, would be forced upon the industry and come as a shock to it.
It is a simple enough concept on paper: balance harvest with growth and
you have a natural renewable resource; forest products forever. Plus the
social and economic benefits of regular work and income, a regular timber
supply and relatively stable markets. Plus the environmental benefits that
accrue from minimising the impact on soil and water qualities and wildlife.
In practice, however, sustainability depends on how well the dynamics of
the forest are understood. And these vary from area to area according to
forest make-up, soil profile, altitude, climate and factors which forest science
may yet discover. Ecology is deep-felt.
We can distinguish four kinds of vocabulary in the text: high-
frequency words (unmarked in the text), academic words (in bold),
and technical and low-frequency words (in italics).
High-frequency words
In the example text, these words are not marked at all and include
function words: in, for, the, of, a, etc. Appendix 6 contains a complete
list of function words. The high-frequency words also include many
content words: government, forests, production, adoption, represent,
boundary. The classic list of high-frequency words is Michael West’s
(1953a) A General Service List of English Words which contains
around 2,000 word families. Almost 80% of the running words in the
text are high-frequency words.
12 The goals of vocabulary learning
Academic words
The text is from an academic textbook and contains many words that
are common in different kinds of academic texts: policy, phase,
adjusted, sustained. Typically these words make up about 9% of the
running words in the text. The best list of these is the Academic Word
List (Coxhead, 1998). Appendix 1 contains the 570 headwords of this
list. This small list of words is very important for anyone using English
for academic purposes (see chapter 6).
Technical words
The text contains some words that are very closely related to the topic
and subject area of the text. These words include indigenous, regener-
ation, podocarp, beech, rimu (a New Zealand tree) and timber. These
words are reasonably common in this topic area but not so common
elsewhere. As soon as we see them we know what topic is being dealt
with. Technical words like these typically cover about 5% of the
running words in a text. They differ from subject area to subject area.
If we look at technical dictionaries, such as dictionaries of economics,
geography or electronics, we usually find about 1,000 entries in each
dictionary.
Low-frequency words
The fourth group is the low-frequency words. Here, this group
includes words like zoned, pioneering, perpetuity, aired and pastoral.
They make up over 5% of the words in an academic text. There are
thousands of them in the language, by far the biggest group of words.
They include all the words that are not high-frequency words, not aca-
demic words and not technical words for a particular subject. They
consist of technical words for other subject areas, proper nouns,
words that almost got into the high-frequency list, and words that we
rarely meet in our use of the language.
Let us now look at a longer text and a large collection of texts.
Sutarsyah, Nation and Kennedy (1994) looked at a single econom-
ics textbook to see what vocabulary would be needed to read the text.
The textbook was 295,294 words long. Table 1.2 shows the results.
The academic word list used in the study was the University Word List
(Xue and Nation, 1984).
What should be clear from this example and from the text looked at
earlier is that a reasonably small number of words covers a lot of text.
The goals of vocabulary learning 13
High-frequency words
There is a small group of high-frequency words which are very impor-
tant because these words cover a very large proportion of the running
words in spoken and written texts and occur in all kinds of uses of the
language.
14 The goals of vocabulary learning
Sustained-yield management ought to be long-
term government policy in indigenous forests
zoned for production. The adoption of such a High-frequency vocabulary
policy would represent a breakthrough – the
boundary between a pioneering, extractive phase 2000 words
and an era in which the timber industry adjusted
to living with the forests in perpetuity. A forest 80% or more text coverage
sustained is a forest in which harvesting and mor- a, equal, places, behaves,
tality combined do not exceed regeneration.
Naturally enough, faster-growing forests produce
educate
more timber, which is why attention would tend
to swing from podocarps to beech forests regard-
less of the state of the podocarp resource. The
colonists cannot be blamed for plunging in
without thought to whether the resource had
limits. They brought from Britain little experience
or understanding of how to maintain forest struc-
ture and a timber supply for all time. Under
German management it might have been different
here. The Germans have practised the sustained
approach since the seventeenth century when they
faced a timber shortage as a result of a series of
wars. In New Zealand in the latter part of the
twentieth century, an anticipated shortage of the
most valuable native timber, rimu, prompts a
similar response – no more contraction of the
indigenous forest and a balancing of yield with
increment in selected areas.
This is not to say the idea is being aired here for
the first time. Over a century ago the first
Conservator of Forests proposed sustained har-
vesting. He was cried down. There were far too
many trees left to bother about it. And yet in the
pastoral context the dangers of overgrazing were
appreciated early in the piece. New Zealand geog-
raphy students are taught to this day how over-
grazing causes the degradation of the soil and
hillsides to slide away, and that with them can go Academic vocabulary
the viability of hill-country sheep and cattle
farming. That a forest could be overgrazed as
easily was not widely accepted until much later –
so late, in fact, that the counter to it, sustained-
Technical vocabulary
yield management, would be forced upon the
industry and come as a shock to it. Low-frequency vocabulary
How large is this group of words? The usual way of deciding how
many words should be considered as high-frequency words is to look
at the text coverage provided by successive frequency-ranked groups
of words. The teacher or course designer then has to decide where the
coverage gained by spending teaching time on these words is no longer
worthwhile. Table 1.5 shows coverage figures for each successive
1,000 lemmas from the Brown Corpus – a collection of various 2,000-
word texts of American English totalling just over one million tokens.
Usually the 2,000-word level has been set as the most suitable
limit for high-frequency words. Nation and Hwang (1995) present
The goals of vocabulary learning 15
Table 1.7. Text type and text coverage by the most frequent 2000
words of English and an academic word list in four different kinds of
texts
Specialised vocabulary
It is possible to make specialised vocabularies which provide good
coverage for certain kinds of texts. These are a way of extending the
high-frequency words for special purposes.
What special vocabularies are there? Special vocabularies are made
by systematically restricting the range of topics or language uses inves-
tigated. It is thus possible to have special vocabularies for speaking,
for reading academic texts, for reading newspapers, for reading chil-
dren’s stories, or for letter writing. Technical vocabularies are also spe-
cialised vocabularies. Some specialised vocabularies are made by
doing frequency counts using a specialised corpus, others are made by
experts in the field gathering what they consider to be relevant vocab-
ulary.
There is a very important specialised vocabulary for second lan-
guage learners intending to do academic study in English. This is the
Academic Word List (Coxhead, 1998; see appendix 1). It consists of
570 word families that are not in the most frequent 2,000 words of
English but which occur reasonably frequently over a very wide range
of academic texts; the list is not restricted to a specific discipline. That
means that the words are useful for learners studying humanities, law,
science or commerce. Academic vocabulary has sometimes been called
sub-technical vocabulary because it does not contain technical words
but rather formal vocabulary.
The importance of this vocabulary can be seen in the coverage it
provides for various kinds of texts (Table 1.7).
Adding the academic vocabulary from the UWL to the high-
frequency words changes the coverage of academic text from 78.1%
to 86.6%. Expressed another way, with a vocabulary of 2,000 words,
approximately one word in every five will be unknown. With a vocab-
ulary of 2,000 words plus the Academic Word List, approximately
18 The goals of vocabulary learning
Low-frequency words
There is a very large group of words that occur very infrequently and
cover only a small proportion of any text.
What kinds of words are they?
1. Some low-frequency words are words of moderate frequency that
did not manage to get into the high-frequency list. It is important
to remember that the boundary between high-frequency and low-
frequency vocabulary is an arbitrary one. Any of several thousand
low-frequency words could be candidates for inclusion within the
high-frequency list simply because their position on a rank
frequency list which takes account of range is dependent on the
nature of the corpus the list is based on. A different corpus would
lead to a different ranking particularly among words on the
boundary. This, however, should not be seen as a justification for
large amounts of teaching time being spent on low-frequency
words at the third or fourth thousand word level. Here are some
words that in the Brown Corpus fall just outside the high-
frequency boundary: curious, wing, arm (vb), gate,
approximately.
2. Many low-frequency words are proper names. Approximately
4% of the running words in the Brown Corpus are words like
Carl, Johnson and Ohio. In some texts, such as novels and
newspapers, proper nouns are like technical words – they are of
high-frequency in particular texts but not in others, their meaning
is closely related to the message of the text, and they could not be
20 The goals of vocabulary learning