Word A Cross Linguistic Typology
Word A Cross Linguistic Typology
Word A Cross Linguistic Typology
Word
A cross-linguistic typology
Edited by
R. M. W. Dixon
and
Alexandra Y. Aikhenvald
Research Centre for Linguistic Typology, La Trobe University
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, So Paulo
Cambridge University Press
The Edinburgh Building, Cambridge , United Kingdom
Published in the United States of America by Cambridge University Press, New York
www.cambridge.org
Information on this title: www.cambridge.org/9780521818995
In the work Cambridge University Press 2002
This book is in copyright. Subject to statutory exception and to the provision of
relevant collective licensing agreements, no reproduction of any part may take place
without the written permission of Cambridge University Press.
First published in print format 2003
-
isbn-13 978-0-511-06149-3 eBook (NetLibrary)
-
isbn-10 0-511-06149-8 eBook (NetLibrary)
-
isbn-13 978-0-521-81899-5 hardback
-
isbn-10 0-521-81899-0 hardback
Contents
List of contributors
Preface
List of abbreviations
1 Word: a typological framework
r. m. w. dixon and alexandra y. aikhenvald
1
2
3
4
5
6
7
8
9
10
11
12
The tradition
Doing without word
What is a word?
Confusions
Some suggested criteria
Phonological word
Grammatical word
Clitics
Relationship between grammatical and phonological words
Do all languages have words?
The social status of words
Summary
Appendix Sample outline account of phonological word and grammatical
word in Fijian
References
page viii
x
xi
1
2
3
5
6
10
13
18
25
27
32
32
34
35
37
42
43
57
70
71
75
79
79
79
vi
Contents
3
4
5
6
7
8
9
Introduction
Phonological word
Clause structure and verbal clauses
Predicate structure
Grammatical word and predicate general discussion
The grammatical word in Jarawara
Summary of instances where the two kinds of word do not correspond
Appendix List of miscellaneous, tensemodal and mood sufxes
References
Introduction
Brief summary of typology
Phonological word
Criteria for grammatical word
Clitics
Relationship between phonological and grammatical word
Complex predicates
Conclusion
References
A thumbnail typology
The word for word in Cupik
Grammatical word
Enclitics in grammar
Phonological word
Summary of grammatical-wordphonological-word mismatches
Conclusion
References
Typology
Grammatical words
Phonological words
Enclitics
Denitional problems
80
81
81
89
91
97
98
98
100
100
101
101
107
108
112
114
123
123
125
125
127
129
130
140
143
147
148
152
153
153
156
161
167
177
180
180
183
192
195
199
Contents
vii
201
202
203
205
1
2
3
4
Phonological word
Grammatical word
Clitics
Relationship between grammatical and phonological word
Appendices
References
Introduction
Some important preliminaries
Various types of word and relevant criteria and tests
Summation regarding wordhood
Conclusion
References
Introduction
The word as established in Latin
How can general linguists help?
Clitics
Are units like words necessary?
References
Index of authors
Index of languages and language families
Index of subjects
205
212
216
223
224
226
227
227
228
232
236
241
241
243
243
244
248
259
261
263
266
266
267
271
276
279
280
282
285
288
Contributors
Alexandra Y. Aikhenvald
Research Centre for Linguistic
Typology
La Trobe University
Victoria, 3086
Australia
e-mail:
a.aikhenvald@latrobe.edu.au
John Boyle
Dept of Linguistics
University of Chicago
Chicago, IL 60637
USA
e-mail:
jpboyle@midway.uchicago.edu
R. M. W. Dixon
Research Centre for Linguistic
Typology
La Trobe University
Victoria, 3086
Australia
no e-mail
Randolph Graczyk
PO Box 29
Pryor, MT, 59066
USA
e-mail: rgraczyk@acl.com
Alice C. Harris
Department of Linguistics
viii
List of contributors
Knut J. Olawsky
Research Centre for Linguistic
Typology
La Trobe University
Victoria, 3086
Australia
e-mail:
olawsky@latrobe.edu.au
Robert Rankin
Dept of Linguistics
University of Kansas
Lawrence, KS 66045-2140
USA
e-mail: rankin@ukans.edu
ix
Anthony C. Woodbury
Dept of Linguistics
Calhoun Hall 501
University of Texas
Austin, TX 78712
USA
e-mail: acw@mail.utexas.edu
Ulrike Zeshan
Research Centre for Linguistic
Typology
La Trobe University
Victoria, 3086
Australia
e-mail: u.zeshan@latrobe.edu.au
Preface
Abbreviations
1
2
3
A
ABL
ABS
ACC
ACT
ACTR
ADJ
ADV
AG
ANIM
ANTIPASS
APPLIC
APPR
ART
ASSOC
ATTEN
AUX
C
CAUS
CL
CLIT
COLL
COMIT
COMPL
CONC
CONT
rst person
second person
third person
transitive
subject function
ablative
absolutive
accusative
active
actor, subject of
active verb
adjective
adverbial case
agentive
animate
antipassive
applicative
approximative
article
associative
attenuative
auxiliary
consonant
causative
classier
clitic
collective
comitative
completive
concessive
continuous
CONT.GO
CV
DAT
DEC
DEF
DET
DIFFSUBJ
DIM
DIR.OBJ
DU, du
e
EMPH
ERG
exc
F, FEM, f
FOC
FP
FUT
GEN
HABIT
ImmPosIMP
IMP
IMPERF
IMPFVE
IMPV
INC, inc
IND
continuous
action while in
motion
character vowel
dative
declarative
denite
determiner
different subject
diminutive
direct object
dual
eyewitness
emphatic
ergative
exclusive
feminine
focus
far past
future
genitive
habitual
immediate
positive
imperative
impersonal
imperfect
imperfective
imperative
inclusive
indicative mood
xi
xii
List of abbreviations
INF
INST
INSTR
INT
INTENS
INTER.VIS
intr
IP
ITER
IV
LOC
M, MASC, m
MD
n
N
NAR
NCL
NEG
nf
NOM
NOMLSR
NONACC
NP
nsg
NTR
O
OBJ
OBJ.FOC
Oc
ORD
P
PART
PASSING
PAST.PCPL
PASTHAB
innitive
instrumentive
instrumental
intentional
intensier
visual
interrogative
intransitive
immediate past
iterative
intransitive
verbaliser
locative
masculine
modalis case
non-eyewitness
nasal
narrative case
noun class
negative
non-feminine
nominative
nominaliser
nonaccomplished
noun phrase
non-singular
neuter
transitive object
function
object
object focus
O-construction
marker
ordinal
possessor
particle
do while
passing
past participle
past habitual
PAT
patient, subject
of stative verb
PAUS
pausal
PEJ
pejorative
PERF
perfect
PERI
peripheral
postposition
PERS
person
PERFVE
perfective
PL, pl
plural
PORT
portative
POSS
possessive
PP
past/passive
participle
PRES
present tense
PRES.NON.VIS present
non-visual
PRO
pronominal
PRX
proximity
marker
PST
past
PTCPL
participle
PURP
purposive
PW
phonological
word
QUOT
quotative
RC
relative clause
marker
REC.PAST
recent past
RECIP
reciprocal
pronominal
REDUP
reduplication
REFL
reexive
REL
relativiser
REM.P.INFR
remote past
inferred
REM.P.REP
remote past
reported
REM.P.VIS
remote past
visual
RP
recent past
List of abbreviations
S
SAMESUBJ
SEMBL
SG, sg
SUBJ
SUBJUNC
SUPPO
SUUS
TOP.NON.A/S
intransitive
subject function
same subject
semblative
singular
subject
subjunctive
supposedly
reexivepossessive
pronominal
topical
non-subject
xiii
tr
TV
V
VERT
VL
VP
WK
WKNED
transitive
transitive
verbaliser
vowel
vertitive,
motion back
towards point
of origin
vialis
verb phrase
weak
weakened
In this book we ask how word should be dened. What are the criteria for
word? Is word, as the term is generally understood, an appropriate unit to
recognise for every type of language?
This introductory chapter rst looks at what scholars have said about word,
and then discusses the categories and distinctions which need to be examined. Chapter 2 suggests a number of typological parameters for the study
of clitics. Following chapters then provide detailed examination of the notion
of word in a selection of spoken languages from Africa, North and South
America, Australia, the Caucasus and Greece, together with a discussion of
words in sign languages. The nal chapter, by P. H. Matthews, asks what has
been learnt from these general and particular studies.
This introduction begins by surveying the criteria that have been put forward
for word, and suggests that one should sensibly keep apart phonological criteria, which dene phonological word, and grammatical criteria, which dene
grammatical word. In some languages the two types of word coincide and one
can then felicitously talk of a single unit word, which has a place both in the
hierarchy of phonological units and in the hierarchy of grammatical units. In
other languages phonological word and grammatical word generally coincide,
but do not always do so. We may have a grammatical word consisting of a whole
number of phonological words, or a phonological word consisting of a whole
number of grammatical words. Or there can be a more complex correspondence
between the two types of word with, say, a grammatical word consisting of all
of one and part of another phonological word.
1 summarises the tradition, 2 discusses linguists who would do without the
word and 3 surveys opinions concerning what is a word. In 4 a number of
confusions are discussed and then in 5 some suggested criteria are examined.
The heart of the chapter is in 68 proposed denitions for phonological
word and for grammatical word (and the status of clitics) followed by (in 9)
examination of the relationship between the two types of word. In 10 we ask
whether all kinds of languages have words; in 11 there is brief discussion of the
varying social status of word in different languages, and then 12 provides a
summary of the results of the introductory chapter. Finally, the appendix gives a
1
brief statement of the criteria for phonological word and for grammatical word
(and their relationship) in a sample language, Fijian.
1
The tradition
Many writers have assumed that word is a or the basic unit of language.
Bolinger (1963: 113) comments: Why is it that the element of language which
the naive speaker feels that [they] know best is the one about which linguists say
the least? To the untutored person, speaking is putting words together, writing
is a matter of correct word-spelling and word-spacing, translating is getting
words to match words, meaning is a matter of word denitions, and linguistic
change is merely the addition or loss or corruption of words. Bolinger himself
takes word as a prime, commenting that it is the source, not the result, of
phonemic contrasts.
And, as Lyons (1968: 194) comments: The word is the unit par excellence
of traditional grammatical theory. It is the basis of the distinction which is frequently drawn between morphology and syntax and it is the principal unit of
lexicography (or dictionary-making). Indeed, for the Greeks and Romans the
word was the basic unit for the statement of morphological patterns; they used
a word and paradigm approach, setting out the various grammatical forms of
a given lexeme in corresponding rows and columns, with no attempt to segment into morphemes (Robins 1967: 25). (In fact Greek and Latin are fusional
languages where it is not an easy matter to segment words into morphemes,
without bringing in the impedimenta of underlying forms, morphophonological
rules, and the like.)
Much that has been written about the word is decidedly eurocentric. It has
sometimes been said that primitive languages do not have words, an opinion
which Lyons (1968: 199) explicitly rejects, partly on the basis of Sapirs report
that uneducated speakers of American Indian languages can dictate word by
word.
However, it appears that only some languages actually have a lexeme with
the meaning word.1 Even in some familiar languages where this does occur it may be a recent development. For instance, in Old English the primary
meaning of word was (a) for referring to speech, as contrasted with act or
thought. There was a second sense, which may then just have been emerging:
(b) what occurs between spaces in written language. In the development to
Modern English (b) has become the major sense the one used in this book
with sense (a) still surviving mainly in xed phrases, e.g. the spoken word,
1
Dixon (1977a: 88) states every (or almost every) language has a word for word; this is
erroneous. Wierzbicka (1996, 1998) has word as a universal semantic primitive, which is said
to be realised in every language; this is equally erroneous.
The idea of word as a unit of language was developed for the familiar languages
of Europe which by-and-large have a synthetic structure. Indeed as will be
shown below some of the criteria for word are only fully applicable for
languages of this type.
What about languages from extreme ends of the typological continuum
those of an analytic or of a polysynthetic prole? Reviewing the rst edition
of Nidas (1944) Morphology, Hockett (1944: 255) notes that Nida devotes a
chapter to the criteria by which words may be recognised. None of these criteria,
nor any combination of them, gives any fruitful results with Chinese . . . the real
implication is that there are no words in chinese. The whole tradition of
words as worked out with western languages is useless in Chinese. (However,
a quite different opinion is expressed by the leading Chinese linguist, Chao,
discussed in 10 and 11 below.)
Some of the polysynthetic languages of North America lack any unit that
looks like the sort of word we are used to from European languages. Gray
(1939: 146) presents his own denition of word as a complex of sounds which
in itself possesses a meaning xed and accepted by convention. (Note that this
would, in fact, also be satised by a prex such as un- or a phrase such as
2
It is likely that all languages with an established (non-ideographic) orthographic tradition do have
a word for word. Other languages tend to create such a term once they are exposed to writing.
The interesting question is how many languages with no written tradition have a lexeme which
corresponds to word in English, mot in French, etc.
We also nd (perhaps as a further reection of Malinowskis position) Potters (1967: 78) statement: unlike a phoneme or a syllable, a word is not a linguistic unit at all. It is no more than a
conventional or arbitrary segment of utterances.
We have noted one instance of word in this paper, but this is used in an informal rather than in
an analytic sense. On page 166 Harris is discussing the English sentence I know John was in and
talks of pronouncing its intonation twice, once over the rst two words and again over the last
three.
What is a word?
Matthews commences the section What are words? in the second edition of
his seminal textbook Morphology (1991: 208) with: there have been many
denitions of the word, and if any had been successful I would have given it
long ago, instead of dodging the issue until now.
Matthews mentions that the ancient grammarians simply had word as the
smallest unit of syntax. But, he comments, to follow that line will only turn
our larger problem back to front. If words are to be dened by reference to
syntax, what in turn is syntax, and why are syntactic relations not contracted
by parts of words as well as whole words?
Some of the denitions suggested for word are horrifying in their complexity
and clearly infringe the principle that a denition should not be more difcult
to understand than the word it purports to dene.5 There are useful surveys of
denitions of word in Rosetti (1947), Weinreich (1954), Ullmann (1957) and
Kramsky (1969).
Some denitions are simple and appealing. These include Sapirs (1921: 34)
one of the smallest, completely satisfying bits of isolated meaning into which
We can quote two rather extreme examples. Firstly, Longacres (1964: 101) denition, which was
conceived within the formal framework of tagmemics: a class of syntagmemes of a comparatively
low hierarchical order, ranking below such syntagmemes as the phrase and the clause and above
such syntagmemes as the stem (as well as above roots which have no external structure and are
therefore not syntagmemes). It may be of greatly varied structure . . . Words tend to be rigidly
ordered linear sequences containing tagmemes which (aside from those manifested by stems) are
manifested by closed classes of morphemes unexpandable into morpheme sequences and giving
only stereotyped bits of information.
Kramsky devotes a whole monograph to discussing word. He surveys past denitions and
then comes up with his own (1969: 67): the word is the smallest independent unit of language
referring to a certain extra-linguistic reality or to a relation of such realities and characterised by
certain formal features (acoustic, morphemic) either actually (as an independent component of
the context) or potentially (as a unit of the lexical plan).
Confusions
The word word is used in many ways in everyday speech, and in much linguistic discourse. It is important to make certain fundamental distinctions:
(1) between a lexeme and its varying forms;
(2) between an orthographic word (something written between two spaces) and
other types of word;
(3) between a unit primarily dened on grammatical criteria and one primarily
dened on phonological criteria.
These are discussed, in turn, in 4.13.
The (grammatical) word forms the interface between morphology and syntax. Morphology deals with the composition of words while syntax deals with
the combination of words. One could imagine slightly different words being
required as ideal units for these two purposes. That is, there could be a morphological word and a syntactic word which would perhaps generally coincide but might not always do so. We are not aware of this sort of distinction
having been fully justied for any language;6 but it is certainly a possibility.
(In chapter 7, Rankin et al. put forward the idea that the term syntactic word
could perhaps be used in Siouan languages for a type of word incorporating
a relative clause, the whole constituting one phonological word.)
4.1
Consider the following examples, from English and Latin, of the root or underlying form of a lexeme and its inected forms, as used in a sentence.
root or underlying form
inected forms
(a)
look
look
looks
looked
looking
(b)
lup- wolf
lupus
lupo
lup
etc.
nominative sg
dative/ablative sg
genitive sg, nominative pl
The possibility of this is mentioned by Di Sciullo and Williams (1987) without, however, the
formulation of any explicit cross-linguistic or language-specic criteria. This question is also
aired in Gak (1990). Dai (1998) establishes separate units syntactic word, phonological word,
and morphological word in Chinese. He suggests that a compound is one syntactic word and also
one morphological word but that it may have different syntactic and morphological structures.
A number of other types of word have been suggested. For example, Packard (2000: 714)
lists: orthographic word, sociological word, lexical word, semantic word, phonological word,
morphological word, syntactic word, and psycholinguistic word.
The term word is sometimes used in reference to the root or underlying form,
and sometimes in reference to the inected forms. That is we hear, on the one
hand things like look, looks, looked and looking are forms of the same word,
and on the other hand things like the lexeme look is realised as word-forms
look, looks, looked and looking.
Bally (1950: 2879) is so concerned about this ambiguity of usage that he
recommends abandoning the label mot in French (and word in English)
and instead employing semant`eme for the root or underlying form and
molecule syntaxique for inected forms. Lyons (1968: 197) prefers a different
course. While recognising that in classical grammar word was used to mean
semant`eme he notes that modern usage tends to employ word as a label for
molecule syntaxique and suggests standardising on this.
We have followed Lyons suggestion, of using lexeme as the label for root
or underlying form and (grammatical) word for inected form of a lexeme.
Note that Lyons uses italics for words and capitals for lexemes thus, the word
looked is the past tense form of the lexeme LOOK.
Lyons convention is useful from another viewpoint, for dealing with lexemes
that involve two words. These include phrasal verbs in English such as MAKE
UP, as in I made the story up and I made it up. Note that the words of this
lexeme are mapped onto two non-contiguous syntactic slots an inected form
of make goes into the verb slot while up follows the object NP.7 That is, the
lexeme MAKE UP consists of two words, each of which has its own syntactic
behaviour. If we had decided on word as the label for lexeme, there would then
be need for a separate notion of syntactic word. We would have had to say that
the (lexical) word make up consists of two syntactic words, make and up. This is
avoided by describing MAKE UP as a lexeme that consists of two (grammatical)
words, an inected form of make and the preposition up. (Similar remarks apply
to phenomena such as separable preverbs in German and Hungarian.)
4.2
Orthographic word
In many language communities a word is thought of as having (semantic, grammatical and phonological) unity and, in writing, words are conventionally separated by spaces. (In 9 below we investigate the writing convention when
phonological and grammatical criteria do not produce the same unit.)
Indeed, in his Phonemics, Pike (1947: 89) denes word as the smallest
unit arrived at for some particular language as the most convenient type of
7
The up can move to the left over an object that is a full NP but not over a preposition I made
up the story but not *I made up it. Note the distinction between a phrasal verb like make up and
one like pick on, where the on must precede the object NP, e.g. He picked on his brother or He
picked on him but not *He picked his brother on or *He picked him on. See Dixon (1982; 1991:
2748).
example, knee was pronounced with an initial k when English was rst written.
A language may undergo considerable changes, few of which get incorporated
into the orthography. French, for instance, has shifted from a mildly synthetic
structure to one bordering on the polysynthetic. A sentence such as je ne lai pas
vu I have not seen it can be considered a single word, on both grammatical and
phonological criteria. But the language is as a reection of its history written
disjunctively, with the consequence that speakers will say that the sentence
consists of ve or six words (see Vendryes 1925: 878). This is one of the reasons
why linguists have found it harder to decide what is a word for French than for
many other languages. (This point is further pursued by Matthews in chapter 11.)
4.3
Before the idea (followed here) that one should deal separately with grammatical word and phonological word and then examine the relationship between
the two units, there was confusion about exactly what a word is.
As Ullmann (1957: 46) points out since the word is the central element of
the language system, it is natural for it to face both ways: not only is it the
chief subject matter of lexicology, but it is dependent on phonology for the
analysis of its sound-structure, and on syntax for the delimitation of its status in
more complex congurations. But is word primarily a grammatical unit, with
some phonological properties; or is it primarily a phonological unit, with some
grammatical properties; or is it equally a unit in grammar and in phonology?
Ideas have varied.
The majority opinion has been that word is primarily a unit of grammar although, as Matthews (1991: 209) notes the word tends to be a unit of phonology
as well as grammar. In Latin, for example, it was the unit within which accents
were determined. Jespersen (1924: 92) states words are linguistic units, but
they are not phonetic units and Bloomeld (1933: 181) agrees that the word
is not primarily a phonetic unit, while Meillet (1964: 136) maintains: le mot
nadmet pas, comme la syllabe, une denition phonetique; en effet la notion de
mot nest pas phonetique, mais morphologique et syntaxique.
Lyons (1968: 2001) puts it this way: we will continue to assume, with the
majority of linguists, that in all languages the morpheme is the minimum unit
of grammatical analysis. The question we have set ourselves therefore is this:
how shall we dene a unit intermediate in rank between the morpheme and
the sentence and one which will correspond fairly closely with our intuitive
ideas of what is a word, these intuitive ideas being supported, in general, by
the conventions of the orthographic tradition? He then adds (p 204): in many
languages the word is phonologically marked in some way.
Pike (1947: 90) makes a clear distinction between grammatical units, which
include morphemes, words, clitics, phrases and utterances, and phonological
10
In a short but classic discussion of the word Bazell (1953: 678) states that
criteria may be found which are either necessary, or sufcient, but not both.
If criterion X is necessary but not sufcient for dening word this implies
that all words show X but some other units show X as well. If criterion X is
sufcient but not necessary this implies that any unit showing X is a word but
there are also some words that do not show X.
Bazell then provides examples: the vowel-congruence [vowel harmony] of
alternating morphs is a sufcient but not necessary criterion of word-unity in
Turkish; the presence of at least one vowel is a necessary but not a sufcient criterion of word-status in English. The possibility of pause is a sufcient criterion,
in most languages, of word-division.
Lyons (1968: 200) paraphrases Meillet: a word may be dened as the unit of
a particular meaning with a particular complex of sounds capable of a particular
11
grammatical employment. He then points out that this may be a necessary but
is by no means a sufcient criterion a phrase such as the new book or afxes
such as un- and -able (as in unacceptable) also have these properties.
In contrast, Bloomelds well-known denition of word as a minimum free
form is plainly sufcient but not necessary. As Matthews (1991: 210) points
out Latin et and would normally be called a word, and so would English my
or the. But are these words that could occur on their own? They could do so
in a kind of citation (Did you mean et or aut? Et.) but so too could a part of
a word. Matthews recalls having heard a dialogue: (A) Did you say revise or
devise?. (B) Re.
In his grammar of spoken Mandarin, Chao (1968: 1467) suggests that the
denition of a word as a minimum free form (free at both ends) has often been
felt to be too drastic, and weaker conditions have been proposed instead. In languages with clear and regular phonological marking, it is fairly simple to nd
word boundaries without trying to nd an isolated occurrence of the word as an
independent utterance. For example, words in Latin can in most cases be marked
off by the penultimate and antepenultimate stress rules. In the Wu dialects,
compound words are recognisable from their tone sandhi, which are different
within words from the tone sandhi between words . . . In Mandarin, stress and
tonal patterns can sometimes be used to mark off words, but potential pauses
are more generally available for this purpose. Note that Chao is here adding
one or more phonological criteria to Bloomelds essentially grammatical
criterion.
The possibility of pausing before and/or after has often been suggested as a
criterion for word. In the quotation just given Chao appears to consider it as
necessary and sufcient for Mandarin, whereas Bazell quoted it as a sufcient
criterion, in most languages that is, if one can pause on either side of a unit
it must be a word, but there are some other words, in addition.
In a typical synthetic language a case could be made for potential pause
being a necessary but not a sufcient criterion. That is, pauses can be at the
boundaries of units which are both (a whole number of) phonological and (a
whole number of) grammatical words and one always has the possibility of
pausing at such a boundary but there may also occasionally be pauses in the
middle of a word (typically, at a morpheme boundary which is also a syllable
boundary), for example its very un- <pause, perhaps including um> suitable.
(This is discussed further under (f) in 7.) It may be that in analytic languages,
such as Chinese, pauses can only occur at word boundaries, never in the middle
of a word. And it is certainly the case that the more polysynthetic a language
is that is, the longer its words tend to be the more likelihood there is of a
pause being made in the middle of a word, in addition to between words. This
applies particularly to languages which are polysynthetic and agglutinative, less
to those that are polysynthetic and fusional.
12
Where one may pause in natural speech is undoubtedly related to (but not
necessarily identical to) where people do pause when dictating. Firth (1957: 5)
suggests that one way of discovering the words of a language is by slow dictation, using any feeling for word-units the native may have. Sapir (1921: 334)
is more denite, stating: no more convincing test could be desired than this,
that the naive Indian, quite unaccustomed to the concept of the written word, has
nevertheless no serious difculty in dictating a text to a linguistic student word
by word. However, Bloomeld (1933: 178) puts forward a contrary opinion:
people who have not learned to read and write, have some difculty when, by
any chance, they are called upon to make word divisions.
An explanation for these differences of opinion may well be that the various scholars were dealing with different types of language. When working on
Dyirbal a mildly synthetic and predominantly agglutinative language from
northern Australia Dixon found that speakers did dictate phrase-by-phrase,
or more slowly word-by-word, or more slowly still syllable-by-syllable (never
morpheme-by-morpheme). He then worked on Jarawara, a polysynthetic but
basically agglutinative language from southern Amazonia. Here a verb form
might involve six or more morphemes making up twelve or more syllables.
When speakers dictate this language at a pace that the linguist can transcribe
they tend to break up long words into feet (disyllabic units) and to pause between
these. (A morpheme may span two feet, and a foot may span two morphemes.)
As mentioned before, the longer the average length of word in a language,
the more likelihood there is of pausing at some speciable places within the
word.
Sapir concludes that the unit word has psychological validity for speakers
of American Indian languages with whom he worked, presumably in a similar
manner to speakers of English and other European languages see the quotation
from Bolinger at the beginning of 1. This is again a matter which may depend
on the typological prole of the language involved; we return to it in 11.
The ideal situation, of course, would be for there to be one or more criteria for
word that would apply in all languages and be both necessary and sufcient in
each. Vendryes (1925: 556) seeks such a universal criterion, just for phonetic
word. He assesses accent (or stress) as a possible candidate but nds that while
this is ne for some languages it is not adequate for all: in certain languages
the position of the accent is clearly decided by the word-ending; in others the
accent falls upon the nal or penultimate syllable, and in others again upon
the beginning of the word. But these cases do not exhaust all the possibilities;
there are tongues, indeed, in which the variable accent gives no indication of the
word-ending. On the other hand, it may happen that there will be only one accent
in a group of several words; or, conversely, a single word may have two. Greek
and Sanskrit prove that Indo-European possessed what are called enclitics, short
13
words never used independently, but attached to the preceding word. Note that
Vendryes is here confusing two different kinds of unit. Although he sets out
to discuss phonetic word, he then describes a clitic as a short word. In fact
a clitic is something which may have the status of a grammatical word but
never that of a phonological word, being generally a stress-less element which
attaches to some full phonological word (which does bear stress)8 see 8
below, and chapter 2. Leaving aside such instances it is the case that stress is
not a signicant feature in some languages, and is not then available for deciding
what is a word. Vendryes appears to be correct in inferring that accent/stress
is not a universal criterion for phonological word.
We have seen that many discussions of word combine grammatical and
phonological criteria without any clear statement concerning the relative statuses of these two kinds of criteria. The most sensible course of action is to
keep apart the two kinds of criteria and the units which they dene.
6
Phonological word
It is clear that there is no single criterion which can serve to dene a unit
phonological word in every language. Rather there is a range of types of
criteria such that every language that has a unit phonological word (which is
probably every language in the world) utilises a selection of these.
We can offer the following denition:
A phonological word is a phonological unit larger than the syllable
(in some languages it may minimally be just one syllable) which has at
least one (and generally more than one) phonological dening property
chosen from the following areas:
(a) Segmental features internal syllabic and segmental structure;
phonetic realisations in terms of this; word boundary phenomena;
pause phenomena.
(b) Prosodic features stress (or accent) and/or tone assignment;
prosodic features such as nasalisation, retroexion, vowel harmony.
(c) Phonological rules some rules apply only within a phonological
word; others (external sandhi rules) apply specically across a
phonological word boundary.
Note that there is likely to be a close interaction between these types of features.
For example, many phonological rules, under (c), operate in terms of stress
8
Quite a lot of the discussion of word suffers from not recognising the unit clitic and its status
with respect to grammatical and phonological criteria for word.
14
15
16
pausal forms is never likely to constitute a necessary and sufcient criterion for
recognising a phonological word, but can be a useful concomitant feature.
(b) Prosodic features. In very many but not quite all languages, stress (or
accent) provides one criterion for phonological word. Many languages have
xed stress on the rst or last or penultimate or antepenultimate syllable (or
mora) of a phonological word. It should then be possible to ascertain the position
of word boundaries from the location of stress. (For example, Olawsky shows,
in 1.2 of chapter 8, that stress falls on the penultimate syllable in Dagbani;
see also the examples given in Bloomeld 1933: 182 and Trubetzkoy 1969:
2778.) The placement of stress may be linked to the segmental properties of
phonemes; for example, in Latin stress falls on the penultimate syllable if it is
long and on the antepenultimate if the penultimate is short.
In languages with contrastive stress there will generally be just one syllable
with primary stress per word see Weinreich (1954) on Yiddish, and Joseph
and Philippaki-Warburton (1987: 2423) on Modern Greek. Although here
phonological word boundaries cannot be deduced from the position of stress,
one can tell from the number of stressed syllables in an utterance how many
phonological words it contains (and one can deduce that a word boundary must
lie somewhere between two stressed syllables).
However, in some languages stress placement may depend on a combination
of morphological and phonological factors. In such cases stress may not be a
useful criterion for phonological word.
A tonal system may relate to the syllable or to the phonological word the
latter applies in Lhasa Tibetan (see Sprigg 1955) and to the Papuan language
Kewa (Franklin 1971, Franklin and Franklin 1978), for example.
A suprasegmental prosody such as nasalisation or retroexion will have
a syntagmatic extent, and this may be a phonological word. For example,
Allen (1957) provides a prosodic account of aspiration in nominals for Har.aut
(Rajasthani) in terms of the unit word. Among his conclusions is: a breathy
transition is never followed or preceded by another breathy transition within the
same word. Robins (1957) describes vowel nasality in Sundanese as having
prosodic extent. A nasal consonant engenders nasalisation of a following vowel
and of all subsequent vowels if separated from it only by a glottal stop or h; this
continues until a word boundary is reached. (Robins points out that this applies
to all nominal words except for loans and onomatopoeics.)
In Terena, an Arawak language, Bendor-Samuel (1966) describes how each
word has one of three prosodies nasalisation, yodisation (involving fronting
and raising of all vowels, similar to vowel harmony) or neither nasalisation nor
yodisation.
There is in Sanskrit a prosody of retroexion which extends until the end of
a word, under certain conditions. Allen (1951: 940) translates Pan.inis rule as:
17
A root plus monosyllabic sufx (such as purposive -gu) forms one phonological word. But a disyllabic sufx always commences a separate phonological word. For example, gajarrA brown possum plus privative sufx -gimbal
without gives gajarrA.gimbal, a single grammatical word that consists of two
phonological words (again using . for a phonological word boundary within a
grammatical word). To this can be added purposive sufx -gu, which is part of
the same phonological word as -gimbal. Rules (i) and (ii) then apply separately
to the two phonological words within this grammatical word.
18
(2)
Grammatical word
19
Ban
yibi
determiner(fem) woman
The two women came
bulayi bani-nyu
two
come-past
(4)
Ban
yibi
determiner(fem) woman
The two women came
jarran bani-nyu
two
come-past
Dyirbal is a language with remarkably free word order. In (3) the four forms
ban, yibi, bulayi and baninyu can be permuted and occur in any order (e.g. yibi
ban baninyu bulayi). However in (4) jarran must follow yibi; here we can only
9
Verbs of the form procura=lo=ei are still freely used in the Portuguese spoken in Portugal, but
in Brazil they are conned to the written register and to a formal spoken style which deliberately
reects the conventions of writing (Prista 1966: 601).
20
permute ban, yibi-plus-jarran and baninyu. This shows that bulayi is a separate
grammatical word, the adjective two, while -jarran is a nominal sufx, with
dual meaning. (Further justication is provided at (710) below.)
Concerning criterion (b), it is in fact sometimes possible for afxes to occur
in alternative ordering within a word, but there must then be a difference in
meaning that is, a change in (b) affects (c), the coherence and meaning of the
word. Matthews (1991: 213) provides a nifty pair of examples from English:
nation-al-is(e)-ation and sens(e)-ation-al-ise (these examples are further discussed at (56) below). Dyirbal has a rich array of derivational afxes to nouns
including the dual sufx -jarran, as in (4), and -gabun another. These can
occur in either order with, of course, a meaning difference. Thus yibi-jarrangabun is another two women (where there have been a number of pairs of
women and here is another pair) and yibi-gabun-jarran is two other women
(where there have been a number of women and here are two more). (See Dixon
1972: 2323 where a further example is given.) Nedjalkov (1992) illustrates
alternative orderings of the afxes want to and begin (begin to want versus
want to begin) in Evenki.
In their seminal account of Siouan (2 of chapter 7), Rankin et al. show
that the order of elements in a word is not according to a xed template, as
it is in most languages. The Siouan languages thus constitute something of an
exception to criterion (b).
Criterion (c) indicates that the speakers of a language think of a word as having
its own coherence and meaning. That is, they may talk about a word (but are
unlikely to talk about a morpheme). Confronted with a word like untruthfulness,
people may talk in various ways about true or truth or untruth or truthfulness
or untruthfulness, etc., but scarcely of -th or -ness (although they may possibly
talk about the sufx -ful, since it is homonymous with the word full which has
some semantic similarities, or about un-, since this has a clear meaning, of
negation). And it must be noted that, while the meaning of a word is related to
the meanings of its parts, it is often not exactly inferable from them. Blackbird
refers to a particular species of bird that is black, not to any black bird. The
noun action is a nominalisation from act but has a shifted meaning not every
instance of acting could be described as an action (e.g. She acted in Hamlet
or He acted the fool would not normally be).
Zeshan provides an illuminating account of grammatical word in sign languages, in 2 of chapter 6. Criterion (b), that the parts of a word should occur
in a xed order, has to be modied. Since sign languages make use of several
articulators (two hands, facial gestures), it is possible to have simultaneous
components. (Zeshan draws an analogy to Semitic languages, which have discontinuous roots consisting of three consonants into which grammatical
afxes in the form of two vowels are placed.) Zeshan shows that criteria (a) and (c) do have straightforward application to sign languages. She also
21
describes (in her 3.1) how compounds involve temporal compression, elimination of repetition, and various processes of assimilation.
Matthews (1991: 213) suggests a further criterion. Whereas syntactic processes may be recursive (e.g. a relative clause within a relative clause, or just
saying something like very very very good) we nd that:
(d) Morphological processes involved in the formation of words tend
to be non-recursive. That is, one element will not appear twice in
a word.
But, as Matthews points out, this only applies to some languages (Latin being an
example). In Turkish, for instance, a causative derivation can apply twice within
a given word, so that two instances of the causative sufx occur in sequence
(although with slightly different forms). In Dyirbal an intransitive verb (e.g.
nyinay- sit) can take the comitative derivational sufx -mal-, producing transitive stem nyinay-mal- sit together with. This can then be made intransitive
by adding the reexive sufx -rriy- and then a second token of comitative -malmay be added to this, giving nyinay-ma-rri-mal- two (people) sit with (a
third) see Dixon (1972: 98, 2467). (And one can say things like re-rediscover
in English, although these are highly marked.)
In 2.2 of chapter 9, Harris examines certain derivational circumxes in
Georgian which exhibit some of the characteristics of recursion, but concludes
that they are not truly recursive. However, Rankin et al. (chapter 7) show that
locative afxes are recursive in Siouan languages, and that the positioning of
locatives can disturb the placement of pronominal afxes.
We also nd some instances of a single grammatical category being marked
twice in a word. In Yiddish, plural is generally marked just at the end of a
word, as in hant hand, diminutive hant-l, diminutive plural hent-lex. There
is, however, a class of nouns where plural is marked twice, by sufx -im to the
root and by the plural form, -lex, of the diminutive sufx. (Note that these are
hybrid forms, including the Hebrew marker -im together with plural diminutive
-lex of Germanic origin.) Thus we get (Bochner 1984: 41415):
poyer peasant poyer-l, diminutive
poyer-im, plural poyer-im-lex, diminutive plural
Aikhenvald (1999a) gives examples of both plural and gender being marked
twice both within a noun and within a verb in Tariana.
It is generally the case that a word is centred on a root or else on a combination of roots (a compound stem). Various derivational processes may be applied
to the root or compound stem, each in its turn forming a derived stem. Thus the
examples from Matthews quoted earlier, of -al and -ise applying either before
or after -ation, involve the following derivations:
22
(5)
noun root
nation
add -al, deriving an adjective stem nation-al
add -ise, deriving a verb stem
nation-al-ise
add -ation, deriving a noun stem
national-is-ation
(6)
verb root
add -ation, deriving a noun stem
add -al, deriving an adjective stem
add -ise, deriving a verb stem
sense
sens-ation
sens-ation-al
sens-ation-al-ise
Once all derivational processes have applied, the resulting stem takes the inection appropriate to its word class. Nationalisation is a derived noun and can take
the plural sufx -s; sensationalise is a derived verb and takes one of the inectional sufxes available for verbs in English, -s, -ed, -ing or zero.
In some languages there can be a variant type of grammatical word, with
no root at all (or perhaps with a zero root). Dixon provides examples of this
from Jarawara, in chapter 5. The 1sg pronoun prex o- can attach to the feminine declarative sufx -ke, to form o-ke, which is both one grammatical word
and one phonological word. And some verbal sufxes may be added to an
auxiliary root, -na-, but cause the auxiliary to drop if it also bears a prex;
thus, underlying o-na-bisa 1sg-auxiliary-also becomes o-bisa, one (phonological and grammatical) word which consists just of prex o- and sufx
-bisa.
In 6 we discussed boundary phenomena, characteristic features of the beginning and end of phonological words in particular languages. Similar features
can be recognised for grammatical words. Van Wyk (1968: 554) mentions: in
Northern Sotho, for example, the negative morpheme ga- only appears on initial
boundaries of verbs and the relative morpheme on the nal boundaries of verbs.
Thus, the negative prex ga- always marks the beginning of a grammatical word
in Northern Sotho. Similarly, in English past tense sufx -ed (with allomorphs
/-t/, /-d/ and /-id/) marks the end of a verb.10 These are language-particular criteria which can be of great help to a linguist working on a previously undescribed
language.
Each language has its own morphological prole. In some cases all afxes are
optional but in others a certain type of afx is obligatory an inectional system.
Just in languages with a single inectional system on each class of words,
there is a further criterion for grammatical word, concerning the distribution of
inections:
(e) There will be just one inectional afx per word.
10
One must of course be careful to distinguish this -ed from an ed which is the nal part of an
unanalysable root. Compare bak(e)-ed and naked. The different statuses of the eds is brought
out in the phonological realisations: /beikt/ and /neikid/.
23
In Latin each word in an NP must show the appropriate inection for number
and case. The same applies in Dyirbal, but just for case. Harking back to (34)
suppose that we have NPs:
(7)
ban
yibi
bulayi
(8)
ban
yibi-jarran
Now, when dative case -gu is added to these two NPs we get:
(9)
bagun
yibi-gu
bulayi-gu
(10)
bagun
yibi-jarran-gu
The dative form of the determiner ba-n is ba-gu-n, with the dative sufx -gu
coming between root ba- and feminine sufx -n. The point to note is that in
(9) noun yibi woman and adjective bulayi two are separate words and each
takes the dative sufx -gu. But in (10) yibi-jarran is one word and it takes a
single token of -gu, after the dual sufx -jarran.
In a language where inections do not go onto every word of an NP (but only,
say, onto the head, or only onto the last word or the rst word) this criterion
would have to be modied but could still be applicable. In a language such as
Turkish or Hungarian, where number and case are separate, obligatory sufxes,
the criterion would have to be modied in a further way but again could still
be applicable. However, it may not be applicable in languages which permit
double case.11 The criterion may also apply with respect to inections on verbs.
We can now look at two of the most quoted criteria for word, concerning the
placement of pauses and the ability of words to make up complete utterances.
Bloomeld (1933: 180) and Lyons (1968: 202) lay stress on the criterion of
uninterruptability:
(f) A speaker may pause between words but not within a word.
Bloomeld (1933: 180) exemplies this with: one can say black I should
say, bluish-black birds, but one cannot similarly interrupt the compound
word blackbirds.
This criterion should, however, be treated with caution. Firstly, it is at best
a tendency. In a synthetic language one certainly tends to pause more often
between words than within words but it is by no means unheard of to pause
between morphemes within a word as mentioned in 5 one does hear things
like its very un- <pause> suitable. Secondly, this applies better to synthetic
11
Dench and Evans (1988) and Evans (1995) describe true instances of double case marking. Note
that some things that have been called double case by many of the contributors to Plank
(1995) for example involve genitive (a marker of function within an NP) followed by a marker
of function within a clause; see Dixon (1998) and Aikhenvald (1999b) for discussion of this.
24
than to polysynthetic languages the longer the words of a language are, the
more likely there are to be pauses in the middle of them (as mentioned above,
this applies especially to languages which are polysynthetic and agglutinative).
The third caveat is the most important. Pausing appears in most cases
(although perhaps not in all) to be related not to grammatical word but to
phonological word. In English, for instance, there are just a few examples of
two grammatical words making up one phonological word, e.g. dont, wont,
hell. One would not pause between the grammatical words do- and -nt in the
middle of the phonological word dont (one could of course pause between the
do and not of do not, since these are distinct phonological words).
The places where expletives may be inserted, as a matter of emphasis, are
closely related to (but not necessarily identical to) the places where a speaker
may pause. Expletives are normally positioned at word boundaries (at positions
which are the boundary for grammatical word and also for phonological word).
But there are exceptions for instance the sergeant-majors protest that I wont
have no more insu bloody bordination from you lot or such things as Cinda
bloody rella and fan fucking tastic. McCarthy (1982) shows that in English
expletives may only be positioned immediately before a stressed syllable. What
was one unit now becomes two phonological words (and the expletive is a further
word). Each of these new phonological words is stressed on its rst syllable;
this is in keeping with the fact that most phonological words in English are
stressed on the rst syllable.
Associated with pause is the phenomenon of self-repair. If a speaker realises
that they have made a mistake in the middle of an utterance, they are likely to
pause. The mistake will have to be corrected and the utterance resumed. The
interesting question is how far (if at all) one has to go back, in this process of
repair. Woodbury shows how in Cupik (a highly polysynthetic language) if a
pause or speech error occurs in the middle of a phonological word, the speaker
will go all the way back to the beginning of the word and start again see (57)
in 7.4 of chapter 3.
We can now turn to the criterion of isolatability Sweets ultimate or indecomposable sentence and Bloomelds minimum free form:
(g) A word may constitute a complete utterance, all by itself.
When this criterion is examined it is seen to apply neither to grammatical word
nor to phonological word. Rather it applies to a combination of these to a
unit which is both a grammatical word and a phonological word. Or to something
which is a grammatical word consisting of a whole number of phonological
words; or to something which is a phonological word consisting of a whole number of grammatical words. In 5.4 of chapter 3, Woodbury states that every grammatical word in Cupik may stand alone as a complete utterance, except for most
clitics (which are one grammatical word, but not a separate phonological word).
25
That is, a grammatical word which is just part of a phonological word may
not make up a complete utterance (e.g. nt from English dont). Nor may a
phonological word which is part of a grammatical word (e.g. gimba:lgu from
Yidigaja:rr.gimba:lgu in (2) of 6).
Even then, criterion (g) has no more than limited applicability to only
some words in some languages, depending on the conventions for discourse
organisation and on other factors; see 11 below.
Note also that, in certain speech situations, part of a word may make up
a complete utterance. Matthews example of an utterance consisting just of
Re was mentioned in 5. And we have heard an airline clerk ask a passenger
whether they would like a smoking or non-smoking seat, the answer being just
Non.
In summary, (ac) are the main criteria for dening a grammatical word,
with the caveats mentioned above. Criterion (d), non-recursiveness, and (e),
distribution of inections, do apply well in certain languages. The principle of
uninterruptability, (f), is only a tendency which may apply more to phonological than to grammatical words but can be a useful support for the other
criteria. And (g), isolatability, is again a tendency which can be of use when it
is realised that it only applies to a unit which consists of a whole number of
(one or more) grammatical words and also a whole number (one or more) of
phonological words.
8
Clitics
The term clitic is often used to refer to something that is a grammatical word
but not a complete phonological word (for example, it does not take stress). A
clitic is attached to a host phonological word, as a sort of optional extra. There
are some items that can have the form either of a clitic or of a full phonological
word. For example, the in English is generally a proclitic [=] but can, when
used contrastively, be accorded a full vowel which is stressed. [] (as in Is
that the man you saw yesterday?).
Clitics may sometimes form part of a host phonological word for purposes
of assignment of prosodic features (such as stress and vowel harmony) and for
the application of phonological rules. More often, they are simply added as
an extra, unstressed syllable to a fully articulated phonological word after
all processes and rules have applied. Consider an example from Yidi of verb
root warrgi- do all around, past tense -u and the clitic with meaning now
which has form =la after a vowel and =ala after a consonant. Recall, from (c)
in 6, rule (i), which states that if a phonological word has an odd number of
syllables then the penultimate vowel is lengthened. A further rule, (iii), omits
the nal -u of past tense -u from a word with an odd number of syllables. We
get the following derivation (Dixon 1977a: 237):
26
underlying form
warrgiu
rule (i) applies to an odd-syllabled form
warrgi:u
rule (iii) applies to an odd-syllabled form warrgi:
the clitic attaches
warrgi:=ala
=(a)la
=(a)la
=(a)la
If the clitic were attached to the underlying form warrgiu we would have
warrgiula which has four syllables and rules (i) and (iii) would then not
apply. But these rules do apply to warrgiu, showing that =(a)la is added to
the phonological word as the very last step in word-building, after all other rules.
One then effectively has two levels of phonological word: (i) that without
any clitics; and (ii) that with one or more clitics. Woodbury (in 7.1 of chapter 3)
states that, for Cupik, (ii) is the phonological word proper (which he labels
PW), while (i) is a subdomain of the phonological word (which he calls PW ).
Sometimes if two clitics occur in sequence they may come together to form
one phonological word. In Boumaa Fijian, for example, the preposition i=
to is generally a proclitic to a following noun and so is the common article
a=, as in i=vanua to land (as opposed to to sea) and a=vanua the land
(where = indicates a clitic boundary). Note that primary stress goes on the
syllable containing the second mora from the end of a word and secondary
stress on the syllable containing the fourth mora; here the clitics i= and a=
bear no stress, showing that they are attached to the phonological word vanua
after the stress rule has applied. However, when the preposition and article are
used together (the article then has allomorph na) they make up a phonological
word, which has penultimate stress, =na vanua to the [place on] land (see the
appendix to this chapter, and Dixon 1988a: 116, 29). Similar clitic-only words
are reported by Aikhenvald for Tariana (chapter 2) and by Woodbury for Cupik
(chapter 3).
Chapter 2, by Aikhenvald, commences with a comprehensive typology of
fteen parameters in terms of which clitics vary the kind of host to which they
attach, the direction of attachment, etc. The second part of her chapter describes
the rich set of proclitics and enclitics in Tariana, comparing their properties
with those of prexes and sufxes. Most of the following chapters include
descriptions of the behaviour of clitics in individual languages. Jarawara, in
chapter 5, is unusual in not requiring a category of clitics for its grammatical
description.
In 3.2 of chapter 6, Zeshan points out that a set of clitics can be recognised
for various sign languages. For example, a deictic index can be cliticised to a
double-handed host sign; the left hand fully articulates the host sign while, part
way through the articulation, the right-hand makes the index clitic.
Most scholars consider Modern Greek to have clitics. However, in chapter
10, Joseph adopts the theoretical stand that the recognition of clitics should be
avoided for this or for any other language. One should simply have words
27
(of a variety of types) and afxes (of a variety of types). Joseph examines the
properties of the little elements that are normally classed as clitics in Modern
Greek, and suggests that some of them could be treated as a type of word
(prosodically weak or decient words) and others as a type of afx. Matthews,
in the following chapter, comments on this position. (Note that Joseph is the
only contributor not to explicitly distinguish between phonological words and
grammatical words, and then to look for coincidences and mismatches between
them. A major type of mismatch, in many languages, concerns clitics, which
each make up one grammatical word but do not constitute a separate phonological word.)
9
Rather few linguists, in writing grammars of languages, have clearly distinguished between phonological and grammatical words. Sometimes the unit
word is taken for granted, with no justication or criteria offered. Sometimes
criteria are offered but they may mix grammatical and phonological characteristics with no clear discussion of whether these always dene the same unit.
However, there are sufcient clear descriptions for us to be able to recognise
each of three simple types of relationship between the two kinds of word: (a)
the units coincide; (b) a phonological word may consist of several grammatical
words; and (c) a grammatical word may consist of several phonological words.
We discuss these rst, before looking at more complex relationships, in (d).
(a) Phonological and grammatical word coincide. Newman (1967) clearly distinguishes phonological and grammatical criteria in Yokuts, implying that these
converge on a single unit word. A similar conclusion is explicitly stated by
Czaykowska-Higgins (1998) for Moses-Columbia Salish (see the discussion at
the end of this section).
A considerable search of grammars has found almost none which provide
explicit criteria for phonological word and for grammatical word and state that
these coincide. It may be that grammars tend only to mention instances where
the two units do not coincide; or that in those languages which have been
investigated from this point of view the two units never exactly coincide. More
work is needed on this.
(b) Phonological word consists of (usually) one or (sometimes) more than one
grammatical words. Many languages have clitics, which are grammatical words
that do not constitute a phonological word on their own but must be attached to
a phonological word primarily associated with some other grammatical word,
e.g. -nt as in English mustnt. In Dyirbal there is a clitic -ma (marking a clause as
a polar interrogative) which is a grammatical word that attaches, as an enclitic,
28
to the end of the rst phonological word of the sentence. For example, the
interrogative version of (3) would be Ban=ma yibi bulayi baninyu Did the two
women come?
Some of the chapters below provide examples of one phonological word
consisting of two grammatical words. In Tariana (chapter 2), Cupik (chapter 3),
Arrernte (chapter 4) and Dagbani (chapter 8) this involves clitics. In Jarawara
(chapter 5) the auxiliary verb -na- is the core of a grammatical word. It generally
takes one or more afxes and then makes up one phonological word (which
must have at least two moras). When it occurs without afxes, it encliticises to
the preceding non-inecting verb, e.g. amo=na he/she sleeps (and it is a full
constituent of this phonological verb, for purposes of stress assignment).
Nespor and Vogel (1986) provide useful discussion of how phonological word
and grammatical word boundaries do not coincide, in a number of languages.
However, in no case do they provide full criteria for phonological word and
grammatical word in a given language.12
(c) Grammatical word consists of (usually) one or (sometimes) more than one
phonological words. In Yidiwe may nd one grammatical word consisting of
two phonological words; this applies both to nouns, illustrated in (2) above, and
to verbs. Foley (1991: 807) reports a similar situation in the Papuan language
Yimas.
There are a number of types of grammatical construction which typically
fall under this heading. A compound is by denition one grammatical word
but in many languages the components are separate phonological words. For
nominal compounds in Yimas, Foley (1991: 86) notes each of the nouns in
these compounds constitute a phonological word in themselves, as shown by the
individual primary stresses. Yet they form one grammatical word in that there
is only one inection for number. Similar remarks apply for compounds in
Fijian (Dixon 1988a: 22), in Jarawara (2 of chapter 5) and in Georgian (5
of chapter 9). Nespor and Vogel (1986: 120) state that in Turkish additional
evidence that the two members of a compound do not form a single phonological
word is provided by vowel harmony.
These languages are different from English. Bloomelds denition of word
as a minimum free form appears to encounter difculties with compounds
such as blackbird since black and bird are themselves minimum free forms.
12
There are also a number of errors and inaccuracies in their account. For instance, on page 34
they refer to Yidi, a language spoken in Central Australia and on page 134 to Yidi, an
Australian language spoken in northeast Queensland. The same language is referred to and in
fact the same data is presented on the two pages; but it is presented in a misleading manner.
Nespor and Vogel say that underlying gumari-daga-u becomes guma:ridaga:u after a rule
of penultimate lengthening has occurred. In fact this is an intermediate stage in derivation, not
an occurring form. The surface form is (after further rule application) guma:ridaga: (Dixon
1977a: 91; 1977b: 28).
29
He is able to argue that bird in blackbird is not the same as bird in black
brd since it does not bear major stress. This argument works for English, and
also for Dagbani (see chapter 8). It would not be applicable to Yimas, Fijian,
Jarawara or Turkish; for these languages a compound is one grammatical but
two phonological words.
Arrernte provides an interesting situation. Henderson (in chapter 4) states
that a compound is one grammatical word which consists of two phonological
words (since each part of the compound has its own stress). However, the
stress patterns on the two parts of a compound differ from those on the same
two words in syntactic association; he concludes that the parts of a compound
constitute distinct phonological words which are conjoined into a single higher
phonological word.
In some languages with verb serialisation, the verbs involved are effectively
compounded together see Foley (1991: 845) on Yimas. This is another typical
instance where a grammatical word (the serialised verb compound) may consist
of several phonological words (the individual verbs involved).
The other typical example of a grammatical word consisting of two phonological words involves reduplication. A reduplicated form is one grammatical
word (if it were not it would be simply repetition) but in many languages the
reduplication boundary is also a phonological word boundary. We saw under
(a) in 6 how a sequence of o-plus-i forms a diphthong within a phonological
word in Fijian but in an inherent reduplication like ilo.ilo glass each vowel
is pronounced as a separate syllable. (Stress rules support this analysis see
Dixon 1988a: 24.) Similar remarks apply to Jarawara see 2 in chapter 5. In
the Australian language Warrgamay a long vowel may only occur in the initial
syllable of a phonological word. The only grammatical words with two long
vowels are ji:ji: bird (generic) and bi:lbi:l pee wee (bird species), words
with inherent reduplication (Dixon 1981: 17). This shows that in Warrgamay,
as in many other languages, a reduplication boundary is also a phonological
word boundary within a grammatical word.
A division between phonological words within a grammatical word may have
other types of motivation in individual languages. For example, Dixon shows
(in 4.4 of chapter 5) that there are just a few verbal sufxes in Jarawara which
commence a new phonological word within a grammatical word if they are
preceded by more than a single mora within the grammatical word.
(d) More complex relationships between grammatical and phonological word.
We only know of two instances where one type of word does not consist of a
whole number of instances of the other type. The rst is in Fijian and it involves
the derivational prex i-, which is added to a verb and derives a noun, e.g. sele
to cut, slice i-sele knife. The unusual feature is that i- coheres with a
preceding common article a to form one phonological word with it:
30
(11)
PHONOLOGICAL WORDS
ARTICLE
DERIVED NOUN
a + i-
sele
(The criteria for phonological word and for grammatical word in Fijian are
given in the Appendix.)
The grammatical words are a and i-sele, but the phonological words are ai
(pronounced as a diphthong, which only happens within a phonological word)
and sele. Thus the grammatical word i-sele consists of one full phonological
word (sele) and a part of another (i from ai) while the phonological word ai
consists of one full grammatical word (the article a) and a part of another (the
derivational prex i- from the noun i-sele).
The early missionaries to Fiji found it hard to decide where to write the word
boundary in a phrase like (11). There are three possibilities:
(12)
i. ai sele
ii. a i sele
iii. a isele
Hazlewood (1850), in his grammar, opted for (i), Churchward (1941) criticised
this and preferred (ii). Then Milner (1956) went to the other extreme and used
(iii). In fact there is merit in each of these alternatives: (i) shows the phonological
word, (iii) the grammatical word, while (ii) simultaneously recognises both
kinds of word boundary. There are fuller details in the appendix (see also Dixon
1988a: 2131; 1988b).
The second known example of one type of word consisting of other than a
whole number of instances of the other type of word concerns Arrernte, and is
described by Henderson in 4.3 of chapter 4. It relates to the VC(C) syllable
structure which Henderson posits for this language.
We can now return to the topic of orthographic word, briey mentioned in
4.2. The question is: if there is a difference between phonological word and
grammatical word, where do people prefer to insert a space between grammatical words or between phonological words? In order to provide a fully informed
answer to this we would need an array of studies for individual languages, which
is not at present available. But some preliminary remarks may be offered.
In many cases people will place word boundaries around the larger unit. Thus,
if a phonological word involves two grammatical words they will write spaces
around the phonological word (for example, mustnt in English) and not between
the grammatical words within the phonological word. And if a grammatical
word consists of two phonological words they will write spaces before and
after the grammatical word and not between the two constituent phonological
words (this applies to reduplication and compounding in many languages).
31
But what of case (d), in Fijian, where there is no whole number of units
inclusion between the two kinds of word? Well, most spontaneous written
material (and the Bible translation) in Fijian work in terms of alternative (i) in
(12). Similarly, when speakers dictate material or help the linguist transcribe
texts they say ai - pause - sele and stoutly maintain that ai is one word and
sele another. This shows that in this instance it is the phonological word which
determines word spaces (and that this is the unit which has psychological
validity see 11 below).
We have discussed phonological words and grammatical words as if they were
quite separate units, and then investigated the types of relationship between
them. In fact, the two kinds of word are always closely intertwined. Each type
of morpheme in a language is likely to have its own accentual potentiality (for
example, some afxes may bear inherent stress while others lack this), so that
the way in which the components of a grammatical word are combined denes
its phonological status.
Phonological words of different compositions may show varying prosodic
properties. In Modern Greek, if a long phonological word consists of more
than one grammatical word it has an obligatory secondary stress, whereas the
inclusion of a secondary stress is always optional in a long phonological word
which consists of just one grammatical word (Joseph and Philippaki-Warburton
1987: 243).
Czaykowska-Higgins (1998) presents an illuminating discussion of words in
Moses-Columbia Salish, showing that although phonological word and morphological word coincide in extent, their internal structures in terms of phonological and grammatical bracketing differ. For example, reduplication is a
grammatical process of sufxation applying to a morphological root whereas,
in terms of phonological processes, the reduplicated portion forms an inherent
part of the phonological root.
For each language we can recognise a hierarchy of grammatical units; this
is, typically: morpheme, grammatical word, phrase, clause, sentence. There
must also be a hierarchy of phonological units; this is, typically: phoneme, foot
(in some languages), syllable, phonological word, intonation group, utterance.
(An alternative phonological hierarchy is suggested by Nespor and Vogel 1986
and repeated in Hall and Kleinhenz 1999: 9: syllable, foot, phonological word,
phonological phrase, intonational phrase, phonological utterance.) The way in
which the hierarchies relate varies from language to language. The place at
which the two hierarchies are most likely to converge concerns grammatical
word and phonological word these may wholly coincide or else often coincide,
for a given language. (This is why it is appropriate to use the term word for
units on both hierarchies.)
32
10
One cannot assume that just because all the languages one knows have words
then so must all the languages in the world. However, it does seem likely that
every language will have both a phonological word, such as we dened in 6,
and a grammatical word, as dened in 7 (that is to say, we cannot imagine a
language which does not have these units). In some languages these units may
always coincide and in others they generally coincide (note that in languages
of types (bd) from 9 it is only in a minority of cases that phonological word
and grammatical word do not coincide). It is not impossible that there would
be a language that lacks phonological words and/or grammatical words, but we
are not at present aware of one.
As mentioned before, most discussion of words has centred on the familiar
synthetic languages of Europe. How about languages of other types? Recall
Hocketts comment, quoted in 2: there are no words in chinese. The
whole tradition of words as worked out with western languages is useless in
Chinese. Well, the leading grammarian of Chinese, Chao, reaches a different
conclusion. He recognises a syntactic unit in Chinese which satises our
criteria for grammatical word it has xed internal structure but unlimited
versatility in syntactic constructions; in addition, one may pause at a word
boundary, etc. (Chao 1946; 1968: 13693). See also the comments in 11.
At the other end of the scale we can certainly recognise a unit grammatical
word in polysynthetic languages, on the criteria given in 7. That is, the parts
always occur together, in xed order and have a conventionalised coherence
and meaning. What makes eldwork on a polysynthetic language demanding is
that one has to quote a complete word (which can be dauntingly long) whenever
one wants to discuss its form or function or meaning; it is not acceptable to
quote just one part of a word.
There have been few in-depth studies of the unit word in polysynthetic languages, but we can air a few preliminary impressions. Firstly, one is rather more
likely than in a synthetic language to nd a grammatical word consisting of a
number of phonological words. And it may be more possible to pause at phonological word boundaries within a grammatical word. Further work is needed to
conrm (or disprove) these initial impressions.
11
Although it is likely that all languages have words (as we have characterised
word in this chapter), the social role of words differs widely.
In English and other European languages (with an established tradition of
writing) the word is the unit of the language about which people talk and argue.
A quite different kind of unit may full this role in other languages. Chao
33
(1968: 136) explains that in Chinese a unit called tzyh (nowadays written z`)
is the sociological unit of the language, meaning by this that type of unit,
intermediate in size between a phoneme and a sentence, which the general,
nonlinguistic public is conscious of, talks about, has an everyday term for, and
is practically concerned with in various ways. It is the kind of thing which
a child learns to say, which a teacher teaches children to read and write in
school, which a writer is paid for so much per thousand, which a clerk in a
telegraph ofce counts and charges so much per, the kind of thing one makes
slips of the tongue on, and for the right or wrong use of which one is praised
or blamed. Thus it has all the social features of the common small change of
every day speech which one would call a word in English. Chao (1946:
4) mentions that tzyh is translated as word by most of those who speak in
English on Chinese, a footnote adding such as Sinologists, missionaries, and
Chinese students studying abroad. But in fact tzyh is not a word on any of
the accepted denitions; it is a character. As mentioned in 10, Chao provides
criteria for a syntactic unit in Chinese (called c, see Packard 2000: 1420)
which satises our criteria for grammatical word (it consists of one or more
tzyh) but states that it plays no role in the Chinaman of the streets conception
of the subunits of the Chinese language (1968: 138).
That is, Chinese does have word but this unit has no social status for the
language community. In much the same way that speakers of English and other
languages talk about words, speakers of Chinese talk about tzyh characters,
which roughly correspond to the grammatical morpheme and/or phonological
syllable. This social difference is undoubtedly related, at least in part, to the
different writing systems employed by the Chinese and the English.
In languages where people do talk about words, noun and verb may have different statuses. When doing eldwork on a previously undescribed language one
may in some cases felicitously cite a noun and ask about its meaning and
use. But it is bad practice to do the same for a verb. One should always include
at least minimal information about core arguments, asking about She hit him
rather than just about hit. A noun generally names a type of object and may
be used just for this; but a verb describes an action or state which requires a
number of participants and these should be specied.
Languages vary. In Jarawara a noun will not normally be used alone, for
naming. When a new species of bird is encountered, a Jarawara would never
point it out by just saying its name, e.g. sasaha. They would add the copula
verb ama be or the intransitive verb wata exist, often with a declarative
sufx, feminine -ke or masculine -ka, indicating the gender of the noun. That
is, they would point at the bird and say sasaha ama or sasaha wata or sasaha
ama-ke or sasaha wata-ke (this particular bird, the hoatzin bird, is of feminine
gender).
34
Summary
We have found that although many types of denition have been suggested for
word, there has often been lack of a clear distinction between lexeme and
word form, and/or between phonological and grammatical criteria. We suggest that different sorts of criteria should be kept strictly apart phonological
criteria dene phonological word, which is a unit in the phonological hierarchy, while grammatical criteria dene grammatical word, which is a unit in
35
36
verb-form ca-ta hate, consider bad. When this is used without an afx the
vowel is lengthened, giving caa.
(ii) A sequence of ai, ei, oi, au, eu, ou or iu, within a phonological word, is
pronounced as a diphthong. Across a phonological word boundary, such a
vowel sequence is pronounced as two separate syllables.
(b) Stress rule. Primary stress goes on the syllable containing the second mora
from the end of a phonological word. Secondary stress goes on the syllables
containing the fourth and sixth moras from the end (we have no example of a
phonological word involving eight moras).
Grammatical word
A grammatical word is centred on a root or a compound stem (combining two
roots) and may have prexes and/or sufxes added to it. The components must
appear together, in xed order, with the word having a conventionalised coherence and meaning (speakers will talk of the form and meaning of grammatical
words, called vosa).
Note that there are some examples of recursion within a grammatical word
(e.g. two occurrences of the prex vaa-; see Dixon 1988a: 1978). A grammatical word can generally be pronounced by itself.
Relation between grammatical and phonological words
(a) There are instances of a grammatical word consisting of two phonological
words:
(i) Compounds, e.g. sara.vanua (lit. look+at.place) tourist.
(ii) In a productively or inherently reduplicated word (where at least two moras
are involved), the reduplication boundary within the grammatical word is
a phonological word boundary. For example butao to steal, buta.butao
to steal constantly, where the penultimate stress rule operates independently within each phonological word. (And see the discussion of vowel
sequences under (a) in 6 above.)
(iii) Whereas one-mora afxes form one phonological word with the root to
which they are attached, multi-mora afxes constitute a separate phonological word within the one grammatical word, e.g. the root talanoa tell
plus transitive sufx -ta'ina make up one grammatical word consisting of
two phonological words, t`alanoa.ta'na tell (stories).
(b) There are instances of a phonological word consisting of two grammatical
words. For example, preposition i to, at plus common article na form one
phonological word, +na.
If a single-mora grammatical word does not enter into a combination of this
type it becomes a clitic to an adjacent phonological word. That is, it does not
enter into the stress assignment pattern of the host word, but is added on, as an
extra unstressed syllable, after the stress rule has applied.
37
GRAMMATICAL WORDS
i
PHONOLOGICAL WORDS
na
i-
talanoa
2
1
' in the story'
Here the phonological word inai consists of all of two grammatical words
(preposition i and article na) plus part of a third grammatical word (the prex
i- from derived noun i-talanoa).
But note that if anything should intervene between the common article
(a or na) and a grammatical word beginning with the derivational prex i-,
the i- becomes part of the same phonological word as the root to which it is
prexed, i.e. grammatical and phonological words here coincide. For example:
(14)
GRAMMATICAL WORDS
a
PHONOLOGICAL WORDS
=
1
ona
i-talanoa
2
Here the article a becomes a clitic to the following possessor ona his. Note
that if this NP were preceded by the preposition i we would get i=na ona italanoa in this story, where i=na is one phonological word consisting of two
grammatical words (each a clitic) and ona and i-talanoa are each both one
grammatical word and one phonological word.
References
Aikhenvald, A. Y. 1996. Words, phrases, pauses and boundaries: evidence from South
American Indian languages, Studies in Language 20.487517.
1998. Warekena, pp 225439 of Handbook of Amazonian languages, Vol. 4, edited
by D. C. Derbyshire and G. K. Pullum. Berlin: Mouton de Gruyter.
1999a. Multiple marking of syntactic function and polysynthetic nouns in Tariana,
pp 23548 of CLS 35, Part 2.
1999b. Review of Plank 1995, Studies in Language 23.44754.
Forthcoming. Language contact in Amazonia. Oxford: Oxford University Press.
38
39
40
41
Sprigg, R. K. 1955. The tonal system of Tibetan (Lhasa dialect) and the nominal phrase,
Bulletin of the School of Oriental and African Studies 17.13353.
Sweet, H. 1875/6. Words, logic and grammar, pp 47083 of Transactions of the Philological Society for 1875/6.
Trubetzkoy, N. S. 1969. Principles of phonology, translated by C. A. M. Baltaxe. Berkeley and Los Angeles: University of California Press.
Ullmann, S. 1957. The principles of semantics. Oxford: Blackwell, and Glasgow:
Jackson.
van Wyk, E. B. 1967. Northern Sotho, Lingua 17.23061.
1968. Notes on word autonomy, Lingua 21.54357.
Vendryes, J. 1925. Language: a linguistic introduction to history, translated by P. Radin.
London: Routledge.
Waterson, N. 1956. Some aspects of the phonology of the nominal forms of the Turkish
word, Bulletin of the School of Oriental and African Studies 18.57891.
Weinreich, U. 1954. Stress and word structure in Yiddish. pp 127 of The eld of
Yiddish, 1, edited by U. Weinreich. New York: Linguistic Circle of New York.
Wells, R. S. 1947. Immediate constituents, Language 23.81117.
Wierzbicka, A. 1996. Semantics: primes and universals. Oxford: Oxford University
Press.
1998. Anchoring linguistic typology in universal semantic primes, Linguistic Typology 2.14194.
Wonderly, W. L. 1951. Zoque II: phonemics and morphophonemics, International
Journal of American Linguistics 17.10523.
Zirmunskij,
V. M. 1966. The word and its boundaries, Linguistics 27.6591.
The term clitic typically refers to a morphological element which does not
have the full set of properties of an independent (phonological) word, and
which forms a phonological unit with the word that precedes it or follows it
(Matthews 1997: 56) for the purposes of accent or prominence assignment (see
Nevis, Joseph, Wanner and Zwicky 1994: xiixx). And they behave differently
from afxes. Sapir (1930: 70) remarked that enclisis is . . . neither true sufxation nor juxtaposition of independent elements. It has the external characteristics
of the former (including strict adherence to certain principles of order), the inner
feeling of the latter.
The consensus appears to be that clitics are morphemes which are prosodically
decient or unusual in certain ways. Criterial properties of clitics found in
the literature invariably include that they are loosely phonologically bound to
a word, or occur in second position in a clause (Klavans 1985: 117), or are
phonologically decient.
This chapter has two distinct parts. In 1, I propose parameters which help
distinguish clitics from afxes, determine the nature of their similarity to other
morpheme types and dene their independent properties in a given language.
These criteria suggest a scalar, or continuum-type, approach that is, some
morphemes turn out to be more afx-like and others to be more word-like.
(In the Appendix, these parameters are compared to those which have been
proposed in the literature.)
Then, in 2, these parameters are applied to Tariana, an Arawak language
from Amazonia. Several subclasses of clitics, with different positions on a
cliticafx continuum, are established, and arguments are provided in
favour of phonological words including clitics, and for clitic-only phonological words, as distinct from phonological words with no clitics. 3 is a brief
summary.
1
I am especially indebted to R. M. W. Dixon, Timothy J. Curnow, and Mauro Tosco, for insightful
discussion. Special thanks go to the members of the Brito family of Santa Rosa and Iauarete in
northwest Amazonia for teaching me Tariana, and to Pauline Laki, for revealing the beauty of
her native Manambu.
42
43
Both are termed clitic group by Nespor and Vogel (1986: 149ff).
44
Alexandra Y. Aikhenvald
In this, and numerous other cases, the order of clitics in a clitic sequence is still the same. A
well-known exception is French il me=le=donne he gives it to me, and imperative Donne=le
moi Give it to me where the order of arguments of the imperative is a mirror image of that of
a non-imperative verb; the imperative also requires a non-cliticised form of a personal pronoun
in the addressee function. The form of enclitics may not be the same as that of proclitics this
is the case with pronominal enclitics in Portuguese (Vigario 1999).
45
(c) Type of host. Clitics can have a xed position within a clause or an NP,
depending on purely phonological factors, or on grammatical properties of the
host, or they can be oating.
(c1) Fixed position clitics can be classied according to two principles:
(i) phonological position regardless of the grammatical class of the host, and
(ii) position depending on the grammatical class of host.
(i) Clitics which are placed regardless of the grammatical class of their host
can be second-position clitics (in Wackernagels position), as in Hittite. They
can be sentence-initial, e.g. Kwakwala determiners (Anderson 1992: 202),
or sentence-nal, as in Wari (Chapacuran: Everett and Kern 1997: 355). In
Chamicuro (Arawak), enclitic denite articles which form part of the preceding
word (not part of the noun they modify: Parker 1999: 556) belong to this type.4
If clitics are positioned by phonological rules only, they cannot be categoryspecic and must lack any direct constituency with the host word, as stated
by Woodbury in 6.2 of chapter 3.
Phonological constraints on positioning a clitic may be complex. The emphatic clitic fa in Hausa (Zec and Inkelas 1990: 36970) must occur immediately
after a phonological phrase. There are a few prosodic constraints on its positioning: it appears between a verb and a following object noun phrase only if
the noun phrase itself consists of more than one word; however, fa can occur
after any constituent which is intonationally emphasised.
(ii) Clitics whose position depends on the grammatical class of their host
can be:
r pre-head, e.g. pronominal clitics in Romance languages or in Macedonian
(Dimitrova-Vulchanova 1995: 75) attaching at the front of the nite verb; the
4
The placement of second-position clitics can be dened in two ways either after the rst
phonological word, as in Hittite or Kabyle or after the rst constituent. In Serbo-Croatian both
possibilities are attested; however, the second position is dened by a combination of phonological
and grammatical factors, since following the rst content word is equivalent to following the
rst phonological word (see discussion in Zec and Inkelas 1990: 367).
46
Alexandra Y. Aikhenvald
(c2) Floating clitics with no xed position are attached to a particular constituent
under a special discourse or other condition. In Kannada (Sridhar 1990: 136,
2578), the emphatic clitic = e: can occur with every type of constituent except
demonstrative adjectives. If it attaches to a noun phrase, it follows case marking
and postpositions, as in the following example (here and below, the clitics under
discussion are underlined).
(1)
5
6
eddu
bandante
gandhiy=e:
Gandhi=emph get.up+past.pcpl come+past+thus
It was as though Gandhi himself got up and came
Cf. clitic clusters in Macedonian (which are phonologically hosted by the preceding negation,
that is, form a phonological word with it) (Dimitrova-Vulchanova 1995: 75).
Some morphemes which are afx-like according to their prosodic and segmental properties can
show unusual syntactic behaviour. When an independent noun grammaticalises as a derivational
afx, it may still retain some of the syntactic properties of a free noun, that is, displaying a certain
mobility which may make it look similar to a clitic. Numerous Romance languages have a sufx
used to form adverbs from adjectives which comes from the accusative form of Latin feminine
noun mens mind -mentem, e.g. French -ment, Portuguese and Italian -mente. Synchronically,
in all these languages this sufx requires the feminine form of an adjective, e.g. French franchement, Portuguese franca-mente openly, frankly. This sufx is productive in these languages.
Only in Portuguese and in Spanish does it display another unusual peculiarity which indicates
its connection with an independent word: it undergoes a process comparable to coreferential
deletion in a sequence of two adverbs, e.g. sabia- e prudente-mente (lit. wise- and cautious-ly)
rather than sabia-mente e prudente-mente wisely and cautiously.
(2)
47
ka:rinoLagindal=e:
kaybi:sidaLu
car+gen+inside+abl=emph hand+wave+past+3f
She waved from inside the car itself
With negated verbs, the emphatic clitic appears before the negation, attaching
to the preceding verb, as in ke:Lal=e: illa (listen+inf=emph neg) she didnt
listen.
Floating clitics which mark speakers attitude or focus are positioned close
to a constituent which is emphasised; see Sadock (1995: 2678) for an example
from West Greenlandic. The interrogative clitic =ne in Latin displays oating
properties: it can attach to a word which is being questioned; if a clause is being
questioned, it attaches to the verb (Nespor and Vogel 1986: 161).
Floating clitics do not always express emphasis, focus or other discourse
categories. The clitic in Zoque which marks past tense may attach to members of
most lexical categories (Jan Terje Faarlund, p.c.). In Apurina (Arawak: Facundes
2000), a number of oating clitics can attach to various grammatical classes,
depending, apparently, on which of them is focussed; these include a frustrative
marker, a predicative marker, two perfectives and an emphatic marker. Floating
verbal enclitics in Tariana go onto the predicate, unless another constituent is
in focus (see 2.4.2).
Languages can combine clitics of different kinds. Hittite has xed second
position clitics and also emphatic oating clitics which go onto an emphasised
constituent. Warekena has sentence-initial proclitics, and also proclitics associated with a specic constituent (for instance, negation Aikhenvald 1998).
See also discussion in chapters 3, 4, 8, 9 and 10 below.
The position of a clitic may correlate with a change of meaning. In Kannada,
the interrogative clitic =a is sentence-nal if it marks a yes-no question; if a
particular constituent is questioned, it attaches after that constituent, and thus
behaves as a oating clitic.
(d) Relationship with phonological word. The lack of independent stress, and
incapability of forming a phonological word on its own, is typically considered
a denitional property of a clitic. As Zwicky (1985: 286) puts it, if an element
counts as belonging to a phonological word for the purposes of accent, tone or
length assignment, then it should be a clitic. Clitics are described as underlyingly unstressed, as in European Portuguese where they do not even affect
the stress placement in a phonological word (Vigario 1999). Olawsky considers
this a denitional property of clitics in Dagbani (see table 2 in chapter 8).
Underlyingly unstressed clitics can acquire stress by phonological rules (and
see the discussion of accented enclitics in Siouan by Rankin et al. in 3.1 of
chapter 7). Clitics can take the stress of another clitic in a sequence in Modern
Greek (Joseph and Philippaki-Warburton 1987: 21112, 243): when one or
48
Alexandra Y. Aikhenvald
at most two enclitics increase the distance between the stressed syllable and
the end of the phonological word, another stress is assigned to the penultimate
syllable of the whole unit; in example (3) a secondary (weaker) stress appears
on the clitic m`u.
(3)
ose=m`u=to
give.impv.sg=me:gen.clit=it:acc.clit
Give it to me
(e) Segmental and phonotactic properties of clitics. Clitics can differ from
afxes and from roots in their segmental structure and phonotactics. Like afxes,
they tend to be monosyllabic (as in Romance languages, or in Warekena).
Cliticised or accentless forms of disyllabic independent words in Bare are
monosyllabic. In Tariana see below clitics differ from afxes and from
independent roots in combinatorial possibilities of consonants.
Zeshan (in 3.2 of chapter 6) gives additional evidence for the loss of
syllabicity in clitics in sign languages. Here, a combination of a clitic and its
host behaves as one unit for the purpose of assignment of a suprasegmental
this is analogous to the process of forming one phonological word in spoken
languages.
49
Clitics can differ from independent words. Tariana clitics are the only morphemes to form monosyllabic phonological words with a short vowel. In numerous Cushitic and Omotic languages of Ethiopia, clitics but not roots can
end in consonants (Mauro Tosco, p.c.).
(f) Phonological cohesion. Clitics may differ from afxes with respect to (i)
phonological processes at clitichost boundary, (ii) processes on boundaries
between clitics, (iii) processes within clitics, and (iv) processes at the edges of
phonological words which include clitics.
It has been claimed that clitics are agglutinative, and subject to automatic
phonological rules only. This is indeed the case in a number of languages.
However, fusion across clitic-plus-clitic boundaries is found in the so-called
contractions in Romance and Germanic languages, e.g. French au or Portuguese
na or German zum from zu + dem. In Piedmontese, a Western Romance language of northwest Italy (Tosco forthcoming), cliticisation of pronouns results
in a variety of clusters which are not admissible word-internally, e.g. /dl/ in
cog-lo /kud=lu/ put him to bed, /t/ in specc-te /spe=te/ look at yourself
in the mirror, /dt/ in vard-te /vard=te/ look at yourself. Geminated consonants in Piedmontese only occur on clitic boundaries, e.g. sete sit! + -te 2sg
gives [set:e] (Mauro Tosco, p.c.). In Manambu, word-initial h deletion accompanied by vowel fusion is specic for a clitic boundary: when the copula ha
gets encliticised to a noun in a copula complement function in rapid speech, h
disappears and vowel fusion takes place. Example (4) is a clause pronounced
slowly; (5) is the same clause pronounced quickly with h deletion and vowel
fusion:
(4)
n na-k
s
ha-l
you:sgf-poss name copula-f
This is your name
(5)
n na-k
sa=l
you:sgf-poss name=copula:f
This is your name
50
Alexandra Y. Aikhenvald
(g) The relationship of clitics to pauses. Restrictions on how one can pause in
the middle of a word which includes clitics depend on what the clitics are. Clitics
which can be realised as full phonological words (simple clitics) often behave
7
51
(h) Combinations of clitics; and the status of words including clitics, and of
clitic-only words. A phonological word which consists just of clitics (a cliticonly word) may be similar to phonological words of other types see the
example from Boumaa Fijian in 8 of chapter 1. But it can also be different
from phonological words of other sorts in its prosodic properties, and in the
ways in which it correlates with a grammatical word. There is ample evidence
in favour of clitic-only words. In Tariana, a sequence of a proclitic and an
enclitic forms a phonological word different from words of other types. In such
a word, the proclitic always bears the primary stress (that is, such a word always
has initial stress). And it cannot contain more than two enclitics see 2.4.4.
In Warekena (Arawak: Aikhenvald 1998: 406) the sequence proclitic + enclitic(s) behaves similarly to clitics: it can form an independent phonological
word if focussed, as in (6),8 or it can be cliticised to the following verb, as
in (7).
(6)
ya=mia
yue=pia-ha
nima-ha
e-pi
neg=perfve for=neg-paus 3pl+with-paus eat-obj.foc
He didnt have anything to eat with them (his children)
(7)
ya=mia=ni-tse=pia-ha
daba
a=wa
neg=perfve=3pl-know=neg-paus where go=nonacc
They did not know where to go
Stress in Warekena is contrastive. Clitic groups are a subclass of words with stress on the rst
syllable. An independent phonological word consisting of a proclitic with a clitic can be used in
a pausal form (marked with -hV: see discussion in Aikhenvald 1996).
52
Alexandra Y. Aikhenvald
demonstrates differences between words which contain clitics and those that
do not. (Also see Vigario 1999, for further evidence on the separate properties
of clitic-only words in European Portuguese, and the summary in Halpern
1998: 103).
A clitic group does not have to be coextensive with a syntactic constituent
or with a grammatical word which is to be expected of a prosodically dened
constituent. Neither does it have to be coextensive with a phonological word of
any other type. Woodbury, in 7.1 of chapter 3, demonstrates that a phonological
word without clitics (PW ) can be interpreted as a subdomain of a phonological
word including clitics (PW).
(i) Relative ordering in clitic strings. Clitics tend to attach to their host in an
idiosyncratic order.9 In Hittite, the order of second position clitics is as follows:
sentence connectives nu, ta quotative = wa(r)= dative/accusative
plural third person nominative; accusative singular rst, second person dative/accusative singular, third person dative singular reexive
local, emphatic particles (Friederich 1974).
Sentence-nal clitics in Wari (Chapacuran: Everett and Kern 1997: 355)
fall into ve position classes: (1) temporal particles, (2) the emphatic particles;
(3) the referent particle quem, (4) the emphatic particle -ta, (5) the emphatic
particles with restricted use (other than those in position 2).
In Warekena, aspectual clitics (e.g. -mia perfective) in a clitic sequence are
followed by relativiser -i, which is followed by the personal O/So enclitics.
Figures 1 and 2 show clitic ordering in Tariana. And see 7 of chapter 7 on the
rules for ordering clitics in Siouan languages.
The order of clitics can sometimes be explained semantically for instance,
indirect object clitics preceding direct object clitics, and rst or second person
clitics preceding third person clitics in Greek, could both be accounted for by
a topicality scale (Haberland and Van der Auwera 1990). This principle could
also account for the order of verbal clitics in Kabyle Berber (host indirect
O direct O) which is the mirror image of the order of their independent
counterparts (verb direct O indirect O).
Clitic ordering can be accounted for by phonological weight; for instance,
in Tagalog (Schachter and Otanes 1972), monosyllabic clitics precede disyllabic
ones. In Tariana, heavy (two- or three-syllable long) clitics tend to attach to
the rst component of a serial verb.
But in a great many cases, explanations for clitic ordering are not readily available. This can be compared to idiosyncratic ordering of afxes in polysynthetic
9
This is also known as clitic clustering see, for instance, examples in Dimitrova-Vulchanova
(1995: 745).
53
languages. In this respect, clitics show some similarities to afxes, but still form
a class of their own.
(j) Position with respect to what can be dened as afxes. In most languages,
clitics usually occur outside all afxes. However, enclitics may sometimes occur
before sufxes, as in the Portuguese conditionals quoted in a above. In Albanian,
pronominal clitics are proclitics to indicative forms and enclitics to the imperative; however, in the imperative they precede the plural inection (Sadock 1991:
56). In Platense Spanish, plural inection can occur after the clitic pronoun, as
in tire=me-n=lo (throw=to.me:clitic-pl=it:clitic) throw this to me (see
Sadock 1991: 57).
If clitics have morphological categories of their own, these morphological markers can occur inside clitics (see Klavans 1979, for the discussion of
examples). In some varieties of Brazilian Portuguese, the diminutive -zinho
which arose as the result of a reanalysis of an epenthetic z and the regular
diminutive -inho (masculine), -inha (feminine) inects for number and for
gender, e.g. aquel-e-zinh-o (this-masc-dim-masc) that little one masculine,
aquel-a-zinh-a (this-fem-dim-fem) this little one feminine, aquel-e-zinh-o-s
(this-masc-dim-masc-pl) these little ones masculine. In West Greenlandic,
the clefting demonstrative nasalises a nal consonant of the word preceding
it (unlike any sufx) (Sadock 1995: 264). And we will see in 2.4.1 that
in Tariana some nominal clitics have partially suppletive number marking
(2.4.1).
This special morphology can account for what only appears to be endoclisis,
that is, derivational or other afxes intervening between clitics. This makes
clitics appear similar to independent grammatical words. However, in most
cases clitics have only a subset of the grammatical categories characteristic of
full grammatical words, that is, they are both phonologically and grammatically
unlike other words.
(k) The correlation of clitics with grammatical words. A clitic often constitutes
a grammatical word; this is what denes simple clitics in Zwickys terms (e.g.
Slavic pronominal clitics, auxiliary clitics, question clitics, etc; cf. DimitrovaVulchanova 1995). In Tiriyo, a Carib language from Brazil, all clitics are
grammatical words (Meira 1999: 113). In Tariana, as we will see in 2, all
the proclitics can be grammatical words, and most enclitics cannot be. And, as
Matthews reminds us in 4 of chapter 11, the words in Greek originally called
enclitics were, to repeat, words, or, alternatively, clitics were in certain
cases roots.
Clitics may have special morphology of their own see j and thus form
independent grammatical words. This is the case for the Tariana nominal diminutive enclitic =tuki with a partially suppletive plural =tupe, or nominal past
54
Alexandra Y. Aikhenvald
masculine =miki-i, feminine =miki-u, and plural -miki; these can be considered not only phonologically but also grammatically decient, if compared to
modiers of other classes.
The mismatches between grammatical and phonological words (see 9 of
chapter 1) are most often accountable for by clitics which form a phonological
word with their host, but can be considered a grammatical word in their own
right. For instance, in Awa Pit (Barbacoan: Curnow 1997), postpositions and
discourse clitics are enclitics to the last word of a noun phrase and/or of a
clause, and are independent grammatical words; a phonological word containing
clitics consists of more than one grammatical word. Similar mismatches are
discussed for Cupik by Woodbury (see 8 of chapter 3), and by Henderson for
Eastern/Central Arrernte (see 6.1 of chapter 4). Mismatches of this sort are by
no means universal Rankin et al. show (in 4 of chapter 7) that phonological
words in Siouan languages are always coextensive with grammatical words, no
matter whether they contain clitics or not.
A word including clitics, and a clitic-only word may or may not be coextensive with a grammatical word (also see Nespor and Vogel 1986: 15163).
Of three kinds of words including clitics in Tariana, only one coincides with a
grammatical word. Just one of the clitic-only words is also a grammatical word
(see 2.4.4, and table 5 there).
(l) Syntactic scope of clitics. Clitics differ in their scope: a clitic marking
negation or a polar question may have scope over an entire clause, while one
marking emphasis or also may have scope over a phrase or perhaps just over
a word. The interesting scope effect of English nt when compared with its
non-cliticised counterpart has been noted by Sadock (1995: 267): you cant
stay means you must leave, and it does not mean you can [not stay], that
is you can leave; when you cannot stay can have either meaning (at least
for some native speakers). Fixed position clitics whose scope is a phrase are
sometimes called phrasal afxes.
The scope of a clitic can be a whole clause or a sentence, e.g. Kannada
interrogative enclitic; or it can be a noun phrase, as in the case of cliticised
adpositions. Fixed position clitics (see c) tend to have scope over the constituent
to which they attach. As mentioned above, in Kannada, when the interrogative
enclitic attaches to the end of the clause, it is used for questioning the whole
clause; when it is used to question a particular noun phrase, it attaches to the end
of this noun phrase.
However, this scope effect is not the exclusive property of a clitic. In Tariana,
variability in placement of nominal sufxes (underlined) results in scope change,
e.g. [[nu-kapi-ma]-da] (1sg-hand-cl:side.of-round) one palm of my hand
and [[nu-kapi-da]-ma] (1sg-hand-round-cl:side.of) one side of my nger
55
(a similar example for Dyirbal is given in 7 of chapter 1). Only some verbal
enclitics in position 19 (see gure 2) allow variability of ordering, (a) as a kind
of afterthought (then there may be a pause) and (b) depending on what aspect of
the activity is focussed on. In this respect Tariana clitics are somewhat similar to
independent words in that change of order has to do with emphasis and focus.
(8)
di-sape=sin`a=sit`a=pit`a=[. . .]=nik`
3sgnf-speak=rem.p.infr=perfve=repetitive=completive
He had completely (nished) speaking again (one stresses the complete
extent of the action as opposed to something different)
(9)
di-sape=sin`a=sit`a=nik`=[. . .]=pit`a
3sgnf-speak=rem.p.infr=perfve=completive=repetitive
Again, he had completely (nished) speaking (one stresses the
repetition of the action as opposed to something different)
56
Alexandra Y. Aikhenvald
The auxiliary can cliticise onto the main verb (as in Malayalam), or onto the subject pronoun, as
appears to be the case in a number of West African languages (Heine 1993: 76). This cliticisation
may result in further evolution of clitics into afxes, and the development of new paradigms (e.g.
tense paradigm in Hausa: Heine 1993: 767; and the development of an old conjugated auxiliary
into a set of subject pronoun sufxes in Muskogean languages described in Haas 1977).
57
Adpositions are often cliticised. Case relations can be expressed with afxes,
with adpositional clitics or with adpositions as independent phonological words.
Afxes usually go on every word in a noun phrase, or they may appear at the
rim of a noun phrase, or just on the head of a noun phrase, while clitics tend to
appear just on the rim of a noun phrase (depending on how they are positioned
see a above).
Prosodic deciency as a class-specic characteristic is typical for closed
classes. However, there is no language with an open word class, every member
of which can be prosodically decient.11
It may appear that historically clitics represent an intermediary stage of a
development path, from full words to full afxes. Historical and comparative
studies show, however, that this is not the case see 7 in chapter 7. We
hypothesise that clitics with low selectivity that is, those that can attach to any
constituent, by phonological rules will tend to be diachronically more stable
than clitics which attach to some particular grammatical host. This requires
further investigation which goes beyond the scope of this chapter.
Clitics often present orthographic problems. Established orthographies may
fail to treat them in a consistent way. For instance, in Italian pronominal
proclitics are written as separate words, but enclitics are written together with
their host. Rankin et al. (chapter 7) demonstrate the difculties which Ella
Deloria, a linguist and a collaborator of Frans Boas, had in consistently writing
enclitics in her native Dakota. Similarly, the Tariana tend to write some disyllabic clitics separately, and some together with their host; most monosyllabic
clitics are not written separately.
We will now discuss properties of words and of clitics in Tariana.
2
2.1
A putative exception to this could be cliticised inectional verb forms in Verb Second position,
and cliticisation of the verb forms in certain environments (see Anderson 1993).
58
Alexandra Y. Aikhenvald
yes
prexes
sufxes
no
some yes,
some no
possible
no
proclitics
enclitics
2.2
secondary stress
yes (under certain
conditions)
no
yes (under certain
conditions)
no
always
phonological
word
grammatical word
yes
no
no
possible
yes
only aktionsart predicate enclitics
59
Suffixes
Enclitics
Suffixes
Enclitics
2.3
But note that 15 and 16 can be lled simultaneously, creating an instance of marking two
syntactic functions: see Aikhenvald (1999b).
60
Prefix
Alexandra Y. Aikhenvald
1. Cross-referencing prefixes (A/Sa ) (three persons in singular and in plural), or
negative ma-, or relative ka2. ROOT
Suffixes
3. Thematic syllable
4. Causative -ita
5. Negative -(ka)de
6. Reciprocal -kaka
7. -ina 'almost, a little bit'
8. Topic-advancing -ni, or passive -kana, or purposive non-visual -hyu or visual
-kau
9. Verbal classifiers
10. Benefactive -pena
11. Relativisers or nominalisers
Enclitics
Some categories e.g. plural can be marked recursively, that is, more
than once, since enclitics in positions 6, 8 and 9 require an additional number
marker, thus creating a situation comparable to endoclisis see j above. In
the following example, brackets show clitics which require a separate number marking: nha-ma-pe=[yan`a-pe]=[t`upe]=[m`ki] (they-cl:fem-pl=[pej-pl]=
[dim:pl]=[nom.past:pl]) the little bad dead female ones. These enclitics
cannot occur as independent phonological words.
The structure of the verbal word is given in gure 2. A minimal verbal word
consists of positions 1, 2 and 3. Position 1 has to be lled for transitive and
intransitive active verbs. Other positions have to be lled if the corresponding
meaning needs to be expressed.
61
Not all enclitics can co-occur; for instance, imperative does not co-occur with
evidentiality and tense (full details are found in Aikhenvald 1999b and forthcoming). Variable ordering is allowed for position 19 (see examples (89)
in 1), and this has a pragmatic effect. Sufxes never come in between enclitics, and categories are not marked recursively. Of all the verbal enclitics, only
aktionsart enclitics can be used as independent phonological words when in
contrastive focus; they are different from other enclitics in some other ways
as well.
Tariana also has serial verbs and complex predicates (Aikhenvald 1999c)
consisting of more than one grammatical and phonological word but occupying
one predicate slot. Sufxes in positions 511 go onto the rst grammatical word
and characterise a serial verb construction or a complex predicate as a whole.
2.4
Clitics in Tariana
nemhani=pidan`a
kay=na-ni na-yha
thus=3pl-do 3pl-swim 3pl+walk=rem.p.rep
Thus (lit. having done thus), they drowned
(11)
kay du-ni
di-kesi-pe=nuk`u
du-kalite=pidan`a . . .
thus 3sgf-do 3sgnf-friend-pl=top.non.a/s 3sgf-tell=rem.p.rep
After she acted this way (not any other way), she told his friends . . .
no
grammatical word
k. Grammatical word
l. Scope
m. Lexicalisation
o. Word class
not applicable
precede enclitics
no pauses after
no
g. Pauses
f. Phonological cohesion
e. Segmental properties
d. Phonological word
high
low
any sentence-initial word (ne then, there, kwe/kwa interrogative, kay thus)
numeral one, interrogative, verb (ne negative)
b. Selectivity
prexes (eleven)
proclitics (four)
c. Type of host
properties
63
ne which is more selective than the others if the proclitic is focussed and/or
contrastive, as in (12). No more than two enclitics can attach to one proclitic
see 2.4.3.
(12)
diha ita-whya=n`e
dsa
ne=pidan`a
then=rem.p.rep the canoe-cl:canoe=ag 3sgnf+go.up
d-nu
3sgnf-come
Then the canoe came upstream
The proclitic ne then, there can attach to the nominal enclitic -nuku topical
non-subject; ne=nuk`u (then=top.non.a/s) is used if sequencing is topical, i.e.
is what the story is about see (13). It forms one phonological word and one
grammatical word with the proclitic.
(13)
ne=nuk`u
nha na-kuna
nheta=pidan`a
then=top.non.a/s they 3pl-take.with.hand 3pl+drag=rem.p.rep
And then they took (a piece of gold they found) with their hands
Table 2 shows that proclitics are similar to prexes in just two ways: in their
segmental structure, and in the restriction against having a pause after them.
All four proclitics share the following properties:
r the capacity for forming an independent phonological word and a grammatical
word;
r the lack of phonological boundary processes;
r their position with respect to afxes and enclitics.
They differ from prexes with respect to other parameters.
2.4.2 Enclitics Enclitics in Tariana are a largish, albeit heterogeneous class. We rst discuss nominal enclitics; then we consider predicate
enclitics.
Nominal enclitics are highly selective xed position clitics whose position
depends on the grammatical class of their host (noun phrase). No nominal enclitic except for =tuki diminutive can form an independent phonological
and grammatical word. However, tuki has different properties as a nominal
enclitic and as an independent word: it is in free variation with =tiki and distinguishes singular and plural only as an enclitic; when used as an adverb it can
be repeated: tuki-tuki, tuki-tiki or tiki-tiki.
The morphemes =miki- nominal past, =yana pejorative and =pasi augmentative are good candidates to be grammatical words, since each of them
requires number marking. The enclitic =miki- also requires gender marking.
Each of these three clitics is a fusion of a number of grammatical elements, and
satises criteria for a grammatical word. They could be grouped together with
64
Alexandra Y. Aikhenvald
adjective modiers since they show gender and number agreement. They are
contrasted with nominal sufxes in table 3.
Predicate enclitics are contrasted with verbal sufxes in table 4. Tenseevidentiality enclitics (position 15) display low selectivity, while the aktionsart
enclitics (position 17) are highly selective.13
Tense-evidentiality enclitics are oating. Their position does show some
correlation with the grammatical class of their host. They can go onto any
constituent in the clause, if it is in contrastive focus and preposed to the predicate.
This is the case in (14) where the tense-evidentiality enclitic goes onto the rst
constituent. If no constituent is contrastive, they go onto the predicate.
(14)
du-kalite
maa-pei=sin`a
good-cl:coll=rem.p.infr 3sgf-tell
She (mother) says good things (contrary to what a misbehaving
girl might think)
du-hwa=thep`
du-a
du-aphua=pidan`a
3sgf-fall=into.water 3sgf-go 3sgf-dive=rem.p.rep
She (the girl transformed into a snake-woman) fell into water diving
thep
di-uku
di-a
into.water 3sgnf-go.down 3sgnf-go
Into water he went (contrary to all expectations)
(18)
13
kan=nihk`a
na:?
where=inter.vis.past 3pl+go
Where did they (the sh, or the girl) go?
thep
into.water
Into water
There are several more classes of predicate enclitics which are not included here, for the sake
of pedagogic simplicity; full details are provided in Aikhenvald (forthcoming).
high (noun)
nouns and adjectives
no
noun phrase
b. Selectivity
c. Type of host
d. Phonological word
e. Segmental properties
f. Phonological cohesion
g. Pauses
k. Grammatical word
l. Scope
yes
no
not applicable
no
no
m. Lexicalisation
o. Word class
not applicable
not applicable
no pause
properties
if focussed; may be
monosyllabic with
a short vowel
d. Phonological word
e. Segmental properties
f. Phonological cohesion
g. Pauses
yes
m. Lexicalisation
o. Word class
adverbs
main verb
l. Scope
yes
k. Grammatical word
no
not applicable
no
no
xed position
follow sufxes
no
not applicable
not applicable
yes
not applicable
no
not applicable
not applicable
not applicable
no
no
verb
high
low
b. Selectivity
c. Type of host
tense-evidentiality
aktionsart
properties
67
68
Alexandra Y. Aikhenvald
NOMINAL ENCLITICS
like a suffix
Aktionsart enclitics
Floating enclitics
PREDICATE ENCLITICS
verb, and thus are similar to sufxes. They can form a phonological word and
a grammatical word of their own, if focussed. They can also form idiomatic
combinations with the verb.
Tense-evidentiality enclitics are oating. (Their unmarked location is the
predicate.) Their selectivity is low, and they can form a phonological word with
proclitics. They cannot form a phonological or a grammatical word on their own.
These six classes of enclitics can be plotted on a continuum14 between a sufx
and a root, with respect to the combination of their properties (see gure 3).
2.4.4 Words with and without clitics in Tariana We have seen that
in Tariana, words containing clitics, and clitic-only words, behave differently
from phonological words without clitics.
First, Tariana has two phonological processes which are clitic-specic see
f in 1. Aspiration oating is indicative of a clitic boundary within a phonological word which contains enclitics. Regressive vowel assimilation is indicative
of the presence of a clitic and of the end of a phonological word which contains enclitics.
Second, when proclitics and aktionsart enclitics appear as independent phonological words, they are unlike any other phonological words in that they are the
only instances of phonological words with a short vowel.
Tariana provides evidence in favour of the existence of the following kinds
of words containing clitics:
(1) A clitic-containing word consisting of an enclitic and a root (with or without
afxes); it coincides with a grammatical word, e.g. du-hwa=thep` (3sgffall=into.water) she fell into water from (15).
14
69
An additional piece of evidence in favour of ne=nuk`u (then, there=t o p . n o n . a / s) as a grammatical word comes from the slightly different variety of Tariana spoken in the village of
70
Alexandra Y. Aikhenvald
type of words
structure
example
1. Clitic-containing
a root (with or
without afxes)
and an enclitic
du-hwa=thep`
(3sgf-fall=into.water)
she fell into
water in (15)
yes
2. Clitic-containing
a oating enclitic
and a word to
which it is attached
kan=nihk`a
(where=inter.vis.past)
in (17)
no
3. Clitic-containing
a clause-initial proclitic
and the following
word to which it
is procliticised
kay=na-ni
thus they did
(having done thus)
in (10)
no
I. Clitic only
proclitic ne then,
there and
enclitic =nuk`u
topical non-subject
ne=nuk`u in (13)
yes
ne=pidan`a
(then=rem.p.rep)
in (12)
no
(II) A clitic-only word consisting of a proclitic and one or two oating enclitics,
e.g. ne=pidan`a (then=REM.P.REP) in (12). It does not coincide with a
grammatical word: ne is one grammatical word, and =pidan`a forms part
of the predicate, which is a different grammatical word.
Table 5 summarises the properties of clitic-containing and clitic-only words
in Tariana.
3
71
The two groups of enclitics nominal and verbal differ in how similar
they are to sufxes and to independent roots. They fall into several subclasses,
depending on whether they can occur as a grammatical and as a phonological word, and whether they have nominal grammatical categories of their
own.
Predicate enclitics have the predicate as their preferred location. Fixed position aktionsart enclitics are similar to sufxes. Floating tense-evidentiality
enclitics are unlike sufxes and unlike aktionsart enclitics in that they are
capable of forming a clitic word together with a proclitic.
We have also demonstrated the existence of at least three different kinds of
clitic-containing words and of two kinds of clitic-only words in Tariana.
The following properties are criterial for distinguishing words with clitics
from phonological words of other types:
r their relationship with stress parameter d;
r their segmental properties parameter e;
r their phonological cohesion, that is, phonological processes applying within
them or on their boundaries parameter f;
r their internal structure (whether they contain just clitics, or other elements as
well) and their ability to combine with other clitics parameter h;
r their ordering with respect to one another and to afxes parameters i
and j;
r and their correlation with grammatical words parameter k.
All this and especially the existence of clitic-specic phonological processes
(see 2.4.4) provides a rationale for a typology of clitics.
An alternative approach would be to restrict oneself to recognising only words
and afxes, as suggested by Joseph in chapter 10. But whether by doing this
we gain in elegance and precision remains an open question. As Matthews
points out in 4 of chapter 11, it may simply be that, in a particular language,
a few forms which are either afx-like or word-like nevertheless do not have
every property that afxes or words in general do have. Various parameters
suggested above demonstrate how much might all clitics have in common
(in Matthews words).
A continuum between words and afxes advocated in this chapter covers
clitics of different kinds. It is designed to present a clearer view of the messy
reality (in Josephs words) than a simplistic binary division of units into words
and afxes.
Appendix Additional issues concerning clitics, and parameters
suggested for distinguishing between clitics and afxes
A great amount of literature has appeared on various issues concerning clitics
(see, for instance, Nevis et al. 1994, Anderson 1992, 1995, Sadock 1991, 1995,
72
Alexandra Y. Aikhenvald
and Halpern 1998). This literature is not quite as vast as that on word; however,
I will not attempt a full survey. Here I assume that clitics have to be realised
as morphemes; that is, I exclude any discussion of morphological processes
which may be functionally reminiscent of clitics (see Anderson 1992, 1995:
223, for analysis of stress shift in Tongan, of mutation in Welsh, and of initial
consontant mutation in some Algonquian languages; also see Ball and Muller
1992: 17880, for a similar approach to Welsh).
Sadock (1995: 259) introduces clitics as items that appear to be positioned
in syntax by ordinary principles, but at the same time show a non-syntactic
fondness for attachment to a nearby word. He exemplies a typical clitic
with the English auxiliary clitic ll realised as a non-syllabic [1] after pronouns; syntactically, the linear position of the clitic is exactly the same as
the functional equivalent free word will, but the enclitic must be sufxed to
a particular morphological word type, a pronoun. However, this is not true
of all clitics as Zwicky (1977) pointed out, some so-called clitics are not
positioned by syntactic rules, and some depend only phonologically on host
words.
A cross-linguistic denition of clitics appears to be so problematic that
Sadock (1995: 260) comes up with the following sociological denition
of clitics: A clitic is an element whose distribution linguists cannot comfortably consign to a single grammatical component; thus arguing that there is no
natural class of clitics dened in terms of genuine grammatical properties.
The alternation between a bound morpheme and a free word in individual
languages depends on various factors one of them speech style or register
(cf. Ill and I will in English); this is the basis for distinguishing between
simple clitics and special clitics (Zwicky 1977, 1985; Anderson 1992: 200ff).
A simple clitic is an element of some basic word class which appears in a normal
syntactic position where its non-cliticised counterpart would appear, e.g. give
em the plate; while a special clitic is an element whose position is determined
by other rules, e.g. second position clitics.
Anderson (1992: 201) assumes that a simple clitic is merely a lexical item
whose phonologcal form does not include assignment to a prosodic unit at the
level of word (or some other appropriate unit that constitutes an essential
domain of stress assignment).
What are loosely grouped under the notion of special clitics are more similar
to afxes for instance, English possessive enclitics which has exactly the same
set of conditioned allomorphs as the plural inection and third person singular
present tense. Clitics appear to be less selective than afxes with respect to
their host; there is often no close grammatical link between a clitic and its
host, just a phonological association of convenience. For instance, the possessive s in English attaches at the end of any NP no matter how long it is, and
73
74
Alexandra Y. Aikhenvald
75
II. Syntax
III. Semantics
IV. Phonology
V. Lexicon
Parameters d and k
Parameters i and j
Parameters i and j
Parameter b
included in m
included in k, n, o
included in b
depends on
interpretations
under l
under d
under f
under d
under f
problematic,
discussion under m
d and k
properties are quoted, as Sadock (1991: 54) states, clear examples of clitics
can be found in the worlds languages that differ in respect to almost any one
of the behaviours listed in I through V.
References
Aikhenvald, A. Y. 1995. Bare. Munich: Lincom Europa.
1996. Words, phrases, pauses and boundaries: evidence from South-American
languages, Studies in Language 20.487517.
1998. Warekena, pp 225439 of Handbook of Amazonian languages, Vol. 4, edited
by D. C. Derbyshire and G. K. Pullum. Berlin: Mouton de Gruyter.
1999a. The Arawak language family, pp 65105 of The Amazonian languages,
edited by R. M .W. Dixon and A. Y. Aikhenvald. Cambridge: Cambridge University
Press.
1999b. Multiple marking of syntactic function and polysynthetic nouns in Tariana,
pp 23548 of CLS 35, Part 2.
1999c. Serial verb constructions and verb compounding: evidence from Tariana
(North Arawak), Studies in Language 23.479508.
2000. Areal typology and grammaticalisation: the emergence of new verbal morphology in an obsolescent language, pp 137 of Comparative linguistics and
grammaticalisation, edited by Spike Gildea. Amsterdam: John Benjamins.
forthcoming. A grammar of Tariana, from northwest Amazonia. Cambridge:
Cambridge University Press.
76
Alexandra Y. Aikhenvald
77
78
Alexandra Y. Aikhenvald
Introduction1
The language
Cupik is the name of the variety of Central Alaskan Yupik that is spoken in
Chevak, Alaska, 11 miles inland from the Bering Sea on southwest Alaskas
Yukon-Kuskokwim Delta. In Chevak (pop. 800 in 1997) Cupik is spoken by
1
I wish to thank Leo Moses, Mary Moses, Rebecca Kelly, John Pingayak, the late Joe Friday and
many others in Chevak who have taught me what I know of Cupik over the years. I gratefully
acknowledge support for my work in Chevak from the National Science Foundation (grants SBR
9511856, BNS 8618271 and BNS 8217785). Many thanks to Bob Dixon and Sasha Aikhenvald
for their work in articulating the problem of the word and bringing together the group represented
in this volume, many of whom have contributed comments that have helped me in the preparation
of this paper.
79
80
Anthony C. Woodbury
practically all native people born before 1970 but by very few born after 1980.
Correspondingly, English is spoken to some extent by most people born after
1940. Central Alaskan Yupik, encompassing several varieties, is indigenous to
the entire Delta area from Unalakleet to Bristol Bay and has about ten thousand
speakers. Together with Alutiiq to the south, Central Siberian Yupik on Saint
Lawrence Island and in the Russian Far East, and Naukanski in the Russian Far
East, it is a member of the Yupik branch of Eskimo. The other major branch is
Inuit, spoken from North Alaska to Canada to Greenland in a range of varieties
(Woodbury 1984).
Almost nothing, except specic morphological and lexical choices and
phonological variations distinguishes Cupik from other Central Alaskan Yupik
varieties. There is also remarkably little difference in respect to most major
points raised in this paper among any Yupik language or Inuit variety, although
some minor but interesting differences can be noted.
All data are given in the standard Central Alaskan Yupik orthography. Symbols have their IPA phonetic values except: vv = [f]; ll = [
]; ss = [s]; g = [],
gg = [x]; r = [], rr = []; c = []; ng = []; y = [j]; e = []; and = gemination. Also, voiced continuant symbols represent their voiceless counterparts in
clusters with other voiceless sounds, hence maligtellruanga s/he followed me
represents [malixt
uaa].
3
A thumbnail typology
81
apqerru qanellren
cukaunak,
Kitak cali
please again say
your.words slowly
caperrnailngurnek qaneryaranek aturluten.
not.difcult
words
using
Please repeat what you said slowly, using simple words.
Grammatical word
82
Anthony C. Woodbury
Absolutive:
arnaq
woman (S or denite O)
arna-m
woman (A) or womans
Ergative:2
Ablative-modalis: arna-meng woman (indenite O); from the/a
woman
Locative:
arna-mi
at the/a woman
Vialis:
arna-kun
via the/a woman
(4)
Absolutive singular:
Absolutive dual:
Absolutive plural:
arnaq woman
arna-k two women
arna-t three or more women
Beside the locative and vialis, there are several other oblique cases not shown
in (3). Nouns are also marked for the person and number of the possessor, if
any. This marking is present whether or not the possessor is overt:
(5)
(arna-m)
eni-i
woman-erg.sg house-abs.sg+3sgP
(the womans)/her house
(6)
(wii)
en-ka
my-erg house-abs.sg+3plP
my house
abs.sg (unpossessed)
abs.sg+1sgP
abs.pl+3sgP
abs.du+3plP
loc.sg+3reexivesgP
vl.du+2sgP
qayaq
kayak
qaya-qa
my kayak
qaya-i
his kayaks
qaya-gkek
those twos two kayaks
qaya-mini
in his own kayak
qaya-gpekun via your two kayaks
83
vl.du+2sgP
-g-pe-kun
-du-2sgP-vl
(9)
abs.pl+3sgP
-i
-abs.pl+3sgP
(10)
abs.du+3duP
-g-ke-k
-abs.du-abs.du-3duP
Indicative
Interrogative
Optative
Participle
Appositional
Consequential
Concessive
tekit-uq
tekit-a
s/he arrived
s/he arrived
(in WH question)
teki-lli
may s/he arrive
tekite-lria
(then, surprisingly) s/he
arrived
teki-lluni
then (as in narrative) s/he
arrived; s/he, arriving
tekic-an
when s/he arrived
teki-ngraan when s/he arrived
(Arnaq)
qavar-tuq.
woman.abs.sg sleep-ind.3sgS
(The woman)/She is sleeping.
(13)
(Wangkuta) qavar-tukut
we-abs.pl
sleep-ind.1plS
We are sleeping.
84
Anthony C. Woodbury
(14)
(Arna-m)
(kaugpii-t)
tangrr-ai.
woman-erg.sg walrus-abs.pl see-ind.3sgA+3plO
(The woman)/she saw (the walruses)/them.
(15)
(Kaugpii-m)
(wii)
tangrr-aanga.
walrus-erg.sg 1sg.abs see-ind.3sgA+1sgO
(The walrus)/it saw me.
5.1.3 Particles Particles are the third word category, dened by the
lack of any inection. Syntactically, most of these words are either adverbs or
interjections:
(16)
keyianeng
unuk
cali
tawa
qa
Kiiki!
Uuminaqsaga!
always
last night
also; more
thats enough; now; then
huh? (forms yes-no question)
Hurry up!
Darn!
Enclitics are a subclass of particles. Like other particles they are uninected;
however, they are phonologically dependent on another grammatical word.
They will be considered in 6, but for now let us take as the null hypothesis the
idea that enclitics are words just like any other particle.
5.2
Suppose we assume the following morphological rules dening the three grammatical word classes:
(17)
noun word
= noun base + noun inection
verb word
= verb base + verb inection
particle word = particle base
These rules correspond to criterion (e) for the grammatical word in the introduction (There will be just one inectional afx per word) so long as we consider
the inectional afx to be the formative-category bundle that makes up the complete inection as dened in 5.1. If these rules are correct, then for every noun
inection and for every verb inection, the unit preceding it should be denable,
respectively, as a noun base or a verb base. We now need criteria for this.
5.3
We will begin by observing the structure of bases and from there we will
use characteristics of that structure as our criterion for identifying the base
independently.
85
By (17), noun and verb bases should be identiable by peeling off the inection, while particle bases should be identical with particle words. That is
what is shown in each of the example sets below. But these sets also show
that bases are derived recursively from more primitive bases via derivational
sufxation:
(18)
ivruci-t
ivruci-li-uq
ivruci-li-sta
ivruci-li-ste-nger-tut
ivruci-li-ste-ngqer-sugnait-uq
(19)
quuyurni-uq
quuyurni-art-uq
quuyurni-arte-llru-uq
quuyurni-arte-llru-yaaq-uq
quuyurni-arte-llru-yaaqe-llini-uq
(20)
Nakleng!
Nakl-urluq!
waterboots (abs.pl)
she is making waterboots
(ind.3sgS)
waterboot maker (abs.sg)
they have someone to make
(them) waterboots (ind.3sgS)
they denitely dont have anyone
to make them waterboots
(ind.3sgS)
s/he is smiling (ind.3sgS)
s/he is smiling quickly
s/he suddenly smiled quickly
s/he suddenly smiled quickly,
but in vain
evidently s/he suddenly smiled
quickly, but in vain
Poor thing! (Particle)
Dear poor thing! (Particle)
In (18) the most elementary base is a noun base meaning waterboot; in (19)
it is a verb base meaning to smile; and in (20) it is a particle base meaning
poor thing.
The derivational process can be summarised as in (21), where the sufxes
are designated postbases following the practice of most Yupik-Inuit specialists:
(21)
*li-uq
s/he made (something)
*yugnait-uq
s/he denitely didnt
*llini-uq Peter-aq Peter evidently did
86
Anthony C. Woodbury
(23)
pi-li-uq
s/he made it/one
pi-yugnait-uq
s/he denitely didnt
pi-llini-uq Peter-aq Peter evidently did
Every grammatical word as dened by (17) and (21) can stand alone as a
complete utterance (except most enclitics, see 6). This is especially true in
actual practice due to a stylistic preference for ellipsis in answers to questions:
(24)
Ca?
What? (noun, abs.sg)
Unuk?
Last night? (Particle)
Qai-ngatnun. On top of them. (e.g., answering Where does
this go? (noun, terminalis.s+3plP)
Pingremi.
(Yes,) even though he did. (verb,
concessive.3reexive.sgS)
(25)
Ene-m
aki-an-et-ut.
house-erg.sg opposite-loc.sg.3sgP-be-ind.3plS
They are at the opposite side of the house.
(26)
Ene-m
aki-ani
et-ut.
house-erg.sg opposite-loc.sg.3sgP be-ind.3sgS
They are at the opposite side of the house.
87
In (25), the base aki opposite is followed by a locative case inection, giving
what should be a complete word according to (17); that word, however, is followed by a postbase -et- be and an indicative mood verb inection, something
licensed by neither (17) nor (21), whether or not aki-an- at the opposite side is
taken to be a word by itself. A further problem with (25) is that an ergative case
NP should not appear as an argument in an intransitive clause. By contrast, (26)
is well-behaved, dividing into three properly inected words, with the ergative
NP as the possessor argument not of a verb but of an independent locative case
noun. In Cupik, (25) is the common construction, whereas (26) is somewhat
archaic. In most varieties of Yupik, (25) appears to the exclusion of (26).
Evidently, (25) is a rare exception to the claim that stems such as et- be are
never grammaticalised as postbases, for this is indeed what has happened in the
case of this verb. It is reasonable to assume that this construction is licensed by
a minor rule of grammar, as follows:
(27)
Some support for this claim is provided by the ungrammaticality of (28) and
(29), which are permutations, respectively, of (25) and (26):
(28)
Akianetut enem.
(29)
(29) shows that the entire locative NP enem akiani on the opposite side of the
house is an inseparable unit in syntax; and (28) shows that that inseparable
unit is, apparently, imported wholesale into the morphology in (25), where it
seems to function as a base. Thus, (27) introduces a paradox where a phrase is
embedded into a word. (The phonology resolves this grammatically introduced
paradox in its own way it treats as separate words the possessor and the
possessum-plus-postbase-plus-ending unit.)
There is independent evidence for this same process in another minor construction:
(30)
Maklagaa-m
citug-tur-tuq.
bearded seal-rlsg nail-eat-ind.3sgS
S/he is eating fermented bearded seal ipper.
88
Anthony C. Woodbury
(31)
Maklagaa-m
citu-i.
bearded seal-rl.sg nail-abs.pl+3sgP
fermented bearded seal ipper (lit. bearded seals, its nails)
Example (30) is like (25) in that it appears that the postbase -tur- eat is in
construction with a possessed NP that would appear independently as (31); and
the order of the words in (30) cannot be reversed (cf. (28)). It is different from
(25), however, in three important respects: (a) the postbase -tur- eat cannot
appear as an independent verb; (b) -tur- is added here to the noun base citugnail, not the inected noun citui its nails; and (c) -tur- only appears to form
constructions of this kind when the NP in question is an idiom: note that the
literal meaning of (31) is merely bearded seals nails, not fermented bearded
seal ipper (a specic food preparation). Normally, -tur- only occurs with an
unpossessed noun stem. The (minor) rule for (30) can thus be given as:
(32)
INFL indicates that the whole NP head is uninected (i.e., maklagaam citugrather than inected, as in (31)). The important point in common between (27)
and (32) is that both license limited embedding of phrases within words.3
5.6
Summary
In this section we have given criteria for the grammatical word based on inection; on the notion of the base; and on the grammatical word as stand-alone in
utterances and phrases. I have also discussed minor, but problematic constructions where phrases appear to be embedded into words as bases. It is worth
pointing out that our formulations in (17) and (21) ensure the two principal criteria for the grammatical word proposed by Dixon and Aikhenvald in chapter
1: that elements within the word occur together (criterion (a)) and in a xed
order (criterion (b)). (17), as noted already, also ensures that there will be only
one inectional afx per word (criterion (e)), except in those limited cases
where phrases are embedded into words. Example (21), because of its recursion, implies a contradiction of criterion (d), that in some languages at least,
morphological processes tend to be non-recursive. It has been noted that the
grammatical word, so dened, functions as an utterance (criterion (g)). As for
the remaining criteria, I found it difcult to say much about conventionalised
coherence and meaning (criterion (c)), a notion I found difcult to apply; and I
defer discussion of the placement of pauses (criterion (f)) to the discussion of
the phonological word in 7.
3
There are different, but related constructions in Cupik and all other Yupik-Inuit languages in
which the grammatical denition of word raised in this section is not violated at all, but where,
nevertheless, a base and postbase function as independent syntactic atoms or syntactic words.
For extensive discussion, see Sadock 1980, 1985, 1991 and Woodbury 1981, 1996.
89
In all, the above shows that the grammatical word is an extremely clear and
robust concept, clouded only by a (remarkably) limited degree of phrase-inword embedding. It is likely due, more than anything, to the pervasiveness of the
inectional system and the virtual absence of compounding in the derivational
morphology.
6
Enclitics in grammar
Enclitic grammar
Following are some principal Cupik enclitics (there are about a dozen in all):
(33)
=am
Several enclitics can also occur as independent words; this simply means that
they have lexical entries both as enclitics and ordinary particles. =i occurs only
with a specic category of particle and is in that sense lexically hosted. Likewise,
some of the others are lexically hosted in the sense of forming particle idioms
in combination with certain particle words (never inected nouns or verbs):
(34)
90
Anthony C. Woodbury
Otherwise, the enclitics have phrasal scope, occurring in phonological construction with the rst grammatical word of the phrase; such is the case for =llu
and in (35), =am, a marker of emphasis, in (36); and =gguq it is said in both
examples:
(35)
Neqiviak=llu=gguq
taun imaicuunani
their.food.cache=and=it.is.said that
it was never empty
And, it is said, that food cache of theirs was never empty.
(36)
Tamatum=ggur=am
nalliini
tunucillget makut,
of that=it.is.said=emphasis at its time loons
these
qaraliinateng
taw. . .
lacked.coloration then
So it is said, at that time these loons lacked coloration then. . .
Qalu=tuq
kan
tagullitgu!
dipnet=I.wish that.down.there may.they.bring.it.up (optative)
I wish theyd bring up that dipnet down there!
(38)
Maqiyuilnguq=gga
taw agaani,
he.who.never.took.rebaths (participle)=voil`a well across.there
ukani tawaam!
this.side instead
(It was) he (who) never took his rebaths across there, (but did it) on
this side instead!
(39)
Aana=qa
anuq?
mom=huh? she.went.out (indicative)
Did mom go out?
6.2
91
(as, e.g., are case, mood and person-and-number agreement). Thus, they would
have to be post-inectional elements of some kind.
However, as just shown, enclitics grammatically pertain to phrases and not
words; and because they are generally hosted by the rst word in the phrase,
whatever that may be, they need not have any direct grammatical constituency
with the host word. Thus a word-plus-enclitic unit is not a grammatical unit
independent of the whole phrase in which it occurs.
But what about the so-called particle idioms of (34)? They are phonological
words and they are lexical entries, so why not call them grammatical words?
I see no reason to do so as long as they show regular enclitic phonology. It
is sufcient to consider them as idioms made up of two grammatical entities
(word particle plus enclitic particle) bound as one phonological word. To be
sure, many bona de particles have their etymological origin as particle idioms,
but in these cases, the phonology is reanalysed to be that of a single particle word
(e.g. tawaam however is presumably from tawa then plus =am; but tawa=am
should be pronounced [tawa:am] whereas what occurs is [taw:a:m] the expected
result when footing rules are applied to tawaam with no enclitic boundary).
7
Phonological word
Prosodic criteria
The main prosodic phenomenon of Cupik is a set of foot formation rules determining stress and other prosodic features. These rules form quantity-sensitive
iambic feet from left to right within a domain which all analysts have taken
to be the word. On the analysis assumed here, the nal syllable of that domain, as well as certain medial CVC syllables, are left unfooted and hence also
unstressed (for more details see Woodbury 1987 and Hayes 1995: 239360).
Phonetically, the foot-nal syllable is stressed and, if open, lengthened. In the
examples in this section, the rst line marks feet with parentheses while the
second line gives the morphological segmentation:
(40)
(pi.ssu:.)(tu.ll:.)(ni.lu:.)ni
pi-ssu-tu-llini-luni
thing-hunt-always-apparently-appositional.3reexive.sgS
s/he is apparently always hunting something
(41)
(ma.llu:.)(ssu.tu:.)(lli.n:.)lu.ni
mallu-ssu-tu-llini-luni
beached.whale-hunt-always-apparently-appositional.3reexive.sgS
s/he is apparently always hunting beached whales
92
Anthony C. Woodbury
(42)
(ang.)(yar.kag.)ka
angyar-kag-ka
boat-future-abs.du+1sgP
my two future boats
(43)
(paq.)naq.(sa.qu:.)na.ku
paqnak-saqu-naku
check-neg-appositional.3sO
Dont check it!
All the items in (403) are inected forms of postbase-derived bases, i.e., grammatical words as we have been dening them. Example (41), in comparison
with (40), shows that the same postbase-and-inection sequence gets different
footing depending on what is upstream of it in the left-to-right foot assignment
process. Moreover, it is clear that the footing of a word is not affected by the
footing of the prior word, for example, the words in (401) keep their footing
when juxtaposed as a phrase, in either direction:
(44)
a. (pi.ssu:.)(tu.ll:.)(ni.lu:.)ni (ma.llu:.)(ssu.tu:.)(lli.n:.)lu.ni
s/he is apparently always hunting something, apparently always
hunting beached whales
b. (ma.llu:.)(ssu.tu:.)(lli.n:.)lu.ni (pi.ssu:.)(tu.ll:.)(ni.lu:.)ni
s/he is apparently always hunting beached whales, apparently
always hunting something
The entire system places certain conditions on the beginnings and ends of words
which can serve as diagnostics for both word-beginnings and word-endings.
For example, there is a rule that creates a unary foot from a word-initial (C)VC
syllable, as shown in (423), which in turn ensures that that syllable is stressed.
Thus no word can begin with an unstressed (C)VC syllable (although not every
stressed CVC syllable is word-initial, cf. (42)). As for word-endings, their
location is often deducible because of the already-mentioned ban on word-nal
feet. Thus, no word can end with a stressed syllable. Futhermore, the presence of
two light unstressed syllables next to each other, as in (41) and (43), is always
the result of the failure of word-nal iambic footing and is thus a sufcient
condition for the end of a word. There is also a rule which automatically creates
a unary foot from any (C)VV(C) syllable, except word-nally. Hence, any
syllable containing VV that is not stressed is necessarily word-nal.
Interestingly, the domain for all these processes, although denable entirely
on prosodic rather than grammatical terms, happens nevertheless to converge
exactly with the grammatical word as it was dened in (17) and (21).
However, this same system, with just a few modications, extends to include
the phonological word plus enclitics. I will argue that the domain of the extended system is in fact the phonological word proper (PW), while the domain
93
(pi.ssu:.)(tu.ll:.)(ni.lu:.)(ni.llu:.)gguq
(46)
(ma.llu:.)(ssu.tu:.)(lli.n:.)(lu.n:.)llu gguq
These examples show that the ban on nal footing is in fact a characteristic of
PW, and not PW, since in each case the PW nal syllable is now footed. It
also shows that left-to-right footing is a characteristic of PW. In other contexts,
PW shows some differences:
(47)
(mng.)qut.ka.mi
mingqut-ka-mi
needle-future-lc.sg
at the future needle
(48)
(Mng.)(qut.ka:.)mi? or (Mng.)(qut.kam.).:i
mingqut-ka=mi
needle-abs.sg+1sgP-what about
What about my needle?
In this minimal pair (from Miyaoka 1985), (47) shows the PW pattern, in
which a closedopen syllable sequence, here qut.ka., resists iambic footing,
and hence there is no stress in the word beyond the initial syllable; cf. also
(43), where naq.sa. likewise resists footing and hence stress. Example (48),
in contrast to the otherwise-identical (47), contains an enclitic boundary and
hence shows the PW pattern: here the closedopen sequence qut.ka. is footed,
the syllable ka is stressed, and is further affected in one of two ways: either /a/
is lengthened or the syllable is closed by the gemination of the following /m/.
The pattern in (478) is general: a closedopen syllable sequence forms a foot,
if and only if it contains, or is adjacent to, an enclitic. Some enclitics, like =mi
in (48), permit either lengthening or gemination at the end of the open syllable;
some allow only lengthening; and some allow only gemination: cf. Jacobson
(1984: 619) and Miyaoka (1985) for the Yupik facts. In some sense, those allowing only lengthening are most sufx-like, while those allowing only gemination
are uniquely enclitic-like. However, these subclasses seem to be partly determined by the initial segment of the enclitic in question, but partly idiosyncratic;
moreover they do not correlate with sequential ordering among the clitics, ruling
out the possibility of intermediate levels between PW and PW.
94
Anthony C. Woodbury
7.2
Segmental criteria
qayar-kaq
qayar-put
qayaq=kiq
qayaq
makings of a kayak
(base-nal uvular before postbase)
our kayak
(base-nal uvular before inection)
the kayak, I wonder . . . (word-nal uvular before enclitic)
kayak
(absolute word-nal uvular)
However, these same uvulars (and velars) are continuants when a following
enclitic is sonorant-initial; and they can be either continuants or stops when a
sonorant-initial word follows:
(50)
qayar=mi
indeed, a/the kayak
qayar mana kayak, this one
qayaq mana kayak, this one
In all, then, base-nal uvulars have one behaviour when followed within the
grammatical word by postbases or inectional endings; another behaviour when
followed by an enclitic; and a third behaviour when followed by nothing. (The
pre-enclitic behaviour is also an option when another word follows in the
phrase.)
A second relevant phenomenon is the treatment of /t/+/t/ clusters arising
heteromorphemically within words versus the treatment of those arising between
a word and an enclitic or another morphsyntactic word:
(51)
mingqutten
mingqut-ten
needle-abs.pl+2sg
your needles
[mNquttn]
(52)
enait=tuq
their.houses-I.wish
I wish their houses . . .
[
n:attuq] or [n:atuq]
(53)
95
enait
taukut [
n:attaukut] or [
n:ataukut]
their.houses those
those houses of theirs
Words:
angyaq boat, ii eye, una this, ena house
Enclitics: =am but, =i (interjectional with demonstrative adverbs)
This offers independent phonological support for the left boundary of the grammatical word. Furthermore, because internal syllables are C-initial and contain
no more than two vowel moras, no grammatical word shows a surface sequence
of more than two vowel moras. But three-mora surface sequences can arise
when a vowel-initial enclitic combines with a grammatical word ending in two
vowel moras:
(55)
angyaa=am
yaa=i
This is another respect in which word plus enclitic units are phonologically
unique.
Like the footing rules, the realisation of base-nal uvulars, the treatment of
/t/+/t/ clusters, and the distribution of syllable types independently refer to the
beginning or the end of the grammatical word, as dened in 5. But they also
refer to the superordinate domain of word plus enclitic.
7.3
96
Anthony C. Woodbury
grammatical
unit
PW
Grammatical
word
PW
Grammatical
word plus
enclitics, if any
process or constraint
To conserve space, I have excluded a demonstration that tonological and segmental word-toword sandhi phenomena earlier analysed as categorical are, in fact, gradient and non-structurepreserving. Argumentation is given along the same lines for related Nunivak Cupig, in Woodbury
1999.
(56)
97
Sekulartek anngamek
angutek atkukek
2.teachers when.they.went.out 2.men
their.2.parkas
agutakek
amavet
they.took.them over.there
When the two teachers went out, they took the two mens two parkas
over there.
For each word except the penultimate one, there were tokens in which it
was followed by a pause (and in elicitation, a pause was accepted there as
well).
Furthermore, if a pause or speech error occurs in the midst of a PW, the
speaker will go all the way back to the beginning of the word and start again,
as shown in the following taped instances from natural speech:
(57)
Akulit Akuliitnun!
Qanemci Qanemci Qanemci
Qanemcikqaqataraqa un
kenig kenillermun kanavet
anelrar anelreraraama
Amissaa Amissaagngama
ataucit ataucitun ayuqngameng
To the side!
Im going to tell a story . . .
to the repit down there
when I was inching out . . .
When I found the door (amik)
as one, they are all alike
98
Anthony C. Woodbury
Conclusion
99
Introduction1
I am grateful to Bob Dixon and Sasha Aikhenvald for their comments on an earlier version of
this chapter; to my Arandist colleagues, especially Gavan Breen, for discussion of these matters
over the years; and to the Arrernte speakers who have helped me, especially Veronica Dobson,
Margaret Heffernan, Therese Ryder and Margaret Mary Turner.
In citation of a single morpheme/allomorph in the text the + symbol indicates (bound) root
or sufx status. In morpheme-by-morpheme word glosses it indicates an afx boundary. In
glosses, = indicates a clitic boundary and - indicates the boundary between elements of a
compound or contiguous elements of a complex verb. - has the same function in the orthography
but also marks clitic boundaries.
100
101
have been made in the educational context to select a standard term exclusively
referring to a word-level unit, without long-term success, e.g. angkentye akweke
literally small unit of language.
There is recognition of a word unit in the taboo on mentioning the personal
names of recently deceased people and people with the same name as the
speaker. This is extended to words resembling those names, applying most
commonly to nominal lexemes, especially the open class items. Personal names,
in particular, are usually replaced with the substitute word kwementyaye. Other
nominals are often replaced by a descriptive compound or phrase, for example
ake-arrirlpe pointy head as a substitute for the word for crested pigeon.
With regard to orthographic word status, there is variation both across writers
and within the work of individual writers whereby disyllabic or larger clitics
and compound elements are written as separate words, hyphenated or as a single
word:
(1)
water=comit
kwatyakerte kwatye-akerte kwatye akerte
remember+pres itelareme
itele-areme
itele areme
Phonological word
This section introduces the basic criteria for the phonological word in ECA.
More complex issues, including mismatch with grammatical word will be discussed in 67. The phonological word in ECA can only be discussed in
102
John Henderson
labial
stop
nasal
pre-stopped nasals
lateral
approximant
tap/trill
p
m
pm
laminodental
apicoalveolar
th
nh
thn
lh
t
n
tn
l
apicopostalveolar
laminoalveopalatal
rt
rn
rtn
rl
r
ty
ny
tny
ly
y
velar
uvular
k
ng
kng
h
rr
103
number of
preceding syllables
Common
Uncommon
Dual
Reciprocal
odd
even
>1
+errirr
+irrer
+ewarr
+err
+irr
+err
+irr
+err
+irr
Allomorphy of the Reciprocal verb sufx and part of the allomorphy of the
Dual and Plural verb sufxes depend on the number of syllables found between
the beginning of the phonological word and the sufx, as shown in table 2.
There are restrictions on which forms can occur with various classes of stems,
and there are also other non-prosodically conditioned allomorphs. Assuming
binary footing from the beginning of the word, the prosodic conditioning can
alternatively be stated in terms of the position of the rst syllable of the sufx
within the foot structure: non-head syllable of foot, head syllable of foot, or not
in the rst or head foot. Note that the rst and third conditioning environments
overlap: a three-syllable3 stem can take both the +errirr and +ewarr Plural
allomorphs, as in (2e) below.
Allomorphy of the Plural markers in table 2 is illustrated in (2). The Uncommon forms are in free variation with their Common counterparts but occur less
frequently, and mostly in the speech of older speakers. The pseudo-orthographic
forms in parentheses indicate the syllabication, showing in (a) and (c) how this
takes into account underlying initial /e/, even though this vowel is not realised
in surface forms. In what follows, where I refer to syllables this is at the level
of the VC analysis unless otherwise indicated.
(2)
Present
th+eme
(eth.em)
grind
ath+eme
(ath.em)
swallow
kwern+eme
(ekw.ern.em)
insert
akwern+eme
(akw.ern.em)
leave
alwarrern+eme
for later (alw.arr.ern.em)
a. poke
b.
c.
d.
e.
Common Plural
th+errirr+eme
(eth.err.irr.em)
ath+errirr+eme
(ath.err.irr.em)
kwern+ewarr+eme
(ekw.ern.ew.arr.em)
akwern+ewarr+eme
(akw.ern.ew.arr.em)
alwarrern+ewarr+eme
(alw.arr.ern.ew.arr.em)
alwarrern+errirr+eme
(alw.arr.ern.err.irr.em)
Uncommon Plural
th+err+eme
(eth.err.em)
ath+err+eme
(ath.err.em)
kwern+irr+eme
(ekw.ern.irr.em)
akwern+irr+eme
(akw.ern.irr.em)
alwarrern+err+eme
(alw.arr.ern.err.em)
Though a ve-syllable stem preceding number marking appears to be theoretically possible, none
has actually been recorded.
104
John Henderson
3.2
Final vowels
[EtnmkNREnk] [EtnmkNREnk]
[etn m kNR enk]
/itn
emern akngerr
inek/
itne
merne akngerre in+eke
3pl:erg food
big:acc
get+past
They got a lot of vegetable food.
kwatye
water
3.3
105
alheme
go+pres /alh.em/
The play language Rabbit Talk (Turner and Breen 1984, Breen 1990) involves
a number of processes which obscure the standard form of words. Two of these
4
Though there are individual and dialect differences with regard to frequency of /a/ reduction.
Further, in some cases, a word with initial /a/ may correspond to a word in another ECA dialect
with initial //. In surface terms, the word in the rst dialect will show [] alternation in
citation form while the cognate form in the other dialect is phonetically consonant-initial in
citation form.
106
John Henderson
are of interest here. In the case of polysyllabic words, the rst syllable of a
word is transposed with the remainder of the word, as for example the /amp/
syllable in (7a). This process may thus split a morpheme, as it does with the verb
roots in (7ab). In the case of monosyllabic words, a syllable /ey/ is prexed, as
in (7de). Although these two processes are obviously formally distinct, they
clearly have a common outcome: the rst section of the morphological word
becomes the last section of the prosodic word, which is consequently no longer
in a linear correspondence with the morphological word.
(7)
a.
b.
c.
d.
e.
moan+pres
smell+pres
that (mid)
man
Lets go!
The domain of these processes of the Rabbit Talk play language appears to
be the phonological word. However, there are complexes of morphemes which
Rabbit Talk variably treats as a single domain or as multiple domains: nominal
plus disyllabic case clitic and complex verbs. Both are discussed further below.
The value of Rabbit Talk evidence is limited by the restricted data available. It is
mostly only known by some older speakers, particularly from the northeastern
part of the ECA area, and is no longer in common use. It was used, by people
of all ages, for secrecy sometimes but mostly for humorous effect, including to
downplay the imposition in requests for food.
3.5
Stress
Stress in ECA is not always clear and consistent, particularly but not exclusively
in casual and/or extended speech. A thorough analysis still remains to be done
but some preliminary statements can be made with a fair degree of validity, or
at least optimism. Each phonological word bears a primary stress. Some basic
rules can be stated in terms of (VC) syllable structure: in words of two or more
syllables, the second syllable bears primary stress, as in (8ad); in words of four
or more syllables secondary stresses may occur on alternating syllables after
the primary stress, as in (8d). Secondary stresses are more likely in citation
forms. A nal vowel is not stressed except in Imperative forms of verbs and
where it is the only surface vowel of the word, as in (8e). In words of the underlying form VC(C) where V is /a/, /i/ or /u/, there is dialectal variation between
placing primary stress on the initial vowel or the predictable nal vowel, as
in (8f).
(8)
a.
b.
c.
d.
merne
food
atherrke
green
ampetyele
back(wards)
atekertneme cough+pres
e. re
f. ampe
3sgnom/erg
child
107
/emern/
/atherrk/
/ampetyel/
/atekertnem/
/er/
/amp/
[m]
[t5Rk]
[mbl]
[tkt$m]
[tktm]
[]
[amp] [amp]
Primary stress may be attracted to word-initial /a/, /i/ or /u/ if the following
vowel is []. This is more likely if the consonant(s) of the rst syllable are
coronal, especially apicals. The language name Arrernte, for example, can be
stressed [R] or [R] (Wilkins 1989: 94-5). Stress may also be attracted to the initial /a/, /i/ or /u/ of clitics and compound elements. The question
of whether this is primary or secondary stress is discussed in 5 and 6.
A signicant problem in the analysis of stress is that it is difcult to distinguish between secondary stress within a phonological word and the varying
degrees of prominence associated with the primary stresses of words in a phrase.
It is thus difcult to determine in some cases whether two morphemes constitute two phonological words in sequence within a phonological phrase or a
single phonological word in which an afx, clitic or compound element bears
a secondary stress.
3.6
Summary
In this section we have seen that the basic phenomena which characterise the
phonological word in ECA are prosodically conditioned verb sufx allomorphy,
the processes of Rabbit Talk, position and degree of stress, and the realisation of
vowels at word margins. The signicance of these criteria varies: the allomorphy
is relevant only to verbs, while stress can be difcult to determine.
4
There is no simple denition of grammatical word in ECA. Nominal morphology is limited to compounding, and limited sufxation in the pronoun
system inasmuch as it is analysable.5 NP case is otherwise marked by case
enclitics. In terms of conventionalised coherence (see chapter 1), speakers
usually speak of nominal plus clitic sequences as single words though
5
Pronouns distinguish person, number, case and optionally kin category. Sufxes marking case
and kin category can be distinguished in some forms; other forms are suppletive. Only a subset
of cases are distinguished in the basic pronominal forms Ergative, Nominative, Accusative,
Dative and Possessive with other cases marked by case enclitics attached to the Dative forms.
108
John Henderson
disyllabic clitics are sometimes spoken of as separate from the nominal they
attach to. However, this may reect phonological word rather than grammatical
word.
Verbs and adverbs can bear afxation, always sufxes. For verbs, two classes
of morphological elements can be recognised apart from the root, non-obligatory
and obligatory morphology, in that order. The non-obligatory class includes
markers of aspect, subject number and motion associated with the verb stem
action. These markers include simple sufxes, compounded verb roots and combinations of both, and fall into several categories of incompatible elements. The
obligatory class marks tense, mood and/or dependent clause relation. Each verb
must contain at least one element from this class. The order of morphological
elements of both classes within the verb is largely xed, though there are alternative sites for some markers and very limited multiple marking, such as the
Plural in (9).
(9)
unth+ilirr+erl+t-ap+erl-iw+eme
look.for+plural+cont.go1 +plural-cont.go2 +plural+pres
They are walking around.
Verb words have conventionalised coherence and meaning: sufxes are not
usually spoken of by ECA speakers. The most common citation form for verb
lexemes is the present tense form.
All this suggests that a verb word can be dened on the basis of cohesiveness, xed ordering and conventionalised coherence. However, certain nonverbal morphemes may intervene at specic points within verb structures. For
example, in (10a) the verb morphology is interrupted by the particle akwele
supposedly which can alternatively occur after the entire uninterrupted verb,
as in (10b), with apparently the same meaning (though there may be subtle pragmatic differences). The particle can also occur with NPs and in isolation. This
raises the possibility that the verb in fact constitutes two grammatical words.
This phenomenon is discussed in more detail in 7.
(10)
Clitics
All clitics in ECA are enclitics. Wilkins (1989: 34759) categorises them into
three groups on the basis of the categories of hosts to which they attach:
109
NPs are recursive, with case marking indicating relations at each level of embedding within an NP as well as the clause-level function of the overall phrase. The
order of case clitics therefore follows from the structure of the NP, as illustrated
in (12). Restrictions on the possible sequences result from the applicability of
particular cases to particular levels. Core grammatical cases such as Ergative
are applicable only to the highest level of the NP since they indicate clauselevel relationships (except that a relative clause may contain an Ergative marked
argument for the same reasons). A majority of the semantic cases can mark relations within an NP, and therefore can be followed by higher level case clitics.
However these principles do not mean that highest level case clitic is nal in an
NP some NP-internal non-case clitics may follow, for example =areye Plural.
(12)
Although clitics and postpositional particles are typically phrase-nal, in complex verbs some can occur either after the entire verb or after a rst or non-nal
element of the complex verb, usually with no apparent difference in meaning.
For example, the relative clause marker in (11) could alternatively occur after
the Past tense marker.
Non-case clitics occurring with an NP generally have scope only over the
preceding phrase, but some may also have scope over the whole clause, for
example, =ekamparre rst in (13). Where they follow verbs, many non-case
clitics seem to be ambiguous between scope over the verb and scope over the
whole clause. For some clitics, the scope depends on the specic function. In its
relative clause marker function, =arle may follow the verb or the rst constituent
110
John Henderson
or both but marks the status of the whole clause. In its focus marking function,
it has scope only over the preceding phrase.
(13)
The emphatic clitics =ay, =ew and =eyew, and the interrogative clitic =ey,
always occur last in a clitic sequence (but may be followed by a post-positional
particle to which further clitics may be attached). There is some evidence that
polysyllabic clitics, such as =ekamparre rst, should be analysed as clitic
complexes but there is insufcient space to deal with this here.
Wilkins (1989: 347) observes that it is often not clear whether certain morphemes are to be analysed as clitics or particles. The problematic cases are
items which are exclusively post-positional (or at least not clause-initial). Others
which are post-positional but can also occur clause-initially and/or in isolation,
are more clearly particles. Though any of the problem particles/clitics may
clearly occur within the same intonational phrase as the preceding element, it is
not always clear whether it constitutes a separate phonological word. This difculty applies especially with disyllabic and larger forms. Stress typically falls on
the second syllable of a disyllabic particle/clitic but, as noted above, it is difcult
to decide whether the degree of stress is to be interpreted as secondary stress
within a single phonological word, for example arlwekere-ar`enye womens
camp+assoc, or a lesser degree of prominence associated with a primary stress
on a separate word within a phrase. In the former case, it is not possible to
attribute the location of the secondary stress in forms like arlwekere-ar`enye
to the alternating stress rule which otherwise accounts for secondary stresses
within phonological words. A separate rule dealing just with disyllabic clitics
is required.
However, there is some evidence that disyllabic and larger clitics may constitute separate phonological words. First, as discussed in 3 above, retroex
consonants are optionally prepalatalised following /a/ in word-initial position
but only prepalatalised after /a/ in non-initial position if they are also immediately followed by a heterorganic consonant. The clitics =arteke Semblative
and =artaye what about? can be pronounced with prepalatalisation, which
suggests the initial /a/ in these forms is word-initial.
Second, Rabbit Talk suggests that disyllabic case clitics, at least, constitute
a separate phonological word. In (14ab), the two elements count as separate
domains: the nominal as a monosyllabic word which therefore undergoes /ey/
111
prexation while the clitic constitutes a separate word, which being disyllabic,
undergoes transposition. There is some inconsistency in the evidence though.
One case has been recorded where the nominal and clitic constitute a single domain for Rabbit Talk transposition, (14c), and therefore constitute a single word.
Although I do not have any evidence of this type of variation in a single form,
it seems likely.
(14)
a. ear=comit6
b. night=assoc
c. what=assoc
irlpe-akerte
ingwe-arenye
iwenhe-aperte
Rabbit Talk
/ey.irlp.ert.ak/
/ey.ingw.eny.ar/
/enh.ap.ert.iw/
Monosyllabic clitics are stressed consistent with the primary and alternating
stress rules, and are therefore taken to form part of the same phonological
word as the host. However, the monosyllabic clitic =arle Focus/RC can be pronounced with prepalatalisation, again indicating that the initial /a/ is word-initial
and therefore that =arle constitutes a separate phonological word. That this is
inconsistent with the facts of stress suggests a distinction between prosodic
word and phonological word. The available Rabbit Talk evidence shows different behaviour by case and non-case clitics. Monosyllabic case clitics behave
as part of the host word, for example yanh=ele there=locbecomes anheleye.
However, the only monosyllabic non-case clitic recorded behaves in a way that
suggests it is a kind of unincorporated appendix to the phonological word. The
particle=clitic sequence kele=arle nished=Focus is rendered as lekarle. The
clitic does not count as part of the phonological word as far as transposition is
concerned, yet it is not treated as a separate word in that /ey/ is not prexed.
The only other criterion for phonological word that is applicable to clitics is
nal vowels. A nal vowel cannot occur on the host, which indicates that host
and clitic are within the one intonational phrase. Monosyllabic clitics of the
underlying form /eC/, such as the Dative case marker /ek/, behave differently
to full phonological words of the same form, such as in (8e), since a nal vowel
is not necessary on the clitic when in intonational-phrase-nal position. The
behaviour of clitics attached to a monosyllabic word is more complex. This is
discussed further in 6.
If, as some of the evidence above suggests, at least disyllabic and larger clitics
constitute distinct prosodic/phonological words, this is clearly at odds with the
usual notion of a clitic. I propose that such clitics can constitute phonological
words but only within a recursive phonological word structure conjoining the
phonological words of the host and clitic.
With regard to conventionalised coherence, the sequence of word plus clitic
is typically spoken of by ECA speakers as a single word; clitics are not typically
spoken of as separate words. Monosyllabic case clitics are typically not written
6
112
John Henderson
as separate from the word they attach to, though monosyllabic non-case clitics
and disyllabic case clitics sometimes are.
6
a. re-comit
u re-akerte urakerte
b. close-iv+pres twe irreme itwrreme
6.2
Two phonological words can align with a single grammatical word in compound
nominals and total reduplications.
In reduplications, both parts bear a stress on their second syllable, with the
rst being stronger. The position of the stress in the second element cannot be
attributed to the alternating stress rule since it is independent of the number
of syllables in the rst element, as demonstrated by (16a). Initial /a/ before a
retroex consonant can be pre-palatalised, as in (16a), indicating that the second
element is a separate phonological word. If the element is greater than three syllables, in addition to the stress on the second syllable of the second element there
will also be secondary stress on the second syllable after that, as in (16b). Reduplications of monosyllabic bases typically form a single phonological word.
(16)
a. arlatyeye
arlatyeye arlatyeye
b. arrernelhetyeke
arrernelhetyeke-arrernelhetyeke
The only relevant evidence available from Rabbit Talk involves reduplication
of disyllabic elements, in which case it is not possible to distinguish whether
the domain of transposition is the entire reduplicated form or each element
individually.
113
However, Wilkins (1984) points out that there is a difference in stress between
the reduplication iperte-iperte rough, holey and the phrase iperte iperte deep
hole (iperte hole and iperte deep). I propose that this behaviour of reduplicated forms be attributed to a recursive phonological word structure in the
reduplicated forms, the same as that proposed for host plus disyllabic clitics in
5. Both elements constitute distinct phonological words which are conjoined
into a single higher phonological word.
For compound nominals, the details of stress position and degree are as
for reduplications, as shown in (17). If the rst element is monosyllabic, the
compound may consist of either one or two phonological words, as in (17c).
(17)
a. southeast
antekerre-ikngerre
b. bird species (lit. stranger coming) ipenye-apetyeme
c. back of head (lit. head-mound)
ake-tapmwe akertapmwe
The limited evidence from Rabbit Talk shows that at least some nominal compounds can constitute a single domain: mwerre-akngerre nice (mwerre good
akngerre much) becomes rrakngerrem.
6.3
Where a monosyllabic word with underlying initial /e/ precedes another grammatical word with initial /a/, /i/ or /u/ within an intonational phrase, the initial
vowel of the second word can combine with the rst grammatical word to form
a phonological word, with that vowel receiving a primary stress, as shown in
(18). Pause between the two prosodic words seems not to be possible. This kind
of overlap is optional but very common. It is clearest when the monosyllabic
word is in initial postion in an intonational phrase and the following word has
initial /i/ or /u/. It is less clear where the initial vowel of the second word is /a/
because in that context /a/ can be very similar in quality to the nal vowel on a
monosyllabic word.
(18)
[t5IkoRk]
[t5.IkoRk]
[thlpw ]
[kwere]pw [arrerneke]pw [the]pw [ikwere]pw [arrerneke]pw
the
ikwere
arrern+eke
1sg:erg 3sg:dat place+past
I put (something) on it.
(19)
[Enekl5g]
[.Enekl5g]
[netyeke]PW [alheke]PW [re]PW [inetyeke]PW [alheke]PW
[r]PW
re
in+etyeke
alh+eke
3sg:nom get+purp
go+past
S/he went to get (it).
114
John Henderson
This phenomenon can be seen as the result of four things: (i) in words of the
underlying form /e(C)C/, the only underlying vowel cannot bear stress, (ii) every
phonological word bears a primary stress, (iii) sequences of (non-contrastive)
word-nal vowel and word-initial vowel are strongly dispreferred if not actually
prohibited within an intonational phrase and (iv) core constituents of a clause
tend to fall within a single intonational phrase. A similar outcome results from
the processes illustrated in (15) and (17c).
7
Complex predicates
The complex predicate constructions present a number of issues for the deninition of phonological and grammatical word in ECA. The basic facts are that
these structures appear on some grounds to constitute a single grammatical
and phonological word. Wilkins (1989) has described the verb types discussed
below in this way, except for the Transitive and Intransitive Verbalisers which
he describes as ambiguous between derivational sufxes and free verbs (1989:
216). However, other evidence suggests that all the complex verb predicates discussed below involve multiple grammatical and/or phonological words. Recall
the discussion on (10ab) above.
Six of the complex predicate types are discussed here:
(i) Sufx+root compounds
(ii) Lexical Complex Predicates
(iii) Transitive Verbaliser (tv) complexes
(iv) Intransitive Verbaliser (iv) complexes
(v) Attenuative verbs
(vi) Initial Separation
Each of these involves a division of the predicate into two parts which occur
in xed order and within a single intonational phrase. More than one of the
types above can occur in a single complex verb, as demonstrated in (35). Types
(i)(iv) by default involve two phonological words, and when the two elements
are contiguous, these phonological words are conjoined under a single higher
level phonological word. Their alternative occurrence as a single phonological
word can be attributed to the optional attening of this structure to a single
phonological word at a single level.
The types of evidence which establish the word structure of complex predicates are presented in 7.1. The six complex predicate types are then discussed
in these terms in 7.2.
7.1
115
alakenhe re
ampe akweke
mpwe ulh+etyenh+ele
thus
3sg:nom child small:nom urine excrete+fut+samesubj
irr+entye-akngerre.
iv+nomlsr
Little kids behave that way when they need to have a leak.
The scale above is also roughly implicational. For example, the only complex
verb type which permits dependent clauses also permits items from the preceding types. The scale also gives a rough indication of the relative likelihood
of occurrence in complex verb types where more than one type of intervening
material is possible: even where a broader range of intervening material is possible, items from the particle/clitic end of the scale are more likely. Particles
and clitics also appear to be more basic in another way: when any of the other
types of intervening material actually occur, it is very likely that that they will
be preceded by a particle or clitic.
As might be expected, there tend to be fewer intervening morphemes, rather
than more. This suggests a markedness principle favouring the least disruption
to a complex verb. It also makes it difcult to precisely determine the range of
intervening material permissible for a given complex verb.
In most cases, the intervening material can alternatively occur elsewhere
without a difference in meaning. For clitics and particles, the intervening position is typically equivalent to following the entire complex verb, though Wilkins
116
John Henderson
mpwar+ety-alh+err+eme
do+prior.motion+go+dual+pres
two go and then do
117
All this suggests two possibilities. (i) The second part optionally gets a secondary stress, similar to what appears to happen with disyllabic case clitics
discussed above, even though the second part of a complex verb need not commence with a disyllabic morpheme. (ii) The second part constitutes a distinct
phonological word, or at least initiates one. The lesser degree of stress could be
attributed either to the second part being in non-head position in a phonological
phrase or to the separate phonological words of the two parts being united within
a higher phonological word (which is optionally reduced to a single at phonological word). It is not possible to decide this on the basis of stress alone, but
the related phonological phenomena of prosodically conditioned allomorphy
and the Rabbit Talk processes support the second alternative.
7.1.3 Prosodically conditioned sufx allomorphy The number of syllables in the rst part of a complex verb does not count in determining prosodically conditioned allomorphy in the second part, as in (22). Any intervening
material is similarly not taken into account. This suggests that the second part is
not part of the same phonological word as the rst or any intervening material.
(22)
akwaketye-ak+errirr/*ewarr+eme
put.arm.around1 -put.arm.around2 +plural+pres
more than two put arms around (someone)
The allomorphy does not vary with any of the apparent variation in stress.
Even where there is no secondary stress perceived in the second part, allomorphy in the second part is invariably conditioned only by the preceding content
of the second part. The variation in stress is therefore a relatively supercial
phenomenon.
There is a small number of cases where allomorphy can alternatively be
determined by the entire preceding verb, as in (23), but in that alternative the two
parts do not show any other evidence of separate word status and appear to be in
the process of being lexicalised. These mostly involve monosyllabic rst parts.
(23)
a. aheye+ewarr/*errirr+eme
breathe1 -breathe2 +plural+pres
b. aheyangk-angk+errirr/*ewarr+eme
breathe+plural+pres
more than two breathing
7.1.4 Rabbit Talk processes For the complex verb types for which
there is evidence, Rabbit Talk varies between treating a complex verb type as a
single domain, as in (24), and the two parts as separate domains, as in (256),
though there is no evidence of this type of variation within a specic complex
verb.
118
John Henderson
(24)
apek+erle-an+eme
smash+cont+pres
(25)
arrern+etye-alp+eme
place+prior.motion-return+pres
rnetyarre malp
(26)
akeme7 +lhe-il+eme
up+tv+pres
melhak mil
Rabbit Talk
kerlanemap
at+elp-atak+eme
atten(redup+elp)-demolish+pres
start to demolish (something)
(28)
a. akw+elpe-akwaketye-ak+eme
atten(redup+elp)-put.arm.around+pres
b. akwaketye-ak+elpe-ak+eme
put.arm.around1 -atten(redup+elp)-put.arm.around2 +pres
start to put (your) arm around (someone)
This is a bound morpheme which occurs only with the Transitive and Intransitive Verbalisers.
7.2
119
ar+etye=arle
akwele
see+prior.motion=foc suppo
two supposedly go and then see
alh+err+eme
go+dual+pres
The Attenuative may precede the entire complex verb, as in (30), but cannot
occur before the second part, as shown in (31). This suggests that the second part
does not constitute a word in the sense that a simple verb does. If it constitutes a
separate phonological word as the discussion above suggests, then the evidence
of the Attenuative suggests that it does not constitute a separate grammatical
word.
(30)
mpwelpe-mpwar+etye-alh+eme
atten(redup+elp)-do+prior.motion-go+pres
start to {go and then do}
(31)
mpwar+ety-alhelpe-alh+eme
do+prior.motion-atten(redup+elp)+go+pres
apan+erle=arteke
re
ap+em+ele
feel+do.along1 =sembl 3sg:erg do.along2 +pres+samesubj
like going along continuously feeling (its way)
The existence of the full verb ap+ go in Kaytetye suggests that this was once the case in ECA.
It is now restricted to to the compounding +erl-ap and combination with +ety hither.
120
John Henderson
ikerrke anthurre re
anteme=arle re
3sg:nom
stick1 intens 3sg:nom now=foc
Hes got himself really stuck now.
(34)
itele
ware ampe nhenhe
remember1 just child this:acc
(S/he) just remembered this kid.
iw+elh+eke
stick2 +re+past
ar+eke
remember2 +past
a. ap+elpe-apat+elhe-il+eme
atten(redup+elp)-be.stunned+tv+pres
b. apat+elhe-il+elpe-il+eme
be.stunned+elh-atten(redup+elp)-il+pres
start to stun someone
(36)
mperlk+elhe
anthurre renhe
il+eme
be.white+elhe intens 3sg:acc tv+pres
make it go really white
7.2.4 Intransitive Verbaliser Like its transitive counterpart, the Intransitive Verbaliser combines with a single nominal word or phrase, adverbs
and certain clause types. It is the freest of all complex verb types with regard
to intervening material, permitting all types including dependent clauses, as in
(20) where the Intransitive Verbaliser forms a complex predicate with alakenhe
thus meaning behave that way.
121
7.2.5 Attenuative As already noted, the partially reduplicative Attenuative marker precedes a verb word, as shown in (37). Pronouns and certain
particles and clitics have been recorded as intervening material, as in (38). An
unusual aspect of this is that the reduplicant is therefore separated from its
source by other morphemes. There is no Rabbit Talk evidence available.
(37)
at+elp-at+errirr/*ewarr +eme
atten(redup+elp)-burst+plural+pres
start to burst
(38)
kwatye
uyelpe
aneme=arle
water:nom atten(redup+elp) then=foc
the water started to go away then
uyerr+erlenge
disappear+diffsubj
In verbs which are also otherwise complex, there are certain limitations with the
Attenuative. Intervening material between the unit consisting of the Attenuative
plus the rst part, and the second part of the verb appears to be limited to particles
and clitics and the third person singular pronoun, as in (39). Both the Intransitive
and Transitive Verbalisers can form complex predicates with a single nominal
or a larger NP. The Attenuative can apply to a nominal in that context but only
to a single nominal (including a compound or a single nominal bearing a case
clitic), not to an NP consisting of more than one nominal, as in (40).
(39)
ingwelpe-ingwe=arle
irr+eme
atten(redup+elp)-night=foc iv+pres
starting to get dark
(40)
122
John Henderson
(Cf. (41) and (42).) Separation is subject to the constraint that the rst morpheme
of the resulting second part is not an element of obligatory morphology or the
sufx part of a sufx+root marker. Morphological structure is not otherwise relevant. In particular, separation does not appear to require that specic morphemes
initiate the second part. It may even split morphemes, both roots, as in (43), and
sufxes, as in (445). Note that because the rst part consists of a root or stem
without obligatory morphology, it is homophonous with a -marked Imperative
form.
Initial separation is distinct from other complex verbs because the evidence
of prosodically conditioned allomorphy indicates that the unseparated verb
constitutes a single prosodic word, as in (46). Initial separation does not appear
to be possible where the allomorphy unambiguously takes into account the entire
preceding stem, the +ewarr form of Plural, but it is possible with the +errirr
form which simply requires that an odd number of syllables precedes it in the
stem and which therefore does not indicate whether the rst part is taken into
account.
(41)
artnerr+enh+eke
crawl+passing+past
(He) crawled off.
(42)
(43)
ateke
akwele tn-eme
cough1 suppo cough2 -pres
(Shes) supposedly coughing.
(44)
unth+err=arle-irr+etyarte
walk.around+plural1 =focus-plural2 +pasthab
They used to walk around.
(45)
arrerlk+ew=arle-arr+eme
look.white+plural1 =foc-plural2 +pres
They look white.
(46)
apern+elh+ewarr/errirr+eme
paint+re+plural+pres
They are painting themselves.
Initial separation is related to two other phenomena. First, all verbs which can
undergo initial separation also permit reduplication of the rst two syllables,
though not simultaneously, as in (47). Second, in some verbs which permit
disyllabic initial separation, the Attenuative can alternatively occur immediately
123
before what would be the beginning of the second part under separation, as in
(48). The range of verbs where this applies is not yet clear.
(47)
aperne-apern+elh+eme
iter(redup)-paint+re+pres
(Shes) quickly painting (her)self.
(48)
aperne-lh+elpe-lh+eme
paint-atten(redup+elp)-re+pres
(Shes) starting to paint (her)self.
7.3
Summary
In this section we have seen that while complex predicates may show some evidence that they are single grammatical and phonological words, there is a range
of evidence that they constitute more than one grammatical and phonological
word. The process of Initial Separation takes an underlyingly simple verb word
and renders it into two grammatical and phonological words. The other complex predicate types are underlyingly complex, involving two grammatical and
phonological words.
8
Conclusion
As in many languages, it is difcult to precisely dene phonological and grammatical word in ECA. A range of criteria have been proposed for both in this
paper. With regard to phonological word, the criteria give fairly consistent results though there are difculties in the description of stress level and position
and some conicting evidence in the prepalatalisation of single consonants
in monosyllabic clitics. There is no simple denition of grammatical word in
ECA. Phonological and grammatical word coincide in most cases but there are
signicant exceptions. Some phenomena are attributed to the conjunction of
phonological words under a single higher level phonological word. Complex
predicates are complex in phonological and grammatical word structure and in
the relationship between them.
References
Breen, G. 1990. The syllable in Arrernte phonology, ms.
Breen, G. and Pensalni, R. 1999. Arrernte: a language with no syllable onsets,
Linguistic Inquiry 30 (1).116.
Dixon, R. M. W. 2001. The Australian linguistic area, pp 64104 of Areal diffusion and genetic inheritance: problems in comparative linguistics, edited by A. Y.
Aikhenvald and R. M. W. Dixon. Oxford: Oxford University Press.
124
John Henderson
Introduction
The small Arawa family of southern Amazonia (quite distinct from Arawak)
consists of ve extant languages Den, Kulina, Sorowaha, Paumar and Madi
(see Dixon 1999). The Madi language consists of three closely related dialects:
Jamamad (with about 190 speakers), Banawa (about 80 speakers) and Jarawara
(about 150 speakers, spread over eight jungle villages). The description of
Jarawara given here is based on materials gathered during six eld trips, during
199199.1
Jarawara is a highly synthetic language, basically agglutinative but with
developing fusion (particularly in the gender-marking forms of inalienably
possessed nouns see Dixon 1995). There is a closed class of about fourteen adjectives, which only function as modiers within an NP or as copula
1
My major debt is to the Jarawara people who have welcomed me as a temporary member of
their community, worked at teaching me their language, and answered all of my questions
Okomobi, Mioto, Soki, Kamo, Botenawaa, Kakai, Wero and others. Alan Vogel is collaborating with me on a grammar of Jarawara and we have discussed many of the points in this
paper. Example (26) comes from a Dyirbal text told by Chloe Grant. Example (29) is from a
Fijian story told by Falavia Matavesi and explicated by Josefa Cokanacagi and Inoke Soqooviti.
The details of Jarawara have been slightly simplied below, for pedagogic purposes. None of
the extra complications which have been left unstated (or else just referred to in a note) would
affect the points being made.
: indicates a phonological word boundary within a grammatical word; + indicates a
grammatical word boundary within a phonological word; [ . . . ] enclose a predicate or NP
consisting or two or more words, except when a predicate makes up a complete clause.
125
126
R. M. W. Dixon
127
Phonological word
Jarawara has just four vowels (i, e, a and o) with contrastive length, and eleven
consonants: bilabial b, (written as f ) and m; apico-dental t and n; apicoalveolar s and r (with allophones [] and [l]), lamino-palatal stop (written
as j, with allophone [y]), dorso-velar k and w, and a nasalised glottal fricative
written as h. There is also a glottal stop, (written as ) which only appears
at certain boundaries, particularly at a phonological word boundary within a
grammatical word (see below). Syllable structure is (C)V.
The stress rule operates in terms of moras, with a short vowel counting as one
mora and a long vowel as two moras. Now in the related Banawa dialect, stress
goes on syllables including the rst, third, etc. moras from the beginning of the
word, ignoring a word-initial V in a word with three or more moras (see Buller,
Buller and Everett 1993); the Jamamad dialect has a similar rule. It is likely
that this was the stress rule at an earlier stage of Jarawara. However, Jarawara
now has penultimate assignment stress goes on syllables including the second,
fourth, etc. moras from the end of the word. It will be seen that in a word with
an even number of moras the stress rule is the same in all dialects, e.g. hosi
sweet potato. However, in a word with an odd-number of moras, stress goes
on odd-numbered moras in Banawa and Jamamad, e.g. kobaja white-collared
peccary, but on even-numbered moras in Jarawara kobaja.
Jarawara has a rich set of ordered phonological rules (all applying just
within verbs) for various types of assimilation, blending and elision, and for
morphophoneme realisation. A number of rules relate to the position of a
syllable in a phonological word, counting from the left; we will here exemplify
128
R. M. W. Dixon
with just one (retaining the rule number from Dixon and Vogel ms.). It will be
seen from the list in the appendix that all tensemodal sufxes begin with a
syllable -hV-. We then have:
Rule 8a: the initial -hV- is omitted from a tensemodal sufx when
(a) it is an even-numbered mora within a word,2 and (b) the preceding
vowel is a.
For example, the feminine form of the immediate past eyewitness tensemodal
sufx is -hara. With three-mora and two-mora inecting verb roots we get:
(1)
rule 8a applies:
be nished-IPef
ahaba-hara
ahaba-ra
(2) eat-IPef
tafa-hara
It is interesting to note that the syllable which is elided would be unstressed on the countingfrom-the-left stress rule which applies in Banawa and Jamamad and is likely to have applied at
an earlier stage in Jarawara. I refer to this as the underlying stress cycle; it conditions many
phonological processes in Jarawara. However, as just stated, the actual stress assignment in
modern-day Jarawara is on a penultimate counting-from-the right principle.
129
The basic elements of clause structure are (note that all elements are optional
except for the predicate):
(1) Clause-initial peripheral elements (discourse markers, peripheral NPs, subordinate clauses, etc.).
(2) Core NPs: S in an intransitive, A and/or O in a transitive clause (there are
ordering preferences but no ordering constraints; nothing concerning the
functions of NPs can be inferred from their ordering).
(3) Predicate, including obligatory pronominal reference to core arguments.
(4) Clause-nal peripheral elements (peripheral NPs, subordinate clauses, etc.).
Verbs are classied according to two independent parameters.
Transitivity. Leaving aside copulas, each verb is one of:
(a) intransitive, e.g. -tafa- eat; haa:haa -na- laugh;
(b) transitive, e.g. -iti- take off, pick up, marry; tama -na- hold in
the hand;
(c) ambitransitive of type S=O, e.g. -mato- tie (tr); be tied (intr);
baka -na- break off (tr); be broken (intr);
(d) ambitransitive of type S=A, e.g. -awa- see, feel; kobo -na- meet
(tr); arrive (intr).
Inecting/non-inecting.Verbs divide into:
(i) inecting verbs, which themselves accept prexes and sufxes,
e.g. -tafa-;
(ii) non-inecting verbs, which do not themselves take afxes but
must be followed by an auxiliary verb (called AUXa, to distinguish it from other kinds of auxiliary, which will be discussed
below) which does, e.g. kobo -na-. There are two auxiliaries
about a dozen non-inecting verbs take -ha- while the remainder
(several hundred) take -na-.
Compare:
(3)
o-tafa-ra
1sgS-eat-IPef
Ive just eaten
(4)
kobo o-na-hara
arrive 1sgS-auxa-IPef
Ive just arrived
130
R. M. W. Dixon
In (3) 1sg prex o- and immediate past eyewitness feminine sufx -hara
(reduced to -ra by rule 8a, since the -ha- is the fouth mora of the word and
is preceded by a) attach to the inecting verb root, -tafa-, whereas in (4) they
attach to the auxiliary -na- of the non-inecting verb kobo.
A non-inecting verb and its auxiliary constituent are separate grammatical
words and also separate phonological words. Consider:
(5)
nki
ka-na-ke
squeeze applicative-auxa-decf
(she) squeezes (it)
(6)
hatsa ti-na-hi
sneeze 2sgS-auxa-ImmPosimpf
you sneeze!
(7)
underlying
surface
jowaba
na-haro
jowaba
na-ro
walk.in.single.le auxa-RPef
(they) walked in single le
If each of (5) and (6) were one phonological word then stress would go on
the last mora of niki and on the last and rst moras of hatisa, since these are
even-numbered moras counting from the right of the word; the actual stress
patterns are nki and hatsa, i.e. on the penultimate moras of the non-inecting
verb roots, showing that these are distinct phonological words. If (7) were one
phonological word then the -ha- of -haro would be the fth mora, counting
from the left, and rule 8a would not apply to omit it. But the -ha- is omitted
showing that it must be in an even-numbered mora counting from the left within
its phonological word; it is in the second mora na-haro, which must be a separate
phonological word from jowaba.
4
Predicate structure
This is the most complex part of the grammar of Jarawara. We can recognise
eleven types of elements. They are, in order:
a First pronominal slot; obligatory in all transitive clauses marks O;
b Second pronominal slot; obligatory marks S or A;
c Prexes:
c1 First prex position: one of 1sg S/A o-, second S/A ti- (both transferred
from slot b); marker of Oc hi-, as in (13); or to- away, as in (8) and (32);
c2 Second prex position: applicative ka-, as in (5) and (16);
c3 Third prex position: causative na- (on verb), niha- (on auxiliary);
d Verb root, inecting or non-inecting (predicate head) obligatory;
131
1sg
2sg
3sg animate
3 inanimate
1nsg.inclusive
1nsg.exclusive
2nsg
3nsg animate
owa
tiwa
e-ra
ota-ra
te-ra
mee or me-ra
oti
ee
otaa
tee
mee
The attested combinations are: far past non-eyewitness followed by reported; future followed by
immediate past non-eyewitness; irrealis followed by far past non-eyewitness, by immediate past
non-eyewitness or by recent past eyewitness.
132
R. M. W. Dixon
b
c1-d-f1b-f2b-g
h-j
ee
to-ka-tima-mina-haba
ee-ke
1nsg.incS away-in.motion-upstream-morning-futf 1nsg.inc-decf
(Well sleep here tonight and) well travel upstream in the morning
Slot i involves one of two secondary verbs, neither of which may take
prexes. If pronominal prex 1sg o- or 2sg ti- is included in slot h before
a secondary verb, then the pronominal form jumps over the secondary verb and
attaches to the mood sufx in slot j. Example (9) has the 1nsg.exc form, otaa, in
slots b and h; this is a separate word and retains its place before the secondary
verb ama, in slot i. In contrast, (10) has pronominal prex 1sg o- in slots b
and h; this jumps over ama in slot i and attaches itself to the declarative mood
marker, -ke, from slot j. We now get a (grammatical and phonological) word
133
o-ke, consisting just of a pronominal prex o- and a mood sufx -ke, with no
intervening root.
(9)
b
d-g
h
otaa
jana-hamaro otaa
1nsg.excS grow.up-FPef 1nsg.exc
[wara jaa]
lake peripheral postposition
We grew up at the lake
(10)
b/c1-d-g
i
h-j
o-jana-maro
ama
o-ke
1sgS-grow.up-FPef extent 1sg-decf
I grew up at the lake
4.1
i-j
ama-ke
extent-decf
[wara jaa]
lake peri
Verbal reduplication
atiO
[ti-mita-hi]
language 2sgA-listen.to-ImmPosimpf
you listen to the talking!
(12)
[mi.mita
ti-na-hi]
atiO
language redup.listen.to 2sgA-auxb-ImmPosimpf
you listen a bit to the talking!
Note that the reduplication auxiliary (auxb) is distinct from the auxiliary of a
non-inecting verb (auxa). In fact, -na- as the auxiliary of a non-inecting verb
will normally be omitted once its afxes have been transferred to a reduplication
auxiliary, although there are circumstances under which it is retained.
134
R. M. W. Dixon
4.2
X is added to a preceding
inecting verb or to the
auxiliary (auxa) of a
non-inecting verb
X must be added to a
special preceding
auxiliary (auxd)
(III) auxiliary-bound
sufxes (seven)
Types of sufx
This is an auxiliary-taking sufx, -rima -na-hi/-rama -na-ho, plainly involving the immediate
positive imperative sufx -hi/-ho as nal element.
135
mee
tafa-kanikima na-ra-ke
3nsgS eat-scattered auxc-IPef-decf
they (arrived and spread out) and each ate in a different house
(IV) The auxiliary-taking and auxiliary-bound sufx -wi do continuously combines the unusual properties of -kanikima (with respect to what follows) and of
-wahare (with respect to what precedes). Consider
(16)
jaraA
[owa haa:haa ka-na
na-wi
branco 1sgO laugh
applic-auxa auxd-cont
the branco laughed at me for a considerable time
na-re-ka]
auxc-IPem-decm
The nal segment of this sufx is most appropriately represented as a morphophoneme I, i.e.
-waharI. The I is realised as i in an odd-numbered and as e in an even-numbered mora counting
from the left of the phonological word This provides a further example of a phonological rule
applying on the underlying stress cycle (see note 2), counting from the beginning of a phonological word. Other sufxes ending in the morphophoneme include -hatI do all day, in (24); -hitI
do all along the way, in (32); and -kI coming and -rI on a raised surface, mentioned in 6.
136
R. M. W. Dixon
of -hare, by rule 8a) and declarative mood sufx (-ka) are attached. This sufx
is unusual with respect to what happens on its left and also with respect to what
happens on its right.
In addition, one of the auxiliary-bound sufxes and ten of the auxiliary-taking
sufxes either require or generally take reduplication of the lexical verb. (More
detailed information is in the appendix.)
It might be suggested that forms like -kanikima, -wahare and -wi should not
really be called sufxes at all. However, there seems to be no other suitable
label for them. There is some similarity with the two varieties of verbs. Normal
sufxes are like inecting verbs in allowing afxes to both precede and follow
them. Whereas non-inecting verbs allow no afxes before or after (but require
an auxiliary for afxes to be added to), auxiliary-taking sufxes allow no further
sufxes to follow (but require an auxiliary for them to be attached to) and
auxiliary-bound sufxes cannot have other sufxes immediately preceding them
(but must be added to their own auxiliary).6
It will be seen that there are four possible types of auxiliary within a predicate:
auxa, the auxiliary of a non-inecting verb; auxb, the auxiliary associated
with reduplication; auxc, the auxiliary required to follow an auxiliary-taking
miscellaneous sufx, of type II, to which further sufxes are attached; and
auxd, the auxiliary to which an auxiliary-bound sufx, of type III, must be
attached. (The rst three types of auxiliary can be either -na- or -ha-; for auxd
only -na- is attested.) A predicate can include two or even three instances of
-na-, as in (16).7
The auxiliary -ha- is never omitted, but the auxiliary -na- may be omitted
under certain grammatical conditions. One is the nature of the following sufx.
There are some sufxes which always omit an immediately preceding auxiliary
-na- (these include -tasa again and -ra negative). There are some which
always retain an immediately preceding auxiliary -na- (these include -mina
in the morning, tomorrow). And there are some that omit an immediately
preceding -na- when it carries a prex, but retain it when there is no prex. The
miscellaneous sufx -bisa also is of this type, as illustrated in:
6
The diachronic origin of non-normal sufxes is an interesting topic for speculation. One possibility is that some or all auxiliary-taking sufxes might at an earlier stage of the language have
been non-inecting verbs, which were then grammaticalised as part of the predicate. To support
this hypothesis, we would expect to nd lexical verbs in other Arawa languages that are cognate
with auxiliary-taking sufxes in Jarawara. I have not been able to uncover any. The hypothesis
thus remains speculative.
Note that we do not get all four types of auxiliary in one predicate since there is no sufx that
is both auxiliary-taking and auxiliary-bound and requires reduplication. (The one sufx that has
been identied as both auxiliary-taking and auxiliary-bound is not common, and I have not tried
to elicit it with a verb that is productively reduplicated.)
(17)
otaa
kobo na-bisa
1nsg.excS arrive auxa-alsof
we also arrived
(18)
kobo o-bisa
arrive 1sgS-alsof
I also arrived
137
In (17) the subject pronoun, 1nsg.exc otaa, is a separate word and precedes
the non-inecting verb root. In (18) the subject pronoun is a prex, 1sg o-, and
attaches to the auxiliary -na-, causing this to drop when followed by -bisa. Thus
in (18) we have an auxiliary constituent consisting just of prex and sufx,
without any overt auxiliary root. (This is similar to o-ke in (10), another kind
of word consisting of just prex and sufx.)
There are a number of subtypes within the afx sets. We will briey comment
on just two.
4.3
o-sawi-bote
o-na-habana
o-ke
1sgS-join. in-soon 1sg-auxc-futf 1sg-decf
Ill soon join in
(20)
sawi-kabote o-na-habana
o-ke
join. in-soon 1sgS-auxc-futf 1sg-decf
Ill soon join in
In (19) the 1sg prex o- is retained on the verb -sawi- and repeated on the
auxiliary demanded by sufx -ibote; in (20) it only occurs on the auxiliary of
the sufx -kabote. In each sentence it also occurs in slot h.
138
R. M. W. Dixon
4.4
Most normal sufxes form one phonological word with what precedes and
follows in the same grammatical word. But there are just a few normal sufxes8
which behave differently. If one of these is preceded by more than a single mora
in the grammatical word to which it belongs, then it begins a new phonological
word within that grammatical word (we again use : for a phonological word
boundary within a grammatical word). This is the third type of situation where
one grammatical word may consist of two phonological words, mentioned in 2.
Compare -tasa again in (21) which has the property of beginning a new
phonological word, with -bisa also in (22) which occurs in the same sixth
echelon slot as -tasa but continues the same phonological word.
(21)
[o-mita:tasa-habone
o-ke]
[Okomobi ati]O
name
speech 1sgA-listen.to:again-intf 1sg-decf
Ill listen again to what Okomobi says (lit. to Okomobis speech)
okati
[kamina-bisa-ra-ke]
1sgposs+grandmother speak-also-IPef-decf
(My grandfather spoke and then) my grandmother also spoke
Here the underlying verb form is kamina-bisa-hara-ke. The -ha- of IPef sufx
-hara is in the sixth mora and is omitted by rule 8a. If -bisa had commenced
a new phonological word the -ha- would have been in the third mora of
bisa-hara-ke and would not have been omitted.
We can also contrast (21), in which -tasa is preceded by three moras within
its grammatical word and commences a new phonological word, with (23), in
which -tasa is preceded by just one mora in its grammatical word and continues
an existing phonological word.
8
There are two sufxes (-tasa again and -ikima two participants, a pair) which always start
a new phonological word under the conditions stated. And there is a third sufx, -mata- short
time, whose behaviour varies. It generally continues an existing phonological word but I have
a fair number of instances (from both texts and elicitation) where it behaves like -tasa- and
-ikima- in beginning a new phonological word if it is preceded by more than a single mora in the
grammatical word to which it belongs. This sufx may be well be in the process of shifting its
morphological prole.
There are different conditions under which the sufx-hite/-hiti do all along the way may
commence a new phonological word; details are in Dixon and Vogel (ms.).
(23)
oko-jiboteeO
[ jori
o-tasa-ra
1sgposs-spouse copulate.with 1sgA-again-IPef
I copulated with my spouse again
139
o-ke]
1sg-decf
The underlying form of the auxiliary constituent is o-na-tasa-hara (1sgAauxa-again-IPef ). As mentioned above, the auxiliary -na- always drops
when immediately followed by -tasa, giving o-tasa-hara which is both one
grammatical word and one phonological word. Since the -ha- of -hara is the
fourth mora it is omitted, by rule 8a. Now if -tasa began a new phonological
word here, then -ha- would be in the third mora of tasa-hara and would not be
omitted. (It would in any case be impossible for tasa-hara within o-tasa-hara
to be a separate phonological word, since that would leave just o-; this only has
one mora and so could not by itself constitute a separate phonological word.)
4.5
The fty-ve or so miscellaneous sufxes fall naturally into six ordered sets
which can be called echelons. There are a number of ordered slots within most
echelons. In outline (a full list is in the appendix):
r echelon 1, normal sufxes: sixteen in three ordered slots;
r echelon 2, normal sufxes: three in three ordered slots (one sufx may commence a new phonological word);
r echelon 3, auxiliary-taking, prex-retaining: two (mutually incompatible);
r echelon 4, auxiliary-taking, prex-poaching (one is also auxiliary bound):
twenty, fourteen of which fall into ve ordered slots, with six not having been
obtained in combination with another sufx from this echelon;
r echelon 5, auxiliary-bound: six, four in two ordered slots (one may commence
a new phonological word);
r echelon 6, normal sufxes: seven in ve ordered slots (two may commence
a new phonological word).
There are two further miscellaneous sufxes which appear to have considerable freedom both of positioning and of morphological type: (a) -waha can be a
normal sufx now, the next thing, then while -waha -na- can be an auxiliarytaking sufx second time; (b) -tee can be a normal sufx habitual, customary
or an auxiliary-bound sufx remembering something from the past.
In most languages which have pronominal elements referring to core arguments,
these are only marked once within the predicate. Jarawara is unusual in that a
predicate typically includes two occurrences of a prexal or non-prexal subject
pronoun, in slots b and h (or, two occurrences of an object pronoun in some
transitive O-constructions, in slots a and h); this is seen in (810), (1921)
and (23).
140
R. M. W. Dixon
o-ka-tima
o-na-hate-hara
o-ke
1sgS-in.motion-upstream 1sgS-auxd-all.day-IPef 1sg-decf
I just went upstream all day
o-tafa
o-wahare
o-na-hate-hara
o-ke
1sgS-eat 1sgS-multiple 1sgS-auxd-all.day-IPef 1sg-decf
I ate, in different houses, all day
In every example given thus far, each space represents both a phonological word
boundary and a grammatical word boundary. That is, every orthographic word
is a grammatical word. Every orthographic word is also one phonological word,
except where there is a :, indicating a phonological word boundary within a
grammatical word, as in (21).
Before turning to a discussion of grammatical word in Jarawara it will be
useful to examine, in a more general way, the distinction between predicate and
grammatical word.
5
A grammatical word is a unit bigger than the morpheme. But how much bigger?
As mentioned in 2 of the introduction, some early writers denied that polysynthetic languages had any unit word, since the grammatical unit next up from
the morpheme seemed to them too big to justify the label word (Milewski
refers to it as syntactic group).
It is now accepted that the size of a grammatical word varies with the morphological type of a language. In an analytic language a word will most often
consist of just one morpheme, occasionally of two or more. In a polysynthetic
language a verbal word may typically include from six to ten morphemes (and,
perhaps, always at least three).
141
In most languages the areas of greatest structural complexity include the verb
and/or the predicate9 (or verb complex or verb phrase). It is, in fact, important
to distinguish between predicate and verb, and not to muddle up their respective
structures.
Some languages have a fairly synthetic verb structure but a rather simple
predicate structure. In Dyirbal, for instance, a verb can consist of root plus ve
or six sufxes, whereas the great majority of predicates consist of just one verb.
The most complex predicate involves two verbs one of them generally having
an adverbal-type meaning which agree in transitivity and in nal inection.
For example, both bura- see, look at, and uyma- do properly to take the
antipassive derivational sufx -lay- and the purposive inection -gu in (Dixon
1972: 386):
(26)
adja [bura-lay-gu
uyma-lay-gu]
1sgS look.at-antipass-purp do.properly.to-antipass-purp
[gayga-gu ba-gu-n]
eye-dat
there-dat-f
I really looked at her eyes
(28)
Now, putting aside our ingrained ideas about word, based on the traditional
way of writing word spaces (which can be an historical relic, out of accord with
the structure of the present-day language, as in the case of French, mentioned
in 4.2 of chapter 1), why do we not regard each of (27) and (28) as a single
grammatical word?
There are several reasons why not. One is that it is possible to pause at the
word boundaries written in (27) and (28). A second is that an adverb can be
inserted into a predicate, prototypically after the rst word (will probably have
been killing) but potentially after any word (for example, will have probably
been killing). And if each of (27) and (28) was taken to be one grammatical
word the roots would presumably be the lexical forms kill and lter; but then -ing
would be a sufx in some occurrences and some kind of prex (or component
of a proclitic?) in others. It is clear that the only feasible treatment of English
is to take each orthographic word in (27) and (28) to be a grammatical word.
9
Here I use the term predicate (or verb complex, or verb phrase) for a unit that does not include
any NPs (an O NP in the case of an accusative, or an A NP in the case of an ergative syntax) but
just the verb and its modiers and bound pronominal attachments.
142
R. M. W. Dixon
This is simply a language with rather simple verb structure and fairly complex
predicate structure.
Let us now look briey at Fijian. As described in chapter 1, a grammatical word is centred on a root (or compound stem) and may have prexes
and/or sufxes added to it. Function items such as prepositions, articles,
tenseaspect markers and verb modiers are also each regarded as a grammatical word. Those grammatical words that have two moras are also distinct
phonological words. Those that have a single mora either attach to another
single-mora function item to jointly create a phonological word (e.g. o=na
2sg.subject.pronoun=future.tense), or function as proclitic to a following
phonological word.
Now Fijian is a typical Oceanic language with a fairly analytic verb but a
richly structured predicate. Most verbs consist just of a root, although there can
be one or even two prexes, and either an incorporated noun or a transitive sufx.
However, it does have a complex predicate structure. A predicate (making up a
complete clause) along the lines of (29) is fairly typical (Dixon 1988: 310):
(29)
dabe vata
sara
to'a
rau
saa
la i
3duS aspect go.and sit
together at.once temporarily
the two of them went and right away sat together for a while
The account of Fijian given here has been slightly simplied, by omitting mention of some
elements; this simplication does not affect the points being made. Full details are in Dixon
(1988).
143
predicate. It may come either immediately after the verb that is, between dabe
and vata in (29) or at the very end of the predicate after to a. Secondly, one
can pause at every orthographic boundary in (29).
Another point is that some of the predicate modiers also occur as full lexemes and then have grammatical word status. There is one example of this in
(29) to a can be a verb meaning sit on ones heels, squat and also a postverbal modier done on an interim basis, temporarily. (Others of this type
include ti o which means reside, remain, sit as a lexical verb and happening
continuously now but not before or afterwards as a modier, and tuu which
means stand as a lexical verb and permanently, or happened over an extended
period and now nished as a modier.) Note that it could not be suggested that
in (29) to a is somehow incorporated into the verb; Fijian does have a process
of incorporation, involving a noun that is attached directly to the verb root, e.g.
ta i:wai fetch:water.
The putative analysis of (29) as one grammatical word must be discarded. This
is a complex predicate consisting of seven words (each being both a phonological word and a grammatical word). What we have is a language with complex
predicate structure and rather simple verb structure.
In some languages the units grammatical word and phonological word always
coincide. In others there are some instances where the two units do not coincide,
but these are greatly outnumbered by instances where they are the same (and
note that this applies in analytic, synthetic and also polysynthetic languages).
Indeed, this is the basis on which the label word is used in naming the two
units.
In some highly polysynthetic languages it is necessary to recognise as grammatical word a long and complex unit (these languages tend to have a simple
predicate structure). Other languages can have a complex predicate structure,
which must be recognised as a quite different phenomenon. In summary, there
is no basis on which Fijian (and the many other Oceanic languages, with similar
proles) should be added to the inventory of languages with polysynthetic verb
structure.
6
144
R. M. W. Dixon
word occur together, in xed order, with the whole having a conventionalised
coherence and meaning.
As shown by o-bisa in (18), the auxiliary root -na- can be dropped in specied grammatical circumstances, so that we get an auxiliary constituent which
actually lacks an auxiliary root and just consists of prex plus sufx. In similar
fashion we can have a pronominal prex plus a mood sufx, such as o-ke in
(10). Each of these is one grammatical word and one phonological word.
(b) Free pronouns (that is, every pronoun other than prexes 1sg o- and 2sg ti-),
deictics, interrogatives, discourse markers, postpositions and interjections also
constitute grammatical words. Each of these has at least two moras and is also
a phonological word.
For instance, the all-purpose preposition (corresponding to to, from, at, on,
in, with, etc. in English) has the form jaa in Jarawara and is here a distinct
grammatical word, as in (910). (A full analysis has not been undertaken of the
other dialects of Madi, but my preliminary impression is that the preposition
may here be =ja, with a single mora, functioning as enclitic to the preceding
word.)
Every noun and adjective (which most often occur without sufxes) and
every non-inecting verb (which cannot take sufxes)11 has at least two moras;
each of them is thus both a grammatical word and a phonological word. Thus,
if one of these items is monosyllabic, its vowel must be long; for example 2nsg
pronoun tee (in table 1) and non-inecting verb hoo -na- snore, (dog) growls.
There are some inecting verbs which do have monosyllabic form with a
short vowel. The most common verb of all is -ka- be in motion. This verb
never occurs without a directional afx, at the least it must take one of prex toaway, sufx -ke/-ki coming or sufx -ma back. Thus a word including -kaalways has at least two moras. (Note that to- can be replaced by a pronominal
sufx, but there are then still at least two moras in the word.)
A few other monosyllabic verbs have one allomorph with a short vowel and
another with a long vowel. For example stand is -wa- if there is a prex or if it
takes the miscellaneous sufx -re/-ri on a raised surface, and -waa- otherwise,
again ensuring that each verbal word has at least two moras. Evidence can be
provided that the underlying form here is -waa- with the vowel being shortened
in certain circumstances. In contrast, the underlying form of the verb exist is
-na-, with a short vowel. This almost always takes some afx(es) but it can be
used without any; when this happens the vowel is lengthened, giving naa, and
ensuring that this grammatical word is also a full phonological word.
11
Nothing in the grammar of Jarawara is simple. There is in fact a distributive sufx -ri, which
is unique in that it is added to the root (not the auxiliary) of a non-inecting verb (it is only
attested with a handful of verbs) e.g. weje -na- carry, weje-ri -na- each carries their own.
145
a mo o-na-habone
sleep 1sgS-auxa-intentionf
Im going to sleep
(31)
amo+na
sleep+auxaf
she sleeps
This is the only instance I know of one phonological word consisting of two
grammatical words.
6.1
The basis for sound linguistics lies in investigating a number of possible analyses
for a given set of data, and examining the pros and cons of each before deciding
on the most appropriate analysis.
In this spirit, let us consider a typical Jarawara predicate (here making up a
complete clause), in (32), and examine whether it could be regarded as a single
complex grammatical word.
(32)
a
b
d
c1-e-f1c-f5-g
otara
mee haa to-na-ma-iti-haro12
1nsg.excO 3nsgA call.to away-auxa-back-along.way-RPef
i-j
ama-ke
extent-decf
they were calling out to us all along the way back
h
mee
3nsg
These predicate elements must occur in a xed order. Why should we not
say that they make up one grammatical word? Note that, unlike in English
and Fijian, no adverbs can be inserted into a predicate in Jarawara (adverbal
constituents are peripheral NPs which must be placed in the rst or last slot of
clause structure see 3). Note that if (32) were analysed as one grammatical
word, then so should every other predicate be.
12
The intial -h- of -hite/-hiti all along the way may as here be omitted when preceded by a
and in an even-numbered mora from the beginning of its phonological word.
146
R. M. W. Dixon
If this were one grammatical word we should have three occurrences of the
auxiliary -na- as sufxes to the lexical verb root haa:haa (and two of them
would be sequential). The situation becomes even more complex when we
consider (25), also repeated here for convenience.
(25)
o-tafa
o-wahare
o-na-hate-hara
o-ke
1sgS-eat 1sgS-multiple 1sgS-auxd-all.day-IPef 1sg-decf
I ate, in different houses, all day
147
If this were a single grammatical word, with - tafa- as head, we would have 1sg
o- as prex once and as sufx three times.
Such an analysis would be unbearably complex and would lack explanatory
power. In contrast, the statement of grammatical word presented in 6 is maximally clear: auxiliary -na- is always the basis for a distinct grammatical word;
1sg o- is always a prex to an inecting verb, or to an auxiliary, or to a mood
sufx as in (10).13
In the approach followed here, grammatical and phonological words in
Jarawara fall together almost all of the time. There are just four exceptions,
which will be recapitulated in 7.
We pointed out in 5 that both English and Fijian have a relatively simple verb
word but rather complex predicate structure, in contrast with polysynthetic languages which tend to have a complex verb word but simple predicate structure.
It is important to distinguish between predicate and verb.
Jarawara has a fairly complex verb structure. There can be up to three prexes
and we frequently get three miscellaneous sufxes plus one (or two) tense
modal markers and a mood sufx; a verb with six morphemes is common. It
also has a fairly complex predicate structure, with pronominal words in slots a,
b and h, a secondary verb in slot i, and a number of miscellaneous sufxes that
require their own auxiliary (making up a separate grammatical/phonological
word), either to which they must be attached or to which following sufxes
must be attached.
As already stated, the units phonological word and grammatical word correspond for the great majority of cases. We nd only four types of disparity.
One grammatical word consisting of two phonological words:
(i) In a compound, e.g. bani:kasako, described in 2.
(ii) In reduplication, e.g. kete:ketebe and a ta: atabo, also described
in 2.
13
In considering and rejecting an analysis where all of, say (30), is one grammatical word, I am not
tilting at a straw man. This approach is taken in all the publications of the Summer Institute of
Linguistics missionary linguists Robert and Barbara Campbell on the related dialect Jamamad,
on which they have been working since 1962. That is, in (30) amo o-na-habone, they would
take the 1sg bound pronoun o to be a sufx to the (what is, for me, a non-inecting) verb amo,
with the auxiliary root na a further sufx after this, i.e. amo-o-na-habone. See, for example.
B. Campbell (1986) and R. Campbell (1988).
Under this analysis it would not be possible to provide any explanatory statement of the
phonological rules; it appears that the Campbells have not considered the matter of phonological
rules.
148
R. M. W. Dixon
n.d.
R
(R)
NPW
(7)
(8)
(9)
(10)
(11)
(12)
slot f1c (13)
(14)
(15)
(16)
**
**
A
A
**
*
A
A
A
***
149
150
R. M. W. Dixon
@
@
A
*
-(ha)ro/-(hi)ri
-(ha)maro/-(hi)mari
-(ha)ni/-(hi)no
-(he)te/-(hi)ta
-(he)mete/-(hi)mata
Modalities
A -(ha)ba(na)/-(hi)ba(na)
A -(ha)bone/-(hi)bona
A -(he)ne/-(hi)na
A -(he)mene/-(hi)mana
A -(ha)mone/-(hi)mona
-ija-hi/ja-ho
-rima -na-hi/-rama
-na-ho
-ri-ja-hi/-ra-ja-ho
Interrogative
A -ra/-ra
A -ini/-
** -ibana/-bana
content interrogative
polar interrogative
polar future interrogative
Others
** -rihi/-rihi
** -ibe(ja)/-ba(ja)
contrastive negator
immediate
151
152
**
?
**
R. M. W. Dixon
-ikani/-kani
-inihi/-noho
-imakoni/-mako
counterfactual
climax
unusual, unexpected
negator
from slot g
-ni/-no
-bone/-bona
-ne/-na
-mone/-mona
References
Buller, E., Buller, B. and Everett, D. 1993. Stress placement, syllable structure and
minimality in Banawa, International Journal of American Linguistics 59.28093.
Campbell, B. 1986. Repetition in Jamamad discourse, pp 17185 of Sentence initial devices, edited by J. E. Grimes. Dallas: Summer Institute of Linguistics and
University of Texas at Arlington.
Campbell, R. 1988. Avaliaca o dentro das citaco es na lngua Jamamad, Serie
Lingustica [Summer Institute of Linguistics, Brazil] 9(2).930.
Dixon, R. M. W. 1972. The Dyirbal language of North Queensland. Cambridge:
Cambridge University Press.
1988. A grammar of Boumaa Fijian. Chicago: University of Chicago Press.
1995. Fusional development of gender marking in Jarawara possessed nouns,
International Journal of American Linguistics 61.26394.
1999. Arawa, pp 293306 of The Amazonian Languages, edited by R. M. W. Dixon
and A.Y. Aikhenvald. Cambridge: Cambridge University Press.
2000. A-constructions and O-constructions in Jarawara, International Journal of
American Linguistics 66.2256.
Dixon, R. M. W. and Vogel, A. R. 1996. Reduplication in Jarawara, Languages of the
World 10.2431.
Ms. A grammar of Jarawara, from southern Amazonia.
The question whether all languages have words may look like a nonsense
question to many people, the universal existence of words being regarded as a
truism in itself. Even though it is widely acknowledged that nding a strictly
satisfying denition of word is as difcult as dening similarly universal terms
such as sentence or language, the existence of words in all languages is not
usually questioned.
As with all putative language universals, probing the validity of the claim
depends crucially on looking at languages that are as different as possible.
If many otherwise very different languages share a certain feature, it is more
likely that this feature is a true universal than if only similar languages are considered. The motivation for looking at the concept of word in sign languages
lies exactly here: for what could be more different than a sign language? As
Anderson (1982: 91) puts it: Comparison of spoken and signed languages can
be especially valuable because the parallels are so surprising at rst, and seem
so automatic and natural after we have worked with them. The challenge of
nding these parallels produces important insights into the nature of human
language in general. So we can often learn more by studying a sign language
than by studying one more spoken language. This is of course not to ignore
that modality-related differences between signed and spoken language can be
just as revealing as the parallels between the two. Sign languages are of great
typological importance by virtue of their visualgestural modality, which makes
them stand out as a distinct language type in opposition to the entirety of spoken
languages. Certainly, using the hands and body to produce a linguistic signal and
the eyes to perceive it should have consequences that mark sign languages as different from languages that use the vocal tract for producing speech signals and
the ears for perceiving them. Some possible modality-related differences at the
phonological level have been discussed by Gee (1993) and Anderson (1993).1
1
Sign language research uses the terms phonology, phoneme and so on, although their literal
meaning obviously does not apply. The terms are used to refer to sublexical units in signs at an
equivalent level of linguistic organisation as phonemes in spoken languages.
153
154
Ulrike Zeshan
This issue will be explored in more detail in 4 of this chapter. At this point,
it is sufcient to say that the more universal a feature of language organisation is claimed to be, the more imperative it is to consider its validity with
respect to sign languages. This is especially true in the light of the fact that
claims about universals of human language have always been based on evidence from spoken languages alone. Sign language research is only just beginning to enter the stage of linguistic typology, and considering the word
unit is certainly not the worst parameter to begin with on the way towards
integrating the ndings of sign language linguistics with spoken language
typology.
It is quite striking that sign language linguists do not usually talk about
words. Instead, it is the sign that takes the place of the word unit in spoken
languages. The question is, of course, whether this is just a terminological
convention or whether there is some reason for referring to units at an equivalent
level of linguistic organisation as words on the one hand but signs on the
other hand. As in most cases of linguistic meta-talk, this issue has, to the best of
my knowledge, never been addressed explicitly. So in what way exactly does a
sign language sign compare to a spoken language word? Are they completely
equivalent, or are signs and words different in character, either essentially or
by degree? This chapter is an initial contribution to addressing this issue.
The initial justication for saying that the word and the sign are situated at
an equivalent level of linguistic organisation comes from the way sign language
users evidently perceive the signs of their sign language. In fact, they talk about
signs in very much the same way that spoken language users talk about words,
and there can be no doubt that signs as a unit have psychological and cultural
validity in deaf communities. A cluster of observations conrms this point.
First of all, it is very revealing to look at meta-linguistic vocabulary in sign
languages, and there are some striking generalisations that appear across different sign languages. The central meta-linguistic term in all sign languages
appears to be the sign glossed SIGN, which may refer to individual signs as
well as the sign language and the signing modality in general. This sign is typically two-handed, with circular, alternating movements of the hands. A form
found in a number of sign languages is the one represented in gure 1. By
contrast, terms for word, sentence and language may arise via inuence
from the surrounding spoken language, may be used with reference to written
language only, or may be lacking altogether.
A number of sign languages, including Indo-Pakistani and German Sign
Language, have no word for language, in the sense of either French langue
or langage. British Sign Language originally lacked signs meaning language
and culture (Kyle et al. 1985). The present signs used to represent these meanings have come into existence via the inuence of spoken English. Similarly,
American Sign Language does have original signs for word and sentence.
155
Figure 1 SIGN
According to a common convention in sign language research, signs are represented by English
words in capital letters in this chapter. The word stands for the sign whose meaning comes closest
to the meaning of the English word. When the form of a sign is important, graphic representations
are used.
156
Ulrike Zeshan
develop gradually:
develop: single
opening movement of stepwise gradual
opening of the hands
both hands
157
1sg-HELP-2sg
closed hands
with finger
tips touching
the thumb and
oriented
inwards move
slightly
towards the
body
repeatedly
2sg-HELP-1sg
tu-saaidu-nii
2sg:subj-help:imperf-1sg:obj
You help me.
(2)
158
Ulrike Zeshan
one month:
flicking out
one finger
from the fist
three
months:
flicking out
three fingers
from the fist
one year
(Karachi
dialect): arc
movement with
one extended
finger
three years
(Karachi
dialect): arc
movement with
three extended
fingers
underlying root:
derived forms:
k-t-b
kataba he wrote, kitaab book, kaatib writer,
maktab ofce, maktaba library, aktubu I write
159
160
Ulrike Zeshan
On the other hand, there seem to be more fundamental problems with respect
to the applicability of criteria for the grammatical word as used in the other
chapters of this volume. I will briey discuss three criteria here: cohesiveness,
order and conventionalised coherence and meaning.
Grammatical elements in a sign, in the prototypical case a basic form and
superimposed morphological derivations, always occur together in the sign
unit. However, it is not clear whether this can be taken as evidence for a particular grammatical status of these elements. This has to do with the largely
simultaneous nature of the sign. The fact that elements occur together in a sign
is due to purely articulatory reasons as much as to a putative grammatical status
of the unit. For instance, the numeral handshape morpheme in numeral incorporation (see 2, gure 4) is necessarily coexistent with the movement pattern
that stands for the unit. It would be physically impossible to produce a sign or a
movement pattern that lacks any handshape. Similarly, it would be impossible
for a morphological derivation such as the gradual aktionsart derivation (see 2,
gure 2) to occur on its own, for example in a sequence where the basic form of
a sign would occur rst and the abstract movement derivation would occur separately in a sequence. Therefore, the criterion of cohesiveness is considerably
weakened if taken as indicative of grammatical word status.
The criterion of order, with elements within a grammatical word always
occurring in a xed order, is even more difcult to apply to a typical sign.
Again due to the signs simultaneous nature, it is mostly impossible to argue for
any order of grammatical elements within a sign, with the possible exception
of directionality (cf. the sequential transcription in example (2) above; but
even this interpretation can be disputed). To take the same examples as in the
previous paragraph, it is impossible to argue that the numerical handshape for
THREE and the sign for MONTH in a complex sign such as THREE-MONTHS
(see 2, gure 4) occur in any order. They are coextensive over the whole
duration of the sign. Similarly, in a complex sign such as DEVELOP-gradual
(see 2, gure 2), there is no sequential order of the sign DEVELOP and the
superimposed movement pattern that conveys the gradual aktionsart derivation.
It would therefore seem that the criterion of order is only marginally applicable
to signed languages. Its applicability is conned to particular cases of complex
signs, in particular compounds (see 3.1).
The remaining criterion, conventionalised coherence and meaning of a grammatical word, fully applies to sign languages and is particularly important in the
comparatively rare cases of sequential combinations of grammatical elements,
as the discussion of compounds and clitics in 3 demonstrates. In-between cases
of semi-lexicalisation do occur (see Zeshan forthcoming), but they do not in
principle challenge the validity of the criterion. I will discuss semi-lexicalisation
in more detail in 4.
161
3.1
162
Ulrike Zeshan
ten: hold up
two open
hands, palm
facing
outward
pass/success:
twist wrist to
bring extended
thumb upwards
school leaving
exam: twist
wrist, changing
from open hand
to hand with
extended thumb
understand:
touch temple
with index
finger
little: indicate
small quantity
with index
finger and
thumb
stupid: twist
hand away
from temple,
changing shape
of the hand
(b) Repetition of movement and internal movement are eliminated in the compound (Klima and Bellugi 1979, Lucas and Valli: 1995 for American Sign
Language).3
(c) There are various assimilation processes such as recessive handshape assimilation (Collins-Ahlgren 1990 for New Zealand Sign Language) and
location assimilation (Gluck and Pfau 1997 for German Sign Language;
Lucas and Valli 1995 for American Sign Language).
(d) A passive hand serving as the place of articulation for one part of the
compound is retained in the other part as well (Klima and Bellugi 1979,
Lucas and Valli 1995 for American Sign Language; Gluck and Pfau 1997
for German Sign Language).
(e) The meaning of the compound may not be predictable from the meaning of
the two simple signs (Lucas and Valli 1995 for American Sign Language).
Obviously, the formational criteria mentioned above do not apply to all cases
of compound formation, with (a) being the likely exception. Deletion of repeated movement and internal movement only applies if there was any such
movement pattern in the original signs in the rst place. Similarly, spreading
of a passive hand, i.e. the hand that is used as the place of articulation on or at
which the other hand articulates, does not apply to signs where only one hand is
used anyway. Handshape assimilation or location assimilation does not apply
to compounds where handshape or place of articulation is the same in both
parts of the compound to begin with. Figure 5 shows two compound signs from
Indo-Pakistani Sign Language together with the individual signs to illustrate
some possible combinations of formational changes temporal compression
and assimilation of handedness (one- versus two-handed) in the rst example,
3
Internal movement refers to movement within a stationary hand, such as nger wiggling and
wrist bending.
163
Many known sign languages use pointing signs to establish locations in the sign
space for referents and to refer back to these locations in what is equivalent
to pronominal reference in spoken languages. The following mini-discourse
illustrates the principle:
164
Ulrike Zeshan
SHOP
SHOP-THERE
(4)
The most common pointing sign, also called index point or simply index, consists of an extended index nger pointing at a location in space (or at the signer
and the addressee for rst person and second person reference). The index point
has many characteristics that are akin to pronouns in spoken languages, and so
the index is indeed often called a pronoun in the sign language literature.
One feature that the index has in common with pronouns in spoken languages
is that it tends to cliticise. Interestingly, there seems to be evidence for index
cliticisation in various unrelated sign languages, although the phenomena reported are often not described in these terms. I will review some of the available
evidence in this section.
Sandler (1999) describes two processes of index cliticisation, of which only
the rst one, coalescence, will be discussed here because it is more straightforward. In coalescence, a deictic index is encliticised to a two-handed host sign.
The example in gure 6 shows the enclitic index point with the host sign SHOP.
Both hands rst start to articulate SHOP, then midway through the downward
movement, the index clitic appears on the right hand while the left hand continues to nish the articulation of SHOP. The cliticised index loses its syllabicity,
the whole hostclitic combination being monosyllabic, consisting of a single
movement unit in the rhythm of the signed sentence. Thus the hostclitic combination forms a single phonological word. On the other hand, the index point
is still a complete grammatical word, as indicated by the transcription of the
combination as SHOP-THERE.
In addition to detailed formational analyses of the kind cited here, further
evidence for index cliticisation can be found in the domain of grammatical rules
as well. In Japanese Sign Language, questions are marked suprasegmentally by
a particular facial expression, in the same way that questions may be marked by
intonation, mostly rising intonation, in spoken languages. In polar questions,
165
the facial expression typically includes raised eyebrows and a slight head nod
or chin tuck on the last word in the clause (example (5), adapted from Morgan
2000). Syntactically, the word order in polar questions is often rearranged, so
that the question ends with an index point, or an earlier index point is repeated
in nal position. In this case the rule for the assignment of the head nod is
slightly different: if there is a clause-nal index point, the head nod co-occurs
not with the index point alone, but with both the index point and the preceding
sign (examples (68), adapted from Morgan 2000).
(5)
--eyebrow raise
---nod
ask-2sg okay
Is it okay if I ask you (a question)?
(6)
(7)
---------eyebrow raise
-------------nod
book buy index-2sg
Did you buy the book?
(8)
---eyebrow raise
---------------nod
index-2sg sato index-2sg
Are you Mr/s Sato?
It seems obvious that for the purpose of head nod assignment, the index point
and the preceding sign count as a single phonological word. Although no details
of the precise formation of the index point are provided in the source and there
is thus no information about factors such as assimilation, shortening and so on,
it seems reasonable to interpret the data as evidence for encliticisation of the
index point to the preceding host sign.
A parallel case is the spread of mouth patterns in those sign languages where
they play a signicant role, such as, for example, German Sign Language. A
mouth pattern is an imitation of the visible mouth movement that corresponds
to a spoken language word, and it occurs simultaneously with manual signs (see
Boyes Braem and Sutton-Spence 2000). Usually, each mouth pattern co-occurs
with exactly one sign of corresponding meaning. However, sometimes a single
mouth pattern may spread over more than one sign, similarly to the spread of
the suprasegmental head nod in Japanese Sign Language. This indicates that
the two signs are closely connected and can be taken as evidence for hostclitic
status of a two-sign sequence. Compare these two utterances in German Sign
Language (with mouth patterns in double quotes):
166
Ulrike Zeshan
(9)
In (9a), each sign is accompanied by one mouth pattern of equivalent meaning, the usual pattern. In (9b), however, only the head word of the hostclitic
combination has a mouth pattern, which spreads over the entire hostclitic combination. So for the purpose of mouth pattern assignment, the two signs seem
to count as one phonological word. Sandler (2000) draws the same conclusion
with respect to the combination SHOP-THERE, which receives a single Hebrew
mouth pattern xanut shop.
American Sign Language also has index points that behave quite similarly to
the encliticised forms in Israeli Sign Language. These have been described as
determiners in Zimmer and Patschke (1990). Formationally, these signs are
shortened, lacking a movement component of their own, and they often occur
simultaneously with another sign. In addition, the American Sign Language
index signs are peculiar in that the direction of the pointing is insignicant and
arbitrary rather than operating along the lines of localisation and subsequent
anaphoric pronominal reference that I have described above: In most cases,
the determiners used with many different characters [i.e. characters in a story]
point to the same location. In fact, the data indicate that signers tend to have a
preferred location that they use consistently for their determiners . . . Also the
determiners used with one character are not consistently directed toward one
location (Zimmer and Patschke 1990: 205). I will not address the issue of how
appropriate the characterisation of these index points as determiners is here.
In the context of the present discussion, it should only be noted that the index
points have lost phonological and grammatical weight and might be regarded
as clitics by (a) losing a movement component of their own (b) co-occurring
simultaneously with a host sign and (c) losing a meaningful specication for
location and orientation.
In summary, the following characteristics have been found to occur with
documented cases or likely candidates of cliticised index points:
(a) phonological evidence: loss of syllabicity, loss of movement, loss of specication for location;
(b) syntactic evidence: clitic + host behaving as a single sign for the purpose of
assignment of suprasegmentals (head movements, mouth patterns), clitic +
host sign occurring simultaneously;
(c) functional evidence: cliticisation occurs with elements that function as
deictics, pronouns and determiners.
A hostclitic combination represents one phonological word, but two grammatical words. The functional evidence is signicant insofar as similar functional classes have been found to be prone to cliticisation in spoken languages,
167
The fact that it is possible to come up with formational, semantic and grammatical criteria for compounds and clitics that are comparable to criteria used
in spoken languages means that there is much common ground between signed
and spoken languages at the level of the word. However, although the word/sign
unit can be determined in sign languages rather straightforwardly in most cases,
and coherent arguments can be advanced for more complicated cases of complex words as well, this is not the whole story. Independently of identifying
signs, it is important to consider some properties of signs that go beyond mere
identication of sign boundaries. For example, how many morphemes does a
sign typically or maximally consist of? What can be said about the internal
structure of a sign? How much semantic information is transmitted in a sign,
and how is this information structured? What are the effects of having two articulators (the two hands) in sign languages compared to a single articulatory
tract in spoken languages? Issues such as these are addressed in this section,
and we will see that signs do differ in important and very interesting ways from
the words of spoken languages.
4.1
Simultaneous words
One difference between signing and speaking that is immediately evident even
to a layman is the fact that people use two hands for signing, while speaking uses
only a single articulatory tract. With signs that are one-handed, it is in principle
possible to produce two words simultaneously, one with the right hand and
one with the left hand. By contrast, speaking does not allow the simultaneous
production of two words, so that, in the spoken language medium, there is
nothing comparable to simultaneous words.
The simultaneous production of two words does indeed occur in sign languages, although this phenomenon has not been widely documented yet. However,
it is clear from the available evidence that there are very specic constraints on
the use of simultaneous words. It is not at all the case that one may produce a
different word on each hand at any time, so that signed communication would
transmit information at a double rate. In fact, as we will see presently, the term
simultaneous words is somewhat misleading.
168
Ulrike Zeshan
right: punjab
sindh
peshawar
balochistan
left:
one----------two---------------three--------------four
There are four (provinces): the Punjab, Sindh, the Peshawar (region)
and Balochistan.
This pattern is quite common in enumerations. One hand signs the items in the
list, the other hand signs the numbers. The numeral signs are held during the
articulation of the next list item, as indicated by the lines after each numeral
sign. Note that at no time do both hands move at the same time, so that a pattern
such as in (11) is not allowed:
(11)
Although such a pattern would actually t the term simultaneous words best,
it does not occur, presumably because the processing load on both signer and
addressee would be too high.
The other type of two-hand sign synchronisation also has to do with discourse
organisation. After a two-handed sign, one hand remains in place while the other
hand articulates further signs, as in this example (based on Zeshan 2000b: 110;
the line again stands for the duration of the held sign):
(12)
In this example, picture is expressed by the thumb and index nger of both
hands presenting a square outline. As long as the signer is talking about the
picture, the left hand remains in place, while the right hand goes on signing (see
gure 7, with the sign LITTLE-BIT on the right hand, and the sign SQUARE
on the left hand). A similar example is reported in Bergman and Wallin (forthcoming) for Swedish Sign Language. In each case, the held left hand indicates
current discourse relevance of its referent. When the signer shifts to a new topic,
the left hand disappears.
This intricate interplay of the two hands is a mechanism for which nothing
comparable can be found in spoken languages. It is one of the fundamental
modality-based differences between signed and spoken languages and a small,
yet important part of what constitutes the linguistic type of signed languages.
Therefore, it should not be surprising that it may be difcult to talk about
the relationship between grammatical and phonological words in these signed
constructions. The formulation of these concepts, based on linear sequential
169
strings of elements, has simply not provided for cases such as these ones. To
say, for instance, that the sign SQUARE in example (12) is one grammatical
word that consists of two phonological half-words (the two hands), and that
one of these half-words can remain on its own and can by itself carry the full
meaning in the absence of the other phonological half-word, does not make
much sense. The parameters of description used in the other chapters of this
volume are of limited use here.
4.2
4.2.1 Iconicity in signs Rather early in the development of sign language linguistics, people started addressing issues related to the effects of the
visualgestural modality on language structure. Two recurrent themes appear
in the literature that seem to be of fundamental importance: simultaneity and
iconicity (see DeMatteo 1977, Mandel 1977, Armstrong 1983 for examples
of earlier discussions of iconicity in sign). While simultaneity has been discussed in 2 and further exemplied in 4.1, this section deals with the effects
of iconicity on the character of the word unit in sign languages. Of all topics
discussed in this chapter, this issue is of the greatest typological signicance
and at the same time presents the greatest challenge to linguistic theory.
The iconicity of many signs is one of the rst points noticed by people who
encounter a sign language for the rst time. Iconicity is a non-arbitrary relationship between a symbol and its referent. There are various types and degrees
of iconicity, but these will not be discussed in detail here. A classication with
respect to sign languages can be found in Mandel (1977). In this section, I will
rst start with some preliminary considerations about the nature of iconicity in
sign and then limit the discussion to aspects of iconicity that are relevant to our
notion of the sign language word.
Ever since the Saussurean postulate of larbitraire du signe, the linguistic
symbol has been conceived of as an arbitrary association between form and
170
Ulrike Zeshan
meaning. In fact, this is part of the idea of double articulation, itself thought to be
a design feature of language. Double articulation involves two fundamental and
distinct levels of linguistic organisation, whereby phonemes that are themselves
meaningless make up the minimal meaningful units of language (morphemes),
which in turn combine to create all the larger units of language (words, sentences). This organisation allows for a virtually innite number of ever new
utterances to be created on the basis of a very small inventory of phonemic
units, a feature unique to human language. The evident and undeniable iconicity
of many signs in sign languages represents a serious challenge to this concept.
Unlike onomatopoetic and phonaesthetic words and sounds in spoken language,
equivalent iconic characteristics in sign languages are not at all marginal, but
represent a substantial part of the vocabulary. Boyes Braem (1986) estimated
the percentage of iconic signs in Swiss-German Sign Language to be about one
third of the total sign vocabulary. However, it seems that this percentage may
be much higher in other sign languages. In Zeshan (2000a) I estimated that at
least half of the vocabulary of Indo-Pakistani Sign Language, and maybe more
than that, is iconic in some way.
In iconic signs, sublexical parts of the sign (chiey the handshape, the movement and the place of articulation) are meaningful in that they stand for aspects
of the meaning of the sign (cf. examples given in 4.2.4). This formmeaning
relationship can be much more complex than the simple imitative character of
onomatopoetic word such as cuckoo. Both concrete and abstract meanings can
be represented iconically in signed languages, the latter via metaphors (which
are, by the way, often similar to metaphors expressed in spoken languages; note
the examples of sign families below).
It is not at all necessary that signers should be aware of the iconicity all the
time or even most of the time when they use an iconic sign. The signs that
will be discussed in this section are conventional units of the language and are
quite different from signs and sign combinations that are created on the y in
order to express a new concept. The latter possibility also exists and can be
used with great productivity, but the kind of iconicity I am discussing here is
entirely compatible with conventional words. It is not necessary either that all
components of a sign should be iconic. There is no strict compositionality of
meaning in the kind of iconicity discussed here, so it is entirely viable for signs
to have partly iconic and partly arbitrary components. Iconicity of course does
not mean either that a sign can only be used when the user understands its
iconic basis. Use of the sign is completely independent of its iconicity most of
the time. However, there are some situations where the latent iconic potential
can suddenly surface. Many sign puns and instances of word play and creative
use of signs in poetry are dependent on a signs iconicity and are evidence for
peoples underlying awareness of the iconic basis of a sign. Moreover, deaf
people will often explain the meaning of a sign in terms of its iconicity.
CLEVER
171
right flat
hand taps
left fist
twice on
the back
side of the
fingers
THURSDAY
(Delhi dialect)
fist makes
contact
with cheek
while the
wrist is
twisted
outwards
PUNJAB
(Karachi dialect)
I will not discuss the appropriateness of the term classier at this point. For a detailed discussion,
see Zeshan (forthcoming), where I have argued that the term classier is not really appropriate
for all the constructions. However, this aspect is not immediately relevant to the discussion here.
172
Ulrike Zeshan
WRITE (holding a
long thin object and
moving it across a
flat surface)
FOLLOW (two
upright persons
moving behind each
other)
Three types of constructions have been associated with the term classier,
referring to either the geometrical shape of objects, or the movement and location of a referent, or the handling of an object. All three constructions are
highly productive in the sign languages studied so far, and they are a major
source of lexical enrichment. Originally productive multimorphemic constructions tend to lexicalise, and the sign gradually loses its semantic compositionality. Therefore, many of these signs have a semantic structure on two levels:
on the one hand, the sign is a fully conventional lexical unit whose meaning is
non-compositional; yet on the other hand, the original compositional meaning
is still underlyingly present and may surface in particular situations, such as
linguistic elicitation or poetic use of the language. Figure 9 shows some examples, one from each construction type, with both the original compositional and
the lexicalized non-compositional meaning noted (examples are from IPSL,
but identical or very similar forms can also be found in a number of other sign
languages).
Since the lexicalisation process is gradual, it is not surprising that there are
many cases of semi-lexicalisation where a sign is in-between a productive multimorphemic construction and a fully lexicalised sign whose original compositional meaning is not synchronically accessible to signers. The classication of
such signs is a major problem for lexicologists: when should a sign be included
into a dictionary as a lexical entry and when should it be handled by the grammatical part of linguistic description? A detailed discussion of this problem can
be found in Johnston and Schembri (1999), with reference to Australian Sign
Language.
Although these constructions present a challenge to the practically oriented
linguist, they do not in principle involve fundamental theoretical problems.
173
174
Ulrike Zeshan
been morphemes in their history. Yet they are meaningful due to their iconicity. The existence of non-morphemic, yet meaningful sublexical units seriously
challenges the traditional concepts of phoneme and morpheme.
The situation in sign languages resembles the case of phonosymbolism in
spoken languages. Malkiel (1990) describes phonosymbolism and the theoretical questions that arise when one takes the phenomenon seriously: An
appeal to phonosymbolism simply means that the analyst endows the sound
at issue with the ability to convey, in conjunction with other elements, a certain message of its own, with being, for once, the carrier of a minor, if not
necessarily minimal, semantic content. This appeal presupposes the suspension of a very widely held, almost axiomatic assumption, namely that morphemes rather than phonemes are the smallest units of speech that are equipped
with this power to transmit ingredients of meaning (Malkiel 1990: 158). In
cases of phonosymbolism, individual sounds or sound combinations convey
a certain sound-image that goes with a particular semantic eld, such as
the initial sounds in English splash and splatter, or combinations such as
helter-skelter, im-am, and the like. These sounds and patterns do not, however, behave as morphemes. Examples of phonosymbolism given in Malkiel
(1990) include: the vowels o and i, cross-linguistically associated with the
concepts of roundness and smallness respectively; onomatopoetic words
such as Russian xoxot outburst of laughter, German krachzen to caw, croak,
or French cliquetis clanking, clatter, jingle; and English verbs ending in
consonant+l, such as wobble, wriggle, straddle, giggle, ogle, prattle etc.,
correlating with situations that are somehow non-neutral as compared to
near synonyms such as laugh (giggle), talk (prattle), glance (ogle) etc.
Signs and their components are often of a phonosymbolic character in sign
languages.
To explore this issue in all detail, one would need to write a whole paper of its
own. Therefore, I will just illustrate the nature of phonosymbolic signs with
a few examples at this point. The easiest case of an iconically motivated sign
is an indexical sign, with the hand or ngers pointing at the referent. The IndoPakistani Sign Language sign for body consists of the two index ngers, nger
tips towards the body, running downwards along the torso. Although the sign
components are clearly meaningful, with the location of the sign corresponding
to the referent and the hands in the prototypical pointing handshape, it seems
to make no sense to think of them as morphemes. Note that, by contrast with
the lexicalised classier signs, there is no literal reading two parallel lines on
the torso or ngers drawing lines on the torso that would make much sense.
Rather, the pointing act itself is the iconic motivation for the sign.
The form of many signs in sign languages is motivated by metaphorical links
between the form and the meaning of the sign. Often there are a number of
signs sharing the same metaphorical basis. Signs that share an aspect of their
ENEMY
(Karachi dialect)
175
index finger
describing
circles near
the head
CRAZY
index finger
describing
circles above
the wrist
OVERTIME
meaning and an aspect of their form are known as sign families (Klima and
Bellugi 1979: 81). In Indo-Pakistani Sign Language and other sign languages,
a number of signs from the semantic eld of cognition have a location at the
temple, the metaphorical seat of cognition. Signs that have to do with time
are often made at the wrist location. Yet these locations do not function as
morphemes in the way that, for instance, the beginning and ending locations
of directional predicates do. Another common metaphor is the equation of upwards movement with positive and downwards movement with negative, a
metaphor occurring in a great many spoken languages as well. This is similar
to cross-linguistic phonosymbolism of the type i = smallness. In individual sign languages, particular handshapes may also carry semantic content. In
Indo-Pakistani Sign Language, for example, a handshape with only the index
nger extended and crooked, originally based on pulling the trigger of a gun,
is associated with meanings involving some sort of violent conict, such as
the signs for army, war and enemy. This is similar to the examples of
language-specic phonosymbolism mentioned above. Note that the term sign
families indicates there is a meaningful connection between the members, yet
they are not described as being morphologically derived from each other, nor is
there any morphological process that would derive these signs from each other
or from any underlying form.
Finally, consider the form and meaning of the signs represented in gure 10,
all of which are partially motivated. ENEMY uses the crooked index nger
handshape symbolising violent conict, CRAZY uses the temple location symbolising cognition, and OVERTIME uses the wrist location symbolising time.
All of these signs have circular movement patterns, yet it does not make much
sense to regard this movement pattern as morphemic in the same way that, for
example, movement patterns for aspect and aktionsart derivations are morphemic. Rather, the movement in ENEMY is (arguably) meaningless, the movement in CRAZY is based on another metaphor (the internal workings of the
176
Ulrike Zeshan
brain), and the movement in OVERTIME has yet another motivation, symbolising duration. Note, once again, that literal readings based on a compositional
organisation of putative morphemes do not make much sense. While holding
a long thin object and moving it across a at surface is an acceptable literal
reading of write, a circumlocution such as repeated circles next to the head,
or even a time duration with reference to the head are not viable morphemic
analyses of the sign CRAZY.
For sign language iconicity, the analytical problems are the same as for
phonosymbolism, except that in spoken languages, the relatively rare occurrence of phonosymbolism makes it easier for the linguist to ignore it. Note
that it is entirely possible that there are languages that make much greater use
of phonosymbolism than the Standard Average European languages, but that
their true character has not been recognised because the descriptive apparatus
used by linguists is inadequate to deal with such phenomena. Results from sign
language research could throw new light on such situations. For the sheer number of iconically motivated signs makes it impossible to discount iconicity as
some obscure exception, as has been the (often unacknowledged and unjustied) tradition in spoken language linguistics. Moreover, the semantic content
carried by phonosymbolic elements in signs is not at all minor or minimal,
but quite substantial.
Therefore, a reasonable conclusion from sign language data such as those
presented in this section is that total arbitrariness of the linguistic symbol is not
a necessary feature of human language. Rather, sign languages allow for a type
of linguistic symbol that is of a different semiotic status than the usual spoken
language word. The meaning-bearing structure of these words is different,
allowing for sublexical units that are non-morphemic, yet meaningful. Moreover, this type of word is not at all marginal in sign languages but represents
a substantial part of the vocabulary. This argument is of great typological and
theoretical importance, given the fact that the arbitrary nature of the linguistic
symbol is so deeply entrenched in contemporary linguistics as to be seemingly
self-evident. The very nature of the concepts of phoneme and morpheme
is necessarily challenged when sign languages are considered seriously and
described in their own terms.
Our discussion of data from various sign languages has shown that the concepts of phonological word and grammatical word can be meaningfully applied
to sign languages, although the denition of the grammatical word is weakened
due to the unusual simultaneous character of sign language morphology. While
the two units are almost always coextensive in the sign languages known so far,
there are instances of mismatches that parallel the mismatches found in spoken languages. Thus, one phonological word may consist of two grammatical
words in hostclitic combinations and one grammatical word may consist of two
phonological words in compounds, at least in some sign languages. However, it
177
seems that, from a language typological point of view, the relationship between
grammatical and phonological word is not the most interesting aspect of the
sign unit. The discussions of simultaneous words and the semiotics of signs
have shown how the signs of sign languages may go beyond the horizons of
what is known about spoken language words. The typological and theoretical
importance of sign languages is all the more evident in cases where linguistic
universals - or rather, what had been taken to be linguistic universals suddenly
appear in a new light altogether.
References
Anderson, L. B. 1982. Universals of aspect and parts of speech: parallels between
signed and spoken languages, pp 91114 of TenseAspect: between semantics
and pragmatics, edited by P. J. Hopper. Amsterdam: John Benjamins.
Anderson, S. R. 1993. Linguistic expression and its relation to modality, pp 273290
of Coulter, 1993.
Armstrong, D. F. 1983. Iconicity, arbitrariness, and duality of patterning in signed
and spoken language: perspectives on language evolution, Sign Language Studies
3.5169.
Baker, C. and Padden, C. A. 1978. Focusing on the nonmanual components of American
Sign Language, pp 2758 of Understanding language through sign language research. Perspectives in Neurolinguistics and Psycholinguistics, edited by P. Siple.
New York: Academic Press.
Bergman, B. and Wallin, L. Forthcoming. Noun and verb classiers in Swedish Sign
Language. In Perspectives on classier constructions in sign languages, edited by
K. Emmorey. Mahwah, N.J.: Lawrence Erlbaum.
Boyes Braem, P. 1986. Two aspects of psycholinguistic research: iconicity and temporal
structure, pp 6574 of Signs of life: proceedings of the Second European Congress
on Sign Language Research, edited by B.Th. Tervoort, Publication of the Institute
for General Linguistics Amsterdam: University of Amsterdam 50.
Boyes Braem, P. and Sutton-Spence, R. 2000. Editors of The hand is the head of the
mouth: the mouth as articulator in sign languages. Hamburg: Signum.
Brentari, D. 1996. Sign language phonology: ASL, pp 61539 of The handbook of
phonological theory, edited by J. A. Goldsmith. Cambridge, Mass.: Blackwell.
Coerts, J. A. 1992. Nonmanual grammatical markers; an analysis of interrogatives, negations and topicalisations in Sign Language of the Netherlands. PhD dissertation,
University of Amsterdam.
Collins-Ahlgren, M. 1990. Word formation processes in New Zealand Sign Language,
pp 279312 of Theoretical issues in sign language research, Vol. 1, edited by S. D.
Fischer and P. Siple. Chicago: Chicago University Press.
Coulter, G. R. 1993. Editor of Phonetics and phonology, Vol. 3: Current issues in ASL
phonology. San Diego: Academic Press.
DeMatteo, A. 1977. Visual Imagery and visual analogues in American Sign Language,
pp 10936 of Friedman, 1977.
Friedman, L. A. 1977. Editor of On the other hand: new perspectives on American Sign
Language. New York: Academic Press.
178
Ulrike Zeshan
Gee, J. P. 1993. Reections on the nature of ASL and the development of ASL linguistics: comments on Corinas article, pp 97101 of Coulter, 1993.
Gluck, S. and Pfau, R. 1997. Einige Aspekte der Morphologie und Morphosyntax in
Deutscher Gebardensprache Frankfurter Linguistische Forschungen 20.3048.
Johnston, T. and Schembri, A. 1999. On dening lexeme in a signed language. Sign
Language and Linguistics 2 (2).11585.
Klima, E. S. and Bellugi, U. 1979. The signs of language. Cambridge, Mass. Harvard
University Press.
Kyle, J., Pullen, G., Allsop, L. and Wood, P. 1985. British Sign Language in the British
deaf community, pp 31523 of SLR 83: Proceedings of the Third International
Symposium on Sign Language Research, Rome June 2226, 1983, edited by
W. Stokoe and V. Volterra. Silver Spring, Md.: Linstok and Rome: Istituto di psicologia CNR.
Liddell, S. K. 1980. American Sign Language syntax. The Hague: Mouton.
Liddell, S. K. and Johnson, R. E. 1986. American Sign Language compounds: implications for the structure of the lexicon, pp 8797 of Georgetown University Round
Table on Languages and linguistics 1985 Languages and linguistics: the interdependence of theory, data, and application, edited by D. Tannen and J. E. Alatis.
Washington, D.C.: Georgetown University Press.
Lucas, C. and Valli, C. 1995. Linguistics of American Sign Language: an introduction.
Washington, D.C.: Gallaudet University Press.
Malkiel, Y. 1990. Diachronic problems in phonosymbolism: edita and inedita, 1979
1988, Vol. 1. Amsterdam: John Benjamins.
Mandel, M. 1977. Iconic devices in American Sign Language, pp 57107 of Friedman,
1977.
Morgan, M. 2000. Negatives and interrogatives in Japanese Sign Language,
Questionnaire for the typological project on negatives and interrogatives in signed
and spoken languages, Ms., Melbourne, La Trobe University, Research Centre for
Linguistic Typology.
McNeill, D. 1992. Hand and mind. Chicago, Ill.: Chicago University Press.
Sandler, W. 1999. Cliticization and prosodic words in a sign language, pp 223
54 of Studies on the phonological word, edited by A. T. Hall and U. Kleinherz.
Amsterdam: John Benjamins.
2000. The medium and the message: prosodic interpretation of linguistic content in
Israeli Sign Language, Sign Language and Linguistics 2 (2).187215.
Sir Syed Deaf Association 1989. Pakistan Sign Language, edited by Syed Iftikhar
Ahmed. Rawalpindi.
Tanzania Association of the Deaf (Chama cha Viziwi Tanzania, Chavita) 1993. The
Tanzania Sign Language dictionary (Kamusi ya Lugha ya Alama Tanzania). Dar
es Salaam.
UNAD (Uganda National Association of the Deaf) 1998. Manual of Ugandan signs.
Kampala.
Wrigley, O. et al. 1990. Editors of The Thai Sign Language dictionary, revised and
expanded edition. Bangkok: National Association of the Deaf in Thailand.
Zeshan, U. 2000a. Sign language in Indopakistan: a description of a signed language.
Amsterdam: John Benjamins
2000b. Gebardensprachen des indischen Subkontinents. Munich: LINCOM Europa.
179
Forthcoming. Classicatory constructions in Indo-Pakistani Sign Language: grammaticalization and lexicalisation processes, in Perspectives on classier constructions in sign languages, edited by K. Emmorey. Mahwah, N.J.: Lawrence Erlbaum.
Zimmer, J. and Patschke, C. 1990. A class of determiners in ASL, pp 20110 of
Sign language research: theoretical issues, edited by C. Lucas. Washington, D.C.:
Gallaudet University Press.
Typology
Incorporation
Compared with languages like Inuit or Iroquoian we would probably not want to
characterise Dakotan, or the rest of Mississippi Valley Siouan, as productively
1
Rankin is responsible for the organisation and most of the writing in this chapter. Boyle provided
discussion of Hidatsa and the interpretation of enclitics in linguistic theory. Graczyk provided
examples and discussion of the Crow language. Koontz provided discussion of some of the
Dhegiha and Dakotan data, commented on empirical versus formal analyses and did a portion of
the writing.
180
181
In (1), a Lakota verb (and sentence), a single phonological word with a single primary accent, incorporates a body part noun, hi- tooth, a locativeinstrumentive
prex, -, that ts into a locative slot in the verb, an actor/agent pronominal, wa-,
one of a selection of about nine prexes indicating particular instrumentals, in
this instance pa- by pushing, and the verb root itself. Enclitic to the verb
are the potential mode marker, =kta, and the male declarative marker, =yelo,
which, although perhaps best considered an enclitic, often bears accent. The
semantics of the verb are fairly transparent in this example, typical of syntactic
incorporation in Lakota.
(2)
In (2), the pronominal allomorph, b-, which must precede the instrumental
prex yu- by hand/pulling, comes rst, followed by the instrumental and an
incorporated noun, c hate heart, the verb stem and the negative enclitic =sni.4
Here the meaning make angry is not equivalent to the sum of the meanings of
2
Writing for an audience of non-Siouanists, we have adopted the abbreviations and terms
instr instrumental and inst instrumentive or locative-instrumental in this chapter to describe two different kinds of instrumental prexes commonly found in Siouan languages. The
i- instrumentive used in example (1) is a general agreement prex for nearly any noun instrument, and it falls into a morphotactic class with the locatives. The other instrumental prexes
form a closed set of about nine with more or less specic meanings such as by pulling, by
pushing, by heat, by foot, by mouth, etc. (see table 1, columns 4 and 7 ). Siouanists all too
frequently tend to use the term instrumental for both kinds, leaving the distinction to context or
example.
c hate heart takes the form c hal- when fully incorporated because Dakotan only permits sonorants
in syllable codas of this kind. Thus, p, t, c , k b/m, d/l/n, d/l/n, g/ respectively in codas. Voiced
stops function as sonorants in this context, with nasality conditioned by the preceding (oral versus
nasal) vowel. Truncation of incorporanda signals that the overall construct is a single grammatical
word.
De Reuses translation gives bad as the meaning for the stem, waxte. If this were the case,
the verb would not require the negative enclitic. Waxte must be related via fricative symbolism,
common in Siouan, to the verb waste good rather than bad.
182
the individual derivational morphemes and roots that make up the verb. Note
also that, comparing (1) and (2), the order of incorporated nouns, pronominals
and instrumentals may differ from verb to verb. This is typical of what de
Reuse (1994), in his extensive treatment of lexical and syntactic incorporation
in Siouan, calls lexical compounding in Lakota.
1.2
In using the agglutinationfusion scale we must distinguish between derivational and inectional morphology. As Sapir mentioned, Siouan derivational
morphology is quite generally agglutinative, but inectional morphology has
a denite fusional tinge. Consider the allomorphy of the agentive pronominal
prex set in the Kansa language (other Dhegiha dialects are quite similar, and
Dakotan also shares reexes of types (ad)). Each example represents a conjugation class. Types (a), (b) and (e) are extremely productive; the other classes
have relatively few members each, but their members enjoy a high frequency
of occurrence. Third person has zero marking and shows the underlying form
of the verb.
(a)
(b)
(c)
(d)
(e)
(f )
(g)
(h)
carry
go
come
do
break
see
make
want
1sg
a-ki7
b-le
p-hu
m-o
p-paxi
t-t
o be
p-pae
k-ko -b-la
2sg
ya-ki7
h-ne
s&-u
z&-o
s&-paxi
s&-t
be
o
s&-kae
s&-ko
-h-na
3sg
ki7
ye
hu
o
bax:
d
be
o
gae
go -ya
The rst person singular pronominal afx, historically *wa-, shows a complex alternation pattern that is partly phonologically conditioned and partly
lexically conditioned. The rst person singular allomorphs are a-/b-/m-/p-/t/k-. Originally the conditioning factors were phonological and assimilatory
in character. The rather involved rule that has developed is virtually identical in every Dhegiha dialect; it generally involved obstruentisation of the
original *w-.
Second person pronominals show an even greater degree of fusion, often replacing the initial consonant of the verb stem. Allomorphs are ya-/s-/z-/h-, from
*ya-. Again, prex vowel syncope and obstruentisation, this time of y, account
for the palatal contoid allomorphs. Certain other prexes, mostly inectional,
also show partially fused allomorphs. Splitting the notion of agglutination from
183
1.3
Polysynthesis
Adopting Comries additional distinction between polysynthesis and incorporation, most Siouan languages fall somewhere between synthetic and polysynthetic on the polysynthesis scale but actively incorporate independent lexemes
only to a relatively reduced degree. For example, Dakota speakers do not usually
incorporate generic noun direct objects. De Reuse (1994) argues that incorporated nouns are pragmatically backgrounded and, in fact, are not arguments of
the verb. Siouan, with its optional, pragmatically marked incorporation, is not
like Eskimo or Iroquoian in which certain kinds of object incorporation are
obligatory (see chapter 3).
Crow and Hidatsa are typologically rather different from Dakotan, tending
much more strongly toward both polysynthesis and incorporation. Not only is
noun incorporation more common; incorporation of entire relative clauses is
possible. Example (3) is a single phonological verb containing a single accent.
Pauses within such constructs are not generally possible.
(3)
akdiiammalapashkuuassaaleewaachiinmook
ak- dii-ammalapas&
kuua-ss- aa- lee-waa- c&iin- m- oo- k
rel-2obj-Billingsgoal-port-go- 1actr-look.for-one-mode-dec
Well look for someone who [will] take you to Billings.
Grammatical words
184
2
a kwa-
something we
us
2.1
3
i
a
o
i-
4
pata rpo
5
adiO
6
adaO
with
at/on
inside
toward
cut
heat
shoot
me
I
you
you
him/her s/he
7
kikikkikkkikkik-
8
badinadakabi-
to/for
back
self
each/
other
ones/
own
pushing
hand
foot
mouth
striking
pressure
2.1.1 Initial problems with prex order It is not the case in Siouan
that derivation occupies the inner afx orders and inection the outer ones.
Sets 1, 3, 4 and 8 are clearly derivational, while 2, 5, 6 and at least some of 7 are
inectional. The locative and instrumental prex sets (3, 4 and 8) give Siouan
languages some of their polysynthetic feel, while noun incorporation and
compounding provide the rest. Note that there is no xed slot for incorporated
nouns in this schema. This issue is taken up in the next section.
The bound locatives, pronominals and instrumental afxes yield Quapaw
verb forms of the following sort (Rankin 1986) (other Mississippi Valley
languages are similar).
(4)
(5)
(6)
wata r- i7z&i
1du.pat-instr-fail
we two fail(ed) in cooking (with heat)5
(7)
wao - ta r- i7z&i=we
1du.pat-loc-instr-fail= pl
we fail(ed) in cooking (with heat) inside something
185
2.1.2 Order of incorporanda This basic picture becomes more complicated when we consider nounverb compounding and incorporation. According to most authors (e.g., Boas and Deloria 1941, Mithun 1984), noun
incorporation is relatively poorly developed in Siouan, amounting to little more
than transparent compounding of a noun root with a verb stem (see de Reuse
1994 for a more detailed view of Dakotan incorporation). Nevertheless, in numerous instances incorporation has a meaning distinct from the sum of its
semantic parts i.e. it has become lexicalised. Much but not all such incorporation involves body parts: Lakota (Buechel 1970), e.g. with pha head, i-pha-hi
inst-head-lean = lean the head against, a-ph-o-mnamna loc-head-locshake-shake = shake the head about, i-pha-sloka inst-head-slide = pull off
over the head, pha-sla-ye-la head-bare-caus-dim = making bare the head;
with h tooth, hi--pa-spu tooth-inst-instr-pluck = pick the teeth; with s
foot, si-yu-thpa foot-instr-cramp = have a foot cramp; with c hate heart,
yu-chal-waxte-sni instr-heart-good-neg = make someone angry; with thezi
stomach, thezi-yu-thipa stomach-instr-cramp = have a stomach cramp;
with thahu neck, thahu-yu-thipa neck.bone-instr-cramp = have a cramp
in the neck.
Examples from other Siouan languages include, in Kansa (Rankin 1987), with
nae heart, nae-laye heart-great = be brave; nae-wahehe heart-tremble
= be cowardly; na-i-o heart-inst-do = love someone very much; with
ho voice, ho-xpe voice-weaken = cough; in Biloxi (Dorsey and Swanton
1912), with yadi heart, {prn}-yadi-hi heart-think(?) = think of someone
constantly; {prn}-yadi-niki heart-lack/be.none = be without any sense;
5
The instrumental prex ta r- by extreme of temperature requires subject pronominals from the
patient (stative) set.
186
Tutelo incorporation of nouns within the verb word seems more complete, since
the pronominal prexes typically occur outside the incorporated noun.
Note that these examples of compounding reveal that the verb prex order
we presented above was oversimplied. While it conveys an idea of the relative
locations of Siouan prex classes, Siouan languages really do not lend themselves to description in terms of templatic morphology for reasons that should
now become clear. In Lakota lean the head against and pull off over the head
the incorporated noun, head, follows the locative prex and precedes the verb
root. In pick the teeth the noun precedes both the locative prex and verb root,
while in shake the head about one locative prex precedes the incorporated
noun and another follows it. In to have a cramp the affected body part noun
is inserted before the instrumental prex, while in to make angry the body
part, heart, follows the same instrumental. So the location of the incorporated
noun is rather variable, especially with regard to the locative and instrumental
prex sets. Semantic role of the incorporated noun is not a determining factor.
A more complicated case involves verbs like cough. Historically, it incorporates ho voice, but comparing Mississippi Valley Siouan languages, we
nd two distinct conjugation patterns. We would expect ho-{pronoun}-xpe,
and that is what we nd in Kansa and Quapaw: 1sg ho-a-xpe, 2sg ho-yaxpe you cough(ed) (Kansa), 2sg ho-da-xpe (Quapaw). In Dakotan, however,
the verb is conjugated with the pronominals outside the incorporated noun:
1sg wa-ho-xpe, 2sg ya-ho-xpe.
It appears that Dakotan speakers have lost etymological awareness of the
source of hoxpe, which is now treated as a monomorphemic root, and they have
moved the pronominals accordingly. Position of the noun seems to depend on
when, historically, the original compounding was done and/or degree of opacity
of the result, in other words, degree of lexicalisation.
2.1.3 Order of locatives Locative/instrumentive prexes also present
problems for any synchronic, invariant ordering proposals. They can double
up, a second locative appearing outside of an earlier one to derive a new lexical
verb stem. There are a great many examples of doubled or even tripled locative
prexes. When locatives are layered there appears to be no xed order for
them: the order is the one in which they were applied historically. Often one
and sometimes both of the prexes has been bleached of any clear locative
meaning. The process and a variety of examples are discussed in detail in
Boas and Deloria (1941: 435): a--capa open mouth toward < -capa open
the mouth; i-a-kaska6 imprison < kaska tie fast; a-o-nathaka lock in;
6
Boas and Deloria write an initial glottal stop with all vowel-initials and also write a glide, [y]
between the vowels. Both elements are predictable and have been omitted here for clarity. I have
187
also replaced Delorias symbols for aspiration, s, z, c , x, and / with currently used symbols on
a one-to-one basis.
The Dakotan inclusive and rst person plural pronominal has two allomorphs, [/]u k- preceding
vowels and [/]u- preceding consonants. An analogous rule exists in Dhegiha languages.
188
1sg
2sg
3sg
1du
Dakotan
(Buechel 1970)
ma-wa-ni
ma-ya-ni
ma- -n:
ma- u-ni
Omaha
(Dorsey 1890)
ma -b-D:
ma--n:
ma - -D:
a- ma- -Di
Quapaw
(Dorsey 189094)
a-ma - b-di
da-ma - t-ti
ma- -n:
a-ma - a-ni
These examples, plus the ones illustrating noun incorporation, further demonstrate that templatic morphology is not destined to work with Siouan languages.
The semantics of locatives is especially problematic as they variably lose their
locative meaning to one or another degree. At one time they were proclitics
or independent words as they tend strongly to attract accent, and one of them,
i- with, using, may appear as an independent word in Crow, suggesting recent
grammaticalisation in the Mississippi Valley.
Certainly all of the forms cited above are grammatical words. The grammatical morphemes, both derivational and inectional, within them occur together,
not scattered throughout the clause. They also have the requisite conventionalised meanings. But the constituent grammatical morphemes do not occur in
a consistent, predictable order, i.e., they cannot be thought of as generated by a
simplistic set of quasi-syntactic rules.
2.1.6 Two kinds of incorporation We have been using the terms compounding and incorporation more or less interchangeably, because it is not possible always to distinguish them on syntactic or phonological grounds, but it is
useful to distinguish them semantically. Boas and Deloria (1941: 67), discussing
this problem, state Each compound has only one primary accent . . . Compounding always expresses that the compound is a unit concept. There are
however two degrees of such unity.
In the rst of Delorias examples thaka be large is a stative verb in an internally headed relative clause. Kettle and large are both independent phonological and grammatical words and both have primary accent. No compounding
or incorporation is involved.
(8)
ma-k/u
me-give
In the next example, thaka functions as a noun modier, but it is not accented,
and Deloria, an educated native-speaker of Lakota, writes a hyphen between
the elements of the compound. This is her rst type of conceptual unity.
(9)
c&hea-thaka ki he
kettle large
the that
Give me the big-kettle.
ma-k/u
me-give
189
hex-thaka wohe/
kettle-large
(3sg)-cooked
She cooked a big-kettle-ful (i.e., she cooked for a feast).
Nouns are only truncated this way when their nal vowel is of a certain sort (proto-Siouan *-e),
so with other noun-nal vowels it may sometimes be difcult to tell the difference between
Delorias two degrees of unity. The best indicator here is accentual if the rst word only has one
syllable. If it is disyllabic so that accent falls on the rst word whether or not it is a type 1 or type
2 incorporation, the best indicator of status would probably be semantic.
190
wine-isshiik
he is drinking wine
(12)
(13)
-chilakee
ak-bus
bus driver
In the second example the accent is on ice, and the incorporated noun is
part of an incorporated relative clause. These are all single phonological words.
It is possible that we might wish to consider distinguishing grammatical word
from syntactic word in such instances. If the latter term has any utility, surely
it would be in cases such as these.
2.2.1 Body part statives There is another type of incorporation
involving inalienably possessed body part nouns and stative verbs. daa
sicchi
be happy < daasa heart + cchi be good, daasduupa
be uncertain, undecided < daasa heart + duupa be two. These are inected with the possessive
prexes: ba-lascchi I am happy, da-lasduupa you are uncertain, etc. With
these verbs, then, the inection precedes the incorporated noun rather than
following it.
2.2.2 More templatic problems The notion of template tends to break
down in Crow, just as in other Siouan languages.
(1) There is variability in the ordering of incorporanda and pronominal prexes. For example, in the word/sentence given above in (3) as an example of
polysynthesis, Crow speakers nd an alternate order acceptable. The pronominal dii- you, the object of the verb in the subordinate clause, may either
precede or follow Billings + goal:
191
(14)
(15)
The former represents the unmarked order, while the latter is pragmatically
marked and contrasts Billings with other possible destinations. Thus, in Crow,
the order of morphemes may be manipulated at the subword level in order to
shift focus.
(2) With the future auxiliaries, the agent pronominals are postposed. That
is, the order is pro stem pro aux except for bia want to, intend to, be going
to, where the order is pro stem aux pro: baa- lee-wia-waa-k (1actr-go-fut1actr-dec) I am going to go.
(3) Crow, like other Siouan languages, has a number of verbs with genuinely
inxed person markers. Note the distinct allomorphs with vowels, nasals and
clusters.
steal
enter
know
1sg at-b-aal bm-m-aali e-wa-hc&e
2sg at-d-aali bn-n-aali e-la-hc&e
3sg at- -aal
bil- -eeli e- -hc&e
2.3
Thus far we have seen that Siouan verbs do not t well into templatic models because the many prexes are ordered according to sometimes-conicting
principles. (1) Derivation is recursive in that locatives can be relatively freely
concatenated in any order. (2) Nouns can be incorporated at more than one
point among the prexes. (3) Pronominals may be inxed within some verbs,
and, if they are prexed, their order can be disturbed by positioning of locatives. Reduplicated verb roots, although not discussed here, can be regarded as
prexes, as the process has grammatical functions, marking iteration in active
verbs, intensity in stative verbs. Finally, the patterns illustrated above are quite
typical of what is found across Siouan among uent speakers. These are not
examples produced by confused semi-speakers of moribund languages.
2.3.1 Implications of diachrony Grammatical words in Siouan include the root and all prexed material. We might want to ask ourselves, however, whether we wish to treat transparent compounds like Lakota s c hola
be footless (from si foot and c hola be without) in exactly the same way
192
that we treat opaque compounds like si-chola be barefoot both with the same
status as grammatical words. Diachronically they are clearly later and earlier
derivations from the same two roots. Such doublets, one a syntactic compound
and the other a lexical compound, are not uncommon though, so the problem
of their overall word status is a real one. Using the term syntactic word at least
for clause-incorporating structures in Crow is one possibility.
The process of lexicalisation is slow, and it is not uncommon to nd compounds near some mid-point along the diachronic lexicalisation continuum.
Such partially digested compounds may ultimately resist attempts to distinguish
subtypes of grammatical words on principled grounds. Hoxpe to cough is such
a verb when looked at across Siouan. Some speakers treat it as a Gestalt and have
prexed agent pronominals to the entire word; others still inx the pronominals,
treating ho voice as a noun. Enclitics (4, below) provide additional examples
of grammaticalisation by degree.
2.4
Grammatical hierarchy
Phonological words
3.1
Accent
Ken Miner (see Hayes 1995: 34665 for a good summary) has shown that Winnebago accent has
been shifted to the right, so that it is often a syllable farther from the beginning of the word than
in the other Mississippi Valley languages.
193
194
cluster, but clusters are common word-internally. Finally, a speaker may pause
between words, but not word-internally.
Ignoring the problem of exact boundaries, however, it is possible to give a
straightforward denition of word in Crow and Hidatsa based on phonological properties: as in Dakota and Dhegiha, the word is the domain of a single
primary accent. With few exceptions, lexical morphemes in Siouan languages
are inherently accented, and in words composed of more than one morpheme,
there are rules that reduce all the accents to one.
3.2
Pauses
Longer verb forms with their prexes and incorporanda are not normally interruptible. This seems to include enclitics in Crow, but Dakotan is a more difcult
call. We do not have rst hand information at this point that would inform us
about ability to pause before enclitics that Deloria tended consistently to write
as separate words. These would mostly be the ones toward the bottom of the
enclitic chart, farther from the verb root.
Verbs, certainly, can form a complete utterance, including stative verbs and
nominal predicates. We take the position that there is no class of adjectives in
Siouan. Some adverbials and other words can be utterances.
3.3
While in everyday speech there is considerable congruity between phonological words and grammatical words, there are exceptions. Graczyk and Boyle,
linguists working with Crow and Hidatsa, nd the phonological denition of
the word to be much more helpful than those of us working farther south with
Dhegihan or Dakotan dialects. There, it is not possible to be categorical about
pauses and accent among the outer clitics. Sometimes these enclitics may be accented according to available sources (Boas and Deloria 1941, Trechter 1995).
If the phonological word is dened accentually, accented enclitics represent the
primary source of mismatch between phonological and grammatical words in
Dakotan.
Depending on how we treat the syntactic incorporations in Crow and Hidatsa,
phonological words in Siouan languages may be made up of one or more
grammatical words. In the more southerly languages, most grammatical words
are simple monomorphemic or polymorphemic sequences, but in verbs (and
trivially in possessed kin-terms) the grammatical words are inected by prexation. Thus phonological words may contain multiple inections, one set
for each grammatical word. In addition, grammatical verb words in particular may manifest a relatively complex polymorphemic derivational structure
including compound verb forms, one or more incorporated nouns, or, in Crow
195
and Hidatsa, entire incorporated relative clauses. The specic rules for these
derivational structures must be rather complex. No Siouanist has been entirely
successful in producing a detailed account of such a system.
3.4
Enclitics
196
are frequently not stressed, so they do sound as if they are sufxed to the
verb.
It is a fact, however, that enclitics are often difcult to distinguish from
sufxes on principled grounds. For example, kta potential mode clearly forms
a part of the phonological verb that precedes it; it causes/passes nasalisation
just as if there were no major boundary present. Dakotan kta potential mode
is a (rare) nasalising morpheme. Thus we have the examples such as: b-le [ble]
I go, but, nasalised, b-le + kta [mn: kte] I will/would go. There are
other sandhi rules that operate routinely between stems and enclitics that do not
operate across other major category boundaries. But in other ways, several of
these same traditional Siouan enclitics are hard to distinguish from independent
words. For example, those with more than a single syllable may bear accent,
something that most recognised clitics avoid (Trechter 1995).
4.1
Dakotan enclitics
Rood and Taylors excellent sketch of Lakota (1996) has twelve enclitic positions. Arranging the orders vertically, beginning with the verb root and working
rightward, these Dakotan clitics are listed below. Some of them were written
as sufxes by Ella Deloria in Boas and Deloria (1941). Others were treated as
distinct words by Deloria and written with a space before them. A few are found
written both ways in the Dakota Grammar. We must assume that Deloria was
conveying her native intuitions about their status with her use of hyphens and
spaces.
Sufxes according to Deloria:
continuative aspect
1. ha
2. pi
plural (subject, animate object or both)
3. la
diminutive
4. ka
somewhat
Treated inconsistently by Deloria:
5. kta
potential mode (both word and sufx)
6. sni
negative mode (both word and sufx)
Treated as separate words by Deloria:
7. s/a
iterative aspect
8. yo
imperative, male speaking
ye
imperative, female speaking
yetho familiar imperative, men
nitho familiar imperative, women or
itho
familiar imperative, women
ye
imperative, request, men/women
na
imperative, request, women only
197
9. sec a
conjecture, probably
nachec a conjecture, probably
10. keya
quotative
keya-pi pl quotative
11. lax
emotional involvement of speaker, really
laxca
emotional involvement of speaker, really
laxcaka emotional involvement of speaker, really
12. More than twenty-seven additional particles marking interrogatives,
suggestions, assertions, emphatic negation, narratives, etc. Disyllabic
ones may bear accent.
4.2
To the comparative linguist there are some revealing items on this list. The
potential mode enclitic in Dakotan and Dhegihan languages is part of a panSiouan cognate set: Crow issi want to, Hidatsa hte desiderative, Mandan
-kt- future, Dakota kta potential, Winnebago ke intentive, Omaha tte,
Kansa tte, Osage hte, Quapaw tte potential mode, Biloxi te want, Tutelo
ta future Proto-Siouan *kte (Carter, Jones and Rankin, in preparation). From
the distribution of meanings across Siouan, it seems fairly clear that this mode
marker is descended historically from a verb meaning approximately want (see
Boyle 2000). This is not surprising, nor does it bear directly on our synchronic
analysis of Dakotan kta or Dhegihan tte, but it serves as a warning that many of
the post-verbal particles will be found along a continuum of grammaticalised
and semi-grammaticalised verbs.
4.3
A little farther along, in position ten, keya and keya-pi quotative are reexes
of the common Siouan verb *k-e rhe say the preceding.10 And, although it is
labelled an enclitic, it is still inected for number with its own enclitic, -pi plural. So an enclitic has its own enclitic, and -pi appears twice (2 and 10) in the
ordering. Deloria consistently writes these as separate, post-verbal words. It is
common for a quotative to be derived from a verb of saying, but might this still be
a verb synchronically, perhaps an auxiliary? It is no longer inected for person,
aspect or mode, but it retains other verb-like characteristics and has its own internal structure, i.e., an incorporated, frozen deictic object and number marking.
In other Siouan languages there are indeed fully edged auxiliary verbs
denoting various kinds of Aktionsart and modality (imperfective, perfective,
10
Most Siouan languages form quotative verbs and particles with *e-he say but in Dakotan this
verb has become contaminated by the phonetically similar verb iya to speak.
198
Noun enclitics
Zwicky (1985) disagrees with Matthews (1965) as to the nature of the Hidatsa
(and, by extension, other Siouan) mood markers. Matthews assumes that, since
they are syntactic in nature, and are readjusted through transformations to form
phonological words, they are clitics (although he does not use the term clitic
overtly). He says that the main clause is assured by the presence of the Mood
constituent at the end, for only the clauses are followed by Mood (Matthews
1965: 38). By calling the mood markers constituents, Matthews stresses their
fundamental syntactic role. Further, Matthews states that constituents, including those that consist of a single morpheme [i.e., the mood markers], are the
important units in Hidatsa syntax. He contrasts this notion with a description
of prexes (and by extension grammatical sufxes), stating that syntactically
each prex is most closely associated with a specic constituent of the sentence . . . (Matthews 1965: 54). Afxes, whether inectional or derivational,
199
are not constituents for Matthews, since they do not play a syntactic role. The
mood markers, on the other hand, do play a syntactic role, however they cannot
stand alone as grammatical words (Boyle 2000). As a result, they attach phonologically to the preceding verb and hence are clitics syntactic elements that
form a phonological word with the preceding verb (Matthews 1965: 26772)
rather than morphological words themselves. This was also Zwickys (1985:
629) assessment of Matthews view of the mood markers.
Zwicky argues instead that they are not clitics but afxes based on three
criteria (see Zwicky and Pullum 1983): (1) that they belong to a relatively
small closed system, one of whose members must always appear at the relative
place in the structure (Carstairs 1981: 4), (2) that clitics can exhibit a low
degree of selection with respect to their hosts, while afxes exhibit a high
degree of selection with respect to their stems, and (3) morphophonological
idiosyncrasies are more characteristic of afxed words than of clitic groups
(Zwicky and Pullum 1983: 5034).
Graczyk, however, looks upon several of these same phenomena, particularly morphophonological irregularities, as evidence that the mood markers are
enclitics in the closely related Crow. The bond that exists between the mood
markers and the verb is not as tight as the bond between a verb and an afx.
5
Denitional problems
Overall, one of the central problems here is the impossibility of trying to dene
a system in terms of itself. Every logical discipline proceeds from undened
terms or domains justied by external hand-waving. Geometry does not dene
points, lines and surfaces, for example, and grammar can ultimately only dene
sentences or grammatical words generatively or, less successfully, with nongrammatical clues. The pertinent domain (sentence, word, etc.) is supplied and
we merely try to account for the supplied examples. But, given a phonological
string, we always lack any really comprehensive set of criteria to characterise
words beyond the rules and generalisations about them. Apart from rules, we
have to depend on judgements such as the speaker felt that this example was
not a whole utterance or word, or the speaker claimed this was two words,
not one long one. The speaker intuits this from some internal process that we
hope our statements approximate.
It is not circularity to assemble examples of words and then claim that a
grammar that generates them denes word. It is approximation, if you will.
The process of selecting some data that strike us intuitively as words and then
seeing if we can account for things on that basis is certainly a case of iterative
hypothesis renement, but it is not circular. The real issue is whether identifying
a target words, afxes, enclitics, etc. as a domain gets us anywhere. While
admitting to some deciencies in the concepts of word, afx or clitic, very few
200
linguists have suggested the concepts should be rejected, even though there is no
real way to dene a word or enclitic except by rule (a statement true of perhaps
most theoretical terms in linguistics). One may develop heuristic approaches to
identifying words, like pauses for breath or thought, or evidential particles or
glottal stops or certain allophones, but these will never be sufcient to dene
words. One cannot dene a verb word in Siouan as everything between a glottal
stop or aspirate and a nal evidential. At best one uses such things as clues
during the period in which one has not developed enough of a grammar of
words or sentences to be able to recognise them by parsing.
Alternatively one may rely on the native speaker or another, earlier linguist,
like Dorsey (1890), Lowie (1930), Boas and Deloria (1941), Robinett (1955),
Rood and Taylor (1960), Buechel (1970), Koontz (1985), Zwicky (1985),
Graczyk (1991) or de Reuse (1994) and trust their judgement, but that, of
course, is informed by their own grammars, however obscure or decient, and
not by their consciousness of where they pause to take a breath.
Ultimately, it is necessary to dene grammatical words with a set of rules
or templates (these latter will not work well in Siouan). Sooner or later we
should be able to parse things in terms of them, or not, and then the case pro,
or con, is closed. In short, if the rules nally work and seem to correspond
to a reality in which speaker intuition and intra-grammatical considerations
hopefully dovetail, they justify themselves. And, of course, that is the crux
of the matter: does hypothesising that Siouan phonological words are made
up of one or more grammatical words plus particular clitics and afxes get us
anywhere? We maintain that it does. Using the concept of grammatical word we
can write a simpler grammar, with fewer exceptions than if we try to generate
or templatise whole phonological words.
5.1
Of course many of the kinds of problems discussed at length above remain. They
require judgement calls by the individual linguist or speaker. For example, it is
probably not the case that it will ever be possible to resolve all of the types and
tokens of complex grammatical words into just three classes of objects words,
clitics and afxes to anyones satisfaction. The boundary between clitic and
afx is inherently unclear in Siouan. We are faced with a continuum of afxed,
incorporated, cliticised, partially or contextually cliticised, loosely juxtaposed
and auxiliary elements, as shown in gure 1.
Syntactic and lexical compounds (or syntactic and lexicalised incorporation)
are polar points on this scale, favoured implicitly by Boas and Deloria and
by de Reuse. Likewise, inected auxiliaries and uninected, or semi-inected,
enclitics are at opposite ends of the continuum. But diachronic study shows
that the processes of lexicalisation and grammaticalisation are gradual and that
(time)
201
Fossilised
lexical(ised)
compounds,
enclitics and
affixes
they affect both words and speakers one at a time. It is therefore inherently
unlikely that synchronic linguists, fond as they are of binary categories, will
ever be able to dene a precise set of objective criteria that will enable them to
apportion all compounds or incorporations to one or the other category to either
their colleagues or the speakers satisfaction (much less both). In cases where
some post-verbal elements are still partially inected as verbs, as they are in
the Dakotan quotative, it is safe to say that there will be plenty of instances in
which we cannot make a principled determination of their status. This follows
from the very nature of analogical linguistic change.
People do talk about words in their native languages and in English when
speaking about them or attempting to write them. This is not a topic about
which most Siouanists are fully conversant as necessary experimental evidence
is not available, but there is a written tradition at least in Dakotan beginning
with the translation of the Bible into that language well over a century ago and
extending to the sophisticated native analyses of Ella Deloria, in the grammar
she co-authored with Franz Boas (1941) and in her text collections (1932). No
doubt both these works contain the speakers insights and the grammatical
prejudices of the Bible-translators or of Boas.
Those of us involved with language teaching programmes note that the postverbal elements we have described as enclitics tend to be written as separate
words by speakers writing their language for the rst time (Kathy Shea, p. c.).
This dovetails fairly well with what Deloria seems to have favoured. The same
writing habits hold for the denite articles that are enclitic to noun phrases. There
does seem to be a certain lack of consistency to all of these efforts however.
The languages themselves do not contain a large variety of expressions for
linguistic elements. In the nine-hundred-page Omaha and Ponca text collection
by James Owen Dorsey, one of his consultants (Dorsey 1890: 134.16) refers to
a sentence as e word(s). Another (194.11) refers to talk or instructions as
e=khe the (horizontal) words. And in the traditional story of Two Faces
(209.5) the narrator says: Gise=ama e=the he forgot the (standing) words
in reference to the confusion that falls over Lodge Boys mind whenever he tries
to tell his father that his secret twin, Spring Boy, has been about. All in all, no
202
distinction is made among words, sentences, messages, etc. All are e, which
may be either a verb or a noun according to position and usage, and which
means simply speech or to speak.
6.1
Taboo
In Siouan this concept does not have the signicance that it does in some
varieties of Austronesian or in Australian languages. One or two observations
can be made however. Historically in Siouan the terms for black bear and
snake appear to have been the subject of taboo, or at the very least, of largescale replacement. Most Siouan tribes also practise the mother-in-law taboo;
historically, a male may not directly address his mother-in-law. Fletcher and
La Flesche (1911) report cases of Omaha language conversations with in-laws
carried out by addressing a child who is present. This might explain why the
terms for parents-in-law are grandfather and grandmother. Boyle reports that
in Hidatsa women and their fathers-in-law are similarly affected, but perhaps
to a lesser extent.
7
Conclusion
kki
ga ga akkwa
e
wakkadagi=
and
children
both
to.speak they.were.enchanted
ama
hna= bi=
habit. pl
quotative
and they say both the children were enchanted in speaking
203
References
Boas, F. and Deloria, E. 1941. Dakota Grammar, Vol. 23, 2nd Memoir, National
Academy of Sciences. Washington: Government Printing Ofce.
Boyle, J. P. 2000. The Hidatsa causative and future markers: a parallel development?,
Paper presented at the Siouan and Caddoan Languages Conference, Anadarko,
Oklahoma.
Buechel, E. 1970. A dictionary of the Teton Dakota Sioux language, edited by P. Manhart.
Pine Ridge: Red Cloud Indian School.
Carstairs, A. 1981. Notes on afxes, clitics, and paradigms. Bloomington: Indiana
University Linguistics Club.
Carter, R. T., Jones, A. W. and Rankin, R. L., et al., eds. In preparation. Siouan comparative dictionary. Computer database.
Comrie, B. 1989. Language universals and linguistic typology. Oxford: Basil Blackwell.
Deloria, Ella. 1932. Dakota texts. Publications of the American Ethnological Society
14. New York: Stechert.
de Reuse, W. 1994. Noun incorporation in Lakota (Siouan), International Journal of
American Linguistics 60.199260.
Dorsey, J. O. 1890. The c/egiha language, Contributions to North American Ethnography
6. Washington: Government Printing Ofce.
189094. Quapaw texts, dictionary and grammar notes, ms. collection of the National
Anthropological Archives. Smithsonian Institution, Washington, DC.
Dorsey, J. O. and Swanton, J. R. 1912. A dictionary of the Biloxi and Ofo languages: accompanied with thirty-one Biloxi texts and numerous Biloxi phrases,
Bureau of American Ethnology Bulletin 47. Washington: Government Printing
Ofce.
Fletcher, A. and La Flesche, F. 1911. The Omaha tribe, Bureau of American Ethnology
Annual Report 27. Washington: Government Printing Ofce.
Graczyk, R. 1991. Incorporation and cliticization in Crow morphosyntax. PhD dissertation, University of Chicago.
Hayes, B. 1995. Metrical stress theory. Chicago: University of Chicago Press.
Koontz, J. E. 1985. Omaha grammar notes, ms.
Lowie, R. 1930. Hidatsa texts with grammatical notes and phonograph transcriptions
by Zellig Harris and C. F. Voegelin, Prehistory research series, Vol. 1, no. 6.
Indianapolis: Indiana Historical Society.
Matthews, G. H. 1965. Hidatsa syntax. The Hague: Mouton.
Mithun, M. 1984. The evolution of noun incorporation, Language 60.84794.
Oliverio, G. R. M. 1996. A grammar and dictionary of Tutelo. PhD dissertation,
University of Kansas.
Rankin, R. L. 1986. QuapawEnglish lexicon, ms.
1987. KansaEnglish lexicon, ms.
Robinett, F. M. 1955. Hidatsa I, II, III, International Journal of American Linguistics
21.17, 16077, 21016.
Rood, D. S. and Taylor, A. R. 1996. Lakhota, pp. 44082 of Handbook of North
American Indians, Vol. 17, edited by I. Goddard. Washington: Smithsonian
Institution Press.
Sadock, J. M. 1991. Autolexical syntax: a theory of parallel grammatical representations.
Chicago: University of Chicago Press.
204
Dagbani is a Gur language spoken by approximately ve hundred thousand people (Dagbamba/Dagombas) in Northern Ghana. With regard to morphological
typology, Dagbani can be described as agglutinative with some degree of fusion:
certain combinations of morphemes may be obscured by phonological and
morphophonological rules. Most afxes in the language occur as sufxes, with
a few prexes to verbs. Compounding of verbal or nominal roots is common;
typically, the output will be a noun. Syntactically, Dagbani is strictly AVO, SV;
however frontshifting of objects or other non-verbal elements is very common.
As far as the status of the word is concerned, the language displays a number
of interesting features. These include the structure of possible minimal words
(1.5), the role of nounadjective constructions which surface as morphological
compounds (2.1), and adjectival derivations that illustrate the phenomenon
of bound words (2.2). Dagbani also has a number of proclitic and enclitic
elements, which is interesting, since most languages that are predominantly
sufxing tend not to have proclitics (see Aikhenvald in chapter 2).
The following sections describe how Dagbani words can be characterised
with respect to their relationship to phonology, morphology and syntax/
semantics.
1
Phonological word
As pointed out in chapter 1, the distinction between grammatical and phonological word is often difcult to make. Among the criteria for the denition of a
phonological word, some of the following have been suggested, although there
are problematic cases:
r boundaries dened by stress or tone;
r pauses (e.g. in dictation);
r phonological rules specically applying to a word.
These criteria are language-specic, i.e. a word in one language can be dened
by certain criteria, and in another language, by others. A phonological word
is usually composed of smaller prosodic units such as feet, syllables or moras
(which in turn are built from combinations of segments).
205
206
Knut J. Olawsky
Stress
As the transcription of tones is not relevant to the discussion in this chapter, all examples are
generally given in Dagbani orthography, which omits statement of tones.
Note that not all of these sounds are phonemes; therefore, they are represented with their phonetic
values. Specically, [r] and [] are allophones of /d/ and /g/, respectively. The following is a
complete list of Dagbani phonemes: /p, t, k, b, d, , kp,
b,
m,
n, m, , , f, s, v, z, l, , j/ and
/a, e, o, i, u, /.
(1)
207
Penultimate stress
a. Morphologically simple words
kapok tree
ua
['u.a]
'kpa.a]
wing
kpukpaa [ kpu.
naanzua
[nan.zu.a]
pepper
b. Compounds
(thing-white-sg)
bin-piel-li
[bn.'pjl.li]
dulim-bil-a [du.lm.'bl.a] (urinate-well-sg)
(hospital-small-sg)
ashibiti-bil-a [a.ib.ti.'b.la]
shroud
urinating hole
clinic
At rst sight, some words seem to deviate from this regularity in that they display
either nal or antepenultimate stress. A look at the examples in (2) reveals that
the difference is of a systematic nature: these nouns bear their stress on the
antepenultimate syllable in the singular.
(2)
However, they have something else in common: the singular forms of all these
words contain an epenthetic vowel [] in the penultimate syllable. This vowel is
transparent for stress assignment and is therefore disregarded. Evidence for the
underlying penultimate stress pattern on the words in (2) is found when we look
at the plural forms of these words. Since no epenthetic vowel is necessary in this
context, the stress pattern is regular. The other type of words that deviate from
the regular pattern are shown in (3). In contrast to other nouns, their singular
forms are stressed on the nal syllable.
(3)
Note that the plural forms consistently follow the penultimate pattern a fact
which demonstrates the underlying stress pattern for the singular as well. The
deviations from the regular stress pattern can be understood if the singular forms
of these examples are taken as shortened versions of the underlying form.
Although a word like kundu /kun.dun/ hyena is clearly disyllabic, its stress
pattern reects an underlying sequence of three syllables, as is demonstrated by
its plural form kunduna /kun.du.na/.3 Further evidence for underlying invisible
3
Note that the underlying form of hyena is /kundun/; there is, however, a velarisation rule that
applies to word-nal /n/, which leads to the realisation as [kundu] (cf. 1.4.1).
208
Knut J. Olawsky
units in Dagbani can be found in tone assignment, where nasals and long vowels
function as tone-bearing units word-nally only (cf. 1.3).
Now supposing that a nal empty syllable belongs to the structure of
words ending in a nasal consonant, a word like kundu can be represented
as /kun.'dun.X/, where X represents the invisible prosodic unit, i.e. a syllable.
Stress has access to this structure and is correctly placed on /dun/. Therefore,
we can now claim for the above-mentioned nominal forms that a catalectic syllable in words ending in CVN or CVV provides the solution for the apparent
mismapping of stress and syllabic structure.
The examples discussed above illustrate the consistency of penultimate stress
assignment in Dagbani. In addition, it has been shown that we have to assume
a more complex prosodic structure for certain lexical items in that they involve
underlying units which have systematic effects on the prosodic realisation of
the word.
1.3
Tone
209
smaller than the word. In most cases, they apply to the domain of the syllable.
However, one example which applies to the phonological word will be mentioned here: words that underlyingly end in the nasal consonant /n/ are realised
with []. This velarisation rule applies exclusively at the end of a phonological word; kundu hyena is an example already mentioned in 1.2. This rule
applies to all nouns and verbs.
1.4.2 Vowel harmony The other phenomenon that plays a certain
(though peripheral) role in Dagbani phonology is cross height vowel harmony.
Its effects are weaker in Dagbani than they are in some of its neighbouring
languages; however, there is at least a tendency for root vowels to affect sufx
vowels. This is described below.
In general, [ATR] (lax) and [+ATR] (tense) vowels are not distinctive
in Dagbani; they occur as allophones of each other, as two sets of [, , , -i]
versus [e, o, u, i].4 However, Kropp-Dakubu (1997) observes certain regularities
in the distribution of [+ATR] and [ATR] vowels in Dagbani which indicate
a systematic occurrence of the two vowel sets with respect to the phonological
word. In particular, she makes the observation that certain vowels in the root
(nucleus) of a word are followed by a second set of vowels in the sufx (coda)
of a word. Though Kropp-Dakubu only lists examples from disyllabic nouns,
vowel harmony is also found in some words whose root is longer than one syllable. The following table illustrates the possible combinations of vowels in roots
and sufxes. Thus, the root vowels [a, , , , -i] are followed by a member of the
same set in the sufx so that the feature [ATR] is found in both morphemes of
the word. Similarly, the [+ATR] vowels [o, u, i] in the nucleus trigger a vowel of
the same set in the coda. For [a], the contrast [+/ATR] is neutralised.
(4)
sufx
[+ATR]
[i]
[u]
[o]
[a]
([e])
According to Olawsky (1999), tense and lax vowels are not different phonemes, as they are not
distinctive in any other context. An analysis involving two different sets of vowels that can both
occur in a root, however, would have to postulate that all eight vowels could be underlying,
and therefore could be phonemes. As a solution to this, it would be necessary to analyse the
segmental environment for each vowel in the root. Olawsky (1999) makes some observations
that can predict the occurrence of tense and lax vowels in a root at least tendentially.
Kropp-Dakubu uses [i-] for schwa or a schwa-like vowel. Here, it will be represented as [].
210
Knut J. Olawsky
In contrast to the examples that display cross height vowel harmony, there are
words which do not feature this phenomenon. There are at least three contexts
that exclude vowel harmony.
(1) Many disyllabic roots arbitrarily contain two vowels of different harmony
sets, which may suggest that predictable vowel harmony in Dagbani is limited
to combinations of monosyllabic roots with a sufx.
(2) Another instance is the blocking of harmony when an epenthetic schwa is
inserted between the root and the sufx. Epenthetic schwa occurs with all root
types, regardless of their feature value for [ATR], and it blocks vowel harmony.
For instance, the root vowel of bihili /bih-li/ breast is clearly [+ATR]; however,
the sufx vowel tends to be realised [ATR] ( [bihli]), and the only visible
trigger is the neighbouring epenthetic schwa.
(3) In compounds, each constituents vowels keep their own [ATR] values, as
is also conrmed by Kropp-Dakubu (1997). The word for hencoop, nosu
/no-so-u/ fowl-stall-s, is realised as [no.s. ], the vowel in the rst part
(fowl) being realised [+ATR], and the second part (stall) having [ATR]
vowels. Thus, Dagbani compounds represent single grammatical words with
two vowel harmony units. One cannot say that these are phonological words,
as only one main stress is present.
The examples in (5) and (6) illustrate the distribution of [+/ATR] vowels in
words with monosyllabic and polysyllabic roots respectively.
(5)
(6)
[i]
[u]
[o]
[i]
[a]
[a]
[i]
[o]
211
pipia
nintua
pololi
dbino
[pipia]
[nintua]
[pololi]
[dobino]
type of calabash-sg
ring-sg
frog-sg
date fruit-sg
When a root contains a long vowel (which is always realised [+ATR]), such as
in kpeeni important, the nal /i/ is realised as [+ATR] [i] as well, whereas the
nal /i/ in nli millstone is pronounced as [ATR] [i] because it is preceded
by the [ATR] vowel []. As mentioned above, lexical exceptions are mostly
found with nouns that have disyllabic (or longer) roots; disharmony may be
found inside the root, whereas the sufx usually takes the [ATR] value of the
last vowel.
(7)
mud-N
kebab-sg
neck-sg
voice-sg
These examples, apart from the cases that are ruled out by epenthesis or complex lexical structure, lead to the following conclusion: while vowel harmony
functions within certain contexts in Dagbani, its occurrence is predictable only
for disyllabic nouns. In these cases, vowel harmony can be used to dene a
phonological word. As far as longer nouns are concerned, some follow the rule,
whereas others show disharmonic behaviour.
1.5
In many languages, there is a requirement that the minimal length of a phonological word be bimoraic. This usually applies to lexical categories, whereas
grammatical markers or particles (which may be characterised as clitics in
some cases) may be shorter than this. Regarding the structure of most nouns
in Dagbani, one realises that Dagbani tends to have similar preferences: typical simplex nouns are at least disyllabic, including those surfacing as CVV or
CVN, as mentioned earlier.
Non-lexical categories in Dagbani tend rather to be short, i.e. they typically
have a CV structure, as those displayed in (8). This also includes a number of
words which might be considered adverbials.
(8)
212
Knut J. Olawsky
On the other hand, Dagbani has a small number of nouns which despite their
status as lexical categories are clearly monomoraic. Some of these are listed
below:
(9)
Monomoraic nouns
ba /ba/ father-sg; ride
ma /ma/ mother-sg
zo /zo/ friend-sg; run
za /za/ millet
The monomoraic status of these words is conrmed by the fact that ba and ma
contrast with the same words with a long vowel baa dog, swamp and maa
(denite article). In addition to these nouns, there are many verbs which have
the structure CV. Verbs like di/ba/nyu eat/ride/drink are very common.
Speakers have no difculty in identifying these units as complete words. But
in spoken language, these verbal roots do not normally occur in isolation: in their
citation form, they are typically accompanied by the prex /n-/. In sentences,
they occur in combination with a pronoun, and (often) with a sufx indicating
the perfective or imperfective form (as all verbs do, regardless of their length).
On the other hand, it must be assumed that these verbs, as well as the nouns
mentioned above, are stored as monomoraic lexical entries in the lexicon. This
leads to the conclusion that Dagbani phonological words tend to satisfy the
minimal word requirement.
2
Grammatical word
Note that this vowel is not sufxed to CV and CVN verbs for partly phonotactic reasons. /i/ can
neither follow a short vowel other than /i/ nor be preceded by //.
2.1
213
Compounding
nounadjective
a. binpilli
b. waiu
c. tikpilli
d. banpilli
/bn-pjel-li/
/a-ze-u/
/ti-kpl-li/
/ban-pjel-li/
(thing-white-sg)
(snake-red-sg)
(medicine-round-sg)
(skin-white-sg)
214
Knut J. Olawsky
combination of both parts results in a new meaning for the whole. Note that the
example in (10a), for instance, is lexicalised, but it may well occur as a spontaneously produced construction of noun plus adjective, referring to a white
thing, and not necessarily to a shroud. Such nounadjective constructions are
particularly interesting since these ad hoc combinations of nouns and adjectives
in a nominal phrase containing an adjectival phrase are built in the very same
way as lexicalised nounadjective compounds which may be a good reason
to call these constructions compounds. Accordingly, plural is only marked
on the morphological head at the right word boundary. As mentioned above,
adjectives have the same morphological structure as nouns, as they consist of
a singular and a plural form which are characterised as belonging to a specic
number class.7 When an adjective and a noun are combined in any ad hoc construction, the noun (being in initial position) loses its class sufx; what remains
is only the nominal root, followed by the adjective. The number specication
of the phrase is determined solely by the class sufx on the adjective. The
singular or plural sufx in this position is the inherent sufx of the adjective.
For instance, the root yil- of the class 3 noun yil-a horn-sg, as illustrated
in (11a), will be in rst position of a construction with the (class 1) adjective
pil-li white-sg as the second element. The resulting form yil-piel-li white
horn has only one sufx, namely the class 1 sufx /-li/ of the adjective. Plural
formation (white horns) is realised according to class 1 of the adjective as
well (yil-kar-a). While horn in this example is clearly the semantic head, its
morphological features are neutralised and unambiguously determined by the
adjective, i.e. its sufx.
(11)
white-sg, pl
holy-sg, pl
The same applies to nounadjective compounds which involve a type B adjective (cf. note 7), as in (11b): the noun is realised in its root form, whereas the
pseudo-adjective is unmarked in the singular. For plural formation, the default
plural marker -nima is attached to the adjective. This nal sufx marks the
whole phrase as [+plural].
7
This applies to canonical adjectives (type A). Dagbani has another type of adjective (type B) that
is pluralised like loans and which differs from type A by various morphological and syntactic
features. A detailed discussion of these differences is in preparation (Olawsky and Ortmann
2002).
215
2.2
Adjectival derivations
While NPs of the structure nounadjective can be regarded as compounds on the basis of the
above discussion, other elements that can be part of an NP, such as determiners or quantiers,
do not form part of it, as they follow the nounadjective construction and form independent
phonological and grammatical words, e.g. [nounadjective] demonstrative do-titan-a these
big men.
216
Knut J. Olawsky
c. /-a/
no-o
no-nya, no-nyam-a
hen-sg/pl
female fowl-sg
na-u
na-nya, na-nyam-a female cow-sg/pl
cow-sg
/-sa-a/
na-u
na-saa, na-sa-hi
young cow-sg/pl
young cow-sg
Clitics
217
1sg
2sg
3sg [+animate]
3sg [animate]
1pl
2pl
3pl [+animate]
3pl [animate]
3.1
pre-verbal
post-verbal
n, m
a
o
di
ti
yi
bi
di, a
ma
a
o
li
ti
ya
ba
li, a
Pronouns
Dagbani pronouns are good candidates for being considered clitics, as their
structure seems to accord with the factors mentioned above. Dagbani has
emphatic and non-emphatic pronouns, both of which distinguish person, number
and animacy (third person only). The non-emphatic pre-verbal and post-verbal
pronouns are shown in table 1. The shape and the distribution of the nonemphatic pronouns indicate that they should be regarded as clitics, as they
display the following properties:
r They usually do not stand alone, but occur together with other words.
r The phonological structure of those pronouns which contain schwa (these
are d, l, t and b, but <i> is used in the orthography) does not correspond
to a phonological word, since they can be characterised as consisting of a
consonant plus transitional vowel, i.e. if we represent the third person singular pronoun as /d/, for instance, where the schwa is epenthetic. This makes
/d/, /l/, /t/ and /b/ non-syllabic, which means that these have to be attached to
a phonological host by some means.
r Apart from the fact that the non-syllabic pronouns cannot occur without a
host, none of the non-emphatic pronouns can bear stress.
r The fact that vowel harmony does not apply across a clitic boundary further
strengthens the view that they are not afxes (which also follows from syntactic evidence, as other non-prexal elements can occur between pronouns
and verbs).
These arguments sufce to justify treating the non-emphatic pronouns discussed
in this work as proclitics.
It should be noted that the set of possessive pronouns (occurring in prenominal position) is identical to the pre-verbal pronouns. Another aspect worth
mentioning is that pre-verbal and post-verbal pronouns are closely related in that
they share the initial consonant. Whereas the unmarked (pre-verbal) pronouns
tend to consist of a consonant plus schwa, the post-verbal pronouns end in
218
Knut J. Olawsky
the vowel [a], but have the initial consonant in common with their pre-verbal
counterparts.
In summary, as non-emphatic pronouns can be regarded as grammatical
words, but not as phonological words, they can be characterised as clitics.
3.2
Pre-verbal markers
One extreme would be to call these prexes, as each of them can occur adjacent
to a verbal root; on the other hand, one might regard them as independent words,
since some of them are not necessarily unstressed or phonologically weak. In
the following, the status of each of these markers will be discussed.
3.2.1 Tense markers Tense is expressed by elements which occur before the verb. More precisely, only future tense has such markers, ni (afrmative)
and ku (negative), whereas past or present (= non-future) does not need any
specic marking. A sentence without explicit tense marking may be past or
present, depending on the context.
(14)
There are several typical features of a phonological word that ni and ku lack:
(1) phonological weakness of n, whose nucleus is a (probably epenthetic)
schwa, and which is reduced to [n] after vowels (i.e. almost always);
(2) both markers never bear stress and are therefore not phonological words;
219
These markers precede the verb or any tense/aspect or negative markers. For
instance, in (16a), sa precedes the negative marker bi; in (16b), daa is followed
by the future marker ni, and in (16c) by the aspect marker yn.
(16)
All three elements are not complete grammatical words as they do not occur
in isolation. But while there is good reason to assume that they are clitics, they
differ from the clitics described so far with respect to a few features.
(1) Unlike most pronominal clitics, sa and daa are phonologically more prominent (while di is realised with a schwa as syllabic nucleus). They both contain
full vowels and are never reduced phonologically.
(2) All three markers can potentially bear stress under certain conditions. This
is the case when the emphasiser /n-/ is prexed to them. (/n-/ is usually attached
220
Knut J. Olawsky
Post-verbal emphasisers
Two other elements which are candidates for being considered clitics are the
focus marker la and the emphasiser mi. In contrast to the examples discussed
above, both occur in post-verbal position. Since their function is rather complex,
9
Another marker related to negation is ku. However, there are reasons to classify ku, which clearly
combines information about negation and future tense, as a future marker rst and foremost.
It appears in the same position as its afrmative counterpart ni, while bi occurs in a different
position and has a different status, as shown earlier in this section.
221
la is generally found after the verb. However, there are certain restrictions on
its distribution:
r it is never found when no word follows (or when a subordinate clause follows);
r it does not occur before non-emphatic pronouns (possibly because they cannot
be focussed).
Interestingly, separation of the focus marker la from the verb is possible, as
the examples in (18) show. Bawa (1978) mentions this type of construction
as disjunctive occurrence of la. The difference in meaning as compared
to the unmarked position is hard to explain. Another aspect is that la also
functions as a denite article so that the la in saim la (18a) may be interpreted as the (porridge) rather than as the same particle occurring in dila.
Speakers of Dagbani usually have difculties assigning a precise meaning to
la in this position. However, it attaches to a phonological host, forming a stress
unit with it, which is ['samla] or ['dbala], respectively, in the examples
below.
(18)
Discontinuous use of la
a. Adam di saim
la.
Adam eat porridge (def.art?)
Adam has eaten (the?) porridge.
b. o
daa di
ba la
bi-hi
3sg prx give.birth 3pl foc child-pl
She gave birth to (them) three children.
ata.
three
The other marker found in post-verbal position is mi. It can be best described as
an emphasiser, although its exact function is difcult to describe, similarly as
for la. In a way, it is related to la, as it also implies continuous meaning when
it co-occurs with the imperfective (cf. (19a, b)). It also occurs with perfective
222
Knut J. Olawsky
forms, where its function is simply emphatic, rather than affecting the interpretation of aspect (cf. (19c)). The crucial difference between mi and la is that mi
occurs only with intransitive clauses.
(19)
mi in intransitive clause
a. o
daa di-ri-mi.
3sg prx eat-impfve/emph
He was eating.
b. n
di
ku-ni-mi.
1sg prx go.home-impfve/emph
I was going home.
c. bi-hi
maa
di-la
saim.
child-pl def.art eat-foc
porridge
The children have eaten porridge.
bi-hi
maa
di-mi.
child-pl def.art eat-emph
The children have eaten.
Also similarly to la, mi can be separated from the verb in transitive sentences.
The examples in (19) show that mi cannot be followed by an object. Nevertheless, mi occurs in the clearly transitive sentences; in this case it is separated from
the verb and inserted after the object phrase to indicate emphasis. Bawa (1978)
characterises this occurrence of mi as discontinuous form again. Comparing
it with la in this position, the analysis of this form is ambiguous, as there is
another particle mi, meaning also, which may occur in the same position
(20e).
(20)
Discontinuous occurrence of mi
a. o
nyu-ri
kom mi.
3sg drink-impfve water emph
He is drinking water.
b. o
puhi-ri
ma mi.
3sg greet-impfve me emph
He is greeting me.
c. o
bihi-ri
pump mi.
3sg sleep-impfve now
emph
He is sleeping now.
d. o
bihi-ri-mi
pump.
3sg sleep-impfve/emph now
He is sleeping now.
e. o
puhi-ri
naa
mi.
3sg greet-impfve chief also
He is also greeting the chief (because its the chiefs turn).
The difference between sentences (20ad) and (20e) is unclear, as in the former,
the interpretation also is obviously not given, whereas this translation seems
to be valid for (20e). Another aspect to consider is that (20a) and (20e) are
223
words (general)
afxes (general)
future tense markers
negation marker
personal pronouns
postverbal emphasisers la, mi
proximality markers
segmentally
salient
can bear
stress
attached to a
phonological
host
variable
word order
occur in
isolation
+
(some)
(one)
(some)
+
+
+
+
+
+
+
+
+
+
+
+
transitive sentences, but as mentioned above, mi typically occurs with intransitives. This raises the question whether mi in (20a) and (20e) is actually the
same as the emphatic marker. Note also that the position of mi is variable when
the complement involved is not a direct object, but an adverbial, as in (20c, d).
In any case (whether it is an emphasis marker or means also), mi is not a
phonological word, as it always is an unstressed part of a stress unit.
The discussion above raises the question of how la and mi should be characterised regarding their categorisation as word or clitic. What is certainly
remarkable is their ability to shift to phrase-nal position under certain conditions. Phonologically, both lack the status of a word, since they do not bear
stress. Their varying position within the sentence further supports the view that
they are clitics. Since we have already observed different kinds of clitics in
Dagbani, the question will be to what category la and mi can be assigned. The
best way to answer this is to view them in a table that compares the various
features of all markers between word and afx that have been discussed in
this chapter. Note that one criterion is segmental saliency. By this I mean the
degree of phonological prominence, e.g. an epenthetic schwa is less salient than
a full vowel. Table 2 clearly indicates the gap between proximality markers
and other clitics (or afxes), as the former display certain features that show a
closer relationship to words than to afxes. I conclude that Dagbani has a variety of clitics, which may have different properties. Some are obviously closer
to afxes, whereas others share more features of words.
4
In many instances, the grammatical and the phonological word in Dagbani coincide. A morphologically complete grammatical word is always a phonological
word, with the exception of clitics, as was discussed in 4: if proclitics which
224
Knut J. Olawsky
consist of a (non-nasal) consonant only are grammatical words, but are phonologically complete only when attached to a host (which involves epenthesis
of a vowel), we must speak of two grammatical words contained in one phonological word. Note that phonological and grammatical words also coincide in
compounds: since a compound may contain several roots, but only one sufx,
only the whole will be acceptable as a grammatical word. This coincides with
the fact that compounds have only one primary stress, namely on the penultimate syllable, while each of the roots involved would bear stress on its penult
if it stood in isolation. Even reduplications, which are grammatical words in
many languages, containing more than one phonological word, do not match
the criteria: many reduplicated words in Dagbani are inherently reduplicated
ideophones and not duplicates of one word that otherwise occurs in isolation.
Examples like /bjela-bjela/ slowly are not decomposable into two meaningful
units; on the phonological side, the word has only one primary stress. Another
type of construction that also shows a mismatch between grammatical and
phonological word is represented by bound adjectives (cf. 2.2). On the one
hand these are phonological words in that they can bear stress; on the other, they
cannot occur in isolation, as would be expected from grammatical words. Their
status as a category between complex sufx and word makes them particularly
interesting.
Various examples cited in this chapter have shown that the concept of word
is an important psychological reality in Dagbani. Even though this is not manifested by a term used specically for word in the vocabulary, Dagbani speakers
have a strong idea of what are words and what are not.
Appendices
I
Dagbani uses several lexical items in order to refer to word (listed in (21)).
The contextual use of three of these is illustrated by the compounds in (22)
and (23).
(21)
(22)
225
argument,dissension,dispute
blabbermouth
sweet words
hunger, involuntary fasting
unity
(23)
II
An experiment conducted with ten Dagbani speakers demonstrates that morphological and specic phonological factors play a role in the well-formedness
of a Dagbani noun. In a rating test, the participants were asked to evaluate
each of 292 given pseudo-words with regard to acceptability. The results show
that a noun must have a sufx. In addition, word length plays a role in the
well-formedness of a Dagbani word, whereas other factors, such as vowel length
and vowel epenthesis are less relevant (cf. Olawsky 1998).
III
226
Knut J. Olawsky
(24)
References
Bawa, A-B. 1978. Collected notes on Dagbani grammar, ms., Ajumako.
Dagbani Orthography Committee 2000. Dagbani sabbu zalisi Rules for spelling
Dagbani, as xed by the Dagbani Orthography Committee at the 2nd Conference on Dagbani Orthography, 2829th November 1997. Tamale: Ghana Institute
of Linguistics, Literacy and Bible Translation Press.
Kropp-Dakubu, M. E. 1997. Oti-Volta vowel harmony and Dagbani, pp 818 of Gur
Papers / Cahiers Gur 2: Actes du 1er Colloque international sur les langues gur du
3 au 7 mars 1997 a` Ouagadougou, 1`ere partie: Generalites et phonologie.
Hyman, L. 1993. Structure preservation and postlexical tonology in Dagbani,
pp 23554 of Phonetics and phonology, Vol. 4: Studies in lexical phonology, edited
by S. Hargus and E. Kaisse. San Diego, Calif.: Academic Press.
Hyman, L. and Olawsky, K. forthcoming. Dagbani verb tonology, to appear in
Proceedings of the 31st Annual Conference on African Linguistics, Boston,
25 March 2000.
Olawsky, K. J. 1998. Psycholinguistic experiments on Dagbani novel nouns, Arbeiten des
Sonderforschungsbereichs 282, Theorie des Lexikons 108. Dusseldorf: HeinrichHeine-Universitat.
1999. Aspects of Dagbani grammar with special emphasis on phonology and
morphology. Muenchen: LINCOM Europa.
Olawsky, K. J. and Ortmann, A. 2002. Dagbani adjectives, ms., University of
California, Berkeley and Heinrich-Heine-Universitat, Dusseldorf.
Zwicky, A. M. 1977. On clitics. Bloomington: Indiana University Linguistics Club.
Like other languages of the Caucasus, Georgian has a large number of consonants (twenty-eight) and a modest number of vowels (ve). Georgian is famous for its consonant clusters, which may contain up to seven consonants,
e.g. mcvrtneli trainer, vprckvni I peel it; clusters are not punctuated by
epenthetic vowels. Traditional work in Georgian has identied so-called harmonic consonant clusters, which are characterised by (a) shared laryngeal properties (roughly, all voiceless, all voiced or all ejectives), (b) a structure in which
each successive consonant is articulated further back in the mouth (e.g. pt, pk,
tk are harmonic clusters, but kp and tp, while allowed, are not harmonic) and
(c) a rst segment that is a stop or affricate and a second that is a stop, affricate
or fricative; see Vogt (1958) and Macavariani (1965).1 Vogt (1958) and others
have argued that harmonic clusters in Georgian are actually complex segments,
not consonant clusters; if this is correct, it would push the number of consonants
in the language much higher. However, Chitoran (1994) presents acoustic data
1
Axvlediani, who may have been the rst to analyse harmonic clusters, lists them (1951: 113) as
the following:
pk
tk
ck
c k
pk
tk
ck
c k
bg
dg
g
g
pq
tq
cq
c q
px
tx
cx
c x
b
d
227
228
Alice C. Harris
that show that harmonic clusters are true clusters of consonants, not complex
segments.
The morphology is predominantly agglutinative, but there is some fusional
morphology. A simple verb paradigm illustrates both the agglutination that
characterises most of the language and the fusion.
(1)
1
2
3
singular
v-xedav
xedav
xedav-s
plural
v-xedav-t see present
xedav-t
xedav-en
svil- child
ma brother
tav- head
xatav- paint
kvlev- research
While these examples are relatively simple, the morphemes of the verb combine
in complex ways to indicate tense aspect mood (tam) categories.
2
2.1
Basic criteria
229
and this phenomenon is well known in other languages; it does not, however,
occur in the modern language. Although deviations from a single order are
known even in some neighbouring languages, such as Udi (Harris 2000), there
are no such deviations in Georgian. The parts of a word in Modern Georgian
always occur together, in a xed order, with a conventionalised meaning.
2.2
Recursion
The Latin criterion of a single inectional afx per word does not characterise
the word in Georgian in the same way. For example, within limits, verbs may
bear non-zero marking of both subjects and objects: g-elodeb-a she waits for
you,2 where g- marks the second person object, and -a the third person singular
subject. Nouns may bear both a case and a number marker (d-eb-s to the sisters,
where -eb marks plurality and -s marks the dative case) or two case markers
(with or without a number marker, d-eb-isa-s to something belonging to the
2
Georgian does not distinguish gender in pronouns or in verbal forms; third person singular
pronouns and the subjects of third person singular verb forms in this chapter are translated with
a feminine pronoun.
230
Alice C. Harris
Pausability
Complete utterance
lens-i
lens-nom
(4)
ak here
(5)
samcuxarod unfortunately
Major parts of speech may be used with the enclitic form of the verb be to
form a complete statement, as long as the referents of the arguments have been
identied in discourse:
(6)
ekimi=a
doctor=is
Shes a doctor.
(7)
c kviani=a
smart=is
Shes smart.
231
Most clitics, including those that cliticise only optionally, cannot be used
alone (e.g. rom that, tu if, =ve again, =ode approximately, =ze on, =si
in). However, ara not and forms of be can be used alone. No non-clitic conjunctions of any kind can be used alone (e.g. da and, an or, tumsa although,
rodesac when).
2.6
Circumxes
mo-m-svl-el-i
hither-ptcpl-move-ptcpl-nom
coming
(9)
Such examples are not limited to circumxes, since the pre-verb is also outside
the agreement markers and so-called character vowels, yet is within the semantic
scope of all of these (see Ackermann and Webelhuth 1998 for discussion).
A similar kind of problem is presented by the bracketing paradox in (11).
Ordinal numbers are formed with the circumx me--e. Cardinal numbers between twenty and forty, forty and sixty, etc. are formed with da and as in (10)
and are widely considered compound single words. (The orthographic norm
requires that they be written without hyphens or spaces, though hyphens are
used here.)
(10)
oc-da-or-i
20-and-2-nom
twenty-two
232
Alice C. Harris
Clearly the entire cardinal is within the semantic scope of the formant of the
ordinal, but it is not within its phonological or morphological scope, as shown
in (11).
(11)
oc-da-me-or-e
20-and-ORD-2-ORD
twenty-second
Note that a similar bracketing paradox is found both in the Modern English
translation of (11) and in the archaic four and twentieth (Zechariah 1:7).
In the end we must conclude that a circumx may encompass a unit that
may be smaller than a word; that is, its status is still to be established on other
grounds. Nevertheless, it is probable that units larger than a word cannot be
encompassed by a circumx.
2.7
Conclusion
The most reliable criteria for identifying a grammatical word in Georgian are
the basic ones quoted above from chapter 1: cohesion, xed order of elements
and conventionalised meaning. Other criteria can play a supporting role.
3
In Georgian the word may consist of a single syllable; this is true not only of conjunctions and particles, such as rom that and xom (roughly) isnt it?, but also
of nouns, such as da sister and ku turtle, and verbs, such as c ris she cuts
it and var I am. Alternatively a word may consist of many syllables, such as
the noun mo.na.di.re.e.bi.sa hunter (gen), the adjective mra.val.mar.cvli.a.ni
polysyllabic or the verb ga.da.u.tar.gmni.ne.bi.na she made him translate it.
For Georgian, segmental features are not helpful in distinguishing a word,
but a number of phonological processes are.
3.1
Stress
not entirely predictable (Tschenkeli 1958: lixlxi; Vogt 1971: 1516; Zenti
1963). Very frequently it is the rst syllable that is (lightly) stressed, regardless
of the length of the word and regardless of whether or not the rst syllable
is a prex: deda mother, kalak-i city, sa-tval-e eye-glasses, ga-mo-a-cx-o
heated, mo-nadir-e hunter. It is not unusual for the antepenult to be stressed
(amxanag-i comrade, acquaintance), and other syllables may be stressed
(sa-stumr-o hotel, me-otx-e fourth).
233
a. d-s
saxl-i
sister-gen house-nom
[her] sisters house
b. d-s-svil-i
sister-gen-child-nom
niece, nephew
(13)
a. gvi-s
x l-i
Givi-gen fruit-nom
Givis fruit
b. zet-is-xil-i
oil-gen-fruit-nom
olive
In (1213), the (a) examples are phrases consisting of a noun in the genitive,
followed by a noun in the nominative; each word has a primary stress. The
(b) examples are similar, consisting also of a noun in the genitive, followed
by a noun in the nominative, but the (b) examples are considered compounds
and have a single primary stress. Because some words that on morphological grounds are considered compounds have dual stress, there is sometimes a
mismatch between the phonological and the morphological word.
3.2
In Georgian a syllable may consist of a nucleus alone (a.ra no, not, ga.a.ke.ta
she did it), an onset with a nucleus (da sister), a nucleus with a coda
(ik there) or an onset, nucleus and coda (mas her dat). Either onsets or
codas or both may consist of consonant clusters (mcvrtne.li trainer, tbi.li.si
234
Alice C. Harris
Tbilisi, a.kebs she praises it, msxverpls victim dat). Sonorant consonants
never serve as the nucleus of a syllable in Georgian.
A basic principle of syllabication is that the rst segment in a consonant
cluster is syllabied with the preceding vowel, subsequent consonants with the
as ka.cma; Sanie (1974: 21) indicates var.debs roses dat, while Axvlediani
(1938) gives both var.di and va.rdi rose nom.
In general, enclitics are syllabied in the same way sufxes are, as illustrated
in (14).
(14)
Bush (1997) argues that consonant devoicing is a productive rule that applies in
word-initial position. For example, while mta mountain may be phonetically
[ mta],
%
sua-mta, the name of a monastery, cannot be [*sua- mta].
% 4 This devoicing
is more likely to occur at moderate and rapid rates of speech than at slower ones.
According to Bush, devoicing is not simply a matter of assimilation, as it applies
also in words such as mdgomareoba situation; yet devoicing is more likely
when the potential target segment is followed by a voiceless consonant, as in
mta mountain. Thus, presence of devoicing provides evidence of the initial
boundary of a phonological word.
According to Hewitt (1995: 21), consonant devoicing also occurs at the end
of words, with words such as rusul-ad Russian [language] adv optionally
pronounced [rusulat]. Historically this has resulted in words such as egret thus
from *eg-r-ed. In principle, this can provide evidence of the nal boundary of
the word, but in practice both types of devoicing in Georgian are weak and
difcult to identify.
3
4
235
In (15), the last syllable of prckvnis rises, as in other Georgian yes-no questions,
and as indicated here by the double acute accent; in addition, because it is the
last word of a yes-no question and is monosyllabic, the vowel is lengthened
(indicated here by doubling the vowel, contrary to the orthographic norms).5
All vowels (i, e, a, o, u) are subject to ML. The -i in (15) is usually analysed as
a sufx; ML also applies if the vowel of the monosyllabic word is in the root
(e.g. da a?
sister?) or in a prex, as in (16).
(16)
a akvs?
!
[morphologically, a-kv-s]
cv-have-3sg
Does she have it?
However, if the last word of the yes-no question is a clitic, ML does not apply,
as I show below.
(17)
Though not noted by Bush, there are sentences in which a word meets the two criteria of being
nal in a yes-no question and being monosyllabic and yet the word does not get ML, as in (i).
(i)
cavid a !
seni da?
she. leave your sister
Did your sister leave?
Here the rising question intonation is on cavida; the end of the sentence may have either falling
tone or rising, but it cannot have ML. See also Kiziria ( 1991: 97).
236
Alice C. Harris
3.3
Conclusions
Clitics
a. merab-i
ekim-i
a r
a ris
Merab-nom doctor-nom neg he.be
Merab is not a doctor.
b. merabi ekimi a r aris
(19)
(20)
a. puli
a ra m-akvs
money neg 1sg.obl-have
I dont have any money
b. puli ara makvs
c. ara makvs puli
When the same particle negates a single word or phrase, it may immediately
precede or follow the constituent negated. If the negative precedes, it does not
normally cliticise.
6
Either ar or ara may cliticise, and either may fail to cliticise; the rst is not a clitic form of the
latter. The full form, ara, is used before present tense forms of be (ara var I am not, ara xar
you are not, etc.), including the reduced form of the third person singular, ara=a she is not,
but not with the unreduced forms of the third persons, ar aris she is not (*ara aris). It is also
used with the present tense of have (inanimate object): ara makvs I dont have it, ara akvs
she doesnt have it, etc.
(21)
237
On the other hand, if the negative follows the word it negates, it does ordinarily
(en)cliticise, and the stress shifts to it.
(22)
Georgian also possesses two clitics that are positioned in an unusual way:
rom if, when, that, because, etc. and tu if, whether. Rom may be said to
be an all-purpose clausal conjunction, because it can mark almost any kind of
embedded clause. However, when it marks clauses that occur as the complement
of verbs such as say or believe, it is simply initial in the clause and otherwise
has properties different from those discussed below. In its other functions, it
may occur rst, but in speech more often occurs in second or later position.
In languages such as Serbo-Croatian, second position may be understood as
following the rst word or as following the rst phrase. That is not the issue
here; rom can follow one or more sentence constituents:
(23)
Rom must, however, come before the verb; it can never follow the verb:
(24)
In each of its possible positions, rom is proclitic, and its host may be any type
of constituent; unlike ar(a), it does not attract stress:
(25)
The second =ve means all and occurs only with numbers.
238
Alice C. Harris
(27)
a. or-i=ve
two-nom=all both
b. sam-i=ve
three-nom=all all three
c. otx-i=ve
four-nom=all all four
=ve does not bear word stress.7 Not only are the meanings of these two homophonous enclitics and the hosts to which they attach different, their distributions in the word are also different. The morpheme =ve in the meaning all
(illustrated in (27)), appears to be undergoing reanalysis from an enclitic to a
derivational sufx. Speakers in some instances accept both the order case-ve
and the order ve-case, as in (28) (see also Jorbenae, Kiobaie and Berie 1988:
1656); in other instances they accept only one or the other, as in (2930). (In
the rst variant of (28), I assume that ori functions as the stem, with -ve reanalysed as a sufx, followed by a dative case sufx, which deletes before -si
in.)
(28)
o r-i-ve-s
/ o r-sa=ve
two-nom-all-dat two-dat=all
I see both.
vxedav
I.see.it
(29)
cqali
semovida o r-i-ve=si
/ *or-sa=ve=si
water.nom it.came.in two-nom-all=in two-dat=all=in
The water came into both.
(30)
merabi or-i-ve-s=tana=a
/ *or-sa=ve=tana=a
Merab two-nom-all-dat=with=is two-dat=all=with=is
Merab is with both.
It is the order case-ve that was used in Old Georgian, while the order ve-case
seems to be preferred today. In the meaning again, on the other hand, as
illustrated in (312), =ve is certainly an enclitic; for example, it follows the
dative case marker in (31) and the postposition -ze on in (32).
(31)
de-s=ve / *de-ve-s
day-dat=again
on the same day
In yes-no questions, stress shifts to the last syllable of the last word; if this happens to be an
enclitic that otherwise does not bear stress, it does so in this situation. For example, in the same
place (remote) as a statement or part of one is [k=ve], not [*ik=ve]; as a yes-no question, or as
the last word of a question, however, it is [ik=v
e ], where the double acute accent indicates stress
and the special rising tone found in Georgian only in yes-no questions. This seems to be true also
of all other enclitics in Georgian.
(32)
239
ima=ze=ve
it.dat=on=again
on the same one
=(a)c too, also can be enclitic to any one of a variety of hosts noun, pronoun,
deictic, adjective. (The vowel occurs after a consonant.)
(33)
a. seni da=c
your sister=too
b. man=ac
she.nar=too
c. ak=ac
here=too
a. ik
ar
v-qopil-var
there neg 1sg-be-1sg
I havent been there (unintentional).
b. ik
ar
-qopil-xar
there neg 2sg-be-2sg
You (sg) havent been there (unintentional).
The third person singular of the verb be in the present tense, unlike any
other personnumber combination or TAM category, has a special clitic form.
Reduction (with cliticisation) is optional, but failure to reduce gives emphasis
to the statement.
(35)
a. merabi saxl=si
aris
Merab house=in he.be
Merab is in the house. (slightly emphatic)
b. merabi saxl=si=a
house=in=a
Merab is in the house.
The special clitic, or the sufx derived historically from it, is used in the formation of the TAM categories mentioned above; (36) is parallel to (34) above.
240
Alice C. Harris
(36)
ak
ar
qopil-a
here neg be-3sg
She hasnt been here. (unintentional)
With the caveat stated in note 7, =a does not bear stress; we nd, for example,
[saxl=si=a] from (35b) and [qopil=a] from (36).
There are three quotatives: metki I said, =tko you should say, =o she
said, they said, they say. The second is literary and so infrequently used that
some consultants have even told me that it is not Georgian. The other two are
commonly used in conversation and in writing. The example below, where the
quotative co-occurs with a verb of speaking, shows that this is really a quotative,
not (synchronically) a verb of speaking.
(37)
ara
neg
var,
I.am
metki
quot
As the example above illustrates, metki I said occurs at the end of the sentence;
its scope is commonly a clause, broadly understood, but it may be even greater.
The scope of =o is the same in principle, but in conversation it is often repeated
earlier in the clause.
(38)
mepem mitxra:
ara=o,
tkven=o
puli
ar
gcirdeba=o
king
he.said.me no=quot you=quot money neg you.need.it=quot
The king said to me: No, you [emphatic] dont need money.
There are additional clitics in Georgian, but I have covered here many of
those that are used most frequently, with the exception of postpositions.
To summarise, Georgian has both proclitics (e.g. ar(a)= neg) and enclitics
(e.g. =ve again, =var I am, =o quot). Some cliticise optionally (e.g. ara=
neg, =var I am), and others obligatorily (e.g. =ve again, =o quot). Some
may attract stress (e.g. ar(a) neg); others never bear stress (e.g. rom= that), or
do so only if they are sentence-nal in yes-no questions (e.g. =ve again; all).
Some enclitics are hosted by the narrow constituents, usually single words,
that are in their scope (e.g. =ve again, =(a)c too). Either proclitics (e.g.
negative and afrmative particles, subordinating conjunctions) or enclitics (e.g.
quotatives) may have scope over an entire clause. In this case the negative and
afrmative particles are positioned relative to the verb, the quotatives relative
to the end of the clause (or quoted material). Positioning of the subordinating
conjunctions, on the other hand, takes into account both the beginning of the
clause and the position of the verb.
241
Conclusion
The basic criteria stated by Dixon and Aikhenvald in chapter 1 cohesion, xed
order of elements and conventionalised meaning are the most reliable ones for
identifying the morphological word in Georgian. The phonological word can be
identied as having a single primary stress. Syllabication and Monosyllabic
Lengthening treat a host with its enclitics as a phonological word.
In some instances there are mismatches between the criteria for the morphological word and those for the phonological word. As noted in 3.1, some
compounds have two stresses, generally a characteristic of a two-word phrase
(e.g. tol-amxanagi comrade(s) of the same age, pur-remi female deer).
Yet the fact that the rst component of each of these has no case marker indicates that it is not an independent word, and consequently that the whole is a
compound.
References
Ackerman, F. and Webelhuth, G. 1998. A theory of predicates. Stanford: CSLI.
Axvlediani, G. 1938. Zogadi da kartuli enis ponetikis sakitxebi [Questions of general
and Georgian language phonetics]. Not seen.
1951. Dve sistemy garmoniceskix smycnyx v gruzinskom jazyke, pp 11316
cerby (18801944): Sbornik statej.
of Pamjati akademika Lva Vladimirovica S
Leningrad: Universitet.
Bush, R. 1997. Georgian syllable structure, Phonology at Santa Cruz 5.114.
1999. Georgian yes-no question intonation, Phonology at Santa Cruz 6.111.
Cherchi, M. 1994. Verbal tmesis in Georgian, Annali del Dipartimento di Studi del
Mondo Classico e del Mediterraneo Antico Sezione linguistica 16.33115.
Chitoran, I. 1994. Acoustic investigation of Georgian harmonic clusters, Working
Papers of the Cornell Phonetics Laboratory 9.2765.
Harris, A. C. 2000. Where in the word is the Udi clitic? Language 76.593616.
Hewitt, B. G. 1995. Georgian: a structural reference grammar. Amsterdam: Benjamins.
Jorbenae, B., Kobaie, M. and Berie, M. 1988. Kartul enis morpemebisa da modaluri
elementebis leksikoni [Dictionary of morphemes and modal elements of the
Georgian language]. Tbilisi: Mecniereba.
Kiziria, N. 1991. Intonacia da cinadadebata tipebi [Intonation and types of intonation], Macne 3.96101.
Macavariani, G. 1965. Saerto-kartveluri konsonanturi sistema [The Common
Kartvelian consonant system]. Tbilisi: Universiteti.
Robins, R.H. and Waterson, N. 1952. Notes on the phonetics of the Georgian word,
Bulletin of the School of Oriental and African Studies XIV, Pt. 1, 5572.
Sanie,
A. 1974. Kartuli enis gramatikis sapuvlebi. [Fundamentals of the grammar
of the Georgian language]. Tbilisi: Universiteti.
Tschenkeli, K. 1958. Einfuhrung in die georgische Sprache, Band 1. Zurich: Amirani.
19601974. GeorgischDeutsches Worterbuch. Zurich: Amirani-Verlag.
Uturgaie, T. 1976. Kartuli enis ponematuri struktura [Phonemic structure of the
Georgian language]. Tbilisi: Akademia.
242
Alice C. Harris
10
Introduction
per` lexeos:
lexis
est` meros
tou
kat`a
about word+gen word-nom is-3sg part+nom art+gen concerning
suntaxin
logou
elakhiston
syntax /acc expression+gen least/nom
On the word: a word is the minimal part of a syntactic construction
The many participants at the International Workshop on the Status of Word offered lively
discussion of the issues contained herein and comments that I benetted greatly from. I would
like to thank Amalia Arvaniti and Giorgos Tserdanelis for their help with some of the data and Rich
Janda for healthy criticism of various notions presented here; also, the editors of this volume
provided useful comments on an earlier version. The usual disclaimers as to their complicity
hold.
243
244
Brian D. Joseph
2.1
2.2
Modern Greek actually has a fairly large number of potential candidates for
clitic status; while these are typically treated as if they were words (in some
245
sense) or clitics (whatever the term might mean), some (especially those
with grammatical functions) may instead be analysable as afxes (possibly
inectional in nature). A full enumeration of these elements is given in (2).
(2)
a. elements modifying the verb, clustering obligatorily before it (when
they occur), marking:
subjunctive mood (general irrealis): na
subjunctive mood (hortative):
as
future (and some modality):
a
negation (indicative):
e(n)
negation (subjunctive):
mi(n)
b. elements marking argument structure of verb (object pronouns),
occurring as the closest element to verb (i.e. inside of modal etc.
modiers above), positioned before nite verbs and after non-nite
verbs (imperatives and participles); acc stands for direct object
markers, gen for indirect object markers:
person
1
2
3m
3f
3ntr
sg.acc
me
se
ton
tin
ta
sg.gen
mu
su
tu
tis
tu
pl.acc
mas
sas
tus
tis
ta
pl.gen
mas
sas
tus
tus
tus
sg pl
tos ti
ti
tes
to ta
246
Brian D. Joseph
person
1
2
3m
3f
3ntr
sg
mu
su
tu
tis
tu
pl
mas
sas
tus
tus
tus
ton
pate
s to spti su
a. e a
neg fut him+3sg+acc go+2pl+pres to the house your
You wont take him to your house
min tus
ta
pume ta nea mas
b. as
subjunc neg them+gen them+ntr+acc say+1pl the news our
Lets not say our news to them
su
e rafe
c. na
subjunc you+gen write+3sg+pst+impfve
He should have written to you
de
d. pes
to
say+impv.sg it+acc de
So say it already!
Na
tos!
e. pun
dos?
where.is he+wk+nom here.is he+wk+nom
Where is he? Here he is!
f. ksero
o
know+1sg i+nom(wkned)
How should I know?
In the sections that follow, after a brief typology of Modern Greek, the issue
of how to characterise some of these elements with regard to wordhood or
afxhood or in-between status, if such is warranted is addressed in the
context of a general consideration of tests and parameters in Greek that might
dene the word for this (stage of the) language.
2.3
247
a. noun:
b. verb:
c. adjective:
e xi
o
janis
lsi
have+3sg the John+nom untied+perf
John has untied
Similarly, there are occasional bipartite verbs, consisting of, for instance, the
verb kano make plus a nominal form, that describe a unitary activity/event
and are even paralleled in some cases by monolexemic verbs, as in (6):2
2
There are two other types of ostensible multiword units that deserve mention here, though they will
not be treated systematically. First, as John Henderson has reminded me, Greek has word-level
units, discussed rst by Rivero 1992, that are composites of noun or adverb stems with verbs, e.g.
ksanavlepo I-again-see (i.e. I see again) (cf. vlepo ksana I-see again). These are best treated
simply as lexically derived compounds (Smirniotopoulos and Joseph 1998); they behave with
respect to inection (e.g. with the grammatical little elements or with person/number marking)
just like non-compound verbs. Second, there are nominal compounds that consist of two nominal
words (not stems), each of which is capable of inection, e.g. xora-melos member-country (e.g.
of an alliance) (lit. country member), whose nominative plural form is xores-meli, with
both parts inected. These are admittedly difcult to analyse as to their word-level status; see
Joseph and Philippaki-Warburton (1987:2278) for discussion, including the fact that some such
compounds show inection only on the rst member.
248
Brian D. Joseph
(6)
a. en ton
kano
u sto
not him+acc make+1sg taste
I dont care for him
b. en ton
ustaro
not him+acc like+1sg
I dont care for him
We are now in a position to begin to investigate various criteria and tests that
might be brought to bear on the identication of word in Modern Greek. It
is assumed here that for Greek, the construct grammatical word is based on
word as listed in the lexicon, representing major syntactic categories: noun,
e.g. spti house; verb, e.g. ln- untie; adjective, e.g. a rosto- sick; preposition,
e.g. apo from. Admittedly, the bipartite verbs such as kano u sto in (6) above
meet some of the criteria for grammatical word status discussed by Dixon
and Aikhenvald in chapter 1 (7), including cohesiveness, in that the parts
always occur together, and xed order. Even so, though, such combinations
typically involve two forms that occur independently in other contexts, and
the meaning of the combination is roughly compositional (e.g. kano make +
u sto taste = have a taste (i.e. a liking) for), contrary to the criterion of
conventionalised meaning. Thus, except for such verbal units, which could after
all simply be treated as Verb + Object idioms, there is no reason to posit a level
of grammatical word different from lexical word plus other independently
needed machinery (e.g. inections) for Greek.
In such an approach, however, there are some representational issues that
need to be addressed. First, with regard to inection, following Lyons 1968,
the lexical listing is stem (i.e. lexeme see also the discussion by Dixon and
Aikhenvald in chapter 1 (4.1)) and inected forms (where they exist) are then
the grammatical words. A relevant question here is the representation of the
vexing little elements, many of which have grammatical function (as with the
markers in the verbal complex) and thus could be considered inection, and
thus properly part of a grammatical word. Alternatively, one needs to consider
instead if they are separate grammatical words in their own right, with their own
lexical listings. Moreover, besides grammatical word as delineated here, one
has to entertain the possibility that there may be a distinct level of phonological
word, though this depends to some extent on how all the little elements with
grammatical values are analysed; if they are inectional afxes, then much
249
p b f v t d s z t s dz
kgx rlmn
aeiou
However, all of these sounds can occur utterance-initially, so there is no test for
wordhood based on possible initial segments. For nal segments, the situation
holds more promise, as there are some restrictions, although loan words now
interfere with the probative value of such nal segment restrictions. Still, a
generalisation can be formulated as in (8) regarding nal segments for words
that have been in Greek for more than about a hundred years3 and not from the
archaising high-style katharevousa register:4
(8)
Only -s, -n, and vowels are allowed word-nally in Modern Greek (for
certain classes of words).
The extent to which recent loans and learned borrowings have altered that
generalisation is evident from the forms in (9) with other nal segments:
(9)
The reason for this temporal parameter is that in the twentieth century, presumably owing to
different attitudes on the part of native speakers of Greek with regard to nativisation of loanwords,
literally hundreds of loans, especially from French in the rst half of the century and later from
English, entered the language with only minimal phonological adaptation.
This distinction is necessary owing to the sociolinguistic situation prevailing throughout most of
Post-Classical Greek period but with particular intensity in the nineteenth and twentieth centuries,
in which an archaising, high-style form, katharevousa and a more colloquial, stylistically lower
form, demotic, competed in a classic diglossic situation (see Ferguson 1959). To some extent,
both this distinction and that concerning the age of a loan mentioned in note 3 reect a linguists
somewhat omniscient perspective, not necessarily a naive native speakers.
250
Brian D. Joseph
Moreover, there is even a class of native Greek words (or word-like forms) with
a wider range of possible word-nal sounds, namely interjections, onomatopes,
clippings and acronyms, as in (10), which, like borrowings, violate (9):
(10)
However, it is not entirely clear that all such forms in (10), especially the
onomatopes, should be treated as words.5 Thus the segmental level proves
inconclusive regarding a characterisation of word; accordingly, we turn to the
next level up in a phonological hierarchy, namely that of clusters.
3.2
Morphophonemics
One can of course legitimately ask whether forms such as mat s -mut s, ax, etc. are words. While
they can stand as independent utterances and in some sense are minimal syntactic units, they
are functionally different from spti, ln-. A functionally based characterisation of word that
would exclude interjectional utterances might be problematic, since some (apparent) interjections
are quasi-grammatical, e.g. the one-word prohibitive utterance mi! Dont!, which shows a
synchronic connection (in a complicated way see Joseph and Janda 1999, Joseph 2001c) to the
bound subjunctive negator mi(n) (see (3b).
251
/ton patera/
[tom batera]
the father/acc
/tin praksa/
[tim braksa]
her I-teased
/en pirazi/
[em birazi]
not it-matters
Optionally (again subject to a complex of factors), the nasal can be weak or even
absent, but also, for some speakers, sporadically, there is no voicing whatsoever
and sometimes just deletion of the nasal, e.g. [ti(n) praksa] I-teased her.7
6
All of the hedging in these statements is necessary due to the complex sets of sociolinguistic
conditions attendant on the nasal + stop realisations; see Arvaniti and Joseph 2000 for some
discussion and literature. Given the variation, it is difcult to describe the status of [b d g] for
the Modern Greek speech community taken as a whole.
Such forms probably reect the effects of hypercorrective pressures (see Kazazis 1992) or even
spelling pronunciations, but such explanations alone of their occurrence are not enough to warrant
discounting them.
252
Brian D. Joseph
Some linguists take the voicing in these combinations as evidence that a level
of phonological word must be recognised, combining grammatical (lexical)
words into phrases in which certain phonological effects are located, and note
that the voicing effects, while similar to what is found word-internally, are not
identical (the [ti praksa] outcome is not found in medial position); for such
linguists, the little elements are clitics. Alternatively, if the little elements
are afxes, one could point to the similarity of the boundary phenomena to
word-internal combinations with voiced stops, and treat the [ti praksa] outcome as part of the idiosyncrasy of afxal combinations (thus considering the
construct as a morphological word or perhaps morphosyntactic word, with the
afxes as the realisation (spell-out) of various features, such as [+negation]
or [+3SG.f.dir.obj])
Still, some voicing can be induced by what must be a word in any approach. For instance, for some speakers (maybe only in fast speech), the complementiser a n if can trigger voicing on a following stop, as in /an po/ if
I-say [am bo]. Such facts might tip the balance in favour of the (grammaticalwords-combining-into-a-) phonological-word approach and against the afxal/
morphological-word approach, although counterbalancing the possibility of
voicing here is the further fact that for some speakers as well, the usual outcome
of /an po/ is [am po], denitely not a word-internal type outcome. Moreover, in
any case, it can never have the nasalless realisation *[a bo], even for speakers
who usually do not have a nasal with a voiced stop word-internally. Therefore,
there is indeed some difference between combinations with articles, pronouns,
etc. and combinations with more clear-cut grammatical words (contrast (na)
ti(m) bo should I-tell her with a m po). This might well be taken by some as
evidence for an intermediate construct such as clitic or simply as atypical
word- or atypical afx-behaviour.
Yet, there is another way to view all this, in the light of still further facts. The
genitive weak pronoun used for marking indirect objects is formally identical
with the genitive weak pronoun used for marking possession (cf. (2b, f)), but
they show different behaviour vis-`a-vis nasal-induced voicing. In particular, the
object pronouns (which are afx-like in showing idiosyncrasies, high selectivity,
strict ordering, etc. see Joseph 1988, 1989, 1990) are voiced post-verbally after
the imperative singular of kano do, make, the only context where a weak object
pronoun occurs after a nasal-nal host in the standard language, as in (12):
(12)
kan
tu
mja xari [ ka(n) du . . . ]
do+2sg+impv him+gen a-favour
Do a favour for him!
253
This generalisation is admittedly not overly broad, but it does differentiate possessives from weak
object pronouns. If accurate, moreover, it allows for a determination of the status of the weak
nominatives (cf. Joseph 1994, forthcoming b), e.g. tos (cf. (2c, 3e)), since the t- of tos can, and
in fact must, be voiced in its post-nasal occurrence with the predicate pun where is/are?, i.e.
pun dos Where is he? (not pun tos). It would thus not be a prosodically weak word (i. e. not
a clitic) and therefore is best treated as an afx (thus a new verbal inection, for subject, in the
language, though with two and only two predicates).
A level of phonological phrase, however, is a different matter; the nasal-induced voicing and
regressive assimilation seen in am bo would reect a phrase-level (post-lexical) set of processes.
That they duplicate in some way the word-internal morphophonemics with the addition of (weakpronouns-as) inectional afxes is a consequence of the analysis argued for here, but since the
word-internal and the phrasal phenomena are not point for point identical (recall a bo), no
analysis can collapse the two environments.
254
Brian D. Joseph
is no idiosyncrasy, these elements are not afxal. However, their argument fails
in two ways. First, it is not the case that all afxes necessarily show idiosyncrasies the presence of idiosyncrasies may be probative but their absence
says nothing in and of itself. Second, it turns out that there in fact are various
irregularities in the morphophonology of the weak pronouns that PhilippakiWarburton and Spyropoulos overlook.
For instance, in the combination of 2sg.gen su + any third person form
(necessarily accusative since two genitives cannot co-occur), the u may be
elided, thus: su to stelno to-you it I-send may surface as sto stelno I send it to
you. However, there is no general process of Standard Modern Greek that
elides (unaccented) -u- in such a context; there is a regular process eliminating
unaccented high vowels in northern dialects, but in the Standard language
based on a southern dialect there is deletion of unstressed high vowels only
in fast speech. Thus, one might imagine that some form of that process is
at work in su to stelno sto stelno, but that cannot be the case. A deleted
u typically leaves a mark on a preceding s in the form of rounding, e.g.
/sutarizma/ shooting becomes [sw tarizma]; importantly, though, this rounding
never happens in the reduced form of the indirect object marker su, i.e. [sto
stelno] but not *[sw tostelno]). Thus the elidability of the -u- in combinations
like su to (stelno) is not attributable to a general property of Greek phonology
but rather is a feature of the particular combination of su with a third person
pronoun, i.e. it is a morphological irregularity associated with su, contrary to
Philippaki-Warburton and Spyropoulos claim.
So also in the combination of any third person form with the markers na and
a, the initial t- of the pronoun may (optionally, with considerable idiolectal
variation) be voiced to [d]; thus a to stelno fut it I-send can optionally
surface as a do stelno Ill be sending it, even though intervocalic t in Greek is
not usually distinctively voiced and na and a do not canonically end in -n (the
typical voicing element in Greek see above in 3.3.1); a did end in a nasal
in earlier stages of Greek but na never did and in any case there is no sign of a
nasal before a vowel (where it would be expected if there were one with these
forms canonically) the contrast of a stelno I will be sending but a alazo I
will be changing (not *an alazo) with e stelno I do not send but en alazo
I do not change (not *e alazo) is instructive in this regard. Thus the voicing
triggered by na and a on third person weak pronouns is an idiosyncrasy of
these combinations, countering Philippaki-Warburton and Spyropoulos claim
about the pronouns.10
Therefore, there is indeed morphophonological idiosyncrasy associated
with the weak pronouns. Moreover, Philippaki-Warburton and Spyropoulos
(1999: 65 n.5) themselves do recognise that there are ordering restrictions, as
10
Moreover, the triggering of voicing here can also be considered an idiosyncrasy associated with
a and with na, and thus constitutes evidence suggestive of afxal status for these elements. See
also note 13 below.
255
well as combinatorial restrictions; for instance, rst and second person combinations cannot occur (i.e. there is no way of saying He is sending you to
me using weak pronouns). All of these observations therefore point to an afxal analysis, and thus are consistent with the general approach taken here and
with the conclusions in 3.3.1 above concerning nasal voicing and 3.5 below
concerning accent placement.
3.4
There is one further segmental phenomenon in Greek, referred to by PhilippakiWarburton and Spyropoulos (1999: 54) as euphonic -e, that is worth considering here, since they give it as an argument for taking some of the little elements,
specically the weak pronouns, as words and not as afxes. This turns out to
be particularly interesting to consider, for a wider range of data indicates that
just the opposite interpretation is called for.
Philippaki-Warburton and Spyropoulos claim there is a strong preference for
open syllables in word-nal position [see (10) above]. When a word terminates
in nal -n, there is a tendency for a euphonic -e to be added after it in order to
obtain a word nal open syllable, e.g. milun/milune they speak. Observing
further that afxes . . . have no need for such a constraint nor do they show such
a tendency and noting that clitic [i.e. weak] pronouns may appear with such
nal euphonic -e, e.g. tone vlepo him I-see (acceptable also: ton vlepo), they
offer these facts as an argument for word-level status for the weak pronouns.
However, there is an unsettling vagueness in the reference to a tendency
Philippaki-Warburton and Spyropoulos themselves admit that not all words
ending in -n will add a euphonic -e as well as an their unfounded assertion
as to causality when they state that those that do are clearly [emphasis added]
motivated by this preference for word nal open syllable. More important,
though, their argument can be countered empirically.
First, there are indeed words ending in -n that never take euphonic -e,
e.g. beton cement (never *betone), enjaferon interesting/ntr.sg (never
*eniaferone), as well as grammatical elements which the authors themselves
want to call words that do not take -e, e.g. the indicative negator en not.
Thus, it is not at all clear that euphonic -e is a useful indicator of wordhood.
Second, the real generalisation is not that words can take this -e but rather
that inectional morphemes do, or rather can, since not all actually do. The
best cases of euphonic -e come with various verbal and nominal grammatical
endings, e.g. 3pl.pst -an, 3pl.pres -un and gen.pl -on (among some others).
Therefore, euphonic -e would provide an argument that accusative singular
weak pronouns ton/tin are inectional morphemes instead of words. And this
generalisation would explain why beton and enjaferon do not take the -e, since
the -n in those elements is part of the word-stem, and not part of an inectional
marker; the underlying stem in interesting is arguably enjaferond-, given the
256
Brian D. Joseph
Suprasegmental issues
The basic facts about accent in Modern Greek are as follows: in general, there
is at most a single main stress accent in a grammatical (i.e. inected) word,
underlyingly (in its lexical form), and it must fall on one of the last three syllables. The feminine nouns in -a show all the possibilities: peripetia adventure
versus imokrata democracy versus omora beauty. When a clear inectional sufx is added to a stem, it can trigger a rightward accent shift in a stem
that has (lexical) antepenultimate accent, e.g.:
(13)
o noma
onoma-tos
name (nom/acc)
of a name (gen)
Such facts have traditionally been treated (e.g. in Joseph and PhilippakiWarburton 1987) as consistent with a principle that the accent in a grammatical word can be no farther from the end of the word than the antepenultimate
syllable. When a pronoun (including the possessives) is added to the end of a
word with antepenultimate accent, however, it triggers the addition of an accent,
which becomes the primary accent, on the syllable before the pronoun, and a
11
For instance, the potential location of pauses (as discussed by Dixon and Aikhenvald in chapter 1
(5)) says little, since pauses (or really the Greek equivalent of pausing, the ller sound [e] or the
protraction of a vowel) can occur within traditionally dened grammatical words, e.g. eniaaa
feronda interesting things/ntr.pl. Moreover, a prosodic denition of minimal word yields an
almost argument for some types of words, but close is not good enough: statistically, by far,
most nouns, verbs and adjectives contain at least two syllables, but there is a non-negligible
number of monosyllabic forms, rendering this test unreliable. These include nouns, e.g. jos
son/nom, jo son/acc (and even more if loanwords are taken into account, e.g. jot yacht,
gel sex appeal, etc.); adjectives, e.g. ble blue, mov mauve, etc.; and verbs, e.g. imperatives
such as e s see, pes say, bes enter, etc., all of which can occur by themselves as one-word
utterances, imperfectives with surface diphthongal nuclei, e.g. [paw] I-go, [paj] (s)he goes,
[klej] (s)he cries (though these are almost certainly disyllabic underlyingly, and possibly so in
careful speech) and perfectives such as po I-say, (s)he sees, etc., which, while not able to
occur by themselves as one-word utterances, and so always co-occur with some other element
(e.g. o pote whenever he-sees), nonetheless occur with forms that are clearly separate words
(e.g. o pote whenever).
257
to o noma the name / to o` noma tu the name his (i.e., his name)
ktakse! Look! (impv.sg) / k`takse me Look-at me!
Such facts have also traditionally been treated as induced by a ban on accent
farther from the end of a word than the antepenultima, with the reduction
triggered by a ban on more than one main stress in a word.
For linguists inclined to treat pronouns as word-like entities of some sort
(e.g. clitics, with their own maximal projection in the syntax), these facts have
motivated a higher level construct such as prosodic word (implicit, e.g., in the
accounts of Arvaniti 1991, 1992) or clitic group (e.g. Nespor and Vogel 1986),
or perhaps simply phonological word, since the pronouns behave differently
from clear afxes (which shift accent) and from clear word combinations (which
have no accentual effect); recall also 3.3.1 regarding nasal-induced voicing
and how that can be used as a basis for dening phonological word, e.g. with
article + noun (and other) combinations.
Still, these accentual facts in and of themselves, despite their being consistent
with non-afxal status for the weak pronouns, are not conclusive evidence for
that categorisation, and in fact can just as well be taken as evidence for the
opposite classication. That is, there are several different idiosyncratic accent
requirements with afxes, e.g.:
(15)
This range of accentual effects associated with afxes means that the accent
addition with weak pronouns, if afxal, could simply be one such idiosyncratic
effect an afx can have.
The argumentation needs some further development, but the case can be
made. Admittedly, the possessive pronouns also provoke accent addition.Thus, if
they are clitics, or atypical, i.e. prosodically special, words, one could argue that
the weak pronouns should fall into the same category. Otherwise, the argument
would go, the grammar would have duplication through the multiple statements
258
Brian D. Joseph
needed for accent addition, in that some afxes would do it and so would clitics
(or some words, as the case may be).
However, what makes this an interesting case is that there are some differences between weak pronouns and possessives, for instance with regard to
nasal-induced voicing, as shown in 3.3.1 by the difference between ka(n) du
do forhim (cf. (12)) and anropon tu (*anropo(n) du) of his men. Thus
somehow these two elements need to be differentiated in the grammar: if accent
addition with the possessives and weak pronouns is consistent with their both
being words, the post-nasal voicing facts are consistent with their each being
a different kind of element. Of considerable importance here is the fact that
there are prosodically weak words, e.g. the attitudinal marker de (cf. (2e) and
(3d) above) with different accentual properties. In particular, de always leans
on the end of a host but never provokes accent addition (e.g. okmase try!
(impv.sg) / okmase de try already! / *ok`mase de); therefore accentually, de
and the possessives like tu have to be differentiated, so that even within the class
of words, accentually distinct behaviours must be stipulated. If one were to say
that possessives are true clitics, based on their accentual behaviour, then presumably weak pronouns belong in the same class, since they behave accentually
like the possessives. What then of the post-nasal voicing differences (cf. (12))?
Should the grammar recognise four (or even more) distinct morphosyntactic elements: word versus possessive-type clitic versus weak-pronoun-type
clitic versus afx?
A solution here is to follow the strict categorisation schema outlined in 2.1
and to recognise only afx and word as basic constructs, while at the same
time setting some tokens apart within those categories by way of recognising
different behaviours and realising that afxes can show various idiosyncrasies.
This approach may also mean that one should give up on trying to generalise
over accentual behaviour as a way of differentiating basic morphosyntactic
element types, though recognising differences within larger types.
Of interest here, but not, strictly speaking, relevant for Standard Modern
Greek is the fact that some dialects (but not Standard Greek) have accent addition
with some disyllabic forms that ostensibly are afxes. For example, in northern
Greek one nds e rxu-mi come/1sg/erxu-masti come/1pl (versus Standard
Greek e rxome / erxomaste). While one could of course say that these endings
have been reanalysed as clitics, that would seem to be begging the question of
how to identify such entities in the rst place.12
12
See Joseph 2001b for discussion of relevant dialectal facts concerning the status of word. Not
only are the accent addition facts quite different in some regional dialects, but so are the accent
placement facts; Crimean Greek, for instance, allows words with the lone accent ve syllables
from the end, as in timazanandini they were preparing (Standard Greek etimazondan); see
Delopoulos 1977 and Newton 1972 for examples.
259
Thus, the upshot regarding accent and wordhood is that while it does admittedly provide a basis from which one might motivate an afx versus clitic
distinction, or a grammatical word versus phonological word distinction, it is
not a clean basis. The relevance of this observation for a general theoretical
framework for dealing with such elements is taken up in 4.
3.6
to ble jot
the blue yacht+nom
tu ble jot
the blue yacht+gen
ta ble jot
the blue yacht+pl
However, there are no uninected verbs. Therefore, a nite set of verbal endings (marking person, number, tense, etc.) allows for verbs to be uniquely
identied, at least paradigmatically (i.e. in relation to other forms). Thus while
[po] could conceivably be a noun (cf. [jo] son+acc), once it is linked with
[ps]/[p]/[pume] etc. (2sg/3sg/1pl), it is clearly identiable as a verb, as a
member specically 1sg of the paradigm of the perfective forms of say.
Moreover, if one ignores uninected nouns and adjectives, a generalisation is
possible about the shape of a subclass of words, namely inected words:
(17)
Still, however valid (17) may be, ending in -s or -n or a vowel is not in itself
an identifying mark of a word, since many clear inectional and derivational
afxes end in -s or -n or a vowel (cf. 2sg -s as in ps, 1pl -me as in pume, etc.),
and some uninected words do too, e.g. tote then, mexris up to, etc.). It is
not at all obvious therefore that morphological considerations offer a signicant
generalisation that has any validity for determining or dening (or rening) the
notion of word for Greek.
4
The most controversial and thus the most interesting aspects of the determination of wordhood in Greek hinge on the analysis of the various grammatical
elements presented above in (2) and (3). Some of those elements in particular
260
Brian D. Joseph
the verbal elements a, na and e(n) (from (2a)), the weak object pronouns
(from (2b)) and the third person weak subject pronouns (from (2c)) have been
examined here to varying degrees and have been argued to be afxes, and not
independent or even prosodically weak words, whereas others in particular
the attitudinal marker de (from (2e)) and the possessives (from (2f)) give evidence of being prosodically weak or decient words. Further arguments for the
afxhood or wordhood for these elements are possible, as is a determination
concerning the word or afx status of all of those not systematically treated
here, i.e. the weakened nominative pronouns (from (2d)), the denite article
(from (2g), and the locative/dative marker (from (2h)).13 Still, the evidence discussed here concerning the weak pronouns especially gives a glimpse of what
can be done with a highly restrictive set of assumptions about wordhood. This
is so even if some of the resulting analyses as with accent placement are
a bit messy, so to speak, in that some stipulations are needed, e.g. as to accent
addition being one of several accentual effects associated with afxation rather
than a feature that falls out automatically from some other sets of facts.
It can be argued, though, that even with some messiness on the side of the
word-and-afx-only approach, it is not at all clear that there is anything to be
gained by adopting an analysis that is based on the multiplication of the number
of basic morphological entities that linguistic theory must recognise. That is,
there is a trade-off between, on the one hand, the neater, more constrained
system one has with the recognition of only word versus afx as basic morphological constructs and the few stipulations that are needed in such a system,
and, on the other hand, a system with a greater number of basic morphological
entities but perhaps fewer stipulative statements. It may not be an even trade-off,
though, since one can argue that the default assumption in all instances should
be to avoid multiplying basic units since some stipulation is always needed
until it is convincingly demonstrated that such an approach cannot work. That
is, recognising a third type of element, e.g. clitic, as a basic morphological
construct should always be at best a last resort, a highly marked and thus
costly analysis.14 I would argue that the Greek facts do not compel one
13
14
Regarding additional argumentation, see, for instance, Joseph (1988, 1994, forthcoming b) and
Nevis and Joseph 1993 for arguments for the various pronominal elements as afxes, Joseph 1990
concerning the indicative negator and Joseph 2001a, on the future marker. Additional evidence
beyond these considerations, as well as arguments concerning many of the other elements, is
discussed in Joseph (forthcoming a). As for the elements not discussed here, my inclination is
to treat the locative/dative marker s(e) and the denite article as (bound, i.e. prosodically weak)
words, though the jury may still be out on them, and the weakened subject pronouns as nothing
more than phonologically conditioned variants of the strong forms (see Joseph forthcoming b).
This is essentially the view taken by Zwicky 1994, a position with which I agree wholeheartedly.
It is perhaps signicant to note here that, as ubiquitous as clitic-like elements seem to be (so that
some linguists see them everywhere!), there are languages that do not have any such elements;
as Dixon argues in chapter 5, Jarawara is just such a language, with no grammatical elements
that might be called clitics.
261
to recognise such a marked construct; all relevant facts can be accounted for
under a system with just words and afxes, and the independently needed scale
of typicality within each construct.
It may well be, of course, that more distinctions among elements are needed,
and indeed, even the approach advocated here, with words and afxes, and
degrees of typicality for each, recognises that on the surface, there can be more
than two kinds of entities. Underlyingly, however, even an atypical element,
it is claimed, must be categorised as either an afx or a word, however much
it may deviate from other members of its category. Under this approach, this
merely reects the messy reality, but the grammar has to make the difcult
decisions, so to speak, and give denitive classications. That in effect is the
job of the grammar. The situation described here is thus analogous to what
is done routinely with regard to sounds: it is often the case that numerous and
physically diverse surface phones are categorised as the same at an abstract level
of analysis referred to as the phoneme; that is, the grammar imposes discrete
category membership on elements with supercially different properties. So
also with the classication of what are here, e.g., called typical and atypical
afxes as members of a single basic morphological type.
5
Conclusion
262
Brian D. Joseph
highly productive in Greek now and not just limited to originally Turkish words
(note, e.g., o taksi-dzs the taxi-driver), the occurrence of an afx in such
novel contexts need not be taken to mean that the afx by itself was borrowed;
rather, it can be viewed as resulting from the extraction of the afx out of
particular instances of the afx attached to a borrowed word, i.e. as a perfectly
ordinary case of morphological segmentation and analogical spread based on
the analysis by speakers of word-level units in their language (whatever their
ultimate origin). Thus the distinction between words and afxes seems to play
a role in language contact.
Second, the notion of lexical diffusion (Wang 1969, among others) claims
that the vehicle for the spread of sound change is the word, or rather, the
lexical entry; morphemes could be intended here as something that ts the bill,
since sound changes are found in endings, prexes and roots, as well as in
unanalysable units, but the claims of Wang and others have focussed on the
word. There is no relevant evidence from Greek here, but generally speaking,
it can be argued that there need not be a separate mechanism of change, i.e.
lexical diffusion distinct from analogy and dialect borrowing that has
to be recognised; that is, analogy and dialect borrowing together can give a
diffusionary effect in the realisation and spread of sound change, thus relegating
a putative process such as lexical diffusion to epiphenomenal status (see Joseph
2001a). However, if one believes in lexical diffusion, then such a process may
offer a useful handle on dening/determining word.
Third, and nally, there are many sound changes which come to be restricted
to occurrence at word boundaries. While that in itself might be taken to suggest
an importance for word as a theoretical construct, it may well be that such
conditioning on sound changes does not reect the original state of affairs
with any given change. That is, following the Neogrammarian view that sound
changes are not in and of themselves conditioned by non-phonetic factors, and
given that word boundaries are not phonetic entities, any instances of a wordboundary-conditioned sound change would have to be the result of reanalysis
and generalisation of the change. One possible source for such reanalysis and
generalisation is from utterance-nal (or utterance-initial) position to wordnal/initial position; utterance edges are phonetically dened (e.g. by silence, by
the vocal folds standing at rest, etc.) and word edges of course can coincide with
utterance boundaries. Another source is an original syllabic basis for change,
since word onsets necessarily give possible syllable onsets and word-initial
position can be utterance-initial position where no resyllabication is possible.
Thus, even if a sound change might originate in such a way as to be oblivious
to word boundaries (per Neogrammarian principles), it seems that speakers
often impose a word-boundary basis onto the effects of a sound change, altering
(i.e. reanalysing) the original basis for the change; often also, linguists studying
the aftermath of a change only see the results of the reanalysis and generalisation
263
See Hock 1976 for a discussion of cases where formulations of sound changes in terms of word
boundaries are actually better analysed as based on syllable structure or utterance-nality. Posner
(1996: 290) states the conditions for e- prosthesis in Romance languages to have been based
on word boundaries (#sC- #esC-) when in fact, as recognised by Lausberg (1956 ff ), the
original basis for this development was syllable-shape in connected speech (i.e. Satzphonetik
or sentence-sandhi), as seen in the standard Italian facts that Posner herself notes, namely that
prosthesis is limited to postconsonantal contexts (e.g. la scuola the school versus in iscuola
in school).
264
Brian D. Joseph
265
Newton, B. 1972. The dialect geography of Modern Greek passive inections, Glotta
50(34).26289.
Philippaki-Warburton, I. and Spyropoulos, V. 1999. On the boundaries of inection
and syntax: Greek pronominal clitics and particles, Yearbook of Morphology 1998
4572.
Posner, R. 1996. The Romance languages. Cambridge: Cambridge University Press.
Rivero, M-L. 1992. Adverb incorporation and the syntax of adverbs in Modern Greek,
Linguistics and Philosophy 15.289331.
Robins, R. H. 1993. The Byzantine grammarians: their place in history, Trends in
Linguistics, Studies and Monographs 70. Berlin: Mouton de Gruyter.
Smirniotopoulos, J. and Joseph, B. 1998. Syntax versus the lexicon: incorporation and
compounding in Modern Greek, Journal of Linguistics 34.44788.
Wang, W. 1969. Competing changes as a cause of residue, Language 45.925.
Zwicky, A. 1985. Clitics and particles, Language 61(2).283305.
1994. What is a clitic?, pp xiixx of Nevis et al., 1994.
Zwicky, A. and Pullum, G. 1983. Cliticization vs. inection: English nt. Language
59(3).50213.
11
Introduction
The problem of the word has worried general linguists for the best part of a
century. In investigating any language, one can hardly fail to make divisions
between units that are word-like by at least some of the relevant criteria. But
these units may be simple or both long and complex, and other criteria may
establish other units. It is therefore natural to ask if words are universal,
or what properties might dene them as such. Dixon and Aikhenvald set out
admirably in chapter 1 the terms in which these questions, among others, might
be discussed. I am not sure, however, that the way I have just posed them is of
much help.
Other chapters make clear not just that criteria conict, but that different
linguists may resolve some kinds of conict very differently. In chapter 10, for
instance, Joseph offers a solution for the so-called clitics in Modern Greek that
is evidently resisted by most other specialists in the language. For students of
Romance his chapter may recall especially what Bally said, in different terms,
about French (1965: 287ff). Bally used the term word only in inverted commas,
as did Martinet, also a French-speaking linguist, in his textbook (1960). But we
may equally be tempted to describe the sentence in (1) as one word, not, as the
orthography would have it, as six.
(1)
je-ne-l-ai-pas-vu
1sg+neg1 +3sg+aux.1sg+neg2 +see.pp
I didnt see it
267
is an enclitic that is said to throw back a high pitch onto the nal syllable of
what would otherwise be ne:sos:
(2)
ne:sos
tis
island+nom.sg some specic+nom.sg
an island
But enclitics were themselves words: tis, for example, was the nominative
singular in a paradigm like that of many nouns. In Modern Greek the forms
called clitics pose a problem which the inherited orthography undoubtedly
disguises. We can explain objectively, in terms of facts that are undoubtedly
relevant, both where it lies and what the alternative solutions might be. The most
obvious truth is therefore that this problem exists. If it were resolved in one way
Greek would still seem to be exional; but a typologist who relied on this might
associate it misleadingly with languages for which the facts encourage no other
answer. In the alternative view, its inectional type might tend more towards
that of Cupik, as described by Woodbury in chapter 3. But that statement too
cannot be swallowed without qualication. The true type is simply that of other
languages in which the nature of our difculty is similar.
In taking this view I am inspired by an inaugural lecture by Bazell (1958), little
noticed and well worth rereading. Linguistics in the west has roots, however, in
the analysis of languages in which the word itself was not a problem. Of these
Latin, in particular, still inuences our thinking: not least that of scholars who
write introductions to morphology, such as my own (1991). It may therefore
help if I begin with a brief look at its structure, both as seen in antiquity and in
the terms that Dixon and Aikhenvald propose.
2
Latin had inherited words for word and name related etymologically to those
of English. Nomen name was then used by the grammarians for a noun
in general, following the development of the corresponding word in Greek.
Uerbum word became their technical term for verb. But, in ordinary usage,
to say something in one word was still to say it in one uerbum; to express it
differently was to use other (plural) uerba.
For word itself the grammarians had instead a term of art, whose form was
calqued directly on a similar term in Greek. The origins of ancient grammar are
in part obscured (see, for instance, Matthews 1994); but the Greek term (lexis)
was a nominalisation of the root for say (leg-), adapted to this usage not more
than a century before the Romans took the technical concept over. Latin dictio
way or act of speaking was a similarly transparent nominalisation of the root
dic- and, as the term for word, was dened by the same formula that Joseph
268
P. H. Matthews
cites in Greek at the beginning of his chapter. An utterance (oratio) was dened
physically, as a stretch of vocal sound (uox) formed, in ancient accounts, by
air set in motion. As speech, it was articulated ultimately into spoken letters:
as many have remarked, the ancient term for letter (litera) might as easily
be translated by speech sound or phoneme. The dictio word was thus the
smallest unit from which such a stretch of speech was formed directly. In (3),
for example, the whole would have been a single utterance or sentence:
(3)
Romam
deleui
Rome+acc.sg destroy+perf.1sg
I destroyed Rome
It was made up of two words, or as the grammarians also said, two parts of an
utterance. Each was in turn composed of syllables (ro, mam, de, le, ui), which
were themselves composed of letters with their phonetic value (r, long o, and
so on).
The grammarians dened no units but these: no other into which words
were divided, none intermediate between it and the sentence. For any word
the rst task was accordingly to ask what part or what part of an utterance
(quae pars? or quae pars orationis?) it was. Romam, for example, belonged
to the part of speech, as we now call it, noun. One would then ask what
were its accompanying or contingent properties (Greek parepomena, Latin
accidentia): thus Romam was, among other things, a proper noun, in the accusative case, and singular. The next part, deleui, was a verb and, among other
things, one which was active. So, in the light of this, these two parts could
stand in a transitive relation. This system was, of course, not that of specialists
only; but was literally beaten into educated citizens throughout the empire.
It was also suited to the language. There is no doubt, rst, that parts like
those in our example were both morphological and syntactic units. They were
grammatically complex and, in broad terms, it is possible to distinguish stems
from variable endings. For example, deleui could be divided into a verb stem
(de:le:-) and inectional ending (-ui:). In terms of our criteria these were totally
cohesive: there was no construction in which stems and endings could be separated. Their order was again xed: there was no form such as ui:-de:le:, and,
if there had been, it would have had an unrelated meaning. They undoubtedly
had a conventionalised coherence in meaning. Our division between stems
and endings is not ancient: in English, both terms were rst used in this sense in
the nineteenth century. The grammarians classed certain forms as compound.
Suburbanus, for example, was composed of sub under and urbanus of the
city+masc.nom.sg; hence its meaning close to the city+masc.nom.sg. They
included compounds in which component words were said to have been altered:
for example, in agricola farmer+nom.sg we can distinguish forms related to
ones with the meanings eld and cultivate, but not identical to them. In our
269
eyes, the relation is between a stem agr(i)- and another stem col-. But semantic
units smaller than the word were not identied.
By contrast, most parts were themselves quite mobile. Their order in (3) was
not subject to rule: it was also possible to say deleui Romam. Parts that stood
in a close syntactic relation were not necessarily adjacent: (4), for example,
illustrates a pattern in which the copula verb (erat) splits a modier (magno)
from its immediate head ( periculo), which are in turn split by a governing
preposition (example from Adams 1994: 22):
(4)
tum magno
erat
then great+abl.sg be+imperf.3sg
Then matters were in great danger
in periculo
in danger+abl.sg
res
thing+nom.sg
If the morpheme was not obvious to ancient linguists, neither was the phrase.
But the grammatical word, as seen by these criteria, would have stared them
in the face.
Virtually the same parts were distinguished phonologically. We know from
ancient sources that an accent fell in positions determined, in part, by grammatical boundaries. In Romam and deleui it would fall, as shown, on the penultimate syllable: so too, for example, in Marcum Marcus+acc.sg or respondit
answer+perf.3sg. This was so for any such form in which the penultimate
was heavy: that is, either open with a long vowel (ro:, le:) or closed (mar,
pon). If the penultimate was light it would fall on the syllable preceding: thus
utulus calf+nom.sg. This rule has been widely discussed (notably by Allen
1973), and the only problem, as we will remind ourselves in a moment, was
that the boundaries of grammatical and phonological words, as dened by our
criteria, did not coincide entirely. Most parts ended either in a vowel or in a
very restricted range of consonants: m, s, r or t. This was, to be more precise,
a restriction on the phonology of inectional endings. Other facts are not recoverable entirely. But we know, for example, that in verse a sequence such as
Roma et . . . Rome+nom.sg and, or indeed Romam et, counted as two syllables
and not three. For an interpretation of the evidence see Allen 1978: chapter 4);
but it is clear that the rule again refers to the same boundary.
We can therefore talk in ancient fashion of a single unit, implicitly grammatical and phonological. The exceptions seem to have been few, and are
not all of the same type.
The best known are three forms classed in antiquity as enclitic. One was the
neutral interrogative particle, ne; the others one of the and coordinators (que)
and one of the or coordinators (ue). They were the only monosyllables that
end in short e, and, in contrast, short vowels elsewhere in a monosyllable were
lengthened. They were also unaccented and in, for example, (5); the accent on
populus was where it would be if que was part of the same phonological unit:
populusque, not populus que.
270
P. H. Matthews
(5)
senatus
populus=que
senate+nom.sg people+nom.sg=and
the senate and people of Rome
romanus
Roman+nom.sg
That much is clear, though for the rule in general and for certain other possible
clitics, our evidence is less secure (Allen 1978: chapter 5). At the same time
these were units in syntax, attaching to a variety of hosts and, in the case of
ne, at varying positions in the sentence. They were also felt as separate. Our
practice in writing spaces is a medieval innovation (charted by Saenger 1997).
But the ancient abbreviation for (5) was precisely S.P.Q.R..
Other exceptions involve, for example, the preposition cum with, which in
combination with a pronoun might be represented as in (6):
(6)
pax
uobis=cum
peace you (pl)+nom.pl=with
Peace be with you!
There were also forms that dictionaries describe as either one word or two. The
form quousque how far? could alternatively have the order usquequo, and
would thus fail one of our criteria. Quomodo (literally in what way?) would
fail another when interrupted by que: quo=que modo and in what way?. But
the qualications truly take up more space than they deserve. The word in Latin
never was a problem; and, although the language was originally described by
reference to a prior analysis of Greek, it does not seem likely that we would
ourselves have had much difculty if we had come to it cold.
The problems lie in justifying any other kind of unit. For morphemes, in
particular, the difculty is not simply that there was fusion. In many other
languages there is quite striking fusion; nevertheless the underlying forms of
roots and afxes can be established without disagreement. In one like Latin
that is just where we have difculty. For the Roman grammarians, a noun
like puella:rum girl+gen.pl was derived by adding the syllable rum to the
corresponding ablative singular: puella: girl+abl.sg puella:rum. This was
their standard technique, whose application to, for example, Italian is still worth
exploring (Matthews 1996). But if we look for morphemes we will nd it hard
to say how such forms should be analysed. Is the genitive, for example, puella:rum, with an underlying stem puella:-? This stem would then be shortened in
forms like puella-m girl+acc.sg. Or is it puella-:rum, with -:rum lengthening
a stem whose underlying vowel is short? Is the ablative then puella-:? Or is there
perhaps an underlying form puella-e, with an -e found in some other ablative
singulars?
In contrast, it was very easy to establish paradigms. The word in ancient
grammar was not formally a lexeme: on this point it might be wise to amend
the remark by Lyons (1968: 197) cited by Dixon and Aikhenvald (chapter 1).
271
Since our conception of the word originates in the description of Ancient Greek
and Latin, it is not surprising that they now seem to approximate to an ideal case,
in which all our criteria agree. A language such as Georgian, as described by
Harris in chapter 9, is another in which the identity of words is not traditionally
a problem: its type is different from that of Latin mainly in that their internal
structure is agglutinative. But the pattern may be ideal only in that that is how
we have historically perceived it.
One conclusion that emerges clearly from this volume is that a grammatical
word does not also have to be a phonological word. There is no logical reason (Matthews 1991: 215) why it should: nevertheless, at that point, I implied
that, when such units coincide, it is in some way more signicant than when,
for instance, morphemes happen also to be syllables. If the latter had coincided
in the classical languages, we might happily have inherited a single term for
both. The term syllable referred in Greek to letters taken together; and, in the
case that we might then see as ideal, they would be together both as sounds
and in meaning. The problem of the syllable would arise in languages in
which grammatical syllables are at variance with phonological syllables, in
which phonological syllables are clear but grammatical syllables not easily
identied, and so on.
Is our ideal of the word more valid? However condent our answer, it is a wise
precaution to insist, with Dixon and Aikhenvald in chapter 1, that grammatical
and phonological units should be distinguished. But two further properties of
words in Latin are worth underlining.
The rst is that they did not, or they did not obviously, form syntactic
phrases. The properties of phrases are, in many languages, in part those of
grammatical words: thus in English, in the old house, the constituents occur in
a xed order, and the whole is cohesive in that the unit can be interrupted only
by other constituents directly or indirectly subordinate to its head. Still less can
the, old and house be scattered, as Dixon and Aikhenvald put it, throughout the clause. These are among the criteria for, in general, a grammatical
unit, which do not themselves dene what kind of unit it is. In this light, a
peculiarity of Latin is that they were met by just one, which had, in addition, almost every other feature, positive and negative, that we now perceive as
word-like. But in other cases both these properties may hold of smaller and of
larger units.
272
P. H. Matthews
l-e
ov-suri
def+pl bald+mouse
the bats
Now we are used to saying that les is the plural of the denite article, just as, in
the case of Latin, illi was a nominative plural of a demonstrative pronoun. But illi
was part of a paradigm in which the categories, and many of the actual endings,
were identical to those of nouns. There was thus good reason for describing it
in that way. But there are no motives of that kind in French. The pattern would
be simply one in which two morph`emes or grammatical elements from closed
classes, [l] denite and [e] plural, form one word: schematically g1 g2 . Two
lexemes, or semant`emes, form another: l1 l2 .
Could there then be languages in which a pattern such as this was general?
Some words would consist of one or more grammatical elements: g1 . . . gn .
Others would be simple words or compounds formed from lexical elements:
273
l, l1 l2 , possibly [l1 l2 ]l3 , and so on. I leave this as a question, since I have not heard
of such a language. But the example from French reminds us that establishing
roots is in part a matter of perception. There are many cases, in particular,
where words are formed from various afxes and what is classied as an
auxiliary. In most we have good reason for describing the auxiliary as a root
on which the whole is centred. Thus, in particular, the same afxes attach to
roots of main verbs. But in other cases the auxiliary could, in reality, be just
one of a sequence of grammatical morphemes.
What general theory is then possible? I am not sure that I understand what
every linguist means by this term. But one kind of theory would propose
constraints on what a language can in principle be like: for example, it could
not be, even predominantly, of the type that I have just described. Now one
proposition which is certainly implicit is that any language must have units
into which an utterance may be organised, which, whatever semantic status
they may or may not have, are not always semantically simple. There would
accordingly be no language which did not have words or phrases, grammatical or
phonological, of any kind. One issue, then, is whether more specic constraints
can be justied.
If not, there are still ways in which general linguists can contribute. What has
often been called a theory is, in reality, more like a descriptive linguists tool
bag. Different types of language differ in the problems that they raise, and we
naturally nd that forms and methods of description which work well for one
are of no help at all for others. An excellent instance is the demonstration by
Rankin and his colleagues (in chapter 7) that, in their words, Siouan languages
really do not lend themselves to description in terms of templatic morphology.
Others, of course, may. Dixon (in chapter 5) remarks in passing that in Jarawara,
as in many other South American languages, there is no useful role for the
distinction between inectional and derivational processes. Now that distinction
is bound up with the traditional conception of a paradigm: derivation is a relation
between different lexemes, inection between forms of the same lexeme. It is
not primarily, as some writers carelessly imply, between two kinds of afx.
When we have grasped that, it is clear that there will be cases where a language
cannot be described successfully unless we have this tool in hand, and others
where we will return it to our bag quite quickly.
This image is not new. But we need to be reminded that there are many
general concepts that should not be generalised beyond the point at which
their application is illuminating. In most languages the syllable, for instance,
seems unproblematic. It is transparently so in languages that are described as
syllable-timed, like Spanish; and, in this volume, it is only in Harris account of
Georgian (chapter 9), and Hendersons of Arrernte (chapter 4), that its character
is an issue. Many indeed claim that the unit is universal, and that its properties
are universally constrained.
274
P. H. Matthews
Universal is another term, like theory, that is often obscure. But the
concept of the syllable is one that we can certainly pull from our tool bag,
and we all know cases, at least marginal, where its application is not easy.
Southern British English gardener is represented, in the latest revision of Jones
pronouncing dictionary, as, alternatively, [.dn.r ], with three syllables,
or [d.nr ], with two (Jones 1997). The raised schwa indicates a possible
syllabic nasal: thus [.dn. .r ]. But where lies the difference between that and
a dissyllable? Partner, in contrast, is represented only as [pt.nr ]; but could
it never instead be trisyllabic [p.tn..r ]? These are questions which we might
be expected to answer if the syllable were posited as universal. But it is not
clear that there is any other reason why they should be asked. We know that
gardener is morphologically garden-er, and how garden itself can be realised.
We know too that partner is not likewise partn-er. Any difference in the ways
they can be realised is explained directly by that. If there is a schwa in garden-,
gardener is three syllables; otherwise, does it matter whether, in either form,
the [n] is called syllabic or not? I cannot pretend to any feel for Georgian, or for
other Caucasian languages in which the syllable has been seen as problematic.
But the notion that syllables are a universal often carries with it the specic
assumption that, in any language, they must be identied exhaustively. This
may, in some, create more difculties than it solves.
The problem in Arrernte is that the syllable as identied by patterns of
prosodic morphology is in general not a syllable by phonetic criteria. The
patterns themselves are clearly demonstrated; and, just as other rules in other
languages count syllables, it would be hard to say that it is any other unit that
is being counted here. It merely does not have a CV structure of a kind that
syllables usually do have. Now we might well argue that, ideally, a metrical
syllable, or syllable that is in one way or another counted, ought to correspond
to a phonetic syllable. It at least seems more like an ideal than the case where
grammatical words are also phonological words. But the example is a warning
not to mistake an ideal for a universal. For phonetic reasons we do not expect
that syllables will have obligatorily a structure VC(C). But so what, once we
recognise that other criteria may apply?
The issue for us is whether the word has any other status. No criterion is
either necessary or sufcient, as Bazell, who is cited in chapter 1 by Dixon and
Aikhenvald, made clear long ago. But they are relevant insofar as, in particular
languages, they do tend to coincide. A form which is cohesive need not logically
consist of elements whose order is xed. I have already cited an exception in
Latin; but, in Latin itself, it is precisely an exception. Nor do units which have
a conventionalised coherence and meaning logically have to be cohesive. A
familiar example is that of separable verbs in German: innitive ausbleiben,
literally out-remain; nite bleibt ( . . . ) aus. But, although this is a pattern
systematic for verbs of that class, it too is exceptional within the language.
275
Otherwise forms meeting one criterion also meet the other: thus, with the same
initial element, a noun like Ausland, literally out-country. We have taken care
to qualify words as grammatical or phonological, and, in particular languages,
we do nd regular discrepancies. Dixon (in chapter 5) has identied three
such in Jarawara: one, for example, holds for compounds. Another discrepancy
involving compounds is reported by Henderson for Arrernte. But, within these
languages, these are again exceptions to a general tendency in which such units,
to quote Dixon, almost always coincide. It is in that sense that both implicitly
are words; not, like the morpheme and the syllable, different units altogether.
Exceptions within languages are one thing; languages that would themselves
count as exceptions are another. The implication is again that of an ideal,
from which such languages depart. But can we take it, rstly, that a language
has ideally words of both kinds?
There seems no reason, in particular, why we should always nd it helpful to
distinguish words in phonology. The criteria, as Dixon and Aikhenvald make
clear in chapter 1, are specic to particular languages. Different kinds of evidence are relevant, and none need always be available. There are languages, for
instance, which have no restrictions on the vowels or consonants with which
word-like units may end. Some kinds, at least, may also be evidence for phrases:
thus, especially, that of accentuation. What then is the relevant unit in, for instance, French? Examples (1) and (7) would normally have accents on their
nal syllable: je ne lai pas vu; les chauves-sours. Neither can be partitioned
into smaller units that are phonologically word-like: on that at least all analysts
seem likely to agree. But what we actually call them will depend, in practice,
on combined criteria from both grammar and phonology. If (1), for example,
is a phrase in syntax we can agree that it is also a phrase in phonology. The
crucial evidence for that view of the grammar is that adverbs such as encore
can be inserted before vu: je ne lai pas encore vu Ive not yet seen it. If it
were instead a word grammatically we could agree that it was also a word
phonologically. Now in many languages there is a distinction in phonology
between words and phrases. We have no reason, however, to claim that it is
universal.
Can we even take it that, at either level, our criteria will not regularly conict?
A specialist in Bantu languages might well feel that this situation was precisely
that faced when, in practice, different ways of writing them became established.
But let us assume, for arguments sake, that languages ideally have words both
in grammar and in phonology. Are both units then, ideally at least, exhaustive?
In the traditional account of Latin, all parts of the sentence are words, clitics
included. But, of the terms used in the preceding chapters, clitic is the one that
leaves me most confused. For it seems that, in an alternative view, a sentence
may consist not wholly of words; but in part of words and also in part of
clitics, which are not words.
276
P. H. Matthews
Clitics
Let me begin at least by keeping the term in inverted commas. For it may be
that no single kind of unit is referred to.
The words in Greek originally called enclitics were, to repeat, words. Some
were morphologically simple, and to that extent were also like afxes. But
others, such as tis in (1), were inected (some specic+nom.sg), and, in the
paradigm of the verb to be, some forms, such as esti is, were enclitic, while
others were not. In such words we could say, of course, that what was clitic
was the root: ti(n)- (compare oblique forms such as genitive singular tinos), or,
in esti, a suppletive clitic alternant es-. But these too would be no more like an
afx than the root of any other lexeme. In Latin a unit such as =que and was
again afx-like, as it was also root-like, in being morphologically simple.
But its status as a word has never been disputed. Afxes were elements specic
to word classes: thus a nominative singular morpheme, if that is what we want
to call it, was found only in nouns, pronouns, adjectives and participles. But the
enclitic in populus=que people+nom.sg=and could be attached to words at
the appropriate point in any form syntactically coordinated with another. These
points may seem obvious. But, as Woodburys criteria for Cupik (in chapter
3, 6.2) also briey remind us, the varieties of host to which a clitic can
be added may be crucial to the reasoning by which we call it clitic in the
rst place. Afxes too can have accentual peculiarities: thus, in Italian, the rst
plural ending in (8) could similarly be said to move an accent to the syllable
before it: third singular manda-va, rst plural manda-va-mo.
(8)
manda-va-mo
send+imperf+1pl
(we) were sending
The difference between Italian -mo and Latin =que is that the forms to
which =que was attached were not limited morphologically.
In the traditional account =que was, accordingly, a word; it was merely not
like other words in its phonology. Its status seems to have been stable for a
long time, since its cognates in both Greek (=te) and in Sanskrit (=ca) patterned
similarly. Nor, in passing, does it have a reex in the Romance languages. Now
we could, in principle, reserve the term word for the forms that were not
enclitic. Thus, in Latin, populus=que could be represented not exhaustively,
as two words, but as a word populus plus another element that is neither a
word nor part of a word. The denition of a clitic would then be as a residue.
In our example from Ancient Greek, the word ne:sos island+nom.sg could
similarly be said to combine, in its phonology, with a residual non-word tis.
In neither language may it matter greatly which way we decide to put it. But
for many linguists clitics can be characterised more generally as, in the words
277
278
P. H. Matthews
every property that afxes or words in general do have. But, with that proviso,
our inverted commas may perhaps be tentatively suppressed.
The term is thus traditionally applied when, as originally in Ancient Greek,
successive units are described as words in grammar but are not separate in
phonology. It is not usual when, in contrast, one grammatical word is treated as
two phonological words: thus again in Dixons examples in Jarawara (chapter 5)
or in Hendersons in Arrernte (chapter 4). But there are many tricky cases where
we cannot expect consistency.
One, in particular, is where we might speak of two levels of inection. (9),
for example, illustrates the usual analysis of Modern Greek:
(9)
ton
3sg+masc.acc.sg
He/she met him
sinantise
meet+past.act.3sg
The verb sinantise is assigned to a verb paradigm; ton to one which is in part like
those of nouns. Since the only accent is on sinantise the pronoun is then said
to be proclitic. But in calling ton a pronoun one is clearly begging questions.
We could, for example, add an object noun phrase: ton sinantise ton patera
mu, literally . . . the father my. If ton is a pronoun this is then a case of
so-called clitic doubling, which in Greek would be syntactically optional. But
it is clearly not a pronoun in the sense that, for example, him in English is a
pronoun. It does not, like him, have the syntax of an object noun phrase. If
the so-called doubling were obligatory, and ton morphologically simple, an
unbiassed analyst might describe it without hesitation as an object prex.
There is no need for me to repeat Josephs arguments (chapter 10). It is
worth remarking, however, that a recent grammar of Modern Greek denes a
clitic pronoun as both structurally and accentually dependent on another word
(Holton, Mackridge and Philippaki-Warburton 1997: 506). One might equally
well say that, in English girls, -s is structurally as well as phonologically
dependent on girl. The problem in Modern Greek is that these forms are
indeed, in that way, afx-like. But at the same time they are still, in their
internal structure, word-like. The genitive singular mu, in ton patera mu, can be
analysed easily into a root m- rst person and a genitive singular ending -u.
For the ending compare, for example, the forms traditionally represented as
in (10):
(10)
t-u
the+gen.sg
the mans
anrop-u
man+gen.sg
The analysis of (either) ton, as can be seen in part from Josephs paradigms,
is merely less straightforward. Are these words, then, which are in turn afxes
within larger words? Or must we argue that, since m-u is a grammatical word,
279
One putative universal is, again, that every language will have words or continuous phrases of some kind. It is therefore fair to ask why this is so; or, if we
are ultra-cautious, why we should expect this.
The answer usually implied is that sentences are more easily produced, and
understood, if organised into such packages. If we try to elaborate this, we can
easily lapse into wafe. But Zeshans discussion of sign language (chapter 6)
may perhaps inspire us to reect, by contrast, on the nature of the vocal medium.
For many linguists, languages are systems realised equally by either gestures
or speech; these include theorists, such as Lyons (1991), who command respect.
But it does seem possible that, in part, the systems are themselves structured
differently.
On the one hand Zeshans compounds, for example, are like compounds
in such languages as English, and exist for the same reasons. They too can be
said to combine either words or what might equally be called lexemes. Her
clitics can again be seen as units that are word-like but whose realisation is
280
P. H. Matthews
dependent on a host. But, on the other hand, it is not pedantic to query notions
of phonology in sign language. Phonology is dened by its place in a system of double articulation, in which languages are structured independently on
two levels. This does not merely facilitate a multiplicity of signs, as Martinet
originally remarked (1960); it also guarantees, or is one factor guaranteeing,
the redundancy that is so clearly necessary, in the spoken medium, if forms
are not to be misheard. This is familiar to phoneticians, and in grammar both
syntactic rules in general, and rules forming larger units, contribute to the same
end. In terms dating from the 1950s, these involve restrictions on the average
probabilities of transitions from one morpheme to another, and probabilities
that vary within and between sub-sequences (compare, in a much later formulation, Harris 1991). We can thus expect packaging, though not always, as
the preceding chapters have shown, of the same kind. Now Zeshan makes clear
that the roles of simultaneity and sequencing are different in the medium of
hand gestures; so too that of iconicity. There is also no precise equivalent, at a
still more elementary level, of a speakers need to breathe at appropriate places.
It then seems reasonable to ask what level of redundancy is necessary. Are the
factors that potentially interfere with signed communication similar to those in
spoken language, and of similar intensity?
Some similarities that link gestural to spoken language may, of course,
be due to inuence from it. This does not mean that sign language is merely
derivative. That would not only be politically incorrect, but indeed wrong. It is
possible, however, that human sign language might not have some properties
that it does have if spoken language had not evolved, whether (to avoid fractious
speculation) earlier, or later, or in parallel.
References
Adams, J. N. 1994. Wackernagels law and the placement of the copula esse in Classical
Latin. Cambridge: Cambridge Philological Society.
Allen, W. S. 1973. Accent and rhythm. Cambridge: Cambridge University Press.
1978. Vox latina, 2nd edn. Cambridge: Cambridge University Press.
Bally, C. 1965. Linguistique generale et linguistique francaise, 4th edn. Berne: Francke.
Bazell, C. E. 1958. Linguistic typology. London: School of Oriental and African Studies.
Delbruck, B. 1901. Grundfragen der Sprachforschung. Strasbourg: Trubner.
Harris, Z. S. 1991. A theory of language and information. Oxford: Clarendon.
Holton, D., Mackridge, P. and Philippaki-Warburton, I. 1997. Greek. London: Routledge.
Jones, D. 1997. English pronouncing dictionary, 15th edn, by P. Roach and J. Hartman.
Cambridge: Cambridge University Press.
Lyons, J. 1968. Introduction to theoretical linguistics. Cambridge: Cambridge University
Press.
1991. Natural language and universal grammar. Cambridge: Cambridge University
Press.
281
Index of authors
282
Index of authors
Firth, J.R. 12, 39
Fletcher, A. 202, 203
Foley, W.A. 15, 2829, 39
Franklin, J. 16, 39
Franklin, K.J. 16, 39
Friedrich, J. 52, 76
Gak, V.G. 6n6, 39
Garvin, P.L. 4, 39
Gee, J.P. 153, 178
Gluck, S. 16162, 178
Graczyk, R. 6, 2021, 45, 47, 54, 57, 180204,
273, 277
Gray, L.H. 3, 15, 39
Gregor, D.B. 18, 39
Guedes, M. 14, 39
Haas, M. 56n10, 76
Haberland, H. 52, 76
Haiman, J. 56, 76
Hale, K.L. 100, 124
Hall, T.A. 31, 39, 76
Halliday, M.A.K. 10, 39
Halpern, A. 44, 50n7, 52, 56, 7273, 76,
90, 98
Harris, A. 21, 48, 56, 76, 22742, 271, 273
Harris, Z. 4, 4n4, 39, 280
Hayes, B. 91, 98, 192n9, 195, 203
Hazlewood, D. 30, 39
Heine, B. 56, 56n10, 76
Henderson, J. 15, 18, 2930, 46, 48, 54,
10024, 247n2, 273, 275, 278
Hewitt, B.G. 234, 241
Hock, H. 263n15, 263
Hockett, C.F. 3, 32, 39
Holton, D. 278, 280
Hudson, J. 14, 39
Hyman, L. 208, 216, 226
Inkelas, S. 45, 45n4, 78
Jacobson, S.A. 81, 93, 99
Jakobsen, R. 102, 124
Janda, R. 250n5, 263
Jastrow, O. 17, 39
Jespersen, O. 9, 39
Johnson, R.E. 163, 173, 178
Johnson, S. 46
Johnston, T. 172, 178
Jones, A. W. 197, 203
Jones, D. 274, 280
Jorbenae, B. 238, 241
Joseph, B. 14, 16, 2627, 31, 34, 39, 42,
4748, 71, 7677, 24365, 26668,
278
283
Kazazis, K. 251n7, 264
Kenesei, I. 17, 40
Kern, B. 45, 52, 76
Kiparsky, P. 96, 98
Kiziria, N. 235n5, 241
Klavans, J. 42, 44, 46, 53, 7374, 77
Kleinhenz, U. 31, 39, 76
Klima, E.S. 16162, 175, 178
Kobaie, M. 238, 241
Koerner, E.F.K. 243, 264
Koontz, J. 6, 2021, 45, 47, 54, 57, 180204,
273, 277
Kramsky, I. 45, 5n5, 40
Kropp-Dakubu, M.E. 20910, 209n5, 226
Kumari, T.C. 46, 76
Kyle, J. 154, 178
La Flesche, F. 202, 203
Lapointe, S. 79, 98
Lausberg, H. 263n15, 264
Lehiste, I. 15, 40
Liddell, S.K. 158, 163, 173, 178
Longacre, R.E. 5n5, 40
Lowie, R. 200, 203
Lucas, C. 16162, 178
Lyons, J. 2, 7, 911, 23, 40, 248, 264, 270,
272, 279, 280
Macavariani, G. 227, 241
Mackridge, P. 278, 280
Malinowski, B. 4, 4n3, 40
Malkiel, Y. 174, 178
Mandel, M. 169, 178
Martinet, A. 266, 280
Matthews, G.H. 19899, 203
Matthews, P.H. 5, 9, 11, 2021, 25, 27, 40, 42,
53, 71, 77, 26681
McCarthy, J.J. 24, 40
McIntosh, A. 10, 39
McNeill, D. 173, 178
Meillet, A. 911, 15, 40
Meira, S. 53, 77
Milewski, T. 4, 40, 140
Milner, G.B. 30, 40
Miner, K. 192n9
Mithun, M. 185, 203
Miyaoka, O. 93, 99
Morgan, M. 165, 178
Muller, N. 72, 76
Nedjalkov, I.V. 20, 40
Nespor, M. 17, 28, 28n12, 31, 40, 43n2, 47,
54, 77, 257, 264
Nevis, J.A. 42, 48, 71, 76, 244, 260n13, 264
Newman, S. 10, 27, 40
284
Index of authors
Sanie,
A. 234, 241
Sapir, E. 2, 5, 12, 34, 41, 42, 77, 81, 98, 180,
18283, 203
Schachter, P. 52, 77
Schembri, A. 172, 178
Shaul, D. 56, 77
Shea, K. 201
Smirniotopoulos, J. 247n2, 265
Smith, I. 46, 77
Sprigg, R.K. 16, 40
Zirmunskij,
V.M. 5, 41
Zwicky, A.M. 4243, 47, 53, 5556, 7274,
76, 78, 198200, 203, 220, 226, 244, 253,
260n14, 26465
Z enti, S. 232, 234, 234n3, 242
Celtic 18
Chamicuro 45
Chapacuran 45, 52
Chechen 14
Chinese 3, 6n6, 11, 3234, 272
Crow 180, 180n1, 183, 18899
Cupig, Nunivak 96n4
Cupik 2428, 5455, 7999, 267, 276
Cushitic 49
Dagbani 16, 2829, 4447, 20526, 277
Dakota 57, 180, 180n1, 18389, 194, 197
Dakotan 18089, 180n1, 181n3, 187n7,
19398, 197n10, 201
Den 125
Dhegiha 180n1, 182, 18489, 19395,
198
Dhegihan 180, 18489, 194, 19798
Djinang 46
Dutch 8
Dyirbal 4, 12, 1923, 27, 55, 125n1, 141
English 212, 3n2, 4n4, 2034, 4344, 46, 48,
5455, 7274, 80, 100, 118, 14147,
15455, 158, 17374, 190, 201, 224, 232,
249n3, 261, 26768, 27172, 274,
27778
Aboriginal 100
Old 2
Eskimo 44, 74, 80, 183
Estonian 15
European languages 34, 12, 32, 181
European, Standard Average 176, 247
Evenki 20
Fijian 34, 1415, 2831, 3537, 125n1,
14247, 142n10
Boumaa 26, 51
Finnish 46
French 3, 3n2, 7, 9, 44n3, 46n6, 49, 50n7,
141, 154, 174, 249n3, 266, 27273,
275
285
286
Manambu 48, 49
Mandan 193, 197
Mandarin 11
Mbya 14
Moses-Columbia Salish 27, 31
Muskogean 56n10
Nahua 56
Nahuatl
Huasteca 56
Michoacan 56
North Pueblo 56
Tetelcingo 56
Naukanski 80
New Guinean languages 3
Ngiyambaa 44, 74
Oceanic 14243
Omaha 180, 18788, 197, 2012
Omotic 49
Osage 180, 197
Papuan 1516, 28, 48
Paumar 125
Piedmontese 4950, 55, 74
Pitjantjatjara 14
Ponca 180, 201
Portuguese 19, 19n9, 44, 44n3, 46n6, 49, 53,
74
Brazilian 19n9, 44, 53
European 19n9, 44, 47, 52
Quapaw 180, 18488, 19798
Rajasthani 16
Romance 4549, 46n6, 5556, 73, 263n15,
266, 276
Western 49
Russian 3, 46, 174
Sanskrit 1213, 1618, 276
Scottish Gaelic 14
Semitic 15, 20, 158
Serbo-Croatian 45n4, 50, 237
sign languages 18, 20, 26, 4849, 55, 15379,
27980
American 15455, 15863, 166, 173
Australian 172
British 154
German 154, 16162, 165
Indo-Pakistani 15458, 16263, 16875
Israeli 18, 159, 164, 166
Japanese, 16465
Netherlands 158
New Zealand 162
Tagalog 52
Tariana 14, 21, 26, 28, 4278, 279
Tepiman 56
Terena 16
Tibetan, Lhasa 16
Tiriy 53
Tongan 72
Tswana 8
Tuareg Ahaggar 48
Tucano 44, 56
Tup 14
Tup-Guaran 14
Turkic 17
Yagua 44, 46
Yiddish 16, 21
Yidi15, 1718, 2526, 28, 28n12, 44,
73
Yimas 15, 2829
Yingkarta 14
Yokuts 10, 27
Yupik, 8081, 85, 87, 88n3, 93
Central Alsakan 7980
Central Siberian 80
Udi 229
Urdu 155
Uto-Aztecan 56
Walmatjari 14
Warekena 1415, 4752, 51n11
Wari 45, 52
Warrgamay 29
Welsh 72
Western Desert language 14
Winnebago 192n9, 193, 195, 197
Wu 11
Xhosa 8
Zoque 15, 47
Zulu 8
287
Index of subjects
288
Index of subjects
289
290
Index of subjects