Journal of Experimental Psychology:
Human Perception and Performance
2004, Vol. 30, No. 2, 268 –286
Copyright 2004 by the American Psychological Association
0096-1523/04/$12.00 DOI: 10.1037/0096-1523.30.2.268
The Perception of Tonal Structure Through the Differentiation and
Organization of Pitches
Nicholas A. Smith and Mark A. Schmuckler
University of Toronto at Scarborough
The role of 2 psychological processes, differentiation and organization, were examined in the perception
of musical tonality. Differentiation distinguishes elements from one another and was varied in terms of
the distribution of pitch durations within tone sequences. Organization establishes relations between
differentiated elements and was varied in terms of either conformity with or deviation from a hierarchical
description of tonality. Multiple experiments demonstrated that the perception of tonality depended on a
minimal degree of differentiation in the distribution of the duration— but not frequency of occurrence— of pitches and only when pitch distributions were hierarchically organized. Moreover, the mere
differentiation of the tonic from nontonic pitches was not sufficient to induce tonal percepts. These results
are discussed in relation to tonal strength, musical expressiveness, and principles of auditory pattern
processing.
music theory has, over the course of many years, provided elaborate theoretical descriptions of important structural relations existing within music that may or may not be critical in understanding listeners’ percepts of such passages (see Schmuckler, 1997, or
Schmuckler & Boltz, 1994, for discussion of this point).
A fundamental aspect of the perception of visual scenes or
auditory sequences is the apprehension of their inherent structural
information (e.g., Garner, 1974; Gibson, 1979; Koffka, 1935;
Kubovy & Pomerantz, 1981; Lockhead & Pomerantz, 1991). Regardless of the specific information being processed, the basic
psychological questions are the same: What types of structure(s)
are contained in the scene or sequence? What psychological processes are used to apprehend such structure? What is the observer’s or listener’s sensitivity to this structure? Are different forms of
such structure equally accessible, or are some types more easily
apprehended? The goal of much of the research in perception and
cognition is to explore these basic issues; in the present article,
these questions are examined in the context of perceiving music.
Music provides an especially compelling arena for investigating
such questions. Along with speech, music represents the most
complex form of auditory information with which people interact;
thus, it is an ideal candidate for both the discovery of fundamental
principles of pattern structure and investigating the parameters of
listeners’ sensitivity to such information. Moreover, music psychology is in the fortunate position of having entire fields of study
devoted to the specification of information available within the
musical stimulus itself. For example, work in both musicology and
The Perception of Tonality and the Probe-Tone Technique
Within the tradition of Western tonal music, one theoretically
fundamental component of musical structure is the property of
tonality. In musical parlance, tonality, or tonal structure, refers to
the organization of the complete set of all 12 musical pitch classes
(called the chromatic set) around a single reference pitch. This
reference pitch is called the tonic, and the remaining pitch classes
are judged in relation to this referent. The tonic pitch denotes the
musical key of a piece of music (i.e., in the key of C, the pitch C
serves as the tonic). Tonality actually induces on the chromatic set
a hierarchical organization of importance; this hierarchy is shown
at the top of Figure 1. In this hierarchy, the tonic (Pitch Class 0)
appears at the top and is seen as the point of maximum stability.
Below the tonic are intermediate levels of stability, consisting of
the pitches that, along with the tonic, make up the tonic triad (Pitch
Classes 4 and 7) and the diatonic set (Pitch Classes 2, 5, 9, and 11).
Finally, at the bottom of the hierarchy are the remaining tones
(Pitch Classes 1, 3, 6, 8, and 10), called the nondiatonic set. These
tones are considered to be outside of the key and thus are seen as
the least important tones.
This theoretical hierarchy of importance has also been examined
from a psychological point of view. Krumhansl and colleagues, in
some now classic tests of this hierarchy (e.g., Krumhansl &
Kessler, 1982; Krumhansl & Shepard, 1979), provided evidence of
its psychological reality. To explore this question, Krumhansl and
Shepard (1979) developed the probe-tone procedure (see Krumhansl, 1990a, for a complete description of this method), in which
listeners provide a goodness-of-fit rating for each of the 12 pitch
classes with reference to a (typically preceding) musical context
that instantiates a specific tonality. Using this method, Krumhansl
Nicholas A. Smith and Mark A. Schmuckler, Division of Life Sciences,
University of Toronto at Scarborough, Toronto, Ontario, Canada.
The research reported in this article was supported by a Natural Sciences
and Engineering Research Council of Canada grant to Mark A. Schmuckler. Parts of this research were included in a master’s thesis by Nicholas A.
Smith, submitted to the University of Toronto, Toronto, Ontario, Canada.
We thank Bruce A. Schneider for comments on an earlier version of this
article and Lola L. Cuddy and Michael E. Lantz for discussion and
encouragement.
Correspondence concerning this article should be addressed to Nicholas
A. Smith, who is now at the Department of Psychology, McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada.
E-mail: nsmith@mcmaster.ca
268
THE PERCEPTION OF TONAL STRUCTURE
Figure 1. Krumhansl and Kessler’s (1982) standardized key profile,
which demonstrates the hierarchical structure of the perceived stability of
pitches within tonal contexts. Individual pitch classes of the chromatic set
are numbered 0 –11, with 0 being the tonic tone, and are presented with
reference to the tonality of C major.
and colleagues (Krumhansl & Kessler, 1982; Krumhansl & Shepard, 1979) found that listeners’ probe-tone ratings matched the
music-theoretic hierarchy just described. The bottom part of Figure 1 shows the averaged probe-tone ratings for the chromatic set
(found by Krumhansl & Kessler, 1982) in terms of pitch-class
numbering and with reference to a given tonality or key (that of C
major). These ratings have been called the tonal hierarchy (Krumhansl, 1990a), with the tonic receiving the highest rating and
functioning as a psychological reference point (e.g., Rosch, 1975;
see Krumhansl, 1979) by which the remaining pitch classes of the
chromatic set are judged.
Although subjective, the probe-tone technique has proven, over
the years, to be a robust way of assessing the relative stability or
goodness of fit of the different pitch classes of the chromatic set
with reference to the tonality of a musical context. These ratings
have been found to be strongly related to objective properties of
pitch relations, such as consonance values between pairs of musical tones. Krumhansl (1990a), for example, demonstrated that the
tonal hierarchy ratings matched quite well with six different measures of tonal consonance between tone pairs (pp. 50 – 62). At the
same time, these ratings also match strongly with statistical distributions of tone durations and/or the frequency of occurrence of
tones within actual musical contexts (Krumhansl, 1990a, pp. 62–
76); this latter finding is expanded on below.
Along with correlates of musical structure, the data arising from
the probe-tone method have also been found to be robust across a
wide array of musical contexts and to be related to a number of
psychological processes involved in music perception. For example, the probe-tone procedure, along with variants in which pairs of
tones or simultaneous tones (called chords) are used as the probes,
269
has been successfully used when the musical context consisted of
schematic key-defining passages (e.g., Bharucha & Krumhansl,
1983; Krumhansl, 1979; Krumhansl, Bharucha, & Castellano,
1982; Krumhansl, Bharucha, & Kessler, 1982; Krumhansl &
Kessler, 1982; Krumhansl & Shepard, 1979); realistic musical
materials, such as short melodies or more complex passages (e.g.,
Cuddy, 1993; Cuddy & Badertscher, 1987; Cuddy & Smith, 2000;
Schmuckler, 1989; Schmuckler & Tomovski, 1997, 2000; Smith &
Cuddy, 2003); 20th century musical materials representing either
extensions of or alternatives to traditional tonal music (e.g., Krumhansl, Sandell, & Sargent, 1987; Krumhansl & Schmuckler,
1986b); and non-Western musics, such as North Indian (e.g.,
Castellano, Bharucha, & Krumhansl, 1984) and Balinese (e.g.,
Kessler, Hansen, & Shepard, 1984) music. In sum, the probe-tone
technique provides a very general indicator of perceived tonal
structure across a variety of musical contexts.
Furthermore, many studies have demonstrated that the tonal
hierarchy, as assessed using the probe-tone technique, is related to
many aspects of the psychological processing of music. Studies on
recognition memory, for example, have found that the perceived
hierarchy of stability effectively predicts memory confusions.
Krumhansl (1979), for instance, found that within a tonal context,
psychologically unstable tones were more frequently confused
with stable tones than vice versa. Similarly, tonality affects memory for musical chords, with psychologically stable chords (quantified using the probe-chord method) being more easily confused
with one another (Bharucha & Krumhansl, 1983) and the probability of detecting a change in a chord being predictable from that
chord’s perceived stability (Krumhansl, Bharucha, & Castellano,
1983; see Justus & Bharucha, 2002, for a review). Finally, tonality
and perceived psychological stability affect the speed of processing of tones (Bharucha, 1987; Janata & Reisberg, 1988) and chords
(e.g., Bharucha & Stoeckig, 1986, 1987; Schmuckler & Boltz,
1994). Janata and Reisberg (1988), for example, found that reaction times for judgments of tonal membership were related to the
perceived psychological stability of the tones, and Schmuckler and
Boltz (1994) observed that expectancy judgments of chords were
predictable as a function of the chord’s role in the tonality of the
musical passage. Taken as a whole, these studies provide compelling evidence for the central impact of tonality on musical processing and for the viability of the probe-tone procedure as an
indicator of this perceived tonality.
Given the importance of tonality in musical perception, understanding the psychological processes involved in tonal perception
is a basic goal of music cognition research (Krumhansl, 1991,
2000). One approach to this issue has explored models of how
listeners might determine the tonality of a piece of Western music
(e.g., Brown, 1988; Brown & Butler, 1981; Holtzman, 1977;
Krumhansl & Schmuckler, 1986a; Leman, 1995; Longuet-Higgins
& Steedman, 1971; Shmulevich & Yli-Harja, 2000; Vos & Van
Geenen, 1996; Winograd, 1968). Of these models, the best known
are Brown and Butler’s (1981) intervallic rivalry theory and the
Krumhansl–Schmuckler key-finding algorithm (Krumhansl &
Schmuckler, 1986a; described in Krumhansl, 1990a). The former
model determines tonality through the identification of rare intervals (i.e., pairs of simultaneous or successive tones) contained in
the set of pitches making up a musical passage (see Browne, 1981;
Van Egmond & Butler, 1997), whereas the latter model operates
SMITH AND SCHMUCKLER
270
by comparing the frequency of occurrence or duration (but see also
Huron & Parncutt, 1993) of the component tones of a musical
passage with the tonal hierarchy values shown in Figure 1. The
efficacy of these models and their abilities to predict listeners’
judgments of musical key have been discussed extensively (Brown
& Butler, 1981; Butler, 1989, 1990; Cuddy, 1991; Krumhansl,
1990b; Schmuckler & Tomovski, 1997, 2000).
Two fundamental psychological processes can be thought to
underlie these models and, presumably, perceptions of tonality as
well. The first of these processes is called differentiation: the
process by which the individual elements or tones of a musical
passage are reliably distinguished from one another along some
relevant dimension (e.g., perceived importance, psychological stability). The second process, referred to as organization, is the
process by which relations between these differentiated elements,
as well as the nature of these relations themselves, are established.
For both of the key-finding models just discussed, and indeed for
the perception of tonal structure itself, listeners must differentiate
the elements of the musical passage, identifying important and/or
psychologically stable pitches, and organize this collection of
differentiated pitches into a comprehensive structure (e.g., the
tonal hierarchy). It is important to stress that the processes of
differentiation and organization are by no means limited to tonality
perception but are, rather, two fundamental aspects of theories of
perceptual organization and the perception of structure quite
generally.
Differentiation
Whereas discrimination refers to a sensitivity to differences
along some perceivable stimulus dimension (e.g., pitch, loudness,
or timbre in audition), differentiation is a higher order ability to
segregate the perceptual scene into elements on the basis of these
discriminable differences. Thus, differentiation not only depends
on basic sensory discrimination but goes beyond such concerns in
transforming the input into perceptual elements. One well-known
example of this process of differentiation occurs in auditory scene
analysis (Bregman, 1990), which describes how listeners segregate
the auditory scene into streams on the basis of frequency (Bregman
& Campbell, 1971; Van Noorden, 1975), timbre (McAdams &
Bregman, 1979), or discontinuity (Bregman & Dannenbring,
1973). Within this framework, differentiation involves the process
by which listeners segregate auditory events into different elements on the basis of some discriminable difference. These elements then combine to form more complex objects; the process
underlying this aspect, organization, is discussed below.
Assuming that pitch differentiation is fundamental to the perception of tonality, an obvious question arises as to how pitches
might be differentiated. One possible candidate is statistical in
nature, with differentiation occurring by notation of the frequency
and/or duration of occurrence of the tones within a musical passage. Support for this possibility comes from statistical analyses of
music (discussed above) on the relation between psychological
stability and the frequency of occurrence and/or duration of pitches
within a tonal context.
Organization
Organization is the process of establishing relations between the
differentiated elements and, as such, presupposes and complements differentiation. Along with the issue of the specific type of
relations among the elements (e.g., more than, less than, similar
to), probably the most central issue for organization involves the
overall form of these relations. One organizational form is a
successive and/or contiguous ordering, such as arranging elements
linearly or logarithmically along a perceptual dimension. An example of such organization might be seen in unidimensional models of pitch (e.g., Stevens & Volkman, 1940; Stevens, Volkman, &
Newman, 1937; Ward, 1954; see Rasch & Plomp, 1999, for a
review). A second organizational form involves the cyclical nature
of a set of elements, with elements organized in repeating fashion.
Schmuckler (1999), for example, has demonstrated that the melodic contour (the pattern of relative rises and falls in pitch across
successive tones) can be characterized by examining its cyclical
nature, and Schmuckler and Gilden (1993) have shown that listeners can make use of fractal information in the discrimination of
auditory sequences varying in pitch and loudness.
Probably the most common form of organization involves hierarchical structure, with elements at a particular level being subordinate to or subsumed under elements at a higher, presumably
more psychologically salient level. Within research on auditory
perception, there have been multiple types of hierarchical structure
proposed, such as work on serial patterns (Jones, 1981; Martin,
1972; Restle, 1970; Simon & Kotovsky, 1963; Vitz & Todd,
1969), with hierarchical structure applied to multiple dimensions
of auditory stimuli, such as pitch, rhythm, and phrase structure
(Deutsch, 1982; Deutsch & Feroe, 1981; Jones, 1978; Lerdahl &
Jackendoff, 1983; Povel, 1981; Povel & Essens, 1985; Simon &
Sumner, 1968). Most important for the current purposes, tonality
has been quite thoroughly described in terms of its hierarchical
structure, with this structure playing a fundamental role in musical
processing, as described earlier.
Tonality, then, provides an ideal arena for studying differentiation and organization, not only because it relies so heavily on
these processes but also because it enables a systematic disentangling of these processes. In many domains, differentiation and
organization are difficult to study independently because differentiation of elements on the basis of some attribute naturally leads to
the organization of these elements. Such is the case with work on
auditory streaming (Bregman, 1990), discussed earlier, in which
element differentiation (say by frequency) leads to a particular
element organization (the formation of high and low tone frequency streams).
In contrast, tonality perception makes it possible to tease apart
these processes. For example, differentiation can be examined by
comparing how different kinds of duration or frequency of occurrence values for the various pitches might influence listeners’
abilities to perceive the psychological stability of these pitches.
Additionally, organization can be explored by having the variously
differentiated pitches occur in either a typical hierarchical organization or some atypical organization.
What experimental forms might explorations of differentiation
and organization take? For differentiation, one manipulation involves varying the durations (or frequencies of occurrence) of the
THE PERCEPTION OF TONAL STRUCTURE
tones relative to one another; as a convenience, we talk in terms of
proportional durations (percentages of total sequence length) as
opposed to timed durations (e.g., durations in ms or s). For example, take a sequence in which three different tones (say C, F#, and
G, or Pitch Classes 0, 6, and 7; Figure 1) have respective durations
of 15.2%, 6.0%, and 12.4% of the total sequence length, with the
proportion of the remaining total duration of the melody being
divided among the remaining 9 pitch classes. This pattern of
durations can be changed to either increase or decrease the absolute differences in the proportional durations between tones while
at the same time maintaining the relative pattern of durations
across tones. Thus, a duration sequence of 11.4%, 7.2%, and
10.3% (for Tones C, F#, and G, respectively) decreases the duration differences of the original (e.g., in the original, the difference
in proportional duration between Tones C and F# was 9.2%, and
between C and G it was 2.8%, whereas in the new version these
differences are now 4.2% and 1.1%, respectively). In contrast, a
duration sequence of 35.1%, 2.2%, and 19.2% increases the proportional duration differences between the tones (e.g., the differences between C and F# and between C and G are now 32.9% and
15.9%, respectively). Despite the large absolute differences in
proportional durations, however, these sequences are equivalent in
a correlational sense, with the differences between elements
judged relative to the variance among all values. Accordingly, the
correlations between these patterns are, as might be expected, high
(rs ⱖ .97).
Manipulating the absolute degree of differences between elements while holding constant the relative patterning of the differences between the elements has a number of advantages. Most
obviously, it provides a means of manipulating the degree of
differentiation between elements. Thus, a comparison of listeners’
tonal percepts in response to sequences varying in their absolute
proportional duration differences (as in the three sequences just
described) assesses how strongly elements must be differentiated
to produce a perceptibly apprehensible structure (i.e., tonality).
Moreover, because all of the sequences contain the same relative
pattern of durations, this manipulation holds constant the organizational structure of the sequences.
In the current study, the organizational structure used was based
on Krumhansl and Kessler’s (1982) tonal hierarchy values, which
mirror the duration and frequency of occurrence values found in
Western music (as described earlier). To actually manipulate the
degree of differentiation, the tonal hierarchy values were raised to
exponents varying from 0 (producing a flat, undifferentiated profile) to 1.0 (reproducing the original Krumhansl and Kessler hierarchy) to 4.5 (producing a highly exaggerated profile); we refer to
this exponent as tonal magnitude. This power transformation preserves, to a high degree, the relative duration pattern while drastically altering the absolute differences between the elements.1
Examples of profiles differing in tonal magnitude are shown in
Figure 2. If percepts of tonal structure systematically vary with
changing tonal magnitude, then clearly differentiation is based, in
large part, on absolute differences between the elements. In contrast, it might be that past some minimum level, variation in tonal
magnitude fails to influence the perception of tonal structure. If so,
then differentiation between elements seems primarily to be based
on relative differences.
271
In the current study, organization was manipulated by presenting the differentiated pitches in two ways (compare the first two
panels of Figure 2). In one case, listeners heard sequences in which
the pitches were arranged hierarchically, mirroring the tonal hierarchy ratings (top panels of Figure 2). In the second case, the
sequences contained similarly differentiated pitches, but the hierarchical organization characterizing tonality was disrupted by randomizing the duration pattern (middle panels of Figure 2). Accordingly, these two situations manipulated the presence versus
absence of hierarchical organization among comparably differentiated pitches, with any differences in perceptions of tonality
between the conditions attesting to the importance of organization
independent of differentiation. (The importance of the bottom
panels of Figure 2, or the binary hierarchy, is explained in Experiment 3.)
Experiment 1: The Role of Duration Information
Experiment 1 provided an initial test of the role of differentiation and organization in listeners’ perceptions of tonality. Differentiation was manipulated via variations in tonal magnitude values, and organization was manipulated by presenting these pitches
in either a hierarchical or nonhierarchical fashion. Although these
manipulations are equally applicable to either the tone durations or
the frequencies of occurrence of tones, in this experiment, durational patterning was manipulated with frequency of occurrence
held constant. It is interesting to note that, although the perception
of tonality is influenced by both the duration (Coady, 1992) and
frequency of occurrence (Oram & Cuddy, 1995) of pitches, which
are in fact naturally correlated in music, these two parameters are
not equivalent in their ability to induce a tonal percept. Lantz and
Cuddy (1996, 1998) teased these parameters apart by systematically varying the sounded duration of different pitches while
holding their frequency of occurrence constant or varying frequency of occurrence while controlling total duration. Tonal stability ratings indicated that variations in tone duration, but not
frequency of occurrence, led to percepts of tonality in listeners.
Given this work, these initial tests focused on durational
patterning.
Method
Participants. Forty students (with normal hearing) at the University of
Toronto at Scarborough, Toronto, Ontario, Canada, participated in this
experiment in exchange for $7 (U.S.$10) or bonus credit in their introductory psychology course. These students were assigned to either the hierarchical (n ⫽ 20) or the nonhierarchical (n ⫽ 20) condition; the decision
to use a between-subjects design was based on practical time limitations.
All of these participants met a 3-year minimum musical training requirement. On average, the two groups of listeners did not differ significantly in
1
Correlations between Krumhansl and Kessler’s (1982) tonal hierarchy
values and duration profiles with tonal magnitudes of 0.5 to 4.5 ranged
from .90 to 1.00 (M ⫽ .96, SD ⫽ .04).
272
SMITH AND SCHMUCKLER
Figure 2. Examples of duration profiles differing in their tonal magnitude and hierarchical structuring.
Hierarchical duration profiles are those based on Krumhansl and Kessler’s (1982) tonal hierarchy. Nonhierarchical profiles were created by randomizing these same duration profiles. Binary profiles differentiate the tonic
tone (Pitch Class 0) from the remaining tones.
their years of musical training (Ms ⫽ 9.85 and 8.15, SDs ⫽ 4.13 and 3.17,
respectively), t(19) ⫽ 1.38.2
Materials. The stimuli for this study consisted of a series of algorithmically composed pitch sequences, 10 s in length and all containing 24
tones. The pitches for each sequence were drawn from 1 of 10 adjacent
chromatic sets within the pitch range from C4 (262 Hz) to A5 (880 Hz),
with each pitch occurring twice. To produce the patterns of relative
durations, the values for the 12 chromatic pitches in the standardized major
key profile of Krumhansl and Kessler (1982) were raised to 1 of 10
exponents (0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, and 4.5), thereby
producing the tonal magnitude manipulation described earlier. These transformed values were then expressed as a percentage of the sum of all 12
values, multiplied by 10,000 (the duration of the sequence in ms), and
divided by 2 (the number of times each pitch occurs in the sequence). The
resulting durations, shown in Appendix A, were then used to create
sequences by randomly permuting the order of all 24 tones. Because these
duration profiles were based on the Krumhansl and Kessler (1982) tonal
hierarchy, the sequences produced by this method constitute the hierar-
2
Differentiation and organization are presumably general perceptual
processes and, hence, operative in all (musically trained and untrained)
listeners. Accordingly, the decision to use trained listeners was based on
the fact that, although both trained and untrained listeners can perceive
tonality (e.g., Cuddy & Badertscher, 1987; Hébert, Peretz, & Gagnon,
1995), musically trained listeners nevertheless show greater sensitivity to
more subtle aspects of the hierarchical structure of tonality (e.g., Cuddy &
Badertscher, 1987; Krumhansl & Shepard, 1979). As such, we felt that
using musically trained listeners would provide the strongest possible test
of sensitivity to tonality in the somewhat unusual sequences used in this study.
THE PERCEPTION OF TONAL STRUCTURE
chically organized sequences of this study. To produce the nonhierarchical
sequences, the durations based on the tonal hierarchy values were randomly assigned to different pitches.
All sequences were played on a Yamaha DX7 synthesizer, set to an
electric piano timbre, which was connected to a Macintosh 8100 AV
computer via a MIDI interface. Audio output from the synthesizer was fed
into a Mackie 1202 mixer and was presented to listeners at a comfortable
listening level through two Boss MA-12 micro monitors.
Design and procedure. The study used the probe-tone method of
Krumhansl and Shepard (1979). Each trial consisted of a presentation of a
pitch sequence, followed by a 1-s silent interval and then a 2-s probe tone.
The probe tone was 1 of the 12 pitches of the chromatic scale and was
played with the same timbre and loudness, and in the same octave, as the
sequence. After each probe tone, listeners rated on a 7-point scale how well
they felt the probe tone fit into the context of the sequence they heard. It
was stressed that they were to judge how well the probe tone fit into the
sequence in general, not as a continuation of the sequence.
Each block of the experiment contained 14 trials, with the listener
hearing the same sequence on all trials within a given block. The first 2
trials in the block were practice, used to familiarize the listener with the
pitch sequence for that block. The following 12 trials contained the 12
probe tones, presented in a different random order for each listener. Each
listener completed 10 blocks of trials, corresponding to the 10 tonal
magnitude values of 0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, and 4.5. The
order of the tonal magnitudes across blocks was also randomized for each
listener. In the nonhierarchical condition, a different randomized version of
the key profile was used for each block and listener. To avoid carry-over
effects between blocks, the sequences were drawn from different chromatic
sets, suggesting different tonics, on successive blocks of trials. An entire
experimental session lasted approximately 45 min, after which each listener filled out a participant information form and was debriefed.
Results and Discussion
Prior to analyses, listeners’ probe-tone ratings across the different blocks were transposed back to a common key; for convenience, these ratings are presented with reference to C major. For
the nonhierarchical condition, in which tone durations were randomized across the pitches, the probe-tone ratings were assigned to
scale degrees according to each pitch’s duration rather than its
name. Thus, the rating for the probe tone whose pitch had the
longest duration in the melody was assigned to the pitch C (Pitch
Class 0), the rating for the probe tone with the second longest
duration was assigned to the pitch G (Pitch Class 7), and so forth.
By this operation, the hierarchical structure that was destroyed in
the nonhierarchical condition was mimicked, enabling comparisons with the hierarchical condition.
Preliminary analyses investigated the degree of consistency in
probe-tone ratings at each level of tonal magnitude for hierarchical
and nonhierarchical conditions. Accordingly, each listener’s set of
probe-tone ratings (for each tonal magnitude level individually) was
correlated with the ratings of all other individual listeners, and the
average correlation within the resulting half matrix of all possible
intercorrelations was computed. For the hierarchical condition, the
mean intersubject correlation increased systematically with increasing
tonal magnitude, from ⫺.09 at a tonal magnitude of 0 up to .41 and
.35 at tonal magnitudes of 4.0 and 4.5, respectively. Supporting this
increase, the mean intersubject correlations were positively correlated
with tonal magnitude, r(8) ⫽ .88, p ⬍ .01. In contrast, for the
nonhierarchical condition, the mean intersubject correlations at all
tonal magnitudes were quite low, varying between .02 and .09.
273
Additionally, there was no significant relation between tonal magnitude and the mean intersubject correlation, r(8) ⫽ .53. Generally,
these patterns suggest that increases in tonal magnitude systematically
add some common structure that is perceived by listeners, but only
when this structure is organized hierarchically.
The next step in the analyses investigated the degree of differentiation in the probe-tone profiles as a function of tonal magnitude and organization type. Toward this end, ratings were analyzed
in a three-way analysis of variance (ANOVA), with the withinsubject variables of tonal magnitude (0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0,
3.5, 4.0, 4.5) and probe tone (C, C#, D, D#, E, F, F#, G, G#, A, A#,
B) and the between-subjects variable of organization type (hierarchical vs. nonhierarchical).3 Of particular relevance were the significant main effects of probe tone, F(11, 418) ⫽ 13.03, MSE ⫽
3.52, p ⬍ .01, and tonal magnitude, F(9, 342) ⫽ 3.44, MSE ⫽
3.08, p ⬍ .01, as well as the significant Probe Tone ⫻ Organization Type, F(11, 418) ⫽ 7.08, MSE ⫽ 3.52, p ⬍ .01, Probe Tone ⫻
Tonal Magnitude, F(99, 3762) ⫽ 1.79, MSE ⫽ 2.31, p ⬍ .01, and
Probe Tone ⫻ Organization Type ⫻ Tonal Magnitude, F(99,
3762) ⫽ 1.32, MSE ⫽ 2.31, p ⬍ .05, interaction effects. No
significant effects were found for organization type, F(1, 38) ⫽
0.63, MSE ⫽ 29.30, or the Tonal Magnitude ⫻ Organization Type
interaction, F(9, 342) ⫽ 1.35, MSE ⫽ 3.08.
These effects were further explored in a series of one-way
ANOVAs for each level of tonal magnitude and organization type
(see Table 1).4 An intriguing pattern was found, such that for the
hierarchical condition, the probe tones became more differentiated
with increasing tonal magnitude. For example, there were no
differences in ratings at tonal magnitudes of 0 and 0.5, but at 1.0
and above, these ratings became significantly differentiated. In
contrast, in the nonhierarchical condition, listeners failed to differentiate probe tones at any tonal magnitude.
The next step in the analyses involved determining the degree to
which the differentiated pitches were organized. Toward this end,
the mean probe-tone ratings for each tonal magnitude were corre3
One issue with repeated measures ANOVAs involves a concern over
violations of sphericity for one or more of the repeated measures variables,
with such a violation indicating that the assumptions of equal variances and
covariances are not justified, and thus the value of the F ratio is compromised. When such a situation occurs, the more conservative Greenhouse–
Geisser correction be applied to the repeated measures effects. In this
experiment, as well as the subsequent experiments, we examined our data
for sphericity violations, and when such violations were detected (as
occurred in all three experiments), the affected effects were reexamined
using the correction. In every case in which sphericity was violated, these
more conservative tests replicated the pattern of significances observed
with the more standard ANOVA procedure. Accordingly, to simplify our
explication of what are already complicated results, we do not present these
additional tests, and we report the uncorrected degrees of freedom.
4
Given the large number of analyses conducted here, one might argue
that it would be more appropriate to adopt a stricter level of statistical
significance (say the .01 level) than is typically used. Although we did
consider this, one concern with this procedure is that such a conservative
approach eliminates some effects in later experiments that are actually
problematic for (as opposed to supportive of) the arguments we make in
this article. Thus, even though removal of these results would strengthen
our arguments, in a spirit of conservativism, we chose to maintain the more
traditional significance values.
SMITH AND SCHMUCKLER
274
Table 1
Mean Probe-Tone Ratings for Each Pitch at Each Tonal Magnitude for Listeners in the Hierarchical and Nonhierarchical Conditions
of Experiment 1
Tonal magnitude
Pitch
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
4.90
3.95
4.15
4.35
4.90
4.95
4.30
5.70
3.85
4.00
3.35
3.15
5.20***
2.02
5.70
3.90
4.20
4.45
4.90
4.60
4.25
5.25
4.10
4.55
3.15
3.05
5.78***
2.04
5.25
3.10
3.50
3.60
4.25
4.55
3.80
4.95
3.60
4.25
3.10
3.50
4.57***
2.13
5.30
2.95
4.65
3.10
4.95
4.45
3.55
5.55
4.00
4.75
2.45
3.20
11.25***
1.82
5.40
3.55
4.50
3.10
5.50
5.35
3.35
5.60
3.40
5.15
2.80
3.50
10.37***
2.18
4.05
4.00
4.35
3.90
5.10
4.00
4.70
4.25
4.40
4.70
4.25
3.95
1.09
2.50
4.15
3.60
3.45
4.10
4.65
4.40
4.40
4.40
3.90
4.10
4.25
3.05
1.33
3.23
4.80
4.25
4.30
4.00
4.95
4.55
4.40
4.20
4.25
4.25
3.80
3.45
1.08
3.06
Hierarchical
C
C#/D♭
D
D#/E♭
E
F
F#/G♭
G
G#/A♭
A
A#/B♭
B
F(11, 209)a
MSE
4.85
4.45
4.40
4.35
4.40
4.60
4.75
5.20
4.95
4.35
3.70
4.80
1.71
1.73
4.60
4.35
4.75
4.15
4.35
4.25
4.15
4.35
4.40
5.00
4.75
4.25
0.73
1.99
5.10
3.90
4.15
4.40
5.40
4.15
4.10
4.90
5.35
5.05
3.15
4.40
4.81***
1.87
4.95
4.40
4.00
5.00
4.45
4.50
4.45
5.25
4.65
4.90
3.20
3.95
3.35***
1.87
5.45
4.20
4.80
4.10
4.60
4.75
4.15
6.05
4.65
4.05
3.65
3.85
5.34***
1.77
Nonhierarchical
C
C#/D♭
D
D#/E♭
E
F
F#/G♭
G
G#/A♭
A
A#/B♭
B
F(11, 209)a
MSE
4.20
4.45
3.80
5.05
4.35
4.65
4.20
4.20
4.35
4.85
4.45
3.90
0.96
2.69
4.00
3.90
4.25
4.20
4.65
4.25
4.55
4.55
3.80
4.10
4.70
3.80
0.81
2.60
3.55
4.10
4.30
4.70
4.00
4.30
4.30
4.40
4.40
4.65
4.30
4.80
0.84
2.69
4.30
4.25
4.10
4.65
4.45
4.55
4.35
4.40
3.90
4.55
4.15
4.35
0.35
2.59
4.95
4.05
4.20
4.10
3.95
3.80
3.65
5.10
4.60
4.50
4.65
4.40
1.37
2.95
4.60
3.65
4.25
4.00
4.10
3.45
4.70
4.95
3.70
4.05
4.10
3.40
1.42
3.38
3.45
3.00
4.50
4.35
3.95
4.30
4.00
4.15
4.35
3.70
3.75
4.05
1.04
3.57
a
The results of a one-way analysis of variance for the main effect of pitch of the probe tone.
*** p ⬍ .001.
lated with the tonal hierarchy values of Krumhansl and Kessler
(1982). This analysis assessed the degree to which the tonal
structure underlying the sequences (be it hierarchical or nonhierarchical) was present in listeners’ ratings. Figure 3 presents these
correlations as a function of tonal magnitude for the two organizational conditions. For the hierarchical sequences, tonal structure
was present as a function of increasing tonal magnitude with tonal
magnitudes of 2.0 and above, demonstrating significant relations.
Moreover, these correlations were themselves significantly related
to tonal magnitude, r(8) ⫽ .88, p ⬍ .01. In contrast, for the
nonhierarchical condition, there was no systematic increase. Only
one of the correlations (for tonal magnitude of 3.5) was significant,
and there was no reliable relation between increasing tonal magnitude and perceived tonality, r(8) ⫽ .48. Thus, for only the
hierarchical condition, as tonal magnitude increased, listeners’
ratings more closely matched a standard description of tonal
structure.
In general, this experiment explored the importance of differentiation and organization in the perception of tonality. One of the
primary results of this experiment was that, without a sufficient
degree of distinction between tone durations, listeners did not
differentiate the tonal implications of the various probe tones and,
thus, failed to apprehend the implied tonality of the sequences.
Moreover, percepts of tonality increased systematically with increasing duration differences. Together, these findings imply that
it is not just the relative pattern of tone durations that is important
in differentiating the tonal implications of these elements, but
instead, there may be some absolute level of duration difference
necessary for element differentiation. Such a result has implications for musical key finding; this topic is addressed in the General
Discussion.
The perception of tonality was not wholly dependent on simple
element differentiation, however. Just as important for this percept
was the fact that the differentiated elements were organized in a
hierarchical fashion, mirroring the structure described by Krumhansl and colleagues (Krumhansl & Kessler, 1982; Krumhansl &
Shepard, 1979); when durations were organized in a nonhierarchical fashion, no tonal perception was evident. It is interesting to
THE PERCEPTION OF TONAL STRUCTURE
275
Experiment 2: The Role of Frequency-of-Occurrence
Information
Figure 3. Correlations between mean probe-tone ratings and standard
tonal hierarchy ratings (Krumhansl & Kessler, 1982) for the hierarchical
and nonhierarchical conditions of Experiment 1 as a function of tonal
magnitude. The dashed line at r ⫽ .57 represents the p ⬍ .05 significance
level for df ⫽ 10.
note that, because we reorganized the actual ratings in the nonhierarchical condition such that the rating for the longest duration
tone was compared to the value of the highest rated element in the
tonal hierarchy (and thus the longest duration tone in the hierarchical condition), one important implication of this result is that
the production of the goodness-of-fit ratings does not occur simply
by giving the highest rating to the element heard for the longest
period of time. If this were the strategy employed by listeners, then
the correlations between the probe-tone ratings in the nonhierarchical condition and the tonal hierarchy values should have been
significant at some level of tonal magnitude. The fact that this did
not occur suggests that the relative duration pattern acted as a cue
to allow listeners to recognize a tonal hierarchy and, thus, to
apprehend the musically important relations inherent in the structure of this hierarchy (e.g., the structural importance of Pitch
Classes 0, 4, and 7; the relative unimportance of the pitch classes
of the nondiatonic set). This idea that the pattern induced a
particular analytical set on the perception of the individual tones is
intriguing and is also returned to in the General Discussion.
One limitation to the current results is that the stimuli of this
experiment broke the natural correlation that exists between tone
duration and frequency of occurrence, using sequences in which
the elements were completely undifferentiated in terms of this
latter factor (i.e., every pitch occurred twice). Recognizing this
limitation raises the obvious question of whether differentiation of
elements will occur if the elements are distinguished by variation
in frequency of occurrence (with total duration held constant).
Exploring issues related to this question was the goal of Experiment 2.
Experiment 2 examined a number of questions related to the
importance of tone duration, frequency of tone occurrence, and the
relation between the two on listeners’ percepts of tonality. The
most straightforward issue involved whether changing tonal magnitude has the same impact on the percept of tonality when element
differentiation occurs via frequency of occurrence (holding duration constant) as it does when differentiation occurs via tone
duration (holding frequency of occurrence constant). Given the
results of Lantz and Cuddy (1996, 1998) described earlier, in
which percepts of tonality mirrored duration and not frequency of
occurrence information, it was anticipated that this situation would
fail to instantiate tonal percepts for both hierarchical and nonhierarchical organizations, irrespective of tonal magnitude.
An additional, and more subtle, issue involved examining the
impact on tonal percepts of breaking the natural relation between
frequency of occurrence and duration. One reason that listeners
may have failed to either differentiate or organize the nonhierarchical stimuli of Experiment 1 is that these stimuli violated this
typical relation, which may have been especially disruptive given
the fact that these stimuli were presented in a nonstandard organizational pattern (i.e., randomized tonal hierarchy relations). The
current experiment explored whether elements can be differentiated and organized (both hierarchically and nonhierarchically)
when duration and frequency of occurrence information retain
their natural correlation.
Method
Participants. Forty students at the University of Toronto at Scarborough participated in this experiment in exchange for $7 (U.S.$10) or bonus
credit in their introductory psychology course. They were assigned to one
of two conditions, hierarchical (n ⫽ 20) or nonhierarchical (n ⫽ 20). On
average, the two groups did not differ significantly in their years of musical
training (Ms ⫽ 7.8 and 8.8, SDs ⫽ 2.62 and 3.54, respectively), t(35) ⫽
0.99, ns.
Materials. Pitch sequences were created that varied systematically in
terms of the frequency of occurrence and total duration of their constituent
pitches. All sequences were approximately 10 s in length and contained
between 98 and 100 tones, with the proportion of these tones assigned to
a given pitch determined by a tonal-magnitude calculation. The variation in
the exact number of notes arose because of a rounding error in moving
from decimal percentages representing tonal magnitude values to integer
values for frequency-of-occurrence values. In contrast to the previous
experiment, this experiment used a more limited range of tonal magnitude
values (0.5, 1.0, 1.5, 2.0, and 2.5). This reduced range was adopted to keep
the duration of the experiment to a reasonable length, as well as to avoid
the situation at higher tonal magnitudes in which the proportion of occurrences for a pitch would equal less than 1%–2% (i.e., less than one or two
occurrences). The tone durations were calculated in one of two ways. In the
uncontrolled total duration condition, all tones were sounded for 100 ms
regardless of their pitch, with the consequence that the total duration of
each pitch increased with its frequency of occurrence. In the controlled
total duration condition, the total duration for each pitch was set a priori to
833 ms, with the duration of each individual occurrence of that pitch
determined by dividing this total duration by its frequency of occurrence.
Thus, this condition not only broke the natural correlation between frequency of occurrence and total duration, it in fact reversed this relation.
The frequency-of-occurrence and duration values for uncontrolled and
276
SMITH AND SCHMUCKLER
controlled total duration conditions, as a function of tonal magnitude,
appear in Appendix B.5 As in Experiment 1, sequences were either hierarchical or nonhierarchical, with the relation between the frequency-ofoccurrence and duration values and pitches either preserving or destroying
the hierarchical structure of tonality.
Procedure. The procedure of this experiment was identical to that of
Experiment 1.
Results and Discussion
As in Experiment 1, probe-tone ratings were transposed to the
common key of C major, and the nonhierarchical condition was
organized hierarchically on the basis of tone duration (e.g., the
rating for the longest pitch was assigned to Pitch Class 0). Preliminary analyses investigated the degree of consistency in probetone ratings at each level of tonal magnitude for hierarchical and
nonhierarchical conditions by intercorrelating listeners’ ratings.
For listeners in the uncontrolled total duration hierarchical condition, mean intersubject correlations increased with increasing tonal
magnitude, with a range of ⫺.01 at a tonal magnitude of 0.5 to .18
at a tonal magnitude of 2.5; these mean intersubject correlations
were themselves significantly correlated with tonal magnitude,
r(3) ⫽ .97, p ⬍ .01. However, the relation between tonal magnitude and mean intersubject correlation was not significant for
controlled total duration hierarchical sequences. In this case, correlations ranged from .01 to .04 at tonal magnitudes of 0.5 and 2.5,
respectively, with no significant relation between the two variables, r(3) ⫽ .79. For the nonhierarchical sequences, there was
similarly no relation between intersubject correlations and tonal
magnitude for either the controlled total duration, r(3) ⫽ .73, or the
uncontrolled total duration, r(3) ⫽ .58, conditions, with the mean
intersubject correlations ranging from .02 to .05 and .00 to .02 for
tonal magnitudes between 0.5 and 2.5, respectively. This pattern of
results suggests that only in the hierarchical, uncontrolled total
duration condition did a common perceptible structure emerge
with increasing tonal magnitude.
The next analytical step investigated the degree of differentiation in probe-tone ratings as a function of the different experimental variables. Toward this end, listeners’ ratings were submitted to
a four-way ANOVA with the within-subject variables of tonal
magnitude (0.5, 1.0, 1.5, 2.0, 2.5), probe tone (C, C#, D, D#, E, F,
F#, G, G#, A, A#, B), and duration (controlled vs. uncontrolled)
and the between-subjects variable of organization (hierarchical vs.
nonhierarchical). There were significant main effects of probe
tone, F(11, 418) ⫽ 2.36, MSE ⫽ 3.90, p ⬍ .01, and duration type,
F(1, 38) ⫽ 10.49, MSE ⫽ 8.97, p ⬍ .01; significant Probe Tone ⫻
Organization, F(11, 418) ⫽ 2.30, MSE ⫽ 3.90, p ⬍ .01, Probe
Tone ⫻ Duration Type, F(11, 418) ⫽ 3.06, MSE ⫽ 2.77, p ⬍ .01,
Probe Tone ⫻ Tonal Magnitude, F(44, 1672) ⫽ 2.04, MSE ⫽
2.61, p ⬍ .01, and Duration Type ⫻ Tonal Magnitude, F(4, 152) ⫽
5.27, MSE ⫽ 3.63, p ⬍ .01, interactions; and significant Probe
Tone ⫻ Duration Type ⫻ Organization, F(11, 418) ⫽ 2.91,
MSE ⫽ 2.77, p ⬍ .01, Probe Tone ⫻ Organization ⫻ Tonal
Magnitude, F(44, 1672) ⫽ 1.53, MSE ⫽ 2.61, p ⬍ .05, and Probe
Tone ⫻ Duration Type ⫻ Tonal Magnitude, F(44, 1672) ⫽ 1.72,
MSE ⫽ 2.55, p ⬍ .01, interactions. The main effects of organization type, F(1, 38) ⫽ 1.69, MSE ⫽ 36.42, and tonal magnitude,
F(4, 152) ⫽ 0.30, MSE ⫽ 4.27, were not significant; nor were the
Organization Type ⫻ Duration Type, F(1, 38) ⫽ 0.12, MSE ⫽
8.97, Organization Type ⫻ Tonal Magnitude, F(4, 152) ⫽ 0.68,
MSE ⫽ 4.27, and Organization Type ⫻ Duration Type ⫻ Tonal
Magnitude, F(4, 152) ⫽ 0.83, MSE ⫽ 3.63, interaction effects. The
Probe Tone ⫻ Duration Type ⫻ Tonal Magnitude interaction
effect was not significant, F(44, 1672) ⫽ 1.29, MSE ⫽ 2.55.
Although complex, these interactions suggest that ratings of the
probe tones varied systematically with changes in duration type,
organization, and tonal magnitude. Subsequent analyses attempted
to disentangle these effects.
Specifically, probe-tone ratings were analyzed in a series of
one-way ANOVAs at each level of tonal magnitude, organization
type, and duration type; these ratings and the results of these
ANOVAs appear in Table 2. In the uncontrolled total duration
hierarchical condition, the probe tones became increasingly differentiated with increasing tonal magnitude, with significant differences between probe-tone ratings at tonal magnitudes of 1.5 and
higher. In contrast, no differentiation was found at any level of
tonal magnitude in the controlled total duration hierarchical condition. Similarly, in the two nonhierarchical conditions (controlled
and uncontrolled total duration), the only significant effect observed was at a tonal magnitude of 2.0 for the uncontrolled total
durations; other than this isolated effect, ratings did not vary. Thus,
differentiation of probe tones was restricted to sequences containing hierarchically organized tones in which tone duration varied in
accordance with frequency of occurrence.
The next series of analyses investigated the degree of organization of the differentiated pitches. As in Experiment 1, the mean
probe-tone ratings for each tonal magnitude, organization, and
duration type were correlated with the Krumhansl and Kessler
(1982) tonal hierarchy values to assess the degree of tonal structure. Figure 4 presents these correlations as a function of tonal
magnitude for the two organization and duration types. In the
hierarchical conditions, uncontrolled total duration led to increasing tonal structure with greater tonal magnitude, with significant
correlations at tonal magnitudes of 1.5 and above. In contrast,
controlled total duration failed to produce any significant correlations. Similarly, in both nonhierarchical conditions, there was no
evidence of tonal structure at any tonal magnitude. Thus, just as
with differentiation, it was only when the sequences contained
hierarchically organized tones varying in total tone duration that
listeners apprehended any degree of tonality.
Along with assessing the role of frequency-of-occurrence information, the controlled total duration condition also disentangled
5
In the uncontrolled total duration condition in particular, the length of
some of these tones was quite short, raising questions about whether
listeners could actually perceive the pitch of these tones in the first place.
Work on pitch perception (e.g., Patterson, Peters, & Milroy, 1983; Robinson & Patterson, 1995) has found that for complex tones, 8 –10 cycles of
the waveform are required for stable pitch perception. On the basis of this
estimate, virtually all of the tones employed in this experiment should have
produced a recognizable pitch. The only possible concern here is that, at a
tonal magnitude of 2.5, if the tonic tone was either C4 (262 Hz) or C#4 (277
Hz), a 28-ms duration would result in 7.34 and 7.76 cycles (respectively),
which are just below the border of pitch perceptible. Although this is
potentially worrisome, because random tonics were chosen throughout this
experiment, our feeling is that the possibility of this being a significant
factor is, in all likelihood, small at best.
THE PERCEPTION OF TONAL STRUCTURE
277
Table 2
Mean Probe-Tone Ratings for Each Pitch at Each Tonal Magnitude for Listeners in the Hierarchical and Nonhierarchical Conditions
of Experiment 2 as a Function of Duration Condition
Tonal magnitude
Uncontrolled total duration (ms)
Pitch
0.5
1.0
1.5
Controlled total duration (ms)
2.0
2.5
0.5
1.0
1.5
2.0
2.5
3.70
4.00
3.95
4.40
4.30
4.45
4.05
3.55
4.15
4.90
4.45
4.00
0.99
2.73
3.80
4.15
4.15
3.70
4.10
4.10
4.00
3.80
4.25
4.60
4.20
4.45
0.42
3.29
4.20
3.75
3.25
4.45
3.90
4.40
4.95
4.75
4.55
4.45
4.85
3.95
1.96
2.54
4.20
4.00
3.65
4.40
3.55
4.90
4.30
4.70
4.85
4.60
5.25
4.10
1.69
3.10
4.40
4.95
4.05
3.95
4.20
4.90
4.25
5.15
5.10
4.40
4.25
3.85
1.67
2.49
4.10
4.85
4.15
4.05
4.50
4.35
4.85
4.05
4.70
4.60
4.50
3.55
1.27
2.37
4.20
4.45
4.90
4.70
3.85
4.60
4.50
4.45
3.90
4.40
4.05
5.20
1.25
2.54
4.60
4.65
4.15
4.95
4.45
4.35
4.65
4.65
3.90
4.30
3.80
4.45
0.83
2.66
4.90
4.60
4.70
4.40
4.70
3.75
4.10
4.25
3.90
4.35
4.30
4.20
0.91
2.54
4.95
4.50
4.95
4.75
4.80
5.30
5.00
4.90
4.55
4.95
4.85
4.40
0.54
2.27
Hierarchical
C
C#/D♭
D
D#/E♭
E
F
F#/G♭
G
G#/A♭
A
A#/B♭
B
F(11, 209)a
MSE
3.45
4.20
4.00
4.35
4.65
4.45
3.95
4.20
4.05
4.50
4.75
3.80
0.87
3.20
4.70
3.40
3.35
4.20
4.30
4.20
3.65
4.30
4.55
4.35
3.45
4.10
1.54
2.80
4.70
3.60
3.50
3.60
3.80
4.05
4.00
4.95
4.15
4.95
3.05
3.60
2.57**
2.82
4.90
2.70
4.05
2.80
4.00
3.95
3.65
5.45
3.20
4.35
3.25
3.50
4.71***
2.83
5.65
4.20
3.55
3.25
4.70
3.75
3.55
4.55
3.45
3.40
2.40
2.90
5.77***
2.66
Nonhierarchical
C
C#/D♭
D
D#/E♭
E
F
F#/G♭
G
G#/A♭
A
A#/B♭
B
F(11, 209)a
MSE
4.50
4.40
4.60
4.75
4.40
3.20
4.20
4.70
4.10
3.70
4.20
4.05
1.32
2.97
4.00
3.90
3.75
4.60
4.50
4.40
4.50
5.40
4.75
4.35
5.30
3.85
1.94
2.91
3.90
5.20
4.05
4.10
4.25
3.15
3.90
4.60
4.15
5.25
3.75
3.75
2.48
2.92
4.70
4.95
4.15
3.65
4.00
4.10
4.20
4.65
3.85
3.35
5.20
3.70
2.56*
2.46
4.60
4.70
4.15
4.00
3.80
4.50
3.50
4.30
3.40
3.35
3.50
4.95
2.46
2.51
a
The results of a one-way analysis of variance for the main effect of pitch of the probe tone.
* p ⬍ .05. ** p ⬍ .01. *** p ⬍ .001.
the influences of the total duration of a tone and the duration of an
individual occurrence of that tone. In Experiment 1, the total pitch
duration was varied by manipulating the duration of each pitch’s
occurrence; because all pitches occurred equally often, total duration and individual pitch duration were perfectly correlated. In
contrast, in the current controlled total duration condition, individual durations were manipulated independently of their total duration. One consequence of this procedure is that the duration profile
of the individual pitches within the controlled total duration condition (assuming a hierarchical structure) exhibits a high degree of
tonal structure but of a very different key.
Take a controlled total duration profile that implies the key of C
major in terms of the frequency of occurrence of pitches. Hierarchically important pitches (i.e., C, E, and G; Pitch Classes 0, 4, and
7) occur frequently, but because total duration is controlled, these
tones have short individual durations. If one examines the pattern
of individual durations across pitches (see Appendix B), one finds
that these durations actually imply tonalities of F# major and C#
major (two maximally dissimilar keys from C major).6 This produces the very interesting situation in which two different aspects
of the same profile make competing predictions about which key
could be perceived by listeners. To examine the possibility that
listeners might be sensitive to the alternative tonal implications of
the individual durations, we correlated probe-tone ratings with
standardized key profiles for these alternative keys. No significant
correlations were found between listeners’ ratings and alternative
keys, suggesting that despite significant correlations between individual note durations and alternative keys, it was only the total
duration of pitches that drove the perception of tonal structure.
Together, Experiments 1 and 2 demonstrate that for a tonal
organization to be perceived, the pitches of the chromatic scale
6
Correlations between the individual duration profiles, shown in Appendix B, and tonal hierarchy values, ranged from .70 to .78 for F# major
and from .58 to .64 for C# major.
278
SMITH AND SCHMUCKLER
Figure 4. Correlations between mean probe-tone ratings and standard tonal hierarchy ratings (Krumhansl &
Kessler, 1982) for the hierarchical and nonhierarchical controlled and uncontrolled total duration conditions of
Experiment 2 as a function of tonal magnitude. The dashed line at r ⫽ .57 represents the p ⬍ .05 significance
level for df ⫽ 10.
must be sufficiently differentiated in their total duration, and they
must be hierarchically organized, embodying the structure outlined
in Krumhansl’s classic work (Krumhansl, 1979, 1990a, 2000;
Krumhansl & Kessler, 1982; Krumhansl & Shepard, 1979) on the
perception of tonality. One potential limitation to this result is the
fact that it is unclear whether listeners’ failure to perceive tonal
structure in the nonhierarchical conditions was due to the absence
of information supporting a consistent tonal interpretation of the
sequence or the presence of information contradicting a tonal
interpretation. For example, if a sequence has C as the longest
pitch, a likely candidate for the tonality of the sequence is C major.
In the hierarchical condition, the second and third longest pitches,
G and E, support this interpretation. However, in the nonhierarchical condition, the second and third longest pitches might be C#
and B, whereas G and E have short durations. In this case, C major
is not supported by the short G and E pitches, and it is contradicted
by the long C# and B pitches. Recognition of this limitation thus
raises the question of which pitch relations are important. Must
listeners hear all of the pitches in their appropriate hierarchical
position to perceive the underlying tonal structure, or are some
pitch relations, such as the distinction between the tonic and
nontonic pitches, more important than others?
Experiment 3: Binary Hierarchical Information
Experiment 3 addressed the question of whether the failure of
tonal perception in the nonhierarchical condition was due to a lack
of supportive information or to the presence of contradictory
information by introducing a modified hierarchical structure. This
modification preserved the differentiation between the tonic (e.g.,
Pitch Class 0; Figure 1) and nontonic pitches (Pitch Classes 1–11;
Figure 1) but eliminated any differentiation among nontonic
pitches by equating their durations. In relation to this two-level
binary hierarchy (as opposed to the multilevel tonal hierarchy
shown in Figure 1, top), the real difference between the hierarchical and nonhierarchical structures in Experiments 1 and 2 is
whether the key implication of the longest pitch, the tonic, is
supported or contradicted by the remaining pitches.
If, on the one hand, listeners’ failure to perceive tonality in the
nonhierarchical conditions of Experiments 1 and 2 was due to
contradictory information, then eliminating all differentiation
among the nontonic pitches should reduce this influence and, thus,
allow listeners to perceive the key implied by the tonic. If, on the
other hand, it was the absence of evidence supporting the key
implied by the tonic that prevented listeners from perceiving
tonality, then this implied key will remain unperceived, because
the supportive information is equally absent in the binary
condition.
Method
Twenty students at the University of Toronto at Scarborough participated in this experiment in exchange for $7 (U.S.$10) or bonus credit in
their introductory psychology course. They all met the 3-year minimum
musical training prerequisite, having an average 8.15 years of training
(SD ⫽ 2.83). The stimuli were identical to the sequences of Experiment 1,
with total tone duration manipulated as a function of tonal magnitude (from
0 to 4.5). In this case, though, changes in tonal magnitude were indicated
solely by differences in duration between the tonic and remaining pitches.
The duration of the 11 nontonic pitches was set to a single value, deter-
THE PERCEPTION OF TONAL STRUCTURE
279
differences in probe-tone ratings; however, for tonal magnitudes of
2.5 and over, there were significant differences in these ratings. To
determine whether these effects were due to the differentiation of
all the pitches or solely to differentiation of the tonic from nontonic pitches, a second series of ANOVAs was performed on the
nontonic pitches alone; the results of these analyses also appear in
Table 3. Although some tonal magnitudes (2.5 and 3.0) produced
significant differences, by and large there was no differentiation of
probe-tone ratings once the tonic was removed. As such, this result
suggests that listeners primarily differentiated the tonic from the
nontonic pitches only and did not differentiate between the remaining pitches themselves.
Finally, the degree of organization was assessed by correlating
the mean probe-tone ratings with the tonal hierarchy values,
r(10) ⫽ ⫺.02 to .72 (shown in Figure 5). Although there was not
as strong a linear increase with increases in tonal magnitude,
r(8) ⫽ .38, ns, it is notable that the only two significant correlations with the tonal hierarchy were for the highest tonal magnitudes, suggesting that increasing relative tone duration differences
drives perceptual organization for binary profiles, just as it did for
the fully differentiated profiles of Experiments 1 and 2.
Before much is made of this result, however, it is important to
evaluate the impact of the high tonic rating on the correlations with
the tonal hierarchy, to assess whether differentiation of the tonic
induced the complete pattern of stability relations of the tonal
hierarchy. To evaluate this possibility, these correlations were
calculated after removal of the tonic value. Overall, this analysis
produced lower correlations, ranging from ⫺.54 to .48, with none
of the correlations statistically significant. Moreover, the weak
mined by averaging the duration of the remaining values used in Experiment 1. Examples of binary profiles differing in their tonal magnitudes are
shown in the bottom panels of Figure 2, and the actual durations appear in
Appendix C. Other than the averaging of tone durations for nontonic
pitches, all aspects of Experiment 3 were identical to Experiment 1.
Results and Discussion
As in the previous experiments, consistency was examined by
intercorrelating listeners’ ratings and calculating the mean intersubject correlation at each level of tonal magnitude. These mean
intersubject correlations ranged from ⫺.02 to .13 and showed a
general increase with increasing tonal magnitude, r(8) ⫽ .73, p ⬍
.05. This increase suggests that some common structure was perceived by listeners at higher tonal magnitudes, although it is worth
noting that this consistency is lower than that for the hierarchical
condition of Experiment 1.
Next, the degree of differentiation of probe-tone ratings was
examined in a two-way ANOVA with the within-subject variables
of tonal magnitude (0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, and 4.5)
and probe tone (C, C#, D, D#, E, F, F#, G, G#, A, A#, and B).
There was a significant effect of probe tone, F(11, 209) ⫽ 6.87,
MSE ⫽ 2.34, p ⬍ .01. The main effect of tonal magnitude, F(9,
171) ⫽ 1.78, MSE ⫽ 2.15, and the Tonal Magnitude ⫻ Probe Tone
interaction, F(99, 1881) ⫽ 1.17, MSE ⫽ 2.11, were not significant.
For comparison with the previous studies, a series of one-way
ANOVAs was performed, examining the probe-tone ratings at
each level of tonal magnitude. Table 3 presents the mean probetone ratings as a function of tonal magnitude, along with the results
of these analyses. At low levels of tonal magnitude, there were no
Table 3
Mean Probe-Tone Ratings for Experiment 3
Tonal magnitude
Pitch
C
C#/D♭
D
D#/E♭
E
F
F#/G♭
G
G#/A♭
A
A#/B♭
B
0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
4.65
4.65
4.65
4.25
4.40
4.70
4.95
4.50
4.25
4.25
4.35
4.35
4.30
4.00
3.65
4.15
4.15
4.30
4.25
4.00
4.05
4.55
3.70
4.80
4.80
4.45
4.05
3.80
4.05
4.05
4.40
4.60
3.85
3.95
4.35
4.35
4.65
3.80
3.30
3.80
4.05
4.00
4.45
4.40
4.05
4.20
4.25
4.05
4.70
5.10
4.20
4.50
4.55
4.10
3.85
3.85
4.35
4.00
4.10
4.20
5.15
4.75
4.15
4.45
4.55
4.55
4.40
4.40
3.50
3.50
3.75
4.40
5.25
5.05
3.90
4.40
4.45
3.80
4.00
3.20
3.90
3.40
3.85
4.20
5.00
4.75
3.95
4.05
3.95
4.15
3.95
4.15
4.20
3.65
4.50
5.00
5.15
4.35
4.60
3.75
4.65
3.90
4.00
4.20
3.85
4.00
3.50
3.80
5.85
4.35
4.00
3.90
3.60
4.45
3.60
3.85
3.95
3.70
3.70
4.25
3.44**
2.11
1.90*
2.03
1.98*
2.20
3.63**
2.12
2.47**
2.09
1.52
2.04
1.22
2.13
0.80
2.20
Tonic and nontonic pitches
F(11, 209)a
MSE
0.53
1.91
0.97
2.15
0.86
2.26
0.95
2.65
1.34
2.03
2.58**
1.91
Nontonic pitches alone
F(10, 190)
MSE
a
a
0.58
1.82
1.08
2.08
0.64
2.24
0.77
2.67
1.34
1.96
1.95*
1.96
The results of a one-way analysis of variance for the main effect of pitch of the probe tone.
* p ⬍ .05. ** p ⬍ .01.
280
SMITH AND SCHMUCKLER
Figure 5. Correlations between mean probe-tone ratings for both all tonic and nontonic pitches, and for the
nontonic pitches alone, and standard tonal hierarchy ratings (Krumhansl & Kessler, 1982) for the binary
sequences of Experiment 3 as a function of tonal magnitude. The dashed lines at r ⫽ .57 and .60 represent the
p ⬍ .05 significance levels for df ⫽ 10 and 9, respectively.
linear trend with tonal magnitude was now eliminated, r(8) ⫽
⫺.15, ns. Thus, although auditory sequences containing a differentiated tonic rendered the tonic at least recognizable as an important pitch, simple tonic identification did not induce a fullblown hierarchical organization on the complete set of pitches.
Listeners’ failure to recover the tonal hierarchy in these sequences shows that the binary hierarchy is not sufficient to induce
tonal perception. This suggests that the failures of tonality perception in the nonhierarchical conditions of Experiments 1 and 2 were
most likely due to the absence of information supporting a tonal
interpretation of the sequence and not to the presence of contradictory information. Given the fact that supporting information
appears to be critical, and that the binary sequences do not provide
enough of such support, the question immediately arises as to
exactly how much and what type of supporting hierarchical information is required to induce tonal perception. One possible way of
addressing this question might be to retain a two-level hierarchy
but add more discriminating information within these levels, with
the top level containing multiple structurally important tones (i.e.,
the pitch classes of the tonic triad, as shown in the first two levels
of Figure 1, top) and the bottom level containing the remaining
tones (as outlined in Figure 1, top). Or, alternatively, it would be
possible to include additional hierarchical levels without discriminating within each level. Thus, one could use a trinary hierarchy,
with the top level containing a long duration tonic, the next level
containing the remaining diatonic tones played with an intermediate duration, and the bottom level containing the nondiatonic tones
presented with a short duration. Eventually, of course, with the
addition of more hierarchical levels, the typical tone profile will
emerge; nevertheless, such systematic additions of structure would
provide insight into the nature of the information necessary to
invoke more abstract, internal representations of musical structure
in listeners.
Manipulating the number and content of the hierarchical levels
also allows for exploration of one of the more curious, and admittedly contradictory, findings of this experiment, which is that in
contrast to the previous studies, listeners in this experiment could
track the most prevalent tone in the auditory sequence and use this
information as the basis for their stability ratings. The obvious
difference between the present experiment and Experiments 1 and
2, of course, is that in the current experiment, listeners only had to
track one duration difference with a single tone as opposed to the
multiple differences with multiple tones in the earlier experiments.
In this case, use of randomized versions of the multileveled profiles (described above) would enable determination of the degree
to which listeners can use patterns of novel duration differences to
drive percepts of psychological stability. As an aside, Lantz (2002)
has recently demonstrated that listeners can, under some circum-
THE PERCEPTION OF TONAL STRUCTURE
stances, use relative duration differences as a cue to differentiation
of pitches in up to a three- to four-level hierarchy with nontonal
sequences. On a more general level, the findings of Lantz (2002),
as well as those of this study, provide insight into how listeners
might actually internalize the hierarchical structure of tonality
during musical learning and development.
General Discussion
The present experiments examined two complementary processes of perceptual organization— differentiation and organization—within the context of perceiving tonality. Overall, it was
found that these two principles played a conjunctive role, with the
apprehension of tonality requiring both a sufficient degree of
differentiation among pitches, characterized by tonal magnitude,
and hierarchical organization of these differentiated pitches, characterized by adherence to a prototypical schema for musical tonality (Krumhansl, 1990a; Krumhansl & Kessler, 1982; Krumhansl & Shepard, 1979). These findings have a host of implications
for the understanding of how listeners come to apprehend tonality,
as well as aspects of auditory processing more generally. Here, we
explore some of these implications.
One of the first issues with which we can grapple is the relation
between our conception of tonal magnitude and the more standard
psychomusicological concept of tonal strength. Although we have
derived tonal magnitude as a power transformation of the standardized key profile, on reflection it is clear that tonal magnitude
is in many ways comparable to the more familiar concept of tonal
strength. Research on tonal strength has occupied a central place in
work on musical cognition, with such investigations extensively
studying, for example, the consequences of varying tonal strength
on the processing of and memory for musical passages (e.g.,
Croonen, 1994, 1995; Cuddy, Cohen, & Mewhort, 1981; Cuddy,
Cohen, & Miller, 1979; Cuddy & Lyons, 1981; Dowling, 1978,
1991). In addition, this work has identified characteristics that
make a passage tonally strong, with music heard as tonally strong
if it (a) is diatonic, in that it is composed primarily of pitches of the
diatonic set; (b) begins and ends on the tonic; and (c) exhibits
cadential structure (e.g., contains a sequence of chords built on
Pitch Class 0, followed by Pitch Class 7, and ending on Pitch Class
0). It is interesting to note that our manipulation of tonal magnitude
is linked to variation in many of these parameters. For example,
increases in tonal magnitude result in increased diatonicism, in that
diatonic and nondiatonic tones become more frequent and infrequent, respectively. Similarly, the prescriptions that tonal music
begin and end on the tonic and exhibit cadential structure have the
effect that hierarchically important tones become more frequent,
thus producing duration distributions with greater tonal
magnitudes.
As conceptualized in the current article, tonal magnitude provides a viable means of assessing the tonal strength of musical
passages, one that goes beyond previous attempts to quantify this
aspect of musical structure. For example, one earlier method of
tonal strength assessment was described by Takeuchi (1994), who,
using the Krumhansl–Schmuckler key-finding algorithm (described earlier; Krumhansl & Schmuckler, 1986a), suggested that
the value of the highest correlation produced by the algorithm
(called the maximum key correlation, or MKC) could index the
281
tonal strength of a passage. In keeping with her hypothesis, Takeuchi found that the MKC was indeed predictive of perceived tonal
strength, with high-MKC melodies perceived by listeners as having greater tonal strength than low-MKC melodies.
Of course, one of the main points of the research reported in the
present article is that assessing tonal strength by correlational
pattern matching, as occurs with the key-finding algorithm, has
some important limitations. These limitations are illustrated by the
duration profiles in the top panels of Figure 2. According to the
key-finding algorithm and the MKC, sequences based on either
distribution should be essentially comparable in instantiating tonality (e.g., have comparable tonal strengths) because the two
distributions correlate to a similar degree with the tonal hierarchy.
Clearly, however, this is not the case, with these two duration
distributions giving rise to sequences differing dramatically in
their perceived tonality.
As such, tonal magnitude estimates could provide a reasonable
conjunct measure of tonal strength, to be used with the key-finding
algorithm to produce a more complete description of the tonal
implications of a passage. Along these lines, the key-finding
algorithm could be used to determine the candidate key(s) for a
given musical passage, and a tonal magnitude estimate of the
passage could be calculated to provide a sense of how strongly
tonal structure is instantiated (note that tonal magnitude estimates
for a given passage do not vary with different keys, meaning that
this measure is not a key-finding measure per se).7 Thus, it is
conceivable that a musical passage might produce a high maximum key correlation with a particular tonality while containing a
relatively low tonal magnitude value; such a situation would
suggest that this passage, although unambiguous in its tonal structure, nevertheless induces this key only somewhat weakly. How
much to weight a judgment of key determination on the basis of
variation in tonal magnitude is an open question at the moment, but
it does suggest an interesting avenue for future work.
Of course, critical to the use of tonal magnitude in key finding
is some estimate of the minimum level of tonal magnitude necessary for a passage to induce a tonal percept. On the basis of the
results of the current study, it seems that duration differences
produced in accordance with the standardized key profile of Krumhansl and Kessler (1982; see Figure 1) having a tonal magnitude of
1.0 might in fact not be sufficient to produce reliable tonal percepts. Instead, Experiment 1 suggests that tonal magnitude values
of about 2.0 might be required, whereas Experiment 2 drops this
value to about 1.5. The obvious reason for the somewhat lower
estimate in the latter case is that, in Experiment 2, both duration
and frequency-of-occurrence values were hierarchically organized,
whereas in Experiment 1 only duration values were organized in
7
One way of assessing the tonal magnitude of a passage would be to
compare the duration profile of the passage in question with a range of
tonal magnitude profiles, say from 0 to 4.0 (in steps of 0.1) and look for the
tonal magnitude value producing the smallest squared mean difference
scores. If one were to then take a tonal magnitude estimate of 1.5 as the
ideal tonal magnitude value (based, for example, on the hierarchical
uncontrolled total duration condition of Experiment 2), a measure of tonal
strength could be derived by comparing the observed tonal magnitude of a
passage with this ideal value. Although speculative, such a procedure
suggests an intriguing avenue for future investigation.
282
SMITH AND SCHMUCKLER
this fashion. It should be remembered, though, that all three
experiments used randomized, nonmetrical pitch sequences. Accordingly, embedding tonal magnitude variation into more realistic
melodies might lead to the perception of tonality at even lower
tonal magnitudes. Future work could, and should, examine this
issue more closely. (This comparison across experiments is intriguing because it suggests that the redundancy of information
from duration and frequency of occurrence in musical passages is
important in inducing tonal percepts in that passages containing
such redundancy might not need to contain overly exaggerated
duration differences between tones. This, then, might partially
explain why highly chromatic musical passages— e.g., music making frequent use of nondiatonic pitches along with diatonic ones—
such as those found in Western classical works of the late 19th to
early 20th century, can nevertheless induce reasonably strong tonal
percepts.)
The current results also have implications for musical key
finding in their use of random orderings of tones. Up to now, the
literature in musical cognition has been rather pessimistic as to the
efficacy of random sequences of tones in producing strong tonal
percepts. West and Fryer (1990), for example, found that after
hearing random orderings of the diatonic set, listeners were unable
to identify the tonic of the sequence, indicative of a failure of tonal
perception. The current findings suggest that West and Fryer’s
finding of lack of tonal perception might have been due to insufficient differentiation among the scale pitches, given that all tones
in their work were played with equal durations.
In fact, one of the most striking findings of the current study,
and one having broader implications for key finding, is the fact that
random sequences of tones in any fashion could result in percepts
of tonality. Within musical lore, it is axiomatic that a controlled
serial ordering of tones (e.g., tones occurring in a particular order)
is critical for both the informal enjoyment and appreciation of
music and for formal analyses of musical structure. In fact, the
central role played by serial order information underlies one of
the most fundamental distinctions between the Krumhansl–
Schmuckler key-finding algorithm (Krumhansl & Schmuckler,
1986a) and Brown and Butler’s (1981) intervallic rivalry theory
key-finding model (described earlier). One of the most important
and potentially devastating criticisms leveled at the key-finding
algorithm by the intervallic rivalry theory has concerned the
former’s failure to incorporate serial order information. In this
context, the present experiments lend strong support to the keyfinding algorithm by demonstrating that appropriate serial order
information is not a prerequisite for key finding, provided that the
chromatic set is sufficiently differentiated and organized.
Looked at more generally, the current experiments could be seen
as quite damaging to the intervallic rivalry theory. According to
that approach, key finding arises because of listeners’ identification of rare versus ubiquitous intervals that simply fall out of the
interval content of the diatonic set. Because, however, the current
experiments based their sequences on the chromatic (and not the
diatonic) set, there is no rare interval information available by
which key finding might occur. Nevertheless, listeners could identify the tonal structure of these sequences; such findings are, at the
least, problematic for a rare interval account.8
In a different vein, the current manipulations of tonal magnitude
also have implications regarding the relation between tonal instan-
tiation and the expressive performance of music. According to
some researchers (e.g., Sundberg, Askenfelt, & Frydén, 1983;
Sundberg, Frydén, & Askenfelt, 1983), one means for producing
musically expressive renditions of musical scores is the systematic
application of rules that (among other things) sharpen durational
contrasts by shortening short tones and lengthen the durations of
tones terminating melodic leaps. In keeping with this hypothesis,
perceptual experiments (e.g., Thompson, Sundberg, Friberg, &
Frydén, 1989) have shown that applying these rules transforms an
automated performance into one that is heard as much more
expressive. Because the note durations of musical scores mimic the
standardized key profiles, such rules have the effect of increasing
the tonal magnitude of a piece in its performed tone durations
relative to the notated durations in the musical score.9 One implication of these results is that total tone durations taken from a
musically expressive performance might exhibit a higher tonal
magnitude than a less expressive performance of the same piece;
ongoing work is examining this possibility. Such a finding would
be especially intriguing given the current results suggesting that
increased tonal magnitude facilitates the apprehension of tonal
structure. Accordingly, one consequence of performance expression might be the more effective communication of tonal structure
to listeners.
Finally, and much more generally, the role of tonal magnitude in
the perception of tonality may reflect very basic principles of
auditory pattern perception. For example, the finding that the tonal
stability of a pitch is related to its proportional duration is closely
related to the proportion-of-the-total-duration rule (Kidd, 1995;
Kidd & Watson, 1992), which proposes that the allocation of
attention to elements of a given frequency in an auditory pattern is
a function of its total duration proportional to those of other
frequencies. Kidd and Watson’s finding, among others, has been
incorporated into Lutfi’s (1993) more general component-relativeentropy model of auditory pattern analysis, which argues that the
discriminability of an element in a pattern is a function of how
much that element’s duration (or some other parameter) contributes to the overall variance in duration among all elements. It is
interesting to note that increasing tonal magnitude has precisely
the effect of increasing the tonic’s and other hierarchically impor8
In truth, this criticism is valid for any analysis of real music by the
intervallic rivalry model, given the fact that composers obviously take
advantage of the entire chromatic set in their compositions. How the
intervallic rivalry theory addresses this problem is unclear. One possibility
is that the diatonic set is in some way abstracted out of the actual musical
surface. Of course, such a supposition leads to the question of how such an
abstraction procedure occurs, with the most obvious solution being the
abstraction out of those pitches that occur most frequently. This, of course,
leads directly to the Krumhansl–Schmuckler key-finding algorithm (Krumhansl & Schmuckler, 1986a).
9
Consistent with this idea are data regarding tone deviations in musically expressive performances (see Palmer, 1997, for a review). One
finding in this work is the occurrence of decreased tempo at phrase
endings. Because phrase endings contain a high proportion of hierarchically important notes, such changes in tempo would selectively increase
durations of tonally important pitches and thus increase the tonal magnitude of the music.
THE PERCEPTION OF TONAL STRUCTURE
tant pitches’ contributions to the variance in duration among all
pitches.
In sum, the current experiments set out to investigate how two
fundamental processes—the differentiation of elements in an array
and the organization of these elements into a recognizable hierarchy—function in the perception of musical materials. Toward this
goal, music has provided an excellent arena for investigating these
basic processes, affording insights into the operation of both processes. Such insights have the potential to cut across individual
domains of study within a modality (e.g., music and speech in
audition) and even across modalities themselves (e.g., differentiation in audition vs. vision). Further work on the topic will, it is
hoped, shed more light on such basic aspects of perceptual
organization.
References
Bharucha, J. J. (1987). Music cognition and perceptual facilitation: A
connectionist framework. Music Perception, 5, 1–30.
Bharucha, J. J., & Krumhansl, C. L. (1983). The representation of harmonic structure in music: Hierarchies of stability as a function of
context. Cognition, 13, 63–102.
Bharucha, J. J., & Stoeckig, K. (1986). Reaction time and musical expectancy: Priming of chords. Journal of Experimental Psychology: Human
Perception and Performance, 12, 403– 410.
Bharucha, J. J., & Stoeckig, K. (1987). Priming of chords: Spreading
activation or overlapping frequency spectra. Perception & Psychophysics, 41, 519 –524.
Bregman, A. S. (1990). Auditory scene analysis. Cambridge, MA: MIT
Press.
Bregman, A. S., & Campbell, J. (1971). Primary auditory stream segregation and perception of order in rapid sequences of tones. Journal of
Experimental Psychology, 89, 244 –249.
Bregman, A. S., & Dannenbring, G. (1973). The effect of continuity on
auditory stream segregation. Perception & Psychophysics, 13, 308 –312.
Brown, H. (1988). The interplay of set content and temporal context in a
functional theory of tonality perception. Music Perception, 5, 219 –250.
Brown, H., & Butler, D. (1981). Diatonic trichords as minimal tonal
cue-cells. In Theory Only, 5, 37–55.
Browne, R. (1981). Tonal implications of the diatonic set. In Theory Only,
5, 3–21.
Butler, D. (1989). Describing the perception of tonality in music: A critique
of the tonal hierarchy theory and a proposal for a theory of intervallic
rivalry. Music Perception, 6, 219 –242.
Butler, D. (1990). Response to Carol Krumhansl. Music Perception, 7,
325–338.
Castellano, M. A., Bharucha, J. J., & Krumhansl, C. L. (1984). Tonal
hierarchies in the music of North India. Journal of Experimental Psychology: General, 113, 394 – 412.
Coady, L. (1992). Perception of tonality as a function of note duration in
novel melodic sequences. Unpublished honors thesis, Queen’s University, Kingston, Ontario, Canada.
Croonen, W. L. M. (1994). Effects of length, tonal structure, and contour
in the recognition of tone series. Perception & Psychophysics, 55,
623– 632.
Croonen, W. L. M. (1995). Two ways of defining tonal strength and
implications for recognition of tone series. Music Perception, 13, 109 –
119.
Cuddy, L. L. (1991). Melodic patterns and tonal structure: Converging
evidence. Psychomusicology, 10, 107–126.
Cuddy, L. L. (1993). Melody comprehension and tonal structure. In T. J.
283
Tighe & W. J. Dowling (Eds.), Psychology and music: The understanding of melody and rhythm (pp. 19 –38). Hillsdale, NJ: Erlbaum.
Cuddy, L. L., & Badertscher, B. (1987). Recovery of the tonal hierarchy:
Some comparisons across age and levels of musical experience. Perception & Psychophysics, 41, 609 – 620.
Cuddy, L. L., Cohen, A. J., & Mewhort, D. J. K. (1981). Perception of
structure in short melodic sequences. Journal of Experimental Psychology: Human Perception and Performance, 7, 869 – 883.
Cuddy, L. L., Cohen, A. J., & Miller, J. (1979). Melody recognition: The
experimental application of musical rules. Canadian Journal of Psychology, 33, 148 –157.
Cuddy, L. L., & Lyons, H. I. (1981). Musical pattern recognition: A
comparison of listening to and studying tonal structures and tonal
ambiguities. Psychomusicology, 1, 15–33.
Cuddy, L. L., & Smith, N. A. (2000). Perception of tonal pitch space and
tonal tension. In D. Greer (Ed.), Musicology and sister disciplines (pp.
47–59). Oxford, England: Oxford University Press.
Deutsch, D. (1982). Internal representation of information in the form of
hierarchies. Perception & Psychophysics, 31, 596 –598.
Deutsch, D., & Feroe, J. (1981). The internal representation of pitch
sequences in tonal music. Psychological Review, 88, 503–522.
Dowling, W. J. (1978). Scale and contour: Two components of a theory of
memory for melodies. Psychological Review, 76, 300 –307.
Dowling, W. J. (1991). Tonal strength and melody recognition after long
and short delays. Perception & Psychophysics, 50, 305–313.
Garner, W. R. (1974). The processing of information and structure. New
York: Wiley.
Gibson, J. J. (1979). The ecological approach to visual perception. Boston:
Houghton Mifflin.
Hébert, S., Peretz, I., & Gagnon, L. (1995). Perceiving the tonal ending of
tune excerpts: The roles of pre-existing representation and musical
expertise. Canadian Journal of Experimental Psychology, 49, 193–209.
Holtzman, S. R. (1977). A program for key determination. Interface, 6,
29 –56.
Huron, D., & Parncutt, R. (1993). An improved model of tonality perception incorporating pitch salience and echoic memory. Psychomusicology, 12, 154 –171.
Janata, P., & Reisberg, D. (1988). Response-time measures as a means of
exploring tonal hierarchies. Music Perception, 6, 161–172.
Jones, M. R. (1978). Auditory patterns: Studies in the perception of
structure. In E. C. Carterette & M. P. Friedman (Eds.), Handbook of
perception, Vol. 8: Perceptual coding (pp. 255–288). New York: Academic Press.
Jones, M. R. (1981). A tutorial on some issues and methods in serial pattern
research. Perception & Psychophysics, 30, 492–504.
Justus, T. C., & Bharucha, J. J. (2002). Music perception and cognition. In
H. Pashler (Series Ed.) & S. Yantis (Vol. Ed.), Stevens’ handbook of
experimental psychology, Vol. 1: Sensation and perception (3rd ed., pp.
453– 492). New York: Wiley.
Kessler, E. J., Hansen, C., & Shepard, R. N. (1984). Tonal schemata in the
perception of music in Bali and in the West. Music Perception, 2,
131–165.
Kidd, G. R. (1995). Proportional duration and proportional variance as
factors in auditory pattern discrimination. Journal of the Acoustical
Society of America, 97, 1335–1338.
Kidd, G. R., & Watson, C. S. (1992). The “proportion-of-the-total-duration
rule” for the discrimination of auditory patterns. Journal of the Acoustical Society of America, 92, 3109 –3118.
Koffka, K. (1935). Principles of Gestalt psychology. New York: Harcourt
Brace.
Krumhansl, C. L. (1979). The psychological representation of musical
pitch in a tonal context. Cognitive Psychology, 11, 346 –374.
284
SMITH AND SCHMUCKLER
Krumhansl, C. L. (1990a). Cognitive foundations of musical pitch. New
York: Oxford University Press.
Krumhansl, C. L. (1990b). Tonal hierarchies and rare intervals in music
cognition. Music Perception, 7, 309 –324.
Krumhansl, C. L. (1991). Music psychology: Tonal structures in perception
and memory. Annual Review of Psychology, 42, 277–303.
Krumhansl, C. L. (2000). Tonality induction: A statistical approach applied
cross-culturally. Music Perception, 17, 461– 479.
Krumhansl, C. L., Bharucha, J. J., & Castellano, M. A. (1982). Key
distance effects on perceived harmonic structure in music. Perception &
Psychophysics, 32, 96 –108.
Krumhansl, C. L., Bharucha, J. J., & Kessler, E. J. (1982). Perceived
harmonic structure of chords in three related musical keys. Journal of
Experimental Psychology: Human Perception and Performance, 8, 24 –
36.
Krumhansl, C. L., & Kessler, E. J. (1982). Tracing the dynamic changes in
perceived tonal organization in a spatial representation of musical keys.
Psychological Review, 89, 334 –368.
Krumhansl, C. L., Sandell, G. J., & Sargeant, D. C. (1987). The perception
of tone hierarchies and mirror forms in twelve-tone serial music. Music
Perception, 5, 31–78.
Krumhansl, C. L., & Schmuckler, M. A. (1986a, August). Key-finding in
music: An algorithm based on pattern matching to tonal hierarchies.
Paper presented at the 19th Annual Meeting of the Society for Mathematical Psychology, Cambridge, MA.
Krumhansl, C. L., & Schmuckler, M. A. (1986b). The Petroushka chord:
A perceptual investigation. Music Perception, 4, 153–184.
Krumhansl, C. L., & Shepard, R. N. (1979). Quantification of the hierarchy
of tonal functions within a diatonic context. Journal of Experimental
Psychology: Human Perception and Performance, 5, 579 –594.
Kubovy, M., & Pomerantz, J. R. (Eds). (1981). Perceptual organization.
Hillsdale, NJ: Erlbaum.
Lantz, M. E. (2002). The role of duration and frequency of occurrence in
perceived pitch structure. Unpublished doctoral dissertation, Queen’s
University, Kingston, Ontario, Canada.
Lantz, M. E., & Cuddy, L. L. (1996, August). The effects of surface cues
in the perception of pitch structure: Frequency of occurrence and
duration. Paper presented at Fourth International Conference on Music
Perception and Cognition, Montreal, Quebec, Canada.
Lantz, M. E., & Cuddy, L. L. (1998). Total and relative duration as cues to
surface structure in music. Canadian Acoustics, 26(3), 56 –57.
Leman, M. (1995). A model of retroactive tone-center perception. Music
Perception, 12, 430 – 471.
Lerdahl, F., & Jackendoff, R. (1983). A generative theory of tonal music.
Cambridge, MA: MIT Press.
Lockhead, G. R., & Pomerantz, J. R. (1991). The perception of structure.
Washington, DC: American Psychological Association.
Longuet-Higgins, H. C., & Steedman, M. J. (1971). On interpreting Bach.
Machine Intelligence, 6, 221–241.
Lutfi, R. A. (1993). A model of auditory pattern analysis based on
component-relative-entropy. Journal of the Acoustical Society of America, 94, 748 –758.
Martin, J. (1972). Rhythmic (hierarchical) versus serial structure in speech
and other behavior. Psychological Review, 79, 487–509.
McAdams, S., & Bregman, A. S. (1979). Hearing musical streams. Computer Music Journal, 3(4), 26 – 43.
Oram, N., & Cuddy, L. L. (1995). Responsiveness of Western adults to
pitch-distributional information in melodic sequences. Psychological
Research/Psychologische Forschung, 57, 103–118.
Palmer, C. (1997). Music performance. Annual Review of Psychology, 48,
115–138.
Patterson, R. D., Peters, R. W., & Milroy, R. (1983). Threshold duration for
melodic pitch. In W. Klinke & W. M. Hartmann (Eds.), Hearing:
Physiological bases and psychophysics (pp. 321–325). Berlin, Germany:
Springer-Verlag.
Povel, D. (1981). Internal representations of simple temporal patterns.
Journal of Experimental Psychology: Human Perception and Performance, 7, 3–18.
Povel, D., & Essens, P. (1985). The perception of temporal patterns. Music
Perception, 2, 411– 440.
Rasch, R., & Plomp, R. (1999). The perception of musical tones. In D.
Deutsch (Ed.), The psychology of music (2nd ed., pp. 89 –112). San
Diego, CA: Academic Press.
Restle, F. (1970). Theory of serial pattern learning: Structural trees. Psychological Review, 77, 481– 495.
Robinson, K., & Patterson, R. D. (1995). The duration required to identify
the instrument, the octave, or the pitch chroma of a musical note. Music
Perception, 13, 1–15.
Rosch, E. (1975). Cognitive reference points. Cognitive Psychology, 7,
532–547.
Schmuckler, M. A. (1989). Expectation in music: Investigation of melodic
and harmonic processes. Music Perception, 7, 109 –150.
Schmuckler, M. A. (1997). Expectancy effects in memory for melodies.
Canadian Journal of Experimental Psychology, 51, 292–305.
Schmuckler, M. A. (1999). Testing models of melodic contour similarity.
Music Perception, 16, 295–326.
Schmuckler, M. A., & Boltz, M. G. (1994). Harmonic and rhythmic
influences on musical expectancy. Perception & Psychophysics, 56,
313–325.
Schmuckler, M. A., & Gilden, D. L. (1993). Auditory perception of fractal
contours. Journal of Experimental Psychology: Human Perception and
Performance, 19, 641– 660.
Schmuckler, M. A., & Tomovski, R. (1997, November). Perceptual tests of
musical key-finding. Paper presented at the 38th Annual Meeting of the
Psychonomic Society, Dallas, TX.
Schmuckler, M. A., & Tomovski, R. (2000, November). Tonal hierarchies
and intervallic rivalries in musical key-finding. Paper presented at the
Society for Music Perception and Cognition, Toronto, Ontario, Canada.
Shmulevich, I., & Yli-Harja, O. (2000). Localized key-finding: Algorithms
and applications. Music Perception, 17, 531–544.
Simon, H. A., & Kotovsky, K. (1963). Human acquisition of concepts for
sequential patterns. Psychological Review, 70, 534 –546.
Simon, H. A., & Sumner, R. K. (1968). Pattern in music. In B. Kleinmuntz
(Ed.), Formal representation of human judgement (pp. 219 –250). New
York: Wiley.
Smith, N. A., & Cuddy, L. L. (2003). Perceptions of musical dimensions in
Beethoven’s Waldstein sonata: An application of tonal pitch space
theory. Musicae Scientiae, 7, 7–34.
Stevens, S. S., & Volkman, J. (1940). The relation of pitch to frequency:
A revised scale. American Journal of Psychology, 53, 329 –353.
Stevens, S. S., Volkman, J., & Newman, E. B. (1937). A scale for the
measurement of the psychological magnitude of pitch. Journal of the
Acoustical Society of America, 8, 185–190.
Sundberg, J., Askenfelt, A., & Frydén, L. (1983). Musical performance: A
synthesis-by-rule approach. Computer Music Journal, 7(1), 37– 43.
Sundberg, J., Frydén, L., & Askenfelt, A. (1983). What tells you the player
is musical? An analysis-by-synthesis study of musical performance. In J.
Sundberg (Ed.). Studies of musical performance (pp. 61–75). Stockholm: Royal Swedish Academy of Music.
Takeuchi, A. H. (1994). Maximum key-profile correlation (MKC) as a
measure of tonal structure in music. Perception & Psychophysics, 56,
335–346.
Thompson, W. F., Sundberg, J., Friberg, A., & Frydén, L. (1989). The use
of rules for expression in the performance of melodies. Psychology of
Music, 17, 63– 82.
THE PERCEPTION OF TONAL STRUCTURE
Van Egmond, R., & Butler, D. (1997). Diatonic connotations of pitch-class
sets. Music Perception, 15, 1–29.
Van Noorden, L. P. A. (1975). Temporal coherence in the perception of
tone sequences. Unpublished doctoral dissertation, Technische Hogeschool Eindhoven, Eindhoven, the Netherlands.
Vitz, P. C., & Todd, T. C. (1969). A coded element model of the
perceptual processing of sequential stimuli. Psychological Review,
76, 433– 449.
285
Vos, P. G., & Van Geenen, E. W. (1996). A parallel-processing keyfinding model. Music Perception, 14, 185–224.
Ward, W. D. (1954). Subjective musical pitch. Journal of the Acoustical
Society of America, 26, 369 –380.
West, R. J., & Fryer, R. (1990). Ratings of the suitability of probe tones as tonics
after random orderings of the diatonic scale. Music Perception, 7, 253–258.
Winograd, T. (1968). Linguistics and the computer analysis of tonal
harmony. Journal of Music Theory, 12, 2– 49.
Appendix A
Tone Durations (in Milliseconds) for Each Pitch at Each Tonal Magnitude in the
Hierarchical Condition of Experiment 1
Tonal magnitude
Pitch
0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
C
C#/D♭
D
D#/E♭
E
F
F#/G♭
G
G#/A♭
A
A#/B♭
B
417
417
417
417
417
417
417
417
417
417
417
417
571
339
423
346
474
458
360
516
350
434
343
385
760
267
416
279
524
489
302
621
286
438
274
345
979
204
397
218
561
506
245
724
226
429
212
299
1,224
151
368
165
582
508
193
818
173
407
159
252
1,486
109
330
121
587
495
147
898
129
375
116
206
1,757
76
289
87
577
469
110
959
94
336
82
164
2,027
52
247
61
553
435
80
1,001
66
295
57
127
2,291
35
207
42
519
394
57
1,022
46
253
39
97
2,542
23
170
28
478
351
40
1,026
31
213
26
72
(Appendixes continue)
SMITH AND SCHMUCKLER
286
Appendix B
Frequencies of Occurrence and Individual Tone Durations (in Milliseconds) for Each
Pitch at Each Tonal Magnitude in Experiment 2
Tonal magnitude
Pitch
0.5
1.0
1.5
2.0
2.5
24
3
7
3
12
10
4
16
3
8
3
5
98
30
2
7
2
12
10
3
18
3
7
2
4
100
35
278
119
278
69
83
208
52
278
104
278
167
28
417
119
417
69
83
278
46
278
119
417
208
100
100
Frequency of occurrence
C
C#/D♭
D
D#/E♭
E
F
F#/G♭
G
G#/A♭
A
A#/B♭
B
Total
11
7
8
7
9
9
7
10
7
9
7
8
99
15
5
8
6
10
10
6
12
6
9
5
7
99
20
4
8
4
11
10
5
14
5
9
4
6
100
Duration (controlled)
C
C#/D♭
D
D#/E♭
E
F
F#/G♭
G
G#/A♭
A
A#/B♭
B
76
119
104
119
93
93
119
83
119
93
119
104
56
167
104
139
83
83
139
69
139
93
167
119
42
208
104
208
76
83
167
60
167
93
208
139
Duration (uncontrolled)
C–B
100
100
100
Appendix C
Tone Durations (in Milliseconds) for the Tonic and Nontonic Pitch at Each Tonal
Magnitude for the Binary Melodies in Experiment 3
Tonal magnitude
Pitch
0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
Tonic
Nontonic
417
417
571
403
760
386
979
366
1,224
343
1,489
319
1,757
295
2,027
270
2,291
246
2,542
223
Received July 9, 2002
Revision received March 18, 2003
Accepted August 14, 2003 䡲