Vocal tract resonances and the sound of the Australian didjeridu (yidaki) I. Experiment

2006, The Journal of the Acoustical Society of America

The didjeridu, or yidaki, is a simple tube about 1.5m long, played with the lips, as in a tuba, but mostly producing just a tonal, rhythmic drone sound. The acoustic impedance spectra of performers’ vocal tracts were measured while they played and compared with the radiated sound spectra. When the tongue is close to the hard palate, the vocal tract impedance has several maxima in the range 1–3kHz. These maxima, if sufficiently large, produce minima in the spectral envelope of the sound because the corresponding frequency components of acoustic current in the flow entering the instrument are small. In the ranges between the impedance maxima, the lower impedance of the tract allows relatively large acoustic current components that correspond to strong formants in the radiated sound. Broad, weak formants can also be observed when groups of even or odd harmonics coincide with bore resonances. Schlieren photographs of the jet entering the instrument and high speed video images of the pla...

Vocal tract resonances and the sound of the Australian didjeridu (yidaki) I. Experimenta) Alex Z. Tarnopolsky School of Physics, University of New South Wales, Sydney NSW 2052, Australia Neville H. Fletcher School of Physics, University of New South Wales, Sydney NSW 2052, Australia and Research School of Physical Sciences and Engineering, Australian National University, Canberra 0200, Australia Lloyd C. L. Hollenberg School of Physics, University of Melbourne, Melbourne, Vic 3010, Australia Benjamin D. Lange, John Smith, and Joe Wolfeb兲 School of Physics, University of New South Wales, Sydney NSW 2052, Australia 共Received 8 August 2005; accepted 8 November 2005兲 The didjeridu, or yidaki, is a simple tube about 1.5 m long, played with the lips, as in a tuba, but mostly producing just a tonal, rhythmic drone sound. The acoustic impedance spectra of performers’ vocal tracts were measured while they played and compared with the radiated sound spectra. When the tongue is close to the hard palate, the vocal tract impedance has several maxima in the range 1 – 3 kHz. These maxima, if sufficiently large, produce minima in the spectral envelope of the sound because the corresponding frequency components of acoustic current in the flow entering the instrument are small. In the ranges between the impedance maxima, the lower impedance of the tract allows relatively large acoustic current components that correspond to strong formants in the radiated sound. Broad, weak formants can also be observed when groups of even or odd harmonics coincide with bore resonances. Schlieren photographs of the jet entering the instrument and high speed video images of the player’s lips show that the lips are closed for about half of each cycle, thus generating high levels of upper harmonics of the lip frequency. Examples of the spectra of “circular breathing” and combined playing and vocalization are shown. © 2006 Acoustical Society of America. 关DOI: 10.1121/1.2146089兴 PACS number共s兲: 43.75.Fg, 43.75.Yy, 43.72.Ct 关DD兴 I. INTRODUCTION The word “didjeridu” 共or “didgeridoo” in the popular literature兲 is an onomatopoeic Western name for a traditional instrument played in parts of Northern Australia and known to the Yolngu people of Arnhem Land as the yidaki. The Yolngu name “yidaki” will be used throughout this paper. It is unusual among wind instruments in that the pitch is only rarely varied: the interest in performance lies in spectacular, rhythmic variations in timbre, which are produced by the player’s vocal tract. It is played using “circular breathing” to produce an uninterrupted sound: the player, traditionally a man, fills his cheeks and uses this reservoir to continue to play while simultaneously inhaling quickly through the nose 共and bypassing the mouth at the soft palate兲 to refill the lungs with air. The differences between timbres produced by playing using the mouth cavity alone, while inhaling, and those produced using the complete tract, while exhaling, are usually unavoidable and are incorporated into the rhythmic variation in timbre that is idiomatic for the instrument and a兲 A Brief Communication reporting related studies has been published in Nature 共Tamopolsky et al, 2005兲. b兲 Author to whom correspondence should be addressed. Electronic mail: j.wolfe@unsw.edu.au 1194 J. Acoust. Soc. Am. 119 共2兲, February 2006 Pages: 1194–1204 that gives it its Western name. Different tongue positions have a strong effect on the sound spectrum. Sound files illustrating these effects are given at www.phys.unsw.edu.au/ ˜jw/yidakididjeridu.html. The yidaki is a member of the lip valve family. In this family, the playing frequency is usually close to that of one of the maxima in the impedance spectrum of the bore. The effect of variations in the player’s vocal tract upon orchestral lip valve instruments is usual modest, because of their narrow bore and the shape of the mouthpiece, whereas in the yidaki it is the preeminent musical feature. The yidaki is therefore an ideal instrument in which to study the interaction among vocal tract, lips, and instrument. Previous studies of the acoustics of the yidaki have considered the lip motion 共Wiggins, 1988兲, the lip-bore interaction 共Fletcher, 1983, 1996兲, numerical modeling of the lip motion 共Hollenberg, 2000兲, the linear acoustics of the instrument 共Amir and Alon, 2001; Amir, 2004兲, and the acoustics of the vocal tract of players miming playing 共Fletcher et al., 2001兲. A very brief report covering work related to that reported here has been given previously 共Tarnopolsky et al., 2005兲. That report used an instrument made in the traditional manner, and therefore of unknown geometry. 0001-4966/2006/119共2兲/1194/11/$22.50 © 2006 Acoustical Society of America In what is now a standard model, Backus 共1985兲 proposed that the acoustic impedance of the bore of a wind instrument Zinst and that of the tract Ztract act in series on the valve and on the air flow through it. An understanding of the interaction between the instrument and the vocal tract thus requires detailed knowledge of the acoustic impedance of the instrument, the acoustic impedance of the vocal tract, the vibratory behavior of the lip valve, the air jet entering the instrument, and their interaction. It is relatively easy to determine the impedance of the instrument, particularly if only the maxima are important. However, the impedance of the vocal tract is much harder to measure, particularly during playing. The vibrating lips generate a sound signal that is transmitted both into the instrument and into the mouth. Consequently, the sound level inside the mouth of the player is very high. This makes it difficult to make measurements of the acoustic impedance of the vocal tract of someone playing the yidaki. In the past, we have made measurements of that impedance while players mimed playing 共Fletcher et al., 2001兲. However, producing a given mouth configuration in the absence of audible feedback is difficult, and it is not clear that yidaki players are capable of miming reliably, perhaps particularly with regard to the aperture of the glottis, of which most people are not conscious. In this study, we report the development of a system that allows the measurement of the acoustic impedance of the vocal tract, just inside the player’s lips, while he is playing the yidaki. We compare this with the spectrum of the sound produced. We also report the motion of the player’s lips, using high-speed photography. These sets of observations are used to test a simple model that explains how the acoustic impedance of the tract affects the spectral envelope of the sound produced. We also analyze other features of idiomatic playing: “circular breathing” and vocalization. A. The yidaki „didjeridu… Traditionally, the material for a yidaki is selected by tapping suitably sized tree trunks to find one whose interior has been eaten by termites to provide a suitable central bore. It is cut to a desired length, the bore is cleaned and sometimes shaped further, and a ring of beeswax is fitted to the smaller end to make a comfortable seal for the player’s lips. The outside is sometimes painted with traditional designs of cultural significance. The instrument is typically 1.2 to 1.5 m long 共different cultural groups have different styles兲 and has an irregular bore, which is usually somewhat flared from about 30 to 50 mm at the blowing end to about 40 to 150 mm at the open end. Sealed at the lip end and open at the other, its lowest resonance is typically 50 to 80 Hz. Orchestral wind instruments usually have several bore resonances whose frequencies fall in harmonic ratios. Because of its shape, this is not usually the case for the yidaki: it is usually neither a cylinder nor a nearly complete cone and its resonances form a “stretched” quasi-harmonic series 共Fletcher, 1996兲. Consequently, harmonics of the note being played only sometimes coincide with a bore resonance of the instrument. Higher or “overblown” notes near the freJ. Acoust. Soc. Am., Vol. 119, No. 2, February 2006 quencies of the second or third bore resonances may be sounded briefly for contrast. This is not usual in traditional playing in Western Arnhem Land, though it is used in the East. Because the bore is typically 30 to 50 mm in diameter at the smaller end, its characteristic impedance is lower by an order of magnitude than that of most of the members of the lip-driven musical instrument family such as the horn or trumpet. This and the rather rough walls of the bore imply that the magnitudes of the maxima in the yidaki’s impedance spectrum are rather lower than those of other members of the wind instrument family. Some consequences have been discussed in previous papers 共Fletcher, 1996; Amir and Alon, 2001; Fletcher et al., 2001; Caussé et al., 2004兲. Because the shape of the bore is largely determined by termites and the shape of the tree trunk, the variation among these instruments is great. The purpose of the current study is to investigate the principles of operation, rather than the effects of different instrumental geometries 共which is the subject of another study兲. For that reason, and to facilitate reproduction of the results reported here, two model instruments were used. For acoustical measurements, we used a cylindrical PVC pipe. For optical measurements, we used a pipe made of plexiglass with a square cross section. Experienced players reported that both model instruments played moderately well. Indeed instruments with a constant cross section, usually made of PVC pipe, are occasionally used in nontraditional musical contexts, particularly when a given pitch is required in order to play with other instruments. B. Vocal tract-instrument interaction There are a number of reports on the effects of the vocal tract on the sound on orchestral wind instruments 共Elliot and Bowsher, 1982; Clinch et al., 1982; Wolfe et al., 2003兲, but the effects in such instruments are modest in comparison with those in the yidaki. These orchestral instruments have a narrow constriction in the mouthpiece and a smooth bore, which is typically only several mm in the mouthpiece. These features give the instruments an impedance spectrum with a series of maxima whose values exceed considerably those of the vocal tract. Consequently, there is only modest coupling between the two resonators 共the vocal tract and the bore of the instrument兲. While the effect of the tract on timbre of orchestral instruments is large enough to interest composers 共e.g., Berio, 1966; Erikson, 1969兲, it is small compared to the striking effects of the vocal tract on the timbre of the yidaki. In a previous paper 共Fletcher et al., 2001兲 we reported sound spectra, vocal tract configurations, and the impedance spectra of players miming the playing of the yidaki. However, these were not measured simultaneously during playing, so we were then unable to make quantitative comparisons among them. C. Measurement of the vocal tract impedance during performance The magnitude of the vocal tract effect in the yidaki makes it an ideal instrument upon which to study vocal tract Tarnopolsky et al.: Acoustics of the didjeridu 1195 effects in general. For this purpose, we have adapted an impedance spectrometer described previously 共Epps et al., 1997兲 to allow us to make impedance measurements using an impedance probe placed just inside the player’s lips, while he is playing. This situation requires several practical compromises. The sound due to the playing has comparable levels in the mouth and in the instrument. As this is “noise” for the purposes of measurement of the impedance spectrum, the signal-to-noise ratio is low. There is the further complication of a humid environment, which means that water-resistant or disposable microphones must be used. On the other hand, this study is concerned with relating the spectral envelope of the sound produced to the overall features of the impedance spectrum. Consequently, high-precision calibrated microphones are not required. II. MATERIALS AND METHODS A. Yidakis For acoustic measurements, a “model yidaki” was made of cylindrical PVC pipe, length 1210 mm and inner diameter 30 mm. 共It is referred to as “the pipe” or “the instrument” below.兲 For the optical measurements of the lip motion and for the flow visualization, a pipe with square cross section was made of plexiglass with glass panels for the optical pathway. It is 1220 mm long and the internal width is 38 mm. B. Players One of the players, BL, traditional name Wilamara, is a member of the Mara people of Roper River in Northern Australia, where he learned to play yidaki in the traditional style. LH is an Australian of European cultural background who has been playing the yidaki for 8 years. AT is an Australian of European cultural background who learned to play yidaki for the purposes of this study. LH’s usual playing style has the instrument displaced laterally from the center of the lips. Neither he nor AT had trouble adapting to the presence of the impedance probe behind the lips. BL, who has played for the longest time and whose lip-instrument position is symmetrical, found the impedance probe disruptive, particularly for the high tongue position. Players were asked to produce three different mouth configurations for recordings. One is called a high tongue drone 共hereafter “high tongue”兲: the player holds the tongue close to the hard palate so that there is a constriction in the air passage between the throat and the lips. This produces a strong formant between about 1.5 and 2 kHz, whose frequency and amplitude may be varied by the performer. This sound is very common in yidaki performance. In another configuration, hereafter called “low tongue,” the players were asked to play with the tongue low in the mouth and thus no lingual constriction. This configuration produces a sound without a strong formant and is used as a contrast to the high-tongue drone. In the third configuration, players inflated their cheeks and then expelled the air, while inhaling, as described above under “circular breathing.” In a different series, they were asked to vocalize at harmonic intervals above the note they were playing on the yidaki. 1196 J. Acoust. Soc. Am., Vol. 119, No. 2, February 2006 To measure the static mouth pressure during playing, they were also asked to play with a range of loudness levels while a small tube connected the mouth cavity to a water manometer. C. Measurements of impedance spectra of the instrument An impedance spectrometer described previously 共Smith et al., 1997; Epps et al., 1997兲 was adapted for this study. Briefly, a waveform is synthesized from harmonic components, amplified, and input via a loudspeaker and impedance matching horn to a narrow high-impedance tube leading to the item under test. This approximates an ideal source of acoustic current. It is calibrated by connection to a reference impedance, which is an acoustically quasi-infinite cylindrical pipe whose impedance is assumed to be real, frequency independent, and equal to its calculated characteristic impedance. From the coefficients of the spectrum of the measured sound in this calibration stage, a new signal is synthesized to produce a measured spectrum with frequency components of equal amplitude. This is used as the acoustic current source for subsequent measurements and the unknown impedance spectrum is calculated from the pressure components measured in measurement and calibration stages, taking into account the small, parallel admittance of the source. The bore diameter of the yidaki is larger than that of the instruments we have studied previously 共Wolfe et al., 2001兲 and consequently a lower impedance reference was required. The acoustically infinite cylindrical pipe used for calibration in this study had an internal diameter of 26.2 mm and a length of 194 m. Because the first curve in the pipe occurs at 40 m from the spectrometer and because any curves have a radius of 5 m or greater, the effects of reflections from these curves are expected to be negligible and this reference impedance should be purely resistive. D. Measurements of impedance spectra in the vocal tract The microphone recording the pressure inside the tract is exposed to high humidity and high steady pressure. For this reason, we used inexpensive electret microphones 共Optimus 33-3013兲, which were replaced when necessary. It was necessary to attenuate the acoustic signal to avoid clipping or harmonic distortion in these microphones. To do this, we used the acoustic divider circuit shown in Fig. 1. At low frequencies, where attenuation is most important, the impedances of both pipes in the divider 共including the radiation impedance associated with the open end of the tube兲 are essentially inertive, and the phase change along the length is small. At higher frequencies, there is a phase shift and a frequency-dependent gain. The microphone therefore records only a fraction of the pressure inside the mouth. This attenuation is inside the calibration loop for the impedance probe, so its frequency and phase response do not affect measurements. The source capillary has an inner diameter of 3.7 mm and the microphone tube an inner diameter of 1.5 mm. Both have a length of 35 mm. Measurements were made at a frequency spacing of 5.383 Hz from 0.2 to 3.0 kHz. Tarnopolsky et al.: Acoustics of the didjeridu FIG. 2. The geometry of the square yidaki, seen from above, as configured for high speed photography. FIG. 1. The technique used to measure the impedance of the vocal tract during performance. The sketch shows the impedance probe inserted into a corner of the mouth. The schematic 共top view—only approximately to scale兲 shows the geometry of the impedance probe and its location in the player’s mouth. The microphone capillary is hidden in the figure at the left. While the acoustic pressure acts upon a significant area of the lips and this determines their vibratory motion, the pressure of concern in the production of formants in the sound is that acting over the opening area of the lips. Measurements of acoustic impedance need to take this geometrical mismatch into consideration, since essentially there is an inertive correction, which may be either positive or negative, involved between this impedance and the plane-wave impedance normally measured in a pipe 共Brass and Locke, 1997; Fletcher et al., 2005兲. The probe used for measurement of the vocal tract impedance, shown in Fig. 1, has a narrow outlet for the acoustic current. The impedance spectrometer is calibrated on a quasi-infinite tube of diameter 26.2 mm, which is comparable with the size of the vocal tract. Consequently, errors due to this effect are small. These measurements required the impedance probe shown in Fig. 1 共8 mm wide and 5 mm high兲 to be placed in the mouth during playing. Impedance measurements start after the player gives a signal that he is happy with the tongue position and the sound produced. The player then continues to play on one breath, typically for about 10 s. During this time an impedance measurement is made. A sample of the sound immediately following the impedance measurement is used to obtain samples of the radiated sound uncontaminated by that of the injected measurement signal. Fig. 2. A mirror, mounted vertically on the end at 45°, allowed the single camera to record the plane and lateral image of the player’s lips simultaneously. Before each recording, the glass panels were heated with warm dry air to prevent water condensation. A video camera running at 1000 frames per second was used to record the images. G. Flow visualization of the jet motion Images of the air jet inside the yidaki during playing were achieved using schlieren imaging, a nondestructive optical flow visualization technique, which is described elsewhere 共Tarnopolsky and Fletcher, 2004兲. This experiment also used the plexiglass yidaki. The schlieren technique depends upon refraction of light rays as they pass through regions with varying refractive index, usually provided by inhomogeneities in density. A curtain of higher density gas was produced by releasing carbon dioxide from a manifold on the outer side into the yidaki through a line of 11 holes of 1.5-mm diameter linking the manifold to the yidaki 共see Fig. 3兲. During playing, the air jet passes through and is contaminated by the curtain of carbon dioxide and thus generates the necessary density gradient. A light source that produced a single pulse of duration of about 0.2 ms was triggered electronically at a selected phase of the lips’ opening 共Tarnopolsky et al., 2000兲. Stroboscopy of a steady, sustained playing gesture was used to obtain images covering one period of the lip’s oscillation in time steps of 1 ms. The period of the lip oscillation was 14.3 ms for these experiments. E. Measurements of sound spectra The output sound was recorded on digital audio tape at 44.1 kHz using an omnidirectional electret microphone, placed on the axis of the yidaki, at a distance of 125 mm from its end. F. Measurements of the lip motion The transparent yidaki with square cross section 共mentioned above兲 had two mouthpiece configurations. For measurements of the lip motion, a round hole was cut in one side for the player’s lips. To improve the image quality, sections of plexiglass on the side and end were replaced with two glass panels, 100 mm long and equal in width to the yidaki, inserted in the optical path at the mouth end, as shown in J. Acoust. Soc. Am., Vol. 119, No. 2, February 2006 FIG. 3. The geometry of the square yidaki, as configured for flow visualization. Tarnopolsky et al.: Acoustics of the didjeridu 1197 FIG. 4. A figure to illustrate the vocal tract impedance measurement while the subject is playing in the high tongue configuration. 共a兲 shows the spectrum of the sound pressure level measured in the mouth due to both lip vibration and the injected acoustic current. 共b兲 shows the radiated sound spectrum measured simultaneously with 共a兲. 共c兲 shows the radiated sound measured just after the impedance measurement i.e. without the injected acoustic current. 共d兲 shows the impedance of the vocal tract that was derived from 共a兲. The sound pressure levels shown in 共a兲, 共b兲, and 共c兲 are normalized relative to their largest frequency component. The frequencies of the harmonics of the sound and of the resonances of the pipe are also shown with vertical dashes. The odd numbered harmonics are represented by longer dashes. III. RESULTS AND DISCUSSION A. Impedance of the player’s vocal tract Figure 4 shows how measurements of the acoustic impedance of the player’s tract were made during performance and then processed. Figure 4共a兲 shows the spectrum of the signal recorded inside the mouth during a typical example of the high tongue configuration. The periodic vibration of the lips at about 70 Hz generates an acoustic signal that interacts with the impedance of the tract to produce a series of harmonics, which are seen at frequencies below about 1.5 kHz. 1198 J. Acoust. Soc. Am., Vol. 119, No. 2, February 2006 Inside the mouth, the higher resonances of the instrument 共some of which lie close to odd harmonics of the lip motion兲 have only modest influence on the sound in the mouth. Hence, in Fig. 4共a兲, there is no systematic difference between odd and even harmonics. At frequencies above 1 kHz 共the range of interest兲, the spectrum is increasingly dominated by the response of the vocal tract to the injected acoustic current. Here we see broad peaks or formants at about 1.5, 2.1, and 2.8 kHz. Because the injected current has been calibrated to have flow components with magnitude independent of frequency, these peaks correspond to maxima in the acoustic impedance spectrum of the vocal tract in this configuration. Figure 4共b兲 shows the spectrum of the externally radiated sound produced by the yidaki, measured 125 mm from the end of the instrument. This shows the strong harmonics of the instrument’s sound. Odd harmonics dominate, because of the impedance matching effects of the transfer function of the closed, cylindrical pipe. The figure shows the frequencies of the harmonics of the sound and those of the impedance maxima in the cylindrical pipe 共titled “resonances”兲, which are approximately at nf 1 where n is an odd integer and f 1 is the frequency of the lowest resonance. 共Because of frequency-dependent end effects, these frequencies are not exactly harmonic.兲 The effects of resonances on harmonics are discussed in more detail later. This spectrum is included to allow comparison of the harmonics measured simultaneously inside and outside the mouth. Above about 1.5 kHz, the spectrum has an increased broadband component. This is the 共filtered兲 sound of the injected acoustic current. Some of the sound injected into the mouth is radiated through the opening lips and the yidaki. Some also leaks through the cladding of the current source directly into the radiation field. Because of this unavoidable contamination of the radiated yidaki sound by the acoustic current used to measure the impedance, all sound spectra shown in subsequent figures were measured immediately following the impedance measurement, during the same, sustained playing gesture. This is also the case in Fig. 4共c兲. Figure 4共d兲 shows the acoustic impedance of the vocal tract during playing. The acoustic impedance was derived from the signal recorded inside the mouth during the injected sound as described above. Frequencies below 200 Hz 共the lowest frequency in the injected current兲 are omitted. To remove the very large signal produced by the vibrating lips, five points centered on each harmonic of the lip frequency up to the 16th have been removed and replaced with a linear interpolation. The resulting data have been smoothed by a linear average over a window of 53.8 Hz and are presented on a linear rather than a logarithmic scale. This process is used hereafter to show Ztract when measured during playing. Comparing Fig. 4共b兲 or 4共c兲 with Fig. 4共d兲 shows that the peaks in the vocal tract impedance occur at frequencies at which the spectral envelope of the radiated sound has minima. This is considered further, below. When considering the spectral envelope of Fig. 4共b兲 or 4共c兲, one should remember that human hearing sensitivity declines rapidly below about 300 Hz. Consequently, despite their relatively large amplitude, the fundamental and lower Tarnopolsky et al.: Acoustics of the didjeridu harmonics are not very loud. Further, they vary little during playing. The formants, on the other hand, occur at frequencies in the range of maximum sensitivity of the ear, and they change in response to changes in mouth configuration. It is these formants and the variation in them that contribute most of the interest in yidaki performance. B. Relationship between the output sound and the impedance of the player’s vocal tract The playing frequency is close to but slightly above that of the first resonance of the pipe, in accordance with Fletcher’s 共1993兲 analysis of an “outward swinging door” valve that opens under excess pressure on the upstream side and closes under excess pressure on the downstream side 关notated 共⫹, ⫺兲兴. The signal radiated by this cylindrical yidaki has stronger odd harmonics, especially for low frequencies, where these harmonics fall close to the resonances of the instrument, which are indicated by vertical lines in Fig. 4共c兲. The broadband component of the radiated signal 共largely due to the spectrometer signal leaking through the player’s lips into the yidaki兲 is visible, especially at high frequencies. This broadband spectrum has an envelope that resembles the inverse of the vocal tract impedance, which is discussed below. The signal in the mouth in Fig. 4共a兲, which is due to the interaction of the flow through the vibrating lips with the vocal tract, shows no strong difference between even and odd harmonics, because the resonances in the tract are much broader than the frequency differences between the harmonics of the lip vibration. The vocal tract impedance shows broad peaks at approximately 1.5, 2.1, and 2.8 kHz. A weak peak below 500 Hz is often seen when players mime playing 关data not shown, but see Fletcher et al. 共2001兲兴, but here it is not seen: it is possibly obscured by the strong signal from the vibrating lips. Comparing the vocal tract impedance spectrum Ztract in Fig. 4共d兲 with the radiated sound spectrum in Fig. 4共c兲 one notes that, when Ztract is sufficiently large, the envelope of the radiated sound spectrum is low. This correlation was evident in many such spectra, both from cylindrical pipes and flared yidakis 共data not shown兲. Figure 5 shows the results of independent measurements of the high tongue configuration for the three players described above. The minima and maxima in the spectral envelope of the radiated sound that fell in the range 1.0 to 2.2 kHz were recorded, as were the extrema in the vocal tract impedance measured immediately previously. In Fig. 5, the frequency of each minimum in the spectral envelope of the radiated sound is plotted against that of the nearest maximum in the impedance spectrum 共filled symbols兲 and the frequency of each maximum in the sound envelope is plotted against that of the nearest minimum in the impedance spectrum. The correlation is excellent 共the slope is 0.93 and the correlation coefficient is 0.98兲. The impedance maxima correspond almost exactly to minima in the spectral envelope of the sound, while maxima in the sound spectrum occur on average at frequencies slightly above those of the minimum in the impedance. Why does a peak in Ztract reduce the level of the radiated sound? The connection is a little obscure and requires careful J. Acoust. Soc. Am., Vol. 119, No. 2, February 2006 FIG. 5. On this graph, each filled symbol plots the frequency of a minimum in the spectral envelope of the radiated sound against the frequency of the nearest maximum in the impedance spectrum of the vocal tract. Each open symbol plots the frequency of a maximum in the spectral envelope of the radiated sound against the frequency of the nearest minimum in the measured impedance spectrum. No clear impedance maxima were evident for player BL in the measured range. The dashed line is the line of equality. analysis. For this reason it will be discussed only briefly here, but is treated in detail in our companion theory paper 共Fletcher et al., 2006兲. In most lip-valve instruments, the maxima in Zinstr are very much larger than those in Ztract. For the yidaki, in the frequency range of interest 共1 to 3 kHz兲, this is not the case, for two or three reasons. First, the yidaki has a larger cross section than does the mouth with the tongue raised and so it has a relatively small characteristic impedance. Second, in a traditional yidaki, wall losses due to roughness in the bore of a genuine instrument may also be important, though not for the PVC pipe used here. In Fig. 4共d兲, the peak in Ztract at 1.5 kHz has a value of about 8 MPa s m−3 for this configuration when the tongue is raised. The impedance of the pipe has peaks in this frequency range of 3 to 10 MPa s m−3. Consequently, these broad peaks in Ztract give rise to a minimum in the acoustic flow U in the yidaki, at the lips. For any given value of the transfer function between the two ends of the yidaki, a small U at the input yields a small acoustic pressure at the output. Of course, the transfer function of a pipe is a strong function of frequency and has maxima at approximately f pipe = 共2n + 1兲c / 4L, where c is the speed of sound, L is the length of the pipe, and n is an integer. 共At these frequencies, the pipe is a good impedance transformer to match the relatively low radiation impedance.兲 But the resonances of the pipe are closely spaced in frequency compared to those of the tract, so that several harmonics of the played sound will fall within a formant produced by a resonance of the vocal tract. This also explains the shape of the broadband component of the radiated spectra, discussed above. Figure 6 presents another example of a measurement with the high tongue configuration. We propose that the impedance maxima in Ztract measured just inside the lips shown in Figs. 4共d兲 and 6共c兲 are due to resonances of the upper Tarnopolsky et al.: Acoustics of the didjeridu 1199 FIG. 6. A recording in the high-tongue configuration, but also illustrating the effects of resonance coincidence. The top graph is the sound spectrum measured inside the mouth, the middle is the sound spectrum outside the end of the yidaki at the same time, and the bottom is the impedance spectrum inside the mouth during playing. At low and high frequencies, the odd harmonics coincide with resonances of the pipe 共vertical dashes兲. However, at around 1.5 kHz, even harmonics coincide with the resonances. vocal tract 共i.e., the airway between the lips and the glottis兲. Mukai 共1989兲 reports that experienced wind players perform with the glottis almost closed. A nearly closed glottis produces relatively strong resonances at high frequencies because the coefficient of reflection is large. Consequently, at frequencies above several hundred Hz, the impedance seen at the lips is approximately that of an irregular tube closed at the glottis. The lungs, on the other hand, would produce a termination that is essentially resistive at high frequencies, and therefore an airway with open glottis exhibits rather weak resonances 共data not shown兲. We propose that experienced yidaki players, like other wind players, also perform with the glottis nearly closed and that this is necessary to produce the relatively strong resonances that give rise to strong formants in the output sound, as discussed below. In a number of styles of yidaki playing, the glottis is used as an 1200 J. Acoust. Soc. Am., Vol. 119, No. 2, February 2006 additional vibrating signal source, so players will presumably be used to keeping the glottis in a nearly closed configuration. The resonances of the tract in playing are therefore somewhat analogous to those used to produce speech, the differences being that for speech the glottis is the vibrating signal source rather than the lips, and the lips are open. Vocal formants in the kHz range occur at frequencies which produce standing waves in the tract with pressure antinodes near the glottis and a flow antinode at the lip opening 共Sundberg, 1977兲. In other words, they occur when the vocal tract is a most effective impedance matcher between the high impedance at the glottis and the low impedance of the radiation field outside the mouth. The formants radiated by the yidaki also occur when there is an impedance minimum at the lips and, we hypothesize, a pressure antinode near the glottis when it is nearly closed. Of course, in the sustained vowels of speech, the lips are at least somewhat open, whereas in yidaki playing they are almost shut. In speech, the mouth opening affects primarily the first vocal formant 共F1, which occurs below about 1 kHz for all vowels兲. It is the second formant 共F2兲 that is of interest here, because it falls approximately in the range 1 to 2 kHz. The frequency of the F2 in speech depends somewhat on mouth opening. Consequently, the strong yidaki formant could be expected to occur at frequencies comparable with, but not equal to, those of second speech formants for a similar tract configuration. Thus playing with a mouth configuration similar to that required to produce the vowel /i/, for example, will produce a yidaki sound whose formant frequency is similar to but not necessarily equal to that of the second formant in the vowel /i/. Further, to produce a yidaki formant, the mouth configuration must provide peaks in the tract impedance that have sufficiently high amplitude, so some vowel shapes with low tongue may not produce a clear formant, as is discussed below. Figure 6 illustrates another effect that influences formants in the output sound. Again, the playing frequency is slightly above that of the first resonance—about 3 Hz in this case. Consequently, at a frequency approaching 1 kHz, the odd-even difference in the sound spectrum disappears, because in this frequency range the harmonics fall almost midway between resonances. At around 1.5 kHz, on the other hand, it is the even harmonics that benefit from the resonances of the pipe. This range is not far below the formant due to the vocal tract resonance. It might be possible, in principle, to observe broad and weak formants in the sound spectrum due only to this effect of an even or an odd harmonic happening to fall on a resonance of the yidaki: we call this the “harmonic coincidence” effect. For an experienced yidaki player, formants produced by this effect would be relatively small compared to those produced by the minima in the spectral envelope that coincide with peaks in Ztract. Consequently, in this figure, as in Fig. 4, the maxima in the impedance of the vocal tract coincide with minima in the spectral envelope of the radiated sound. However, in Fig. 6 共but not Fig. 4兲, the formants at about 1.6 kHz is somewhat assisted by the near coincidence of a harmonic of the lip motion 共here an even harmonic兲 with a resonance of the inTarnopolsky et al.: Acoustics of the didjeridu FIG. 7. The graphs are as for Figs. 4 and 6, but these data were taken during a note played with the low tongue configuration. strument. As the player can readily make small adjustments to the playing frequency, it is possible that the playing frequency is sometimes adjusted to take advantage of harmonic coincidence, though this was not investigated here. Figure 7 shows the impedance measured inside the mouth and the sound produced for a configuration in which the player held the tongue low in the mouth. The peaks in Ztract were not as large as for the high tongue configuration, and consequently there is none of the shaping of the spectral envelope of the sound as is produced with the high tongue figuration. The impedance of the vocal tract is so low over most of the frequency range that only the strong low harmonics of the lip motion produce components that are clearly visible above the turbulent noise present inside the mouth. There is no clear formant comparable to those near 1.7 kHz in Figs. 4 and 6. The coincidence effect does increase the odd harmonics around 2.5 kHz. In this example, and in others measured in the low tongue configuration, the odd harmonics in the signal recorded in the mouth were stronger than the neighboring even harmonics. Whether this is due to the shape of the lip opening 共discussed below兲 or to sound transmission from the bore of the yidaki into the relatively low impedance load in the mouth, or to another cause, we do not know. However, one of its consequences might be the absence of a clear coincidence effect around 1.5 kHz. C. Motion of the lips and air jet during playing Images of the playing lips were produced in two different ways. In one, a high speed camera operating at 1000 frames per second filmed the lips from both the front and the J. Acoust. Soc. Am., Vol. 119, No. 2, February 2006 FIG. 8. The images on the left show one cycle of vibration of the player AT’s lips, from the front, photographed with a high speed camera. The bar has length 10 mm. The side views 共at right兲 were obtained stroboscopically by the schlieren flow visualization technique and are matched with the high speed images to show one complete cycle. The scale bars have length 10 mm. side using the arrangement shown in Fig. 2. The side-view pictures thus taken are not shown here. Instead we present in side view the schlieren images, which show both the air jet and the lip position. These were produced stroboscopically using the arrangement shown in Fig. 3. Images from these two series are shown in Fig. 8. The images on the left of each pair are sequential pictures from one cycle. The pictures on the right, however, were each obtained in different cycles from the stroboscopic series and were matched to those from the high speed series by matching the shape of the lips at opening. The high speed film 共both front and side view兲 is on our web site 共Music Acoustics, 2005兲. The lip shapes shown here are qualitatively similar to the results reported by Wiggins 共1988兲. Figure 9 shows data from sets of images such as those on the left in Fig. 8, and the side view images taken simultaneously via the mirror 共pictures not shown兲. The data are averaged over ten cycles from a series made with the high tongue configuration. The maximum camera speed 共1000 frames per second兲 limits the time resolution, and hence limits the maximum frequency in an experimental spectrum A共f兲 to 500 Hz. However, the shape of A共t兲 is used as an input in a numerical model presented in our companion theory paper 共Fletcher et al., 2006兲. Yoshikawa 共1995兲 reported the motion of the lips of horn players. In the low end of the range, their lips moved mainly along a horizontal axis 共i.e., parallel to the bore of the mouthpiece兲, with a smaller opening motion in the vertical direction. Like that of the horn players in the low register, the lip motion reported here thus conforms Tarnopolsky et al.: Acoustics of the didjeridu 1201 yidaki in general. Indeed, the period of the sound pressure oscillation 共14.3 ms兲 remained substantially unchanged when the CO2 was supplied to form the curtain. To reduce the effect of the CO2 jets on the air jet, the flow of CO2 was regulated so that the CO2 curtain becomes almost invisible at the level of the opening of the lips. Some aspects of the motion of the jet are explained by the mouth opening. In Fig. 8共b兲, the lips have just opened and the jet is well-defined and narrow. In Figs. 8共b兲–8共e兲 inclusive, the lips remain open and the jet grows broader. This is possibly the result of changing geometry of the jet separation from the lips as they open. It is interesting that the jet deviates noticeably downward from the axis of the yidaki 关Figs. 8共e兲 and 8共f兲兴. The downward deviation may be caused by the changing geometry of the player’s lips as they open or it may be due to a momentum transfer between the jet and the descending CO2 stream. Once the lips close 关Fig. 8共f兲兴 the momentum of the jet keeps it moving and the disturbance of the “curtain” gradually disappears. In a widely used Poiseuille-flow model for wind instruments, the relation between the flow of air into the instrument and pressure difference between the mouth and the instrument is calculated assuming conservation of energy between the mouth and the jet, followed by dissipation of kinetic energy in the jet when it mixes with the air in the instrument. When the air jet emerges from the lips it experiences viscous drag from the surrounding air, which leads to turbulent mixing because of the high speed of the jet. This can be clearly seen in Fig. 8. The important thing for sound generation in the instrument, however, is simply the input volume flow at a particular frequency multiplied by the instrument impedance at that frequency. D. Idiomatic effects: Circular breathing and vocalization FIG. 9. Data from images such as those in Fig. 8, averaged over ten cycles. 共a兲 shows how the average position of the upper lip varies throughout a complete cycle 共bars show the standard error兲. 共b兲 shows how the average area of the open space between the lips varies with time throughout a cycle for the high and low tongue configurations. 共c兲 shows the relative amplitude of the harmonic components of the periodic variations in the lip area. The dashed line indicates the situation in which the harmonic components vary as n−2. to the “outward swinging door model” used to explain the operation of lip-valve instruments 共Fletcher and Rossing, 1998兲 and the yidaki 共Hollenberg, 2000兲. The schlieren images in Fig. 8 show the motion of the jet emitted from the lips by mixing the jet with CO2 from the “curtain” introduced as shown in Fig. 3. The density of carbon dioxide is about 1.5 times greater than that of air. However, the total volume of CO2 is small so there is little reason to suspect that this alters significantly the behavior of the 1202 J. Acoust. Soc. Am., Vol. 119, No. 2, February 2006 Figure 10 shows aspects of the sound produced for another important playing configuration: the inhalation phase necessary for “circular breathing,” during which the performer plays using the air in the inflated cheeks, with his mouth sealed from the lower vocal tract with the soft palate. This radical change in the vocal tract geometry makes a substantial change in the timbre and the necessity of frequent inhalation means that the change occurs often in continuous playing. Idiomatic playing makes a virtue of this necessity, so that the sound produced using the cheeks as reservoir is made an integral element of the rhythmic structure. Many of the rhythms used require simple alternation of inhalation and exhalation through the instrument. Typical mouth pressures vary from 0.8 to 2 kPa and flow rates from 0.1 to 0.3 l / s. The volume of air that can be expelled from the cheeks supports loud playing for only a fraction of a second, during which most players are unable to inspire deeply. Consequently, a simple alternation of inhalation and normal playing allows only shallow breathing. For this reason, players will sometimes play for several seconds using the air in the lungs, and then refill the tidal volume during a series of three or four alternations between brief inhalations 共while playing with the cheek reservoir兲 and Tarnopolsky et al.: Acoustics of the didjeridu FIG. 11. The spectrum of a sound produced by playing a note with frequency f = 70 Hz and simultaneously vocalizing at a frequency g, where g = 3f / 2 = 105 Hz. FIG. 10. Top: an oscillogram of a sound sample during which the player inhales through the nose three times, while continuing to play—an example of one of the common rhythms used in circular breathing. The amplitude falls during the inhalations and is largest during normal playing. Spectra of sounds during normal playing 共a兲 and inhalation 共b兲 are shown. normal playing. This may be used when preparing for or recovering from the use of a high flow rate, such as may be used to play one of the upper resonances. An example of a regular series of inhalations is shown in Fig. 10. The waveform shows a series of three exhalations 共large amplitude兲, each followed by a brief inhalation 共small amplitude兲. The amplitude decreases during each of the inhalations, which may be explained by the application of the Young-Laplace relation to the muscles of the cheeks: it is easier to apply a higher pressure with a given muscular tension when the cheeks are highly curved. The finite mouth volume and the consequent brief period during inhalation 共typically 0.2 s for this player兲 had the unfortunate conseJ. Acoust. Soc. Am., Vol. 119, No. 2, February 2006 quence for this study that it proved impossible to make acceptable impedance measurements of the tract during this gesture. The spectra in Fig. 10 contrast the spectra of the sound produced during one of the inhalations, during which the mouth alone was the resonator upstream from the lips and during the intervening exhalation. A further series of different timbres can be obtained by vocalizing and playing at the same time. In this case, both the vocal folds and the lips act as pressure-controlled valves, operating at opposite ends of the vocal tract. Due to the strongly nonlinear interaction among lip motion, vocal fold motion, and the air flow, a range of heterodyne components are generated. Simply, if the vocal folds have opening area S1 = 兺a共n兲 sin 共n␻1t兲 and the lips have opening area S2 = 兺b共n兲 sin 共n␻2t兲, then the total flow, ignoring tract impedance effects, is proportional to S1S2 and thus has components at all frequencies n␻1 ± m␻2. An example is shown in Fig. 11. Player LH plays a steady note at frequency f = 70 Hz, while simultaneously vocalizing at a frequency g = 3f / 2 = 105 Hz, a musical fifth above. The first several harmonics and heterodyne components are indicated in the figure. Because of the harmonic relation between the two frequencies, these components are equally spaced at frequencies f / 2 = g / 3 = 35 Hz. Consequently, the pitch is heard at one octave below the lip fundamental, as the sound files demonstrate 共Music Acoustics, 2005兲. IV. CONCLUSIONS When the player’s tongue is raised close to the hard palate, the acoustical impedance spectrum of the vocal tract has maxima with values of 2 – 10 MPa s m−3 in the range 1 to 3 kHz—comparable with or larger than a typical impedance maximum of the yidaki in that frequency range. At the frequencies of these maxima, the spectral envelope of the radiated sound has minima, because the frequency components of acoustic current in the air jet entering the instrument Tarnopolsky et al.: Acoustics of the didjeridu 1203 are reduced. In the frequency bands lying between the impedance maxima, the lower impedance of the tract produces a stronger acoustic current through the lips, resulting in a characteristic, strong formant in the radiated sound. These formants in the radiated sound correspond approximately to those of the second formant frequency produced in speech with the same tract configuration. The large variation between impedance maxima and minima, the variation that produces strong formants in the radiated sound, is consistent with a glottis nearly closed, the configuration used by experienced wind players studied by Mukai 共1989, 1992兲. This suggests that learning to play with the glottis nearly closed may be important in developing technique on the yidaki. Broad and weak formants can also be observed when groups of even or odd harmonics coincide with a bore resonance. The lips are open for about half of each cycle and operate approximately as “outward swinging doors,” as described by Yoshikawa 共1995兲 for the low range of the horn. The motion of the jet is relatively simple and supports the approximation of one-dimensional motion made in several analyses 共Fletcher, 1993; Hollenberg, 2000兲. The present paper has been concerned entirely with experimental measurements. A theoretical analysis and justification of some of the conclusions will be presented in a companion paper 共Fletcher et al., 2006兲. ACKNOWLEDGMENTS We thank John Tann for technical assistance and the Australian Research Council for support. BL worked on this project during a vacation scholarship. Amir, N. 共2004兲. “Some insights into the acoustics of the didjeridu,” Appl. Acoust. 65, 1181–1196. Amir, N., and Alon, Y. 共2001兲. “A study of the didjeridu: normal modes and playing frequencies,” Proc. International Symposium on Musical Acoustics, Perugia, edited by D. Bonsi, D. Gonzalez, and D. Stanzial, pp. 95–98. Backus, J. 共1985兲. “The effect of the player’s vocal tract on woodwind instrument tone,” J. Acoust. Soc. Am. 78, 17–20. Berio, L. 共1966兲. Sequenza V; Solo Trombone 共Universal, New York兲. Brass, D., and Locke, A. 共1997兲. “The effect of the evanescent wave upon acoustic measurements in the human ear canal,” J. Acoust. Soc. Am. 101, 2164–2175. Caussé, R., Goepp, B., and Sluchin, B. 共2004兲. “An investigation on ‘tonal’ and ‘playability’ qualities of eight didgeridoos, perceived by players,” Proc. International Symposium on Musical Acoustics, Nara, Japan. Clinch, P. G., Troup, G. J., and Harris, L. 共1982兲. “The importance of vocal tract resonance in clarinet and saxophone performance—a preliminary account,” Acustica 50, 280–284. Elliott, J. S., and Bowsher, J. M. 共1982兲. “Regeneration in brass wind in- 1204 J. Acoust. Soc. Am., Vol. 119, No. 2, February 2006 struments,” J. Sound Vib. 83, 181–217. Epps, J., Smith, J. R., and Wolfe, J. 共1997兲. “A novel instrument to measure acoustic resonances of the vocal tract during speech,” Meas. Sci. Technol. 8, 1112–1121. Erickson, R. 共1969兲. General speech: for trombone solo (with theatrical effects) 共Smith, Baltimore兲. Fletcher, N., Hollenberg, L., Smith, J., and Wolfe, J. 共2001兲. “The didjeridu and the vocal tract,” Proc. International Symposium on Musical Acoustics, Perugia, edited by D. Bonsi, D. Gonzalez, and D. Stanzial, pp. 87–90. Fletcher, N. H. 共1983兲. “Acoustics of the Australian didjeridu,” Australian Aboriginal Studies 1, 28–37. Fletcher, N. H. 共1993兲. “Autonomous vibration of simple pressurecontrolled valves in gas flows,” J. Acoust. Soc. Am. 93, 2172–2180. Fletcher, N. H. 共1996兲. “The didjeridu 共didgeridoo兲,” Acoust. Aust. 24, 11–15. Fletcher, N. H., and Rossing, T. D. 共1998兲. The Physics of Musical Instruments, 2nd ed. 共Springer-Verlag, New York兲. Fletcher, N. H., Smith, J., Tarnopolsky, A., and Wolfe, J. 共2005兲. “Acoustic impedance measurements—correction for probe geometry mismatch,” J. Acoust. Soc. Am. 117, 2889–2895. Fletcher, N. H., Hollenberg, L. C. L., Smith, J., Tarnopolsky, A. Z., and Wolfe, J. 共2006兲. “Vocal tract resonances and the sound of the Australian didjeridu 共yidaki兲 II. Theory,” J. Acoust. Soc. Am. 119, 1205–1213. Hollenberg, L. 共2000兲. “The didjeridu: Lip motion and low frequency harmonic generation,” Aust. J. Phys. 53, 835–850. Mukai, M. S. 共1992兲. “Laryngeal movement while playing wind instruments,” in Proc. International Symposium on Musical Acoustics, Tokyo, Japan, pp. 239–242. Mukai, S. 共1989兲. “Laryngeal movement during wind instrument play,” J. Otolaryngol. Jpn. 92, 260–270. Music Acoustics 共2005兲. http://www.phys.unsw.edu.au/~jw/ yidakididjeridu.html Smith, J. R., Henrich, N., and Wolfe, J. 共1997兲. “The acoustic impedance of the Bœhm flute: standard and some non-standard fingerings,” Proc. Inst. Acoust. 19, 315–320. Sundberg, J. 共1977兲. “The acoustics of the singing voice,” Sci. Am. 236, 82–91. Tarnopolsky, A., Fletcher, N., Hollenberg, L., Lange, B., Smith, J., and Wolfe, J. 共2005兲. “The vocal tract and the sound of a didgeridoo,” Nature 共London兲 436, 39. Tarnopolsky, A. Z., and Fletcher, N. H. 共2004兲. “Schlieren flow visualisation technique: applications to musical instruments, especially the didjeridu,” Proc. 18th International Congress in Acoustics, Kyoto, Japan, 4—9 April. Tarnopolsky, A. Z., Lai, J. C. S., and Fletcher, N. H. 共2000兲. “Flow structures generated by pressure-controlled self-oscillating reed valves,” J. Sound Vib. 247共2兲, 213–226. Wiggins, G. C. 共1988兲. “The physics of the didgeridoo,” Phys. Bull. 39, 266–267. Wolfe, J., Smith, J., Tann, J., and Fletcher, N. H. 共2001兲. “Acoustic impedance of classical and modern flutes,” J. Sound Vib. 243, 127–144. Wolfe, J., Tarnopolsky, A. Z., Fletcher, N. H., Hollenberg, L. C. L., and Smith, J. 共2003兲. “Some effects of the player’s vocal tract and tongue on wind instrument sound.” Proc. Stockholm Music Acoustics Conference 共SMAC 03兲, edited by R. Bresin, Stockholm, Sweden, pp. 307–310. Yoshikawa, S. 共1995兲. “Acoustical behavior of brass player’s lips,” J. Acoust. Soc. Am. 97, 1929–1939. Tarnopolsky et al.: Acoustics of the didjeridu

Log In

Vocal tract resonances and the sound of the Australian didjeridu (yidaki) I. Experiment

Related papers

Related papers

Related topics