See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/26277654
Controller design and consonantal contrast
coding using a multi-finger tactual display
ARTICLE in THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA · JULY 2009
Impact Factor: 1.5 · DOI: 10.1121/1.3124771 · Source: PubMed
CITATIONS
READS
2
22
4 AUTHORS, INCLUDING:
Ali Israr
Peter Meckl
36 PUBLICATIONS 347 CITATIONS
116 PUBLICATIONS 875 CITATIONS
Disney Research
SEE PROFILE
Purdue University
SEE PROFILE
Hong Z Tan
Purdue University
153 PUBLICATIONS 2,884 CITATIONS
SEE PROFILE
Available from: Ali Israr
Retrieved on: 09 February 2016
Controller design and consonantal contrast coding using a
multi-finger tactual displaya)
Ali Israrb兲
Haptic Interface Research Laboratory, Purdue University, 465 Northwestern Avenue, West Lafayette,
Indiana 47907-2035
Peter H. Meckl
Ruth and Joel Spira Laboratory for Electromechanical Systems, 585 Purdue Mall, West Lafayette, Indiana
47907-2088
Charlotte M. Reed
Research Laboratory of Electronics, Massachusetts Institute of Technology, Room 36-751, 77 Massachusetts
Avenue, Cambridge, Massachusetts 02139
Hong Z. Tan
Haptic Interface Research Laboratory, Purdue University, 465 Northwestern Avenue, West Lafayette,
Indiana 47907-2035
共Received 8 August 2008; revised 24 March 2009; accepted 29 March 2009兲
This paper presents the design and evaluation of a new controller for a multi-finger tactual display
in speech communication. A two-degree-of-freedom controller consisting of a feedback controller
and a prefilter and its application in a consonant contrasting experiment are presented. The feedback
controller provides stable, fast, and robust response of the fingerpad interface and the prefilter
shapes the frequency-response of the closed-loop system to match with the human
detection-threshold function. The controller is subsequently used in a speech communication system
that extracts spectral features from recorded speech signals and presents them as
vibrational-motional waveforms to three digits on a receiver’s left hand. Performance from a
consonantal contrast test suggests that participants are able to identify tactual cues necessary for
discriminating consonants in the initial position of consonant-vowel-consonant 共CVC兲 segments.
The average sensitivity indices for contrasting voicing, place, and manner features are 3.5, 2.7, and
3.4, respectively. The results show that the consonantal features can be successfully transmitted by
utilizing a broad range of the kinesthetic-cutaneous sensory system. The present study also
demonstrates the validity of designing controllers that take into account not only the
electromechanical properties of the hardware, but the sensory characteristics of the human user.
© 2009 Acoustical Society of America. 关DOI: 10.1121/1.3124771兴
PACS number共s兲: 43.66.Wv, 43.66.Ts, 43.66.Gf, 43.60.Ek 关ADP兴
I. INTRODUCTION
The motivation for this research is to utilize touch as a
sensory substitute for hearing in speech communication for
individuals with severe hearing impairments. That such a
goal is attainable is demonstrated by users of the Tadoma
method who receive speech by placing a hand on the face of
a speaker to monitor facial movements and airflow variations
associated with speech production. Previous research has
documented the speech-reception performance of highly experienced deaf-blind users of the Tadoma method at the segmental, word, and sentence levels 共Reed et al., 1985兲. An
analysis of information-transfer 共IT兲 rates for a variety of
methods of human communication 共Reed and Durlach, 1998兲
suggests that the communication rates achieved through
a兲
Part of this work concerning the controller design was presented at the
2004 ASME International Mechanical Engineering Congress and Exposition, Anaheim, CA, Nov. 13-19, 2004.
b兲
Author to whom correspondence should be addressed. Electronic mail:
israr@rice.edu
J. Acoust. Soc. Am. 125 共6兲, June 2009
Pages: 3925–3935
Tadoma are roughly half of those achieved through normal
auditory reception of spoken English. By comparison, the
estimated communication rates for speech transmission
through artificial tactile aids are substantially below those of
the Tadoma method 共Reed and Durlach, 1998兲. The limited
success demonstrated thus far with artificial tactual communication systems may be due to a variety of factors, including
共1兲 the homogeneous nature of displays that utilize single or
multiple actuators to deliver only high-frequency cutaneous
stimulation, and 共2兲 the use of body sites with relatively
sparse nerve innervation, such as forearm, abdomen, or neck
共Plant, 1989; Waldstein and Boothroyd, 1995; Weisenberger
et al., 1989; Galvin et al., 1999; Summers et al., 2005兲. In
contrast, Tadoma users have access to a rich set of stimulus
attributes, including kinesthetic movements of the face and
jaw, cutaneous vibrations at the neck, airflow at the lips, and
muscle tensions in the face, jaw, and neck, which are received through the hands.
To more fully exploit the capabilities of the tactual sensory system that are engaged in the use of the Tadoma
0001-4966/2009/125共6兲/3925/11/$25.00
© 2009 Acoustical Society of America
3925
method, an artificial device, the Tactuator, was developed to
deliver kinesthetic 共motions兲 as well as cutaneous 共vibrations兲 stimuli through the densely innervated fingertips of the
left hand 共Tan and Rabinowitz, 1996兲. Previous research has
examined IT rates for multidimensional stimuli delivered
through the Tactuator device 共Tan et al., 1999, 2003兲. For
example, in Tan et al., 2003, IT rates of up to 21.9 bits/s
were achieved using multidimensional synthetic waveforms
presented at a single contact site. These rates, which are
among the highest reported to date for a touch-based display,
are at the lower end of the range of IT rates obtained for
auditory reception of speech 共Reed and Durlach, 1998兲.
The present research was concerned with the utilization
of the broad kinesthetic-to-cutaneous stimulation range
共nearly 0–300 Hz兲 of the TactuatorII1 for the display of
speech. In particular, this research was designed to extend
the work of Yuan 共2003兲 in which speech was encoded for
display through the Tactuator device. Yuan 共2003兲 examined
the ability to discriminate the voicing cue in consonants using a two-channel speech-coding scheme in which the amplitude envelope of a low-frequency band of speech was
used to modulate a 50-Hz waveform delivered to the thumb,
and the amplitude envelope of a high-frequency band of
speech was used to modulate a 250-Hz waveform at the index finger. Noise-masked normal-hearing participants
achieved high levels of performance on the pairwise discrimination of consonants contrasting the feature of voicing
through the tactual display alone. This coding scheme was
also effective in providing a substantial benefit to lipreading
in closed-set consonant identification tasks.
Encouraged by the results of Yuan 共2003兲, the present
study investigated consonant discriminability for the features
of place and manner of articulation in addition to voicing. A
speech-to-touch coding scheme was developed to extract envelope information from three major spectral regions of the
speech signal and present them as kinesthetic motional and
cutaneous vibrational cues. The three spectral bands included
a low-frequency region 共intended to convey information
about fundamental frequency兲, a mid-frequency region 共intended to convey information about the first formant of
speech兲, and a high-frequency region 共intended to convey
second-formant information兲. These bands were somewhat
consistent with the assessment of bands of modulated noise
required for speech recognition by Shannon et al. 共1995兲, as
well as those used in previous studies on tactile aids
共Weisenberger and Percy, 1995; Clements et al., 1988; Summers, 1992兲. Amplitude-envelope information from each of
these spectral regions was encoded tactually through the use
of mid- and high-frequency vibrations at one of the three
contactor sites of the TactuatorII 共thumb, middle finger, and
index finger, respectively兲. The absolute amplitude of the vibrations at each finger provided information about the energy
in the corresponding frequency band. The relative amplitudes
of the two vibrations 共modulated at 30 and 200 Hz兲 at each
finger channel provided information about energy spread in
the corresponding frequency band. In addition to the tactile
waveforms, the coding scheme monitored energy peaks
within each band and presented this information as lowfrequency motional cues—extending the finger for high3926
J. Acoust. Soc. Am., Vol. 125, No. 6, June 2009
frequency contents and flexing the finger for low-frequency
contents in the corresponding finger band. These more pronounced representations of formant and formant transition
cues were employed in an effort to improve the transmission
of cues related to place of articulation, which have been
poorly transmitted through previous tactile aids 共Clements
et al., 1988; Weisenberger et al., 1989; Waldstein and Boothroyd, 1995; Plant, 1989; Summers et al., 2005; Weisenberger and Percy, 1995; Galvin et al., 1999兲. Acoustical
analyses of plosive and fricative consonants have shown that
place of articulation is well correlated with the frequency
values of the first two formants 共spectral peaks in the speech
spectrum due to the shape of the mouth兲, F1 and F2, and
their transitions 共Ali et al., 2001a, 2001b; Jongman et al.,
2000兲. Therefore, motions indicating changes in F1 and F2
were used to encode information concerning place of articulation. Although the location of the energy peaks was presented as high-frequency vibrations, redundant presentations
of the same information as quasi-static positions of the fingers were intended to reduce inter-channel effects that may
arise, such as those due to masking. It is well known that
masking reduces internal representations of proximal tactual
stimuli 共Craig and Evans, 1987; Evans, 1987; Tan et al.,
2003兲 and redundant presentation of speech information can
lead to improved perceptual performance 共Yuan, 2003; Summers et al., 1994兲.
One challenge associated with the use of broadband signals with an electromechanical system such as the TactuatorII is that the system frequency response is not uniform
across its operating range. Therefore, the input signals are
distorted spectrally before they are presented to a human
user. To solve this problem, a closed-loop two-degree-offreedom 共2DOF兲 controller was developed to reshape the
overall system response. Specifically, the controller compensated for both the frequency response of the TactuatorII and
the frequency-dependent human detection-thresholds 共HDTs兲
for tactual stimulation so that when a broadband input signal
is applied to the TactuatorII, the relative amplitude of spectral components in the input signal is preserved in terms of
perceived intensity when the signal reaches the user’s fingers. The 2DOF controller consists of a feedback controller
and a prefilter. The feedback controller 共referred to as the
low-frequency kinesthetic or motion controller兲 counters the
effects of low-frequency disturbances due to a user’s finger
loading the device, increases the closed-loop bandwidth, and
reduces the high-frequency in-line noise. The prefilter 共referred to as the broadband cutaneous or vibration controller兲
shapes the overall system frequency response so that two
equal-amplitude spectral components at the reference input
would be perceived as equally intense by the human user.
The remainder of this paper describes the controller design and implementation of the TactuatorII system 共Sec. II兲
and the speech-to-touch coding scheme 共Sec. III兲. An experimental study on the pairwise discrimination of consonants
with two human observers is reported 共Sec. IV兲 before the
paper concludes with a general discussion 共Sec. V兲.
Israr et al.: Tactual consonant discrimination
low-frequency finger load disturbance, and achieve fast and
stable motion tracking. Because of the similarities among all
three channels, controller design for one channel assembly of
the TactuatorII is discussed in this paper.
Our approach was to have two main components in the
controller: one for the low-frequency kinesthetic movements,
and the other for the broadband high-frequency cutaneous
region, as explained in Fig. 1共b兲. The high-frequency broadband reference position signal, r2共t兲, was first passed through
a prefilter, F共s兲, and then added to the low-frequency motional reference position, r1共t兲. The combined signal was
then compared to the measured position signal, y ⴱ共t兲, to form
an error signal, e共t兲, as the input to the feedback controller,
C共s兲. The output of the feedback controller or the command
signal, u共t兲, was used to drive the motor assembly, P共s兲, to
achieve a position trajectory of y共t兲 at the point where the
fingerpad rests. The effects of finger loading and sensor noise
are represented by d0共t兲 and n共t兲, respectively.
Major steps in the design of the feedback controller and
the prefilter are outlined below. More details can be found in
Israr, 2007 共cf. Chap. 2兲.
FIG. 1. 共a兲 Three channels of the TactuatorII system and the three hand
contact points rested lightly on the “fingerpad interface” rods 共inset兲. 共b兲 The
block diagram representation of the 2DOF controller.
II. CONTROLLER DESIGN
A. Apparatus
The TactuatorII consists of three independentlycontrolled channels interfaced with the fingerpads of the
thumb, the index finger, and the middle finger, respectively
关Fig. 1共a兲兴. The range of motion for each digit is about 25
mm. Each channel has a continuous frequency response from
dc to 300 Hz, delivering stimuli from the kinesthetic range
共i.e., low-frequency gross motion兲 to cutaneous range 共i.e.,
high-frequency vibration兲 as well as in the mid-frequency
range. Across the frequency range of dc to 300 Hz, an amplitude of 0 dB sensation level 共SL兲 共decibels above HDT兲 to
at least 47 dB SL can be achieved at each frequency, thereby
matching the dynamic range of tactual perception 共Verrillo
and Gescheider, 1992兲. Details of the TactuatorII can be
found in Israr et al., 2006 共cf. Sec. II A, on pp. 2790–2791兲
and Tan and Rabinowitz, 1996.
The frequency response of the motor assembly was obtained by measuring the input-output voltage ratio over the
frequency range dc to 300 Hz. It was modeled by a secondorder transfer function P共s兲 = 2875/ 共s2 + 94s + 290兲 共see also
Tan and Rabinowitz, 1996兲.
B. Controller design
The main design objective was to shape the frequency
response of the TactuatorII so that when driven with a broadband signal 共up to 300 Hz兲, the relative intensities of different spectral components were preserved in terms of the relative sensation levels 共SLs兲 delivered by the TactuatorII. In
addition, the controller should be able to reduce the effects of
J. Acoust. Soc. Am., Vol. 125, No. 6, June 2009
1. Feedback controller for kinesthetic stimulus region
The feedback controller or the motional 共kinesthetic兲
controller C共s兲 was designed using a lead-lag frequency
loop-shaping technique that shaped the frequency response
of the open-loop transfer function, L共s兲 = C共s兲P共s兲, to lie
within the constraints determined by the required closed-loop
response,
T共s兲 = C共s兲P共s兲 / 关1 + C共s兲P共s兲兴
共Maciejowski,
1989兲. It consists of an integrator for maintaining the 0 dB
closed-loop gain, a pair of zeros for increasing the stability
margin, and a high-frequency pole for suppressing inline
noise and for the proper structure 共causality兲 of the controller
C共s兲. The final design of the feedback 共kinesthetic兲 controller
is given by
C共s兲 = 12.264
s2 + 111s + 530
.
s共s + 260兲
Figure 2 shows the magnitude 关panel 共a兲兴 and the phase
关panel 共b兲兴 of the frequency response for the open-loop system 共dashed-dotted curve兲 and the closed-loop system 共solid
curve兲. The stability gain and phase margins achieved with
the controller C共s兲 are also shown in Fig. 2. A quantitative
analysis of the system showed that the feedback controller
was able to reject or reduce unwanted noise. The 60-Hz inline noise was imperceptible by human users due to rapidly
falling slope of the closed-loop magnitude frequency response at 60 Hz. The finger load was rejected by keeping the
closed-loop response close to the 0 dB line at low frequencies, and by selecting an appropriate bandwidth of about 30
Hz. In the loaded conditions 共where the fingerpad was lightly
placed on the fingerpad interface兲, the average deviations of
the closed-loop response were 0.34 dB at 1 Hz, 1.43 dB at 8
Hz, 0.64 dB at 40, 0.3 dB at 100, and 0.65 dB at 260 Hz
from the unloaded conditions 共where the fingerpad interface
was displaced with no finger load兲, measured at four intensity levels.
Israr et al.: Tactual consonant discrimination
3927
FIG. 4. Graphical illustration of the objectives for the high-frequency cutaneous controller. When the frequency-response function of the mechanical
system matches with that of the HDT, the frequency function cancels the
effects of variable human sensitivity function and preserves spectral components of the reference input signal.
FIG. 2. Comparison of the frequency response of open-loop and closed-loop
systems. 共a兲 shows the magnitude response of the open-loop and closed-loop
systems. A solid curve shows the input-output transfer function model of the
closed-loop TactuatorII assembly. The response of the continuous and discrete open-loop responses overlaps. 共b兲 shows phase response of the openloop and closed-loop systems. Also shown in the figure are gain and phase
margins, which are important criteria for system stability.
controller is used to compensate for the human sensitivity
function, equal intensities in the input signal 共shown as equal
intensities in the reference signal, Fig. 4兲 spectrum will result
in equally strong sensations when received by a human user.
Therefore, the steady-state response of the overall closedloop system, H共s兲 = F共s兲T共s兲, should follow the target frequency function of the HDT curve in the frequency range dc
to 300 Hz, i.e., H共s兲 = HDT共s兲.
It was anticipated that the HDT function with the fingerpad interface of the TactuatorII system would differ from that
reported in Bolanowski et al., 1988 based on the known
2. Prefilter controller for cutaneous stimulus region
For the design of the broadband controller component,
i.e., the prefilter F共s兲, we first considered the typical HDT
curve as a function of sinusoidal stimulus frequencies2 共Bolanowski et al., 1988兲 共shown as solid curve in Fig. 3兲. The
inverse of this detection-threshold curve was regarded as the
sensitivity curve, or equivalently, the “frequency response”
of the human user. The perceived intensity of a signal, in dB
SL, is roughly determined by the distance between the physical intensity of the signal and the detection-threshold at the
corresponding frequency 共Verrillo and Gescheider, 1992兲.
The effect of the human sensitivity curve on system performance is illustrated in Fig. 4. When a broadband cutaneous
FIG. 3. A typical HDT curve as a function of frequency adapted from
Bolanowski et al. 共1988兲 共solid curve兲 and a HDT curve obtained in the
present study 共dashed curve兲. Also shown here are data points from three
participants 共S1—䊊, S2—䊐, and S3—䉭兲 and the standard errors of their
threshold levels. The dashed curve is a first order approximation of the
detection-threshold levels for three participants along the frequency continuum.
3928
J. Acoust. Soc. Am., Vol. 125, No. 6, June 2009
FIG. 5. 共a兲 Response of TactuatorII for ramp signal applied at the reference
input, r1共t兲, shown as solid line. The response of the model 共dashed line兲 and
actual mechanical assembly 共dots兲 showed fast response time and low overshoot. 共b兲 A comparison of the measured sensor outputs 共individual data
points兲 and the predicted output levels 共solid lines兲 at 0, 10, 20, 30, and 40
dB SLs without the influence of human finger loading 共unloaded condition
shown as unfilled symbols兲 and with the influence of human finger loading
共loaded condition as filled symbols兲.
Israr et al.: Tactual consonant discrimination
variation in tactual thresholds with experimental conditions
such as contact site, direction of vibrations, use of an annulus
surround to restrict penetration of vibrations, etc. 共Verrillo
and Gescheider, 1992; Brisben et al., 1999兲. Thus, the
detection-thresholds for three highly trained participants
were estimated in a psychophysical experiment. Detectionthresholds for 1-s stimulus at nine test frequencies 共1, 3, 7,
15, 31, 63, 127, 255, and 300 Hz兲 were determined with a
three-interval forced-choice paradigm combined with a
one-up three-down adaptive procedure 共Leek, 2001兲. Thresholds obtained this way corresponded to the 79.4-percentile
point on the psychometric function. The results are shown in
Fig. 3. Compared with the HDT curve determined by Bolanowski et al. 共1988兲, which were measured at the thenar
eminence, the newly measured thresholds measured on the
F共s兲 = 0.51
index fingerpad followed the same general trend; however,
our absolute-threshold measurements were somewhat higher
than those of Bolanowski et al. 共1988兲 at the lower frequencies and lower than theirs at the higher frequencies. These
results were consistent with those found in other studies 共Gescheider et al., 1978; Van Doren, 1990; Goble et al., 1996兲,
and those taken earlier with the Tactuator using a
Proportional-Integral-Derivative 共PID兲 controller 共Tan and
Rabinowitz, 1996; Yuan, 2003兲.
The TactuatorII-specific HDTs were subsequently incorporated into the parameters of the prefilter controller, F共s兲. A
new HDT curve based on the measured data 共dashed line in
Fig. 3兲 was obtained and used as the required frequency
function H共s兲 = F共s兲T共s兲. The Laplace transform of the resulting prefilter is
s4 + 1797s3 + 1.822 ⫻ 106s2 + 9.779 ⫻ 108s + 1.955 ⫻ 1011
.
s4 + 1134s3 + 4.313 ⫻ 106s2 + 1.337 ⫻ 109s + 1.995 ⫻ 1010
C. Controller response analysis
The 2DOF controller was implemented on a SBC6711
standalone DSP card 共Innovative Integration, Simi Valley,
CA兲 with a 16-bit Analog-to-Digital Converter 共ADC兲 and a
16-bit Digital-to-Analog Converter 共DAC兲 at a sampling rate
of 4 kHz. The 2DOF controller design was analyzed by measuring closed-loop reference tracking and overall closed-loop
frequency response in unloaded and loaded conditions. In
order to readily compare the sensor feedback signal in volts
with the threshold levels, the controller input reference was
scaled by a factor of 0.003 97. This factor accommodated the
magnitude level of the flat portion of the HDT function at
lower frequencies 共26 dB with regard to 1 m peak or ⫺34
dB with regard to 1 mm peak in Fig. 3, or equivalently
0.019 95兲 and the sensor gain of 0.198 98 V/mm.
1. Motion tracking
Figure 5共a兲 shows the response of the TactuatorII system
for ramp trajectories applied at the reference input r1共t兲 without the influence of human finger load. Shown are the responses of the model 共dashed line兲 and the actual hardware
assembly 共dots兲. The slopes of the reference trajectories were
0.1, ⫺0.06, ⫺0.28, 0.04, and 0.016 V/ms, respectively. The
output 共position兲 response of the hardware assembly showed
that the low-frequency kinesthetic controller maintained stability of the device, and tracked the reference input with low
response time 共about 10 ms兲 and with a small response overshoot.
2. Frequency response
Sinusoidal reference input signals of 2-s duration at various frequencies, r2共t兲, were applied to the TactuatorII system,
and the position-sensor readings were recorded. The results
J. Acoust. Soc. Am., Vol. 125, No. 6, June 2009
for unloaded 共without finger load兲 and loaded 共with finger
placed lightly on the fingerpad interface兲 conditions are
shown in Fig. 5共b兲. The bottom solid curve corresponds to
the HDT curve measured with the TactuatorII 共dashed line in
Fig. 3兲, i.e., the 0 dB SL curve. The other four solid curves
are at 10, 20, 30, and 40 dB SLs, respectively. The open
symbols show the measured outputs at the five SLs with no
finger load and the filled symbols show loaded results with a
finger resting lightly on the fingerpad interface. There was
generally a close match between the measured data points
共filled and unfilled symbols兲 and the expected output levels
共solid curves兲. Deviations at a few frequencies 共especially at
the highest frequencies兲 were likely due to signal noise and
non-linear finger loading effects at such a low signal level.
Therefore, the 2DOF controllers were successful at compensating for the frequency response of the motor assembly and
the HDT curve, and the feedback controller was effective at
rejecting the low-frequency disturbances caused by the finger
load.
The engineering performance measurements presented
above indicate that the 2DOF controller met the original design objectives in accurate and fast motion tracking, disturbance rejection, and broadband response shaping. Most importantly, we demonstrated the achievement of the main
design objective of preserving the relative intensities of input
signal spectral components in terms of dB SLs.
III. SPEECH-TO-TOUCH CODING
Speech features were extracted off-line in MATLAB 共The
MathWorks, Inc., Natick, MA兲 from the digitized speech
segments and were converted into tactual signals presented
through all three channels of the TactuatorII. Before the processing, the speech signal was passed through a preemphasis filter that amplified the energy above 1000 Hz at a
Israr et al.: Tactual consonant discrimination
3929
FIG. 6. 共a兲 Block diagram illustration of tactile coding
scheme used in the formant bands. 共b兲 Block diagram
illustration of motional coding scheme. Spectral features are extracted from three bands of speech signal
and presented as motional waveforms.
typical rate of 6 dB/octave in order to compensate for the
falling speech spectrum above 1000 Hz. Three major signal
processing schemes were used for the extraction of spectral
features: 共1兲 low-pass filtering, 共2兲 band-pass filtering, and
共3兲 envelope extraction scheme of Grant et al., 共1985兲. In this
scheme, the band-limited signal is rectified and passed
through a low-pass filter to extract its temporal envelope,
which is then scaled and output with a carrier frequency, as
shown in Fig. 6共a兲. The coding scheme incorporates both
high-frequency tactile vibrations and low-frequency motional waveforms.
A. Speech material
The speech materials consisted of C1VC2 nonsense syllables spoken by two female speakers of American descent.
Each speaker produced eight tokens of each of 16 English
consonants 共the plosives, fricatives, and affricates: /p, t, k, b,
d, g, f, , s, b, v, ð, z, c, tb, dc/兲 at the initial consonant 共C1兲
location with medial V = / Ä /. The final consonant 共C2兲 was
randomly selected from a set of 21 consonants 共/p, t, k, b, d,
g, f, , s, b, v, ð, z, c, tb, dc, m, n, G, l, r/兲. The syllables were
converted into digital segments and stored as a .mov 共QuickTime Movie兲 file on the hard drive of a desktop computer
共see details of conversion in Yuan, 2003兲. The .mov files
were then converted into .wav 共waveform audio兲 files by
using CONVERTMOVIE 3.1 共MOVAVI, Novosibirsk, Russia兲
and with audio format set at a sampling rate of 11 025 Hz
and 16-bit mono. The duration of the segments varied from
1.268 to 2.002 s with a mean duration of 1.653 s.
the thumb channel by passing the low-pass filtered speech
signal directly through the 2DOF controller described in Sec.
II. Information from the first-formant band 共F1兲 was presented through the middle finger channel and the secondformant band 共F2兲 information through the index finger
channel, using processing units described in Fig. 6共a兲. The
formant band-limited signal was processed through two
band-pass filters, and amplitude envelopes of these two
bands were extracted and modulated with carrier frequencies
of 30 and 200 Hz. The 30-Hz waveforms modulated the
envelope of the lower-frequency band and the 200-Hz waveforms modulated the higher-frequency band. The two vibratory signals were added and passed through the fingerpad
interface. Since the digitized speech segments were normalized to one, the vibrations were scaled to a maximum intensity of 40 dB SL.
Figure 7 illustrates the vibration cues associated with
two CVC segments spoken by two female speakers. The top
two panels show the 30- and 200-Hz vibrations for segment
/ b Ä C2/ spoken by speaker 1 and the bottom two panels
show the same by speaker 2. Note the similar waveforms at
the two fingerpads associated with the same medial vowel
/Ä/. The vowel /Ä/ has a high first formant and a low second
formant. This is indicated by stronger 200-Hz vibrations than
the 30-Hz vibrations at the middle fingerpad 共see the two left
panels in Fig. 7兲 and significantly stronger 30-Hz vibrations
at the index fingerpad 共see the two right panels兲. Cues associated with similar initial consonants are difficult to judge
TABLE I. Speech bands and the corresponding vibrations through the three
channels.
B. Tactile coding scheme
The coding scheme extracted envelopes from three distinct frequency bands 共F0-, F1-, and F2-bands兲 of the speech
spectrum and presented them as vibrations 共mid- and highfrequency waveforms兲 through the three channels of the TactuatorII. Table I lists the numerical values for the frequency
bands and corresponding finger channels. Spectral energy
from the fundamental frequency 共F0兲 region was presented at
3930
J. Acoust. Soc. Am., Vol. 125, No. 6, June 2009
TactuatorII
channel
Speech bands
共Hz兲
Middle finger
F1-band 共300–1200兲
Index finger
F2-band 共1150–4000兲
Thumb
F0-band 共80–270兲
Envelope bands Carrier frequency
共Hz兲
共Hz兲
300–650
650–1200
1150–1750
1750–4000
Low-pass filtered at
30
200
30
200
270 Hz
Israr et al.: Tactual consonant discrimination
FIG. 7. Illustration of vibration waveforms extracted by using the speechto-touch coding scheme. The figure
shows vibration cues presented on the
middle fingerpad 共left panel兲 and on
the index fingerpad 共right panel兲. The
cues are associated with multiple segments of / b ÄC2/ spoken by two female speakers 共Sp1 or Sp2兲.
because they are not resolved for a small duration of time
共either visually or through the tactual sensory system兲.
divided into ten bands. Illustration of the motion cues associated with two segments of six initial consonants in CVC
format spoken by the two speakers is shown in Fig. 8.
C. Motional coding scheme
IV. PRELIMINARY STUDY OF CONSONANT
DISCRIMINATION
The coding scheme extracted frequency variations in the
F0-, F1-, and F2-bands using processing blocks shown in
Fig. 6共b兲 and presented them through three channels of the
TactuatorII. These motion cues indicated variations of spectral energy such as formant transition cues in the consonantvowel segments and the quasi-static positions of the fingerpad interface redundantly indicated the frequency locations
of energy peaks in the frequency band of each channel. As
illustrated in Fig. 6共b兲, the formant band-limited signal was
passed through contiguous band-pass filters in parallel and
the temporal envelope of each band was obtained. The envelopes were compared and the center frequency of the band
with the largest envelope value was noted at each sample
instant. The center frequency was linearly mapped to the
absolute reference position of the fingerpad interface that
ranged ⫾12.5 mm from the neutral zero position and was
low-pass filtered with a gain crossover at 8 Hz. Thus, the
finger extended for high-frequency contents and flexed for
the low-frequency contents in the finger band. As with the
tactile coding scheme, the features from the F0-, F1-, and
F2-bands were presented to the thumb, middle finger, and
index finger channels, respectively. The center frequencies
and bands of each band-pass filters are shown in Table II.
The frequency ranges covered by the middle finger and
thumb channels were divided into eight bands, while the
larger range encompassed by the index finger channel was
J. Acoust. Soc. Am., Vol. 125, No. 6, June 2009
A perception study was conducted on the pairwise discriminability of consonants that were processed for display
through the three finger-interfaces of the TactuatorII system.
A. Methods
The ability to discriminate consonantal features was
tested for 20 pairs of initial consonants that contrasted in
voicing, place, and manner features. Each pair contrasted one
TABLE II. Frequency bands for motional cues.
Frequency band
共Hz兲
Filter index
Middle finger
Index finger
Thumb
1
2
3
4
5
6
7
8
9
10
300–400
400–500
500–600
600–700
700–800
800–900
900–1000
1000–1200
N/A
N/A
1150–1300
1300–1500
1500–1700
1700–1900
1900–2100
2100–2300
2300–2500
2500–3000
3000–4000
4000 – 5000
80–100
100–120
120–140
140–160
170–200
200–220
220–240
240–260
N/A
N/A
Israr et al.: Tactual consonant discrimination
3931
FIG. 8. Illustration of motion waveforms extracted by
using the speech-to-touch coding scheme. Each row
shows the waveforms associated with two segments of
the same initial consonant spoken by two female speakers 共sp1 or sp2兲. The waveforms correspond to the formant location and formant transitions in the firstformant band 共solid line, motion waveforms at the
middle finger兲 and in the second-formant band 共dashed
line, motion waveforms at the index finger兲. Also
shown are the locations of constriction during the production of the initial consonant.
of the three features 共and had the same value for each of the
other two features兲. The pairs used in the present study along
with their contrasting features are shown in Table III. Out of
the 20 pairs, 5 pairs contrasted in voicing, 8 pairs contrasted
in place, and 7 pairs contrasted in manner. Two male participants 共ages 30 and 22 years old兲 took part in the experiments. S1, who is one of the authors, was highly experienced
with the TactuatorII system, but S2 had not used the device
3932
J. Acoust. Soc. Am., Vol. 125, No. 6, June 2009
prior to the present study. Both S1 and S2 were research staff
members with previous experience in other types of haptic
experiments.
The tests were conducted using a two-interval twoalternative forced-choice paradigm 共Macmillan and Creelman, 2004兲. On each trial, the participant was presented with
two tactual stimuli associated with a specific pair of consonants. The order of the two consonants was randomized with
Israr et al.: Tactual consonant discrimination
TABLE III. Contrasting consonant pairs, associated articulatory and phonetic features, and average evaluation scores in C3.
Pairs
Articulatory
features
/p-b/
Bilabial plosives
/k-g/
Velar plosives
/f-v/
Labiodental fricatives
/s-z/
Alveolar fricatives
/tb-dc
Affricates
/p-t/
Unvoiced plosives-bilabial/alveolar
/t-k/
Unvoiced plosives-alveolar/velar
/b-d/
Voiced plosives-bilabial/alveolar
/d-g/
Voiced plosives-alveolar/velar
/f-s/
Unvoiced fricatives labiodental/alveolar
/v-z/
Voiced fricatives labiodental/alveolar
Unvoiced fricatives dental/post-alveolar
/-b/
/ð-c/
Voiced fricatives dental/post-alveolar
/p-f/ Unvoiced bilabial plosives/labiodental fricative
/b-ð/
Voiced bilabial plosives/dental fricative
/t-s/
Unvoiced alveolar plosive/fricative
/d-c/ Voiced alveolar plosive/post-alveolar fricative
/d-dc/
Voiced alveolar plosive/affricate
/s-tb/
Unvoiced alveolar fricative/affricate
/b-tb/
Unvoiced post-alveolar fricative/affricate
Contrasting
distinction
d⬘
Voicing
Voicing
Voicing
Voicing
Voicing
Place
Place
Place
Place
Place
Place
Place
Place
Manner
Manner
Manner
Manner
Manner
Manner
Manner
4.65
2.90
2.36
3.78
3.81
2.38
1.66
3.80
2.13
1.46
3.80
2.66
3.60
3.12
3.47
3.60
4.65
2.37
3.00
3.38
equal a priori probability in each trial. The participant was
instructed to press a button corresponding to the order of the
consonants presented. The duration of each stimulus interval
was 2 s with an inter-stimulus-interval of 500 ms. A 150-ms
auditory tone and a visual phrase indicating “stimulus 1” or
“stimulus 2” were presented 250 ms before the start of each
stimulus to mark the beginning of each stimulus interval.
Data were collected for each consonant pair under three
different experimental conditions tested in a single session: A
20-trial initial run without any feedback 共C1兲, up to four
20-trial runs with trial-by-trial correct-answer feedback 共C2兲,
and a 50-trial final run without feedback 共C3兲. Condition C2
was terminated if a percent-correct score above 90% was
obtained in a single run or when the participant had completed all four runs. Conditions C1 and C3 could be viewed
as the initial and final assessments of the participants’ performance, while C2 provided training as needed 共although one
could argue that S1 was already “trained” prior to C1兲. Half
of the 256 total speech tokens were used in conditions C1
and C3 and the other half were used in condition C2. Thus,
the two sounds associated with each discrimination test were
represented by eight tokens apiece 共four from each of the two
speakers兲. Each consonant within a pair was presented once
or twice to the participant before C1 to familiarize the participant with its tactual cues. The order in which consonant
pairs were tested was randomized for each participant. Each
participant was tested for no more than two 40–45 min sessions on a single day, and frequent rests were encouraged.
For each experimental run, a 2 ⫻ 2 stimulus-response
confusion matrix was obtained, from which the percentagecorrect 共PC兲 score, the sensitivity index d⬘, and the response
bias  were calculated using signal-detection theory 共Macmillan and Creelman, 2004兲. The sensitivity index was set to
J. Acoust. Soc. Am., Vol. 125, No. 6, June 2009
4.65 共corresponding to a hit rate of 0.99 and a false-alarm
rate of 0.01兲 when the performance was perfect.
During the experiment, the TactuatorII was placed to the
left of the participant’s torso. It was covered by a padded
wooden box that served as an armrest for the participant’s
left forearm. The top of the box had an opening that allowed
the participant to place the thumb, the index finger, and the
middle finger on the “fingerpad interface” rods 共see inset in
Fig. 1兲. Earmuff 共Twin-Cup type, H10A, NRR 29, Peltor,
Sweden兲 and pink noise 共presented at roughly 80 dB SPL兲
were used to eliminate possible auditory cues.
B. Results
In general, performance indices increased as the participants gained more experience with the stimuli. Overall, the
average sensitivity index of all pairs increased from d⬘
= 2.66 共PC= 83%兲 in C1 to d⬘ = 3.13 共PC= 91%兲 in C3. A
pairwise two-sided t-test showed that sensitivity scores in C3
were significantly higher than in C1 关t共39兲 = 2.16, p
⬍ 0.05兴. The sensitivity indices averaged over the two participants for each contrasting consonant in condition C3 are
shown in Table III. For consonants contrasting in the voicing,
place, and manner of articulation features, the performance
levels of the two participants were similar in C3 : d⬘ = 3.5 for
S1 and d⬘ = 3.5 for S2 in voicing, d⬘ = 2.8 for S1 and d⬘
= 2.6 for S2 in place, and d⬘ = 3.2 and d⬘ = 3.6 for S1 and S2,
respectively, in manner distinction. The response bias across
the 20 consonant pairs ranged from  = −0.74 to  = 0.63 and
averaged  = 0.008, indicating that the participants generally
demonstrated little or no bias in their use of the two response
choices. Both participants performed perfectly in discriminating the two consonant pairs /p,b/ and /d,c/. For the remaining pairs, the participants’ relative performance levels
were mixed as one participant performed better than the
other with some pairs but not others. In all cases, d⬘ was
greater than 1.0, a typical criterion for discrimination threshold, indicating that the coding scheme succeeded in providing the cues needed for the discrimination of the consonant
pairs.
V. DISCUSSION
The coding scheme used in the present study was an
extension of the scheme presented in Yuan, 2003, where the
envelope of the low-frequency speech band 共⬍350 Hz兲 was
modulated with a 50-Hz vibration at the thumb and the envelope of the high-frequency speech band 共⬎3000 Hz兲 was
modulated with a 250-Hz vibration at the index finger. This
scheme was successful in pairwise discrimination of initial
consonants that contrasted in voicing only. On average, discriminability of roughly 90% and d⬘ of 2.4 were obtained for
eight voiced-unvoiced pairs in four participants. Our coding
scheme presented the low-frequency speech band 共fundamental frequency band兲 directly at the thumb and the envelopes of the high-frequency speech band 共second-formant
band兲 at the index fingerpad, consistent with the scheme presented in Yuan,2003. The results of the two studies show
similar performance: An average discriminability of 94% and
d⬘ of 3.5 were obtained in the present study when contrasting
Israr et al.: Tactual consonant discrimination
3933
five voiced-unvoiced consonant pairs, indicating that the
coding scheme used in Yuan, 2003 was a subset of the
scheme used in the present study. The performance level obtained in the present study also appears to compare favorably
with the results reported by earlier studies of tactual displays,
where discrimination scores were generally less than 75%
共Plant, 1989; Clements et al., 1988; Galvin et al., 1999;
Waldstein and Boothroyd, 1995; Weisenberger et al., 1989;
Summers et al., 2005兲.
In addition to incorporating the amplitude information
from low- and high-frequency speech bands, as in Yuan,
2003, our coding scheme displays energy information from
the mid-frequency speech band in the form of temporal envelopes as well as low-frequency motion cues from the three
speech bands to the corresponding fingerpads. To the best of
our knowledge, this is the first time that low-frequency motion cues have been used to encode speech spectral information. These cues provide both frequency location and frequency transition information of formants to the receiver’s
fingerpads. The transition of formants is useful for the distinction of the place of articulation feature in consonants as
indicated in Ali et al., 2001a,2001b and Jongman et al.,2000.
Although some studies have presented contradictory results
arguing that formant transitions are not useful for distinguishing place of articulation in consonants 关e.g., see a review by Jongman et al. 共2000兲兴, motion waveforms extracted
from the speech-to-touch coding scheme in the present study
共see Fig. 8兲 indicate distinction in transitions as the place of
constriction during the production of consonants varies from
lips to velum. The cues associated with transition of the second formant can be observed in Fig. 8 共dashed lines兲. The
index finger flexes at the onset of the initial bilabial plosive
/b/ and stays flexed for the medial vowel /Ä/ 共first row兲. The
index finger extends at the onset of the initial velar plosive
/g/ and flexes at the onset of the medial vowel /Ä/ 共second
row兲. Similarly, for fricatives, the index finger stays at the
neutral zero position at the onset of the initial labiodental
consonant /v/ 共third row兲 and slightly extends at the onset of
the initial dental fricative /ð/ 共fifth row兲. The index finger
extends for a longer duration at the onset of the initial alveolar and the post-alveolar fricatives /z/ and /c/ before it flexes
at the onset of the medial vowel /Ä/ 共fourth and sixth rows兲.
Thus, as the place of articulation of consonant moves from
near lips 共bilabials兲 to near velum, the index finger extends
more for the latter initial consonants 共corresponding to an
increase in F2 associated with an effective shortening of the
vocal tract for velar as opposed to labial constrictions兲. This
may explain the better performance level we have achieved
in the present study due to the utilization of place of articulation cues.
Results of the pairwise consonantal discrimination experiments in the present study showed that both participants
were able to discriminate all eight consonant pairs that differed in the place of articulation feature with an average
discriminability of 88% and a d⬘ of 2.7. The results of previous studies with tactual displays indicate poor transmission
of place cues. For example, Clements et al. 共1988兲 used a
12-by-12 pin tactual matrix display to present acoustic features as vibrations along the two dimensions of the display
3934
J. Acoust. Soc. Am., Vol. 125, No. 6, June 2009
similar to that in the spectrogram used for speech analysis.
The pairwise discrimination performance of the manner of
articulation and voicing features was satisfactory 共71% for
voicing and 80% for manner兲 but discriminability of place of
articulation distinction was poorer, i.e., 66%. Even with the
multi-channel spectral display of the Queen’s vocoder studied by Weisenberger et al. 共1989兲, place of articulation was
not discriminated as well as other features 共65% for place
compared to 75% for manner and 70% for voicing兲. In other
studies, discriminability of place of articulation was at
chance level 共Waldstein and Boothroyd, 1995; Plant, 1989;
Summers et al., 2005; Weisenberger and Percy, 1995; Galvin
et al., 1999兲. Therefore, it appears that the present coding
scheme was able to transmit the place of articulation feature
more successfully than has been demonstrated previously.
Consonants contrasting manner of articulation have been
shown to be well discriminated with the tactile displays of
previous studies, i.e., 80% in Clements et al., 1988, 75% in
Weisenberger et al., 1989, ⱕ90% in Weisenberger and Percy,
1995, 70% in Plant, 1989, and ⬍85% in Summers et al.,
2005. In the present study, the discriminability of manner of
articulation was always greater than 90% except for the /s/-/
tb/ contrast 共88%兲 by S1 and the /d/-/dc/ contrast 共84%兲 by
S2. The manner of articulation distinction is associated with
coarse spectral variations in speech such as abrupt or smooth
temporal variations 共e.g., plosives vs fricatives兲 or the combination of both 共as in affricates兲. The manner discrimination
results obtained in the present study are comparable to the
best performance obtained with previous tactile speech displays.
A major distinction between the present and previous
studies is that the previous displays utilized either the tactile
or the kinesthetic sensory system, but not both, to transmit
acoustic and phonetic features associated with consonantal
and vocalic contrasts 共Bliss, 1962; Tan et al., 1997兲. The two
sensory systems are perceptually independent 共Bolanowski
et al., 1988; Israr et al., 2006兲 and can be utilized simultaneously to improve the transmission of features associated
with speech signals. Tan et al. 共1999, 2003兲 formed a set of
synthetic waveforms from the two sensory systems and demonstrated that relatively high rates of information could be
transmitted through the tactual sense. In the present study,
we utilized the entire kinesthetic-cutaneous sensory continuum in an effort to broaden the dynamic range of tactual
perception, similar to the Tadoma method, and to improve
speech transfer through the human somatosensory system.
Our results demonstrate that with the new controller and the
coding scheme reported here that engage both the kinesthetic
and cutaneous aspects of the somatosensory system, normalhearing participants were able to discriminate consonantal
features at a level that is similar to or higher than those
reported by previous studies with other types of tactual
speech-information displays.
ACKNOWLEDGMENTS
This research was supported by Research Grant No.
R01-DC00126 from the National Institute on Deafness and
Israr et al.: Tactual consonant discrimination
Other Communication Disorders, National Institutes of
Health.
1
The first Tactuator was developed at MIT 共Tan and Rabinowitz, 1996兲. A
second device, the TactuatorII, was subsequently developed at Purdue
University with essentially the same hardware.
2
The unit “dB with regard to 1 m peak” is commonly used with HDTs. It
is computed as 20 log10共A / 1.0兲 where A denotes the amplitude 共in m兲 of
the sinusoidal signal that can be detected by a human participant at a
specific frequency.
Ali, A. M. A., Van der Spiegel, J., and Mueller, P. 共2001a兲. “Acousticphonetic features for the automatic classification of fricatives,” J. Acoust.
Soc. Am. 109, 2217–2235.
Ali, A. M. A., Van der Spiegel, J., and Mueller, P. 共2001b兲. “Acousticphonetic features for the automatic classification of stop consonants,”
IEEE Trans. Speech Audio Process. 9, 833–841.
Bliss, J. C. 共1962兲. “Kinesthetic-tactile communications,” IRE Trans. Inf.
Theory 8, 92–99.
Bolanowski, S. J., Gescheider, G. A., Verrillo, R. T., and Checkosky, C. M.
共1988兲. “Four channels mediate the mechanical aspects of touch,” J.
Acoust. Soc. Am. 84, 1680–1694.
Brisben, A. J., Hsiao, S. S., and Johnson, K. O. 共1999兲. “Detection of vibration transmitted through an object grasped in the hand,” J. Neurophysiol.
81, 1548–1558.
Clements, M. A., Braida, L. D., and Durlach, N. I. 共1988兲. “Tactile communication of speech: Comparison of two computer-based displays,” J. Rehabil. Res. Dev. 25, 25–44.
Craig, J. C., and Evans, P. M. 共1987兲. “Vibrotactile masking and the persistence of tactual features,” Percept. Psychophys. 42, 309–371.
Evans, P. M. 共1987兲. “Vibrotactile masking: Temporal integration, persistence and strengths of representations,” Percept. Psychophys. 42, 515–
525.
Galvin, K. L., Mavrias, G., Moore, A., Cowan, R. S. C., Blamey, P. J., and
Clark, G. M. 共1999兲. “A comparison of Tactaid II⫹ and Tactaid 7 use by
adults with a profound hearing impairment,” Ear Hear. 20, 471–482.
Gescheider, G. A., Capraro, A. J., Frisina, R. D., Hamer, R. D., and Verrillo,
R. T. 共1978兲. “The effects of a surround on vibrotactile thresholds,” Sens
Processes 2, 99–115.
Goble, A. K., Collins, A. A., and Cholewiak, R. W. 共1996兲. “Vibrotactile
threshold in young and old observers: The effects of spatial summation
and the presence of a rigid surround,” J. Acoust. Soc. Am. 99, 2256–2269.
Grant, K. W., Ardell, L. H., Kuhl, P. K., and Sparks, D. W. 共1985兲. “The
contribution of fundamental frequency, amplitude envelope, and voicing
duration cues to speechreading in normal-hearing subjects,” J. Acoust.
Soc. Am. 77, 671–677.
Israr, A. 共2007兲. “Tactual transmission of phonetic features,” Ph.D. thesis,
Purdue University, West Lafayette, IN.
Israr, A., Tan, H. Z., and Reed, C. M. 共2006兲. “Frequency and amplitude
discrimination along the kinesthetic-cutaneous continuum in the presence
of masking stimuli,” J. Acoust. Soc. Am. 120, 2789–2800.
Jongman, A., Wayland, R., and Wong, S. 共2000兲. “Acoustic characteristics
of English fricatives,” J. Acoust. Soc. Am. 108, 1252–1263.
Leek, M. R. 共2001兲. “Adaptive procedures in psychophysical research,” Percept. Psychophys. 63, 1279–1292.
J. Acoust. Soc. Am., Vol. 125, No. 6, June 2009
Maciejowski, J. M. 共1989兲. Multivariable Feedback Design 共AddisonWesley, Reading, MA兲.
Macmillan, N. A., and Creelman, C. D. 共2004兲. Detection Theory: A User’s
Guide 共Lawrence Erlbaum Associates, New York兲.
Plant, G. 共1989兲. “A comparison of five commercially available tactile aids,”
Aust. J. Audiol. 11, 11–19.
Reed, C. M., and Durlach, N. I. 共1998兲. “Note on information transfer rates
in human communication,” Presence—Teleoperators & Virtual Environments 7, 509–518.
Reed, C. M., Rabinowitz, W. M., Durlach, N. I., Braida, L. D., ConwayFithian, S., and Schultz, M. C. 共1985兲. “Research on the Tadoma method
of speech communication,” J. Acoust. Soc. Am. 77, 247–257.
Shannon, R. V., Zeng, F.-G., Kamath, V., Wygonski, J., and Ekelid, M.
共1995兲. “Speech recognition with primarily temporal cues,” Science 270,
303–304.
Summers, I. R. 共1992兲. Tactile Aids for the Hearing Impaired 共Whurr, London兲.
Summers, I. R., Dixon, P. R., Cooper, P. G., Gratton, D. A., Brown, B. H.,
and Stevens, J. C. 共1994兲. “Vibrotactile and electrotactile perception of
time-varying pulse trains,” J. Acoust. Soc. Am. 95, 1548–1558.
Summers, I. R., Whybrow, J. J., Gratton, D. A., Milnes, P., Brown, B. H.,
and Stevens, J. C. 共2005兲. “Tactile information transfer: A comparison of
two stimulation sites,” J. Acoust. Soc. Am. 118, 2527–2534.
Tan, H. Z., Durlach, N. I., Rabinowitz, W. M., Reed, C. M., and Santos, J. R.
共1997兲. “Reception of Morse code through motional, vibrotactile and auditory stimulation,” Percept. Psychophys. 59, 1004–1017.
Tan, H. Z., Durlach, N. I., Reed, C. M., and Rabinowitz, W. M. 共1999兲.
“Information transmission with a multifinger tactual display,” Percept.
Psychophys. 61, 993–1008.
Tan, H. Z., and Rabinowitz, W. M. 共1996兲. “A new multi-finger tactual
display,” Proceedings of the International Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems, edited by K.
Danai 共American Society of Mechanical Engineers, New York兲, Vol. 58,
pp. 515–522.
Tan, H. Z., Reed, C. M., Delhorne, L. A., Durlach, N. I., and Wan, N.
共2003兲. “Temporal masking of multidimensional tactile stimuli,” J. Acoust.
Soc. Am. 114, 3295–3308.
Van Doren, C. L. 共1990兲. “The effects of a surround on vibrotactile thresholds: Evidence for spatial and temporal independence in the non-Pacinian
I 共NPI兲 channel,” J. Acoust. Soc. Am. 87, 2655–2661.
Verrillo, R. T., and Gescheider, G. A. 共1992兲. “Perception via the sense of
touch,” in Tactile Aids for the Hearing Impaired, edited by I. R. Summers
共Whurr, London兲, pp. 1–36.
Waldstein, R. S., and Boothroyd, A. 共1995兲. “Comparison of two multichannel tactile devices as supplements to speechreading in a postlingually deafened adult,” Ear Hear. 16, 198–208.
Weisenberger, J. M., Broadstone, S. M., and Saunders, F. A. 共1989兲. “Evaluation of two multichannel tactile aids for the hearing impaired,” J. Acoust.
Soc. Am. 86, 1764–1775.
Weisenberger, J. M., and Percy, M. E. 共1995兲. “The transmission of
phoneme-level information by multichannel tactile speech perception
aids,” Ear Hear. 16, 392–406.
Yuan, H. 共2003兲. “Tactual display of consonant voicing to supplement lipreading,” Ph.D. thesis, Massachusetts Institute of Technology, Cambridge,
MA.
Israr et al.: Tactual consonant discrimination
3935