US3551588A - Vocoder filter system - Google Patents
Vocoder filter system Download PDFInfo
- Publication number
- US3551588A US3551588A US649518A US3551588DA US3551588A US 3551588 A US3551588 A US 3551588A US 649518 A US649518 A US 649518A US 3551588D A US3551588D A US 3551588DA US 3551588 A US3551588 A US 3551588A
- Authority
- US
- United States
- Prior art keywords
- speech
- synthesizer
- vocoder
- circuit
- pulse
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 230000005284 excitation Effects 0.000 description 25
- 238000013016 damping Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 7
- 238000000034 method Methods 0.000 description 7
- 229920006395 saturated elastomer Polymers 0.000 description 6
- 230000010355 oscillation Effects 0.000 description 5
- 238000001228 spectrum Methods 0.000 description 5
- 238000004519 manufacturing process Methods 0.000 description 4
- 230000003595 spectral effect Effects 0.000 description 4
- 230000001755 vocal effect Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000004804 winding Methods 0.000 description 3
- 239000003990 capacitor Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003908 quality control method Methods 0.000 description 2
- 230000000638 stimulation Effects 0.000 description 2
- 125000000349 (Z)-3-carboxyprop-2-enoyl group Chemical group O=C([*])/C([H])=C([H])\C(O[H])=O 0.000 description 1
- 206010011878 Deafness Diseases 0.000 description 1
- 235000015164 Iris germanica var. florentina Nutrition 0.000 description 1
- 235000015265 Iris pallida Nutrition 0.000 description 1
- 244000050403 Iris x germanica Species 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000001066 destructive effect Effects 0.000 description 1
- 238000007599 discharging Methods 0.000 description 1
- 230000005279 excitation period Effects 0.000 description 1
- 230000005669 field effect Effects 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- NQLVQOSNDJXLKG-UHFFFAOYSA-N prosulfocarb Chemical group CCCN(CCC)C(=O)SCC1=CC=CC=C1 NQLVQOSNDJXLKG-UHFFFAOYSA-N 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
Definitions
- the present invention relates to circuitry and methods for periodically sharply varying the quality factor of resonant circuits so as to allow the stored energy of the circuit to be disrsipated and has particular though not exclusive application in a speech synthesizer or in the synthesizer stage of a vocoder.
- vocoder Other types include, for example, the correlation vocoder and the formant tracking vocoder.
- the synthesizer of many types of vocoder consists of a number of frequency channels in parallel, each channel containing a modulator and a band-pass filter.
- a signal whose pitch is that of the original speech and which contains sufficient harmonic content to cover the frequency range of interest is used to excite the channels.
- the above signal consists of a train of narrow pulses which in the case of a full channel vocoder are spaced at pitch rate and in the case of a voice-excited vocoder are spaced at a rate which is related to pitch rate but not necessarily equal to it.
- the channel filters are sharply tuned (as is generally the case), they will ring after theapplication of each pulse for a time greaterthan the time between pulses. There will thus be energy remaining in the filters when the next pulse is applied, and depending on the relative phase of the damped oscillation it will interfere additively or othenvise with the response of the resonant circuit to the succeeding excitation signal.
- the present invention is suitable for use in the synthesizer of a channel vocoder which is excited by a series of pulses and in the channel portion of the synthesizer of a voice-excited vocoder.
- -ln vocoders which employ single tuned filters for the frequency channels in the synthesizer stage and wherein sam- ,ple excitation pulses are applied to the filters once in every propose therefore to employ a technique which we refer to as clamping" which consists of allowing the filter to be of as high a during the major portion of each pitch period as is required and considerably reducing the Q at the end of each pitch period for a short time immediately before the application of the next excitation pulse, thus ensuring that there is lit tle or no-energy remaining in the filter when it is again excited.
- clamping which consists of allowing the filter to be of as high a during the major portion of each pitch period as is required and considerably reducing the Q at the end of each pitch period for a short time immediately before the application of the next excitation pulse, thus ensuring that there is lit tle or no-energy remaining in the filter when it is again excited.
- the invention provides a method of dissipating the energy stored in a resonant circuit in a vocoder synthesizer without giving rise to undesirable transients in the output.
- the duration of the clamping signal is as short as possible while beingsufficiently long to allow substantially all of the stored energy to be dissipated and the signal is timed so as to end at the application of the succeeding excitation pulse.
- Various preferredernbodiments of the invention include a number of variable bandwidth filters whose bandwidths are variable independently of one another or a number of filters the bandwidths of some or all of which are variable in concert.
- the resonant frequency f in cycles per second of a parallel LC circuit is given approximately by 2 v26 Awhere L and C are the values in M.K.S. units of the inductance and the capacitance respectively.
- the range of frequencies around the resonant frequency over which the response of the resonant circuit does not fall more than 3 decibels below the value of the output at resonance is called the bandwidthand is denoted by 8]".
- the bandwidth is proportional to the overall resistance of the circuit. if the resistance is arranged to be variable in response to a control signal, we obtain a filter whose bandwidth is externally controllable.
- the bandwidth of a single tuned LC filter isaltered by changing the effective resistance in the circuit.
- a resonant circuit with positive damping is one in which the output decays with time, e.g. the output might be of the form exp (-kt). sin (I c) where exp denotes the exponential function, e is a constant, k is a positive constant called the .decay factor, and t is time measured from the moment of application' of the excitation pulse.
- the circuit is said to be critically damped when the rate of energy decay is a maximum. In this case the energy is dissipated in the minimum possible time.
- Any real resonant circuit has inherent damping due to resistive losses in the circuit elements, particularly in the inductance. Generally it is required to minimize this damping in order to realize the maximum possible selectivity, but if an additional resistance is switched in parallel with a branch of a resonant circuit, for example, the result will be critical damping of the circuit if the additional resistance has been selected appropriately.
- a circuit arrangement which includes a bank of resonant circuits resonating each at a different frequency and having quality factor control means whereby the bandwidth of one ofthe resonant circuits may be varied.
- the invention resides in the method employed and in circuitry'or apparatus which includes a synthesizer of the type described, and in a resonant circuit wherein critical damping is introduced by the method described.
- the critical damping resistance of the resonant circuit is introduced by being switched in electronically.
- a symmetrical transistor is in series with a resistor, a pulse of the appropriate duration being applied at the appropriate moment to the base of the transistor so as to switch into the resonant circuit the resistance of the transistor together with the resistance of the resistor.
- the polarities of the power supply to the transistor, of the resonant circuit, of the excitation pulses and of the pulse applied to the base of the transistor are not necessarily independent of one another.
- the bandwidths of the resonant circuits of the speech synthesizer are variable independently of one another, so that, for example, subjective tests can be made as to what settings give rise to most acceptable speech. enabling research into acceptableness and other characteristics of speech to be carried out.
- the tenn "speech synthesizer includes, for example, the synthesizer stage of a vocoder, a reading machine for the blind, and a parametric artificial talker.
- FIG. I is a block diagram of a channel vocoder
- FIG. 2 is a circuit diagram of one form of filter for a vocoder synthesizer channel
- FIG. 3 is a circuit diagram which is equivalent to a part of FIG. 2;
- FIG. 4 is a circuit diagram which is equivalent to a different part of FIG. 2;
- FIG. 5 is a circuit diagram of a second form of filter for a vocoder synthesizer channel with clamping facilities
- FIG. 6 is a circuit diagram of a second form of filter for a vocoder synthesizer channel with bandwidth control
- FIG. 7 is a graph illustrating the variation in bandwidth in the circuits of FIGS. 2 and 4 for a range of applied quality factor control voltages
- FIG. 8 shows oscilloscope traces of output waveforms obtained for various control voltages in the circuit of FIGS. 2 and
- FIG. 9 is an oscilloscope trace showing a clamping signal and the output obtained when the clamping signal is applied;
- FIG. 10 is an oscilloscope trace of the output obtained with zero clamping signal.
- FIG. I1 is a block diagram of a parametric speech synthesizer.
- a symmetrical transistor is one which is constructed in such a way that its characteristics remain substantially unchanged when the emitter and collector connections are interchanged.
- a transistor is said to be saturated when the difference in potential between collector and emitter is less than that between base and emitter. When this occurs, the transistor undergoes bottoming, that is to say the output is determined by the characteristics of the transistor and not by the applied signal.
- a symmetrical transistor operating in the saturated mode has an effective emitter-collector resistance R given by where I,, is the base current.
- A is typically 40 per volt at I 6 C.
- R is about l0 ohms when I,, is 100 p. A.
- FIG. 1 illustrates a so-called channel vocoder having an analyzer stage from which signals are derived and carried by a number of transmission paths, one of which, 10, carries the fundamental frequency information of the speech, while the remainder convey the spectral information. These signals control the synthesizer stage which reconstructs the original speech.
- a speech signal is picked up by a microphone II and amplified by an amplifier 12.
- the output of the amplifier I2 is fed to a pitch detector 16 which detects whether the fundamental voice frequency is random or periodic. From 16 the pitch information goes to a frequencyto-voltage converter 17.
- the output of 17 is fed through a lowpass filter 18 and carried by the pitch channel 10 (which thus conveys the fundamental frequency or pitch information of the speech, if any) to a switch 19 and a pitch generator 20. If the pitch information is zero, a hiss generator 21 is switched on. If the pitch information exceeds a threshold value, "voicing" in the -300 c.p.s. range is provided.
- the output of the amplifier 12 goes also to a pre-emphasis equalizer 13 and thence to a bank of analyzer spectrum filters ASF whose outputs are rectified by the bank of rectifiers RECT and then applied to the bank of low-pass filters LI- -L9.
- the signals carried by channels 1-9 thus convey the spectral information of the speech.
- the outputs of channels 1- 9 are applied to the modulators MlM9 where they modulate the pitch or his signal from 20 or 21 respectively.
- the modulated outputs of Ml-M9 pass through a bank of synthesizer spectrum filters SSF to be combined at the deemphasis equalizer l4, whence the reconstituted speech signal goes to an amplifier 15, being output in audible form at the headphones 22.
- the technique of switching in of critical damping can be utilized in some or all of the synthesizer spectrum filters SSF.
- the bandwidths of the analyzer spectrum filters ASF can be varied independently of one another or in concert.
- synthesizer spectrum filters are of the type shown in FIG. 2, then variation of bandwidth can be achieved in filters which can also be critically damped between successive excitation pulses.
- FIG. 2 illustrates one spectral filter and some associated circuitry from the filter bank SSF of the synthesizer stage of the vocoder of FIG. 1, the filter consisting of a resonant circuit.
- a series of excitation pulses spaced at voice fundamental rate are generated (by 20 of FIG. 1, from information provided by l6, l7 and 18) and applied to the synthesizer where they excite the tuned circuits whose outputs are combined to give an analogue response which is converted to audio form by an electromechanical transducer.
- the parallel resonant circuit consists of an inductor LI and a capacitor C1 in the collector circuit of Q4. Excitation pulses are applied to the base of Q4. The signal output is taken to a high impedance (Q7 of FIG. 2, not shown in FIG. 3) to avoid loading the resonant circuit. Clamping is accomplished by placing in parallel with L1 and C1 a resistor R1 which is in series with a symmetrical transistor Q5. O5 is normally biassed to be nonconducting, but the application of a negative pulse of at least 10 volts amplitude turns it hard on and places the low resistance R1 across the resonant circuit thus dissipating the energy stored therein. The clamping pulse is timed to be just long enough to allow all the energy to be dissipated and to end just before the application of the next excitation pulse.
- the capacitor C1 is taken to earth through a variable resistance P and a symmetrical transistor Q6.
- variable resistance enables the minimum bandwidth to be preset.
- the transistor is used as a variable resistance, and so must always be kept saturated. This means that only small signals can be handled and so the gain of the stage is kept low by making the emitter ,resistance of Q4 large, say kiloohms.
- a voltage applied for quality control delivers a current through the potential divider formed by R2 and R3 to the base of the transistor 06.
- the bandwidth is in the range 60 c.p.s. to 500 c.p.s.
- the symmetrical PNP transistor Q6 is used as a variable resistance by keeping it saturated, in which case the collector current is independent of the base current.
- the ratio R3/R2 is approximately 10/ l in order to obtain in a convenient way the range of variation in transistor base current from O6.
- FIGS. 5 and 6 illustrate as an alternative a series resonant form of resonant circuit for the synthesizer stage of a vocoder and having respectively clamping facilities and a control for smoothly varying the filter bandwidth.
- R10, R20, R30, L10, Q50 and 060 correspond to R1, R2, R3, L1, 05 and Q6 of FIGS. 2-4.
- the effective capacitance C10 of the circuit of FIGS. 5 and 6 is given by Application of a clamping pulse input to the base of 050 in FIG. 5 results in critical damping being introduced across the resonant circuit for the duration of the pulse.
- the duration of the pulse is chosen to be such that all the energy stored in the circuit is dissipated in as short a time as possible in the interval between the application of successive excitation pulses and the timing is such that the clamping pulse ends before the application of the next excitation pulse.
- the inductance L10 is seen to be provided by the primary. winding of a transformer. Variation in bandwidth is achieved by applying a negative control voltage through the potential divider formed by the resistance R20 and R30 to the base of a saturated symmetrical transistor Q60.
- FIGS. 2-6 the symmetrical transistors have been PNP transistors.
- NPN transistors can be used throughout, (though symmetrical NPN transistors are' notv readily commercially available at present), appropriate changes e.g. in the polarity of the power supplies being made in the circuits.
- the excitation pulse of FIGS. 2-6 and the clamping pulse of FIGS. 2, 3 and 5. are negative going signals.
- the control voltage of FIGS. 2, 4 and 6 is a negative voltage. These require to be positive if the polarities are reversed.
- Field effect transistors can be used in place of symmetrical PNP transistors, with appropriate modifications in the circuits.
- FIG. 7 illustrates the control characteristic obtained by applying a quality control voltage as described with reference to FIGS. 2 and 4.
- the curve is seen to vary smoothly over the range I c.p.s. to 300 c.p.s. which is commonly of most utility in vocoder application.
- FIG. 8 illustrates output waveforms obtained by repetitive stimulation of the tuned circuit for various values of the control voltage (equivalently, for various bandwidths).
- clamping is achieved by placing the critical damping resistance in series with a symmetrical transistor across the inductor.
- the base of O5 is biased so that normally it is turned off, but the application at the input labeled "clamping pulse” of a negative pulse (of 10 volts amplitude for the filter of FIGS. 1 and 2) turns it hard on and the oscillations of the tuned circuit die away rapidly.
- the time required for all the energy to be dissipated depends on the resonant frequency, being greater for the lower frequencies. For the lowest frequency channels, about 400 psec. is required, but as this may represent about 20 percent of the time between excitation pulses some compromise will be needed.
- the clamping" pulse should end at or just after the commencement of the excitation pulse.
- the upper trace of FIG. 9 illustrates the response of a single filter of resonant frequency 1,000 c.p.s. which is excited by a train of pulses whose repetition rate is constant at 250 p.p.s.
- the lower trace. which is to be interpreted as a relatively short negative pulse of amplitude l0 volts and not as a long positive pulse illustrates the form of the clamping signal which is applied to the base of 05 so as to prevent energy from being carried over between one excitation period and the next.
- FIG. 10 shows the response of the same filter to the same pulse train but with the clamping signal removed. and shows the reduced response due to destructive interference between the oscillations of successive pitch periods.
- FIG. 11 illustrates a synthesizer which produces artificial speech.
- a control unit 30 produces control signals, the control unit itself being governed by, for example, a digital computer with paper tape output, or by a voltage reader which reads frequency and spectral information presented as manually prepared patterns of electrically conducting ink on rolls of paper.
- a fundamental frequency signal F is applied to a pulse generator 31, the output of which is applied to a voice amplitude circuit 32 whose operation is controlled by an amplitude signal A and whose output is fed through a low-pass filter 33 and thence through resonant circuits 34 (F3), 35 (F2) and 36 (F1) to an integrator 37, and thence through a network which includes a variable resistance 38 to an amplifier A3, the output being an audible signal.
- a noise generator 39 feeds a first noise amplitude circuit 40 which is controlled by the parameter Al.
- the output of 40 passes through a high pass filter 41 and thence goes to the resonant circuits 34 (F3), 35 (F2) and 36 (F1), passing next to the integrator 37 and the variable resistance 38 and finally to the output amplifier A3.
- the output passes through a variable resistor 44 and is then output asan audible signal from the amplifier A3.
- Essentially 36, 35 and 34 mirror the first three fonnants of natural speech i.e. they resonate at around the natural frequencies of the vocal tract. In the synthesis of an utterance, they are modified in accordance with the alteration in shape which the human vocal tract would undergo during speech production.
- Front fricatives include, for example, the English labiodental fricatives If/ and /v/ and the alveolar fricatives /S/ and /Z/.
- Back fricatives include the English palatal fricatives /g/ and /g/ and the glottal spirant /h/.
- the facility of dissipating energy in the interval between successive excitation pulses also has application in the filters 34 (F3),35 (F2) and 36 (F1).
- a speech synthesizer which includes a plurality of resonant circuits, each having inherent damping, each resonating at different frequencies, excitation means associated with each of said resonant circuits for repeatedly applying momentary excitation pulses to the associated circuit thereby causing damped oscillations therein, and means for selectively controlling said excitation means to cause the amplitude of the oscillations in any one of the circuits to decay to'an arbitrarily low value before the application of the next successive excitation pulse to that same circuit.
- each resonant circuit has a quality factor
- said excitation control means comprises means for lowering the quality factor of each of the resonant circuits for a portion of the time between the application of successive excitation pulses.
- a speech'synthesizer as claimed in claim 2 including resistance means associated with each resonant circuit for damping the associated resonant circuit, and means for switching a resistance means into the associated resonant circuit to give critical damping in the resonant circuit for some discrete time after the application of one excitation pulse, and means'for removing said resistance means from said resonant circuit before the application of the next succeeding excitation pulse.
- a speech synthesizer as claimed in claim 3 further including an inductor and transistor connected in parallel, and means for connecting the critical damping resistance between said inductance and the emitter of said transistor.
- a speech synthesizer as claimed in claim 1, and fuither including means for varyingthe bandwidths of the resonant circuits independently of one another.
- a speech synthesizer as claimed in claim Landjfurther including an inductance in one of the resonant circuits adjacent one winding of a transformer.
- a speech synthesizer as claimed in'claim 10 wherein the said variable resistance is a transistor.
- a speech synthesizer as claimed in claim I wherein said resonant circuit has associated therewith a variable. resistance for connection into said resonant circuit to presebthe bandwidth of the filter.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Networks Using Active Elements (AREA)
Description
United States Patent inventor Donald Anthony Acott Roworth References Cited London, England UNITED STATES PATENTS Appl. No. 649,518 2,819,341 1/1958 Barney l79/ltASl Filed June 28,1967 3,330,910 7/1967 Flanagan 179(AS)X Paemed Dec-29,1970 3,423,530 l/1969 Coulter l79/|(AS) Assignee International Standard Electric Corporation Primary Examiner-Kathleen H. Clafiy New York, Assistant Exmcinerjfilarles WhJirazuch P M P Anorneysome emsen, r., ayson orris. ercy priority 3:333:22 of Delaware P. Lantzy, .1. Warren Whitesel, Phillip A. Weiss and Delbert Great Britain Warner Nos. 29684/66,29685/66 and 29686/66 ABSTRACT: A vocoder or voice synthesizer includes a plurality of filters which are used to analyze human speech signals so that they may be synthesized. The original voice signal is se arated b a plurality of hi h Q-filters. A variable resistor is VOCOPER FILTEI} SYTEM co t ipled acrbss these filters t5 selectively increase or decrease l3 Chums n D'awmg the quality factor or Q of the filters. Time controlled clamping U.S. Cl. 179/1, means are provided for alternately sampling speech via high 179/1555 Q-filters and then discharging the filter elements by lowering Int. Cl G101 1/00 the O. This way, the energy stored in the filters during one Field ofSearch 179/1AS; speech mpl does n degrade the response to the next speech sample.
EME-
PATENTEU UEC29 I978 SHEET 1 OF 7 N Wm 2% Ni tn fi fi m: m M w w w w i w a E E m w w w E w T @ii Ws N N xiv a m m m H/ -%/s e Input PATENTED DEE29 I970 3551.588
' SHEET 5 OF 7 PATENTEU UECZSIBYG 3,551,588
SHEET 7 OF 7 VOCODER FILTER SYSTEM The present invention relates to circuitry and methods for periodically sharply varying the quality factor of resonant circuits so as to allow the stored energy of the circuit to be disrsipated and has particular though not exclusive application in a speech synthesizer or in the synthesizer stage of a vocoder.
A vocoder is a speech transmission system comprising an analyzeriwhich produces control signals derived from a speech inputya' transmission line for the said control signals, and a synthesiz 'eiifi i The bandwidthprequired to, transmit the control signals is less than that"required to transmit the speech in analogue form. The control signals cari -be multiplexed, and are not subject to the same kinds of distortion as are analogue speech signals. in a voice excited vocoder, for example, a base band covering a frequency range such as 80600c.p.s. is transmitted in analogue form and is added at the synthesizer to the output of the parallel frequency channels which cover the remaining part of the speech bandwidth. The base band is also I subjected at the synthesizer to nonlinear distortion to provide the excitation signal for the stimulation of the frequency channels.
' Other types of vocoder include, for example, the correlation vocoder and the formant tracking vocoder. v
The synthesizer of many types of vocoder consists of a number of frequency channels in parallel, each channel containing a modulator and a band-pass filter. A signal whose pitch is that of the original speech and which contains sufficient harmonic content to cover the frequency range of interestis used to excite the channels. Commonly the above signal consists of a train of narrow pulses which in the case of a full channel vocoder are spaced at pitch rate and in the case of a voice-excited vocoder are spaced at a rate which is related to pitch rate but not necessarily equal to it. If the channel filters are sharply tuned (as is generally the case), they will ring after theapplication of each pulse for a time greaterthan the time between pulses. There will thus be energy remaining in the filters when the next pulse is applied, and depending on the relative phase of the damped oscillation it will interfere additively or othenvise with the response of the resonant circuit to the succeeding excitation signal.
The present invention is suitable for use in the synthesizer of a channel vocoder which is excited by a series of pulses and in the channel portion of the synthesizer of a voice-excited vocoder.
-ln vocoders which employ single tuned filters for the frequency channels in the synthesizer stage and wherein sam- ,ple excitation pulses are applied to the filters once in every propose therefore to employ a technique which we refer to as clamping" which consists of allowing the filter to be of as high a during the major portion of each pitch period as is required and considerably reducing the Q at the end of each pitch period for a short time immediately before the application of the next excitation pulse, thus ensuring that there is lit tle or no-energy remaining in the filter when it is again excited.
The invention provides a method of dissipating the energy stored in a resonant circuit in a vocoder synthesizer without giving rise to undesirable transients in the output.
The duration of the clamping signal is as short as possible while beingsufficiently long to allow substantially all of the stored energy to be dissipated and the signal is timed so as to end at the application of the succeeding excitation pulse.
Various preferredernbodiments of the inventioninclude a number of variable bandwidth filters whose bandwidths are variable independently of one another or a number of filters the bandwidths of some or all of which are variable in concert.
The resonant frequency f in cycles per second of a parallel LC circuit is given approximately by 2 v26 Awhere L and C are the values in M.K.S. units of the inductance and the capacitance respectively.
The range of frequencies around the resonant frequency over which the response of the resonant circuit does not fall more than 3 decibels below the value of the output at resonance is called the bandwidthand is denoted by 8]".
The quality factor Q of the circuit is defi'ne'd'bv the formula Q=i I. i :7 I I 6f 1(2) where f is the resonant frequency. Q is thus 'a measure of resolution. 0 may be shown to satisfy the relation if C has been chosen to give the required f.
Thus the bandwidth is proportional to the overall resistance of the circuit. if the resistance is arranged to be variable in response to a control signal, we obtain a filter whose bandwidth is externally controllable. i
For a mathematical analysis of resonant circuits, reference may be made, for example, to 'J. Millman and H. Taubs Pulse, Digital and Switching Waveforms; devices and 'currents for their generation and processing (McGraw-Hill 1965). Obviously the complete mathematical analysis of a series resonant circuit will differ from that of a parallel resonantcii'cuit. However the techniques of clamping andof smoothly varying the bandwidth of a filter which are provided by the presentinvention have application both in series and in parallel tuned circuits.
In one version of the preferred embodiment of the invention, the bandwidth of a single tuned LC filter isaltered by changing the effective resistance in the circuit.
A resonant circuit with positive damping is one in which the output decays with time, e.g. the output might be of the form exp (-kt). sin (I c) where exp denotes the exponential function, e is a constant, k is a positive constant called the .decay factor, and t is time measured from the moment of application' of the excitation pulse.
The circuit is said to be critically damped when the rate of energy decay is a maximum. In this case the energy is dissipated in the minimum possible time.
Any real resonant circuit has inherent damping due to resistive losses in the circuit elements, particularly in the inductance. Generally it is required to minimize this damping in order to realize the maximum possible selectivity, but if an additional resistance is switched in parallel with a branch of a resonant circuit, for example, the result will be critical damping of the circuit if the additional resistance has been selected appropriately.
According to the invention there is provided a circuit arrangement which includes a bank of resonant circuits resonating each at a different frequency and having quality factor control means whereby the bandwidth of one ofthe resonant circuits may be varied.
The invention resides in the method employed and in circuitry'or apparatus which includes a synthesizer of the type described, and in a resonant circuit wherein critical damping is introduced by the method described.
' In one embodiment of the invention, the critical damping resistance of the resonant circuit is introduced by being switched in electronically.
In a particular embodiment of the above paragraph a symmetrical transistor is in series with a resistor, a pulse of the appropriate duration being applied at the appropriate moment to the base of the transistor so as to switch into the resonant circuit the resistance of the transistor together with the resistance of the resistor.
The polarities of the power supply to the transistor, of the resonant circuit, of the excitation pulses and of the pulse applied to the base of the transistor are not necessarily independent of one another.
In a preferred embodiment of the invention, the bandwidths of the resonant circuits of the speech synthesizer are variable independently of one another, so that, for example, subjective tests can be made as to what settings give rise to most acceptable speech. enabling research into acceptableness and other characteristics of speech to be carried out.
The tenn "speech synthesizer includes, for example, the synthesizer stage of a vocoder, a reading machine for the blind, and a parametric artificial talker.
The above-mentioned and other features of the invention and the manner of attaining them will become more apparent and the invention itself will be best understood by reference to the following description of an embodiment of the invention taken in conjunction with the accompanying drawings, in which:
FIG. I is a block diagram of a channel vocoder;
FIG. 2 is a circuit diagram of one form of filter for a vocoder synthesizer channel;
FIG. 3 is a circuit diagram which is equivalent to a part of FIG. 2;
FIG. 4 is a circuit diagram which is equivalent to a different part of FIG. 2;
FIG. 5 is a circuit diagram of a second form of filter for a vocoder synthesizer channel with clamping facilities;
FIG. 6 is a circuit diagram of a second form of filter for a vocoder synthesizer channel with bandwidth control;
FIG. 7 is a graph illustrating the variation in bandwidth in the circuits of FIGS. 2 and 4 for a range of applied quality factor control voltages;
FIG. 8 shows oscilloscope traces of output waveforms obtained for various control voltages in the circuit of FIGS. 2 and FIG. 9 is an oscilloscope trace showing a clamping signal and the output obtained when the clamping signal is applied;
FIG. 10 is an oscilloscope trace of the output obtained with zero clamping signal; and
FIG. I1 is a block diagram of a parametric speech synthesizer.
The definitions of some terms used in the following description will now be introduced.
A symmetrical transistor is one which is constructed in such a way that its characteristics remain substantially unchanged when the emitter and collector connections are interchanged.
A transistor is said to be saturated when the difference in potential between collector and emitter is less than that between base and emitter. When this occurs, the transistor undergoes bottoming, that is to say the output is determined by the characteristics of the transistor and not by the applied signal.
A symmetrical transistor operating in the saturated mode has an effective emitter-collector resistance R given by where I,, is the base current.
B the common emitter current gain, and
A is typically 40 per volt at I 6 C.
For a typical transistor of the kind utilized as 05 or 06 in the circuits of FIGS. 2, 3, and 4, or as 050 in FIG. 5 or as 060 in FIG 6. R is about l0 ohms when I,, is 100 p. A.
FIG. 1 illustrates a so-called channel vocoder having an analyzer stage from which signals are derived and carried by a number of transmission paths, one of which, 10, carries the fundamental frequency information of the speech, while the remainder convey the spectral information. These signals control the synthesizer stage which reconstructs the original speech. In more detail, a speech signal is picked up by a microphone II and amplified by an amplifier 12. The output of the amplifier I2 is fed to a pitch detector 16 which detects whether the fundamental voice frequency is random or periodic. From 16 the pitch information goes to a frequencyto-voltage converter 17. The output of 17 is fed through a lowpass filter 18 and carried by the pitch channel 10 (which thus conveys the fundamental frequency or pitch information of the speech, if any) to a switch 19 and a pitch generator 20. If the pitch information is zero, a hiss generator 21 is switched on. If the pitch information exceeds a threshold value, "voicing" in the -300 c.p.s. range is provided.
The output of the amplifier 12 goes also to a pre-emphasis equalizer 13 and thence to a bank of analyzer spectrum filters ASF whose outputs are rectified by the bank of rectifiers RECT and then applied to the bank of low-pass filters LI- -L9. The signals carried by channels 1-9 thus convey the spectral information of the speech. The outputs of channels 1- 9 are applied to the modulators MlM9 where they modulate the pitch or his signal from 20 or 21 respectively. The modulated outputs of Ml-M9 pass through a bank of synthesizer spectrum filters SSF to be combined at the deemphasis equalizer l4, whence the reconstituted speech signal goes to an amplifier 15, being output in audible form at the headphones 22.
The technique of switching in of critical damping can be utilized in some or all of the synthesizer spectrum filters SSF.
The bandwidths of the analyzer spectrum filters ASF can be varied independently of one another or in concert.
If the synthesizer spectrum filters are of the type shown in FIG. 2, then variation of bandwidth can be achieved in filters which can also be critically damped between successive excitation pulses.
FIG. 2 (and FIGS. 3 and 4, which are different portions of FIG. 2, corresponding respectively to the clamping facility and to the control for smoothly varying the bandwidth) illustrates one spectral filter and some associated circuitry from the filter bank SSF of the synthesizer stage of the vocoder of FIG. 1, the filter consisting of a resonant circuit. A series of excitation pulses spaced at voice fundamental rate are generated (by 20 of FIG. 1, from information provided by l6, l7 and 18) and applied to the synthesizer where they excite the tuned circuits whose outputs are combined to give an analogue response which is converted to audio form by an electromechanical transducer.
The parallel resonant circuit consists of an inductor LI and a capacitor C1 in the collector circuit of Q4. Excitation pulses are applied to the base of Q4. The signal output is taken to a high impedance (Q7 of FIG. 2, not shown in FIG. 3) to avoid loading the resonant circuit. Clamping is accomplished by placing in parallel with L1 and C1 a resistor R1 which is in series with a symmetrical transistor Q5. O5 is normally biassed to be nonconducting, but the application of a negative pulse of at least 10 volts amplitude turns it hard on and places the low resistance R1 across the resonant circuit thus dissipating the energy stored therein. The clamping pulse is timed to be just long enough to allow all the energy to be dissipated and to end just before the application of the next excitation pulse.
The capacitor C1 is taken to earth through a variable resistance P and a symmetrical transistor Q6.
The variable resistance enables the minimum bandwidth to be preset. The transistor is used as a variable resistance, and so must always be kept saturated. This means that only small signals can be handled and so the gain of the stage is kept low by making the emitter ,resistance of Q4 large, say kiloohms.
A voltage applied for quality control delivers a current through the potential divider formed by R2 and R3 to the base of the transistor 06. For an applied negative voltage in the range 10 volts to 1.5 volts the bandwidth is in the range 60 c.p.s. to 500 c.p.s.
The symmetrical PNP transistor Q6 is used as a variable resistance by keeping it saturated, in which case the collector current is independent of the base current.
Then the bandwidth '6 f for a given frequency f is given by equation (4) where R is the total resistance of the circuit, and so a; R M
21L (6) (since R R, R where R, is the resistance of the inductor). B i.e.
where A and B are constants. Thus a parabolic relationship holds between 8 f and I, over a certain range.
(The ratio R3/R2 is approximately 10/ l in order to obtain in a convenient way the range of variation in transistor base current from O6.)
FIGS. 5 and 6 illustrate as an alternative a series resonant form of resonant circuit for the synthesizer stage of a vocoder and having respectively clamping facilities and a control for smoothly varying the filter bandwidth. R10, R20, R30, L10, Q50 and 060 correspond to R1, R2, R3, L1, 05 and Q6 of FIGS. 2-4. The effective capacitance C10 of the circuit of FIGS. 5 and 6 is given by Application of a clamping pulse input to the base of 050 in FIG. 5 results in critical damping being introduced across the resonant circuit for the duration of the pulse.
The duration of the pulse is chosen to be such that all the energy stored in the circuit is dissipated in as short a time as possible in the interval between the application of successive excitation pulses and the timing is such that the clamping pulse ends before the application of the next excitation pulse.
In FIG. 6 the inductance L10 is seen to be provided by the primary. winding of a transformer. Variation in bandwidth is achieved by applying a negative control voltage through the potential divider formed by the resistance R20 and R30 to the base of a saturated symmetrical transistor Q60.
In FIGS. 2-6 the symmetrical transistors have been PNP transistors. NPN transistors can be used throughout, (though symmetrical NPN transistors are' notv readily commercially available at present), appropriate changes e.g. in the polarity of the power supplies being made in the circuits. The excitation pulse of FIGS. 2-6 and the clamping pulse of FIGS. 2, 3 and 5. are negative going signals. The control voltage of FIGS. 2, 4 and 6 is a negative voltage. These require to be positive if the polarities are reversed. Field effect transistors can be used in place of symmetrical PNP transistors, with appropriate modifications in the circuits.
FIG. 7 illustrates the control characteristic obtained by applying a quality control voltage as described with reference to FIGS. 2 and 4. The curve is seen to vary smoothly over the range I c.p.s. to 300 c.p.s. which is commonly of most utility in vocoder application.
FIG. 8 illustrates output waveforms obtained by repetitive stimulation of the tuned circuit for various values of the control voltage (equivalently, for various bandwidths).
Referring again to FIGS. 2,3 and 5, clamping" is achieved by placing the critical damping resistance in series with a symmetrical transistor across the inductor. The base of O5 is biased so that normally it is turned off, but the application at the input labeled "clamping pulse" of a negative pulse (of 10 volts amplitude for the filter of FIGS. 1 and 2) turns it hard on and the oscillations of the tuned circuit die away rapidly. The time required for all the energy to be dissipated depends on the resonant frequency, being greater for the lower frequencies. For the lowest frequency channels, about 400 psec. is required, but as this may represent about 20 percent of the time between excitation pulses some compromise will be needed. Normally the clamping" pulse should end at or just after the commencement of the excitation pulse.
The upper trace of FIG. 9 illustrates the response of a single filter of resonant frequency 1,000 c.p.s. which is excited by a train of pulses whose repetition rate is constant at 250 p.p.s. The lower trace. which is to be interpreted as a relatively short negative pulse of amplitude l0 volts and not as a long positive pulse illustrates the form of the clamping signal which is applied to the base of 05 so as to prevent energy from being carried over between one excitation period and the next.
FIG. 10 shows the response of the same filter to the same pulse train but with the clamping signal removed. and shows the reduced response due to destructive interference between the oscillations of successive pitch periods.
FIG. 11 illustrates a synthesizer which produces artificial speech. A control unit 30 produces control signals, the control unit itself being governed by, for example, a digital computer with paper tape output, or by a voltage reader which reads frequency and spectral information presented as manually prepared patterns of electrically conducting ink on rolls of paper. For the production of vowels, a fundamental frequency signal F is applied to a pulse generator 31, the output of which is applied to a voice amplitude circuit 32 whose operation is controlled by an amplitude signal A and whose output is fed through a low-pass filter 33 and thence through resonant circuits 34 (F3), 35 (F2) and 36 (F1) to an integrator 37, and thence through a network which includes a variable resistance 38 to an amplifier A3, the output being an audible signal. For the production of voiced consonants and front fricatives, a noise generator 39 feeds a first noise amplitude circuit 40 which is controlled by the parameter Al. The output of 40 passes through a high pass filter 41 and thence goes to the resonant circuits 34 (F3), 35 (F2) and 36 (F1), passing next to the integrator 37 and the variable resistance 38 and finally to the output amplifier A3. For the production of other consonants, including, for example, the back fricatives, the output .through a variable resistor 44 and is then output asan audible signal from the amplifier A3.
The above description of a parametric artificial talker is somewhat simplified. I
Essentially 36, 35 and 34 mirror the first three fonnants of natural speech i.e. they resonate at around the natural frequencies of the vocal tract. In the synthesis of an utterance, they are modified in accordance with the alteration in shape which the human vocal tract would undergo during speech production.
. 36-34 can be connected in parallel rather than in series if appropriate modifications are made to the rest of the circuitry of FIG. 11.
The term "voice as used in the preceding paragraph means the periodicity superimposed on certain speech sounds by the vibration of the vocal chords. The term noise" means the turbulent waveform produced by the flow of air in the constricted vocal tract. Front fricatives" include, for example, the English labiodental fricatives If/ and /v/ and the alveolar fricatives /S/ and /Z/. Back fricatives include the English palatal fricatives /g/ and /g/ and the glottal spirant /h/.
of a single phoneme may be desirable in order to reproduce adequately, for example, the English bilabial nasal /m/, the alveolar nasal In/ and the velar nasal /ng/ of English sing."
The facility of dissipating energy in the interval between successive excitation pulses also has application in the filters 34 (F3),35 (F2) and 36 (F1).
It is to be understood that the foregoing description of specific examples of this invention is made by way of example only, and is not to be considered as a limitation on its scope.
lclaim:
l. A speech synthesizer which includes a plurality of resonant circuits, each having inherent damping, each resonating at different frequencies, excitation means associated with each of said resonant circuits for repeatedly applying momentary excitation pulses to the associated circuit thereby causing damped oscillations therein, and means for selectively controlling said excitation means to cause the amplitude of the oscillations in any one of the circuits to decay to'an arbitrarily low value before the application of the next successive excitation pulse to that same circuit.
2. A speech synthesizer as claimed in claim 1, wherein each resonant circuit has a quality factor, and said excitation control means comprises means for lowering the quality factor of each of the resonant circuits for a portion of the time between the application of successive excitation pulses.
3. A speech'synthesizer as claimed in claim 2, including resistance means associated with each resonant circuit for damping the associated resonant circuit, and means for switching a resistance means into the associated resonant circuit to give critical damping in the resonant circuit for some discrete time after the application of one excitation pulse, and means'for removing said resistance means from said resonant circuit before the application of the next succeeding excitation pulse.
4. A speech synthesizer as claimed in claim 1, wherein one of the resonant circuits includes an inductance in parallel with a capacitance.
5. A speech synthesizer as claimed in claim 1, wherein one of the resonant circuits includes an inductance in series with a capacitance.
' 6. A speech synthesizer as claimed in claim 3, further including an inductor and transistor connected in parallel, and means for connecting the critical damping resistance between said inductance and the emitter of said transistor.
7. A speech synthesizer as claimed in claim 6, wherein the transistor is symmetrical, and wherein there are means for applying a pulse of predetermined duration and timing to the base of the transistor to switch it on. v t
8. A speech synthesizer as claimed in claim 1, and fuither including means for varyingthe bandwidths of the resonant circuits independently of one another. 7
9. A speech synthesizer as claimed in claim Landjfurther including an inductance in one of the resonant circuits adjacent one winding of a transformer.
10. A speech synthesizer as claimed in claim 9, further including a variable resistance in series with another winding of the transformer.
11. A speech synthesizer as claimed in'claim 10, wherein the said variable resistance is a transistor.
12. A speech synthesizer as claimed in claim 11, wherein the transistor is a symmetrical PNP transistor which is kept saturated and an external control signal is applied as a potential to the base of the transistor.
13. A speech synthesizer as claimed in claim I, wherein said resonant circuit has associated therewith a variable. resistance for connection into said resonant circuit to presebthe bandwidth of the filter.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB2968666A GB1146186A (en) | 1966-07-01 | 1966-07-01 | Speech synthesizer |
GB2968466A GB1146184A (en) | 1966-07-01 | 1966-07-01 | Speech synthesizer |
GB2968566A GB1146185A (en) | 1966-07-01 | 1966-07-01 | Filter |
Publications (1)
Publication Number | Publication Date |
---|---|
US3551588A true US3551588A (en) | 1970-12-29 |
Family
ID=27258835
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US649518A Expired - Lifetime US3551588A (en) | 1966-07-01 | 1967-06-28 | Vocoder filter system |
Country Status (3)
Country | Link |
---|---|
US (1) | US3551588A (en) |
BE (1) | BE717838A (en) |
FR (1) | FR1540969A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4099035A (en) * | 1976-07-20 | 1978-07-04 | Paul Yanick | Hearing aid with recruitment compensation |
US4292469A (en) * | 1979-06-13 | 1981-09-29 | Scott Instruments Company | Voice pitch detector and display |
US20080249128A1 (en) * | 2006-10-16 | 2008-10-09 | Gruenenthal Gmbh | Substituted Sulfonamide Compounds |
-
1967
- 1967-06-28 US US649518A patent/US3551588A/en not_active Expired - Lifetime
- 1967-06-30 FR FR112708A patent/FR1540969A/en not_active Expired
-
1968
- 1968-07-10 BE BE717838D patent/BE717838A/en unknown
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4099035A (en) * | 1976-07-20 | 1978-07-04 | Paul Yanick | Hearing aid with recruitment compensation |
US4292469A (en) * | 1979-06-13 | 1981-09-29 | Scott Instruments Company | Voice pitch detector and display |
US20080249128A1 (en) * | 2006-10-16 | 2008-10-09 | Gruenenthal Gmbh | Substituted Sulfonamide Compounds |
Also Published As
Publication number | Publication date |
---|---|
BE717838A (en) | 1969-01-10 |
FR1540969A (en) | 1968-10-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Dudley | Remaking speech | |
US3651242A (en) | Octave jumper for musical instruments | |
US2243527A (en) | Production of artificial speech | |
US3102928A (en) | Vocoder excitation generator | |
US4708657A (en) | Audio-frequency converter apparatus, for treating subjects suffering from audio-phonatory and auditive-verbal disorders, and a method of using the apparatus | |
US4567806A (en) | Sound generator | |
US3551588A (en) | Vocoder filter system | |
US4038898A (en) | System for producing chorus effect | |
US2570701A (en) | Harmonic-selecting apparatus | |
US2458227A (en) | Device for artificially generating speech sounds by electrical means | |
GB1376093A (en) | Sampling modulation system for an electronic musical instrument | |
US2403985A (en) | Sound reproduction | |
US3573374A (en) | Formant vocoder utilizing resonator damping | |
Miller | Performance characteristics of an experimental harmonic identification pitch extraction (HIPEX) system | |
US2466306A (en) | Vibrato system for amplifiers | |
DE2209548C3 (en) | Electric speech synthesizer circuit | |
US2162875A (en) | Dynamic expansion circuit | |
EP0736970A1 (en) | Amplitude adjust circuit and method thereof | |
DE2430320B2 (en) | Musical sound processing device for an electric musical instrument | |
US2561349A (en) | Electrical musical instrument | |
DE2515524C3 (en) | Device for the electronic generation of sound signals | |
US3423530A (en) | Speech synthesizer having q multiplier | |
US4475430A (en) | Differential sampling circuit for improving signal to noise ratio in an electronic organ having multiplexed keying | |
US3499986A (en) | Speech synthesizer | |
GB1069129A (en) | Flight noise monitoring equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: STC PLC,ENGLAND Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL STANDARD ELECTRIC CORPORATION, A DE CORP.;REEL/FRAME:004761/0721 Effective date: 19870423 Owner name: STC PLC, 10 MALTRAVERS STREET, LONDON, WC2R 3HA, E Free format text: ASSIGNMENT OF ASSIGNORS INTEREST.;ASSIGNOR:INTERNATIONAL STANDARD ELECTRIC CORPORATION, A DE CORP.;REEL/FRAME:004761/0721 Effective date: 19870423 |