WO2011080855A1 - 音声信号復元装置および音声信号復元方法 - Google Patents
音声信号復元装置および音声信号復元方法 Download PDFInfo
- Publication number
- WO2011080855A1 WO2011080855A1 PCT/JP2010/006264 JP2010006264W WO2011080855A1 WO 2011080855 A1 WO2011080855 A1 WO 2011080855A1 JP 2010006264 W JP2010006264 W JP 2010006264W WO 2011080855 A1 WO2011080855 A1 WO 2011080855A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio signal
- signal
- distortion
- band
- audio
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 35
- 238000011156 evaluation Methods 0.000 claims abstract description 77
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 51
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 51
- 238000005070 sampling Methods 0.000 claims abstract description 25
- 238000006243 chemical reaction Methods 0.000 claims abstract description 19
- 230000005236 sound signal Effects 0.000 claims description 282
- 239000000284 extract Substances 0.000 abstract description 11
- 238000004458 analytical method Methods 0.000 description 28
- 238000012545 processing Methods 0.000 description 27
- 238000001228 spectrum Methods 0.000 description 17
- 230000001629 suppression Effects 0.000 description 17
- 238000004891 communication Methods 0.000 description 7
- 230000003595 spectral effect Effects 0.000 description 6
- 230000002194 synthesizing effect Effects 0.000 description 6
- 230000006837 decompression Effects 0.000 description 5
- 238000000605 extraction Methods 0.000 description 5
- 230000006835 compression Effects 0.000 description 4
- 238000007906 compression Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000013139 quantization Methods 0.000 description 4
- 230000005284 excitation Effects 0.000 description 3
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 238000012937 correction Methods 0.000 description 2
- 230000006866 deterioration Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012854 evaluation process Methods 0.000 description 2
- 101000822695 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C1 Proteins 0.000 description 1
- 101000655262 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C2 Proteins 0.000 description 1
- 101000655256 Paraclostridium bifermentans Small, acid-soluble spore protein alpha Proteins 0.000 description 1
- 101000655264 Paraclostridium bifermentans Small, acid-soluble spore protein beta Proteins 0.000 description 1
- 230000005534 acoustic noise Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000001028 reflection method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/038—Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
Definitions
- the present invention relates to an audio signal restoration apparatus and method for restoring a wideband audio signal from an audio signal whose frequency band is limited to a narrow band and restoring an audio signal in a degraded or missing band.
- the frequency band of audio signals sent through a telephone line is narrowly limited to, for example, 300 to 3400 Hz. For this reason, the sound quality of conventional telephone lines is not very good. Also, in digital voice communication such as a cellular phone, the bandwidth is limited in the same way as an analog line due to severe bit rate limitations, so it cannot be said that the sound quality is good in this case.
- Patent Documents 1 and 2 are disclosed as methods of generating or restoring a wideband signal in a pseudo manner from a narrowband signal on the receiving side with respect to the above problem.
- the autocorrelation coefficient of the narrowband audio signal is calculated to extract the basic period of the audio, and the wideband audio signal is obtained based on this basic period.
- the wideband audio signal restoration device a narrowband audio signal is encoded by an encoding method based on an analysis method by synthesis, and a sound source signal or an audio signal obtained as a final result of the encoding is A wideband audio signal is obtained by performing zero padding processing (oversampling).
- Japanese Patent No. 3243174 pages 3-5, FIG. 1
- Japanese Patent No. 3230790 pages 3-4, FIG. 1
- Patent Document 1 In the frequency band expansion device disclosed in Patent Document 1, it is necessary to extract the basic period of a narrowband audio signal. Although various methods for extracting the fundamental period of speech have been disclosed, it is difficult to accurately extract the fundamental period of speech signals. Even more difficult in noisy environments.
- the wideband audio signal restoration apparatus disclosed in Patent Document 2 has an advantage that it is not necessary to extract the basic period of the audio signal.
- the generated broadband sound source signal is analyzed and generated from a narrowband signal, but because it is generated in a pseudo manner by zero padding processing (oversampling), aliasing distortion components are mixed, There is a problem that it is not optimal as a broadband audio signal (especially a high frequency signal) and the sound quality deteriorates.
- the present invention has been made to solve the above-described problems, and an object of the present invention is to provide an audio signal restoration device and an audio signal restoration method for restoring an audio signal with high quality.
- An audio signal restoration device combines a phoneme signal and a sound source signal to generate a plurality of audio signals, and frequency components in at least a part of the frequency band of the audio signal generated by the synthesis filter Is used to evaluate the waveform distortion of the comparison target signal and each of the plurality of sound signals generated by the synthesis filter using a predetermined distortion measure, and select one of the plurality of sound signals based on the evaluation result
- the apparatus includes a distortion evaluation unit and a restored audio signal generation unit that generates a restored audio signal using the audio signal selected by the distortion evaluation unit.
- An audio signal restoration method includes a synthesis filter step for generating a plurality of audio signals by combining a phoneme signal and a sound source signal, and a frequency band of at least a part of the frequency band of the audio signal generated by the synthesis filter step.
- a restored speech signal generation step for generating a restored speech signal using the speech signal selected in the distortion assessment step.
- a plurality of audio signals are generated by combining a phoneme signal and a sound source signal, and waveform distortion with a comparison target signal is evaluated using a predetermined distortion measure, and any one of these is evaluated based on the evaluation result. Since a restored audio signal is generated by selecting the audio signal of, for example, an audio signal restoring device and an audio that restores a comparison target signal lacking a frequency component in an arbitrary frequency band by band limitation or noise suppression with high quality A signal restoration method can be provided.
- FIG. 8 It is a graph which shows typically the distortion evaluation process of the distortion evaluation part 107 of the audio
- Embodiment 1 FIG.
- voice communication voice storage or voice recognition system
- voice communication system such as car navigation, mobile phone and interphone, handsfree call system, video conference system and monitoring system
- This is used to improve the recognition rate of a voice recognition system, and generates a wideband voice signal from a voice signal whose frequency band is limited to a narrow band for passing through a transmission line such as a telephone line.
- An audio signal restoration device for this will be described as an example.
- FIG. 1 shows the overall configuration of an audio signal restoration apparatus 100 according to the first embodiment.
- the audio signal restoration device 100 includes a sampling conversion unit 101, an audio signal generation unit 102, and a restored audio signal generation unit 110.
- the speech signal generation unit 102 includes a phoneme / sound source signal storage unit 105 including a phoneme signal storage unit 108 and a sound source signal storage unit 109, a synthesis filter 106, and a distortion evaluation unit 107.
- the restored audio signal generation unit 110 includes a first band filter 103 and a band synthesis unit 104.
- FIG. 2 schematically shows an audio signal generated by the configuration of the first embodiment.
- FIG. 2A shows a narrowband audio signal (comparison target signal) input to the sampling converter 101.
- FIG. 2B shows an upsampled narrowband audio signal (sampled and converted comparison target signal) output from the sampling converter 101.
- FIG. 2C shows a wideband audio signal with the minimum distortion, selected by the distortion evaluation unit 107 from a plurality of wideband audio signals (audio signals) generated by the synthesis filter 106.
- FIG. 2D shows a signal obtained by extracting a low frequency component and a high frequency component from the wideband audio signal, which is an output of the first band filter 103.
- FIG. 2E shows a restored audio signal that is an output result of the audio signal restoration apparatus 100.
- each arrow in FIG. 2 represents the order of processing, the vertical axis of each graph indicates power, and the horizontal axis indicates frequency.
- voice and music captured through a microphone are A / D (analog / digital) converted, then sampled at a predetermined sampling frequency (for example, 8 kHz), and frame unit (for example, 10 ms). ), And is further input to the audio signal restoration device 100 of the first embodiment as a narrowband audio signal that is band-limited (for example, 300 to 3400 Hz).
- a narrowband audio signal that is band-limited (for example, 300 to 3400 Hz).
- band-limited for example, 300 to 3400 Hz.
- the sampling conversion unit 101 up-samples the input narrowband audio signal to 16 kHz, for example, removes the aliasing distortion signal through a low-pass filter, and then outputs it as an upsampled narrowband audio signal.
- the synthesis filter 106 In the audio signal generation unit 102, the synthesis filter 106 generates a plurality of wideband audio signals using the phoneme signal stored in the phoneme signal storage unit 108 and the sound source signal stored in the sound source signal storage unit 109, and generates distortion.
- the evaluation unit 107 calculates the waveform distortion with the upsampled narrowband audio signal based on a predetermined distortion measure, and selects and outputs the wideband audio signal with the smallest distortion.
- the speech signal generation unit 102 may have the same configuration as that of a decoding method in, for example, a CELP (Code-Excited Linear Prediction) encoding scheme. In this case, the phoneme signal storage unit 108 And the sound source code are stored in the sound source signal storage unit 109.
- CELP Code-Excited Linear Prediction
- the phonological signal storage unit 108 has a configuration that combines the power or gain of the phonological signal in addition to the phonological signal, and can express a large number of diverse phonological signals so that the phonological shapes (spectrum patterns) of various wideband speech signals can be expressed.
- a phoneme signal is output to the synthesis filter 106 in accordance with an instruction from a distortion evaluation unit 107 described later.
- These phonological signals can be obtained from a wideband speech signal (for example, having a bandwidth of 50 to 7000 Hz) using a known method such as linear prediction analysis.
- the spectrum pattern can be expressed in the form of an acoustic parameter such as a spectrum signal itself or an LSP (Line Spectrum Pair) parameter and a cepstrum, and appropriately converted so that it can be applied to the filter coefficient of the synthesis filter 106. Just keep it.
- the obtained phoneme signal may be compressed by a known method such as scalar quantization or vector quantization in order to reduce the amount of memory.
- the sound source signal storage unit 109 has a configuration having both the power and gain of the sound source signal in addition to the sound source signal, and can express the sound source signal shapes (pulse trains) of various wideband audio signals in the same manner as the phoneme signal storage unit 108.
- a large amount and a wide variety of sound source signals are stored in storage means such as a memory, and the sound source signals are output to the synthesis filter 106 in accordance with an instruction from the distortion evaluation unit 107 described later.
- These sound source signals can be learned and obtained by the CELP method using a wide-band audio signal (for example, having a bandwidth of 50 to 7000 Hz) and the above-mentioned phoneme signal.
- the obtained excitation signal may be compressed by a known method such as scalar quantization or vector quantization to reduce the amount of memory, or multipulse and ACELP (Algebric CELP: Algebraic Code Excited Linear Prediction) method
- the sound source signal may be expressed by a predetermined model.
- VSELP Vector Sum Excited Linear Prediction
- the synthesis filter 106 may synthesize after adjusting the power or gain of the phoneme signal and the power or gain of the sound source signal.
- a plurality of wideband audio signals can be generated from one phoneme signal and one sound source signal, so that the memory amount of the phoneme signal storage unit 108 and the sound source signal storage unit 109 can be reduced.
- the distortion evaluation unit 107 evaluates the waveform distortion between the wideband audio signal output from the synthesis filter 106 and the upsampled narrowband audio signal output from the sampling conversion unit 101.
- the frequency band for evaluating distortion (predetermined frequency band) is limited only to the range of the narrowband audio signal, and is limited to 300 to 3400 Hz in this example.
- both a wideband audio signal and an upsampled narrowband audio signal are FIR (Finite Impulse Response) having a band pass characteristic of 300 to 3400 Hz. : Finite impulse response characteristics)
- s (n) and u (n) are the wideband audio signal and the upsampled narrowband audio signal that have been subjected to FIR filter processing, respectively, and N is the number of samples of the audio signal waveform (in the case of 160 samples and 16 kHz sampling). . If the low frequency region of 300 Hz or less is not restored, the wideband audio signal is downsampled to the frequency (8 kHz) of the narrowband audio signal without using the FIR filter described above, and narrowed before the upsampling. You may perform distortion evaluation with a zone
- the distortion evaluation unit 107 performs the filter processing using the FIR filter in the above, for example, an IIR (Infinite Impulse Response) filter may be used if the distortion evaluation can be performed appropriately. It may be used.
- IIR Intelligent Impulse Response
- the distortion evaluation unit 107 may also perform distortion evaluation on the frequency axis instead of on the time axis.For example, after both the wideband audio signal and the upsampled narrowband audio signal are zero-padded and windowed, It is also possible to convert to the spectral domain using 256-point FFT (Fast Fourier Transform) and evaluate the sum of differences on the power spectrum as distortion, for example, as in the following equation. In this case, unlike the evaluation on the time axis, a filter process having a band pass characteristic is not necessary.
- S (f) and U (f) are the power spectrum component of the wideband audio signal and the power spectrum component of the upsampled narrowband audio signal, respectively, and FL and FH are the spectrum components corresponding to 300 Hz and 3400 Hz, respectively. Number.
- the distortion evaluation unit 107 sequentially instructs the synthesis filter 106 to generate a wideband audio signal by issuing an instruction to output a set of a spectrum pattern and a sound source signal from the phoneme signal storage unit 108 and the sound source signal storage unit 109, and the above equation (1) Alternatively, the distortion is calculated by the above equation (2). Then, the wideband audio signal that minimizes the distortion is selected and output to the first bandpass filter 103. Note that the distortion evaluation unit 107 can calculate distortion after performing the auditory weighting process, which is usually used in the CELP speech coding method, on both the wideband speech signal and the upsampled narrowband speech signal. is there.
- the distortion evaluation unit 107 does not necessarily need to select a wideband audio signal that minimizes the distortion, and may select a wideband audio signal that has the second smallest distortion, for example.
- an allowable range of distortion is set, and a wideband audio signal that is distorted within the range is selected, and the subsequent processing of the synthesis filter 106 and the distortion evaluation unit 107 is not performed, and the number of processes can be reduced. You may plan.
- the first band filter 103 extracts a frequency component other than the band of the narrowband audio signal from the wideband audio signal, and outputs it to the band synthesizing unit 104. That is, in the first embodiment, a low frequency component of 300 Hz or lower and a high frequency component of 3400 Hz or higher are extracted. An FIR filter, an IIR filter, or the like may be used for extraction of the low frequency component and the high frequency component. As a general characteristic of an audio signal, the harmonic structure in the low frequency region often appears in the high frequency region as well, and conversely, if the harmonic structure is observed in the high frequency region, the harmonic structure in the low frequency region is similarly detected. Often appear.
- the cross-correlation is strong between the low frequency band and the high frequency band, the distortion of the low frequency component and the high frequency component extracted by the first band filter 103 with the narrow band audio signal is minimized.
- an optimal restored audio signal can be constructed.
- the band synthesizing unit 104 adds the low-frequency component and the high-frequency component of the wideband audio signal output from the first bandpass filter 103 and the upsampled narrowband audio signal output from the sampling conversion unit 101 to provide a wideband.
- the audio signal is restored and output as a restored audio signal.
- the audio signal restoration apparatus 100 converts a narrowband audio signal that is band-limited to a narrowband into a wideband audio signal that includes the narrowband.
- the waveform distortions of the filter 106, the upsampled narrowband audio signal sampled and converted by the sampling converter 101, and the plurality of wideband audio signals generated by the synthesis filter 106 are evaluated using a predetermined distortion measure, and the evaluation is performed.
- a distortion evaluation unit 107 that selects a wideband audio signal with the smallest distortion based on the result, and a distortion evaluation unit 1 7 is a first band filter 103 that extracts a frequency component other than the narrow band from the wideband audio signal selected by 7, and an upsampled narrow band that is sampled and converted by the sampling conversion unit 101 to the frequency component extracted by the first band filter 103.
- a band synthesizing unit 104 that combines audio signals is provided. In this way, the low-frequency component and the high-frequency component used for audio signal restoration are obtained from the wideband audio signal generated so that the distortion of the narrowband audio signal is minimized. Can be restored.
- the first embodiment it is not necessary to extract the basic period of the voice, and there is no quality deterioration due to an extraction error of the basic period. Therefore, even in a noise environment where it is difficult to analyze the basic period of the voice, high quality A wideband audio signal can be restored.
- the low-frequency component and the high-frequency component used for audio signal restoration are obtained from the wideband audio signal generated so that the distortion of the narrowband audio signal is minimized.
- Narrowband audio signals and low-frequency components (or high-frequency components and narrowband audio signals) can be connected smoothly, and interpolation processing such as power correction at the time of band synthesis is not required. Can be restored.
- the audio signal restoration apparatus 100 omits the processing of the first band filter 103 and the band synthesis unit 104 when the distortion evaluation result in the distortion evaluation unit 107 is very small.
- the wideband audio signal output from the evaluation unit 107 may be directly output as a restored audio signal.
- the low-frequency and high-frequency components are restored for the narrowband audio signal from which both the low-frequency and high-frequency bands are missing.
- the present invention is not limited to this. Needless to say, it is needless to say that even a narrow-band audio signal lacking at least one frequency band of the low, middle, and high frequencies can be restored.
- the audio signal restoration device 100 can restore the same frequency band as the wideband audio signal as long as it is a narrowband audio signal having at least a part of the frequency band of the wideband audio signal generated by the synthesis filter 106. .
- FIG. 3 shows the overall configuration of the audio signal restoration device 100 according to the second embodiment, and is a configuration in which a voice analysis unit 111 is newly added to the audio signal restoration device 100 shown in FIG. Regarding the other components, the same reference numerals are given to the portions corresponding to those in FIG. 1, and detailed description thereof is omitted.
- the speech analysis unit 111 analyzes the acoustic characteristics of the input narrowband speech signal by a known method such as linear prediction analysis, extracts the phoneme signal and the sound source signal of the narrowband speech signal, and The data is output to the signal storage unit 108 and the sound source signal storage unit 109.
- a known method such as linear prediction analysis
- the phoneme signal for example, an LSP parameter with good interpolation characteristics is desirable, but other parameters may be used.
- the speech analysis unit 111 includes an inverse filter having, for example, a phonological signal that is an analysis result as a filter coefficient, and a residual signal obtained by filtering the narrowband speech signal is used as the sound source signal. Can do.
- the phoneme / sound source signal storage unit 105 uses the phoneme signal and the sound source signal of the narrowband speech signal input from the sound analysis unit 111 as auxiliary information for the phoneme signal storage unit 108 and the sound source signal storage unit 109.
- auxiliary information for example, a 300 to 3400 Hz portion can be removed from a phonological signal of a wideband speech signal, and a phonological signal of a narrowband speech signal can be applied to the removed portion.
- the phonological signal storage unit 108 performs distortion evaluation on, for example, a spectrum of the phonological signal of the narrowband speech signal and the wideband speech signal, and outputs only the phonological signal of the wideband speech signal with little distortion to the synthesis filter 106.
- a preliminary selection can be made. By performing the preliminary selection of the phoneme signal, the number of processings of the synthesis filter 106 and the distortion evaluation unit 107 can be reduced.
- a sound source signal of a narrowband audio signal can be added to a wideband audio signal or used as information for preliminary selection, like the phonological signal storage unit 108.
- a sound source signal of a narrowband audio signal By adding a sound source signal of a narrowband audio signal, it is possible to obtain a sound source signal of a wideband audio signal that is more approximate to the narrowband audio signal.
- the number of processings of the synthesis filter 106 and the distortion evaluation unit 107 can be reduced by performing preliminary selection of the sound source signal.
- the audio signal restoration device 100 includes the audio analysis unit 111 that performs acoustic analysis on a narrowband audio signal that is band-limited to a narrowband and generates auxiliary information, and performs synthesis.
- the filter 106 uses the auxiliary information generated by the speech analysis unit 111 to combine a plurality of phoneme signals having a wideband frequency component stored in the phoneme / sound source signal storage unit 105 and a plurality of sound source signals, respectively. A plurality of audio signals are generated. Therefore, by using the analysis result of the narrowband audio signal as auxiliary information, it is possible to obtain a wideband audio signal that is more approximate to the narrowband audio signal, and to restore a higher-quality wideband audio signal.
- the phonological signal and the sound source signal when generating a wideband audio signal, can be preliminarily selected using the analysis result of the narrowband audio signal as auxiliary information, so that processing is performed while maintaining high quality. The amount can be reduced.
- the processing of the voice analysis unit 111 is performed before being input to the sampling conversion unit 101, but may be performed after the processing of the sampling conversion unit 101. In this case, speech analysis of the upsampled narrowband speech signal is performed.
- the voice analysis unit 111 performs, for example, frequency analysis of a voice signal and a noise signal on the input narrowband voice signal, and a ratio of the voice signal spectrum power to the noise signal spectrum power (signal-to-noise ratio, hereinafter, SN ratio).
- Auxiliary information designating a frequency band having a high frequency may be generated.
- the sampling conversion unit 101 performs sampling conversion on the frequency component of the frequency band (predetermined frequency band) specified by the auxiliary information in the narrowband audio signal
- the distortion evaluation unit 107 performs the upsampled narrow band. Distortion evaluation between the band audio signal and the plurality of wideband audio signals is performed between the frequency components in the frequency band specified by the auxiliary information.
- the first band filter 103 extracts a frequency component other than the frequency band specified by the auxiliary information from the wideband audio signal selected by the distortion evaluation unit 107, and the band synthesizing unit 104 extracts this frequency band. Synthesizes upsampled narrowband audio signal. For this reason, the distortion evaluation unit 107 evaluates distortion only in the frequency band specified by the auxiliary information, not in the entire frequency band of the narrowband audio signal, and the processing amount can be reduced.
- Embodiment 3 the audio signal restoration device 100 for generating a wideband audio signal from an audio signal whose frequency band is limited to a narrow band has been described.
- the audio signal restoration is performed.
- an audio signal restoration apparatus 200 for restoring an audio signal in a frequency band that has been degraded or lost due to noise suppression processing, audio compression processing, or the like is configured.
- FIG. 4 shows the overall configuration of the audio signal restoration apparatus 200 according to the third embodiment, and a noise suppression unit 201 and a second band filter 202 are newly added to the audio signal restoration apparatus 100 shown in FIG. This is the configuration.
- the same reference numerals are given to the portions corresponding to those in FIG. 1, and detailed description thereof is omitted.
- the frequency band of the input noise-mixed voice signal is set to 0 to 4000 Hz, and the noise that is mixed is assumed to be automobile running noise, and the band of 0 to 500 Hz.
- the phoneme / sound source signal storage unit 105, the synthesis filter 106, the distortion evaluation unit 107, the first band filter 103, and the second band filter 202 in the audio signal generation unit 102 have a frequency band of 0 to 4000 Hz.
- the operation corresponding to the above is performed, and the phoneme signal and the sound source signal are held. Needless to say, these conditions are not necessarily applied to an actual system.
- FIG. 5 schematically shows an audio signal generated by the configuration of the third embodiment.
- FIG. 5A shows a noise-suppressed speech signal (comparison target signal) output from the noise suppression unit 201.
- FIG. 5B shows a wideband audio signal that is selected by the distortion evaluation unit 107 from a plurality of wideband audio signals (audio signals) generated by the synthesis filter 106 and has the minimum distortion with the noise-suppressed audio signal.
- FIG. 5C shows a signal obtained by extracting a low frequency component from the wideband audio signal, which is an output of the first band filter 103.
- FIG. 5D shows a high-frequency component of the noise-suppressed speech signal output from the second band filter 202.
- FIG. 5E shows a restored audio signal that is an output result of the audio signal restoration apparatus 200.
- each arrow in FIG. 5 represents the order of processing, the vertical axis of each graph indicates power, and the horizontal axis indicates frequency.
- the noise suppression unit 201 inputs a noise-mixed speech signal mixed with noise, and outputs the noise-suppressed speech signal to the distortion evaluation unit 107 and the second band filter 202.
- the noise suppression unit 201 uses a low-frequency / wide-band that is separated into a low frequency of 0 to 500 Hz and a high frequency of 500 to 4000 Hz for use by the distortion evaluation in the subsequent distortion evaluation unit 107 and the first band filter 103.
- a band information signal designating a division frequency is output.
- the band information signal is fixed at 500 Hz in the third embodiment.
- the state of the input noise-mixed voice signal for example, the frequency analysis of the voice signal and the noise signal is performed, and the noise signal spectrum power is the voice signal.
- a frequency exceeding the spectrum power may be used as the band information signal.
- the frequency changes every moment according to the input noise-mixed speech signal and the state of the noise it may be changed, for example, every 10 ms frame.
- a method that combines spectral subtraction and spectral amplitude suppression for example, Japanese Patent No. 3454190 is used. It is possible to use.
- the synthesis filter 106 uses a plurality of sound source signals stored in the phoneme signal storage unit 108 and a sound source signal stored in the sound source signal storage unit 109.
- a wideband speech signal, and the distortion evaluation unit 107 evaluates the waveform distortion of the noise-suppressed speech signal with the noise suppressed based on a predetermined distortion measure, and selects a wideband speech signal having a waveform distortion that meets an arbitrary condition And output.
- the distortion evaluation unit 107 limits the frequency band (predetermined frequency band) for evaluating distortion when evaluating the waveform distortion to a range higher than the frequency specified by the band information signal. In this example, the distortion evaluation unit 107 sets the frequency band to 500 to 4000 Hz. limit. In order to evaluate the waveform distortion in this range, for example, the same technique as that used in the first embodiment can be adopted.
- the distortion evaluation unit 107 sequentially instructs the synthesis filter 106 to generate a plurality of wideband audio signals by issuing an instruction to output a set of spectrum patterns and sound source signals from the phoneme signal storage unit 108 and the sound source signal storage unit 109. The smallest wideband audio signal is selected and output to the first bandpass filter 103.
- the first band filter 103 extracts a low frequency component equal to or lower than the low frequency / wide frequency division frequency indicated by the band information signal from the wideband audio signal generated by the distortion evaluation unit 107, and outputs the low frequency component to the band synthesis unit 104.
- an FIR filter, an IIR filter or the like may be used as in the first embodiment.
- the harmonic structure in the low frequency region often appears in the high frequency region as well, and conversely, if the harmonic structure is observed in the high frequency region, the harmonic structure in the low frequency region is similarly detected. Often appear.
- the low band component extracted by the first band filter 103 is generated so as to minimize the distortion with the noise-suppressed speech signal. It is considered that an optimum restored audio signal can be constructed by obtaining from a wideband audio signal.
- the second band filter 202 performs the reverse operation of the first band filter 103 described above. That is, a high frequency component equal to or higher than the low frequency / wide frequency division frequency indicated by the band information signal is extracted from the noise-suppressed voice signal and output to the band synthesizing unit 104.
- an FIR filter, an IIR filter, or the like may be used as in the first band filter 103.
- the band synthesizer 104 adds the low-frequency component of the wideband audio signal output from the first band filter 103 and the high-frequency component of the noise-suppressed audio signal output from the second band filter 202 to generate a sound.
- the signal is restored and output as a restored audio signal.
- an audio signal restoration device that restores a degraded or missing noise-suppressed audio signal by performing noise suppression processing on the noise-mixed audio signal at the noise suppression unit 201 and generates a restored audio signal.
- a synthesis filter 106 that generates a plurality of wideband speech signals by combining the phoneme signal and the sound source signal stored in the phoneme / sound source signal storage unit 105, and a noise-suppressed speech signal and the synthesis filter 106
- a distortion evaluation unit 107 that evaluates waveform distortion with a plurality of wideband audio signals using a predetermined distortion scale, and selects a wideband audio signal with the smallest distortion based on the evaluation result, and a distortion evaluation unit 107
- a first band-pass filter 103 for extracting a frequency component of a deteriorated or missing frequency band from the selected wideband audio signal; and a noise-suppressed audio signal
- a second band filter 202 that extracts frequency components other than the deteriorated or missing frequency band, and a band
- the third embodiment since it is not necessary to extract the fundamental period of speech and there is no quality degradation due to an extraction error of the fundamental period, high quality is achieved even in a noise environment where it is difficult to analyze the fundamental period of speech.
- the audio signal can be restored.
- the low frequency component used for the audio signal restoration is obtained from the audio signal generated so as to minimize the distortion with the noise signal, so in principle,
- the high frequency component of the noise signal whose noise is suppressed and the generated low frequency component can be connected smoothly, and interpolation processing such as power correction at the time of band synthesis is not necessary, and a high quality audio signal can be restored.
- the audio signal restoration apparatus 200 has the first band filter 103, the second band filter 202, and the band synthesis unit 104 when the distortion evaluation result in the distortion evaluation unit 107 is very small. Each of these processes may be omitted, and the wideband audio signal output by the distortion evaluation unit 107 may be directly output as a restored audio signal.
- the low frequency component is restored to the noise-suppressed signal whose low frequency is deteriorated or lost.
- the present invention is not limited to this. It may be configured to restore the frequency components of these bands for a noise-suppressed speech signal in which one or both of the high frequencies are deteriorated or missing.
- a configuration may be adopted in which frequency components in an intermediate band of 800 to 1000 Hz are restored.
- the intermediate band is deteriorated or lost, for example, a case where local band noise such as wind noise (wind noise) generated when an automobile travels at a high speed is mixed in an audio signal can be considered.
- the noise-suppressed voice signal has at least a part of the frequency band of the wideband voice signal generated by the synthesis filter 106, The frequency components of the remaining frequency band of the noise-suppressed speech signal can be restored.
- Embodiment 4 As a modification of the third embodiment, as in the second embodiment, it is also possible to use the analysis result of the noise-suppressed audio signal as auxiliary information for generating a wideband audio signal.
- a speech analysis unit 111 as shown in FIG. 3 is added to the speech signal restoration apparatus 200 according to Embodiment 3, and the speech analysis unit 111 receives noise input from the noise suppression unit 201.
- the acoustic characteristics of the suppressed speech signal are analyzed, the phoneme signal and the sound source signal of the noise-suppressed speech signal are extracted, and output to the phoneme signal storage unit 108 and the sound source signal storage unit 109, respectively.
- the speech signal restoration apparatus 200 includes the speech analysis unit 111 that performs acoustic analysis on the noise-suppressed speech signal and generates auxiliary information, and the synthesis filter 106 includes the speech analysis unit 111.
- the generated auxiliary information a plurality of wideband audio signals are generated by combining the phoneme signals and sound source signals stored in the phoneme / sound source signal storage unit 105. Therefore, by using the analysis result of the noise-suppressed speech signal as auxiliary information, a wideband speech signal that is more approximate to the noise-suppressed speech signal can be obtained, and a higher-quality speech signal can be restored.
- the phonological signal and the sound source signal when generating a wideband audio signal, can be preliminarily selected using the analysis result of the noise-suppressed audio signal as auxiliary information, so that high quality is maintained. The amount of processing can be reduced.
- Embodiment 5 FIG.
- the audio signal is divided into a low frequency band and a high frequency band based on the band information signal, and only the distortion in the high frequency band is evaluated in the distortion evaluation process.
- the components can be subjected to distortion evaluation after weighting, or the distortion evaluation can be performed by weighting according to the frequency characteristics of the noise signal.
- the audio signal restoration device according to the fifth embodiment has the same configuration as that of the audio signal restoration device 200 shown in FIG. 4, and will be described below with reference to FIG. 4.
- FIG. 6 is an example of a weighting coefficient used for distortion evaluation by the distortion evaluation unit 107.
- FIG. 6A shows a case where some low-frequency components are also evaluated, and
- FIG. 6B shows the frequency characteristics of the noise signal. This is a case where the inverse characteristic of is a weighting factor.
- the vertical axis of each graph in FIG. 6 indicates the amplitude and distortion evaluation weight value, and the horizontal axis indicates the frequency.
- a weighting factor reflection method for distortion evaluation in the distortion evaluation unit 107 for example, a method of convolving a weighting factor with a filter coefficient or multiplying a power spectrum component by a weighting factor can be considered.
- the characteristics of the first band-pass filter 103 and the second band-pass filter 202 it is possible to separate the low band and the high band as in the third embodiment, and FIG.
- the filter characteristic may represent the frequency characteristic of the weighting coefficient.
- the reason for evaluating the low frequency as shown in FIG. 6 (a) is that although the low frequency component is noise-suppressed, the audio component is not lost at all. In addition, the quality of the generated wideband audio signal is improved.
- FIG. 6B by performing distortion evaluation with the inverse characteristic of the frequency characteristic of noise, it is possible to weight a high frequency range having a relatively high S / N ratio, thereby improving the quality of the generated wideband audio signal. To do.
- the distortion evaluation unit 107 is configured to evaluate waveform distortion using a distortion scale that is weighted on the frequency axis. For this reason, by evaluating the distortion by weighting some of the low frequency components, the quality of the generated audio signal can be improved, and a higher quality audio signal can be restored.
- the quality of the generated audio signal can be improved, and a higher quality audio signal can be restored.
- the weight of distortion evaluation is applied to the restoration of the noise-suppressed voice signal.
- the voice signal restoration apparatus 100 according to the first and second embodiments performs a wideband operation from a narrowband voice signal.
- the present invention can be similarly applied to restoration to an audio signal.
- Embodiments 1 to 5 described above the case of telephone audio is described as an example of a narrowband audio signal.
- the present invention is not limited to telephone audio, but an audio signal such as MP3 (MPEG Audio Layer-3).
- the present invention can also be applied to high frequency generation processing of a signal whose high frequency has been cut by an encoding technique.
- the frequency band of the wideband audio signal is not limited to 50 to 7000 Hz, and can be implemented in a wider band such as 50 to 16000 Hz.
- restored audio signal generation section 110 shown in the first to fifth embodiments, a specific frequency band is cut out from the audio signal by the band filter, and a restored audio signal is generated by combining it with other audio signals at the band synthesizing section.
- a restored audio signal may be generated by weighted addition of two types of audio signals input to the restored audio signal generation unit 110.
- FIG. 7 shows an example when the restored audio signal generation unit 110 having this configuration is applied to the audio signal restoration apparatus 100 according to the first embodiment
- FIG. 8 schematically shows the restored audio signal.
- each arrow in FIG. 8 represents the order of processing
- the vertical axis of each graph indicates power
- the horizontal axis indicates frequency.
- the restored audio signal generation unit 110 newly includes two weight adjustment units 301 and 302.
- the weight adjustment unit 301 adjusts the weight (gain) of the wideband audio signal output from the distortion evaluation unit 107 to, for example, 0.2 (broken line shown in FIG. 8A), and the weight adjustment unit 302 includes a sampling conversion unit.
- the weight (gain) of the upsampled audio signal output from 101 is adjusted to, for example, 0.8 (broken line shown in FIG. 8B), and both audio signals are added by the band synthesis unit 104 (FIG. 8 ( c)), a restored audio signal is generated (FIG. 8D).
- FIG. 7 may be applied to the audio signal restoration device 200.
- the weight adjusters 301 and 302 may use weights as necessary, such as using weights having frequency characteristics that increase as the frequency increases, in addition to using constant weights in the frequency direction.
- a configuration including both the weight adjustment unit 301 and the first band filter 103 is used, and the first band filter 103 extracts a frequency band equal to the narrow band audio signal from the wideband audio signal weight-adjusted by the weight adjustment unit 301.
- the first band filter 103 may extract a frequency band equal to the narrowband audio signal from the wideband audio signal and adjust the weight by the weight adjustment unit 301.
- the configuration may include both the weight adjustment unit 301 and the second band filter 202.
- the audio signal restoration device generates the restored audio signal from the wideband audio signal selected from the plurality of wideband audio signals synthesized from the phoneme signal and the sound source signal and the comparison target signal.
- the audio signal restoration devices 100 and 200 are configured by a computer, the processing contents of the sampling conversion unit 101, the audio signal generation unit 102, the restored audio signal generation unit 110, the audio analysis unit 111, and the noise suppression unit 201 are described.
- the stored program may be stored in the memory of the computer, and the CPU of the computer may execute the program stored in the memory.
- An audio signal restoration device and an audio signal restoration method generate a plurality of audio signals by combining a phoneme signal and a sound source signal, and evaluate waveform distortion with a comparison target signal using a predetermined distortion measure, respectively. Since one of the audio signals is selected based on the evaluation result to generate the restored audio signal, the wideband audio signal is restored from the audio signal whose frequency band is limited to a narrow band, and It is suitable for use in an audio signal restoration apparatus and method for restoring an audio signal in a degraded or missing band.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Telephone Function (AREA)
Abstract
Description
そのため、従来のアナログ電話回線通信およびデジタル音声通信の音質の問題は依然解決されていない。
実施の形態1.
本実施の形態1では、音声通信、音声蓄積または音声認識システムが導入された、カーナビゲーション、携帯電話およびインターフォンなどの音声通信システム、ハンズフリー通話システム、TV会議システムならびに監視システムなどの音質改善や、音声認識システムの認識率の向上に供されるものであって、電話回線などの伝送路を経由するために周波数帯域が狭帯域に制限されている音声信号から、広帯域の音声信号を生成するための音声信号復元装置を例に説明する。
図1において、音声信号復元装置100はサンプリング変換部101と、音声信号生成部102と、復元音声信号生成部110とから構成されている。この音声信号生成部102は、音韻信号記憶部108および音源信号記憶部109を備える音韻・音源信号記憶部105と、合成フィルタ106と、歪評価部107とから構成されている。また、復元音声信号生成部110は、第1の帯域フィルタ103と、帯域合成部104とから構成されている。
上記実施の形態1の変形例として、狭帯域音声信号の分析結果を、広帯域音声信号を生成するための補助情報として用いることも可能である。図3は、本実施の形態2による音声信号復元装置100の全体構成を示したものであり、図1に示す音声信号復元装置100に新たに音声分析部111を追加した構成である。その他の構成要素に関しては、図1に対応する部分については同一符号を付与し、詳細な説明を省略する。
上記実施の形態2では、周波数帯域が狭帯域に制限されている音声信号から、広帯域の音声信号を生成するための音声信号復元装置100を説明したが、本実施の形態2ではこの音声信号復元装置100を変形して応用することで、雑音抑圧処理や音声圧縮処理などにより劣化または欠損した周波数帯域の音声信号を復元するための、音声信号復元装置200を構成する。図4は、本実施の形態3による音声信号復元装置200の全体構成を示したものであり、図1に示す音声信号復元装置100に新たに雑音抑圧部201および第2の帯域フィルタ202を追加した構成である。その他の構成要素に関しては、図1に対応する部分については同一符号を付与し、詳細な説明を省略する。
雑音抑圧部201は、雑音が混入した雑音混入音声信号を入力して、雑音抑圧した音声信号を歪評価部107および第2の帯域フィルタ202へ出力する。また、雑音抑圧部201は、後段の歪評価部107における歪み評価と第1の帯域フィルタ103とが用いるための、0~500Hzの低域と500~4000Hzの高域に分離する低域・広域分割周波数を指定した帯域情報信号を出力する。なお、帯域情報信号は本実施の形態3では500Hz固定としているが、例えば、入力される雑音混入音声信号の様態、例えば、音声信号と雑音信号の周波数分析を行い、雑音信号スペクトルパワーが音声信号スペクトルパワーを上回る周波数(スペクトル上でのSN比が0dBを交差する周波数)を帯域情報信号としても良い。また、その周波数は入力される雑音混入音声信号およびその雑音の様態に応じて時々刻々変化するので、例えば、10msのフレーム毎に変更しても良い。
上記実施の形態3の変形例として、上記実施の形態2と同様に、雑音抑圧された音声信号の分析結果を、広帯域音声信号を生成するための補助情報として用いることも可能である。具体的には、上記実施の形態3に係る音声信号復元装置200に、図3に示すような音声分析部111を追加して、この音声分析部111が、雑音抑圧部201から入力される雑音抑圧済音声信号について音響的特徴の分析を行い、雑音抑圧済音声信号の音韻信号と音源信号の抽出を行って、それぞれ音韻信号記憶部108と音源信号記憶部109へ出力する。
上記実施の形態3では、帯域情報信号に基づいて音声信号を低域と高域に2分割し、歪み評価処理では高域部の歪みだけを評価していたが、例えば、一部の低域成分も重み付けを行った上で歪み評価の対象としたり、雑音信号の周波数特性に応じた重み付けを行って歪み評価を行ったりすることも可能である。なお、本実施の形態5に係る音声信号復元装置は、図4に示す音声信号復元装置200と図面上では同様の構成であるため、以下では図4を援用して説明する。
なお、図示は省略するが、図7の構成を音声信号復元装置200に適用してもよい。
Claims (8)
- 音韻信号および音源信号を組み合わせて、複数の音声信号を生成する合成フィルタと、
前記合成フィルタが生成する音声信号の周波数帯域の少なくとも一部の周波数帯域の周波数成分を持つ比較対象信号と前記合成フィルタが生成した前記複数の音声信号それぞれとの波形歪みを所定の歪み尺度を用いて評価して、当該評価結果に基づいて前記複数の音声信号のうちのいずれかを選択する歪評価部と、
前記歪評価部が選択した音声信号を用いて復元音声信号を生成する復元音声信号生成部とを備える音声信号復元装置。 - 復元音声信号生成部は、比較対象信号と歪評価部が選択した音声信号とを組み合わせる帯域合成部を有することを特徴とする請求項1記載の音声信号復元装置。
- 歪評価部は、比較対象信号と合成フィルタが生成した複数の音声信号それぞれとの、所定の周波数帯域の周波数成分の波形歪みを評価することを特徴とする請求項1記載の音声信号復元装置。
- 比較対象信号を所定の周波数帯域に対応するようにサンプリング変換するサンプリング変換部を備え、
歪評価部は、前記サンプリング変換部がサンプリング変換した前記比較対象信号と合成フィルタが生成した複数の音声信号それぞれとの、前記所定の周波数帯域の周波数成分の波形歪みを評価することを特徴とする請求項3記載の音声信号復元装置。 - 音韻信号および音源信号を組み合わせて、複数の音声信号を生成する合成フィルタステップと、
前記合成フィルタステップで生成する音声信号の周波数帯域の少なくとも一部の周波数帯域の周波数成分を持つ比較対象信号と前記合成フィルタステップで生成した前記複数の音声信号それぞれとの波形歪みを所定の歪み尺度を用いて評価して、当該評価結果に基づいて前記複数の音声信号のうちのいずれかを選択する歪評価ステップと、
前記歪評価ステップで選択した音声信号を用いて復元音声信号を生成する復元音声信号生成ステップとを備える音声信号復元方法。 - 復元音声信号生成ステップは、比較対象信号と歪評価ステップで選択した音声信号とを組み合わせる帯域合成ステップを有することを特徴とする請求項5記載の音声信号復元方法。
- 歪評価ステップは、比較対象信号と合成フィルタステップで生成した複数の音声信号それぞれとの、所定の周波数帯域の周波数成分の波形歪みを評価することを特徴とする請求項5記載の音声信号復元方法。
- 比較対象信号を所定の周波数帯域に対応するようにサンプリング変換するサンプリング変換ステップを備え、
歪評価ステップは、前記サンプリング変換ステップでサンプリング変換した前記比較対象信号と合成フィルタステップで生成した複数の音声信号それぞれとの、前記所定の周波数帯域の周波数成分の波形歪みを評価することを特徴とする請求項7記載の音声信号復元方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2011547245A JP5535241B2 (ja) | 2009-12-28 | 2010-10-22 | 音声信号復元装置および音声信号復元方法 |
DE112010005020.1T DE112010005020B4 (de) | 2009-12-28 | 2010-10-22 | Sprachsignal-Wiederherstellungsvorrichtung und Sprachsignal-Wiederherstellungsverfahren |
US13/503,497 US8706497B2 (en) | 2009-12-28 | 2010-10-22 | Speech signal restoration device and speech signal restoration method |
CN201080055064.1A CN102652336B (zh) | 2009-12-28 | 2010-10-22 | 声音信号复原装置以及声音信号复原方法 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009-297147 | 2009-12-28 | ||
JP2009297147 | 2009-12-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011080855A1 true WO2011080855A1 (ja) | 2011-07-07 |
Family
ID=44226287
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/006264 WO2011080855A1 (ja) | 2009-12-28 | 2010-10-22 | 音声信号復元装置および音声信号復元方法 |
Country Status (5)
Country | Link |
---|---|
US (1) | US8706497B2 (ja) |
JP (1) | JP5535241B2 (ja) |
CN (1) | CN102652336B (ja) |
DE (1) | DE112010005020B4 (ja) |
WO (1) | WO2011080855A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012073295A (ja) * | 2010-09-27 | 2012-04-12 | Fujitsu Ltd | 音声帯域拡張装置および音声帯域拡張方法 |
CN103827967A (zh) * | 2011-12-27 | 2014-05-28 | 三菱电机株式会社 | 语音信号复原装置以及语音信号复原方法 |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9838784B2 (en) | 2009-12-02 | 2017-12-05 | Knowles Electronics, Llc | Directional audio capture |
US8798290B1 (en) | 2010-04-21 | 2014-08-05 | Audience, Inc. | Systems and methods for adaptive signal equalization |
KR102060208B1 (ko) * | 2011-07-29 | 2019-12-27 | 디티에스 엘엘씨 | 적응적 음성 명료도 처리기 |
JP6169849B2 (ja) * | 2013-01-15 | 2017-07-26 | 本田技研工業株式会社 | 音響処理装置 |
US9711156B2 (en) * | 2013-02-08 | 2017-07-18 | Qualcomm Incorporated | Systems and methods of performing filtering for gain determination |
US9304010B2 (en) * | 2013-02-28 | 2016-04-05 | Nokia Technologies Oy | Methods, apparatuses, and computer program products for providing broadband audio signals associated with navigation instructions |
US9536540B2 (en) | 2013-07-19 | 2017-01-03 | Knowles Electronics, Llc | Speech signal separation and synthesis based on auditory scene analysis and speech modeling |
US9721584B2 (en) * | 2014-07-14 | 2017-08-01 | Intel IP Corporation | Wind noise reduction for audio reception |
US9978388B2 (en) | 2014-09-12 | 2018-05-22 | Knowles Electronics, Llc | Systems and methods for restoration of speech components |
US10347273B2 (en) * | 2014-12-10 | 2019-07-09 | Nec Corporation | Speech processing apparatus, speech processing method, and recording medium |
DE112016000545B4 (de) | 2015-01-30 | 2019-08-22 | Knowles Electronics, Llc | Kontextabhängiges schalten von mikrofonen |
US9820042B1 (en) | 2016-05-02 | 2017-11-14 | Knowles Electronics, Llc | Stereo separation and directional suppression with omni-directional microphones |
WO2018062021A1 (ja) * | 2016-09-27 | 2018-04-05 | パナソニックIpマネジメント株式会社 | 音声信号処理装置、音声信号処理方法、および制御プログラム |
KR102648122B1 (ko) * | 2017-10-25 | 2024-03-19 | 삼성전자주식회사 | 전자 장치 및 그 제어 방법 |
DE102018206335A1 (de) | 2018-04-25 | 2019-10-31 | Audi Ag | Haupteinheit für ein Infotainmentsystem eines Fahrzeugs |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08248997A (ja) * | 1995-03-13 | 1996-09-27 | Matsushita Electric Ind Co Ltd | 音声帯域拡大装置 |
JPH10124089A (ja) * | 1996-10-24 | 1998-05-15 | Sony Corp | 音声信号処理装置及び方法、並びに、音声帯域幅拡張装置及び方法 |
WO2003019533A1 (fr) * | 2001-08-24 | 2003-03-06 | Kabushiki Kaisha Kenwood | Dispositif et procede d'interpolation adaptive de composantes de frequence d'un signal |
JP2007072264A (ja) * | 2005-09-08 | 2007-03-22 | Nippon Telegr & Teleph Corp <Ntt> | 音声量子化方法、音声量子化装置、プログラム |
Family Cites Families (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3099047B2 (ja) | 1990-02-02 | 2000-10-16 | 株式会社 ボッシュ オートモーティブ システム | ブラシレスモータの制御装置 |
JPH03243174A (ja) | 1990-02-16 | 1991-10-30 | Toyota Autom Loom Works Ltd | アクチュエータ |
JP3563772B2 (ja) * | 1994-06-16 | 2004-09-08 | キヤノン株式会社 | 音声合成方法及び装置並びに音声合成制御方法及び装置 |
JP3230790B2 (ja) | 1994-09-02 | 2001-11-19 | 日本電信電話株式会社 | 広帯域音声信号復元方法 |
JP3189598B2 (ja) * | 1994-10-28 | 2001-07-16 | 松下電器産業株式会社 | 信号合成方法および信号合成装置 |
EP0732687B2 (en) * | 1995-03-13 | 2005-10-12 | Matsushita Electric Industrial Co., Ltd. | Apparatus for expanding speech bandwidth |
US6240384B1 (en) * | 1995-12-04 | 2001-05-29 | Kabushiki Kaisha Toshiba | Speech synthesis method |
JP3243174B2 (ja) | 1996-03-21 | 2002-01-07 | 株式会社日立国際電気 | 狭帯域音声信号の周波数帯域拡張回路 |
US6081781A (en) * | 1996-09-11 | 2000-06-27 | Nippon Telegragh And Telephone Corporation | Method and apparatus for speech synthesis and program recorded medium |
JPH10124098A (ja) * | 1996-10-23 | 1998-05-15 | Kokusai Electric Co Ltd | 音声処理装置 |
JP3454190B2 (ja) | 1999-06-09 | 2003-10-06 | 三菱電機株式会社 | 雑音抑圧装置および方法 |
US6587846B1 (en) * | 1999-10-01 | 2003-07-01 | Lamuth John E. | Inductive inference affective language analyzer simulating artificial intelligence |
JP4296714B2 (ja) * | 2000-10-11 | 2009-07-15 | ソニー株式会社 | ロボット制御装置およびロボット制御方法、記録媒体、並びにプログラム |
US7251601B2 (en) * | 2001-03-26 | 2007-07-31 | Kabushiki Kaisha Toshiba | Speech synthesis method and speech synthesizer |
EP1345207B1 (en) * | 2002-03-15 | 2006-10-11 | Sony Corporation | Method and apparatus for speech synthesis program, recording medium, method and apparatus for generating constraint information and robot apparatus |
DE10252070B4 (de) * | 2002-11-08 | 2010-07-15 | Palm, Inc. (n.d.Ges. d. Staates Delaware), Sunnyvale | Kommunikationsendgerät mit parametrierter Bandbreitenerweiterung und Verfahren zur Bandbreitenerweiterung dafür |
KR100463655B1 (ko) * | 2002-11-15 | 2004-12-29 | 삼성전자주식회사 | 부가 정보 제공 기능이 있는 텍스트/음성 변환장치 및 방법 |
WO2004097792A1 (ja) * | 2003-04-28 | 2004-11-11 | Fujitsu Limited | 音声合成システム |
JP4661074B2 (ja) * | 2004-04-07 | 2011-03-30 | ソニー株式会社 | 情報処理システム、情報処理方法、並びにロボット装置 |
WO2006070768A1 (ja) * | 2004-12-27 | 2006-07-06 | P Softhouse Co., Ltd. | オーディオ波形処理装置、方式およびプログラム |
FR2898443A1 (fr) * | 2006-03-13 | 2007-09-14 | France Telecom | Procede de codage d'un signal audio source, dispositif de codage, procede et dispositif de decodage, signal, produits programme d'ordinateur correspondants |
EP1892703B1 (en) | 2006-08-22 | 2009-10-21 | Harman Becker Automotive Systems GmbH | Method and system for providing an acoustic signal with extended bandwidth |
JP2008185805A (ja) * | 2007-01-30 | 2008-08-14 | Internatl Business Mach Corp <Ibm> | 高品質の合成音声を生成する技術 |
JP4966048B2 (ja) * | 2007-02-20 | 2012-07-04 | 株式会社東芝 | 声質変換装置及び音声合成装置 |
JP2009109805A (ja) * | 2007-10-31 | 2009-05-21 | Toshiba Corp | 音声処理装置及びその方法 |
-
2010
- 2010-10-22 WO PCT/JP2010/006264 patent/WO2011080855A1/ja active Application Filing
- 2010-10-22 DE DE112010005020.1T patent/DE112010005020B4/de not_active Expired - Fee Related
- 2010-10-22 JP JP2011547245A patent/JP5535241B2/ja active Active
- 2010-10-22 US US13/503,497 patent/US8706497B2/en not_active Expired - Fee Related
- 2010-10-22 CN CN201080055064.1A patent/CN102652336B/zh not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08248997A (ja) * | 1995-03-13 | 1996-09-27 | Matsushita Electric Ind Co Ltd | 音声帯域拡大装置 |
JPH10124089A (ja) * | 1996-10-24 | 1998-05-15 | Sony Corp | 音声信号処理装置及び方法、並びに、音声帯域幅拡張装置及び方法 |
WO2003019533A1 (fr) * | 2001-08-24 | 2003-03-06 | Kabushiki Kaisha Kenwood | Dispositif et procede d'interpolation adaptive de composantes de frequence d'un signal |
JP2007072264A (ja) * | 2005-09-08 | 2007-03-22 | Nippon Telegr & Teleph Corp <Ntt> | 音声量子化方法、音声量子化装置、プログラム |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012073295A (ja) * | 2010-09-27 | 2012-04-12 | Fujitsu Ltd | 音声帯域拡張装置および音声帯域拡張方法 |
CN103827967A (zh) * | 2011-12-27 | 2014-05-28 | 三菱电机株式会社 | 语音信号复原装置以及语音信号复原方法 |
Also Published As
Publication number | Publication date |
---|---|
JP5535241B2 (ja) | 2014-07-02 |
DE112010005020T5 (de) | 2012-10-18 |
JPWO2011080855A1 (ja) | 2013-05-09 |
CN102652336A (zh) | 2012-08-29 |
CN102652336B (zh) | 2015-02-18 |
US20120209611A1 (en) | 2012-08-16 |
DE112010005020B4 (de) | 2018-12-13 |
US8706497B2 (en) | 2014-04-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5535241B2 (ja) | 音声信号復元装置および音声信号復元方法 | |
JP3881943B2 (ja) | 音響符号化装置及び音響符号化方法 | |
US8209188B2 (en) | Scalable coding/decoding apparatus and method based on quantization precision in bands | |
US7035797B2 (en) | Data-driven filtering of cepstral time trajectories for robust speech recognition | |
CA2399706C (en) | Background noise reduction in sinusoidal based speech coding systems | |
JP3881946B2 (ja) | 音響符号化装置及び音響符号化方法 | |
EP2416315B1 (en) | Noise suppression device | |
JP5127754B2 (ja) | 信号処理装置 | |
JP2009530685A (ja) | Mdct係数を使用する音声後処理 | |
JP4302978B2 (ja) | 音声コーデックにおける擬似高帯域信号の推定システム | |
EP1970900A1 (en) | Method and apparatus for providing a codebook for bandwidth extension of an acoustic signal | |
KR20070000995A (ko) | 고조파 신호의 주파수 확장 방법 및 시스템 | |
JPH10124088A (ja) | 音声帯域幅拡張装置及び方法 | |
CN101976566A (zh) | 语音增强方法及应用该方法的装置 | |
JP4622164B2 (ja) | 音響信号符号化方法及び装置 | |
US9390718B2 (en) | Audio signal restoration device and audio signal restoration method | |
JP6073456B2 (ja) | 音声強調装置 | |
JP2000305599A (ja) | 音声合成装置及び方法、電話装置並びにプログラム提供媒体 | |
JP2009223210A (ja) | 信号帯域拡張装置および信号帯域拡張方法 | |
EP1686564A1 (en) | Bandwidth extension of bandlimited acoustic signals | |
EP1619666B1 (en) | Speech decoder, speech decoding method, program, recording medium | |
JP4287840B2 (ja) | 符号化装置 | |
JP2004078232A (ja) | 広帯域音声復元装置及び広帯域音声復元方法及び音声伝送システム及び音声伝送方法 | |
JP2004046238A (ja) | 広帯域音声復元装置及び広帯域音声復元方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080055064.1 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10840718 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011547245 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13503497 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 112010005020 Country of ref document: DE Ref document number: 1120100050201 Country of ref document: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10840718 Country of ref document: EP Kind code of ref document: A1 |