JP5446745B2

JP5446745B2 - Sound signal processing method and sound signal processing apparatus

Info

Publication number: JP5446745B2
Application number: JP2009253963A
Authority: JP
Inventors: 義照土永; 孝志牧内
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2009-11-05
Filing date: 2009-11-05
Publication date: 2014-03-19
Anticipated expiration: 2029-11-05
Also published as: JP2011099967A

Description

本発明は、音信号の雑音抑圧処理に関し、特に、雑音の大きさに応じた音信号の雑音抑圧処理に関する。 The present invention relates to noise suppression processing for sound signals, and more particularly to noise suppression processing for sound signals in accordance with the magnitude of noise.

ノイズ抑圧装置は、入力音信号における音声を含む区間と背景雑音のみを含む区間とを判別し（ＶＡＤ判定し）、その背景雑音区間における背景雑音電力を検出し、その音区間においてその検出された背景雑音電力に対応する電力分を抑圧することができる。 The noise suppression device discriminates a section including speech in the input sound signal and a section including only background noise (VAD determination), detects background noise power in the background noise section, and detects the detected noise in the sound section. The power corresponding to the background noise power can be suppressed.

マイクロホン・アレイは、少なくとも２つのマイクロホンのアレイを用い、受音して変換された音信号を処理することによって、所望の目的音の音源方向に受音範囲を限定しまたは指向性を制御し、雑音抑圧または目的音強調を行うことができる。 The microphone array uses an array of at least two microphones to process the sound signal received and converted, thereby limiting the sound receiving range to the sound source direction of the desired target sound or controlling the directivity, Noise suppression or target sound enhancement can be performed.

既知の或る音声信号送出装置では、送話位置Ｍからの通話音声及び雑音源からの雑音が各経路を通じて各マイクロホンに入力され、さらに各遅延器に入力する。この場合、それら遅延器から出力される音声信号のそれぞれの遅延量が等しくなるようにし、加算器では、それらの遅延量のそれぞれの音声信号及び雑音を加え合わせて出力する。雑音は時間相関が小さいので、この白色雑音の振幅は３倍にならず、音声信号の振幅は３倍の値になり、音声の明瞭度が向上する。 In a certain known voice signal transmission device, the voice from the transmission position M and the noise from the noise source are input to each microphone through each path and further input to each delay unit. In this case, the delay amounts of the audio signals output from the delay devices are made equal, and the adder outputs the audio signals and noises of those delay amounts together. Since the noise has a small time correlation, the amplitude of the white noise is not tripled, and the amplitude of the voice signal is tripled, thereby improving the clarity of the voice.

既知の或る会議用拡聴器では、マイクロホン・アレイの両端のマイクロホンに入力する音波の到来時間差を利用して音源方向検出手段で音源方向を検出し、各可変遅延素子の遅延量を調整して各マイクロホンの出力が同相になるようにして指向性を高め、また、入力する音源が変わった時はその都度ＡＧＣ手段により増幅手段の利得を自動調整する。それによって、背景雑音を抑制し、目的音声のみを適切な音量で歪みなく受聴できる。 In one known conference hearing aid, the sound source direction is detected by the sound source direction detection means using the difference in arrival time of the sound waves input to the microphones at both ends of the microphone array, and the delay amount of each variable delay element is adjusted. The directivity is improved by making the outputs of the microphones in phase, and the gain of the amplification means is automatically adjusted by the AGC means each time the input sound source changes. As a result, background noise is suppressed, and only the target voice can be heard at an appropriate volume without distortion.

既知の或る音源分離システムは、目的音到来方向に並べて配置された２個のマイクロフォンと、これらの受音信号を用いて目的音強調用の線形結合処理を行って目的音優勢の信号を生成する目的音優勢信号生成手段と、マイクロフォンの受音信号を用いて目的音抑制用の線形結合処理を行って目的音劣勢の信号を生成する目的音劣勢信号生成手段と、目的音優勢の信号のスペクトルと目的音劣勢の信号のスペクトルとを用いて目的音と妨害音とを分離する分離手段とを含む。それによって、目的音と任意の方向から到来する妨害音とを精度よく分離することができ、装置を小型化できる。 A known sound source separation system generates a target sound dominant signal by performing linear combination processing for target sound enhancement using two microphones arranged side by side in the target sound arrival direction and these received signals. A target sound dominant signal generating means, a target sound inferior signal generating means for generating a target sound inferior signal by performing a linear combination process for suppressing the target sound using the received sound signal of the microphone, and a target sound dominant signal Separation means for separating the target sound and the interference sound using the spectrum and the spectrum of the target sound inferior signal. Thereby, the target sound and the interference sound coming from an arbitrary direction can be separated with high accuracy, and the apparatus can be miniaturized.

特開平０７−３８９８４号公報JP 07-38884 A 特開平０９−１４００００号公報Japanese Patent Laid-Open No. 09-140000 特開２００６−１９７５５２号公報JP 2006-197552 A

上述のノイズ抑圧装置では、音声成分が雑音成分に埋もれてしまうほど雑音電力が相対的に大きいとき、ＶＡＤ判定の精度が低くなる傾向がある。一方、上述のマイクロホン・アレイでは、個々のマイクロホン間に出荷時におけるおよび経年変化による特性の差が存在することがあり、例えばマイクロホン感度がばらつくので、マイクロホン間の位相差が理論通りに求まらないことがある。 In the above-described noise suppression device, when the noise power is relatively large such that the speech component is buried in the noise component, the accuracy of the VAD determination tends to be low. On the other hand, in the above-described microphone array, there may be a difference in characteristics between the individual microphones at the time of shipment and due to aging. For example, since the microphone sensitivity varies, the phase difference between the microphones can be obtained theoretically. There may not be.

発明者は、入力音信号の雑音の大きさに応じて、異なる抑圧法を適用すると、雑音が抑圧されたより高い音声品質を実現することができる、と認識した。 The inventor has recognized that, when different suppression methods are applied according to the noise level of the input sound signal, higher voice quality with suppressed noise can be realized.

本発明の実施形態の目的は、雑音が抑圧されたより高い音声品質を実現することである。 An object of embodiments of the present invention is to achieve higher voice quality with noise suppressed.

本発明の実施形態の一観点によれば、情報処理装置における音信号処理方法は、或る時間期間において第１と第２の入力音信号の中のその第１の入力音信号の音声区間と雑音区間を判定する工程と、その雑音区間におけるその第１の入力音信号の電力の大きさを判定する工程と、その雑音区間におけるその第１の入力音信号の電力の大きさが第１の閾値より大きいかどうかを判定する工程と、その雑音区間におけるその第１の入力音信号の電力の大きさがその第１の閾値より大きくないと判定された場合には、その雑音区間におけるその判定された電力の大きさに基づいて、第１の抑圧部により、その第１の入力音信号のその音声区間およびその雑音区間における雑音を抑圧する工程と、その雑音区間におけるその第１の入力音信号の電力の大きさがその第１の閾値より大きくないと判定された場合に、その音声区間において、その第１の入力音信号とその第２の入力音信号の間の位相差を求め、その第１の入力音信号とその第２の入力音信号の間の理論的位相差とその求めた位相差との間の誤差を求める工程と、その雑音区間におけるその第１の入力音信号の電力の大きさがその第１の閾値より大きいと判定された場合には、その第１の入力音信号とその第２の入力音信号の間の位相差をその求められた誤差に応じて補正し、第２の抑圧部により、その補正された位相差に応じてその第１の入力音信号の雑音を抑圧する工程と、を含んでいる。 According to one aspect of the embodiment of the present invention, a sound signal processing method in an information processing apparatus includes: a sound section of a first input sound signal in first and second input sound signals in a certain time period; Determining a noise interval; determining a power level of the first input sound signal in the noise interval; and determining a power level of the first input sound signal in the noise interval A step of determining whether or not the threshold value is greater than the threshold value; and if it is determined that the power of the first input sound signal in the noise interval is not greater than the first threshold value, the determination in the noise interval A step of suppressing noise in the speech section and the noise section of the first input sound signal by the first suppressor based on the magnitude of the power, and the first input sound in the noise section. Of signal power When it is determined that the pitch is not greater than the first threshold value, a phase difference between the first input sound signal and the second input sound signal is obtained in the sound section, and the first Obtaining an error between the theoretical phase difference between the input sound signal and the second input sound signal and the obtained phase difference, and the magnitude of the power of the first input sound signal in the noise interval Is determined to be larger than the first threshold, the phase difference between the first input sound signal and the second input sound signal is corrected according to the obtained error, and the second And a step of suppressing noise of the first input sound signal in accordance with the corrected phase difference.

本発明の実施形態によれば、音信号の雑音抑圧後の音声品質を高くすることができる。 According to the embodiment of the present invention, it is possible to improve the voice quality after noise suppression of the sound signal.

図１は、本発明の実施形態による音声情報装置の概略的構成の例を示している。FIG. 1 shows an example of a schematic configuration of a voice information device according to an embodiment of the present invention. 図２は、ディジタル信号プロセッサの概略的構成の例を示している。FIG. 2 shows an example of a schematic configuration of a digital signal processor. 図３は、音声区間のパワースペクトルと雑音区間のパワースペクトルの例を示している。FIG. 3 shows an example of the power spectrum of the voice section and the power spectrum of the noise section. 図４は、目的音源に対するそれぞれ２つのマイクロホンの空間的な配置の例を示している。FIG. 4 shows an example of a spatial arrangement of two microphones with respect to the target sound source. 図５は、図４におけるマイクロホンの抑圧角度範囲の例を表している。FIG. 5 shows an example of the suppression angle range of the microphone in FIG. 図６および７は、音源からの受音の角度方向に対する或る周波数に関する入力信号の位相差および誤差位相差の例を示している。6 and 7 show examples of a phase difference and an error phase difference of an input signal regarding a certain frequency with respect to an angular direction of sound reception from a sound source. (図6で説明)(Explained in Fig. 6) 図８Ａ〜８Ｃは、２つのノイズ・サプレッサおよび制御部によって実行される、雑音抑圧のためのフローチャートの例を示している。8A to 8C show examples of flowcharts for noise suppression executed by two noise suppressors and a control unit. (図8Aで説明)(Explained in Figure 8A) (図8Aで説明)(Explained in Figure 8A)

発明の目的および利点は、請求の範囲に具体的に記載された構成要素および組み合わせによって実現され達成される。 The objects and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims.

前述の一般的な説明および以下の詳細な説明は、典型例および説明のためのものであって、本発明を限定するためのものではない。 The foregoing general description and the following detailed description are exemplary and explanatory only and are not intended to limit the invention.

本発明の非限定的な実施形態を、図面を参照して説明する。図面において、同様の構成要素には同じ参照番号が付されている。 Non-limiting embodiments of the present invention will be described with reference to the drawings. In the drawings, similar components are given the same reference numerals.

図１は、本発明の実施形態による音声情報装置１０の概略的構成（configuration）の例を示している。 FIG. 1 shows an example of a schematic configuration of a voice information device 10 according to an embodiment of the present invention.

音声情報装置１０は、少なくとも２つのマイクロホンＭＩＣ１、ＭＩＣ２を含むマイクロホン・アレイ装置１００、ディジタル信号プロセッサ（ＤＳＰ）２００、および利用アプリケーション部４００を含んでいる。音声情報装置１０は、例えば音声認識機能を有する車載装置またはカー・ナビゲーション装置、ハンズフリー電話機、または携帯電話機のような情報機器であってもよい。 The audio information device 10 includes a microphone array device 100 including at least two microphones MIC1 and MIC2, a digital signal processor (DSP) 200, and a utilization application unit 400. The voice information device 10 may be an information device such as an in-vehicle device or a car navigation device having a voice recognition function, a hands-free phone, or a mobile phone.

マイクロホン・アレイ装置１００は、受音部または音信号入力部としてのマイクロホンＭＩＣ１、ＭＩＣ２、増幅器（ＡＭＰ）１２２、１２４、低域通過フィルタ（ＬＰＦ）１４２、１４４、およびアナログ−ディジタル変換器（Ａ／Ｄ）１６２、１６４を具えている。ディジタル信号プロセッサ（ＤＳＰ）２００は、例えばＲＡＭ等を含むメモリ２０２に結合されている。 The microphone array apparatus 100 includes microphones MIC1, MIC2, amplifiers (AMP) 122, 124, low-pass filters (LPF) 142, 144, and analog-digital converters (A / A) as sound receiving units or sound signal input units. D) 162 and 164 are provided. A digital signal processor (DSP) 200 is coupled to a memory 202 including, for example, a RAM or the like.

マイクロホンＭＩＣ１、ＭＩＣ２によって音波から変換されたアナログ入力信号ｉｎａ１、ｉｎａ２は、増幅器（ＡＭＰ）１２２、１２４にそれぞれ供給されて、増幅器１２２、１２４によって増幅される。増幅器１２２、１２４の出力の増幅されたアナログ音信号ＩＮａ１、ＩＮａ２は、例えば遮断周波数ｆｃ（例えば、３．９ｋＨｚ）の低域通過フィルタ（Low Pass Filter）１４２、１４４の入力にそれぞれ結合されて、低域通過濾波される。低域通過フィルタの代わりに、例えば通過周波数帯域０．４〜３．９ｋＨｚの帯域通過フィルタを用いてもよい。 Analog input signals ina1 and ina2 converted from sound waves by the microphones MIC1 and MIC2 are supplied to amplifiers (AMP) 122 and 124, respectively, and are amplified by the amplifiers 122 and 124. The amplified analog sound signals INa1 and INa2 output from the amplifiers 122 and 124 are respectively coupled to inputs of low-pass filters 142 and 144 having a cutoff frequency fc (for example, 3.9 kHz), respectively. Low pass filtered. Instead of the low-pass filter, for example, a band-pass filter having a pass frequency band of 0.4 to 3.9 kHz may be used.

低域通過フィルタ１４２、１４４の出力の濾波済みのアナログ信号ＩＮｐ１、ＩＮｐ２は、サンプリング周波数ｆｓ（例えば、８ｋＨｚ）（ｆｓ＞２ｆｃ）のアナログ−ディジタル変換器１６２、１６４の入力にそれぞれ結合されて、ディジタル入力信号に変換される。アナログ−ディジタル変換器１６２、１６４からの時間領域のディジタル入力信号ＩＮ１（ｔ）、ＩＮ２（ｔ）は、ディジタル信号プロセッサ（ＤＳＰ）２００の、音信号入力部としての入力端子ｉｔ１およびｉｔ２にそれぞれ結合される。 The filtered analog signals INp1, INp2 at the outputs of the low-pass filters 142, 144 are respectively coupled to the inputs of analog-to-digital converters 162, 164 with a sampling frequency fs (eg, 8 kHz) (fs> 2fc), Converted to a digital input signal. The time domain digital input signals IN1 (t) and IN2 (t) from the analog-to-digital converters 162 and 164 are respectively coupled to input terminals it1 and it2 as sound signal input units of the digital signal processor (DSP) 200. Is done.

図２は、ディジタル信号プロセッサ２００の概略的構成（configuration）の例を示している。 FIG. 2 shows an example of a schematic configuration of the digital signal processor 200.

ディジタル信号プロセッサ２００は、入力端子ｉｔ１およびｉｔ２にそれぞれ結合された入力バッファ・メモリ２１２および２１４、１マイクＭＩＣ１用のノイズ・サプレッサ（抑圧部）（ＮＳ）２２０、制御部またはモード切替部２３０、および２マイクＭＩＣ１およびＭＩＣ２用のノイズ・サプレッサ（抑圧部）（ＮＳ）２６０を含んでいる。入力バッファ・メモリ２１２および２１４は、マイクロホン・アレイ装置１００（マイクロホンＭＩＣ１、ＭＩＣ２）に結合されていて、そのディジタル入力信号ＩＮ１（ｔ）、ＩＮ２（ｔ）を受け取って、雑音抑圧処理のためにバッファリングする。入力バッファ・メモリ２１２および２１４は、メモリ２０２のメモリ領域であってもよい。 Digital signal processor 200 includes input buffer memories 212 and 214 coupled to input terminals it1 and it2, respectively, a noise suppressor (NS) 220 for microphone MIC1, a control unit or mode switching unit 230, and A noise suppressor (NS) 260 for the two microphones MIC1 and MIC2 is included. Input buffer memories 212 and 214 are coupled to microphone array apparatus 100 (microphones MIC1, MIC2), receive their digital input signals IN1 (t), IN2 (t), and buffer them for noise suppression processing. Ring. Input buffer memories 212 and 214 may be memory areas of memory 202.

ノイズ・サプレッサ２２０は、ディジタル入力信号ＩＮ１（ｔ）をバッファリングする入力バッファ・メモリ２１２に結合され、また、制御部２３０、ノイズ・サプレッサ２６０および出力側スイッチＳＷ＿Ｏに結合されている。ノイズ・サプレッサ２２０は、音声区間検出部（ＶＡＤ）２２２、電力判定部または電力検出および推定部２２４、入力側スイッチＳＷ＿Ｉ１、および雑音電力を抑圧する抑圧部２２６を含んでいる。音声区間検出部（ＶＡＤ）２２２は、既知のＶＡＤ（Voice Activity Detection、音声活動検出）法により、例えば２０ｍｓの時間期間について入力音信号における音声区間かまたは非音声区間（雑音区間）かを識別する。 The noise suppressor 220 is coupled to the input buffer memory 212 that buffers the digital input signal IN1 (t), and is coupled to the control unit 230, the noise suppressor 260, and the output side switch SW_O. The noise suppressor 220 includes a voice interval detection unit (VAD) 222, a power determination unit or power detection and estimation unit 224, an input side switch SW_I1, and a suppression unit 226 that suppresses noise power. The voice section detection unit (VAD) 222 identifies, for example, a voice section in the input sound signal or a non-speech section (noise section) for a time period of 20 ms by a known VAD (Voice Activity Detection) method. .

ノイズ・サプレッサ２６０は、ディジタル入力信号ＩＮ１（ｔ）およびＩＮ２（ｔ）をそれぞれバッファリングする入力バッファ・メモリ２１２および２１４に結合され、制御部２３０の制御信号ＣＴＬ、および出力側スイッチＳＷ＿Ｏに結合されている。ノイズ・サプレッサ２６０は、位相差決定部２６２、誤差位相差決定部２６４、入力側スイッチＳＷ＿Ｉ２、位相差補正部２６６、方向判定部２６８、および雑音角度方向の雑音電力を抑圧する抑圧部２７０を含んでいる。 The noise suppressor 260 is coupled to the input buffer memories 212 and 214 for buffering the digital input signals IN1 (t) and IN2 (t), respectively, and is coupled to the control signal CTL of the control unit 230 and the output side switch SW_O. ing. The noise suppressor 260 includes a phase difference determination unit 262, an error phase difference determination unit 264, an input side switch SW_I2, a phase difference correction unit 266, a direction determination unit 268, and a suppression unit 270 that suppresses noise power in the noise angle direction. It is out.

ノイズ・サプレッサ２２０の電力判定部２２４は、音声区間検出部２２２からの非音声区間におけるディジタル入力信号ＩＮ１（ｔ）の雑音の平均電力の大きさを検出し、その検出した雑音平均電力の大きさを制御部２３０に供給する。電力の大きさを表すものとして、電力に限定されることなく、平均振幅を用いてもよい。 The power determination unit 224 of the noise suppressor 220 detects the average power level of the noise of the digital input signal IN1 (t) in the non-voice section from the voice section detection unit 222, and the detected magnitude of the noise average power. Is supplied to the control unit 230. As an indication of the magnitude of power, the average amplitude may be used without being limited to power.

制御部２３０は、音声区間および非音声区間の各時間区間毎に、また非音声区間における雑音（Ｎ）電力の大きさに応じて、雑音電力抑圧モードと雑音方向抑圧モードの間の切替えを行うための制御信号ＣＴＬを生成する。制御部２３０は、制御信号ＣＴＬを、電力判定部（または電力検出および推定部）２２４、位相差決定部２６２、誤差位相差決定部２６４、入力側のスイッチＳＷ＿Ｉ１、ＳＷ＿Ｉ２、および出力側のスイッチＳＷ＿Ｏに供給する。制御部２３０は、さらに、制御信号ＣＴＬを、抑圧部２２６および２７０、位相差補正部２６６、および方向判定部２６８に供給してもよい。 The control unit 230 performs switching between the noise power suppression mode and the noise direction suppression mode for each time interval of the speech segment and the non-speech segment and according to the magnitude of the noise (N) power in the non-speech segment. A control signal CTL is generated. The control unit 230 converts the control signal CTL into a power determination unit (or power detection and estimation unit) 224, a phase difference determination unit 262, an error phase difference determination unit 264, input side switches SW_I1 and SW_I2, and output side switch SW_O. To supply. The control unit 230 may further supply the control signal CTL to the suppression units 226 and 270, the phase difference correction unit 266, and the direction determination unit 268.

ノイズ・サプレッサ２２０において、音声区間検出部（ＶＡＤ）２２２は、時間領域のディジタル入力信号ＩＮ１（ｔ）を、例えばフーリエ変換などによって周波数領域のディジタル入力信号または複素スペクトルＩＮ１（ｆ）に変換する。音声区間検出部２２２は、その電力分布の特徴に基づいて、音声区間と非音声区間（雑音区間、無音区間）とを識別し、音声区間と非音声区間の識別情報を、電力判定部２２４、制御部２３０、位相差決定部２６２および誤差位相差決定部２６４に供給する。 In the noise suppressor 220, the voice interval detector (VAD) 222 converts the time domain digital input signal IN1 (t) into a frequency domain digital input signal or a complex spectrum IN1 (f) by, for example, Fourier transform. The voice segment detection unit 222 identifies a voice segment and a non-speech segment (noise segment, silence segment) based on the characteristics of the power distribution, and uses the power determination unit 224 to identify identification information of the voice segment and the non-speech segment. This is supplied to the control unit 230, the phase difference determination unit 262, and the error phase difference determination unit 264.

図３は、音声区間のパワースペクトルと雑音区間のパワースペクトルの例を示している。音声区間のパワースペクトルは、分布が不均一であり、相対的に規則性が高い（エントロピが小さい）。雑音区間のパワースペクトルは、周波数全体に対して分布が概ね均一であり、相対的に規則性が低い（エントロピが大きい）。このような分布差を利用して音声区間と非音声区間を識別する。さらに、例えば音声特有のピッチ(ハーモニクス)特性やフォルマンの分布特性を求めて識別してもよい。 FIG. 3 shows an example of the power spectrum of the voice section and the power spectrum of the noise section. The power spectrum of the speech section has a non-uniform distribution and a relatively high regularity (small entropy). The power spectrum of the noise section has a substantially uniform distribution over the entire frequency, and is relatively low in regularity (high entropy). A voice interval and a non-voice interval are identified using such a distribution difference. Further, for example, a pitch (harmonic) characteristic peculiar to voice or a formant distribution characteristic may be obtained and identified.

電力判定部２２４は、雑音電力抑圧モードにおいて、通過周波数帯域に対するその検出した雑音電力の分布に基づいて雑音電力の分布を推定し、推定の雑音電力の分布を抑圧部２２６に供給する。図３における雑音区間のパワースペクトルを周波数に対する雑音電力成分の分布と推定してもよい。抑圧部２２６は、雑音電力抑圧モードにおいて、入力信号ＩＮ１（ｔ）からその推定の雑音電力成分を減算し、それによって雑音成分を抑圧し、その雑音を除去した音声信号ＩＮｎｓ（ｔ）をスイッチＳＷ＿Ｏに供給する。 The power determination unit 224 estimates the noise power distribution based on the detected noise power distribution with respect to the pass frequency band in the noise power suppression mode, and supplies the estimated noise power distribution to the suppression unit 226. The power spectrum in the noise section in FIG. 3 may be estimated as the distribution of the noise power component with respect to the frequency. In the noise power suppression mode, the suppression unit 226 subtracts the estimated noise power component from the input signal IN1 (t), thereby suppressing the noise component and switching the audio signal INns (t) from which the noise has been removed to the switch SW_O. To supply.

図４は、目的音源ＳＳに対するそれぞれ２つのマイクロホンＭＩＣ１およびＭＩＣ２の空間的な配置の例を示している。
図５は、図４におけるマイクロホンＭＩＣ１およびＭＩＣ２の抑圧角度範囲＋α〜π〜（２π−α）の例を示している。 FIG. 4 shows an example of a spatial arrangement of two microphones MIC1 and MIC2 with respect to the target sound source SS.
FIG. 5 shows an example of suppression angle ranges + α to π to (2π−α) of the microphones MIC1 and MIC2 in FIG.

図４において、一般的には、複数のマイクロホンＭＩＣ１、ＭＩＣ２、．．．のアレイが、直線上に互いに既知の距離ｄだけ離して配置される。ここでは、典型例として、隣接する２つのマイクロホンＭＩＣ１およびＭＩＣ２が直線上に互いに距離ｄだけ離して配置されているものとする。隣接マイクロホン間の距離ｄは、サンプリング定理を満たすものであればよい。 In FIG. 4, generally, a plurality of microphones MIC1, MIC2,. . . Are arranged at a known distance d from each other on a straight line. Here, as a typical example, it is assumed that two adjacent microphones MIC1 and MIC2 are arranged on a straight line at a distance d from each other. The distance d between adjacent microphones only needs to satisfy the sampling theorem.

図４において、目的音源ＳＳは、マイクロホンＭＩＣ１とＭＩＣ２を結ぶ直線上にあり、目的音源はマイクロホンＭＩＣ１の左側正面にあり、目的音源ＳＳの方向をマイクロホン・アレイＭＩＣ１、ＭＩＣ２の受音方向または目的方向とする。典型的には、受音目的の音源ＳＳは話者の口であり、受音方向は話者の口の方向である。受音角度方向（０）付近に受音角度範囲（−α〜０〜＋α）が設けられる。受音角度範囲（−α〜０〜＋α）以外の範囲（＋α〜π〜（２π−α））を雑音の抑圧角度範囲としてもよい。 In FIG. 4, the target sound source SS is on a straight line connecting the microphones MIC1 and MIC2, the target sound source is on the left front surface of the microphone MIC1, and the direction of the target sound source SS is the sound receiving direction or the target direction of the microphone arrays MIC1 and MIC2. And Typically, the sound source SS for receiving sound is the speaker's mouth, and the sound receiving direction is the direction of the speaker's mouth. A sound reception angle range (−α to 0 to + α) is provided in the vicinity of the sound reception angle direction (0). A range (+ α to π to (2π−α)) other than the sound receiving angle range (−α to 0 to + α) may be set as the noise suppression angle range.

マイクロホンＭＩＣ１とＭＩＣ２の間の距離ｄは、サンプリング定理を満たすように、距離ｄ＜音速ｃ／サンプリング周波数ｆｓの条件を満たすように設定されることが好ましい。マイクロホン・アレイＭＩＣ１、ＭＩＣ２によって受音され処理される入力音信号は、マイクロホン・アレイＭＩＣ１、ＭＩＣ２を通る直線に対する音波の入射角度θ（＝−α〜０〜＋α）に依存し、その直線に垂直な平面上の半径方向の入射方向（０〜２π）には依存しない。 The distance d between the microphones MIC1 and MIC2 is preferably set so as to satisfy the condition of distance d <sound speed c / sampling frequency fs so as to satisfy the sampling theorem. The input sound signal received and processed by the microphone arrays MIC1 and MIC2 depends on the incident angle θ (= −α to 0 + α) of the sound wave with respect to the straight line passing through the microphone arrays MIC1 and MIC2, and is perpendicular to the straight line. It does not depend on the incident direction (0 to 2π) in the radial direction on a flat surface.

目的音源ＳＳの音は、右側のマイクロホンＭＩＣ２において、その左側のマイクロホンＭＩＣ１よりも遅延時間τ＝ｄ／ｃだけ遅延して検出される。左側正面の角度θ方向の音源からの音声は、右側のマイクロホンＭＩＣ２において、その左側のマイクロホンＭＩＣ１よりも遅延時間τ＝ｄ・sin（π／２−θ）／ｃだけ遅延して検出される（−π／２≦θ≦＋π／２）。一方、抑圧角度範囲の雑音Ｎは、左側のマイクロホンＭＩＣ１において、その右側のマイクロホンＭＩＣ２よりも遅延時間τ＝ｄ／ｃだけ遅延して検出される。右側背面の角度θ方向の音源からの雑音は、左側のマイクロホンＭＩＣ１において、その右側のマイクロホンＭＩＣ２よりも遅延時間τ＝ｄ・sin（θ−π／２）／ｃだけ遅延して検出される（＋π／２≦θ≦＋π）。 The sound of the target sound source SS is detected by the right microphone MIC2 with a delay time τ = d / c from the left microphone MIC1. The sound from the sound source in the direction of the angle θ on the left front is detected by the right microphone MIC2 with a delay time τ = d · sin (π / 2−θ) / c from the left microphone MIC1 ( −π / 2 ≦ θ ≦ + π / 2). On the other hand, the noise N in the suppression angle range is detected by the left microphone MIC1 with a delay time τ = d / c from the right microphone MIC2. Noise from the sound source in the angle θ direction on the right rear surface is detected by the left microphone MIC1 with a delay time τ = d · sin (θ−π / 2) / c from the right microphone MIC2. + Π / 2 ≦ θ ≦ + π).

ノイズ・サプレッサ２６０において、位相差決定部２６２は、時間領域のディジタル入力信号ＩＮ１（ｔ）とＩＮ２（ｔ）の間の位相差ＰＤを求める。位相差ＰＤは、周波数ｆ毎に２つのディジタル入力信号ＩＮ１（ｔ）とＩＮ２（ｔ）の時間的な電力または振幅の変化の時間差に基づいて求めてもよい。代替形態として、位相差ＰＤは、周波数ｆ毎に周波数領域における２つのディジタル入力信号ＩＮ１（ｔ）とＩＮ２（ｔ）の位相に基づいて求めてもよい。 In the noise suppressor 260, the phase difference determination unit 262 obtains the phase difference PD between the time domain digital input signals IN1 (t) and IN2 (t). The phase difference PD may be obtained based on the time difference between temporal power or amplitude changes of the two digital input signals IN1 (t) and IN2 (t) for each frequency f. As an alternative, the phase difference PD may be obtained based on the phases of the two digital input signals IN1 (t) and IN2 (t) in the frequency domain for each frequency f.

位相差ＰＤを求めるために、時間領域のディジタル入力信号ＩＮ１（ｔ）、ＩＮ２（ｔ）を、例えばフーリエ変換などによって周波数領域のディジタル入力信号または複素スペクトルＩＮ１（ｆ）、ＩＮ２（ｆ）に変換してもよい。 In order to obtain the phase difference PD, the time domain digital input signals IN1 (t) and IN2 (t) are converted into frequency domain digital input signals or complex spectra IN1 (f) and IN2 (f) by, for example, Fourier transform. May be.

周波数領域で入力信号を処理するために、入力バッファ・メモリ２１２および２１４中の時間領域のディジタル入力信号ＩＮ１（ｔ）、ＩＮ２（ｔ）は、ディジタル信号プロセッサ２００の高速フーリエ変換器（ＦＦＴ）にそれぞれ供給される。高速フーリエ変換器は、既知の形態で、ディジタル入力信号ＩＮ１（ｔ）、ＩＮ２（ｔ）の各信号区間に、オーバラップ窓関数を乗算してその積をフーリエ変換または直交変換して、周波数領域の複素スペクトルＩＮ１（ｆ）、ＩＮ２（ｆ）を生成する。ここで、ＩＮ１（ｆ）＝Ａ_１ｅ^{ｊ（２πｆｔ＋φ１（ｆ））}、ＩＮ２（ｆ）＝Ａ_２ｅ^{ｊ（２πｆｔ＋φ２（ｆ））}、ｆは周波数、Ａ_１およびＡ_２は振幅、ｊは単位虚数、φ１（ｆ）およびφ２（ｆ）は周波数ｆの関数である遅延位相である。オーバラップ窓関数として、例えば、ハミング窓関数、ハニング窓関数、ブラックマン窓関数、３シグマガウス窓関数、または三角窓関数を用いることができる。 In order to process the input signal in the frequency domain, the time domain digital input signals IN1 (t), IN2 (t) in the input buffer memories 212 and 214 are fed to the fast Fourier transform (FFT) of the digital signal processor 200. Supplied respectively. In a known form, the fast Fourier transformer multiplies each signal section of the digital input signals IN1 (t) and IN2 (t) by an overlap window function and Fourier-transforms or orthogonally transforms the product to obtain a frequency domain. The complex spectra IN1 (f) and IN2 (f) are generated. Where IN1 (f) = A ₁ e ^{j (2πft + φ1 (f))} , IN2 (f) = A ₂ e ^{j (2πft + φ2 (f))} , f is frequency, A ₁ and A ₂ are amplitude, j is unit The imaginary numbers, φ1 (f) and φ2 (f) are delay phases that are a function of frequency f. As the overlap window function, for example, a Hamming window function, a Hanning window function, a Blackman window function, a 3 sigma gauss window function, or a triangular window function can be used.

位相差決定部２６２は、距離ｄだけ離れた隣接の２つのマイクロホンＭＩＣ１とＭＩＣ２の間での周波数ｆ（０＜ｆ＜ｆｓ／２）毎の音源方向を示す位相スペクトル成分の位相差ＰＤ（ｆ）（ラジアン、ｒａｄ）を次の式で求める。
ＰＤ（ｆ）＝ｔａｎ^−１（ＩＮ１（ｆ）／ＩＮ２（ｆ））
＝ｔａｎ^−１（Ｊ｛ＩＮ１（ｆ）／ＩＮ２（ｆ）｝／Ｒ｛ＩＮ１（ｆ）／ＩＮ２（ｆ）｝）
ここで、特定の周波数ｆの音声または雑音の音源は１つの音源しかないものと近似する。Ｊ｛ｘ｝は複素数ｘの虚数成分を表し、Ｒ｛ｘ｝は複素数ｘの実数成分を表す。 The phase difference determination unit 262 includes a phase difference PD (f) of a phase spectrum component indicating a sound source direction for each frequency f (0 <f <fs / 2) between two adjacent microphones MIC1 and MIC2 separated by a distance d. ) (Radian, rad) is obtained by the following equation.
PD (f) = tan ⁻¹ (IN1 (f) / IN2 (f))
= Tan ⁻¹ (J {IN1 (f) / IN2 (f)} / R {IN1 (f) / IN2 (f)})
Here, a sound source having a specific frequency f or a sound source is approximated as having only one sound source. J {x} represents the imaginary component of the complex number x, and R {x} represents the real component of the complex number x.

この位相差ＰＤ（ｆ）をディジタル入力信号ＩＮ１（ｔ）、ＩＮ２（ｔ）の遅延位相（φ１（ｆ）、φ２（ｆ））で表現すると、次のようになる。
ＰＤ（ｆ）＝ｔａｎ^−１（Ｊ｛（Ａ_１ｅ^{ｊ（２πｆｔ＋φ１（ｆ））}／Ａ_２ｅ^{ｊ（２πｆｔ＋φ２（ｆ））}｝／Ｒ｛（Ａ_１ｅ^{ｊ（２πｆｔ＋φ１（ｆ））}／Ａ_２ｅ^{ｊ（２πｆｔ＋φ２（ｆ））}｝）
＝ｔａｎ^−１（Ｊ｛（Ａ_１／Ａ_２）ｅ^{ｊ（φ１（ｆ）−φ２（ｆ））}｝／Ｒ｛（Ａ_１／Ａ_２）ｅ^{ｊ（φ１（ｆ）−φ２（ｆ））}｝）
＝ｔａｎ^−１（Ｊ｛ｅ^{ｊ（φ１（ｆ）−φ２（ｆ））}｝／Ｒ｛ｅ^{ｊ（φ１（ｆ）−φ２（ｆ））}｝）
＝ｔａｎ^−１（ｓｉｎ（φ１（ｆ）−φ２（ｆ））／ｃｏｓ（φ１（ｆ）−φ２（ｆ）））
＝ｔａｎ^−１（ｔａｎ（φ１（ｆ）−φ２（ｆ））
＝φ１（ｆ）−φ２（ｆ） This phase difference PD (f) is expressed as follows by the delay phases (φ1 (f), φ2 (f)) of the digital input signals IN1 (t) and IN2 (t).
PD (f) = tan ⁻¹ (J {(A ₁ e ^{j (2πft + φ1 (f))} / A ₂ e ^{j (2πft + φ2 (f))} } / R {(A ₁ e ^{j (2πft + φ1 (f))} / A ₂ e ^{j (2πft + φ2 (f))} })
= Tan ⁻¹ (J {(A ₁ / A ₂ ) e ^{j (φ1 (f) −φ2 (f))} } / R {(A ₁ / A ₂ ) e ^{j (φ1 (f) −φ2 (f) )} })
= Tan ⁻¹ (J {e ^{j (φ1 (f) −φ2 (f))} } / R {e ^{j (φ1 (f) −φ2 (f))} })
= Tan ⁻¹ (sin (φ1 (f) −φ2 (f)) / cos (φ1 (f) −φ2 (f)))
= Tan ⁻¹ (tan (φ1 (f) −φ2 (f))
= Φ1 (f) −φ2 (f)

図６および７は、音源からの受音の角度方向θに対する或る周波数ｆに関する入力信号の位相差ＰＤおよび誤差位相差ΔＰＤの例を示している。図６、７において、理論上の位相差は実線で例示され、実測した位相差は破線で例示されている。 6 and 7 show examples of the phase difference PD and the error phase difference ΔPD of the input signal related to a certain frequency f with respect to the angular direction θ of the sound received from the sound source. 6 and 7, the theoretical phase difference is exemplified by a solid line, and the actually measured phase difference is exemplified by a broken line.

図６および７において、音源の角度方向θ＝０〜＋π／２および＋３π／２〜２πにおいて、位相差ＰＤは正の値を有し、即ち進み位相を示す。音源の角度方向θ＝＋π／２〜＋３π／２において、位相差ＰＤは負の値を有し、即ち遅れ位相を示す。音源の角度方向θ＝＋π／２および＋３π／２において、位相差ＰＤは０を有し、即ち同相を示す。 6 and 7, in the angular directions θ = 0 to + π / 2 and + 3π / 2 to 2π of the sound source, the phase difference PD has a positive value, that is, indicates a leading phase. In the angle direction θ = + π / 2 to + 3π / 2 of the sound source, the phase difference PD has a negative value, that is, indicates a delayed phase. In the angular directions θ = + π / 2 and + 3π / 2 of the sound source, the phase difference PD has 0, that is, shows the same phase.

図６において、実測の入力信号ＩＮ１（ｔ）とＩＮ２（ｔ）の位相差ＰＤは、マイクロホンＭＩＣ１、ＭＩＣ２の感度のばらつきのせいで、抑圧角度範囲の角度境界αに対応する望ましい位相差の閾値Ｄｔｈよりも低い位相差を有することがある。この場合、その閾値Ｄｔｈより低い位相差ＰＤを有する入力信号ＩＮ１（ｔ）は望ましくなく全てが抑圧される。 In FIG. 6, the phase difference PD between the actually measured input signals IN1 (t) and IN2 (t) is a desirable threshold value of the phase difference corresponding to the angle boundary α of the suppression angle range due to variations in sensitivity of the microphones MIC1 and MIC2. It may have a phase difference lower than Dth. In this case, the input signal IN1 (t) having the phase difference PD lower than the threshold value Dth is not desirable and is all suppressed.

図７において、実測の入力信号ＩＮ１（ｔ）とＩＮ２（ｔ）の位相差ＰＤは、マイクロホンＭＩＣ１、ＭＩＣ２の感度のばらつきのせいで、抑圧角度範囲の角度境界αに対応する位相差の閾値Ｄｔｈよりも高い位相差を有することがある。この場合、抑圧角度範囲が望ましくなく狭くなり、即ちその閾値Ｄｔｈより高い位相差ＰＤを有する入力信号ＩＮ１（ｔ）は抑圧角度範囲が狭く、雑音抑圧が不充分になる。 In FIG. 7, the phase difference PD between the actually measured input signals IN1 (t) and IN2 (t) is the phase difference threshold Dth corresponding to the angle boundary α of the suppression angle range due to variations in sensitivity of the microphones MIC1 and MIC2. May have a higher phase difference. In this case, the suppression angle range becomes undesirably narrow, that is, the input signal IN1 (t) having a phase difference PD higher than the threshold value Dth has a narrow suppression angle range, and noise suppression becomes insufficient.

一方、音声区間検出部２２２によって判定された音声区間では、目的音源ＳＳの方向から受音されると期待され、従って位相差ＰＤは理論的な最大の正の値を有すると考えられる。従って、この実測の位相差ＰＤと理論的な位相差ＰＤの間の差が誤差位相差ΔＰＤであると考えられる。 On the other hand, in the speech section determined by the speech section detection unit 222, it is expected that the sound is received from the direction of the target sound source SS, and thus the phase difference PD is considered to have a theoretical maximum positive value. Therefore, the difference between the actually measured phase difference PD and the theoretical phase difference PD is considered to be the error phase difference ΔPD.

位相差決定部２６２は、隣接する２つの入力信号ＩＮ１（ｔ）、ＩＮ２（ｔ）の間の周波数ｆ毎の位相差ＰＤまたは最大電力の周波数の位相差ＰＤ若しくは上限周波数（例えば、３．８ｋＨｚ）での位相差ＰＤの値を、誤差位相差決定部２６４および位相差補正部２６６に供給する。 The phase difference determining unit 262 is configured to output a phase difference PD for each frequency f between two adjacent input signals IN1 (t) and IN2 (t), or a phase difference PD of the maximum power frequency or an upper limit frequency (for example, 3.8 kHz). ) Is supplied to the error phase difference determination unit 264 and the phase difference correction unit 266.

誤差位相差決定部２６４は、雑音電力抑圧モードにおいて、音声区間検出部２２２によって判定された音声区間において、理論的な位相差ＰＤと位相差決定部２６２からの実測の位相差ＰＤの間の差を、誤差位相差ΔＰＤとして求める。誤差位相差決定部２６４は、雑音電力抑圧モードにおいてその求めた誤差位相差ΔＰＤを保持して、その後の雑音方向抑圧モードにおいて位相差補正部２６６に供給する。位相差補正部２６６は、雑音方向抑圧モードにおいて、誤差位相差決定部２６４からの誤差位相差ΔＰＤだけ、位相差決定部２６２からの位相差ＰＤを補正または補償して、補正された位相差ＰＤｃを生成する。 In the noise power suppression mode, the error phase difference determination unit 264 is the difference between the theoretical phase difference PD and the actually measured phase difference PD from the phase difference determination unit 262 in the speech segment determined by the speech segment detection unit 222. Is obtained as an error phase difference ΔPD. The error phase difference determination unit 264 holds the obtained error phase difference ΔPD in the noise power suppression mode and supplies it to the phase difference correction unit 266 in the subsequent noise direction suppression mode. The phase difference correction unit 266 corrects or compensates the phase difference PD from the phase difference determination unit 262 by the error phase difference ΔPD from the error phase difference determination unit 264 in the noise direction suppression mode, and corrects the corrected phase difference PDc. Is generated.

方向判定部２６８は、雑音方向抑圧モードにおいて、閾値Ｄｔｈに基づいてその補正された位相差ＰＤｃを判定し、抑圧角度範囲の角度境界αに対応する望ましい閾値Ｄｔｈ以下の位相差ＰＤｃを有する入力信号の時間区間を雑音区間と判定する。また、方向判定部２６８は、抑圧角度範囲の角度境界αに対応する望ましい閾値Ｄｔｈより大きい位相差ＰＤｃを有する入力信号の時間区間を音声区間と判定する。方向判定部２６８は、雑音方向抑圧モードにおいて、音声区間または雑音区間の識別情報を抑圧部２７０に供給する。抑圧部２７０は、雑音方向抑圧モードにおいて、雑音区間における入力信号ＩＮ１（ｔ）またはＩＮ１（ｆ）を減衰させることによって抑圧する。抑圧部２７０は、雑音方向抑圧モードにおいて、音声区間における入力信号ＩＮ１（ｔ）またはＩＮ１（ｆ）を通過させる。抑圧部２７０は、雑音方向抑圧モードにおいて、その雑音が除去された音声信号ＩＮｎｓ（ｔ）をスイッチＳＷ＿Ｏ（Ｔ１）に供給する。 The direction determination unit 268 determines the corrected phase difference PDc based on the threshold value Dth in the noise direction suppression mode, and has an input signal having a phase difference PDc equal to or less than the desired threshold value Dth corresponding to the angle boundary α of the suppression angle range. Are determined as noise intervals. In addition, the direction determination unit 268 determines that the time interval of the input signal having the phase difference PDc larger than the desirable threshold value Dth corresponding to the angle boundary α of the suppression angle range is the audio interval. The direction determination unit 268 supplies the speech section or the noise section identification information to the suppression unit 270 in the noise direction suppression mode. In the noise direction suppression mode, the suppression unit 270 suppresses the input signal IN1 (t) or IN1 (f) in the noise interval by attenuating. The suppression unit 270 passes the input signal IN1 (t) or IN1 (f) in the speech section in the noise direction suppression mode. In the noise direction suppression mode, the suppression unit 270 supplies the audio signal INns (t) from which the noise has been removed to the switch SW_O (T1).

ノイズ・サプレッサ２２０および２６０は、これらの雑音抑圧のための処理を周波数領域で周波数毎に行ってもよい。この場合、ノイズ・サプレッサ２２０および２６０の抑圧部２２６および２７０は、さらに、処理済みの周波数領域のディジタル入力信号ＩＮｎｓ（ｆ）を、例えば逆フーリエ変換などによって時間領域のディジタル音信号ＩＮｎｓ（ｔ）に逆変換して、雑音抑圧済みのディジタル音信号ＩＮｎｓ（ｔ）をスイッチＳＷ＿Ｏに供給する。 The noise suppressors 220 and 260 may perform the processing for noise suppression for each frequency in the frequency domain. In this case, the suppression units 226 and 270 of the noise suppressors 220 and 260 further convert the processed digital input signal INns (f) in the frequency domain into a digital audio signal INns (t) in the time domain by, for example, inverse Fourier transform. And the noise-suppressed digital sound signal INns (t) is supplied to the switch SW_O.

ディジタル音信号ＩＮｎｓ（ｔ）の出力は、例えば、音声認識または携帯電話機の通話に用いられる。ディジタル音信号ＩＮｎｓ（ｔ）は、後続の利用アプリケーション４００に供給され、そこで、例えば、ディジタル−アナログ変換器４０４でディジタル−アナログ変換され低域通過フィルタ４０６で低域通過濾波されてアナログ信号が生成され、またはメモリ４１４に格納されて音声認識部４１６で音声認識に使用される。音声認識部４１６は、ハードウェアとして実装されたプロセッサであっても、またはソフトウェアとして実装された例えばＲＯＭおよびＲＡＭを含むメモリ４１４に格納されたプログラムに従って動作するプロセッサであってもよい。 The output of the digital sound signal INns (t) is used, for example, for voice recognition or a mobile phone call. The digital sound signal INns (t) is supplied to a subsequent application 400 where, for example, it is digital-analog converted by a digital-analog converter 404 and low-pass filtered by a low-pass filter 406 to generate an analog signal. Or stored in the memory 414 and used by the voice recognition unit 416 for voice recognition. The speech recognition unit 416 may be a processor implemented as hardware, or a processor that operates according to a program stored in a memory 414 including, for example, a ROM and a RAM implemented as software.

ディジタル信号プロセッサ２００は、ハードウェアとして実装された信号処理回路であっても、またはソフトウェアとして実装された例えばＲＯＭおよびＲＡＭを含むメモリ２０２に格納されたプログラムに従って動作する信号処理回路であってもよい。 The digital signal processor 200 may be a signal processing circuit implemented as hardware or a signal processing circuit that operates according to a program stored in a memory 202 including, for example, ROM and RAM, implemented as software. .

図８Ａ〜８Ｃは、２つのノイズ・サプレッサ２２０および２６０および制御部２３０によって実行される、雑音抑圧のためのフローチャートの例を示している。 8A to 8C show examples of flowcharts for noise suppression executed by the two noise suppressors 220 and 260 and the control unit 230. FIG.

図８Ａを参照すると、ステップ８０２において、音声区間検出部２２２は、入力バッファ・メモリ２１２から、マイクロホンＭＩＣ１から取り出された現在の時間区間のディジタル入力信号ＩＮ１（ｔ）を受け取る。 Referring to FIG. 8A, in step 802, the voice interval detector 222 receives from the input buffer memory 212 the digital input signal IN 1 (t) of the current time interval extracted from the microphone MIC 1.

ステップ８０４において、音声区間検出部２２２は、その時間区間のディジタル入力信号ＩＮ１（ｔ）についてＶＡＤ（Voice Activity Detection、音声活動検出）法により音声区間か非音声区間または雑音区間かを判定する。音声区間検出部２２２は、ディジタル入力信号ＩＮ１（ｔ）のその時間区間について音声区間および／または非音声区間（雑音区間）の識別情報を制御部２３０および電力判定部２２４に供給する。 In step 804, the voice interval detector 222 determines whether the digital input signal IN1 (t) in the time interval is a voice interval, a non-voice interval, or a noise interval by a VAD (Voice Activity Detection) method. The voice section detection unit 222 supplies the identification information of the voice section and / or the non-voice section (noise section) for the time section of the digital input signal IN1 (t) to the control unit 230 and the power determination unit 224.

ステップ８０６において、電力判定部２２４は、入力バッファ・メモリ２１２からディジタル入力信号ＩＮ１（ｔ）を取りだして、その判定された雑音区間におけるディジタル入力信号ＩＮ１（ｔ）の雑音の電力（または振幅）の大きさを検出する。電力判定部２２４は、その検出した雑音電力の大きさを制御部２３０に供給する。 In step 806, the power determination unit 224 extracts the digital input signal IN1 (t) from the input buffer memory 212, and the noise power (or amplitude) of the digital input signal IN1 (t) in the determined noise section. Detect the size. The power determination unit 224 supplies the detected magnitude of the noise power to the control unit 230.

ステップ８０８において、制御部２３０は、雑音電力の大きさが、雑音抑制のための閾値Ｐｔｈより大きいかどうかを判定する。雑音電力の大きさとして、例えば電力または振幅の平均値を用いてもよい。閾値Ｐｔｈは、望ましい音声電力の大きさより充分小さい値、または予期される一般的な背景雑音より高い値に設定してもよい。雑音電力の大きさが閾値Ｐｔｈより大きいと判定された場合は、手順は図８Ｃのステップ８３２に進む。雑音電力の大きさが閾値Ｐｔｈ以下であると判定された場合は、手順は図８Ｂのステップ８１２に進む。 In step 808, the control unit 230 determines whether or not the magnitude of the noise power is greater than a threshold Pth for noise suppression. As the magnitude of the noise power, for example, an average value of power or amplitude may be used. The threshold value Pth may be set to a value that is sufficiently smaller than the desired amount of voice power or higher than the expected general background noise. If it is determined that the magnitude of the noise power is greater than the threshold value Pth, the procedure proceeds to Step 832 in FIG. 8C. When it is determined that the magnitude of the noise power is equal to or less than the threshold value Pth, the procedure proceeds to Step 812 in FIG. 8B.

雑音区間における入力信号ＩＮ１（ｔ）の雑音電力の大きさが閾値Ｐｔｈより大きくない場合には、雑音区間における入力信号を抑圧し、音声電力成分を含む入力信号電力から雑音電力成分を減算することによって、入力信号の音声品質を向上させることができる。雑音区間における入力信号ＩＮ１（ｔ）の雑音電力の大きさが、閾値Ｐｔｈより大きい、または音声成分より充分小さくない場合には、音声と雑音を電力の大きさで区別することは難しい。この場合、音声電力成分を含む入力信号電力から雑音電力成分を減算すると、音声電力成分が小さ過ぎて、入力信号の音声品質が低下する可能性がある。従って、２つの入力信号の間の位相差または音源方向に基づいて雑音を抑制するとよい。 When the magnitude of the noise power of the input signal IN1 (t) in the noise section is not larger than the threshold value Pth, the input signal in the noise section is suppressed and the noise power component is subtracted from the input signal power including the voice power component. Thus, the voice quality of the input signal can be improved. When the magnitude of the noise power of the input signal IN1 (t) in the noise section is larger than the threshold value Pth or not sufficiently smaller than the voice component, it is difficult to distinguish voice and noise by the magnitude of power. In this case, if the noise power component is subtracted from the input signal power including the audio power component, the audio power component is too small, and the audio quality of the input signal may be degraded. Therefore, noise may be suppressed based on the phase difference between the two input signals or the sound source direction.

図８Ｂを参照すると、ステップ８１２において、制御部２３０は、少なくとも現在の雑音区間、または現在の雑音区間および後続の音声区間を含む時間期間において、ノイズ・サプレッサ２２０を雑音電力抑圧モードに設定する。そのために、制御部２３０は、ノイズ・サプレッサ２２０のスイッチＳＷ＿Ｉ１をオン状態（Ｔ１）に設定し、ノイズ・サプレッサ２６０のスイッチＳＷ＿Ｉ２をオフ状態（Ｔ１）に設定する。 Referring to FIG. 8B, in step 812, the control unit 230 sets the noise suppressor 220 to the noise power suppression mode in at least the current noise period or a time period including the current noise period and the subsequent voice period. Therefore, the control unit 230 sets the switch SW_I1 of the noise suppressor 220 to the on state (T1) and sets the switch SW_I2 of the noise suppressor 260 to the off state (T1).

ステップ８１４において、電力判定部２２４は、その判定された雑音区間における検出した雑音の電力に基づいて、雑音電力を推定し、その推定の雑音電力を抑圧部２２６に供給する。その推定の雑音電力は、例えば、その検出した雑音の電力の平均値であってもよい。 In step 814, the power determination unit 224 estimates noise power based on the detected noise power in the determined noise section, and supplies the estimated noise power to the suppression unit 226. The estimated noise power may be, for example, an average value of the detected noise power.

ステップ８１６において、抑圧部２２６は、ディジタル入力信号ＩＮ１（ｔ）の電力をゼロ（０）に減衰させることによってその雑音区間における雑音電力（成分）を抑圧する。代替形態として、入力信号ＩＮ１（ｔ）の電力を或る割合で、例えば１／１０に減衰させてもよい。それによって、雑音電力中に音声が埋もれていた場合に音声を消去してしまう危険性を減らすことができる。また、抑圧部２２６は、ディジタル入力信号ＩＮ１（ｔ）から各周波数について推定の雑音電力（成分）を減算することによってその音声区間における雑音成分を抑圧する。 In step 816, the suppression unit 226 suppresses the noise power (component) in the noise interval by attenuating the power of the digital input signal IN1 (t) to zero (0). As an alternative, the power of the input signal IN1 (t) may be attenuated at a certain rate, for example 1/10. As a result, it is possible to reduce the risk of erasing the voice when the voice is buried in the noise power. Further, the suppression unit 226 suppresses the noise component in the speech section by subtracting the estimated noise power (component) for each frequency from the digital input signal IN1 (t).

ステップ８１８において、位相差決定部２６２は、音声区間検出部２２２から音声区間（Ｖ）と非音声区間（Ｎ）の識別情報を受け取って、音声区間（Ｖ）が検出されたかどうかを判定する。音声区間が検出されたと判定された場合は、手順はステップ８２０に進む。音声区間が検出されなかったと判定された場合は、手順はステップ８２４に進む。 In step 818, the phase difference determination unit 262 receives the identification information of the voice segment (V) and the non-speech segment (N) from the voice segment detection unit 222, and determines whether the voice segment (V) is detected. If it is determined that a voice segment has been detected, the procedure proceeds to step 820. If it is determined that no voice segment has been detected, the procedure proceeds to step 824.

ステップ８２０において、位相差決定部２６２は、入力バッファ・メモリ２１２および２１４からその音声区間のディジタル入力信号ＩＮ１（ｔ）およびＩＮ２（ｔ）を取り出して、音声区間における音声ディジタル入力信号ＩＮ１（ｔ）およびＩＮ２（ｔ）の間の位相差を求める。音声区間では、方向θ＝０における目的音源ＳＳからの音がマイクロホンＭＩＣ１、ＭＩＣ２によって拾われると考えられる。 In step 820, the phase difference determination unit 262 takes out the digital input signals IN1 (t) and IN2 (t) in the voice section from the input buffer memories 212 and 214, and the voice digital input signal IN1 (t) in the voice section. And the phase difference between IN2 (t). In the voice section, it is considered that the sound from the target sound source SS in the direction θ = 0 is picked up by the microphones MIC1 and MIC2.

ステップ８２２において、誤差位相差決定部２６４は、音声ディジタル入力信号ＩＮ１（ｔ）およびＩＮ２（ｔ）の間の決定された位相差ＰＤと、目的音源方向からの音声についてそれに対応する理論的位相差ＰＤとを比較してその間の誤差（差）を求める。例えば、誤差位相差ΔＰＤは、その決定された位相差ＰＤから理論的位相差ＰＤを減算して求めてもよい。誤差位相差決定部２６４は、その決定された誤差を誤差位相差ΔＰＤとして位相差補正部２６６に供給する。この誤差位相差ΔＰＤは、後で、ステップ８３６において用いることができる。 In step 822, the error phase difference determination unit 264 determines the determined phase difference PD between the audio digital input signals IN1 (t) and IN2 (t) and the corresponding theoretical phase difference for the audio from the target sound source direction. An error (difference) between them is obtained by comparing with PD. For example, the error phase difference ΔPD may be obtained by subtracting the theoretical phase difference PD from the determined phase difference PD. The error phase difference determination unit 264 supplies the determined error to the phase difference correction unit 266 as an error phase difference ΔPD. This error phase difference ΔPD can be used later in step 836.

ステップ８２４において、制御部２３０は、出力側スイッチＳＷ＿Ｏをノイズ・サブレッサ２２０の抑圧部２２６の出力に接続して、抑圧部２２６からの（ステップ８１６における）雑音抑圧された出力音声信号ＩＮｎｓ（ｔ）を出力する。 In step 824, the control unit 230 connects the output side switch SW_O to the output of the suppression unit 226 of the noise suppressor 220, and the noise-suppressed output audio signal INns (t) from the suppression unit 226 (in step 816). Is output.

ステップ８５２において、音声区間検出部２２２は、入力バッファ・メモリ２１２に現在処理すべき時間区間のディジタル入力信号ＩＮ１（ｔ）があるかどうかを判定する。そのような時間区間のディジタル入力信号ＩＮ１（ｔ）があると判定された場合は、手順はステップ８１４に戻る。そのような時間区間のディジタル入力信号がないと判定された場合は、手順は図８Ａ〜８Ｃのルーチンを出る。新しい時間区間のディジタル入力信号に対して図８Ａ〜８Ｃのフローチャートが再び実行される。 In step 852, the voice interval detector 222 determines whether or not the input buffer memory 212 has a digital input signal IN 1 (t) for the time interval to be processed at present. If it is determined that there is a digital input signal IN1 (t) for such a time interval, the procedure returns to step 814. If it is determined that there is no digital input signal for such a time interval, the procedure exits the routine of FIGS. The flowcharts of FIGS. 8A-8C are executed again for the digital input signal in the new time interval.

図８Ｂのステップ８１４〜８２４および８５２は、次に図８Ａのステップ８０８の判定が行われてその後で図８Ｃのステップ８３２が実行されるまで繰り返してもよい。 Steps 814-824 and 852 of FIG. 8B may be repeated until the determination of step 808 of FIG. 8A is then made and thereafter step 832 of FIG. 8C is executed.

このようにして、雑音区間におけるディジタル入力信号の雑音電力の大きさが閾値Ｐｔｈより大きくない場合には、ノイズ・サプレッサ２２０によって雑音電力に基づいて雑音が抑制される。 Thus, when the magnitude of the noise power of the digital input signal in the noise interval is not larger than the threshold value Pth, the noise is suppressed by the noise suppressor 220 based on the noise power.

図８Ｃを参照すると、ステップ８３２において、制御部２３０は、少なくとも現在の雑音区間、または現在の雑音区間および後続の音声区間を含む時間期間において、ノイズ・サプレッサ２６０を雑音方向抑圧モードに設定する。そのために、制御部２３０は、ノイズ・サプレッサ２６０のスイッチＳＷ＿Ｉ２をオン状態（Ｔ２）に設定し、ノイズ・サプレッサ２２０のスイッチＳＷ＿Ｉ１をオフ状態（Ｔ２）に設定する。 Referring to FIG. 8C, in step 832, the control unit 230 sets the noise suppressor 260 to the noise direction suppression mode in at least the current noise period or a time period including the current noise period and the subsequent voice period. Therefore, the control unit 230 sets the switch SW_I2 of the noise suppressor 260 to the on state (T2) and sets the switch SW_I1 of the noise suppressor 220 to the off state (T2).

ステップ８３４において、位相差決定部２６２は、入力バッファ・メモリ２１２および２１４からディジタル入力信号ＩＮ１（ｔ）およびＩＮ２（ｔ）を取り出して、雑音区間または音声区間の時間区間における音声ディジタル入力信号ＩＮ１（ｔ）およびＩＮ２（ｔ）の間の位相差ＰＤを求める。 In step 834, the phase difference determination unit 262 extracts the digital input signals IN1 (t) and IN2 (t) from the input buffer memories 212 and 214, and the audio digital input signal IN1 (in the noise interval or the audio interval time interval). The phase difference PD between t) and IN2 (t) is obtained.

ステップ８３６において、位相誤差補正部２６４は、位相差決定部２６２によって求めた位相差ＰＤを、ステップ８２２において誤差位相差決定部２６４から受け取った誤差位相差ΔＰＤを用いて補正または補償して、補正された位相差ＰＤｃを生成する（図６、７）。そのために、その求めた位相差ＰＤから、誤差位相差ΔＰＤまたは誤差位相差ΔＰＤの或る割合α（例えば、α＝１または０．８）を減算してもよい（ＰＤ−ΔＰＤ×α）。それによって位相差の誤差補正の可能性ある誤差を小さくすることができる。 In step 836, the phase error correction unit 264 corrects or compensates the phase difference PD obtained by the phase difference determination unit 262 using the error phase difference ΔPD received from the error phase difference determination unit 264 in step 822. The phase difference PDc thus generated is generated (FIGS. 6 and 7). Therefore, the error phase difference ΔPD or a certain ratio α (for example, α = 1 or 0.8) of the error phase difference ΔPD may be subtracted from the obtained phase difference PD (PD−ΔPD × α). As a result, errors that can be corrected for phase difference errors can be reduced.

ステップ８３８において、方向判定部２６８は、その補正された位相差ＰＤｃに基づいてその時間区間における音源の方向が空間的に受音角度範囲（−α〜０〜＋α）かどうかを判定し、または補正された位相差ＰＤｃが閾値Ｄｔｈを超えるかどうかを判定する。この場合、各時間区間は、１つの音声区間または雑音区間であっても、それより短い時間区間であってもよい。方向判定部２６８は、位相差ＰＤｃが閾値Ｄｔｈを超える場合には、音源の方向が空間的に受音角度範囲（−α〜０〜＋α）であると判定する。方向判定部２６８は、位相差ＰＤｃが閾値Ｄｔｈを超えない場合には、音源の方向が空間的に抑圧角度範囲（＋α〜π〜（２π−α））であると判定する。 In step 838, the direction determination unit 268 determines whether the direction of the sound source in the time interval is spatially within the sound reception angle range (−α to 0 + α) based on the corrected phase difference PDc, or It is determined whether or not the corrected phase difference PDc exceeds the threshold value Dth. In this case, each time interval may be one speech interval or noise interval, or may be a shorter time interval. When the phase difference PDc exceeds the threshold value Dth, the direction determination unit 268 determines that the direction of the sound source is spatially within the sound reception angle range (−α to 0 ++ α). When the phase difference PDc does not exceed the threshold value Dth, the direction determination unit 268 determines that the direction of the sound source is spatially within the suppression angle range (+ α to π to (2π−α)).

ステップ８４０〜８４４において、抑圧部２７０は、音源の方向が受音角度範囲（−α〜０〜＋α）か、または受音角度範囲（−α〜０〜＋α）の位相差を有するディジタル入力信号ＩＮ１（ｔ）を通過させる。また、抑圧部２７０は、音源の方向が抑圧角度範囲（＋α〜π〜（２π−α））であるか、または位相差ＰＤｃが抑圧角度範囲（＋α〜π〜（２π−α））の位相差を有するとき、ディジタル入力信号ＩＮ１（ｔ）の電力をゼロ（０）に減衰させる。このようにして、音源の方向が抑圧角度範囲（＋α〜π〜（２π−α））にある時間区間におけるディジタル入力信号ＩＮ１（ｔ）の電力を抑圧する。 In Steps 840 to 844, the suppression unit 270 determines whether the direction of the sound source is a sound reception angle range (−α to 0 to + α) or a phase difference within a sound reception angle range (−α to 0 to + α). Let IN1 (t) pass. Further, the suppression unit 270 has a direction of the sound source in the suppression angle range (+ α to π to (2π−α)) or the phase difference PDc in the suppression angle range (+ α to π to (2π−α)). When there is a phase difference, the power of the digital input signal IN1 (t) is attenuated to zero (0). In this way, the power of the digital input signal IN1 (t) in the time interval in which the direction of the sound source is in the suppression angle range (+ α to π to (2π−α)) is suppressed.

ステップ８４０において、抑圧部２７０は、判定された音源の方向が、受音角度範囲（−α〜０〜＋α）かどうか、または位相差ＰＤｃが抑圧角度範囲（＋α〜π〜（２π−α））の位相差を有するかどうかを判定する。それが受音角度範囲（−α〜０〜＋α）であると判定された場合は、手順はステップ８４２に進む。それが受音角度範囲（−α〜０〜＋α）でないと判定された場合は、手順はステップ８４４に進む。 In step 840, the suppression unit 270 determines whether the determined sound source direction is within the sound reception angle range (−α to 0 to + α), or the phase difference PDc is within the suppression angle range (+ α to π to (2π−α). ) Is determined. If it is determined that it is within the sound receiving angle range (−α to 0 to + α), the procedure proceeds to step 842. If it is determined that it is not within the sound reception angle range (−α to 0 to + α), the procedure proceeds to step 844.

ステップ８４２において、抑圧部２７０は、ディジタル入力信号ＩＮ１（ｔ）を通過させる。その後、手順はステップ８５０に進む。 In step 842, the suppression unit 270 passes the digital input signal IN1 (t). Thereafter, the procedure proceeds to Step 850.

ステップ８４４において、抑圧部２７０は、ディジタル入力信号ＩＮ１（ｔ）をゼロ（０）に減衰させて抑圧する。代替形態として、入力信号ＩＮ１（ｔ）の電力を或る割合で、例えば１／１０に減衰させてもよい。それによって、雑音電力中に音声が埋もれていた場合に音声を消去してしまう危険性を減らすことができる。その後、手順はステップ８５０に進む。 In step 844, the suppression unit 270 attenuates and suppresses the digital input signal IN1 (t) to zero (0). As an alternative, the power of the input signal IN1 (t) may be attenuated at a certain rate, for example 1/10. As a result, it is possible to reduce the risk of erasing the voice when the voice is buried in the noise power. Thereafter, the procedure proceeds to Step 850.

ステップ８５０において、制御部２３０は、出力側スイッチＳＷ＿Ｏをノイズ・サブレッサ２６０の抑圧部２７０に接続して、抑圧部２７０からの雑音抑圧された出力音声信号ＩＮｎｓ（ｔ）を出力する。その後、手順はステップ８４２に進む。 In step 850, the control unit 230 connects the output side switch SW_O to the suppression unit 270 of the noise suppressor 260 and outputs the noise-suppressed output audio signal INns (t) from the suppression unit 270. Thereafter, the procedure proceeds to Step 842.

ステップ８５２は、図８Ｂのものと同様である。現在処理すべき時間区間のディジタル入力信号ＩＮ１（ｔ）があると判定された場合は、手順は図８Ａのステップ８３４に戻る。新しい時間区間のディジタル入力信号に対して図８Ａ〜８Ｃのフローチャートが再び実行される。 Step 852 is similar to that of FIG. 8B. If it is determined that there is a digital input signal IN1 (t) for the current time interval to be processed, the procedure returns to step 834 in FIG. 8A. The flowcharts of FIGS. 8A-8C are executed again for the digital input signal in the new time interval.

図８Ｃのステップ８３４〜８５０および８５２は、次に図８Ａのステップ８０８の判定が行われてその後で図８Ｂのステップ８１２が実行されるまで繰り返してもよい。 Steps 834-850 and 852 of FIG. 8C may be repeated until the determination of step 808 of FIG. 8A is then made and thereafter step 812 of FIG. 8B is executed.

このように、雑音区間におけるディジタル入力信号の雑音電力の大きさが閾値Ｐｔｈより大きい場合には、ノイズ・サプレッサ２６０によって２つのディジタル入力信号の間の位相差または音源方向に基づいて雑音を抑制する。 Thus, when the magnitude of the noise power of the digital input signal in the noise interval is larger than the threshold value Pth, the noise is suppressed by the noise suppressor 260 based on the phase difference between the two digital input signals or the sound source direction. .

このようにして、抑圧部２２６または抑圧部２７０からの雑音抑圧された出力音声信号ＩＮｎｓ（ｔ）が、出力され、さらに利用アプリケーション部４００に供給される。 In this way, the noise-suppressed output audio signal INns (t) from the suppression unit 226 or the suppression unit 270 is output and further supplied to the usage application unit 400.

ここで挙げた全ての例および条件的表現は、発明者が技術促進に貢献した発明および概念を読者が理解するのを助けるためのものであり、ここで具体的に挙げたそのような例および条件に限定することなく解釈できる。また、明細書におけるそのような例の編成は本発明の優劣を示すこととは関係ない。本発明の実施形態を詳細に説明したが、本発明の精神および範囲から逸脱することなく、それに対して種々の変更、置換および変形を施すことができる。 All examples and conditional expressions given here are intended to help the reader understand the inventions and concepts that have contributed to the promotion of technology, such examples and Interpretation is not limited to conditions. Also, the organization of such examples in the specification is not related to showing the superiority or inferiority of the present invention. While embodiments of the present invention have been described in detail, various changes, substitutions and variations can be made thereto without departing from the spirit and scope of the present invention.

２００ディジタル信号プロセッサ
２２０１マイク用のノイズ・サプレッサ
２２２音声区間検出部（ＶＡＤ）
２２４電力判定部
２２６抑圧部
２３０制御部
２６０２マイク用のノイズ・サプレッサ
２６２位相差決定部
２６４誤差位相差決定部
２６６位相差補正部
２６８方向判定部
２７０抑圧部
ＳＷ＿Ｉ１、ＳＷ＿Ｉ２入力側のスイッチ
ＳＷ＿Ｏ出力側のスイッチ 200 Digital Signal Processor 220 Noise Suppressor for One Microphone 222 Voice Section Detection Unit (VAD)
224 Power determination unit 226 Suppression unit 230 Control unit 260 Noise suppressor for 2 microphones 262 Phase difference determination unit 264 Error phase difference determination unit 266 Phase difference correction unit 268 Direction determination unit 270 Suppression unit SW_I1, SW_I2 Input side switch SW_O Output Side switch

Claims

A sound signal processing method in an information processing apparatus,
Determining a speech section and a noise section of the first input sound signal in the first and second input sound signals in a certain time period;
Determining the magnitude of power of the first input sound signal in the noise interval;
Determining whether the magnitude of the power of the first input sound signal in the noise interval is greater than a first threshold;
When it is determined that the power level of the first input sound signal in the noise section is not larger than the first threshold, based on the determined power level in the noise section, A step of suppressing noise in the voice section and the noise section of the first input sound signal by one suppression unit;
When it is determined that the power level of the first input sound signal in the noise section is not larger than the first threshold value, the first input sound signal and the second input in the voice section Obtaining a phase difference between the sound signals and obtaining an error between the theoretical phase difference between the first input sound signal and the second input sound signal and the obtained phase difference;
When it is determined that the power level of the first input sound signal in the noise section is larger than the first threshold value, the first input sound signal and the second input sound signal are between the first input sound signal and the second input sound signal. Correcting a phase difference according to the obtained error, and suppressing noise of the first input sound signal according to the corrected phase difference by a second suppression unit;
A sound signal processing method including:

The step of suppressing noise of the first input sound signal in accordance with the corrected phase difference,
Determining whether the phase difference between the first input sound signal and the second input sound signal is greater than a second threshold;
If the phase difference between the is determined to be greater than the second threshold value, and supplies the first input sound signal to the output unit, when the phase difference is not greater than said second threshold value The sound signal processing method according to claim 1, wherein the second suppression unit includes suppressing noise of the first input sound signal.

First and second sound signal input units for receiving first and second input sound signals, respectively;
A section determination unit for determining a voice section and a noise section of the first input sound signal received from the first sound signal input unit in a certain time period received;
A power determination unit that determines the magnitude of the power of the first input sound signal in the noise section;
A control unit for determining whether the magnitude of the power of the first input sound signal in the noise section is greater than a first threshold;
When it is determined that the power level of the first input sound signal in the noise section is not greater than the first threshold, based on the determined power level in the noise section, A first suppression unit that suppresses noise in the speech section and the noise section of the first input sound signal;
When it is determined that the power level of the first input sound signal in the noise section is not larger than the first threshold value, the first input sound signal and the second input in the voice section An error phase difference determination unit that obtains a phase difference between sound signals and obtains an error between the theoretical phase difference between the first input sound signal and the second input sound signal and the obtained phase difference. When,
When it is determined that the power level of the first input sound signal in the noise section is larger than the first threshold value, the first input sound signal and the second input sound signal are between the first input sound signal and the second input sound signal. A second suppression unit that corrects a phase difference according to the obtained error, and suppresses noise of the first input sound signal according to the corrected phase difference ;
A sound signal processing apparatus.

The second suppressor is
Determining whether the phase difference between the first input sound signal and the second input sound signal is greater than a second threshold;
If the phase difference between the is determined to be greater than the second threshold value, and supplies the first input sound signal to the output unit, when the phase difference is not greater than said second threshold value The second suppressor suppresses the noise of the first input sound signal.
The sound signal processing apparatus according to claim 3 .