WO2016190460A1 - Procédé et dispositif pour une lecture de son tridimensionnel (3d) - Google Patents
Procédé et dispositif pour une lecture de son tridimensionnel (3d) Download PDFInfo
- Publication number
- WO2016190460A1 WO2016190460A1 PCT/KR2015/005253 KR2015005253W WO2016190460A1 WO 2016190460 A1 WO2016190460 A1 WO 2016190460A1 KR 2015005253 W KR2015005253 W KR 2015005253W WO 2016190460 A1 WO2016190460 A1 WO 2016190460A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sound
- speakers
- group
- signal
- speaker
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S5/00—Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation
- H04S5/02—Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation of the pseudo four-channel type, e.g. in which rear channel signals are derived from two-channel stereo signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/302—Electronic adaptation of stereophonic sound system to listener position or orientation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/01—Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
Definitions
- the present invention relates to a method and an apparatus for reproducing stereo sound, and more particularly, to a method and an apparatus for positioning a virtual sound source at a predetermined position using a plurality of speakers.
- Stereo sound is a technique that arranges a plurality of speakers at different positions on a horizontal plane and outputs the same or different sound signals from each speaker so that the listener feels a sense of space.
- the sweet spot may be limited to the center of the home theater configuration, and the reflection sound generating technology using the sound bar may be affected by the characteristics of the room. Accordingly, there is a need for a three-dimensional audio rendering method that is not affected by the characteristics of a room by using a plurality of speakers and is not restricted by the position of a sweet spot.
- An apparatus and method for reproducing stereo sound for providing a stereoscopic sense and a spatial sense to a listener may be provided.
- the present invention also provides a computer-readable recording medium having recorded thereon a program for executing the method on a computer.
- the technical problem to be achieved by the present embodiment is not limited to the technical problem as described above, and other technical problems may be inferred from the following embodiments.
- FIG. 1 illustrates a stereoscopic sound reproduction environment of a listener according to an embodiment.
- FIG. 2 illustrates a stereoscopic sound reproducing apparatus according to an embodiment.
- FIG. 3 illustrates a stereoscopic sound reproducing apparatus using a wave field synthesis rendering method.
- FIG. 4A shows a stereoscopic sound reproducing apparatus that renders using a minimum error summing method.
- FIG. 4B is a view showing an arbitrary point in the virtual sound source and the sweet spot in the stereoscopic sound reproduction environment of FIG.
- FIG. 5 shows a 3D sound reproducing apparatus that performs rendering for high-altitude reproduction.
- FIG. 6A illustrates a stereoscopic sound reproducing apparatus for tracking a listener's head position according to an embodiment.
- FIG. 6B illustrates a change in a sweet spot of a stereoscopic sound reproduction environment according to an embodiment.
- FIG. 7 is a flowchart of a method of reproducing stereoscopic sound, according to an exemplary embodiment.
- FIG. 8 shows a flowchart of a further embodiment of the method for the stereophonic reproduction apparatus to reproduce stereoscopic sound.
- a stereoscopic sound reproducing method includes grouping a plurality of speakers into a group, receiving a sound signal, and using one or more grouped speakers to position one or more virtual sound sources of the sound signal. And positioning the virtual sound source through the plurality of speakers.
- the grouping of the plurality of speakers into a group may include including a speaker constituting one home theater system and a separate loudspeaker not constituting the home theater system in the one group. .
- the home theater system may be a loudspeaker array in which a plurality of loudspeakers are linearly connected.
- Grouping the plurality of speakers may include connecting the plurality of physically separated speakers through a wireless or wired network.
- the positioning of the virtual sound source at a predetermined position may include positioning the virtual sound image at a predetermined position by a sound field synthesis rendering method.
- the positioning of the virtual sound source at a predetermined position may include: a first sound pressure in a sweet spot generated from the speakers included in the group, and in the sweet spot generated from the virtual sound source at the predetermined position;
- the method may include determining a sound pressure signal for each speaker included in the group capable of minimizing a difference in second sound pressure, and modulating the received sound signal based on the determined sound pressure signal for each speaker.
- the calculating of the sound pressure signal for each speaker included in the group includes determining an impulse response to be applied to each speaker included in the group, and modulating the received sound signal includes the group.
- the method may include convolving the determined impulse response to the sound signal input for each speaker.
- the positioning of the virtual sound image at a predetermined position may include passing the received sound signal through a filter corresponding to a predetermined altitude, replicating the filtered sound signal to generate a plurality of sound signals, and And performing at least one of amplification, attenuation, and delay for each of the replicated acoustic signals based on at least one of a gain value and a delay value corresponding to each of the speakers to which the duplicated acoustic signals are to be output.
- And tracking the position of the head of the listener in real time, and positioning the virtual sound source at a predetermined position comprises: at least one of the speakers included in the group based on the tracked position of the head of the listener. And changing the gain and phase delay values of the speaker.
- an apparatus for reproducing stereo sound includes a grouping unit for grouping a plurality of speakers into a group, a receiving unit for receiving an audio signal, and a plurality of virtual sound sources of the sound signal by using the grouped speakers. And a rendering unit for positioning at a position, and a reproducing unit for reproducing the virtual sound source through the plurality of speakers.
- the grouping unit may include a speaker constituting one home theater system and a separate loudspeaker not constituting the home theater system in the one group.
- the home theater system may be a loudspeaker array in which a plurality of loudspeakers are linearly connected.
- the grouping unit may connect the plurality of physically separated speakers through a wireless or wired network.
- the rendering unit may orient the virtual sound image to a predetermined position by using the received sound signal by a sound field synthesis rendering method.
- the rendering unit may minimize the difference between the first sound pressure in the sweet spot generated from the speakers included in the group and the second sound pressure in the sweet spot generated from the virtual sound source existing at the predetermined position.
- the sound pressure signal for each speaker included in the group may be determined, and the received sound signal may be modulated based on the determined sound pressure signal for each speaker.
- the rendering unit may determine an impulse response to be applied to each speaker included in the group, and convolve the determined impulse response to an acoustic signal input for each speaker included in the group.
- the rendering unit may include a filtering unit which passes the input sound signal to a filter corresponding to a predetermined altitude, a copy unit which generates a plurality of sound signals by copying the filtered sound signal, and the copied sound signals may be output. And an amplifier configured to perform at least one of amplification, attenuation, and delay of each of the replicated acoustic signals based on at least one of a gain value and a delay value corresponding to each of the speakers.
- a listener tracker configured to track the position of the head of the listener in real time, wherein the renderer is configured to obtain gain and phase delay values of at least one of the speakers included in the group based on the tracked position of the listener. It may be characterized by changing.
- a computer readable recording medium having recorded thereon a program for executing the stereo sound reproduction method on a computer may be provided.
- part refers to a hardware component or circuit, such as an FPGA or an ASIC.
- FIG. 1 illustrates a stereoscopic sound reproduction environment of a listener according to an embodiment.
- the stereoscopic reproduction environment 100 is an example of an environment in which the listener 110 views stereoscopic sounds through the stereoscopic reproduction device 200 which will be described later.
- the stereoscopic playback environment 100 is an environment for the playback of audio content alone or with other content such as video, and may be any, such as a room that can be embodied in a home, cinema, theater, auditorium, studio, game console, or the like. It can mean an open, partially closed, or completely closed area of a.
- the listener 110 may enjoy multimedia content through the multimedia player 140 such as television or audio.
- multimedia content such as television or audio.
- the listener 110 of the stereoscopic sound reproduction environment 100 listens to the sound of the content played on the television through the plurality of speakers 145, 160, and 165.
- the television 140 may have a built-in speaker, but the stereoscopic sound reproduction environment 100 may include a separate home theater system.
- a separate sound bar 145 may be present directly below the television 140.
- the sound bar 130 may be a speaker array module including a plurality of loudspeakers.
- the sound bar 145 may include panning, wave field synthesis, beam forming, focus source, and head transmission under a stereo sound reproduction environment 100.
- a three-dimensional sound field processing technique such as a head related transfer function may be used to virtually reproduce the multi-channel audio signal.
- FIG. 1 shows the sound bar 145 as a single horizontal linear array positioned at the bottom of the television
- the sound bar 145 is a dual horizontal linear array installed above and below the television 140 to provide high elevation, It may be composed of a double vertical linear array positioned to the left and right of the television 140 and a window array of a type surrounding the television 140.
- the sound bar 145 may be installed in a form surrounding the listener 110 or positioned in front of and behind the listener 110.
- the stereo sound reproduction environment 100 may include a speaker (not shown) of a home theater system other than the sound bar 145, and may not necessarily include a home theater speaker such as the sound bar 145. .
- the listener 110 may include a speaker that constitutes one home theater system and a separate loudspeaker that does not constitute the home theater system in one group to enjoy stereoscopic sound through a plurality of speakers included in the group.
- the listener 110 may combine the separate loudspeakers 160 and 165 physically separated from the sound bar 145 to enjoy the content played on the television 140.
- the listener 110 may combine the loudspeakers (not shown) built in the television 140 with separate loudspeakers 160 and 165 that are physically separated from each other to enjoy the content played on the television 140. .
- the listener 110 may add separate loudspeakers 160 and 165 to the existing TV-embedded speaker or sound bar 145 to group them into one group and enjoy stereoscopic sound.
- the stereoscopic reproduction environment 100 may be configured by grouping the television built-in speaker 140, the sound bar 145, the left loudspeaker 160, and the right loudspeaker 165 into one group 180. Sound can be reproduced. Although only the left loudspeaker 160 and the right loudspeaker 165 of the listener 110 are illustrated in FIG. 1, the listener 110 may be configured according to the size of the space for listening to stereoscopic sound or the style of the content for listening. The number and location can be configured adaptively.
- the stereoscopic reproduction environment 110 may further include a left rear loudspeaker (not shown) and a right rear loudspeaker (not shown).
- the television 140 or a separate display device may display a list of speakers composed of one group 180 to the listener 110, and the listener 110 may add any speaker constituting the group, or Can be removed.
- the stereoscopic reproduction environment 100 may include a sweet spot 120 which is a spatial range in which optimal stereoscopic sounds can be enjoyed.
- the stereoscopic sound reproduction environment 100 may set the position of the virtual ear of the listener 110 so that the optimal stereoscopic sound is output from the position of the ear and the adjacent sweet spot 120.
- the stereoscopic reproduction environment 100 may perform rendering in which the virtual sound source is positioned at a desired position using the speakers 145, 160, and 165 in the group 180, and the listener 110 may determine the actual speaker position. It feels as if sound is heard from the position of the virtual sound source.
- FIG. 2 illustrates a stereoscopic sound reproducing apparatus according to an embodiment.
- the stereoscopic sound reproducing apparatus 200 performs the 3D audio rendering to place the virtual sound source at a predetermined position on the input audio signals in the stereoscopic sound reproducing environment 100 described above with reference to FIG. 1 to the listener 110. You can feel the sense of space and three-dimensional.
- the stereoscopic sound reproducing apparatus 200 may include a receiver 210, a controller (not shown), and a reproducer 240.
- the controller may include a renderer 220 and a grouper 230.
- the controller includes at least one processor such as a central processing unit (CPU), an application processor (AP), an application specific integrated circuit (ASIC), an embedded processor, a microprocessor, hardware control logic, and a hardware finite state machine (FSM). , Digital signal processor (DSP), or a combination thereof.
- processor such as a central processing unit (CPU), an application processor (AP), an application specific integrated circuit (ASIC), an embedded processor, a microprocessor, hardware control logic, and a hardware finite state machine (FSM). , Digital signal processor (DSP), or a combination thereof.
- CPU central processing unit
- AP application processor
- ASIC application specific integrated circuit
- FSM hardware finite state machine
- DSP Digital signal processor
- the receiver 210 may receive an input audio signal (ie, an acoustic signal) from a device such as a digital versatile disc (DVD), a Blu-ray disc (BD), an MP3 player, or the like.
- the input audio signal may be a multi-channel audio signal such as a stereo signal (2 channels), 5, 1 channel, 7.1 channel, 10.2 channel and 22.2 channel.
- the input audio signal may be an object-based audio signal in which a plurality of mono input signals and real-time positions of objects are transmitted in the form of metadata.
- the object-based audio signal refers to a form in which the position of each audio object arranged in three-dimensional space is compressed into metadata along with sound.
- the input audio signal may be a hybrid input audio signal in which a channel audio signal and an object-based audio signal are mixed.
- the grouping unit 230 may group at least two speakers existing in the 3D sound reproducing environment 100 into one group.
- the grouping unit 230 may group the television built-in speaker and the separate loudspeaker into one group.
- the grouping unit 230 may group the built-in TV, one or more soundbars and one or more loudspeakers into one group.
- the grouping unit 230 may group the existing home theater speaker and the one or more loudspeakers purchased separately by the listener 110 into one group. Speakers in a group may be physically separated from each other.
- the listener 110 may select speakers to be grouped, and may determine speakers to be added based on the size and characteristics of the space where the listener 110 is located or the nature of the content to be enjoyed.
- the grouping unit 230 may group a plurality of physically separated speakers into a group through various communication paths.
- the communication path may represent various networks and network topologies.
- the communication path may include wireless communication, wired communication, optical, ultrasound, or a combination thereof. Satellite communications, mobile communications, Bluetooth, Infrared Data Association standard (lrDA), wirelessfidelity (WiFi), and worldwide interoperability for microwave access (WiMAX) can be included in the communication path. Examples of communication. Ethernet, digital subscriber line (DSL), fiber to the home (FTTH), and plain old telephone service (POTS) are examples of wireline communications that can be included in the communication path.
- the communication path may include a personal area network (PAN), a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN), or a combination thereof.
- PAN personal area network
- LAN local area network
- MAN metropolitan area network
- WAN wide area network
- the grouper 230 may store positions and gains of the speakers existing in the group, and transmit the positions of the speakers to the renderer 220.
- the renderer 220 may perform 3D audio rendering for positioning the virtual sound source at a predetermined position with the input audio signals.
- the renderer 220 may generate at least one speaker signal corresponding to the audio signal by processing the input audio signal using a wave field synthesis rendering algorithm.
- the rendering unit 220 processes an input audio signal by using a head related transfer function rendering, beam-forming rendering, or focused source rendering algorithm to correspond to the audio signal. At least one speaker signal may be generated.
- the rendering unit 220 may calculate an impulse response for each speaker based on the minimum error summation, or perform rendering to reproduce the sense of altitude. A detailed process of performing the 3D audio rendering by the rendering unit 220 will be described later.
- the reproduction unit 240 may reproduce the virtual sound source rendered by the rendering unit 220 through the multichannel speaker.
- the playback unit 240 may include speakers existing in the group 180.
- FIG. 3 illustrates a stereoscopic sound reproducing apparatus using a wave field synthesis rendering method.
- FIG. 3 illustrates an embodiment of the rendering unit 220 of the stereoscopic sound reproducing apparatus 200. Although the descriptions are omitted below, the contents described above with respect to the stereoscopic sound reproducing apparatus 200 of FIG. The same applies to the stereoscopic sound reproducing apparatus 200 according to the embodiment of the present invention.
- the renderer 220 may include an audio signal analyzer 310 and a sound field synthesis renderer 320.
- the rendering unit 220 may determine a gain and phase delay value for each speaker suitable for the position of the sound image according to the propagation characteristics of the sound image to reproduce the near field focused sound source.
- the rendering unit 220 uses a feature in which the magnitude of the sound pressure decreases to 1 / r according to the distance r between the listener 110 and the sound source, so that the output of the speakers in the group is located at the near sound image position to be positioned.
- the gain between the speakers in the group can be changed to achieve the same sound pressure.
- the rendering unit 220 may be configured to converge the output of all the speakers in the group without delay in the desired near-field position in consideration of the propagation delay of the sound field between the virtual sound source and the actual speaker.
- the audio signal analyzer 310 may include speaker information in a group, sound source position information (for example, information about a position such as an angle of a virtual sound source with respect to a listening position), and a multichannel audio signal (sound source signal to be positioned). Get input.
- the speaker information in the group may include information about the sound bar (for example, information about the arrangement such as the position and spacing of the loudspeaker array), position information of the speakers in the group, and the space between the speakers.
- the audio signal analyzer 310 may determine the number of channels of the audio signal by analyzing a sound source format of the received multichannel audio signal, and extract each channel sound source signal for each identified channel from the received multichannel audio signal. have.
- the sound field synthesis rendering unit 320 renders the multi-channel audio signal by the sound field synthesis method according to the number of audio channels identified by the audio signal analyzer 310. That is, the sound field synthesis rendering unit 320 orients the virtual sound source to a desired position in accordance with the identified number of audio channels.
- the number of virtual sound images may vary depending on the number of audio channels checked by the audio signal analyzer 310. For example, when the sound source of the multi-channel audio signal is two channels, the sound field synthesis rendering unit 320 may render a virtual sound source in a front left direction and a front right direction, that is, in both directions, in a sound field synthesis method.
- the sound field synthesis rendering unit 320 generates a virtual sound source in a total of five directions such as front left direction, front right direction, center direction, rear left direction, and rear right direction based on the position of the speaker in the group. You can render in a composite way.
- the sound field synthesis rendering unit 320 may change the phase delay value and the gain value of each speaker in the group according to the number and position of the speakers in the group.
- the stereoscopic sound reproducing apparatus 200 may change the phase delay value and the gain value of the speakers in the group in real time. For example, if the listener 110 moves to the left side, the gain value or pre delay value of the speakers in the group may be changed to a value optimized for the position of the listener 110 moved to the left side.
- FIG. 4A shows a stereoscopic sound reproducing apparatus that renders using a minimum error summing method.
- FIG. 4A illustrates an embodiment of the rendering unit 220 of the stereoscopic sound reproducing apparatus 200. Although the descriptions are omitted below, the contents described above with respect to the stereoscopic sound reproducing apparatus 200 of FIG. 2 are illustrated in FIG. 4A. The same applies to the stereoscopic sound reproducing apparatus 200 according to the embodiment of the present invention.
- the renderer 220 may include an audio signal analyzer 310 and a minimum error adder 420. Since the audio signal analyzer 310 is as described above with reference to FIG. 3, description thereof will be omitted.
- FIG. 4B is a view showing the virtual sound source 460 and arbitrary points 470 and 480 within the sweet spot 120 added to the stereo sound reproduction environment 100 of FIG.
- the minimum error summing unit 420 is a method that can be applied to allow the listener 110 to enjoy optimal stereo sound within the set sweet spot 120.
- the minimum error adder 420 may set a sound pressure pTarget within the sweet spot 120 due to the virtual sound source 460.
- the virtual sound source 460 refers to assuming that there is an actual sound source at a position where the sound signal is to be positioned.
- the minimum error summing unit 420 sets the actual sound pressure pReproduce in the sweet spot 120 generated from the speakers 145, 160, and 165 in the group, and then the difference between the two sound pressures pReproduce, pTarget (J).
- the sound pressure signal of each speaker 145, 160, and 165, which is minimized, may be determined.
- the minimum error adder 420 may modulate the sound signal received by the receiver 210 based on the determined sound pressure signal.
- the arrows shown in solid lines in FIG. 4B represent pReprodece for any point in sweet spot 120 and the arrows shown in dashed lines represent pTarget.
- the minimum error adder 420 may calculate pTarget and pReproduce of arbitrary points 470 and 480 within the sweet spot 120.
- the minimum error summing unit 420 may calculate J by performing integration on the entire size of the sweet spot 120 to calculate pTarget and pReprodece of all points in the sweet spot 120.
- v represents the size of the sweet spot 120
- r is the distance between the position of the actual speaker (145, 160, 165) and a specific point (470, 480) within the sweet spot 120 or the position of the virtual sound source 460
- t represents the time.
- w is a weighting function that can be arbitrarily set according to r.
- Equation 2 i denotes an index for each speaker in a group, and N denotes the total number of speakers in a space.
- the ki value may mean a filter coefficient (ie, an impulse response) to be applied to each speaker, and thus the ki value may be determined by minimizing J '. That is, the minimum error adder 420 may determine a filter for each speaker in the group (that is, an impulse response).
- the minimum error summing unit 420 determines a sound pressure signal that each speaker 145, 160, 165 should radiate from Equation 2 in the form of an impulse response for each speaker 145, 160, 165, and determines The impulse response may be convolved with the acoustic signal received for each speaker 145, 160, 165. Alternatively, the minimum error summing unit 420 may modulate an input sound signal for each speaker 145, 160, 165 by estimating gain and phase values from filter values determined for each speaker 145, 160, 165.
- the minimum error summing unit 420 considers that the speakers located at the side of the listener 110 have little effect of orienting the virtual sound source, so that the impulse of the speaker array 145 located in front of the listener 110 is small. You can also determine only the response.
- the minimum error summing unit 420 sets the sweet spot 120 to be large enough to add the minimum error of the sound field (or sound pressure) transmitted to the sweet spot 120, and then adds the speaker 145, 160, 165. Star filters can be calculated. Since the sweet spot 120 is large enough, the listener 110 has an advantage of being able to enjoy a stereoscopic sound of a predetermined level or more regardless of movement. If there are two or more listeners, the minimum error adder 420 may set the sweet spot 120 large enough to include two or more listeners.
- the minimum error summing unit 420 calculates the minimum error of the sound field (or sound pressure) transmitted to the sweet spot 120 by setting the sweet spot 120 to be small, and for each speaker 145, 160, and 165.
- the filter can be calculated.
- the sweet spot 120 is very small, even if the listener 110 moves a little, the sweet spot 120 may leave the sweet spot 120, thereby making it difficult to enjoy stereoscopic sound.
- the optimized speaker stars 145, 160, and 165 can calculate the impulse response, so that the listener can enjoy high quality stereo sound within the determined sweet spot 120. If there are two or more listeners, the minimum error adder 420 may set a plurality of sweet spots 120 for providing optimal stereo sound to each listener.
- the sweet spot 120 is also moved according to the movement path of the listener 110, so that the size of the sweet spot 120 may be set smaller than that of the listener 110. Optimum stereoscopic reproduction may be possible regardless of movement.
- FIG. 5 shows a 3D sound reproducing apparatus that performs rendering for high-altitude reproduction.
- FIG. 5 illustrates an embodiment of the rendering unit 220 of the stereoscopic sound reproducing apparatus 200. Although the descriptions are omitted below, the contents described above with respect to the stereoscopic sound reproducing apparatus 200 of FIG. 2 are illustrated in FIG. 5. The same applies to the stereoscopic sound reproducing apparatus 200 according to the embodiment of the present invention.
- the renderer 220 may include a filter 520, a replica 530, and an amplifier 540.
- the filtering unit 520 passes the sound signal through a predetermined filter corresponding to a predetermined altitude.
- the filtering unit 520 may pass the sound signal to a head related transfer filter (HRTF) filter corresponding to a predetermined altitude.
- HRTF includes the path information from the spatial position of the sound source to both ears of the listener 110, that is, the frequency transfer characteristic. HRTF is diffracted at the head surface, as well as simple path differences such as inter-aural level differences (ILD) between two ears and inter-aural time differences (ITD) between the two ears.
- ILD inter-aural level differences
- ITD inter-aural time differences
- the stereoscopic sound can be recognized by the phenomenon that the characteristics of the complicated path such as the reflection by the wheel and the wheel change according to the direction of sound arrival. Since HRTF has unique characteristics in each direction of space, it can be used to generate stereo sound.
- the filtering unit 520 uses an HRTF filter to model sound generated at a higher altitude than actual speakers by using speakers arranged on a horizontal plane. Equation 3 below is an example of an HRTF filter used by the filtering unit 520.
- HRTF2 is HRTF indicating path information from the position of the virtual sound source to the ear of the listener 110
- HRTF1 is HRTF indicating path information from the position of the actual speaker to the ear of the listener 110. Since the sound signal is output through the actual speaker, in order to recognize that the sound signal is output from the virtual speaker, the HRTF2 corresponding to the predetermined altitude is divided by the HRTF1 corresponding to the horizontal plane (or the height of the actual speaker).
- the optimal HRTF filter corresponding to a given altitude is different from person to person such as fingerprint. Therefore, it is desirable to calculate and apply the HRTF for each listener 110, but this is not practical.
- HRTF is calculated for some listeners 110 within a group of listeners 110 having similar characteristics (e.g., physical characteristics such as age, height, or preferred frequency band, preferred music, etc.).
- the representative value eg, average may then be determined as the HRTF to apply to all listeners 110 in the population.
- Equation 4 An example of the result of filtering the acoustic signal using the HRTF defined in Equation 3 is shown in Equation 4 below.
- Y1 (f) is a value obtained by converting an acoustic signal heard by the listener 110 into the frequency domain by a real speaker
- Y2 (f) is a value obtained by converting an acoustic signal heard by the listener 110 into the frequency domain by the virtual speaker. to be.
- the filtering unit 520 may filter only some of the plurality of channel signals included in the sound signal.
- the sound signal may include sound signals corresponding to a plurality of channels.
- seven channel signals are defined for convenience of description.
- the channel signal to be described later is merely an example, and the sound signal may include a channel signal indicating a sound signal generated in a direction other than the seven directions described below.
- the center channel signal represents an acoustic signal generated at the center of the front face and is output to the center speaker.
- the right front channel signal represents an acoustic signal generated on the right side of the front and is output to the right front speaker.
- the left front channel signal represents an acoustic signal generated on the left side of the front and is output to the left front speaker.
- the right rear channel signal represents an acoustic signal generated on the right side of the rear side and is output to the right rear speaker.
- the left rear channel signal represents an acoustic signal generated on the left side of the rear side and is output to the left rear speaker.
- the right top channel signal represents an acoustic signal generated from the upper right side, and is output to the right top speaker.
- the left top channel signal represents an acoustic signal generated from the upper left and is output to the left top speaker.
- the filtering unit 520 filters the right top channel signal and the left top channel signal. Thereafter, the filtered right top channel signal and left top channel signal are used to model a virtual sound source generated at a desired altitude.
- the filtering unit 520 filters the right front channel signal and the left front channel signal. Thereafter, the filtered right front channel signal and left front channel signal are used to model a virtual sound source generated at a desired altitude.
- the right top channel signal and the left top channel signal are upmixed to generate a right top channel signal and a left top channel signal, and then mixed.
- the right top channel signal and the left top channel signal may be filtered.
- the replica unit 530 replicates the filtered channel signal into a plurality.
- the replica unit 530 replicates the number of speakers in the group to output the filtered channel signal. For example, when the filtered sound signal is output as a right top channel signal, a left top channel signal, a right rear channel signal, and a left rear channel signal, the copy unit 530 replicates the filtered channel signals into four.
- the number of copies of the filtered channel signal by the copying unit 530 may vary depending on the embodiment. However, the copying unit 530 may duplicate the filtered channel signal in two or more so that the filtered channel signal is output to at least the right rear channel signal and the left rear channel signal. It may be desirable.
- the speaker on which the right top channel signal and the left top channel signal are to be reproduced are arranged on a horizontal plane. For example, it may be attached directly above the front speaker to reproduce the right front channel signal.
- the amplifier 540 amplifies (or attenuates) the filtered sound signal according to a predetermined gain value.
- the gain value is set differently according to the type of filtered sound signal and the type of filtered sound signal.
- the right top channel signal to be output to the right top speaker is amplified according to the first gain value
- the right top channel signal to be output to the left top speaker is amplified according to the second gain value.
- the first gain value may be greater than the second gain value.
- the left top channel signal to be output to the right top speaker is amplified according to the second gain value
- the left top channel signal to be output to the left top speaker is amplified according to the first gain value so that corresponding channel signals are output from the left and right speakers.
- the 3D sound reproducing apparatus 200 may output the same sound signal by different gain values from the speakers in the group.
- the virtual sound source can be easily positioned at an altitude higher than that of the actual speaker, or the virtual sound source can be positioned at a specific altitude independent of the altitude of the actual speaker.
- the operation of the replica unit 530 and the amplifier 540 may vary according to the number of channel signals included in the input sound signal and the number of speakers in the group.
- the stereo sound reproducing apparatus 200 has described the method of positioning the virtual sound source at a predetermined position, respectively with reference to FIGS. 3 to 5, one stereo sound reproducing apparatus 200 has been described with reference to FIGS. It is obvious that all or alternatively may be used.
- the method of positioning the virtual sound source at a predetermined position by the stereo sound reproducing apparatus 200 is not limited to the above-described example, and the stereo sound reproducing apparatus 200 may use any other method based on the position and the number of speakers in the group. Can be used to orient the virtual sound source at a predetermined position.
- FIG. 6A illustrates a stereoscopic sound reproducing apparatus for tracking a listener's head position according to an embodiment.
- the stereoscopic sound reproducing apparatus 200 may further include a communication unit (not shown).
- the communication unit (not shown) may include one or more hardware components that allow communication between the 3D sound reproducing apparatus 200 and the peripheral device.
- the communication unit (not shown) may include short range communication or mobile communication.
- Short-range wireless communication includes Bluetooth communication, BLE (Bluetooth Low Energy) communication, near field communication (Near Field Communication), WLAN (Wi-Fi) communication, Zigbee communication, Infrared (IrDA) ), Communication, Wi-Fi Direct (WFD) communication, ultra wideband (UWB) communication, Ant + communication, and the like, but is not limited thereto.
- the mobile communication may transmit / receive a radio signal with at least one of a base station, an external terminal, and a server on a mobile communication network.
- the wireless signal may include various types of signals according to transmission and reception of an audio signal, an image signal signal, or a text / multimedia message.
- the communicator may include a listener tracker 610.
- 5A is a block diagram illustrating another example of a 3D sound reproducing apparatus 200 according to an exemplary embodiment. Therefore, even if omitted below, the above description of the stereoscopic sound reproducing apparatus 200 of FIG. 2 is also applied to the stereoscopic sound reproducing apparatus 200 according to the exemplary embodiment of FIG. 5.
- the listener tracker 610 may track a position at which the listener 110 moves.
- the sweet spot 120 which is a position where the optimal stereoscopic sound can be enjoyed, is typically determined manually based on the positions of the speakers 145, 160, and 165.
- the virtual line is folded by 60 degrees at both ends of the line between the left and right speakers.
- the sweet spot 120 may be determined based on a point where two virtual lines meet.
- the pre-echo phenomenon means that when the position of the listener 110 is shifted from the center to the left, the influence of the left speaker having a large gain and a relatively high pre delay becomes dominant so that the position of the auditory sound image is out of the focused position. It is the phenomenon of listening by listening to the position of left speaker.
- the sweet spot 120 may move according to the head position of the listener 110 without being in a fixed position.
- the stereoscopic sound reproducing apparatus 200 may update the sweet spot 120 in real time or at regular time intervals according to the position of the head of the listener acquired from the listener tracking unit 610.
- the listener tracker 610 may acquire the head position information of the listener 110 in real time.
- the listener tracking unit 610 may acquire the head position information of the listener 110 based on the mobile phone, the motion recognition sensor or the position sensor attached to the remote controller possessed by the listener 110.
- the listener tracking unit 610 may acquire the head position information of the listener 110 using an image processing algorithm such as object tracking or an accessory worn by the listener 110 or a wearable glass such as Google glass. It may be. How the listener tracking unit 610 tracks the head position of the listener 110 is not limited to the above-described example, and any other method may be used.
- the listener tracker 610 may obtain head position information of the plurality of listeners 110 in real time.
- the stereoscopic sound reproducing apparatus 200 sets or obtains one sweet spot 120 including the plurality of listeners 110 based on the head position information of the plurality of listeners 110 obtained.
- a plurality of sweet spots 120 may be set based on the location.
- 6B illustrates a change in a sweet spot of a stereo sound listening environment according to an embodiment.
- the 3D sound reproducing apparatus 200 may reset the sweet spot 120 based on the position of the moved listener 110.
- the renderer 220 may change the gain and delay values of the speakers in the group to suit the moved sweet spot 120.
- the rendering unit 220 orientates the virtual sound source using the WFS method described above with reference to FIG. 3, the pre-echo phenomenon by changing the gain value for each speaker 145, 160, 165 in the group in real time. Can be reduced.
- the sweet spot 120 is set near the position of the head of the tracked listener 110 and the optimized speaker ( 145, 160, 165) can be calculated per impulse response.
- the rendering unit 220 orients the virtual sound source to a predetermined altitude using the altitude reproduction method described above with reference to FIG. 5, a gain value to be applied to each speaker 145, 160, 165 in the group and By changing the phase delay value, the position of the elevation angle can be kept constant.
- the listener tracking unit 610 may track head positions of the plurality of listeners 110 and set a plurality of sweet spots based on head positions of the listeners.
- the renderer 220 may change gain and delay values of the speakers in the group based on the positions and sizes of the plurality of sweet spots.
- 7 to 8 are diagrams for describing a stereoscopic sound reproducing method performed by the stereoscopic sound reproducing apparatus 200 shown in FIGS. 1 to 6. Therefore, even if omitted below, the above description of the stereoscopic sound reproducing apparatus 200 of FIGS. 1 to 6 may be applied to the stereoscopic sound reproducing method according to the exemplary embodiment of FIGS. 7 to 8.
- FIG. 7 is a flowchart of a method of reproducing stereoscopic sound, according to an exemplary embodiment.
- the 3D sound reproducing apparatus 200 may group the plurality of speakers.
- the 3D sound reproducing apparatus 200 may group at least two physically separated speakers into one group.
- the 3D sound reproducing apparatus 200 may group a television built-in speaker and a separate loudspeaker into one group.
- the 3D sound reproducing apparatus 200 may group a built-in television speaker, one or more soundbars, and one or more loudspeakers into one group.
- the 3D sound reproducing apparatus 200 may group existing home theater speakers and one or more loudspeakers separately purchased by the listener into one group. Speakers in a group may be physically separated from each other. The listener can select the speakers to be grouped, and can decide which speakers to add based on the size and characteristics of the space where the listener is located or the nature of the content to be enjoyed.
- the 3D sound reproducing apparatus 200 may group a plurality of physically separated speakers into a group through various communication paths.
- the communication path may represent various networks and network topologies.
- the communication path may include wireless communication, wired communication, optical, ultrasound, or a combination thereof.
- the 3D sound reproducing apparatus 200 may receive an audio signal.
- the 3D sound reproducing apparatus 200 may receive an input audio signal from a device such as a DVD, BD, or MP3 player.
- the input audio signal may be a multi-channel audio signal such as a stereo signal (2 channels), 5, 1 channel, 7.1 channel, 10.2 channel and 22.2 channel.
- the input audio signal may receive a plurality of mono input signals and object-based audio signals in which real-time positions of objects are transmitted in the form of metadata.
- the object-based audio signal refers to a form in which the position of each audio object arranged in three-dimensional space is compressed into metadata along with sound.
- the input audio signal may be a hybrid input audio signal in which a channel audio signal and an object-based audio signal are mixed.
- the 3D sound reproducing apparatus 200 may perform 3D audio rendering for positioning the virtual sound source at a predetermined position.
- the 3D sound reproducing apparatus 200 may generate at least one speaker signal corresponding to the audio signal by processing the input audio signal using a wave field synthesis rendering algorithm.
- the stereo sound reproducing apparatus 200 processes an input audio signal using a head related transfer function rendering, beam-forming rendering, or focused source rendering algorithm to process an audio signal.
- At least one speaker signal corresponding to may be generated.
- the 3D sound reproducing apparatus 200 may calculate an impulse response for each speaker based on the minimum error summation, or perform rendering to reproduce the sense of altitude.
- the 3D sound reproducing apparatus 200 may reproduce the rendered virtual sound source through the multi-channel speaker.
- FIG. 8 shows a flowchart of a further embodiment of the method for the stereophonic reproduction apparatus to reproduce stereoscopic sound.
- Steps 710, 720, and 740 are the same as those described with reference to FIG.
- the 3D sound reproducing apparatus 200 may track the position of the head of the listener. If the stereo sound reproducing apparatus 200 tracks the position where the head of the listener moves in real time, the sweet spot may move according to the position of the listener without being in a fixed position.
- the stereoscopic sound reproducing apparatus 200 may update the sweet spot in real time or periodically according to the acquired head position of the listener.
- the 3D sound reproducing apparatus 200 may acquire the head position information of the listener in real time. For example, the 3D sound reproducing apparatus 200 may acquire the head position information of the listener based on a mobile phone, a motion recognition sensor, or a position sensor attached to the remote controller. Alternatively, the 3D sound reproducing apparatus 200 may acquire the head position information of the listener using an image processing algorithm such as object tracking or an accessory worn by the listener or a wearable glass such as Google Glass. It is apparent that the method for the stereo reproducing apparatus 200 to track the position of the head of the listener is not limited to the example described above, and any other method may be used.
- the 3D sound reproducing apparatus 200 may position the virtual sound source at a predetermined position based on the head position information of the listener.
- the 3D sound reproducing apparatus 200 may change the gain value and the phase delay value of at least one of the speakers in the group using the WFS method based on the moved listener head position.
- the 3D sound reproducing apparatus 200 may reset the sweet spot near the head position of the tracked listener and recalculate the impulse response of at least one of the speakers in the group by using the aforementioned minimum error calculation method.
- the stereo sound reproducing apparatus 200 changes the gain value and the phase delay value to be applied to at least one of the speakers in the group.
- the position of the angle can be kept constant.
- the stereoscopic sound reproducing method may be embodied as computer readable codes on a computer readable recording medium.
- the computer-readable recording medium includes all kinds of recording devices in which data that can be read by a computer system is stored. Examples of computer-readable recording media include ROM, RAM. CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, and the like, and also include those implemented in the form of carrier waves such as transmission over the Internet.
- the computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.
- the methods, processes, devices, products and / or systems according to the present invention are simple, cost effective, and not complicated and are very versatile and accurate.
- efficient and economical manufacturing, application and utilization can be realized while being readily available.
- Another important aspect of the present invention is that it is in line with current trends that call for cost reduction, system simplification and increased performance. Useful aspects found in such embodiments of the present invention may consequently increase the level of current technology.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Stereophonic System (AREA)
Abstract
Conformément à un mode de réalisation, l'invention concerne un procédé pour une lecture de son tridimensionnel (3D), lequel procédé peut comprendre les étapes consistant : à grouper une pluralité de haut-parleurs dans un groupe ; à recevoir une entrée d'un signal audio ; à utiliser la pluralité de haut-parleurs groupés pour localiser une ou plusieurs sources sonores virtuelles du signal audio à une position prédéterminée ; et à lire la ou les sources sonores virtuelles par l'intermédiaire de la pluralité de haut-parleurs.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/KR2015/005253 WO2016190460A1 (fr) | 2015-05-26 | 2015-05-26 | Procédé et dispositif pour une lecture de son tridimensionnel (3d) |
KR1020177029777A KR102357293B1 (ko) | 2015-05-26 | 2015-05-26 | 입체 음향 재생 방법 및 장치 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/KR2015/005253 WO2016190460A1 (fr) | 2015-05-26 | 2015-05-26 | Procédé et dispositif pour une lecture de son tridimensionnel (3d) |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016190460A1 true WO2016190460A1 (fr) | 2016-12-01 |
Family
ID=57393245
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2015/005253 WO2016190460A1 (fr) | 2015-05-26 | 2015-05-26 | Procédé et dispositif pour une lecture de son tridimensionnel (3d) |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR102357293B1 (fr) |
WO (1) | WO2016190460A1 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111343556A (zh) * | 2020-03-11 | 2020-06-26 | 费迪曼逊多媒体科技(上海)有限公司 | 一种传统扩声与全息扩声和电子声罩结合的音响系统及其使用方法 |
CN117156348A (zh) * | 2023-06-30 | 2023-12-01 | 惠州中哲尚蓝柏科技有限公司 | 一种用于家庭影院的立体声组合音箱及其控制方法 |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102023400B1 (ko) * | 2018-07-02 | 2019-09-23 | 주식회사 이엠텍 | 웨어러블 음향 변환 장치 |
KR102737006B1 (ko) * | 2019-03-08 | 2024-12-02 | 엘지전자 주식회사 | 음향 객체 추종을 위한 방법 및 이를 위한 장치 |
WO2021049704A1 (fr) * | 2019-09-10 | 2021-03-18 | 주식회사 신안정보통신 | Appareil et système de reproduction sonore de type réseau horizontal utilisant une technique de synthèse d'ondes planes |
KR20230105188A (ko) * | 2022-01-03 | 2023-07-11 | 삼성전자주식회사 | 무선 전송을 이용해서 7 채널의 오디오를 확장한 장치 및 방법 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20090054802A (ko) * | 2007-11-27 | 2009-06-01 | 한국전자통신연구원 | 음장 합성을 이용한 입체 음장 재생 장치 및 그 방법 |
KR20100062773A (ko) * | 2008-12-02 | 2010-06-10 | 한국전자통신연구원 | 오디오 컨텐츠 재생 장치 |
KR20140017682A (ko) * | 2011-07-01 | 2014-02-11 | 돌비 레버러토리즈 라이쎈싱 코오포레이션 | 적응형 오디오 신호 생성, 코딩 및 렌더링을 위한 시스템 및 방법 |
KR20140025268A (ko) * | 2012-08-21 | 2014-03-04 | 한국전자통신연구원 | 사운드 바를 이용한 음장 재현 시스템 및 방법 |
KR20140093578A (ko) * | 2013-01-15 | 2014-07-28 | 한국전자통신연구원 | 사운드 바를 위한 오디오 신호 처리 장치 및 방법 |
-
2015
- 2015-05-26 WO PCT/KR2015/005253 patent/WO2016190460A1/fr active Application Filing
- 2015-05-26 KR KR1020177029777A patent/KR102357293B1/ko active IP Right Grant
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20090054802A (ko) * | 2007-11-27 | 2009-06-01 | 한국전자통신연구원 | 음장 합성을 이용한 입체 음장 재생 장치 및 그 방법 |
KR20100062773A (ko) * | 2008-12-02 | 2010-06-10 | 한국전자통신연구원 | 오디오 컨텐츠 재생 장치 |
KR20140017682A (ko) * | 2011-07-01 | 2014-02-11 | 돌비 레버러토리즈 라이쎈싱 코오포레이션 | 적응형 오디오 신호 생성, 코딩 및 렌더링을 위한 시스템 및 방법 |
KR20140025268A (ko) * | 2012-08-21 | 2014-03-04 | 한국전자통신연구원 | 사운드 바를 이용한 음장 재현 시스템 및 방법 |
KR20140093578A (ko) * | 2013-01-15 | 2014-07-28 | 한국전자통신연구원 | 사운드 바를 위한 오디오 신호 처리 장치 및 방법 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111343556A (zh) * | 2020-03-11 | 2020-06-26 | 费迪曼逊多媒体科技(上海)有限公司 | 一种传统扩声与全息扩声和电子声罩结合的音响系统及其使用方法 |
CN117156348A (zh) * | 2023-06-30 | 2023-12-01 | 惠州中哲尚蓝柏科技有限公司 | 一种用于家庭影院的立体声组合音箱及其控制方法 |
CN117156348B (zh) * | 2023-06-30 | 2024-02-09 | 惠州中哲尚蓝柏科技有限公司 | 一种用于家庭影院的立体声组合音箱及其控制方法 |
Also Published As
Publication number | Publication date |
---|---|
KR102357293B1 (ko) | 2022-01-28 |
KR20180012744A (ko) | 2018-02-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5459790A (en) | Personal sound system with virtually positioned lateral speakers | |
US6144747A (en) | Head mounted surround sound system | |
US5661812A (en) | Head mounted surround sound system | |
US5841879A (en) | Virtually positioned head mounted surround sound system | |
US5272757A (en) | Multi-dimensional reproduction system | |
WO2016190460A1 (fr) | Procédé et dispositif pour une lecture de son tridimensionnel (3d) | |
US7333622B2 (en) | Dynamic binaural sound capture and reproduction | |
Valimaki et al. | Assisted listening using a headset: Enhancing audio perception in real, augmented, and virtual environments | |
JP4584416B2 (ja) | 位置調節が可能な仮想音像を利用したスピーカ再生用多チャンネルオーディオ再生装置及びその方法 | |
WO2018182274A1 (fr) | Procédé et dispositif de traitement de signal audio | |
US20080056517A1 (en) | Dynamic binaural sound capture and reproduction in focued or frontal applications | |
KR100878457B1 (ko) | 음상정위 장치 | |
WO2018056780A1 (fr) | Procédé et appareil de traitement de signal audio binaural | |
WO2015147532A2 (fr) | Procédé de rendu de signal sonore, appareil et support d'enregistrement lisible par ordinateur | |
US20070009120A1 (en) | Dynamic binaural sound capture and reproduction in focused or frontal applications | |
WO2011139090A2 (fr) | Procédé et appareil de reproduction de son stéréophonique | |
US20230247384A1 (en) | Information processing device, output control method, and program | |
WO2015147619A1 (fr) | Procédé et appareil pour restituer un signal acoustique, et support lisible par ordinateur | |
WO2013103256A1 (fr) | Procédé et dispositif de localisation d'un signal audio multicanal | |
US20220345845A1 (en) | Method, Systems and Apparatus for Hybrid Near/Far Virtualization for Enhanced Consumer Surround Sound | |
JP2003032776A (ja) | 再生システム | |
JP2018110366A (ja) | 3dサウンド映像音響機器 | |
WO2016182184A1 (fr) | Dispositif et procédé de restitution sonore tridimensionnelle | |
US10440495B2 (en) | Virtual localization of sound | |
Glasgal | 360 localization via 4. x race processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15893412 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20177029777 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 15893412 Country of ref document: EP Kind code of ref document: A1 |