WO2023122227A1 - Audio control system - Google Patents
Audio control system Download PDFInfo
- Publication number
- WO2023122227A1 WO2023122227A1 PCT/US2022/053737 US2022053737W WO2023122227A1 WO 2023122227 A1 WO2023122227 A1 WO 2023122227A1 US 2022053737 W US2022053737 W US 2022053737W WO 2023122227 A1 WO2023122227 A1 WO 2023122227A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio
- user
- frequency
- indication
- frequencies
- Prior art date
Links
- 230000005236 sound signal Effects 0.000 claims abstract description 98
- 230000004044 response Effects 0.000 claims abstract description 81
- 238000010408 sweeping Methods 0.000 claims abstract description 9
- 238000000034 method Methods 0.000 claims description 59
- 230000004048 modification Effects 0.000 claims description 24
- 238000012986 modification Methods 0.000 claims description 24
- 208000016354 hearing loss disease Diseases 0.000 claims description 16
- 206010011878 Deafness Diseases 0.000 claims description 14
- 230000010370 hearing loss Effects 0.000 claims description 14
- 231100000888 hearing loss Toxicity 0.000 claims description 14
- 230000003321 amplification Effects 0.000 claims description 6
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 description 88
- 238000010801 machine learning Methods 0.000 description 32
- 230000006870 function Effects 0.000 description 26
- 238000012360 testing method Methods 0.000 description 24
- 230000008569 process Effects 0.000 description 19
- 238000012549 training Methods 0.000 description 14
- 238000004891 communication Methods 0.000 description 9
- 238000012937 correction Methods 0.000 description 9
- 230000008447 perception Effects 0.000 description 9
- 238000012546 transfer Methods 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 238000003062 neural network model Methods 0.000 description 6
- 238000001514 detection method Methods 0.000 description 5
- 210000005069 ears Anatomy 0.000 description 4
- 230000000670 limiting effect Effects 0.000 description 4
- 239000003550 marker Substances 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 230000001174 ascending effect Effects 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 230000003292 diminished effect Effects 0.000 description 3
- 210000003128 head Anatomy 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 230000003190 augmentative effect Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004321 preservation Methods 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000004397 blinking Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 210000003454 tympanic membrane Anatomy 0.000 description 1
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/12—Audiometering
- A61B5/121—Audiometering evaluating hearing capacity
- A61B5/125—Audiometering evaluating hearing capacity objective methods
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/7264—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
- A61B5/7267—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
Definitions
- Audio systems can determine a hearing characteristic of a user. For example, an audiologist can determine a user response according to an audiology exam. An audio device can process audio for the user based on the hearing characteristics of the user. However, determining the hearing characteristics can employ specialized equipment, personnel, or extensive time.
- a controller can cause a speaker to present a frequency sweep to a user, and receive an indication of the user’s receipt thereof.
- the controller can determine a hearing characteristic of the user according to the received indication.
- the controller can receive an indication of the identity of the audio output device.
- the controller can receive an indication of a type of audio content.
- the controller can receive an indication of audio content including human speech, music, or spatial content (e.g., virtual reality or video game content).
- the controller can employ a digital signal processing (DSP) circuit to process audio content to match the hearing characteristic of a user.
- DSP digital signal processing
- a method includes: generating, by one or more processors, an audio signal for output by an audio device to a user, the audio signal sweeping across a plurality of frequencies; receiving, by the one or more processors, a first indication of a user response to a first portion of the audio signal, the first portion associated with a first frequency of the plurality of frequencies; upon receiving the first indication of the user response, modifying, by the one or more processors, an amplitude of a second portion of the audio signal associated with a second frequency of the plurality of frequencies; receiving, by the one or more processors, a second indication of a user response to the second portion of the audio signal, the second portion associated with the second frequency of the plurality of frequencies; and generating, by the one or more processors, an audiogram specific to the user according to the first indication of the user response and the second indication of the user response.
- a system includes: a speaker; and one or more processors configured to: generate an audio signal for output by an audio device to a user, wherein the audio signal sweeps across a plurality of frequencies; receive a first indicate on of a user response to a first portion of the audio signal, the first portion associated with a first frequency of the plurality of frequencies; upon receipt of the first indication of the user response, modify an amplitude of a second portion of the audio signal associated with a second frequency of the plurality of frequencies; receive a second indication of a user response to the second portion of the audio signal, the second portion associated with the second frequency of the plurality of frequencies; and generate an audiogram specific to the user according to the first indication of the user response and the second indication of the user response; receive an audio input from an audio source; receive a device identifier associated with the speaker; modify the audio input at one or more frequencies to generate an audio output, according to the audiogram and the device identifier; and output, via the speaker, the audio output to the user.
- a headphone set includes a speaker; and one or more processors configured to: generate an audio signal for output by an audio device to a user, the audio signal sweeping across a plurality of frequencies; receive a first indication of a user response to a first portion of the audio signal, the first portion associated with a first frequency of the plurality of frequencies; upon receiving the first indication of the user response, modify an amplitude of a second portion associated with a second frequency of the plurality of frequencies; receive a second indication of a user response to the second portion of the audio signal, the second portion associated with the second frequency of the plurality of frequencies; generate an audiogram specific to the user according to the first indication of the user response and the second indication of the user response; receive an audio input from an audio source; modify the audio input at one or more frequencies to generate an audio output, according to the audiogram and a frequency response of the headphone set; and output, via the speaker, the audio output to the user.
- FIG. 1 depicts a block diagram of an example of a data processing system.
- FIG. 2 depicts an example of a frequency response plot for a speaker.
- FIG. 3 depicts an example method of determining an audiogram.
- FIG. 4 depicts an example of an audiogram-threshold chart.
- FIG. 5 depicts an example of an equalizer profile.
- FIG. 6 depicts an example of a correction adjustment interface of a user interface.
- FIG. 7 depicts a top view of an environment including a user in relation to a speaker and an environment boundary.
- FIG. 8 depicts a block diagram of an example system using supervised learning.
- FIG. 9 depicts a block diagram of a simplified neural network model.
- FIG. 10 depicts an example method of generating an audiogram.
- FIG. 11 depicts an example block diagram of an example computer system. DETAILED DESCRIPTION
- Systems and methods described herein can provide improved detection or application of an indication of a hearing for a user (e.g., an audiogram).
- an audio output device e.g., a speaker
- the audio signal can sweep across a plurality of frequencies.
- the audio signal can sequentially sweep between a minimum and a maximum frequency.
- a controller can select frequencies to present to a user according to a nonlinear function, such as a logarithmic function (e.g., frequency spacing approaching the maximum frequency may be greater than the frequency spacing approaching the minimum frequency).
- the audio signal can include a frequency sweep from 100 Hz to 16 kHz.
- the frequency sweep can be continuous or include discrete frequencies (e.g., logarithmically spaced such that the individual frequencies are evenly spaced as depicted by a logarithmic depiction thereof, such as an audiogram).
- the audio signal can include a tone at each frequency (e.g., a 300ms tone).
- the audio signal can include dwell time at a duty cycle (e.g., 50%) with the tone.
- the audio signal can alternate between a 300ms tone and a 300 ms dwell time during which no audio is output, ascending or descending between various frequencies.
- a control interface such as a key of a keyboard, a user interface element of a device, a control of a headset, or the like can receive an indication of an audibility of the audio signal from the user.
- the controller can cause the audio signal to progress through the various frequencies of the frequency sweep, and receive an indication of audibility of the audio content.
- the control interface can detect an actuation thereof, indicating that the user hears the audio (or does not hear the audio).
- the speed of the adjustment between frequencies may be adjustable, such as in response to a user command, a control interface type or latency, etc.
- the controller can cause an additional frequency sweep in a frequency range or volume range of interest.
- a second frequency sweep can repeat or further granulize areas of interest (e.g., transitions between audibility and inaudibility) with respect to frequencies or amplitudes.
- amplitude adjustments can be about 5dB in a first frequency sweep and about 2.5 dB in a second frequency sweep.
- the controller can cause the audio signal to be modified according to a frequency response of a speaker (e.g., via a digital signal processing circuit).
- a digital signal processing (DSP) circuit can modify the frequency response (e.g., adjusted or generated) based on the frequency response (e.g., increase an amplitude at 100 Hz by 10 dB, relative to the amplitude at 2 kHz).
- the controller can generate an audiogram indicating the response of audibility, along with any modifications for the speaker.
- the controller can cause audio content to be modified based on the audiogram.
- the controller can employ the DSP circuit to modify audio content including human speech, music, video game or other virtual reality (VR) or augmented reality (AR) content, or the like.
- the controller can cause the audio to be amplified at a same or different frequency as an indication of hearing loss or other audiogram content. Such amplification may enhance the intelligibility of speech, the subjective experience of a user, or spatial information of the audio content.
- FIG. 1 depicts an example of a data processing system 100.
- the data processing system 100 can include, interface with, or otherwise utilize at least one controller 102, digital signal processing (DSP) circuit 104, audiography circuit 106, user interface 108, speaker 110, or audio content recognition circuit 112.
- the controller 102, digital signal processing (DSP) circuit 104, audiography circuit 106, user interface 108, speaker 110, or audio content recognition circuit 112 can each include or interface with at least one processing unit or other logic device such as programmable logic array engine, or module configured to communicate with the data repository 120 or database.
- the controller 102, digital signal processing (DSP) circuit 104, audiography circuit 106, user interface 108, speaker 110, or audio content recognition circuit 112 can be separate components, a single component, or part of the data processing system 100.
- the data processing system 100 can include hardware elements, such as one or more processors, logic devices, or circuits.
- the data processing system 100 can include one or more components or structures of functionality of computing devices depicted in FIG. 11.
- the data repository 120 can include one or more local or distributed databases, and can include a database management system.
- the data repository 120 can include computer data storage or memory and can store one or more of an audiogram 122, device characteristics 124, user preferences 126, or audio content 128.
- the audiogram 122 can include an indication of a hearing characteristic of audio content for a user.
- the audiogram can include an indication of a hearing characteristic of a user across various frequencies, or a transfer function for the user, such as a head related transfer function (HRTF).
- the device characteristics 124 can include an identity or transfer function of a device.
- an identity can include a manufacturer, model number, unique identifier (e.g., serial number or user-supplied identifier), or the like.
- a transfer function can include a frequency response, sound pressure level, or distortion for a various sound pressure levels, frequencies, or the like.
- the user preferences 126 can include an expressed preference with regard to audio content, an intelligibility of audio content, an indication of a spatial audio (e.g., an indication of a user’s ability to distinguish a directionality of an element of audio content 128), or the like.
- the audio content 128 can include audio files, streams, or characteristics thereof. For example, audio files such as music or other audio content (e.g., podcasts, audio tracks for audio-visual content such as video content) or the like can be stored by or accessible to the data repository 120.
- Stream information such as audio content for a videogame, virtual reality, augmented reality, mixed reality, or other content can be derived therefrom, including frequency content, content type, or devices associated therewith.
- the data processing system 100 can include, interface with, or otherwise utilize at least one controller 102.
- the controller 102 can include or interface with one or more processors and memory.
- the processor can be implemented as a specific purpose processor, an application specific integrated circuit (ASIC), one or more field programmable gate arrays (FPGAs), a group of processing components, or other suitable electronic processing components.
- the processors and memory can be implemented using one or more devices, such as devices in a client-server implementation.
- the memory can include one or more devices (e.g., random access memory (RAM), read-only memory (ROM), flash memory, hard disk storage) for storing data and computer code for completing and facilitating the various user or client processes, layers, and modules.
- RAM random access memory
- ROM read-only memory
- flash memory hard disk storage
- the memory can be or include volatile memory or non-volatile memory and may include database components, object code components, script components, or any other type of information structure for supporting the various activities and information structures of the inventive concepts disclosed herein.
- the memory can be communicably connected to the processor and include computer code or instruction modules for executing one or more processes described herein.
- the memory can include various circuits, software engines, and/or modules that cause the processor to execute the systems and methods described herein, such as to cause the communication or processing of audio signals.
- the controller 102 can include or be coupled with communications electronics.
- the communications electronics can conduct wired and/or wireless communications.
- the communications electronics can include one or more wired (e.g., Ethernet, PCIe, or AXI) or wireless transceivers (e.g., a Wi-Fi transceiver, a Bluetooth transceiver, a NFC transceiver, or a cellular transceiver).
- the controller 102 may be in network communication or otherwise communicatively coupled with the DSP circuit 104, speaker 110, or other components of the data processing system 100.
- the controller 102 can cause one or more operations disclosed, such as by employing another element of the data processing system. For example, operations disclosed by other elements of the data processing system may be initiated, scheduled, or otherwise controller by the controller 102.
- the data processing system 100 can include, interface with, or otherwise utilize at least one DSP circuit 104.
- the DSP circuit 104 can receive audio content 128, user preferences 126, or audiograms 122.
- the DSP circuit can process audio content 128 based on the audiogram 122, the user preferences 126, or the audio content 128 (e.g., audio content type).
- the DSP circuit 104 can receive an audiogram 122 from the audiography circuit 106, and process the audio content 128 based on the audiogram 122.
- the DSP circuit can normalize the audio content 128 to reduce a difference between the audiogram 122 of a user and a target audiogram 122 (e.g., by increasing a sound pressure at a frequency for which the audiogram 122 for the user indicates diminished hearing relative to the target audiogram 122).
- the DSP circuit 104 can be configured to adjust a different frequency responsive to a determination that an intelligibility, directionality, or subjective impression of audio content 128 would be enhanced thereby.
- human speech can be processed to amplify a harmonic of a frequency at which an audiogram 122 indicated diminished hearing of a user.
- 7Khz content can be amplified responsive to diminished hearing at 14 KHz, which may increase an intelligibility of human speech or other harmonic content.
- the DSP circuit 104 can be configured to adjust audio content 128 based on a spatial position of the audio content. For example, the DSP circuit 104 can receive an audiogram indicative of hearing loss at 4 KHz in a left ear.
- the DSP circuit 104 can receive an audio content type, such as from the audio content recognition circuit 112.
- the DSP circuit 104 can process the audio content 128 based on the audio type for a same or different ear.
- the DSP circuit 104 can amplify a right channel of the audio responsive to non-spatial content, such as human speech from an audio output device (e.g., a single channel speaker 110).
- the DSP circuit 104 can frequency shift content according to spatial content. For example, a 4 kHz tone can be modified to a 5 KHz tone such that spatial information can be perceived by the user.
- Such embodiments can include video games, virtual reality, mixed reality, or the like where spatial information may be prioritized over tonal information, such as to locate a prompt, character, or item, or navigational aid.
- spatial information may be prioritized over tonal information, such as to locate a prompt, character, or item, or navigational aid.
- audio content such as music
- the 4 kHz content can be amplified according to the audiogram 122 or user preferences 126.
- the data processing system 100 can include, interface with, or otherwise utilize at least one audiography circuit 106.
- the audiography circuit 106 can generate, adjust, or otherwise interface with an audiogram 122 for a user (e.g., specific to the user). For example, the audiography circuit 106 can present an indication to a user (e.g., via the user interface 108 or the speaker 110) to indicate an audible portion of a test pattern as received by the user.
- the audiography circuit 106 can provide one or more prompts to a user interface 108 and receive a response to said prompts, indicative of a receipt of audio content 128 from the user.
- the audiography circuit 106 can normalize a frequency response of a speaker 110, such as by conveying a device identifier or attribute to the DSP circuit 104 for processing of audio content 128 conveyed to the speaker 110.
- the data processing system 100 can include, interface with, or otherwise utilize at least one user interface 108.
- the user interface 108 can include audio, visual, or control interfaces.
- the user interface 108 can include or interface with a control, such as a mechanical switch, touchscreen, capacitive switch, or the like.
- the control can be a keyboard, dedicated control of an speaker 110 (e.g., a button or microphone intrinsic to a headphone set), or may be remote from the audio output device.
- the control can be a control of a graphical user interface (GUI).
- GUI graphical user interface
- the user interface 108 can present the GUI by a device in network communication with the controller 102 or the speaker 110.
- the GUI can be a GUI of an application on a mobile or other device.
- the GUI can present various prompts or information (e.g., to a user) as depicted throughout the present disclosure.
- prompts can be presented via the user interface 108, or the speaker 110.
- various prompts or information may be alternated between the user interface 108 or the speaker 110, unless stated otherwise.
- an audio output to determine a hearing of a user is presented by the speaker 110.
- the data processing system 100 can include, interface with, or otherwise utilize at least one speaker 110.
- the speaker 110 can include a wired or wireless device such as a hearing aid, headphones (e.g., in-ear headphones, over-the-ear headphones), stand-alone speaker, etc.
- the speaker 110 can include various transducers such as for a left and right ear, or a frequency range (e.g., subwoofer, mid-range, tweeter, etc.).
- the speaker 110 can include a communications port communicatively coupled with the controller 102 to exchange information therewith.
- the speaker 110 can provide identity information of the speaker 110, device characteristics 124, or audio content information to the controller 102.
- the speaker 110 can receive audio content 128 from the controller 102.
- the speaker 110 can include or interface with a transducer, a digital to analogue converter (DAC), an amplifier or other components (e.g., Codecs, wireless transceivers, etc.).
- DAC digital to analogue converter
- the speaker identity can be defined based on a subset of the components of the speaker 110. For example, a headphone manufacturer model number can identify an audio output device, comprising a DAC, amplifier, or other components.
- the data processing system 100 can include, interface with, or otherwise utilize at least one audio content recognition circuit 112.
- the audio content recognition circuit 112 can determine a type of audio content 128.
- the audio content recognition circuit 112 can receive a tag associated with content, receive a user indication of a content type, or infer a content type based on a history of user entries, a time, location, device, or the like.
- the audio content recognition circuit 112 can determine that audio content 128 contains spatial information based on an association with a VR headset.
- the tag or other indication of user content can indicate a genre of music, audiovisual content, or other content types.
- audio content 128 can include more than one type.
- a video game can include a music track, spatial content (e.g., events, items, or the like), or speech content (e.g., character dialogue).
- the audio content recognition circuit 112 can determine one or more applicable content types. For example, the audio content recognition circuit 112 can select a primary content type which may vary over time, may receive separate audio stream of different audio types, or may cause the DSP circuit 104 to disaggregate content types, and thereafter process the disaggregated content types separately.
- the audio content recognition circuit can receive an indication of audio content from a user, via the user interface 108.
- the audio content recognition circuit 112 can include, interface with or otherwise employ machine learning models (e.g., supervised learning). For example, the audio content recognition circuit 112 can train a model, or receive a trained model. The audio content recognition circuit 112 can employ the trained model to determine an audio content type, or a probability of a match to one or more audio content types. Such models are further discussed with regard to, for example, FIGs. 8 and 9.
- the audio content recognition circuit 112 can cause the user interface 108 to present a selection of audio content types to a user based on an output of a trained model and receive audio content type information therefrom. The model may be trained based on a user response.
- FIG. 2 depicts an example of a frequency response plot 200 for a speaker 110.
- the frequency response plot 200 can be associated with a device identifier such that the controller 102 can receive the frequency response plot 200 from the speaker 110 or receive an identifier of the speaker 110 (e.g., via the user interface 108).
- the frequency response can be detected by a microphone.
- the frequency response plot 200 includes an unadjusted frequency response 202 indicative of a frequency intrinsic to the speaker 110.
- the unadjusted frequency response 202 can indicate a default setting of the speaker.
- various unadjusted frequency responses 202 can be associated with a device such as according to a setting, position, or the like.
- An aggregated unadjusted frequency response 202 can combine (e.g., average) multiple unadjusted frequency responses 202 curves.
- the DSP circuit 104 can process audio content 128 to generate an adjusted frequency response curve 204.
- the DSP circuit 104 can adjust a magnitude of an amplitude corresponding to frequency content of the audio content 128 such that the adjusted frequency response curve 204 is “flattened” (e.g., a deviation of a magnitude of response between various frequencies is reduced).
- the center magnitude of the adjusted frequency response curve 204 can be normalized (e.g., can bound a same offset, such as 0 dB).
- various speakers 110 can be harmonized such that an indication of audibility from a user of a first speaker 110 can be indicative of an audiogram 122 of the user with various speakers (e.g., the speaker 110 can be de-conflated).
- the DSP circuit 104 can subdivide a portion of the frequency curve into frequency bands, and adjust the magnitude of each band to reduce a variation therebetween (e.g., can implement an audio equalizer).
- the user interface 108 can present a graphical representation for the audio equalizer.
- the graphical representation can include controls to adjust one or more frequency bands.
- FIG. 3 depicts an example method 300 of determining an audiogram.
- the data processing system 100 presents a test output at operation 302.
- the user interface receivers an indication of user receipt of the test output is received at operation 304.
- the speaker 110 presents an //th (e.g., first or subsequent) output at a first volume, and varies the volume until receiving an indication of user acknowledgment.
- the user interface 108 receives an indication of audibility.
- the controller 102 causes a modification of the amplitude of the signal.
- the controller 102 determines whether the output pattern is complete at operation 312. Responsive to determining a non-completion of the output pattern at operation 312, the controller adjusts a frequency of the output at operation 314. Responsive to determining a completion of the output pattern, the controller generates an audiogram at operation 316.
- the data processing system 100 presents a test output.
- the test output can be presented via the speaker 110, or the user interface 108.
- the test output can include a tone, an audible command or instructions, or the like.
- the test output can include a visual presentation such as a blinking LED, instructions of the test (e.g., presented via the user interface), or the like.
- a visual presentation can instruct a user to acknowledge receipt of an audio or visual presentation (e.g., to interact with a control interface, such as a spacebar on a keyboard).
- the test output can include an audible tone at a predefined duty cycle (e.g., 50%).
- the controller 102 receives an indication of user receipt of the test output.
- the controller 102 can receive an indication via a user interface control (e.g., a mechanical key, microphone, touch-sensitive key, or the like).
- a user interface control e.g., a mechanical key, microphone, touch-sensitive key, or the like.
- Such an indication can verify an operation of the data processing system 100 (e.g., an speaker 110 thereof), and verify a status of a user.
- a failure of a user to acknowledge a test output may be indicative of an operational state of a speaker or a control interface, or a user having profound hearing loss.
- the method 300 can proceed to operation 306.
- the method 300 can maintain the test output.
- the controller 102 can include a timeout function. For example, upon a failure to receive a response to the test output in a predefined amount of time, the controller can cause a cessation of the method or otherwise may halt the test output.
- the controller 102 can cause the speaker 110 to produce a first output.
- the first output can include an ascending or descending volume sweeping across various frequency bands.
- the controller 102 can cause the speaker 110 to produce a second output.
- the second output can include an ascending or descending volume sweeping across various frequency bands.
- the controller 102 can cause the speaker 110 to produce an //th output.
- Each transition between outputs can be intermediated by a progression to operation 308.
- the user interface 108 can provide a prompt to actuate a control interface responsive to a detection or non-detection of the output incident to the output.
- the user interface 108 can prompt a user to indicate a first perception of an audio signal increasing in volume, or a last perception of an audio signal decreasing in volume.
- the user interface 108 receives an indication of receipt of the output provided at operation 312.
- the user interface 108 can receive the indication from a control interface intrinsic to the speaker 110, or a separate user interface 108.
- the user interface can receive an indication to repeat a test pattern (e.g., responsive to an inadvertent response, or the like).
- the controller 102 can cause a modification of the amplitude of the signal.
- the controller 102 can cause the output to increase or decrease in volume.
- the controller 102 can cause the output to increase or decrease according to a pre-defined power or sound pressure level (e.g., 5 dB steps).
- an additional instance of an output can be presented to the user (e.g., to confirm or interrogate a level of perception).
- an additional instance of the first output can be presented to the user.
- the second instance can include a same or different magnitude adjustment as the first instance (e.g., may be presented at 2.5 dB steps).
- the second instance may omit one or more magnitudes (e.g., may provide a narrower band of magnitudes, which may be centered about the detected threshold of the first instance).
- the controller can determine if the test pattern is complete.
- the test pattern can be predefined; the controller can compare a number of increments to a predetermined number.
- the test pattern can be adaptive. For example, incident to a failure to detect audio of one or more frequency bands, or a threshold of detection thereof, some frequency bands, steps, or the like may be omitted. For example, the controller can determine that an output at 16.8 kHz can be omitted for a user which indicates no perception of audio at 12 kHz, 13.2 KHz, 14.4 KHz, and 15.5 kHz.
- the controller can adjust a frequency band for an output.
- the frequency band can be determined according to a predefined test pattern or dynamically selected.
- the controller cause the frequency band to increase or decrease logarithmically from about 100 Hz to about 16 kHz for a total of 150 frequency steps.
- addition, fewer, or different (e.g., linear) frequency steps can be selected.
- the frequency steps can ascend monotonically, descend monotonically, be randomly or pseudo- randomly distributed, or the like.
- the controller 102 can generate an audiogram 122 for the user.
- the audiogram 122 can be saved by the data processing system, or conveyed via the user interface 108. According to some embodiments, the audiogram 122 or a transfer function derived therefrom can be conveyed to the speaker 110, or the DSP circuit 104 (e.g., to process audio content 128 based on the audiogram).
- FIG. 4 depicts an example of an audiogram-threshold chart 400.
- the user interface 108 (of FIG. 1) can present an audiogram-threshold chart 400.
- the controller 102 can compare an audiogram 122 to a threshold to generate an equalization function absent a graphical depiction thereof, or process other data presented herein as a chart without generating a user viewable chart.
- the audiogram can be generated, for example, responsive to execution of the method 300 of FIG. 3.
- the audiogram-threshold chart 400 includes a frequency axis 402 (or function) relative to an audibility axis 404 (or function).
- the audibility axis 404 can depict a sound pressure, power, or other indication of volume corresponding to an audibility of the user.
- a complete audiogram 122 of the user is provided, for reference.
- the audiogram 122-threshold chart 400 depicts one or more thresholds 406, which may define normal hearing or a threshold for hearing loss (e.g., mild hearing loss, moderate hearing loss, or the like).
- the audiography circuit 106 presents the audio signal to the user.
- the user may indicate a receipt of the audio signal (e.g., via a control interface). Responsive to the receipt of the user, the audiography circuit 106 can cause the audio signal to be modified to change an amplitude thereof. For example, the audiography circuit 106 can cause the audio signal to be decreased a predefined amount (e.g., 12 dB, 5 dB, or 2.5 dB).
- the user may continue to audibly perceive the audio signal, and the data processing system 100 can receive an indication of said perception. Responsive to such a receipt, the data processing system 100 may further modify (e.g., reduce) the amplitude.
- the user may not perceive the audio (and may indicate non-receipt / non-perception of the audio signal via the control interface). For example, the user may indicate non-perception of the audio signal by selecting a different button or interface element, by de-selecting the control interface, etc.
- the audiography circuit 106 may increase a volume or amplitude of a subsequent frequency presented to the user responsive to receiving the indication of non-perception of the audio signal.
- the audiography circuit 106 can continue to present each of a fourth 414, fifth 416, sixth 418, seventh 420, eighth 422, ninth 424, tenth 426 frequency, and so on such that an amplitude of the various frequencies can bracket, bound, or otherwise define the hearing of the user so as to generate an audiogram 122.
- the audiography circuit 106 can generate the audiogram 122 by applying a smoothing function to the data points at each of the frequencies 408-426.
- the audiography circuit 106 can generate further modifications to the audiogram 122 based on a device identity, such as is described with regard to FIG. 2, or a head related transfer function (HRTF) as is described with regard to FIG. 7.
- the controller 102 can determine one or more deviations of a determined audiogram 122 relative to the threshold. For example, the controller 102 can determine elevated hearing or hearing loss for one or more frequencies based on the relative position of the audiogram 122 and the threshold 406.
- a reduction of a spacing between frequencies may increase a performance of the audiography circuit 106 (e.g., may determine a more accurate audiogram 122).
- a frequency sweep may refer to a sequence of discrete frequencies, or a continual/continuous/near-continuous sweep between a first and a second frequencies, such that an amplitude is determined for each frequency of a band of frequencies.
- further data may be sampled at same or different frequencies along the frequency spectrum.
- the controller 102 can cause the audio signal to be presented to the user at the same frequencies, and a further predefined step, smaller than the previous step may be employed (e.g., 5 dB for the first step and 2.5 dB for the second step).
- the user interface 108 can receive a desired time to generate a profile, or a number of frequency steps to test, such that a user may select increased data to generate a more accurate audiogram, or decreased data to generate an audiogram faster.
- the audiography circuit 106 can generate a separate audiogram -thresh old chart 400 for each ear.
- FIG. 5 depicts an example of an equalizer profile 500.
- the controller 102 can generate the equalizer profile based on the audiogram -threshold chart 400 of FIG. 4.
- various channels of the equalizer profile can provide an adjustment to various frequency bands.
- a fifth frequency band adjustment 510 can increase a magnitude of audio output to mitigate a decrease of audible perception of a user (e.g., a deviation).
- One or more frequency bands can be associated with no adjustment, or another adjustment (e.g., a larger or smaller offset from a 0 dB center magnitude).
- each of a first frequency band adjustment 502, third frequency band adjustment 506, fourth frequency band adjustment 508, sixth frequency band adjustment 512, seventh frequency band adjustment 514, and eighth frequency band adjustment 516 can have a lesser magnitude than the fifth frequency band adjustment 510.
- a second frequency band adjustment 504 can be omitted (e.g., can be indicated as 0 dB).
- the magnitude of the frequency band adjustments can be based on an average, maximum, minimum, or other deviation between an audiogram 122 and a threshold 406.
- an audiogram 122 can be generated for two ears, and an adjustment can be made between the two ears.
- a deviation can be associated with a left ear, and the DSP circuit 104 can insert left channel audio content within a frequency of or proximate to the deviation in a right channel of the speaker.
- a DSP circuit 104 can amplify a frequency band based on a magnitude of a deviation of the frequency band from a threshold.
- an amplification can be based on a degree of hearing loss.
- the controller 102 can cause the DSP circuit 104 to amplify a frequency band.
- a first threshold e.g., mild hearing loss such as a deviation of less than 20 dB
- the controller 102 can cause the DSP circuit 104 to insert a audio content between channels (e.g., channels corresponding to a left ear and a right ear).
- the DSP circuit 104 can amplify a different frequency than a detected deviation. In some embodiments, the DSP circuit 104 can amplify a harmonic of a detected deviation. For example, the DSP circuit 104 can amplify a harmonic of a detected deviation which may increase an intelligibility of human speech. For example, the DSP circuit 104 can amplify a frequency band at about 8 kHz in response to a detected deviation at 4 kHz. The DSP circuit 104 can encode 4 KHz information at 8 kHz to increase an intelligibility thereof. In some embodiments, the DSP circuit 104 can process audio content 128 to increase intelligibility of directional information.
- the controller 102 can receive a content type from the audio content recognition circuit 112, such as a virtual realty or video game content type. Responsive to the content type, the controller 102 can determine that directional information is relevant to the audio content. Responsive to the content type, the controller 102 can cause a DSP circuit 104 to process the audio. For example, the DSP circuit 104 can frequency shift the audio content 128 to avoid a frequency associated with a deviation. For example, responsive to an audiogram indicating hearing loss at 4 kHz, the DSP circuit 104 can frequency shift 4 kHz information to 5 kHz, which may retain directional information.
- a content type from the audio content recognition circuit 112
- the controller 102 can determine that directional information is relevant to the audio content. Responsive to the content type, the controller 102 can cause a DSP circuit 104 to process the audio.
- the DSP circuit 104 can frequency shift the audio content 128 to avoid a frequency associated with a deviation. For example, responsive to an audiogram
- amplitude modifications may not correspond linearly to audio content or user hearing.
- the DSP circuit 104 may increase a component of audio content less than, equal to, or greater than a corresponding deviation of an audiogram 122 for the user.
- FIG. 6 depicts an example of a correction adjustment interface 600 of a user interface 108.
- the correction adjustment interface 600 can adjust a magnitude of an adjustment employed by the DSP circuit 104.
- a correction adjustment interface 600 can depict a modification amplitude, a preservation of spatial information, or the like.
- the correction adjustment interface 600 can include a selection bar 602, ring, toggle, or the like.
- a selection marker 604 may indicate a selected position on the selection bar 602. The selection marker can separate a first portion of the selection bar 606 indicating a lesser correction, and a second portion of the selection bar 608 indicating a greater correction.
- the user interface 108 can receive an adjustment of the selection marker (e.g., by sliding the selection marker 604 along the selection bar 602) to define a magnitude of modification of audio content.
- the user may select various points along the selection bar 602.
- the DSP circuit 104 can generate an audio signal corresponding to the selected degree of correction.
- the user can select a subjective preference of modification for music, a maximum intelligibility for speech, or the like.
- the correction adjustment interface 600 can include various selection bars 602 corresponding to various content types or speakers.
- FIG. 7 depicts a top view of an environment 700 including a user 702 in relation to a speaker 110 and an environment boundary 704 (e.g., a wall).
- the data processing system 100 can define a direct audio path 706 between the user 702 and the speaker 110.
- a direct audio path can be defined corresponding to a relationship between each transceiver of a speaker and each ear of a user 702.
- the data processing system 100 can define one or more indirect audio paths 708.
- an indirect audio path 708 can include various reflections between an environment 700 or a user 702.
- a head-related transfer function (HRTF) can define a relationship between an audio source and an eardrum of a user 702, including the user’s ears shoulders, etc.
- HRTF head-related transfer function
- the HRTF may be a generic HRTF a custom HRTF based on user attributes (e.g., height, sex, ear geometry, etc.).
- One or more locations of a speaker 110 can be defined virtually, such that a speaker location for spatial audio differs from a physical speaker location.
- a physical speaker location can be over the ears (e.g., headphones), whereas a virtual speaker location 710 may be otherwise disposed.
- one or more virtual speaker locations 710 may correspond to a band on a stage, a navigational icon in an AR navigation application, a character in a VR game, etc.
- Such HRTFs can be employed with any of the techniques described herein.
- the DSP circuit 104 can convolve the HRTF for each ear against audiogram-corrected audio content 128 for each ear to generate various virtual speaker locations of processed audio.
- Supervised learning is a method of training a machine learning model given input-output pairs.
- An input-output pair is an input with an associated known output (e.g., an expected output).
- Machine learning models 804 may be trained on known input-output pairs such that the machine learning model 804 can learn how to predict known outputs given known inputs. Once the machine learning model 804 has learned how to predict known input-output pairs, the machine learning model 804 can operate on unknown inputs to predict an output.
- the machine learning model 804 may be trained based on general data and/or granular data (e.g., data based on a specific user 832) such that the machine learning model 804 may be trained specific to a particular user 832, a particular class of user, or a particular ear, a particular content type, etc.
- Training inputs 802 and actual outputs 810 may be provided to the machine learning model 804.
- Training inputs 802 may include frequency composition of audio content, any tags or metadata associated with the audio content, a location of a user, a time of access, or device employed (e.g., a home theatre system, VR goggles, ear buds, etc.), historical use, similar users (e.g., according to an age, interest, genre of preferred music, etc.) and the like.
- the inputs 802 and actual outputs 810 may be received from any of the data repositories.
- a data repository may contain a list of audio content selected by the user, a history of use of the user, a device identifier, a user preference, feedback received via the user interface 108, or the like.
- the data repository may also contain data associated with related users.
- the machine learning model 804 may be trained to predict audio content type based on the training inputs 802 and actual outputs 810 used to train the machine learning model 804.
- the data processing system 100 may include one or more machine learning models 804.
- a first machine learning model 804 may be trained to predict data such as an audio content type.
- the first machine learning model 804 may use the training inputs 802 of a device identifier, metadata associated with audio content, or frequency content of audio content to predict outputs 806 such as by applying the current state of the first machine learning model 804 to the training inputs 802.
- the comparator 808 may compare the predicted outputs 806 to actual outputs 810 which may include adjustments to the predicted outputs 806 via the user interface 108 to determine an amount of error or differences.
- the predicted audio content type may be compared to the actual user selected or adjusted audio content type (e.g., actual output 810).
- a second machine learning model 804 may be trained to make one or more recommendations to the user 832 based on the predicted output from the first machine learning model 804.
- the second machine learning model 804 may use the training inputs 802 frequency composition of audio content, any tags or metadata associated with the audio content, a location of a user, a time of access, or device employed (e.g., a home theatre system, VR goggles, ear buds, etc.), historical use, similar users (e.g., according to an age, interest, genre of preferred music, etc.) and the like to predict outputs 806 of an audio content type by applying the current state of the second machine learning model 804 to the training inputs 802.
- the comparator 808 may compare the predicted outputs 806 to actual outputs 810 amplification or spatial information preservation to determine an amount of error or differences.
- the actual outputs 810 may be determined based on historic data of recommendations made to the user 832.
- a user may select an audio signal comprising pop music.
- the machine learning model 804 may determine a genre of the music. Based on the genre, the machine learning model can implement, propose, or convey (e.g., via the user interface) one or more suggestions to the user, such as to confirm the genre type, or select a modification of the audio content based on the audiogram 122 of the user.
- a single machine leaning model 804 may be trained to make one or more recommendations to the user 832 based on current user 832 data received from enterprise resources 828. That is, a single machine leaning model may be trained using the training inputs of a plurality of various users to predict outputs 806 such a genres of music, special information, human speech, or the like by applying the current state of the machine learning model 804 to the training inputs 802.
- the comparator 808 may compare the predicted outputs 806 to actual outputs 810 to determine an amount of error or differences.
- the actual outputs 810 may be determined based on historic data associated with the recommendation to the user 832.
- the error (represented by error signal 812) determined by the comparator 808 may be used to adjust the weights in the machine learning model 804 such that the machine learning model 804 changes (or learns) over time.
- the machine learning model 804 may be trained using a b ackpropagation algorithm, for instance.
- the backpropagation algorithm operates by propagating the error signal 812.
- the error signal 812 may be calculated each iteration (e.g., each pair of training inputs 802 and associated actual outputs 810), batch and/or epoch, and propagated through the algorithmic weights in the machine learning model 804 such that the algorithmic weights adapt based on the amount of error.
- the error is minimized using a loss function.
- loss functions may include the square error function, the root mean square error function, and/or the cross entropy error function.
- the weighting coefficients of the machine learning model 804 may be tuned to reduce the amount of error, thereby minimizing the differences between (or otherwise converging) the predicted output 806 and the actual output 810.
- the machine learning model 804 may be trained until the error determined at the comparator 808 is within a certain threshold (or a threshold number of batches, epochs, or iterations have been reached).
- the trained machine learning model 804 and associated weighting coefficients may subsequently be stored in memory 816 or other data repository (e.g., a database) such that the machine learning model 804 may be employed on unknown data (e.g., not training inputs 802).
- the machine learning model 804 may be employed during a testing (or an inference phase).
- the machine learning model 804 may ingest unknown data to predict future data (e.g., a modification to increase the intelligibility of speech, a modification for a new genre of music, or special content associated with a new device, and the like).
- future data e.g., a modification to increase the intelligibility of speech, a modification for a new genre of music, or special content associated with a new device, and the like.
- the neural network model 900 may include a stack of distinct layers (vertically oriented) that transform a variable number of inputs 902 being ingested by an input layer 904, into an output 906 at the output layer 908.
- the neural network model 900 may include a number of hidden layers 910 between the input layer 904 and output layer 908. Each hidden layer has a respective number of nodes (912, 914 and 916).
- the first hidden layer 910-1 has nodes 912
- the second hidden layer 910-2 has nodes 914.
- the nodes 912 and 914 perform a particular computation and are interconnected to the nodes of adjacent layers (e.g., nodes 912 in the first hidden layer 910-1 are connected to nodes 914 in a second hidden layer 910-2, and nodes 914 in the second hidden layer 910-2 are connected to nodes 916 in the output layer 908).
- Each of the nodes (912, 914 and 916) sum up the values from adjacent nodes and apply an activation function, allowing the neural network model 900 to detect nonlinear patterns in the inputs 902.
- Each of the nodes (912, 914 and 916) are interconnected by weights 920-1, 920-2, 920-3, 920-4, 920-5, 920-6 (collectively referred to as weights 920). Weights 920 are tuned during training to adjust the strength of the node. The adjustment of the strength of the node facilitates the neural network’s ability to predict an accurate output 906.
- the output 906 may be one or more numbers.
- output 906 may be a vector of real numbers subsequently classified by any classifier.
- the real numbers may be input into a softmax classifier.
- a softmax classifier uses a softmax function, or a normalized exponential function, to transform an input of real numbers into a normalized probability distribution over predicted output classes.
- the softmax classifier may indicate the probability of the output being in class A, B, C, etc.
- the softmax classifier may be employed because of the classifier’s ability to classify various classes.
- Other classifiers may be used to make other classifications.
- the sigmoid function makes binary determinations about the classification of one class (i.e., the output may be classified using label A or the output may not be classified using label A).
- FIG. 10 depicts an example method 1000 of generating an audiogram 122.
- the data processing system 100 generates an audio signal sweeping across a plurality of frequencies at operation 1002.
- the data processing system 100 receives a first indication of a user response to a first portion of the audio signal associated with a first frequency at 1004.
- the data processing system 100 modifies an amplitude of a second portion of the audio signal associated with a second frequency.
- the data processing system 100 receives a second indication of a user response to a second portion of the audio signal associated with the second frequency.
- the data processing system 100 generates an audiogram 122 at operation 1010, according to the first indication of the user response and the second indication of the user response.
- the data processing system 100 generates an audio signal sweeping across a plurality of frequencies.
- the plurality of frequencies can be or include frequencies within human hearing range (e.g., 20 Hz to 20 kHz, 100 Hz to 16 kHz, or the like).
- the audio signal can include one or more files, streams, sources, or the like.
- the data processing system 100 can generate the audio signal for output by an audio device (e.g., a speaker 110) to a user.
- the data processing system 100 can generate the audio signal according to a format, amplitude, and the like such that the audio signal can cause a signal which is audible to a user, or an inaudibility of which may be used to determine a level of hearing loss of the user.
- the signal may sweep according to a non-linear sequence, such as a logarithmic sequence.
- the frequencies may be defined according to a nonlinear series including a first frequency through a fifth frequency (e.g., a first frequency, second frequency, third frequency, fourth frequency, and fifth frequencies).
- the data processing system 100 may receive an identifier of the audio device or speaker 110.
- the data processing system 100 may receive the identifier (e.g., device characteristics 124 described above) from the speaker 110 responsive to the speaker being communicably coupled to the data processing system 100.
- the data processing system 100 may generate the audio signal according to the identifier of the audio device or speaker 110.
- the data processing system 100 may generate the audio signal by normalizing a swept audio signal for the particular speaker 110, using the device characteristics 124 for the speaker 110 including the identifier of the speaker 110.
- the data processing system 100 receives a first indication of a user response to a first portion of the audio signal.
- the first indication of the user response may correspond to or be similar to a first instance of an indication of user receipt, as described above with reference to the method 300 of FIG. 3, at operation 308.
- the data processing system 100 can receive the first indication of the user response via a control interface of the audio output device, or another device (e.g., a key of a keyboard or another control interface disclosed according to the systems and methods disclosed herein).
- the first indication may be or include an indication that designates an amplitude (e.g., and a frequency) for which the audio signal is inaudible to the user.
- the data processing system 100 can receive the first indication of the user response when the user provides the first indication responsive to the audio signal no longer being audible at the first portion.
- the user may provide the first indication by releasing an interface element responsive to inaudibility of the audio signal.
- the user may depress, select, or otherwise hold down the interface element while the audio signal is audible to the user, and release the interface element when the audio signal is no longer audible (e.g., is inaudible).
- the data processing system 100 may receive the indication responsive to the user releasing the interface element.
- the first portion of the audio signal may correspond to or be associated with a first frequency (e.g., of the plurality of frequencies across which the audio signal is swept) at which the audio signal is inaudible at the particular amplitude.
- the data processing system 100 100 can correlate the first portion of the audio signal to the frequency (e.g., the first frequency )and amplitude corresponding thereto (e.g., an amplitude of the audio signal at the first frequency).
- the data processing system 100 can correlate the first portion of the audio signal to the frequency band and amplitude, as is further discussed with reference to the output of the method 300 of FIG. 3 (e.g., at operation 306 or 310).
- the data processing system 100 modifies the amplitude of a second portion of the audio signal associated with a second frequency.
- the data processing system 100 can modify the amplitude of the second portion of the audio signal upon, responsive to, or according to receiving the first indication (e.g., at operation 1004).
- the second frequency can be a higher or lower frequency of the plurality of frequencies (e.g., as compared to the first frequency) according to a direction of the frequency sweep.
- the modification of the amplitude of the frequency can correspond to the modification of operation 310 of the method 300 of FIG. 3.
- the modification can be according to a predetermined step size (e.g., 5 dB, 2.5 dB, or the like).
- the data processing system 100 may modify the audio signal by adjusting the amplitude by various amounts.
- the data processing system 100 may adjust the step size or amplitude of the audio signal by a reduced amount for subsequent frequency sweeps (relative to a larger step size or amplitude for a previous step size), which may increase a resolution of an audiogram 122, relative to larger step sizes.
- the data processing system 100 may adjust the amplitude of the audio signal by a second amount (e.g., according to a second step size) which is less than the first amount (according to a first step size), subsequent to adjusting the amplitude of the audio signal by the first amount (e.g., at the first time instance.
- the data processing system may adjust the amplitude of the audio signal by a lesser degree than previous iterations, to provide a more granular or higher resolution audiogram 122.
- the data processing system 100 receives a second indication of a user response to the second portion of the audio signal. Operation 1008 may be similar to operation 1004 described above.
- the data processing system 100 can correlate the second portion with the second frequency, as discussed with regard to operation 1004 (e.g., by associating an amplitude with a frequency of the audio signal).
- the data processing system 100 can correlate the second portion of the audio signal to a frequency and amplitude corresponding thereto.
- the data processing system 100 can correlate the second portion of the audio signal to the frequency band and amplitude as is further discussed with reference to the output of the method 300 of FIG. 3 (e.g., at operation 310).
- the data processing system 100 generates an audiogram 122 according to the first indication of operation 1004, and the second indication of operation 1008.
- the first and second indication can indicate a threshold of hearing at respective frequency and amplitude levels.
- the first indication can indicate normal hearing at 100 Hz
- the second indication can indicate hearing loss at 10 kHz.
- the data processing system 100 can process various further indications (e.g., third, fourth, and so on) to generate an audiogram 122 at various further frequencies.
- the data processing system 100 can model a discrete or continuous function corresponding to an audiogram 122 of the user, as is further described according to the disclosure of operation 316 of the method 300 of FIG. 3, and throughout this disclosure (e.g., with regard to the audiography circuit 106 of FIG. 1).
- the data processing system 100 may generate a profile according to the audiogram 122.
- the profile may be similar to the equalizer profile 500 described above with reference to FIG. 5.
- the data processing system 100 may present the profile to the user, and the user may provide various modification(s) to the profile.
- the data processing system 100 may receive various modifications to the profile at one or more frequencies.
- the data processing system 100 may update the profile according to the modification.
- the data processing system 100 may apply the profile and/or the audiogram corresponding to the profile may be used to modify audio content from various audio sources.
- the data processing system 100 may receive audio content from an audio source.
- the audio content may be or include music, songs, speech, etc.
- the data processing system 100 may modify the audio content to generate an audio output, according to the audiogram.
- the data processing system 100 may modify (such as amplify) the audio content at various frequencies according to the audiogram (e.g., by applying the audiogram to the audio content to generate the audio output).
- the data processing system 100 may modify the audio content according to a classification for the audio content.
- the data processing system 100 may include a classifier to classify audio content locally, the data processing system 100 may identify or determine the classification provided by the audio source (e.g., the audio source may embed the classification as metadata, such as genre of music). The data processing system 100 may use the classification and the audiogram to modify the audio content. For example, depending on the classification and the audiogram, the data processing system 100 may apply different amplifications at different frequencies (e.g., as described above).
- FIG. 11 depicts an example block diagram of an example computer system 1100.
- the computer system or computing device 1100 can include or be used to implement a data processing system or its components.
- the computing system 1100 includes at least one bus 1105 or other communication component for communicating information and at least one processor 1110 or processing circuit coupled to the bus 1105 for processing information.
- the computing system 1100 can also include one or more processors 1110 or processing circuits coupled to the bus for processing information.
- the computing system 1100 also includes at least one main memory 1115, such as a random access memory (RAM) or other dynamic storage device, coupled to the bus 1105 for storing information, and instructions to be executed by the processor 1110.
- the main memory 1115 can be used for storing information during execution of instructions by the processor 1110.
- the computing system 1100 may further include at least one read only memory (ROM) 1120 or other static storage device coupled to the bus 1105 for storing static information and instructions for the processor 1110.
- ROM read only memory
- a storage device 1125 such as a solid state device, magnetic disk or optical disk, can be coupled to the bus 1105 to persistently store information and instructions.
- the computing system 1100 may be coupled via the bus 1105 to a display 1135, such as a liquid crystal display, or active matrix display, for displaying information to a user.
- a display 1135 such as a liquid crystal display, or active matrix display
- An input device 1130 such as a keyboard or voice interface may be coupled to the bus 1105 for communicating information and commands to the processor 1110.
- the input device 1130 can include a touch screen display 1135.
- the input device 1130 can also include a cursor control, such as a mouse, a trackball, or cursor direction keys, button on a headset, or the like for communicating direction information and command selections to the processor 1110 and for controlling cursor movement on the display 1135.
- the processes, systems and methods described herein can be implemented by the computing system 1100 in response to the processor 1110 executing an arrangement of instructions contained in main memory 1115. Such instructions can be read into main memory 1115 from another computer-readable medium, such as the storage device 1125. Execution of the arrangement of instructions contained in main memory 1115 causes the computing system 1100 to perform the illustrative processes described herein. One or more processors in a multiprocessing arrangement may also be employed to execute the instructions contained in main memory 1115. Hard-wired circuitry can be used in place of or in combination with software instructions together with the systems and methods described herein. Systems and methods described herein are not limited to any specific combination of hardware circuitry and software.
- All or part of the processes described herein and their various modifications can be implemented, at least in part, via a computer program product, i.e., a computer program tangibly embodied in one or more tangible, physical hardware storage devices that are computer and/or machine-readable storage devices for execution by, or to control the operation of, data processing apparatus, e.g., a programmable processor, a computer, or multiple computers.
- a computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- a computer program can be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a network.
- processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer.
- a processor will receive instructions and data from a read-only storage area or a random access storage area or both.
- Elements of a computer include one or more processors for executing instructions and one or more storage area devices for storing instructions and data.
- a computer will also include, or be operatively coupled to receive data from, or transfer data to, or both, one or more machine-readable storage media, such as mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks.
- Computer program products are stored in a tangible form on non-transitory computer readable media and non-transitory physical hardware storage devices that are suitable for embodying computer program instructions and data.
- These include all forms of non-volatile storage, including by way of example, semiconductor storage area devices, e.g., EPROM, EEPROM, and flash storage area devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks and volatile computer memory, e.g., RAM such as static and dynamic RAM, as well as erasable memory, e.g., flash memory and other non-transitory devices.
- semiconductor storage area devices e.g., EPROM, EEPROM, and flash storage area devices
- magnetic disks e.g., internal hard disks or removable disks
- magneto-optical disks e.g., magneto-optical disks
- CD-ROM and DVD-ROM disks e.g., RAM such as static
- Coupled means the joining of two members directly or indirectly to one another. Such joining may be stationary (e.g., permanent or fixed) or moveable (e.g., removable or releasable). Such joining may be achieved with the two members coupled directly to each other, with the two members coupled to each other using a separate intervening member and any additional intermediate members coupled with one another, or with the two members coupled to each other using an intervening member that is integrally formed as a single unitary body with one of the two members.
- Coupled or variations thereof are modified by an additional term (e.g., directly coupled)
- the generic definition of “coupled” provided above is modified by the plain language meaning of the additional term (e.g., “directly coupled” means the joining of two members without any separate intervening member), resulting in a narrower definition than the generic definition of “coupled” provided above.
- Such coupling may be mechanical, electrical, or fluidic.
- the present disclosure contemplates methods, systems and program products on any machine-readable media for accomplishing various operations.
- the embodiments of the present disclosure may be implemented using existing computer processors, or by a special purpose computer processor for an appropriate system, incorporated for this or another purpose, or by a hardwired system.
- Embodiments within the scope of the present disclosure include program products including machine-readable media for carrying or having machine-executable instructions or data structures stored thereon.
- Such machine-readable media can be any available media that can be accessed by a general purpose or special purpose computer or other machine with a processor.
- machine-readable media can comprise RAM, ROM, EPROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store desired program code in the form of machine-executable instructions or data structures and which can be accessed by a general purpose or special purpose computer or other machine with a processor. Combinations of the above are also included within the scope of machine-readable media.
- Machine-executable instructions include, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing machines to perform a certain function or group of functions.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biophysics (AREA)
- Pathology (AREA)
- Biomedical Technology (AREA)
- Heart & Thoracic Surgery (AREA)
- Medical Informatics (AREA)
- Molecular Biology (AREA)
- Veterinary Medicine (AREA)
- Animal Behavior & Ethology (AREA)
- Public Health (AREA)
- Multimedia (AREA)
- Otolaryngology (AREA)
- Acoustics & Sound (AREA)
- Evolutionary Computation (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physiology (AREA)
- Psychiatry (AREA)
- Signal Processing (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Audiography and systems associated therewith are provided. An audio signal is generated for output by an audio device to a user, the audio signal sweeping across a plurality of frequencies. A first indication is received of a user response to a first portion of the audio signal, the first portion associated with a first frequency of the plurality of frequencies. Upon receiving the first indication of the user response, an amplitude of a second portion associated with a second frequency of the plurality of frequencies is modified. A second indication of a user response to the second portion of the audio signal is received, the second portion associated with the second frequency of the plurality of frequencies. An audiogram is generated according to the first indication of the user response and the second indication of the user response.
Description
AUDIO CONTROL SYSTEM
CROSS-REFERENCE TO RELATED PATENT APPLICATIONS
[0001] The present application claims the benefit of priority to U.S. Provisional Application No. 63/265,856, filed December 22, 2021, the disclosure of which is incorporated herein by reference in its entirety.
BACKGROUND
[0002] Audio systems can determine a hearing characteristic of a user. For example, an audiologist can determine a user response according to an audiology exam. An audio device can process audio for the user based on the hearing characteristics of the user. However, determining the hearing characteristics can employ specialized equipment, personnel, or extensive time.
SUMMARY
[0003] Systems and methods in accordance with the present disclosure can provide improved detection of hearing characteristics of a user. For example, a controller can cause a speaker to present a frequency sweep to a user, and receive an indication of the user’s receipt thereof. The controller can determine a hearing characteristic of the user according to the received indication. The controller can receive an indication of the identity of the audio output device. The controller can receive an indication of a type of audio content. For example, the controller can receive an indication of audio content including human speech, music, or spatial content (e.g., virtual reality or video game content). The controller can employ a digital signal processing (DSP) circuit to process audio content to match the hearing characteristic of a user.
[0004] According to one embodiment, a method includes: generating, by one or more processors, an audio signal for output by an audio device to a user, the audio signal sweeping across a plurality of frequencies; receiving, by the one or more processors, a first indication of a user response to a first portion of the audio signal, the first portion associated with a first frequency of the plurality of frequencies; upon receiving the first indication of the user response, modifying, by the one or more processors, an amplitude of a second portion of the audio signal associated with a second frequency of the plurality of frequencies; receiving, by the one or more processors, a second indication of a user response to the second portion of the audio signal, the second
portion associated with the second frequency of the plurality of frequencies; and generating, by the one or more processors, an audiogram specific to the user according to the first indication of the user response and the second indication of the user response.
[0005] According to another embodiment, a system includes: a speaker; and one or more processors configured to: generate an audio signal for output by an audio device to a user, wherein the audio signal sweeps across a plurality of frequencies; receive a first indicate on of a user response to a first portion of the audio signal, the first portion associated with a first frequency of the plurality of frequencies; upon receipt of the first indication of the user response, modify an amplitude of a second portion of the audio signal associated with a second frequency of the plurality of frequencies; receive a second indication of a user response to the second portion of the audio signal, the second portion associated with the second frequency of the plurality of frequencies; and generate an audiogram specific to the user according to the first indication of the user response and the second indication of the user response; receive an audio input from an audio source; receive a device identifier associated with the speaker; modify the audio input at one or more frequencies to generate an audio output, according to the audiogram and the device identifier; and output, via the speaker, the audio output to the user.
[0006] According to yet another embodiment, a headphone set includes a speaker; and one or more processors configured to: generate an audio signal for output by an audio device to a user, the audio signal sweeping across a plurality of frequencies; receive a first indication of a user response to a first portion of the audio signal, the first portion associated with a first frequency of the plurality of frequencies; upon receiving the first indication of the user response, modify an amplitude of a second portion associated with a second frequency of the plurality of frequencies; receive a second indication of a user response to the second portion of the audio signal, the second portion associated with the second frequency of the plurality of frequencies; generate an audiogram specific to the user according to the first indication of the user response and the second indication of the user response; receive an audio input from an audio source; modify the audio input at one or more frequencies to generate an audio output, according to the audiogram and a frequency response of the headphone set; and output, via the speaker, the audio output to the user.
[0007] These and other aspects and implementations are discussed in detail below. The foregoing information and the following detailed description include illustrative examples of various aspects and implementations, and provide an overview or framework for understanding the nature and character of the claimed aspects and implementations. The drawings provide illustration and a further understanding of the various aspects and implementations, and are incorporated in and constitute a part of this specification.
BRIEF DESCRIPTION OF THE DRAWINGS
[0008] The accompanying drawings are not intended to be drawn to scale. Like reference numbers and designations in the various drawings indicate like elements. For purposes of clarity, not every component can be labeled in every drawing. In the drawings:
[0009] FIG. 1 depicts a block diagram of an example of a data processing system.
[0010] FIG. 2 depicts an example of a frequency response plot for a speaker.
[0011] FIG. 3 depicts an example method of determining an audiogram.
[0012] FIG. 4 depicts an example of an audiogram-threshold chart.
[0013] FIG. 5 depicts an example of an equalizer profile.
[0014] FIG. 6 depicts an example of a correction adjustment interface of a user interface.
[0015] FIG. 7 depicts a top view of an environment including a user in relation to a speaker and an environment boundary.
[0016] FIG. 8 depicts a block diagram of an example system using supervised learning.
[0017] FIG. 9 depicts a block diagram of a simplified neural network model.
[0018] FIG. 10 depicts an example method of generating an audiogram.
[0019] FIG. 11 depicts an example block diagram of an example computer system.
DETAILED DESCRIPTION
[0020] The present disclosure provides for many different embodiments. While certain embodiments are described below and shown in the drawings, the present disclosure provides only some examples of the principles described herein and is not intended to limit the invention to the embodiments illustrated and described.
[0021] Systems and methods described herein can provide improved detection or application of an indication of a hearing for a user (e.g., an audiogram). For example, an audio output device (e.g., a speaker) can present an audio signal. The audio signal can sweep across a plurality of frequencies. For example, the audio signal can sequentially sweep between a minimum and a maximum frequency. A controller can select frequencies to present to a user according to a nonlinear function, such as a logarithmic function (e.g., frequency spacing approaching the maximum frequency may be greater than the frequency spacing approaching the minimum frequency). For example, the audio signal can include a frequency sweep from 100 Hz to 16 kHz. The frequency sweep can be continuous or include discrete frequencies (e.g., logarithmically spaced such that the individual frequencies are evenly spaced as depicted by a logarithmic depiction thereof, such as an audiogram). For example, the audio signal can include a tone at each frequency (e.g., a 300ms tone). According to some embodiments, the audio signal can include dwell time at a duty cycle (e.g., 50%) with the tone. For example, the audio signal can alternate between a 300ms tone and a 300 ms dwell time during which no audio is output, ascending or descending between various frequencies.
[0022] A control interface, such as a key of a keyboard, a user interface element of a device, a control of a headset, or the like can receive an indication of an audibility of the audio signal from the user. For example, the controller can cause the audio signal to progress through the various frequencies of the frequency sweep, and receive an indication of audibility of the audio content. For example, the control interface can detect an actuation thereof, indicating that the user hears the audio (or does not hear the audio). According to some embodiments, the speed of the adjustment between frequencies may be adjustable, such as in response to a user command, a control interface type or latency, etc.
[0023] According to some embodiments, the controller can cause an additional frequency sweep in a frequency range or volume range of interest. For example, a second frequency sweep can repeat or further granulize areas of interest (e.g., transitions between audibility and inaudibility) with respect to frequencies or amplitudes. For example, amplitude adjustments can be about 5dB in a first frequency sweep and about 2.5 dB in a second frequency sweep. According to some embodiments, the controller can cause the audio signal to be modified according to a frequency response of a speaker (e.g., via a digital signal processing circuit). For example, if a frequency response is known to have a frequency response at 100 Hz which is 10 dB less than the frequency response at 2 kHz, a digital signal processing (DSP) circuit can modify the frequency response (e.g., adjusted or generated) based on the frequency response (e.g., increase an amplitude at 100 Hz by 10 dB, relative to the amplitude at 2 kHz). The controller can generate an audiogram indicating the response of audibility, along with any modifications for the speaker.
[0024] The controller can cause audio content to be modified based on the audiogram. For example, the controller can employ the DSP circuit to modify audio content including human speech, music, video game or other virtual reality (VR) or augmented reality (AR) content, or the like. For example, the controller can cause the audio to be amplified at a same or different frequency as an indication of hearing loss or other audiogram content. Such amplification may enhance the intelligibility of speech, the subjective experience of a user, or spatial information of the audio content.
[0025] FIG. 1 depicts an example of a data processing system 100. The data processing system 100 can include, interface with, or otherwise utilize at least one controller 102, digital signal processing (DSP) circuit 104, audiography circuit 106, user interface 108, speaker 110, or audio content recognition circuit 112. The controller 102, digital signal processing (DSP) circuit 104, audiography circuit 106, user interface 108, speaker 110, or audio content recognition circuit 112 can each include or interface with at least one processing unit or other logic device such as programmable logic array engine, or module configured to communicate with the data repository 120 or database. The controller 102, digital signal processing (DSP) circuit 104, audiography circuit 106, user interface 108, speaker 110, or audio content recognition circuit 112 can be separate components, a single component, or part of the data processing system 100. The data processing system 100 can include hardware elements, such as one or more processors, logic
devices, or circuits. For example, the data processing system 100 can include one or more components or structures of functionality of computing devices depicted in FIG. 11.
[0026] The data repository 120 can include one or more local or distributed databases, and can include a database management system. The data repository 120 can include computer data storage or memory and can store one or more of an audiogram 122, device characteristics 124, user preferences 126, or audio content 128. The audiogram 122 can include an indication of a hearing characteristic of audio content for a user. For example, the audiogram can include an indication of a hearing characteristic of a user across various frequencies, or a transfer function for the user, such as a head related transfer function (HRTF). The device characteristics 124 can include an identity or transfer function of a device. For example, an identity can include a manufacturer, model number, unique identifier (e.g., serial number or user-supplied identifier), or the like. A transfer function can include a frequency response, sound pressure level, or distortion for a various sound pressure levels, frequencies, or the like. The user preferences 126 can include an expressed preference with regard to audio content, an intelligibility of audio content, an indication of a spatial audio (e.g., an indication of a user’s ability to distinguish a directionality of an element of audio content 128), or the like. The audio content 128 can include audio files, streams, or characteristics thereof. For example, audio files such as music or other audio content (e.g., podcasts, audio tracks for audio-visual content such as video content) or the like can be stored by or accessible to the data repository 120. Stream information such as audio content for a videogame, virtual reality, augmented reality, mixed reality, or other content can be derived therefrom, including frequency content, content type, or devices associated therewith.
[0027] The data processing system 100 can include, interface with, or otherwise utilize at least one controller 102. The controller 102 can include or interface with one or more processors and memory. The processor can be implemented as a specific purpose processor, an application specific integrated circuit (ASIC), one or more field programmable gate arrays (FPGAs), a group of processing components, or other suitable electronic processing components. The processors and memory can be implemented using one or more devices, such as devices in a client-server implementation. The memory can include one or more devices (e.g., random access memory (RAM), read-only memory (ROM), flash memory, hard disk storage) for storing data and computer code for completing and facilitating the various user or client processes, layers, and
modules. The memory can be or include volatile memory or non-volatile memory and may include database components, object code components, script components, or any other type of information structure for supporting the various activities and information structures of the inventive concepts disclosed herein. The memory can be communicably connected to the processor and include computer code or instruction modules for executing one or more processes described herein. The memory can include various circuits, software engines, and/or modules that cause the processor to execute the systems and methods described herein, such as to cause the communication or processing of audio signals.
[0028] The controller 102 can include or be coupled with communications electronics. The communications electronics can conduct wired and/or wireless communications. For example, the communications electronics can include one or more wired (e.g., Ethernet, PCIe, or AXI) or wireless transceivers (e.g., a Wi-Fi transceiver, a Bluetooth transceiver, a NFC transceiver, or a cellular transceiver). The controller 102 may be in network communication or otherwise communicatively coupled with the DSP circuit 104, speaker 110, or other components of the data processing system 100. The controller 102 can cause one or more operations disclosed, such as by employing another element of the data processing system. For example, operations disclosed by other elements of the data processing system may be initiated, scheduled, or otherwise controller by the controller 102.
[0029] The data processing system 100 can include, interface with, or otherwise utilize at least one DSP circuit 104. For example, the DSP circuit 104 can receive audio content 128, user preferences 126, or audiograms 122. The DSP circuit can process audio content 128 based on the audiogram 122, the user preferences 126, or the audio content 128 (e.g., audio content type). The DSP circuit 104 can receive an audiogram 122 from the audiography circuit 106, and process the audio content 128 based on the audiogram 122. For example, the DSP circuit can normalize the audio content 128 to reduce a difference between the audiogram 122 of a user and a target audiogram 122 (e.g., by increasing a sound pressure at a frequency for which the audiogram 122 for the user indicates diminished hearing relative to the target audiogram 122).
[0030] In some embodiments, the DSP circuit 104 can be configured to adjust a different frequency responsive to a determination that an intelligibility, directionality, or subjective
impression of audio content 128 would be enhanced thereby. For instance, human speech can be processed to amplify a harmonic of a frequency at which an audiogram 122 indicated diminished hearing of a user. For example, 7Khz content can be amplified responsive to diminished hearing at 14 KHz, which may increase an intelligibility of human speech or other harmonic content. The DSP circuit 104 can be configured to adjust audio content 128 based on a spatial position of the audio content. For example, the DSP circuit 104 can receive an audiogram indicative of hearing loss at 4 KHz in a left ear. The DSP circuit 104 can receive an audio content type, such as from the audio content recognition circuit 112. The DSP circuit 104 can process the audio content 128 based on the audio type for a same or different ear. For example, the DSP circuit 104 can amplify a right channel of the audio responsive to non-spatial content, such as human speech from an audio output device (e.g., a single channel speaker 110). The DSP circuit 104 can frequency shift content according to spatial content. For example, a 4 kHz tone can be modified to a 5 KHz tone such that spatial information can be perceived by the user. Such embodiments can include video games, virtual reality, mixed reality, or the like where spatial information may be prioritized over tonal information, such as to locate a prompt, character, or item, or navigational aid. For audio content such as music, the 4 kHz content can be amplified according to the audiogram 122 or user preferences 126.
[0031] The data processing system 100 can include, interface with, or otherwise utilize at least one audiography circuit 106. The audiography circuit 106 can generate, adjust, or otherwise interface with an audiogram 122 for a user (e.g., specific to the user). For example, the audiography circuit 106 can present an indication to a user (e.g., via the user interface 108 or the speaker 110) to indicate an audible portion of a test pattern as received by the user. The audiography circuit 106 can provide one or more prompts to a user interface 108 and receive a response to said prompts, indicative of a receipt of audio content 128 from the user. The audiography circuit 106 can normalize a frequency response of a speaker 110, such as by conveying a device identifier or attribute to the DSP circuit 104 for processing of audio content 128 conveyed to the speaker 110.
[0032] The data processing system 100 can include, interface with, or otherwise utilize at least one user interface 108. The user interface 108 can include audio, visual, or control interfaces. For example, the user interface 108 can include or interface with a control, such as a mechanical
switch, touchscreen, capacitive switch, or the like. In some embodiments, the control can be a keyboard, dedicated control of an speaker 110 (e.g., a button or microphone intrinsic to a headphone set), or may be remote from the audio output device. For example, the control can be a control of a graphical user interface (GUI). The user interface 108 can present the GUI by a device in network communication with the controller 102 or the speaker 110. For example, the GUI can be a GUI of an application on a mobile or other device. The GUI can present various prompts or information (e.g., to a user) as depicted throughout the present disclosure. According to various embodiments, prompts can be presented via the user interface 108, or the speaker 110. Indeed, various prompts or information may be alternated between the user interface 108 or the speaker 110, unless stated otherwise. For example, an audio output to determine a hearing of a user is presented by the speaker 110.
[0033] The data processing system 100 can include, interface with, or otherwise utilize at least one speaker 110. For example, the speaker 110 can include a wired or wireless device such as a hearing aid, headphones (e.g., in-ear headphones, over-the-ear headphones), stand-alone speaker, etc. The speaker 110 can include various transducers such as for a left and right ear, or a frequency range (e.g., subwoofer, mid-range, tweeter, etc.). According to some embodiments, the speaker 110 can include a communications port communicatively coupled with the controller 102 to exchange information therewith. For example, the speaker 110 can provide identity information of the speaker 110, device characteristics 124, or audio content information to the controller 102. According to some embodiments, the speaker 110 can receive audio content 128 from the controller 102.
[0034] The speaker 110 can include or interface with a transducer, a digital to analogue converter (DAC), an amplifier or other components (e.g., Codecs, wireless transceivers, etc.). In some embodiments, the speaker identity can be defined based on a subset of the components of the speaker 110. For example, a headphone manufacturer model number can identify an audio output device, comprising a DAC, amplifier, or other components.
[0035] The data processing system 100 can include, interface with, or otherwise utilize at least one audio content recognition circuit 112. The audio content recognition circuit 112 can determine a type of audio content 128. The audio content recognition circuit 112 can receive a
tag associated with content, receive a user indication of a content type, or infer a content type based on a history of user entries, a time, location, device, or the like. For example, the audio content recognition circuit 112 can determine that audio content 128 contains spatial information based on an association with a VR headset. The tag or other indication of user content can indicate a genre of music, audiovisual content, or other content types. In some embodiments, audio content 128 can include more than one type. For example, a video game can include a music track, spatial content (e.g., events, items, or the like), or speech content (e.g., character dialogue). The audio content recognition circuit 112 can determine one or more applicable content types. For example, the audio content recognition circuit 112 can select a primary content type which may vary over time, may receive separate audio stream of different audio types, or may cause the DSP circuit 104 to disaggregate content types, and thereafter process the disaggregated content types separately. In some embodiments, the audio content recognition circuit can receive an indication of audio content from a user, via the user interface 108.
[0036] In some embodiments, the audio content recognition circuit 112 can include, interface with or otherwise employ machine learning models (e.g., supervised learning). For example, the audio content recognition circuit 112 can train a model, or receive a trained model. The audio content recognition circuit 112 can employ the trained model to determine an audio content type, or a probability of a match to one or more audio content types. Such models are further discussed with regard to, for example, FIGs. 8 and 9. In some embodiments, the audio content recognition circuit 112 can cause the user interface 108 to present a selection of audio content types to a user based on an output of a trained model and receive audio content type information therefrom. The model may be trained based on a user response.
[0037] FIG. 2 depicts an example of a frequency response plot 200 for a speaker 110. The frequency response plot 200 can be associated with a device identifier such that the controller 102 can receive the frequency response plot 200 from the speaker 110 or receive an identifier of the speaker 110 (e.g., via the user interface 108). In some embodiments, the frequency response can be detected by a microphone. The frequency response plot 200 includes an unadjusted frequency response 202 indicative of a frequency intrinsic to the speaker 110. For example, the unadjusted frequency response 202 can indicate a default setting of the speaker. In some embodiments, various unadjusted frequency responses 202 can be associated with a device such
as according to a setting, position, or the like. An aggregated unadjusted frequency response 202 can combine (e.g., average) multiple unadjusted frequency responses 202 curves.
[0038] The DSP circuit 104 can process audio content 128 to generate an adjusted frequency response curve 204. For example, the DSP circuit 104 can adjust a magnitude of an amplitude corresponding to frequency content of the audio content 128 such that the adjusted frequency response curve 204 is “flattened” (e.g., a deviation of a magnitude of response between various frequencies is reduced). Further, the center magnitude of the adjusted frequency response curve 204 can be normalized (e.g., can bound a same offset, such as 0 dB). Thus, various speakers 110 can be harmonized such that an indication of audibility from a user of a first speaker 110 can be indicative of an audiogram 122 of the user with various speakers (e.g., the speaker 110 can be de-conflated). According to some embodiments, the DSP circuit 104 can subdivide a portion of the frequency curve into frequency bands, and adjust the magnitude of each band to reduce a variation therebetween (e.g., can implement an audio equalizer). In some embodiments, the user interface 108 can present a graphical representation for the audio equalizer. In some embodiments, the graphical representation can include controls to adjust one or more frequency bands.
[0039] FIG. 3 depicts an example method 300 of determining an audiogram. In brief summary, the data processing system 100 presents a test output at operation 302. The user interface receivers an indication of user receipt of the test output is received at operation 304. Upon a receipt of the user indication of operation 304, at operation 306, the speaker 110 presents an //th (e.g., first or subsequent) output at a first volume, and varies the volume until receiving an indication of user acknowledgment. At operation 308, the user interface 108 receives an indication of audibility. At operation 310, the controller 102 causes a modification of the amplitude of the signal. The controller 102 determines whether the output pattern is complete at operation 312. Responsive to determining a non-completion of the output pattern at operation 312, the controller adjusts a frequency of the output at operation 314. Responsive to determining a completion of the output pattern, the controller generates an audiogram at operation 316.
[0040] In further detail, at operation 302, the data processing system 100 presents a test output.
The test output can be presented via the speaker 110, or the user interface 108. For example, the
test output can include a tone, an audible command or instructions, or the like. The test output can include a visual presentation such as a blinking LED, instructions of the test (e.g., presented via the user interface), or the like. For example, a visual presentation can instruct a user to acknowledge receipt of an audio or visual presentation (e.g., to interact with a control interface, such as a spacebar on a keyboard). For example, the test output can include an audible tone at a predefined duty cycle (e.g., 50%).
[0041] At operation 304, the controller 102 receives an indication of user receipt of the test output. For example, the controller 102 can receive an indication via a user interface control (e.g., a mechanical key, microphone, touch-sensitive key, or the like). Such an indication can verify an operation of the data processing system 100 (e.g., an speaker 110 thereof), and verify a status of a user. For example, a failure of a user to acknowledge a test output may be indicative of an operational state of a speaker or a control interface, or a user having profound hearing loss. Upon receiving a response to the test output, the method 300 can proceed to operation 306. Upon a failure to receive a response to the test output, the method 300 can maintain the test output. In some embodiments, the controller 102 can include a timeout function. For example, upon a failure to receive a response to the test output in a predefined amount of time, the controller can cause a cessation of the method or otherwise may halt the test output.
[0042] At operation 306, and at a first instance, the controller 102 can cause the speaker 110 to produce a first output. For example, the first output can include an ascending or descending volume sweeping across various frequency bands. Subsequent to producing the first output, the controller 102 can cause the speaker 110 to produce a second output. For example, the second output can include an ascending or descending volume sweeping across various frequency bands. Subsequent to producing the second output, the controller 102 can cause the speaker 110 to produce an //th output. Each transition between outputs can be intermediated by a progression to operation 308. In some embodiments, the user interface 108 can provide a prompt to actuate a control interface responsive to a detection or non-detection of the output incident to the output. For example, the user interface 108 can prompt a user to indicate a first perception of an audio signal increasing in volume, or a last perception of an audio signal decreasing in volume.
[0043] At operation 308, the user interface 108 receives an indication of receipt of the output provided at operation 312. For example, the user interface 108 can receive the indication from a control interface intrinsic to the speaker 110, or a separate user interface 108. According to some embodiments, the user interface can receive an indication to repeat a test pattern (e.g., responsive to an inadvertent response, or the like).
[0044] At operation 310, the controller 102 can cause a modification of the amplitude of the signal. The controller 102 can cause the output to increase or decrease in volume. For example, the controller 102 can cause the output to increase or decrease according to a pre-defined power or sound pressure level (e.g., 5 dB steps). In some embodiments, an additional instance of an output can be presented to the user (e.g., to confirm or interrogate a level of perception). For example, an additional instance of the first output can be presented to the user. The second instance can include a same or different magnitude adjustment as the first instance (e.g., may be presented at 2.5 dB steps). According to some embodiments, the second instance may omit one or more magnitudes (e.g., may provide a narrower band of magnitudes, which may be centered about the detected threshold of the first instance).
[0045] At operation 312, the controller can determine if the test pattern is complete. In some embodiments, the test pattern can be predefined; the controller can compare a number of increments to a predetermined number. In some embodiments, the test pattern can be adaptive. For example, incident to a failure to detect audio of one or more frequency bands, or a threshold of detection thereof, some frequency bands, steps, or the like may be omitted. For example, the controller can determine that an output at 16.8 kHz can be omitted for a user which indicates no perception of audio at 12 kHz, 13.2 KHz, 14.4 KHz, and 15.5 kHz.
[0046] In further detail, at operation 314, the controller can adjust a frequency band for an output. For example, the frequency band can be determined according to a predefined test pattern or dynamically selected. For example, the controller cause the frequency band to increase or decrease logarithmically from about 100 Hz to about 16 kHz for a total of 150 frequency steps. Such an embodiment is not intended to be limiting, indeed, according to various embodiments, addition, fewer, or different (e.g., linear) frequency steps can be selected. In some embodiments, the frequency steps can ascend monotonically, descend monotonically, be randomly or pseudo-
randomly distributed, or the like. Upon completion of the outputs, at operation 316, the controller 102 can generate an audiogram 122 for the user. The audiogram 122 can be saved by the data processing system, or conveyed via the user interface 108. According to some embodiments, the audiogram 122 or a transfer function derived therefrom can be conveyed to the speaker 110, or the DSP circuit 104 (e.g., to process audio content 128 based on the audiogram).
[0047] FIG. 4 depicts an example of an audiogram-threshold chart 400. According to various embodiments, the user interface 108 (of FIG. 1) can present an audiogram-threshold chart 400. However, such a graphical representation is not intended to be limiting. Indeed, in some embodiments, the controller 102 can compare an audiogram 122 to a threshold to generate an equalization function absent a graphical depiction thereof, or process other data presented herein as a chart without generating a user viewable chart. The audiogram can be generated, for example, responsive to execution of the method 300 of FIG. 3. The audiogram-threshold chart 400 includes a frequency axis 402 (or function) relative to an audibility axis 404 (or function). The audibility axis 404 can depict a sound pressure, power, or other indication of volume corresponding to an audibility of the user. A complete audiogram 122 of the user is provided, for reference. The audiogram 122-threshold chart 400 depicts one or more thresholds 406, which may define normal hearing or a threshold for hearing loss (e.g., mild hearing loss, moderate hearing loss, or the like).
[0048] At a first frequency 408, the audiography circuit 106 presents the audio signal to the user. The user may indicate a receipt of the audio signal (e.g., via a control interface). Responsive to the receipt of the user, the audiography circuit 106 can cause the audio signal to be modified to change an amplitude thereof. For example, the audiography circuit 106 can cause the audio signal to be decreased a predefined amount (e.g., 12 dB, 5 dB, or 2.5 dB). At a second frequency 410, the user may continue to audibly perceive the audio signal, and the data processing system 100 can receive an indication of said perception. Responsive to such a receipt, the data processing system 100 may further modify (e.g., reduce) the amplitude. At a third frequency, 412, the user may not perceive the audio (and may indicate non-receipt / non-perception of the audio signal via the control interface). For example, the user may indicate non-perception of the audio signal by selecting a different button or interface element, by de-selecting the control interface, etc. The audiography circuit 106 may increase a volume or amplitude of a subsequent
frequency presented to the user responsive to receiving the indication of non-perception of the audio signal. The audiography circuit 106 can continue to present each of a fourth 414, fifth 416, sixth 418, seventh 420, eighth 422, ninth 424, tenth 426 frequency, and so on such that an amplitude of the various frequencies can bracket, bound, or otherwise define the hearing of the user so as to generate an audiogram 122. For example, the audiography circuit 106 can generate the audiogram 122 by applying a smoothing function to the data points at each of the frequencies 408-426. In some embodiments, the audiography circuit 106 can generate further modifications to the audiogram 122 based on a device identity, such as is described with regard to FIG. 2, or a head related transfer function (HRTF) as is described with regard to FIG. 7. The controller 102 can determine one or more deviations of a determined audiogram 122 relative to the threshold. For example, the controller 102 can determine elevated hearing or hearing loss for one or more frequencies based on the relative position of the audiogram 122 and the threshold 406.
[0049] In some instances and implementations, a reduction of a spacing between frequencies (e.g., an increase in a number of sampled frequencies) may increase a performance of the audiography circuit 106 (e.g., may determine a more accurate audiogram 122). According to some embodiments, a frequency sweep may refer to a sequence of discrete frequencies, or a continual/continuous/near-continuous sweep between a first and a second frequencies, such that an amplitude is determined for each frequency of a band of frequencies. In some embodiments, further data may be sampled at same or different frequencies along the frequency spectrum. For example, the controller 102 can cause the audio signal to be presented to the user at the same frequencies, and a further predefined step, smaller than the previous step may be employed (e.g., 5 dB for the first step and 2.5 dB for the second step). According to some embodiments, the user interface 108 can receive a desired time to generate a profile, or a number of frequency steps to test, such that a user may select increased data to generate a more accurate audiogram, or decreased data to generate an audiogram faster. In some embodiments, the audiography circuit 106 can generate a separate audiogram -thresh old chart 400 for each ear.
[0050] FIG. 5 depicts an example of an equalizer profile 500. For example, the controller 102 can generate the equalizer profile based on the audiogram -threshold chart 400 of FIG. 4. As depicted, various channels of the equalizer profile can provide an adjustment to various frequency bands. For example, a fifth frequency band adjustment 510 can increase a magnitude
of audio output to mitigate a decrease of audible perception of a user (e.g., a deviation). One or more frequency bands can be associated with no adjustment, or another adjustment (e.g., a larger or smaller offset from a 0 dB center magnitude). For example, each of a first frequency band adjustment 502, third frequency band adjustment 506, fourth frequency band adjustment 508, sixth frequency band adjustment 512, seventh frequency band adjustment 514, and eighth frequency band adjustment 516 can have a lesser magnitude than the fifth frequency band adjustment 510. A second frequency band adjustment 504 can be omitted (e.g., can be indicated as 0 dB). The magnitude of the frequency band adjustments can be based on an average, maximum, minimum, or other deviation between an audiogram 122 and a threshold 406.
[0051] In some embodiments, an audiogram 122 can be generated for two ears, and an adjustment can be made between the two ears. For example, a deviation can be associated with a left ear, and the DSP circuit 104 can insert left channel audio content within a frequency of or proximate to the deviation in a right channel of the speaker. In some embodiments, a DSP circuit 104 can amplify a frequency band based on a magnitude of a deviation of the frequency band from a threshold. In some embodiments, an amplification can be based on a degree of hearing loss. For example, for a first threshold, (e.g., mild hearing loss such as a deviation of less than 20 dB), the controller 102 can cause the DSP circuit 104 to amplify a frequency band. For a second threshold, (e.g., moderate hearing loss such as a deviation of between 20 dB and 40 dB), the controller 102 can cause the DSP circuit 104 to insert a audio content between channels (e.g., channels corresponding to a left ear and a right ear).
[0052] In some embodiments, the DSP circuit 104 can amplify a different frequency than a detected deviation. In some embodiments, the DSP circuit 104 can amplify a harmonic of a detected deviation. For example, the DSP circuit 104 can amplify a harmonic of a detected deviation which may increase an intelligibility of human speech. For example, the DSP circuit 104 can amplify a frequency band at about 8 kHz in response to a detected deviation at 4 kHz. The DSP circuit 104 can encode 4 KHz information at 8 kHz to increase an intelligibility thereof. In some embodiments, the DSP circuit 104 can process audio content 128 to increase intelligibility of directional information. For example, the controller 102 can receive a content type from the audio content recognition circuit 112, such as a virtual realty or video game content type. Responsive to the content type, the controller 102 can determine that directional
information is relevant to the audio content. Responsive to the content type, the controller 102 can cause a DSP circuit 104 to process the audio. For example, the DSP circuit 104 can frequency shift the audio content 128 to avoid a frequency associated with a deviation. For example, responsive to an audiogram indicating hearing loss at 4 kHz, the DSP circuit 104 can frequency shift 4 kHz information to 5 kHz, which may retain directional information.
According to some embodiments, amplitude modifications may not correspond linearly to audio content or user hearing. For example, the DSP circuit 104 may increase a component of audio content less than, equal to, or greater than a corresponding deviation of an audiogram 122 for the user.
[0053] FIG. 6 depicts an example of a correction adjustment interface 600 of a user interface 108. The correction adjustment interface 600 can adjust a magnitude of an adjustment employed by the DSP circuit 104. For example, a correction adjustment interface 600 can depict a modification amplitude, a preservation of spatial information, or the like. The correction adjustment interface 600 can include a selection bar 602, ring, toggle, or the like. A selection marker 604 may indicate a selected position on the selection bar 602. The selection marker can separate a first portion of the selection bar 606 indicating a lesser correction, and a second portion of the selection bar 608 indicating a greater correction. The user interface 108 can receive an adjustment of the selection marker (e.g., by sliding the selection marker 604 along the selection bar 602) to define a magnitude of modification of audio content. For example, the user may select various points along the selection bar 602. Responsive to the selection, the DSP circuit 104 can generate an audio signal corresponding to the selected degree of correction. For example, the user can select a subjective preference of modification for music, a maximum intelligibility for speech, or the like. In some embodiments, the correction adjustment interface 600 can include various selection bars 602 corresponding to various content types or speakers.
[0054] FIG. 7 depicts a top view of an environment 700 including a user 702 in relation to a speaker 110 and an environment boundary 704 (e.g., a wall). The data processing system 100 can define a direct audio path 706 between the user 702 and the speaker 110. For example, a direct audio path can be defined corresponding to a relationship between each transceiver of a speaker and each ear of a user 702. The data processing system 100 can define one or more indirect audio paths 708. For example, an indirect audio path 708 can include various reflections between an
environment 700 or a user 702. For example, a head-related transfer function (HRTF) can define a relationship between an audio source and an eardrum of a user 702, including the user’s ears shoulders, etc. The HRTF may be a generic HRTF a custom HRTF based on user attributes (e.g., height, sex, ear geometry, etc.). One or more locations of a speaker 110 can be defined virtually, such that a speaker location for spatial audio differs from a physical speaker location. For example, a physical speaker location can be over the ears (e.g., headphones), whereas a virtual speaker location 710 may be otherwise disposed. For example one or more virtual speaker locations 710 may correspond to a band on a stage, a navigational icon in an AR navigation application, a character in a VR game, etc. Such HRTFs can be employed with any of the techniques described herein. For example, the DSP circuit 104 can convolve the HRTF for each ear against audiogram-corrected audio content 128 for each ear to generate various virtual speaker locations of processed audio.
[0055] Referring to FIG. 8, a block diagram of an example system using supervised learning is shown. Supervised learning is a method of training a machine learning model given input-output pairs. An input-output pair is an input with an associated known output (e.g., an expected output).
[0056] Machine learning models 804 may be trained on known input-output pairs such that the machine learning model 804 can learn how to predict known outputs given known inputs. Once the machine learning model 804 has learned how to predict known input-output pairs, the machine learning model 804 can operate on unknown inputs to predict an output.
[0057] The machine learning model 804 may be trained based on general data and/or granular data (e.g., data based on a specific user 832) such that the machine learning model 804 may be trained specific to a particular user 832, a particular class of user, or a particular ear, a particular content type, etc.
[0058] Training inputs 802 and actual outputs 810 may be provided to the machine learning model 804. Training inputs 802 may include frequency composition of audio content, any tags or metadata associated with the audio content, a location of a user, a time of access, or device employed (e.g., a home theatre system, VR goggles, ear buds, etc.), historical use, similar users (e.g., according to an age, interest, genre of preferred music, etc.) and the like.
[0059] The inputs 802 and actual outputs 810 may be received from any of the data repositories. For example, a data repository may contain a list of audio content selected by the user, a history of use of the user, a device identifier, a user preference, feedback received via the user interface 108, or the like. The data repository may also contain data associated with related users. Thus, the machine learning model 804 may be trained to predict audio content type based on the training inputs 802 and actual outputs 810 used to train the machine learning model 804.
[0060] The data processing system 100 may include one or more machine learning models 804. In an embodiment, a first machine learning model 804 may be trained to predict data such as an audio content type. For example, the first machine learning model 804 may use the training inputs 802 of a device identifier, metadata associated with audio content, or frequency content of audio content to predict outputs 806 such as by applying the current state of the first machine learning model 804 to the training inputs 802. The comparator 808 may compare the predicted outputs 806 to actual outputs 810 which may include adjustments to the predicted outputs 806 via the user interface 108 to determine an amount of error or differences. For example, the predicted audio content type may be compared to the actual user selected or adjusted audio content type (e.g., actual output 810).
[0061] In other embodiments, a second machine learning model 804 may be trained to make one or more recommendations to the user 832 based on the predicted output from the first machine learning model 804. For example, the second machine learning model 804 may use the training inputs 802 frequency composition of audio content, any tags or metadata associated with the audio content, a location of a user, a time of access, or device employed (e.g., a home theatre system, VR goggles, ear buds, etc.), historical use, similar users (e.g., according to an age, interest, genre of preferred music, etc.) and the like to predict outputs 806 of an audio content type by applying the current state of the second machine learning model 804 to the training inputs 802. The comparator 808 may compare the predicted outputs 806 to actual outputs 810 amplification or spatial information preservation to determine an amount of error or differences.
[0062] The actual outputs 810 may be determined based on historic data of recommendations made to the user 832. In an illustrative non-limiting example, a user may select an audio signal comprising pop music. Based on the user listen history, the frequency content of the music, a
device associated with the music (e.g., bookshelf speakers), or the like, the machine learning model 804 may determine a genre of the music. Based on the genre, the machine learning model can implement, propose, or convey (e.g., via the user interface) one or more suggestions to the user, such as to confirm the genre type, or select a modification of the audio content based on the audiogram 122 of the user.
[0063] In some embodiments, a single machine leaning model 804 may be trained to make one or more recommendations to the user 832 based on current user 832 data received from enterprise resources 828. That is, a single machine leaning model may be trained using the training inputs of a plurality of various users to predict outputs 806 such a genres of music, special information, human speech, or the like by applying the current state of the machine learning model 804 to the training inputs 802. The comparator 808 may compare the predicted outputs 806 to actual outputs 810 to determine an amount of error or differences. The actual outputs 810 may be determined based on historic data associated with the recommendation to the user 832.
[0064] During training, the error (represented by error signal 812) determined by the comparator 808 may be used to adjust the weights in the machine learning model 804 such that the machine learning model 804 changes (or learns) over time. The machine learning model 804 may be trained using a b ackpropagation algorithm, for instance. The backpropagation algorithm operates by propagating the error signal 812. The error signal 812 may be calculated each iteration (e.g., each pair of training inputs 802 and associated actual outputs 810), batch and/or epoch, and propagated through the algorithmic weights in the machine learning model 804 such that the algorithmic weights adapt based on the amount of error. The error is minimized using a loss function. Non-limiting examples of loss functions may include the square error function, the root mean square error function, and/or the cross entropy error function.
[0065] The weighting coefficients of the machine learning model 804 may be tuned to reduce the amount of error, thereby minimizing the differences between (or otherwise converging) the predicted output 806 and the actual output 810. The machine learning model 804 may be trained until the error determined at the comparator 808 is within a certain threshold (or a threshold number of batches, epochs, or iterations have been reached). The trained machine learning model
804 and associated weighting coefficients may subsequently be stored in memory 816 or other data repository (e.g., a database) such that the machine learning model 804 may be employed on unknown data (e.g., not training inputs 802). Once trained and validated, the machine learning model 804 may be employed during a testing (or an inference phase). During testing, the machine learning model 804 may ingest unknown data to predict future data (e.g., a modification to increase the intelligibility of speech, a modification for a new genre of music, or special content associated with a new device, and the like).
[0066] Referring to FIG. 9, a block diagram of a simplified neural network model 900 is shown. The neural network model 900 may include a stack of distinct layers (vertically oriented) that transform a variable number of inputs 902 being ingested by an input layer 904, into an output 906 at the output layer 908.
[0067] The neural network model 900 may include a number of hidden layers 910 between the input layer 904 and output layer 908. Each hidden layer has a respective number of nodes (912, 914 and 916). In the neural network model 900, the first hidden layer 910-1 has nodes 912, and the second hidden layer 910-2 has nodes 914. The nodes 912 and 914 perform a particular computation and are interconnected to the nodes of adjacent layers (e.g., nodes 912 in the first hidden layer 910-1 are connected to nodes 914 in a second hidden layer 910-2, and nodes 914 in the second hidden layer 910-2 are connected to nodes 916 in the output layer 908). Each of the nodes (912, 914 and 916) sum up the values from adjacent nodes and apply an activation function, allowing the neural network model 900 to detect nonlinear patterns in the inputs 902. Each of the nodes (912, 914 and 916) are interconnected by weights 920-1, 920-2, 920-3, 920-4, 920-5, 920-6 (collectively referred to as weights 920). Weights 920 are tuned during training to adjust the strength of the node. The adjustment of the strength of the node facilitates the neural network’s ability to predict an accurate output 906.
[0068] In some embodiments, the output 906 may be one or more numbers. For example, output 906 may be a vector of real numbers subsequently classified by any classifier. In one example, the real numbers may be input into a softmax classifier. A softmax classifier uses a softmax function, or a normalized exponential function, to transform an input of real numbers into a normalized probability distribution over predicted output classes. For example, the softmax
classifier may indicate the probability of the output being in class A, B, C, etc. As, such the softmax classifier may be employed because of the classifier’s ability to classify various classes. Other classifiers may be used to make other classifications. For example, the sigmoid function, makes binary determinations about the classification of one class (i.e., the output may be classified using label A or the output may not be classified using label A).
[0069] FIG. 10 depicts an example method 1000 of generating an audiogram 122. In brief summary, the data processing system 100 generates an audio signal sweeping across a plurality of frequencies at operation 1002. The data processing system 100 receives a first indication of a user response to a first portion of the audio signal associated with a first frequency at 1004. At operation 1006, the data processing system 100 modifies an amplitude of a second portion of the audio signal associated with a second frequency. At operation 1008, the data processing system 100 receives a second indication of a user response to a second portion of the audio signal associated with the second frequency. The data processing system 100 generates an audiogram 122 at operation 1010, according to the first indication of the user response and the second indication of the user response.
[0070] In further detail, at operation 1002, and in some embodiments, the data processing system 100 generates an audio signal sweeping across a plurality of frequencies. The plurality of frequencies can be or include frequencies within human hearing range (e.g., 20 Hz to 20 kHz, 100 Hz to 16 kHz, or the like). The audio signal can include one or more files, streams, sources, or the like. The data processing system 100 can generate the audio signal for output by an audio device (e.g., a speaker 110) to a user. For example, the data processing system 100 can generate the audio signal according to a format, amplitude, and the like such that the audio signal can cause a signal which is audible to a user, or an inaudibility of which may be used to determine a level of hearing loss of the user. The signal may sweep according to a non-linear sequence, such as a logarithmic sequence. For example, the frequencies may be defined according to a nonlinear series including a first frequency through a fifth frequency (e.g., a first frequency, second frequency, third frequency, fourth frequency, and fifth frequencies).
[0071] In some embodiments, the data processing system 100 may receive an identifier of the audio device or speaker 110. For example, the data processing system 100 may receive the
identifier (e.g., device characteristics 124 described above) from the speaker 110 responsive to the speaker being communicably coupled to the data processing system 100. The data processing system 100 may generate the audio signal according to the identifier of the audio device or speaker 110. For example, the data processing system 100 may generate the audio signal by normalizing a swept audio signal for the particular speaker 110, using the device characteristics 124 for the speaker 110 including the identifier of the speaker 110.
[0072] At operation 1004, and in some embodiments, the data processing system 100 receives a first indication of a user response to a first portion of the audio signal. The first indication of the user response may correspond to or be similar to a first instance of an indication of user receipt, as described above with reference to the method 300 of FIG. 3, at operation 308. For example, the data processing system 100 can receive the first indication of the user response via a control interface of the audio output device, or another device (e.g., a key of a keyboard or another control interface disclosed according to the systems and methods disclosed herein). The first indication may be or include an indication that designates an amplitude (e.g., and a frequency) for which the audio signal is inaudible to the user. For example, the data processing system 100 can receive the first indication of the user response when the user provides the first indication responsive to the audio signal no longer being audible at the first portion. As one example, the user may provide the first indication by releasing an interface element responsive to inaudibility of the audio signal. For example, the user may depress, select, or otherwise hold down the interface element while the audio signal is audible to the user, and release the interface element when the audio signal is no longer audible (e.g., is inaudible). The data processing system 100 may receive the indication responsive to the user releasing the interface element.
[0073] The first portion of the audio signal (e.g., at which the first indication is received) may correspond to or be associated with a first frequency (e.g., of the plurality of frequencies across which the audio signal is swept) at which the audio signal is inaudible at the particular amplitude. The data processing system 100 100 can correlate the first portion of the audio signal to the frequency (e.g., the first frequency )and amplitude corresponding thereto (e.g., an amplitude of the audio signal at the first frequency). For example, the data processing system 100 can correlate the first portion of the audio signal to the frequency band and amplitude, as is
further discussed with reference to the output of the method 300 of FIG. 3 (e.g., at operation 306 or 310).
[0074] At operation 1006, and in some embodiments, the data processing system 100 modifies the amplitude of a second portion of the audio signal associated with a second frequency. The data processing system 100 can modify the amplitude of the second portion of the audio signal upon, responsive to, or according to receiving the first indication (e.g., at operation 1004). The second frequency can be a higher or lower frequency of the plurality of frequencies (e.g., as compared to the first frequency) according to a direction of the frequency sweep. The modification of the amplitude of the frequency can correspond to the modification of operation 310 of the method 300 of FIG. 3. For example, the modification can be according to a predetermined step size (e.g., 5 dB, 2.5 dB, or the like). In other words, the data processing system 100 may modify the audio signal by adjusting the amplitude by various amounts. According to some embodiments, the data processing system 100 may adjust the step size or amplitude of the audio signal by a reduced amount for subsequent frequency sweeps (relative to a larger step size or amplitude for a previous step size), which may increase a resolution of an audiogram 122, relative to larger step sizes. For example, the data processing system 100 may adjust the amplitude of the audio signal by a second amount (e.g., according to a second step size) which is less than the first amount (according to a first step size), subsequent to adjusting the amplitude of the audio signal by the first amount (e.g., at the first time instance. In other words, at a subsequent iteration of the method, the data processing system may adjust the amplitude of the audio signal by a lesser degree than previous iterations, to provide a more granular or higher resolution audiogram 122.
[0075] At operation 1008, and in some embodiments, the data processing system 100 receives a second indication of a user response to the second portion of the audio signal. Operation 1008 may be similar to operation 1004 described above. The data processing system 100 can correlate the second portion with the second frequency, as discussed with regard to operation 1004 (e.g., by associating an amplitude with a frequency of the audio signal). The data processing system 100 can correlate the second portion of the audio signal to a frequency and amplitude corresponding thereto. For example, the data processing system 100 can correlate the second
portion of the audio signal to the frequency band and amplitude as is further discussed with reference to the output of the method 300 of FIG. 3 (e.g., at operation 310).
[0076] At operation 1010, the data processing system 100 generates an audiogram 122 according to the first indication of operation 1004, and the second indication of operation 1008. For example, the first and second indication can indicate a threshold of hearing at respective frequency and amplitude levels. For example, the first indication can indicate normal hearing at 100 Hz, and the second indication can indicate hearing loss at 10 kHz. Indeed, the data processing system 100 can process various further indications (e.g., third, fourth, and so on) to generate an audiogram 122 at various further frequencies. For example, the data processing system 100 can model a discrete or continuous function corresponding to an audiogram 122 of the user, as is further described according to the disclosure of operation 316 of the method 300 of FIG. 3, and throughout this disclosure (e.g., with regard to the audiography circuit 106 of FIG. 1).
[0077] In some embodiments, following execution of an audiogram 122, the data processing system 100 may generate a profile according to the audiogram 122. The profile may be similar to the equalizer profile 500 described above with reference to FIG. 5. The data processing system 100 may present the profile to the user, and the user may provide various modification(s) to the profile. For example, the data processing system 100 may receive various modifications to the profile at one or more frequencies. The data processing system 100 may update the profile according to the modification.
[0078] In some embodiments, the data processing system 100 may apply the profile and/or the audiogram corresponding to the profile may be used to modify audio content from various audio sources. For example, the data processing system 100 may receive audio content from an audio source. The audio content may be or include music, songs, speech, etc. The data processing system 100 may modify the audio content to generate an audio output, according to the audiogram. For example, the data processing system 100 may modify (such as amplify) the audio content at various frequencies according to the audiogram (e.g., by applying the audiogram to the audio content to generate the audio output). In some embodiments, the data processing system 100 may modify the audio content according to a classification for the audio content. For
example, the data processing system 100 may include a classifier to classify audio content locally, the data processing system 100 may identify or determine the classification provided by the audio source (e.g., the audio source may embed the classification as metadata, such as genre of music). The data processing system 100 may use the classification and the audiogram to modify the audio content. For example, depending on the classification and the audiogram, the data processing system 100 may apply different amplifications at different frequencies (e.g., as described above).
[0079] FIG. 11 depicts an example block diagram of an example computer system 1100. The computer system or computing device 1100 can include or be used to implement a data processing system or its components. The computing system 1100 includes at least one bus 1105 or other communication component for communicating information and at least one processor 1110 or processing circuit coupled to the bus 1105 for processing information. The computing system 1100 can also include one or more processors 1110 or processing circuits coupled to the bus for processing information. The computing system 1100 also includes at least one main memory 1115, such as a random access memory (RAM) or other dynamic storage device, coupled to the bus 1105 for storing information, and instructions to be executed by the processor 1110. The main memory 1115 can be used for storing information during execution of instructions by the processor 1110. The computing system 1100 may further include at least one read only memory (ROM) 1120 or other static storage device coupled to the bus 1105 for storing static information and instructions for the processor 1110. A storage device 1125, such as a solid state device, magnetic disk or optical disk, can be coupled to the bus 1105 to persistently store information and instructions.
[0080] The computing system 1100 may be coupled via the bus 1105 to a display 1135, such as a liquid crystal display, or active matrix display, for displaying information to a user. An input device 1130, such as a keyboard or voice interface may be coupled to the bus 1105 for communicating information and commands to the processor 1110. The input device 1130 can include a touch screen display 1135. The input device 1130 can also include a cursor control, such as a mouse, a trackball, or cursor direction keys, button on a headset, or the like for communicating direction information and command selections to the processor 1110 and for controlling cursor movement on the display 1135.
[0081] The processes, systems and methods described herein can be implemented by the computing system 1100 in response to the processor 1110 executing an arrangement of instructions contained in main memory 1115. Such instructions can be read into main memory 1115 from another computer-readable medium, such as the storage device 1125. Execution of the arrangement of instructions contained in main memory 1115 causes the computing system 1100 to perform the illustrative processes described herein. One or more processors in a multiprocessing arrangement may also be employed to execute the instructions contained in main memory 1115. Hard-wired circuitry can be used in place of or in combination with software instructions together with the systems and methods described herein. Systems and methods described herein are not limited to any specific combination of hardware circuitry and software.
[0082] Although an example computing system has been described in FIG. 11, the subject matter including the operations described in this specification can be implemented in other types of digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them.
[0083] Having now described some illustrative implementations, it is apparent that the foregoing is illustrative and not limiting, having been presented by way of example. In particular, although many of the examples presented herein involve specific combinations of method acts or system elements, those acts and those elements can be combined in other ways to accomplish the same objectives. Acts, elements and features discussed in connection with one implementation are not intended to be excluded from a similar role in other implementations or implementations.
[0084] All or part of the processes described herein and their various modifications (hereinafter referred to as “the processes”) can be implemented, at least in part, via a computer program product, i.e., a computer program tangibly embodied in one or more tangible, physical hardware storage devices that are computer and/or machine-readable storage devices for execution by, or to control the operation of, data processing apparatus, e.g., a programmable processor, a computer, or multiple computers. A computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other
unit suitable for use in a computing environment. A computer program can be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a network.
[0085] Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only storage area or a random access storage area or both. Elements of a computer (including a server) include one or more processors for executing instructions and one or more storage area devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from, or transfer data to, or both, one or more machine-readable storage media, such as mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks.
[0086] Computer program products are stored in a tangible form on non-transitory computer readable media and non-transitory physical hardware storage devices that are suitable for embodying computer program instructions and data. These include all forms of non-volatile storage, including by way of example, semiconductor storage area devices, e.g., EPROM, EEPROM, and flash storage area devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks and volatile computer memory, e.g., RAM such as static and dynamic RAM, as well as erasable memory, e.g., flash memory and other non-transitory devices.
[0087] The construction and arrangement of the systems and methods as shown in the various embodiments are illustrative only. Although only a few embodiments have been described in detail in this disclosure, many modifications are possible (e.g., variations in sizes, dimensions, structures, shapes and proportions of the various elements, values of parameters, mounting arrangements, use of materials, colors, orientations, etc.). For example, the position of elements may be reversed or otherwise varied and the nature or number of discrete elements or positions may be altered or varied. Accordingly, all such modifications are intended to be included within the scope of the present disclosure. The order or sequence of any process or method steps may be varied or re-sequenced. Other substitutions, modifications, changes, and omissions may be made
in the design, operating conditions and arrangement of embodiments without departing from the scope of the present disclosure.
[0088] As utilized herein, the terms “approximately,” “about,” “substantially”, and similar terms are intended to include any given ranges or numbers +/- 10%. These terms include insubstantial or inconsequential modifications or alterations of the subject matter described and claimed are considered to be within the scope of the disclosure as recited in the appended claims.
[0089] It should be noted that the term “exemplary” and variations thereof, as used herein to describe various embodiments, are intended to indicate that such embodiments are possible examples, representations, or illustrations of possible embodiments (and such terms are not intended to connote that such embodiments are necessarily extraordinary or superlative examples).
[0090] The term “coupled” and variations thereof, as used herein, means the joining of two members directly or indirectly to one another. Such joining may be stationary (e.g., permanent or fixed) or moveable (e.g., removable or releasable). Such joining may be achieved with the two members coupled directly to each other, with the two members coupled to each other using a separate intervening member and any additional intermediate members coupled with one another, or with the two members coupled to each other using an intervening member that is integrally formed as a single unitary body with one of the two members. If “coupled” or variations thereof are modified by an additional term (e.g., directly coupled), the generic definition of “coupled” provided above is modified by the plain language meaning of the additional term (e.g., “directly coupled” means the joining of two members without any separate intervening member), resulting in a narrower definition than the generic definition of “coupled” provided above. Such coupling may be mechanical, electrical, or fluidic.
[0091] The term “or,” as used herein, is used in its inclusive sense (and not in its exclusive sense) so that when used to connect a list of elements, the term “or” means one, some, or all of the elements in the list. Conjunctive language such as the phrase “at least one of X, Y, and Z,” unless specifically stated otherwise, is understood to convey that an element may be either X, Y, Z; X and Y; X and Z; Y and Z; or X, Y, and Z (i.e., any combination of X, Y, and Z). Thus, such
conjunctive language is not generally intended to imply that certain embodiments require at least one of X, at least one of Y, and at least one of Z to each be present, unless otherwise indicated.
[0092] References herein to the positions of elements (e.g., “top,” “bottom,” “above,” “below”) are merely used to describe the orientation of various elements in the FIGURES. It should be noted that the orientation of various elements may differ according to other exemplary embodiments, and that such variations are intended to be encompassed by the present disclosure.
[0093] The present disclosure contemplates methods, systems and program products on any machine-readable media for accomplishing various operations. The embodiments of the present disclosure may be implemented using existing computer processors, or by a special purpose computer processor for an appropriate system, incorporated for this or another purpose, or by a hardwired system. Embodiments within the scope of the present disclosure include program products including machine-readable media for carrying or having machine-executable instructions or data structures stored thereon. Such machine-readable media can be any available media that can be accessed by a general purpose or special purpose computer or other machine with a processor. By way of example, such machine-readable media can comprise RAM, ROM, EPROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store desired program code in the form of machine-executable instructions or data structures and which can be accessed by a general purpose or special purpose computer or other machine with a processor. Combinations of the above are also included within the scope of machine-readable media. Machine-executable instructions include, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing machines to perform a certain function or group of functions.
[0094] Although the figures show a specific order of method steps, the order of the steps may differ from what is depicted. Also two or more steps may be performed concurrently or with partial concurrence. Such variation will depend on the software and hardware systems chosen and on designer choice. All such variations are within the scope of the disclosure. Likewise, software implementations could be accomplished with standard programming techniques with
rule based logic and other logic to accomplish the various connection steps, processing steps, comparison steps and decision steps.
Claims
1. A method comprising: generating, by one or more processors, an audio signal for output by an audio device to a user, the audio signal sweeping across a plurality of frequencies; receiving, by the one or more processors, a first indication of a user response to a first portion of the audio signal, the first portion associated with a first frequency of the plurality of frequencies; upon receiving the first indication of the user response, modifying, by the one or more processors, an amplitude of a second portion of the audio signal associated with a second frequency of the plurality of frequencies; receiving, by the one or more processors, a second indication of a user response to the second portion of the audio signal, the second portion associated with the second frequency of the plurality of frequencies; and generating, by the one or more processors, an audiogram specific to the user according to the first indication of the user response and the second indication of the user response.
2. The method of claim 1, wherein modifying the amplitude of the second portion comprises adjusting the amplitude of the audio signal by a first amount.
3. The method of claim 2, wherein modifying the amplitude of the audio signal comprises adjusting the amplitude of the audio signal by a second amount which is less than the first amount, subsequent to adjusting the amplitude of the audio signal by the first amount.
4. The method of claim 1, wherein each frequency of the plurality of frequencies is defined by a non-linear series comprising the first frequency, the second frequency, a third frequency, a fourth frequency, and a fifth frequency.
5. The method of claim 1, wherein the first indication designates an amplitude for which the audio signal is inaudible to the user.
32
6. The method of claim 5, wherein the first indication is received upon a release of an interface element responsive to inaudibility of the audio signal.
7. The method of claim 5, further comprising: receiving an identifier associated with the audio device; and wherein the audio signal is generated according to the identifier associated with the audio device.
8. The method of claim 1, further comprising: receiving audio content from an audio source; and modifying the audio content to generate an audio output, according to the audiogram.
9. The method of claim 8, further comprising: determining a classification for the audio content, wherein an amplitude of the audio content is modified at one or more frequencies, according to the classification for the audio content.
10. The method of claim 1, comprising: generating a profile according to the audiogram; receiving a modification of the profile at one or more frequencies based on a user input; and updating the profile according to the modification.
11. A system comprising: a speaker; and one or more processors configured to: generate an audio signal for output by an audio device to a user, wherein the audio signal sweeps across a plurality of frequencies; receive a first indicate on of a user response to a first portion of the audio signal, the first portion associated with a first frequency of the plurality of frequencies;
33
upon receipt of the first indication of the user response, modify an amplitude of a second portion of the audio signal associated with a second frequency of the plurality of frequencies; receive a second indication of a user response to the second portion of the audio signal, the second portion associated with the second frequency of the plurality of frequencies; and generate an audiogram specific to the user according to the first indication of the user response and the second indication of the user response; receive an audio input from an audio source; receive a device identifier associated with the speaker; modify the audio input at one or more frequencies to generate an audio output, according to the audiogram and the device identifier; and output, via the speaker, the audio output to the user.
12. The system of claim 11, wherein, to modify the audio input, the one or more processors are configured to increase an amplitude of the audio input at a frequency corresponding to hearing loss, as indicated by the audiogram.
13. The system of claim 11, wherein the one or more processors are configured to: determine a classification of the audio input, the classification comprising at least one of spatial audio, speech audio, music audio, genre of music, or gaming audio, wherein the one or more processors modify the audio input according to the audiogram and the classification of the audio input.
14. The system of claim 11, comprising: one or more headphones, the one or more headphones comprising the speaker and the one or more processors.
15. The system of claim 11, wherein, to modify the audio input, the one or more processors are configured to increase an amplitude of the audio input at a fourth frequency, different from a
third frequency corresponding to hearing impairment as indicated by the audiogram, responsive to a determination that the audio input includes spatial information.
16. The system of claim 11, wherein, to modify the audio input, the one or more processors are configured to increase an amplitude of the audio input at a harmonic of a third frequency corresponding to hearing impairment as indicated by the audiogram, responsive to a determination that the audio input includes human speech.
17. The system of claim 11, wherein the audio input is modified based on a frequency response of the speaker.
18. The system of claim 11, further comprising an interface element configured to receive an indication of the audibility of the audio output from the user, to generate the audiogram.
19. The system of claim 11, wherein the one or more processors are further configured to: receive an indication of an amplification preference from the user; and cause a device to adjust an amplitude of the audio output according to the amplification preference from the user.
20. A headphone set comprising: a speaker; and one or more processors configured to: generate an audio signal for output by an audio device to a user, the audio signal sweeping across a plurality of frequencies; receive a first indication of a user response to a first portion of the audio signal, the first portion associated with a first frequency of the plurality of frequencies; upon receiving the first indication of the user response, modify an amplitude of a second portion associated with a second frequency of the plurality of frequencies; receive a second indication of a user response to the second portion of the audio signal, the second portion associated with the second frequency of the plurality of frequencies;
generate an audiogram specific to the user according to the first indication of the user response and the second indication of the user response; receive an audio input from an audio source; modify the audio input at one or more frequencies to generate an audio output, according to the audiogram and a frequency response of the headphone set; and output, via the speaker, the audio output to the user.
36
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202163265856P | 2021-12-22 | 2021-12-22 | |
US63/265,856 | 2021-12-22 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023122227A1 true WO2023122227A1 (en) | 2023-06-29 |
Family
ID=86903599
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2022/053737 WO2023122227A1 (en) | 2021-12-22 | 2022-12-21 | Audio control system |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2023122227A1 (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070092089A1 (en) * | 2003-05-28 | 2007-04-26 | Dolby Laboratories Licensing Corporation | Method, apparatus and computer program for calculating and adjusting the perceived loudness of an audio signal |
US20140309549A1 (en) * | 2013-02-11 | 2014-10-16 | Symphonic Audio Technologies Corp. | Methods for testing hearing |
US20150078556A1 (en) * | 2012-04-13 | 2015-03-19 | Nokia Corporation | Method, Apparatus and Computer Program for Generating an Spatial Audio Output Based on an Spatial Audio Input |
US20180220243A1 (en) * | 2015-10-05 | 2018-08-02 | Widex A/S | Hearing aid system and a method of operating a hearing aid system |
US20190364354A1 (en) * | 2018-05-22 | 2019-11-28 | Staton Techiya LLC | Hearing sensitivity acquisition methods and devices |
-
2022
- 2022-12-21 WO PCT/US2022/053737 patent/WO2023122227A1/en unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070092089A1 (en) * | 2003-05-28 | 2007-04-26 | Dolby Laboratories Licensing Corporation | Method, apparatus and computer program for calculating and adjusting the perceived loudness of an audio signal |
US20150078556A1 (en) * | 2012-04-13 | 2015-03-19 | Nokia Corporation | Method, Apparatus and Computer Program for Generating an Spatial Audio Output Based on an Spatial Audio Input |
US20140309549A1 (en) * | 2013-02-11 | 2014-10-16 | Symphonic Audio Technologies Corp. | Methods for testing hearing |
US20180220243A1 (en) * | 2015-10-05 | 2018-08-02 | Widex A/S | Hearing aid system and a method of operating a hearing aid system |
US20190364354A1 (en) * | 2018-05-22 | 2019-11-28 | Staton Techiya LLC | Hearing sensitivity acquisition methods and devices |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11888456B2 (en) | Methods and systems for automatically equalizing audio output based on room position | |
US20200081683A1 (en) | Methods and apparatus for dynamic volume adjustment via audio classification | |
US11809775B2 (en) | Conversation assistance audio device personalization | |
EP3301675B1 (en) | Parameter prediction device and parameter prediction method for acoustic signal processing | |
US10896020B2 (en) | System for processing service requests relating to unsatisfactory performance of hearing devices, and components of such system | |
US20150271608A1 (en) | Crowd sourced recommendations for hearing assistance devices | |
US11096005B2 (en) | Sound reproduction | |
US12061840B2 (en) | Methods and apparatus for dynamic volume adjustment via audio classification | |
US11438710B2 (en) | Contextual guidance for hearing aid | |
EP3484183B1 (en) | Location classification for intelligent personal assistant | |
US11601757B2 (en) | Audio input prioritization | |
US12041424B2 (en) | Real-time adaptation of audio playback | |
EP2163124B1 (en) | Fully learning classification system and method for hearing aids | |
WO2023122227A1 (en) | Audio control system | |
CN111918174B (en) | Method and device for balancing volume gain, electronic device and vehicle | |
US20240112661A1 (en) | Environmentally Adaptive Masking Sound | |
US9055362B2 (en) | Methods, apparatus and systems for individualizing audio, music and speech adaptively, intelligently and interactively | |
US12022271B2 (en) | Dynamics processing across devices with differing playback capabilities | |
JP7252893B2 (en) | SOUND MANAGEMENT METHOD AND SYSTEM | |
US20220345833A1 (en) | Machine learning based hearing assistance system | |
US20230260526A1 (en) | Method and electronic device for personalized audio enhancement | |
US20240373187A1 (en) | Audio parameter optimizing method and computing apparatus related to audio parameters | |
US20240163621A1 (en) | Hearing aid listening test presets | |
KR102727090B1 (en) | Location classification for intelligent personal assistant | |
WO2024105468A1 (en) | Hearing aid listening test presets |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22912459 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |