CN109036430A - Voice control terminal - Google Patents
Voice control terminal Download PDFInfo
- Publication number
- CN109036430A CN109036430A CN201811150055.5A CN201811150055A CN109036430A CN 109036430 A CN109036430 A CN 109036430A CN 201811150055 A CN201811150055 A CN 201811150055A CN 109036430 A CN109036430 A CN 109036430A
- Authority
- CN
- China
- Prior art keywords
- voice control
- cloud
- control box
- voice
- semantic analysis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000005540 biological transmission Effects 0.000 abstract description 3
- 230000002452 interceptive effect Effects 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 7
- 241000209140 Triticum Species 0.000 description 2
- 235000021307 Triticum Nutrition 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000000034 method Methods 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Telephonic Communication Services (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The present invention discloses a kind of voice control terminal, including voice control box, cloud, throw screen, microphone, camera, collected audio data is sent to the speech recognition of the cloud in cloud/semantic analysis engine by AIUI service by voice control box, and the text of needs is converted speech by semantic understanding, the text is converted into corresponding control instruction by voice control box, is sent to control centre by transmission forms such as network, serial ports;Voice control box identifies the identity for issuing the people of control voice by recognition of face, finger print identifying and Application on Voiceprint Recognition, and decide whether to send phonetic order by identification, and loudspeaker is provided in voice control box, support full-duplex voice interactive and barge function, it is handled by local voice identification/semantic analysis engine that the cloud speech recognition/semantic analysis engine cooperation disposed on beyond the clouds is deployed on voice control box to carry out speech recognition and semantic analysis, hand labor is liberated significantly, improves working efficiency.
Description
Technical field
The present invention relates to intelligent sound control technology field more particularly to a kind of voice control terminals.
Background technique
With the development of science and technology, science and technology is more and more intelligent, wherein voice control terminal is more more and more universal, existing voice
Controlling terminal all has user identity identification function, but is mostly identified by vocal print, also has plenty of and passes through
Action gesture identifies that this results in function more single, and cloud deployment way is also more single, lead to voice control terminal
It is more low for the treatment effeciency of audio-frequency information, and phonetic order back track function is not had, therefore, solve this kind of ask
Topic is particularly important.
Summary of the invention
In view of the deficiencies of the prior art, it the present invention provides a kind of voice control terminal, will be collected by voice control box
Audio data the speech recognition of the cloud in cloud/semantic analysis engine is sent to by AIUI service, and passing through semantic understanding will
Voice is converted into the text needed, which is converted into corresponding control instruction, passes through network, serial port form by voice control box
It is sent to control centre, voice control is deployed in by the cloud speech recognition/semantic analysis engine cooperation disposed on beyond the clouds
Local voice identification/semantic analysis engine on box is handled to carry out speech recognition and semantic analysis.
To solve the above-mentioned problems, the present invention provides a kind of voice control terminal, include
Voice control box: collected audio data is sent to the speech recognition of the cloud in cloud/semanteme point by AIUI service
Engine is analysed, and converts speech into the text of needs by semantic understanding, which is converted into controlling accordingly by voice control box
System instruction, control centre is sent to by network, serial port form, be additionally provided in the voice control box local voice identification/
Semantic analysis engine is used for speech recognition and semantic analysis;
Cloud: the audio data transmitted for receiving and identifying voice control box, the cloud include privately owned cloud and
Publicly-owned cloud, the public cloud end administration are deployed in open network, and the private clound end administration is deployed in local area network, institute
It states and is provided with cloud speech recognition/semantic analysis engine in cloud, the cloud speech recognition/semantic analysis engine distribution is in private
Have on cloud and publicly-owned cloud;
Throw screen: for showing treated audio data, content feed, guidance and configuration;
Microphone: for acquiring audio data, the microphone is connected on voice control box by bluetooth or network interface;
Camera: for acquiring user's human face data, the camera is connected to voice control box by USB or network interface
On.
Further improvement lies in that: the voice control box identifies hair by recognition of face, finger print identifying and Application on Voiceprint Recognition
The identity of the people of voice is controlled out, and decides whether to send phonetic order by identification.
Further improvement lies in that: it is provided with loudspeaker in the voice control box, loudspeaker can carry or external.
Further improvement lies in that: the microphone uses array algorithm noise reduction, supports near field and far field pickup.
Further improvement lies in that: it is provided with backtracking module in the voice control box, for using the instruction of sending
The retrospect of family identity.
Further improvement lies in that: the semantic understanding includes the understanding of the standard meaning of one's words and the understanding for extending the meaning of one's words.
The beneficial effects of the present invention are: the present invention acquires audio data by microphone, microphone passes through usb, bluetooth etc.
Wired or be wirelessly connected with voice control terminal, collected audio data is sent to cloud by AIUI service by voice control box
Speech recognition/semantic analysis engine is held, and converts speech into the text of needs, voice by speech analysis, semantic understanding etc.
It controls box and the text is converted into corresponding control instruction, the control of integrated manufacturer is sent to by transmission forms such as network, serial ports
Center processed;Voice control box identifies the identity for issuing the people of control voice by recognition of face, finger print identifying and Application on Voiceprint Recognition,
And decide whether to send phonetic order by identification, and there is back track function simultaneously, facilitate instruction of the user to sending
It is tracked;And voice control box carries loudspeaker, supports full-duplex voice interactive and barge function;Microphone uses wheat
Gram wind array noise reduction algorithm supports near field and far field pickup, is drawn by the cloud speech recognition/semantic analysis disposed on beyond the clouds
Local voice identification/semantic analysis engine that cooperation is deployed on voice control box is held up to carry out at speech recognition and semantic analysis
Reason, has liberated hand labor significantly, has improved work efficiency.
Detailed description of the invention
Fig. 1 is system connection figure of the invention.
Fig. 2 is deployment schematic diagram in cloud of the invention.
Fig. 3 is local speech recognition engine system framework figure of the invention.
Specific embodiment
In order to deepen the understanding of the present invention, the present invention is further described below in conjunction with embodiment, the present embodiment
For explaining only the invention, it is not intended to limit the scope of the present invention..
As shown in Figure 1, 2, 3, a kind of voice control terminal is present embodiments provided, includes
Voice control box: collected audio data is sent to the speech recognition engine in cloud by AIUI service, and passes through language
Reason and good sense solution converts speech into the text of needs, which is converted into corresponding control instruction by voice control box, by network,
Serial port form is sent to control centre, is provided with local speech recognition engine in the voice control box;
Cloud: the audio data transmitted for receiving and identifying voice control box, the cloud include privately owned cloud and
Publicly-owned cloud, the public cloud end administration are deployed in open network, and the private clound end administration is deployed in local area network, institute
It states and is provided with cloud speech recognition/semantic analysis engine in cloud, the cloud speech recognition/semantic analysis engine distribution is in private
Have on cloud and publicly-owned cloud;
Throw screen: for showing treated audio data;
Microphone: for acquiring audio data, the microphone is connected on voice control box by bluetooth or MIC interface;
Camera: for acquiring user's human face data, the camera is connected to voice control box by USB or network interface
On.
The voice control box identifies the people's for issuing control voice by recognition of face, finger print identifying and Application on Voiceprint Recognition
Identity, and decide whether to send phonetic order by identification.Loudspeaker is provided in the voice control box.The wheat
Gram elegance array algorithm noise reduction supports near field and far field pickup.Be provided with backtracking module in the voice control box, for pair
The instruction of sending carries out the retrospect of user identity.The local speech recognition engine is used for speech recognition and semantic analysis.It is described
Semantic understanding includes the understanding of the standard meaning of one's words and the understanding for extending the meaning of one's words.
The present invention acquires audio data by microphone, microphone by usb, bluetooth etc. it is wired or wirelessly with voice control
Terminal processed is connected, and collected audio data is sent to cloud speech recognition/semantic analysis by AIUI service by voice control box
Engine, and the text of needs is converted speech by speech analysis, semantic understanding etc., which is converted by voice control box
Corresponding control instruction is sent to the control centre of integrated manufacturer by transmission forms such as network, serial ports;Voice control box passes through
Recognition of face, finger print identifying and Application on Voiceprint Recognition issue the identity for the people for controlling voice to identify, and are determined by identification
Phonetic order whether is sent, and there is back track function simultaneously, user is facilitated to be tracked the instruction of sending;And voice control box
Included loudspeaker supports full-duplex voice interactive and barge function;Microphone uses noise reduction of microphone array algorithm, supports
Near field and far field pickup are deployed in voice control by the cloud speech recognition/semantic analysis engine cooperation disposed on beyond the clouds
Local voice identification/semantic analysis engine on box is handled to carry out speech recognition and semantic analysis, has liberated artificial labor significantly
It is dynamic, it improves work efficiency.
Claims (6)
1. a kind of voice control terminal, which is characterized in that include
Voice control box: collected audio data is sent to the speech recognition of the cloud in cloud/semanteme point by AIUI service
Engine is analysed, and converts speech into the text of needs by semantic understanding, which is converted into controlling accordingly by voice control box
System instruction, control centre is sent to by network, serial port form, be additionally provided in the voice control box local voice identification/
Semantic analysis engine is used for speech recognition and semantic analysis;
Cloud: the audio data transmitted for receiving and identifying voice control box, the cloud include privately owned cloud and
Publicly-owned cloud, the public cloud end administration are deployed in open network, and the private clound end administration is deployed in local area network, institute
It states and is provided with cloud speech recognition/semantic analysis engine in cloud, the cloud speech recognition/semantic analysis engine distribution is in private
Have on cloud and publicly-owned cloud;
Throw screen: for showing treated audio data, content feed, guidance and configuration;
Microphone: for acquiring audio data, the microphone is connected on voice control box by bluetooth or network interface;
Camera: for acquiring user's human face data, the camera is connected to voice control box by USB or network interface
On.
2. voice control terminal according to claim 1, it is characterised in that: the voice control box by recognition of face,
Finger print identifying and Application on Voiceprint Recognition identify the identity for the people for issuing control voice, and decide whether to send language by identification
Sound instruction.
3. voice control terminal according to claim 1, it is characterised in that: be provided with loudspeaking in the voice control box
Device.
4. voice control terminal according to claim 1, it is characterised in that: the microphone uses array algorithm noise reduction,
Support near field and far field pickup.
5. voice control terminal according to claim 1, it is characterised in that: be provided with backtracking mould in the voice control box
Block carries out the retrospect of user identity for the instruction to sending.
6. voice control terminal according to claim 1, it is characterised in that: the semantic understanding includes the standard meaning of one's words
Understand and extend the understanding of the meaning of one's words.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811150055.5A CN109036430A (en) | 2018-09-29 | 2018-09-29 | Voice control terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811150055.5A CN109036430A (en) | 2018-09-29 | 2018-09-29 | Voice control terminal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109036430A true CN109036430A (en) | 2018-12-18 |
Family
ID=64615139
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811150055.5A Pending CN109036430A (en) | 2018-09-29 | 2018-09-29 | Voice control terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109036430A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109657091A (en) * | 2019-01-02 | 2019-04-19 | 百度在线网络技术(北京)有限公司 | State rendering method, device, equipment and the storage medium of interactive voice equipment |
CN110691301A (en) * | 2019-09-25 | 2020-01-14 | 晶晨半导体(深圳)有限公司 | Method for testing delay time between far-field voice equipment and external loudspeaker |
CN111785277A (en) * | 2020-06-29 | 2020-10-16 | 北京捷通华声科技股份有限公司 | Speech recognition method, speech recognition device, computer-readable storage medium and processor |
CN112151062A (en) * | 2020-09-27 | 2020-12-29 | 广州德初科技有限公司 | Virtual sound insulation communication method based on cloud storage |
CN113223518A (en) * | 2021-04-16 | 2021-08-06 | 讯飞智联科技(江苏)有限公司 | Human-computer interaction method of edge computing gateway based on AI (Artificial Intelligence) voice analysis |
CN114553922A (en) * | 2022-02-07 | 2022-05-27 | 中煤信息技术(北京)有限公司 | Voice-controlled coal mine comprehensive automation system and method |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103247291A (en) * | 2013-05-07 | 2013-08-14 | 华为终端有限公司 | Updating method, device, and system of voice recognition device |
CN103839549A (en) * | 2012-11-22 | 2014-06-04 | 腾讯科技(深圳)有限公司 | Voice instruction control method and system |
CN104318924A (en) * | 2014-11-12 | 2015-01-28 | 沈阳美行科技有限公司 | Method for realizing voice recognition function |
CN105045122A (en) * | 2015-06-24 | 2015-11-11 | 张子兴 | Intelligent household natural interaction system based on audios and videos |
CN105202721A (en) * | 2015-07-31 | 2015-12-30 | 广东美的制冷设备有限公司 | Air conditioner and control method thereof |
US20170302450A1 (en) * | 2015-05-05 | 2017-10-19 | ShoCard, Inc. | Identity Management Service Using A Blockchain Providing Certifying Transactions Between Devices |
CN107682536A (en) * | 2017-09-25 | 2018-02-09 | 努比亚技术有限公司 | A kind of sound control method, terminal and computer-readable recording medium |
CN108470533A (en) * | 2018-03-30 | 2018-08-31 | 南京七奇智能科技有限公司 | Enhanced smart interactive advertisement system based on visual human and device |
-
2018
- 2018-09-29 CN CN201811150055.5A patent/CN109036430A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103839549A (en) * | 2012-11-22 | 2014-06-04 | 腾讯科技(深圳)有限公司 | Voice instruction control method and system |
CN103247291A (en) * | 2013-05-07 | 2013-08-14 | 华为终端有限公司 | Updating method, device, and system of voice recognition device |
CN104318924A (en) * | 2014-11-12 | 2015-01-28 | 沈阳美行科技有限公司 | Method for realizing voice recognition function |
US20170302450A1 (en) * | 2015-05-05 | 2017-10-19 | ShoCard, Inc. | Identity Management Service Using A Blockchain Providing Certifying Transactions Between Devices |
CN105045122A (en) * | 2015-06-24 | 2015-11-11 | 张子兴 | Intelligent household natural interaction system based on audios and videos |
CN105202721A (en) * | 2015-07-31 | 2015-12-30 | 广东美的制冷设备有限公司 | Air conditioner and control method thereof |
CN107682536A (en) * | 2017-09-25 | 2018-02-09 | 努比亚技术有限公司 | A kind of sound control method, terminal and computer-readable recording medium |
CN108470533A (en) * | 2018-03-30 | 2018-08-31 | 南京七奇智能科技有限公司 | Enhanced smart interactive advertisement system based on visual human and device |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109657091A (en) * | 2019-01-02 | 2019-04-19 | 百度在线网络技术(北京)有限公司 | State rendering method, device, equipment and the storage medium of interactive voice equipment |
US11205431B2 (en) | 2019-01-02 | 2021-12-21 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method, apparatus and device for presenting state of voice interaction device, and storage medium |
CN110691301A (en) * | 2019-09-25 | 2020-01-14 | 晶晨半导体(深圳)有限公司 | Method for testing delay time between far-field voice equipment and external loudspeaker |
CN111785277A (en) * | 2020-06-29 | 2020-10-16 | 北京捷通华声科技股份有限公司 | Speech recognition method, speech recognition device, computer-readable storage medium and processor |
CN112151062A (en) * | 2020-09-27 | 2020-12-29 | 广州德初科技有限公司 | Virtual sound insulation communication method based on cloud storage |
CN112151062B (en) * | 2020-09-27 | 2021-12-24 | 梅州国威电子有限公司 | Sound insulation communication method |
CN113223518A (en) * | 2021-04-16 | 2021-08-06 | 讯飞智联科技(江苏)有限公司 | Human-computer interaction method of edge computing gateway based on AI (Artificial Intelligence) voice analysis |
CN113223518B (en) * | 2021-04-16 | 2024-03-22 | 讯飞智联科技(江苏)有限公司 | Human-computer interaction method of edge computing gateway based on AI voice analysis |
CN114553922A (en) * | 2022-02-07 | 2022-05-27 | 中煤信息技术(北京)有限公司 | Voice-controlled coal mine comprehensive automation system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109036430A (en) | Voice control terminal | |
CN109309804A (en) | A kind of intelligent meeting system | |
US7707035B2 (en) | Autonomous integrated headset and sound processing system for tactical applications | |
CN105512113B (en) | AC system speech translation system and interpretation method | |
CN108520743A (en) | Sound control method, smart machine and the computer-readable medium of smart machine | |
CN202542604U (en) | Speech recognition man-machine interaction device for elevator | |
WO2017128775A1 (en) | Voice control system, voice processing method and terminal device | |
CN105957514A (en) | Portable deaf-mute communication equipment | |
CN108415904B (en) | A dual-channel real-time translation method | |
CN105976814A (en) | Headset control method and device | |
CN109545216A (en) | A kind of audio recognition method and speech recognition system | |
CN109905797A (en) | A kind of intelligence simultaneous interpretation bluetooth headset | |
CN208985692U (en) | Voice control terminal | |
CN108877799A (en) | Voice control device and method | |
CN207010925U (en) | A kind of Headphone device for carrying voice and waking up identification | |
CN107942700A (en) | A kind of appliance control system, method and computer-readable recording medium | |
CN108766426B (en) | Intelligent voice interaction command system for naval vessel | |
CN106131349A (en) | A kind of have the mobile phone of automatic translation function, bluetooth earphone assembly | |
CN109300478A (en) | An auxiliary dialogue device for the hearing impaired | |
CN109168110A (en) | External hanging type speech packet | |
CN209571226U (en) | A kind of speech recognition equipment and system | |
CN205376116U (en) | Automatic dolly remote control unit that guides of wireless directional speech control | |
CN208538474U (en) | Speech recognition system | |
CN208094741U (en) | A kind of intelligent microphone based on speech recognition technology | |
CN112399020A (en) | Intelligent voice customer service system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181218 |