Nothing Special   »   [go: up one dir, main page]

CN108682414A - Sound control method, voice system, equipment and storage medium - Google Patents

Sound control method, voice system, equipment and storage medium Download PDF

Info

Publication number
CN108682414A
CN108682414A CN201810361415.XA CN201810361415A CN108682414A CN 108682414 A CN108682414 A CN 108682414A CN 201810361415 A CN201810361415 A CN 201810361415A CN 108682414 A CN108682414 A CN 108682414A
Authority
CN
China
Prior art keywords
vocal print
voice
mentioned
feature
phrase
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810361415.XA
Other languages
Chinese (zh)
Inventor
黎发敢
丁翔
童辉
黄海骅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Kei Chi Intelligent Technology Co Ltd
Original Assignee
Shenzhen Kei Chi Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Kei Chi Intelligent Technology Co Ltd filed Critical Shenzhen Kei Chi Intelligent Technology Co Ltd
Priority to CN201810361415.XA priority Critical patent/CN108682414A/en
Publication of CN108682414A publication Critical patent/CN108682414A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a kind of sound control method, voice system, equipment and storage mediums, include the following steps:Extract the vocal print of the characteristic voice section and the characteristic voice section in voice messaging;According to the characteristic voice and the vocal print corresponding phonetic order is matched from preset feature instruction database;According to the characteristic voice and the vocal print target device is matched from preset list of devices;According to target device described in the voice command control.Sound control method, voice system, equipment and the storage medium of the present invention has the beneficial effect that:Control targe equipment is carried out by matching phonetic order according to feature phrase harmony line, different people can be directed to by, which realizing, carries out personalized control device, improves the experience of user.

Description

Sound control method, voice system, equipment and storage medium
Technical field
The present invention relates to Smart Home technical field more particularly to a kind of sound control method, voice system, equipment and deposit Storage media.
Background technology
Voice control refers to that smart machine analyzes collected voice messaging, and is carried out pair according to analysis result The control answered.With the development of science and technology, more and more equipment all use voice and are controlled.Presently mainly by by language Sound synthesizes and is converted into text, then behind high in the clouds or local progress operation, carries out semantics recognition and generates instruction, and then control Make various equipment.But this voice control mode due to only single semantics recognition is carried out to voice, so can not Personalized identification is carried out for the use habit of different users, and does not have automatic study to improve intelligentized control method Efficiency.
Invention content
The main object of the present invention be provide a kind of sound control method that capableing of personalized control device and voice system, Equipment and storage medium are to promote the experience of user.
The present invention provides a kind of sound control methods, include the following steps:
Extract the vocal print of the characteristic voice section and features described above voice segments in voice messaging;
According to features described above voice and above-mentioned vocal print corresponding phonetic order is matched from preset feature instruction database;
According to features described above voice and above-mentioned vocal print target device is matched from preset list of devices;
According to the above-mentioned target device of above-mentioned voice command control.
Further, the step of vocal print of the characteristic voice section in said extracted voice messaging and features described above voice segments wraps It includes:
Judge in above-mentioned voice messaging whether to include particular phrase;
If so, intercepting the corresponding voice segments of above-mentioned particular phrase forms features described above voice segments, and extract features described above The vocal print of voice segments.
Further, it is matched from preset feature instruction database according to features described above voice and features described above vocal print above-mentioned Go out before corresponding phonetic order step, further includes:
Obtain voice sound source position;
Match the next equipment nearest with above-mentioned sound source position distance.
Further, above-mentioned that correspondence is matched from preset feature instruction database according to features described above voice and above-mentioned vocal print Phonetic order the step of include:
With the presence or absence of the feature vocal print to match with above-mentioned vocal print in judging characteristic vocal print library;
If so, according in features described above voice feature phrase and features described above vocal print instructed from preset features described above Corresponding above-mentioned phonetic order is matched in library;
If it is not, then according in features described above voice feature phrase and above-mentioned the next equipment from preset feature instruction database Corresponding phonetic order is matched, and the processing of above-mentioned vocal print is generated into vocal print model and is stored in features described above vocal print library.
Further, the step of in above-mentioned judging characteristic vocal print library with the presence or absence of the feature vocal print to match with above-mentioned vocal print Including:
Corresponding vocal print model in features described above vocal print library is transferred according to features described above phrase, and models generation features described above Vocal print;
Above-mentioned vocal print is subjected to vocal print comparison with features described above vocal print respectively, and determines whether to exist similar to above-mentioned vocal print Degree reaches the features described above vocal print of designated ratio.
Further, the corresponding voice segments formation features described above voice segments of above-mentioned particular phrase are being intercepted, and extracted above-mentioned After the step of vocal print of characteristic voice section, further include:
Record the occurrence number of each particular phrase;
It is ranked up according to the occurrence number of above-mentioned each particular phrase to presetting the corresponding equipment in list of devices, and Above-mentioned sequence is set as matching matching sequence when target device.
Further, it is matched from preset feature instruction database pair according to features described above voice and features described above vocal print Further include establishing feature vocal print library before the phonetic order step answered, above-mentioned the step of establishing feature vocal print library includes:
The specified phrase of each user or the vocal print of word are obtained as vocal print template;
Grade setting is carried out to the above-mentioned vocal print template of each user;
List of devices and permissions list corresponding to different grades of above-mentioned vocal print template are set.
The present invention also proposes a kind of voice system, including:
Extraction unit, the vocal print for extracting characteristic voice section and features described above voice segments in voice messaging;
Instructions match unit matches correspondence according to features described above voice and above-mentioned vocal print from preset feature instruction database Phonetic order;
Target device matching unit, for being matched from preset list of devices according to features described above voice and above-mentioned vocal print Go out target device;
Control unit, for according to voice command control target device.
The present invention also proposes a kind of computer equipment, including memory, processor and storage are on a memory and can be The computer program run on processor, above-mentioned processor are realized when executing above procedure as described in any one of embodiment Method.
The present invention also proposes a kind of computer readable storage medium, is stored thereon with computer program, which is handled The method as described in any one of embodiment is realized when device executes.
The prior art is compared, the present invention has the advantages that:By matching voice according to feature phrase harmony line Instruction carries out control targe equipment, and different people can be directed to by, which realizing, carries out personalized control device, improves the body of user It tests;By carrying out control targe equipment according to voice sound source position and feature phrase, expand voice control range;By to extraction Vocal print processing generate and vocal print model and modeled by vocal print model, realize the automatic depth learning process of Application on Voiceprint Recognition to improve The accuracy rate of voice control;By the permissions list of setting, to improve the accuracy rate of voice control.
Description of the drawings
Fig. 1 is the flow diagram of the sound control method of one embodiment of the invention;
Fig. 2 is the flow diagram of the sound control method of one embodiment of the invention;
Fig. 3 is the flow diagram of the sound control method of one embodiment of the invention;
Fig. 4 is the flow diagram of the sound control method of one embodiment of the invention;
Fig. 5 is the flow diagram of the sound control method of one embodiment of the invention;
Fig. 6 is the flow diagram of the sound control method of one embodiment of the invention;
Fig. 7 is the flow diagram of the sound control method of one embodiment of the invention;
Fig. 8 is the structural schematic diagram of the voice system of one embodiment of the invention;
Fig. 9 is a kind of structural schematic diagram of computer equipment of one embodiment of the invention.
1, extraction unit;2, instructions match unit;3, target device matching unit;4, control unit;5, computer is set It is standby;6, external equipment;7, processing unit;8, bus;9, network adapter;10, (I/O) interface;11, display;12, system Memory;13, random access memory (RAM);14, cache memory;15, storage system;16, program/practicality work Tool;17, program module.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiment is only a part of the embodiment of the present invention, instead of all the embodiments.Base Embodiment in the present invention, those of ordinary skill in the art obtained without creative efforts it is all its His embodiment, shall fall within the protection scope of the present invention.
Referring to Fig.1, the sound control method for proposing one embodiment of the invention, includes the following steps:
S1, extraction voice messaging in characteristic voice section and features described above voice segments vocal print;
S2, corresponding voice is matched from preset feature instruction database according to features described above voice and above-mentioned vocal print refer to It enables;
S3, target device is matched from preset list of devices according to features described above voice and above-mentioned vocal print;
S4, according to the above-mentioned target device of above-mentioned voice command control.
Such as above-mentioned steps S1, the vocal print of the characteristic voice section and features described above voice segments in voice messaging is extracted, wherein on The sound wave spectrum that vocal print refers to carrying voice messaging is stated, is capable of the sound characteristic of unique identification someone or something, is above-mentioned Vocal print has the function of to individual identification.
Such as above-mentioned steps S2, correspondence is matched from preset feature instruction database according to features described above voice and above-mentioned vocal print Phonetic order, wherein features described above voice contains the voice messaging of particular phrase, and above-mentioned particular phrase generally comprises effective word Group and feature phrase include only feature phrase, and above-mentioned effective phrase generally refers to have single above-mentioned phonetic order direction Phrase, such as equipment noun, whole or all devices phrase, features described above phrase generally refer to include effectively to act Phrase is such as opened, closes, opens or closes phrase.
Such as above-mentioned steps S3, target is matched from preset list of devices according to features described above voice and above-mentioned vocal print and is set It is standby, wherein preset above equipment list will be for that will store Default device information and newly added equipment information.
Such as above-mentioned steps S4, according to the above-mentioned target device of above-mentioned voice command control, according to above-mentioned voice command control Above-mentioned target device can be individual equipment or multiple equipment.
In this embodiment, the step of vocal print of the characteristic voice section in said extracted voice messaging and features described above voice segments Before, further include:
A1 obtains voice messaging;
Above-mentioned voice messaging is filtered by A2, noise and gain-adjusted are handled.
Such as above-mentioned steps A1, voice messaging is obtained, wherein above-mentioned voice messaging is generally referred to as the information of sound carrying.
Such as above-mentioned steps A2, above-mentioned voice messaging is filtered, the processing of noise and gain-adjusted, above-mentioned voice messaging warp It is purified after crossing filtering, noise and gain-adjusted processing, influence of the background noise to voice messaging is reduced, to obtain pure language Message ceases to improve the accuracy rate of voice control.
With reference to Fig. 2, in this embodiment, the sound of characteristic voice section and features described above voice segments in said extracted voice messaging The step of line includes:
Whether include particular phrase in S5, the above-mentioned voice messaging of judgement;
S6, if so, intercepting the corresponding voice segments of above-mentioned particular phrase forms features described above voice segments, and extract above-mentioned spy Levy the vocal print of voice segments.
Such as above-mentioned steps S5, judge in above-mentioned voice messaging whether to include particular phrase, wherein above-mentioned particular phrase one As include effective phrase and feature phrase or include only feature phrase, above-mentioned effective phrase generally refers to have single upper predicate The phrase that sound instruction is directed toward, such as equipment noun, whole or all devices phrase, features described above phrase generally refer to include The phrase effectively acted is such as opened, closes, opens or closes phrase.The one of which existence form of above-mentioned particular phrase is packet Containing effective phrase and feature phrase, another existence form is only comprising feature phrase, and specifically, particular phrase can be open Lamp, the lamp for opening first row, opening, fully open or closing air-conditioning etc..Accordingly, their effective phrase is respectively lamp, Lamp, whole and the air-conditioning of one row, feature phrase are to open, open, opening, opening and closing.
Such as above-mentioned steps S6, if so, intercepting the corresponding voice segments of above-mentioned particular phrase forms features described above voice segments, and Extract the vocal print of features described above voice segments, wherein the vocal print for the features described above voice segments being extracted is used for and features described above vocal print Matching.
Referred in this embodiment from preset feature according to features described above voice and features described above vocal print above-mentioned with reference to Fig. 3 It enables before matching corresponding phonetic order step in library, further includes:
S7, voice sound source position is obtained;
S8, the next equipment nearest with above-mentioned sound source position distance is matched.
Such as above-mentioned steps S7, obtain voice sound source position, above-mentioned voice sound source position be usually according to the power of voice and Phonetic incepting time difference etc. acquisition of information.
Such as above-mentioned steps S8, the next equipment nearest with above-mentioned sound source position distance is matched, wherein above-mentioned bottom equipment For single equipment.
With reference to Fig. 4, in this embodiment, it is above-mentioned according to features described above voice and above-mentioned vocal print from preset feature instruction database The step of matching corresponding phonetic order include:
With the presence or absence of the feature vocal print to match with above-mentioned vocal print in S9, judging characteristic vocal print library;
S10, if so, according in features described above voice feature phrase and features described above vocal print from preset features described above Corresponding above-mentioned phonetic order is matched in instruction database;
S11, if it is not, then according in features described above voice feature phrase and above-mentioned the next equipment instructed from preset feature Corresponding phonetic order is matched in library, and the processing of above-mentioned vocal print is generated into vocal print model and is stored in features described above vocal print library.
Such as above-mentioned steps S9, with the presence or absence of the feature vocal print to match with above-mentioned vocal print in judging characteristic vocal print library, In, features described above vocal print is generally sound-groove model, and features described above vocal print can match with all vocal prints of corresponding user.
If above-mentioned steps S10, if so, according in features described above voice feature phrase and features described above vocal print from default Features described above instruction database in match corresponding above-mentioned phonetic order, specifically, user identity, root are determined according to feature vocal print The control intention of user is determined according to feature phrase, you can be carried out from preset features described above according to the use habit of different user Corresponding above-mentioned phonetic order is matched in instruction database.
Above-mentioned steps S11, if it is not, then according in features described above voice feature phrase and above-mentioned the next equipment from preset Corresponding phonetic order is matched in feature instruction database, and the processing of above-mentioned vocal print is generated into vocal print model and is stored in features described above sound In line library, wherein the above-mentioned vocal print model that above-mentioned vocal print generates after treatment is for building features described above vocal print, above-mentioned bottom Equipment be with above-mentioned sound source position apart from nearest equipment, be to establish features described above vocal print not yet even if user, cannot be true When the fixed user identity, can also by include only features described above phrase features described above voice and above-mentioned the next equipment from pre- If feature instruction database in match corresponding phonetic order.
With reference to Fig. 5, in this embodiment, with the presence or absence of the feature to match with above-mentioned vocal print in above-mentioned judging characteristic vocal print library The step of vocal print includes:
S12, corresponding vocal print model in features described above vocal print library is transferred according to features described above phrase, and it is above-mentioned to model generation Feature vocal print;
S13, above-mentioned vocal print is subjected to vocal print comparison with features described above vocal print respectively, and determines whether exist and above-mentioned vocal print Similarity reaches the features described above vocal print of designated ratio.
Above-mentioned steps S12 transfers corresponding vocal print model in features described above vocal print library according to features described above phrase, and models Generate features described above vocal print, wherein features described above vocal print is generally used for determining user identity.
Above-mentioned steps S13, by above-mentioned vocal print respectively with features described above vocal print carry out vocal print comparison, and determine whether exist with Above-mentioned vocal print similarity reaches the features described above vocal print of designated ratio, wherein above-mentioned designated ratio is above-mentioned vocal print and above-mentioned spy Levy the minimum ratio value of voice print matching.
With reference to Fig. 6, in this embodiment, features described above voice segments are formed intercepting the corresponding voice segments of above-mentioned particular phrase, And after the step of extracting the vocal print of features described above voice segments, further include:
S14, the occurrence number for recording each particular phrase;
S15, it is arranged according to the occurrence number of above-mentioned each particular phrase presetting the corresponding equipment in list of devices Sequence, and by above-mentioned sequence be set as match target device when matching sort.
Such as above-mentioned steps S14, the occurrence number of each particular phrase is recorded, each particular phrase appearance is once denoted as correspondence User is recorded using the first use of corresponding equipment, and the most particular phrase of occurrence number is that the most commonly used use of the user is practised Used, i.e., the most common equipment of user, the occurrence number of the above-mentioned each particular phrase of record are to record the use habit of user.
It is corresponding in list of devices to presetting according to the occurrence number of above-mentioned each particular phrase such as above-mentioned steps S15 Equipment is ranked up, and by above-mentioned sequence be set as match target device when matching sort, in corresponding list of devices on It is the most commonly used equipment of the user to state and be arranged in most preceding equipment.It therefore, can be according in the above-mentioned voice messaging of user Features described above phrase and the realization of above-mentioned vocal print control the most commonly used equipment.
In this embodiment, after recording the occurrence number step of each particular phrase, further include:
A3, the record of particular phrase is sent to corresponding user record storage position according to the vocal print that above-mentioned particular phrase carries It sets.
Such as above-mentioned steps A3, the record of above-mentioned particular phrase is sent to correspondence according to the vocal print that above-mentioned particular phrase carries The usage record of corresponding user is carried out classification storage, it is convenient to be conducive to later stage lookup by user record storage location.
It is instructed in the present embodiment from preset feature according to features described above voice and features described above vocal print with reference to Fig. 7 It further includes before establishing feature vocal print library that corresponding phonetic order step is matched in library, above-mentioned the step of establishing feature vocal print library Including:
The vocal print of S16, the specified phrase for obtaining each user or word are as vocal print template;
S17, grade setting is carried out to the above-mentioned vocal print template of each user;
List of devices and permissions list corresponding to S18, the different grades of above-mentioned vocal print template of setting.
Such as above-mentioned steps S16, the specified phrase of each user or the vocal print of word are obtained as vocal print template, wherein above-mentioned The specified phrase or word of each user is generally referred to as the specified phrase or word that are directed toward with corresponding equipment.
Such as above-mentioned steps S17, grade setting is carried out to the above-mentioned vocal print template of each user, wherein above-mentioned according to each user Above-mentioned vocal print template carry out grade setting be to each user carry out grade classification.
Such as above-mentioned steps S18, list of devices and permissions list corresponding to different grades of above-mentioned vocal print template are set, In, different users corresponds to different above equipment lists, and access right of the user to equipment is determined according to above-mentioned permissions list.
In one embodiment, a kind of sound control method, includes the following steps:
A1 obtains voice messaging;
Above-mentioned voice messaging is filtered by A2, noise and gain-adjusted are handled;
Whether include particular phrase in S5, the above-mentioned voice messaging of judgement;
S6, if so, intercepting the corresponding voice segments of above-mentioned particular phrase forms features described above voice segments, and extract above-mentioned spy Levy the vocal print of voice segments;
S14, the occurrence number for recording each particular phrase;
A3, the record of particular phrase is sent to corresponding user record storage position according to the vocal print that above-mentioned particular phrase carries It sets;
S15, it is arranged according to the occurrence number of above-mentioned each particular phrase presetting the corresponding equipment in list of devices Sequence, and by above-mentioned sequence be set as match target device when matching sort;
S7, voice sound source position is obtained;
S8, the next equipment nearest with above-mentioned sound source position distance is matched.
The vocal print of S16, the specified phrase for obtaining each user or word are as vocal print template;
S17, grade setting is carried out to the above-mentioned vocal print template of each user;
List of devices and permissions list corresponding to S18, the different grades of above-mentioned vocal print template of setting;
S12, corresponding vocal print model in features described above vocal print library is transferred according to features described above phrase, and it is above-mentioned to model generation Feature vocal print;
S13, above-mentioned vocal print is subjected to vocal print comparison with features described above vocal print respectively, and determines whether exist and above-mentioned vocal print Similarity reaches the features described above vocal print of designated ratio.
S10, if so, according in features described above voice feature phrase and features described above vocal print from preset features described above Corresponding above-mentioned phonetic order is matched in instruction database;
S3, target device is matched from preset list of devices according to features described above voice and above-mentioned vocal print;
S4, according to the above-mentioned target device of above-mentioned voice command control.
With reference to figure, the present invention also proposes a kind of voice system, including:
Extraction unit 1, the vocal print for extracting characteristic voice section and features described above voice segments in voice messaging;
Instructions match unit 2, for being matched from preset feature instruction database according to features described above voice and above-mentioned vocal print Go out corresponding phonetic order;
Target device matching unit 3, for according to features described above voice and above-mentioned vocal print from preset list of devices Allot target device;
Control unit 4, for according to voice command control target device.
Said extracted unit 1, the vocal print for extracting characteristic voice section and features described above voice segments in voice messaging, on The sound wave spectrum that vocal print refers to carrying voice messaging is stated, is capable of the sound characteristic of unique identification someone or something, is above-mentioned Vocal print has the function of to individual identification.
Above-metioned instruction matching unit 2 is used for according to features described above voice and above-mentioned vocal print from preset feature instruction database Match corresponding phonetic order, wherein features described above voice contains the voice messaging of particular phrase, and above-mentioned particular phrase is general Including effective phrase and feature phrase or only including feature phrase, above-mentioned effective phrase generally refers to have single above-mentioned voice The phrase being directed toward, such as equipment noun, whole or all devices phrase, features described above phrase is instructed to generally refer to include to have The phrase of the action of effect is such as opened, closes, opens or closes phrase.
Above-mentioned target device matching unit 3, for according to features described above voice and above-mentioned vocal print from preset list of devices In match target device, wherein preset above equipment list will be for will store Default device information and newly added equipment information.
Above-mentioned control unit 4 is used for according to voice command control target device, according to the above-mentioned of above-mentioned voice command control Target device can be individual equipment or multiple equipment.
In the present embodiment, further include:Voice messaging acquiring unit and voice messaging optimize unit.
Above-mentioned voice messaging acquiring unit, for obtaining voice messaging.
Above-mentioned voice messaging optimizes unit, for being filtered to above-mentioned voice messaging, the processing of noise and gain-adjusted, on It is purified after stating filtered voice messaging, noise and gain-adjusted processing, reduces influence of the background noise to voice messaging, with Pure voice messaging is obtained to improve the accuracy rate of voice control.
In the present embodiment, further include:Particular phrase judging unit and extraction subelement.
Above-mentioned particular phrase judging unit, for judging in above-mentioned voice messaging whether to include particular phrase, wherein on It states particular phrase to generally comprise effective phrase and feature phrase or only include feature phrase, above-mentioned effective phrase generally refers to have There are the phrase that single above-mentioned phonetic order is directed toward, such as equipment noun, whole or all devices phrase, features described above phrase one As refer to include that the phrase effectively acted is such as opened, closes, opens or closes phrase.The one of which of above-mentioned particular phrase Existence form is comprising effective phrase and feature phrase, and another existence form is specifically, specific to include only feature phrase Phrase can be the lamp of first row, opening, fully open or close air-conditioning etc. of turning on light, open.Accordingly, their effective phrase Respectively lamp, the lamp of first row, whole and air-conditioning, feature phrase are to open, open, opening, opening and closing.
Said extracted subelement forms features described above voice segments for intercepting the corresponding voice segments of above-mentioned particular phrase, and Extract the vocal print of features described above voice segments, wherein the vocal print for the features described above voice segments being extracted is used for and features described above vocal print Matching.
Further include sound source position acquiring unit and the next equipment matching unit in the present embodiment.
Above-mentioned sound source position acquiring unit, for obtaining voice sound source position, above-mentioned voice sound source position is usually basis The power of voice and phonetic incepting time difference etc. acquisition of information.
Above-mentioned bottom equipment matching unit, for matching the next equipment nearest with above-mentioned sound source position distance, wherein Above-mentioned bottom equipment is single equipment.
In the present embodiment, further include vocal print judging unit, the first instructions match subelement, the second instructions match subelement and Vocal print model generation unit.
Above-mentioned vocal print judging unit, for whether there is the feature to match with above-mentioned vocal print in judging characteristic vocal print library Vocal print;Wherein, features described above vocal print is generally sound-groove model, and features described above vocal print can be with all vocal print phases of corresponding user Match.
Above-mentioned first instructions match subelement, for according to the feature phrase and features described above vocal print in features described above voice Corresponding above-mentioned phonetic order is matched from preset features described above instruction database, specifically, user is determined according to feature vocal print Identity determines the control intention of user according to feature phrase, you can be carried out from preset according to the use habit of different user Corresponding above-mentioned phonetic order is matched in features described above instruction database.
Above-mentioned second instructions match unit, for according in features described above voice feature phrase and above-mentioned the next equipment from Corresponding phonetic order is matched in preset feature instruction database, and the processing of above-mentioned vocal print is generated into vocal print model and is stored in above-mentioned In feature vocal print library, wherein the above-mentioned vocal print model that above-mentioned vocal print generates after treatment is used to build features described above vocal print, on State the next equipment be with above-mentioned sound source position apart from nearest equipment, be to establish features described above vocal print not yet even if user, It, can also be by including only that the features described above voice of features described above phrase and above-mentioned bottom are set when not can determine that the user identity It is standby to match corresponding phonetic order from preset feature instruction database.
Vocal print model generation unit carries out processing for the vocal print to user and generates vocal print model, and above-mentioned vocal print model is used In establishing feature vocal print.
In the present embodiment, further include:Vocal print modeling unit and vocal print comparison unit.
Above-mentioned vocal print modeling unit, for transferring corresponding vocal print model in features described above vocal print library according to features described above phrase This, and model and generate features described above vocal print, wherein features described above vocal print is generally used for determining user identity.
Above-mentioned vocal print comparison unit for above-mentioned vocal print to be carried out vocal print comparison with features described above vocal print respectively, and judges With the presence or absence of the features described above vocal print for reaching designated ratio with above-mentioned vocal print similarity, wherein above-mentioned designated ratio is above-mentioned sound The minimum ratio value of line and features described above voice print matching.
In this embodiment, further include:Recording unit and equipment sequencing unit.
Above-mentioned recording unit, the occurrence number for recording each particular phrase, each particular phrase appearance are once denoted as Corresponding user is recorded using the first use of corresponding equipment, and the most particular phrase of occurrence number, which is that the user is the most commonly used, to be made With custom, the i.e. most common equipment of user, the occurrence number of the above-mentioned each particular phrase of record is to record the use habit of user It is used.
Above equipment sequencing unit is used for the occurrence number according to above-mentioned each particular phrase in default list of devices Corresponding equipment is ranked up, and by above-mentioned sequence be set as match target device when matching sort, corresponding equipment arrange It is above-mentioned in table to be arranged in most preceding equipment as the most commonly used equipment of the user.Therefore, can be believed according to the above-mentioned voice of user Features described above phrase and the realization of above-mentioned vocal print in breath control the most commonly used equipment.
In this embodiment, further include:Record storage unit.
Above-mentioned record storage unit, for sending out the record of above-mentioned particular phrase according to the vocal print that above-mentioned particular phrase carries It send to corresponding user record storage location, the usage record of corresponding user is subjected to classification storage, it is convenient to be conducive to later stage lookup.
In the present embodiment, further include:Vocal print template establishes unit, level setting unit and list setting unit.
Above-mentioned vocal print template obtains the specified phrase of each user or the vocal print of word as vocal print mould for establishing unit Plate, wherein the specified phrase or word of above-mentioned each user is generally referred to as the specified phrase or word that are directed toward with corresponding equipment.
Above-mentioned level setting unit carries out grade setting, wherein above-mentioned basis for the above-mentioned vocal print template to each user It is to carry out grade classification to each user that the above-mentioned vocal print template of each user, which carries out grade setting,.
Above-mentioned list setting unit, for the list of devices corresponding to different grades of above-mentioned vocal print template and permission to be arranged List, wherein different users corresponds to different above equipment lists, determines that user makes equipment according to above-mentioned permissions list Use permission.
With reference to Fig. 9, in embodiments of the present invention, the present invention also provides a kind of computer equipment, above computer equipment 5 with The form of universal computing device shows, and the component of computer equipment 5 can include but is not limited to:One or more processor or Person's processing unit 5, system storage 12, the bus of connection different system component (including system storage 12 and processing unit 7) 8。
Bus 8 indicates one or more in a few class bus structures, including memory bus or Memory Controller, outside Enclose bus, graphics acceleration port, processor or the local bus using the arbitrary bus structures in a variety of bus structures.Citing For, these architectures include but not limited to industry standard architecture (ISA) bus, and microchannel architecture (MAC) is total Line, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Computer equipment 5 typically comprises a variety of computer system readable media.These media can be it is any can be by The usable medium that computer equipment 5 accesses, including volatile and non-volatile media, moveable and immovable medium.
System storage 12 may include the computer system readable media of form of volatile memory, such as arbitrary access Memory (RAM) 13 and/or cache memory 14.Computer equipment 5 may further include other movement/it is irremovable , volatile/non-volatile computer decorum storage medium.Only as an example, storage system 15 can be used for reading and writing not removable Dynamic, non-volatile magnetic media (commonly referred to as " hard disk drive ").Although being not shown in Fig. 9, can provide for removable The disc driver of dynamic non-volatile magnetic disk (such as " floppy disk ") read-write, and to removable anonvolatile optical disk (such as CD~ ROM, DVD~ROM or other optical mediums) read-write CD drive.In these cases, each driver can pass through One or more data media interfaces is connected with bus 8.Memory may include at least one program product, the program product With one group of (for example, at least one) program module 17, these program modules 17 are configured to perform the work(of various embodiments of the present invention Energy.
Program/utility 16 with one group of (at least one) program module 17 can store in memory, for example, Such program module 17 includes --- but being not limited to --- operating system, one or more application program, other program moulds Block and program data may include the realization of network environment in each or certain combination in these examples.Program module 17 usually execute function and/or method in embodiment described in the invention.
Computer equipment 5 (such as keyboard, sensing equipment, display 11, can also be taken the photograph with one or more external equipments 6 As head etc.) communication, the equipment interacted with the computer equipment 5 communication can be also enabled a user to one or more, and/or With enable any equipment that the computer equipment 5 communicated with one or more of the other computing device (such as network interface card, modulation Demodulator etc.) communication.This communication can be carried out by input/output (I/O) interface 10.Also, computer equipment 5 may be used also To pass through network adapter 9 and one or more network (such as LAN (LAN)), wide area network (WAN) and/or public network Network (such as internet) communicates.As shown, network adapter 9 is communicated by bus 8 with other modules of computer equipment 5. It should be understood that although being not shown in Fig. 9, computer equipment 5 can be combined using other hardware and/or software module, including but It is not limited to:Microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape drive with And data backup storage system etc..
Processing unit 7 is stored in program in system storage 12 by operation, to perform various functions application and Data processing, such as realize the sound control method that the embodiment of the present invention is provided.
That is, above-mentioned processing unit 7 is realized when executing above procedure:Extract the characteristic voice section and above-mentioned in voice messaging The vocal print of characteristic voice section;According to features described above voice and above-mentioned vocal print corresponding language is matched from preset feature instruction database Sound instructs;According to features described above voice and above-mentioned vocal print target device is matched from preset list of devices;According to above-mentioned voice Instruction controls above-mentioned target device.
In embodiments of the present invention, the present invention also proposes a kind of computer readable storage medium, is stored thereon with computer Program realizes the sound control method provided such as all embodiments of the application when the program is executed by processor:
That is, being realized when being executed by processor to program:Extract the characteristic voice section and features described above language in voice messaging The vocal print of segment;Corresponding voice is matched according to features described above voice and above-mentioned vocal print from preset feature instruction database to refer to It enables;According to features described above voice and above-mentioned vocal print target device is matched from preset list of devices;According to above-mentioned phonetic order Control above-mentioned target device.
The arbitrary combination of one or more computer-readable media may be used.Computer-readable medium can be calculated Machine gram signal media or computer readable storage medium.Computer readable storage medium for example can be --- but it is unlimited In --- electricity, system, device or the device of magnetic, optical, electromagnetic, infrared ray or semiconductor, or the arbitrary above combination.Computer The more specific example (non exhaustive list) of readable storage medium storing program for executing includes:Being electrically connected, be portable with one or more conducting wires Formula computer disk, hard disk, random access memory (RAM) 13, read-only memory (ROM), erasable programmable read-only memory (EPOM or flash memory), optical fiber, portable compact disc read-only memory (CD~ROM), light storage device, magnetic memory device or Above-mentioned any appropriate combination.In this document, can be any include computer readable storage medium or storage program Tangible medium, the program can be commanded the either device use or in connection of execution system, device.
Computer-readable signal media may include in a base band or as the data-signal that a carrier wave part is propagated, Wherein carry computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including --- but It is not limited to --- electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be Any computer-readable medium other than computer readable storage medium, change computer-readable medium can send, propagate or Transmission for by instruction execution system, device either device use or program in connection.
It can be write with one or more programming languages or combinations thereof for executing the computer that operates of the present invention Program code, above procedure design language include object oriented program language --- such as Java, Smalltalk, C+ +, further include conventional procedural programming language --- such as " C " language or similar programming language.Program code It can fully execute on the user computer, partly execute, held as an independent software package on the user computer Execution is either held on remote computer or server completely on the remote computer for part on the user computer for row, part Row.In situations involving remote computers, remote computer can pass through the network of any kind --- including LAN (LAN) or wide area network (WAN) --- it is connected to subscriber computer, or, it may be connected to outer computer (such as using because of spy Service provider is netted to be connected by internet).
The foregoing is merely the preferred embodiment of the present invention, are not intended to limit the scope of the invention, every utilization Equivalent structure or equivalent flow shift made by description of the invention and accompanying drawing content is applied directly or indirectly in other correlations Technical field, be included within the scope of the present invention.

Claims (10)

1. a kind of sound control method, which is characterized in that include the following steps:
Extract the vocal print of the characteristic voice section and the characteristic voice section in voice messaging;
According to the characteristic voice and the vocal print corresponding phonetic order is matched from preset feature instruction database;
According to the characteristic voice and the vocal print target device is matched from preset list of devices;
According to target device described in the voice command control.
2. sound control method according to claim 1, which is characterized in that the characteristic voice in the extraction voice messaging Section and the characteristic voice section vocal print the step of include
Judge in the voice messaging whether to include particular phrase;
If so, intercepting the corresponding voice segments of the particular phrase forms the characteristic voice section, and extract the characteristic voice The vocal print of section.
3. sound control method according to claim 2, which is characterized in that described according to the characteristic voice and described Vocal print further includes before matching corresponding phonetic order step in preset feature instruction database:
Obtain voice sound source position;
Match the next equipment nearest with sound source position distance.
4. sound control method according to claim 3, which is characterized in that described according to the characteristic voice and the sound The step of line matches corresponding phonetic order from preset feature instruction database include:
With the presence or absence of the feature vocal print to match with the vocal print in judging characteristic vocal print library;
If so, according in the characteristic voice feature phrase and the feature vocal print from the preset feature instruction database Match the corresponding phonetic order;
If it is not, then according in the characteristic voice feature phrase and the next equipment matched from preset feature instruction database Go out corresponding phonetic order, and vocal print processing is generated into vocal print model and is stored in feature vocal print library.
5. sound control method according to claim 4, which is characterized in that the judging characteristic vocal print whether there is in library The step of feature vocal print to match with the vocal print includes:
Corresponding vocal print model in feature vocal print library is transferred according to the feature phrase, and models and generates the acoustical signature Line;
The vocal print is subjected to vocal print comparison with the feature vocal print respectively, and determines whether to exist and be reached with the vocal print similarity To the feature vocal print of designated ratio.
6. sound control method according to claim 2, which is characterized in that intercepting the corresponding voice of the feature phrase Section forms the characteristic voice section, and after the step of extracting the vocal print of the characteristic voice section, further includes:
Record the occurrence number of each particular phrase;
It is ranked up according to the occurrence number of each particular phrase to presetting the corresponding equipment in list of devices, and by institute State matching sequence when sequence is set as matching target device.
7. sound control method according to claim 1, which is characterized in that according to the characteristic voice and the feature Vocal print further includes before matching corresponding phonetic order step in preset feature instruction database:
Feature vocal print library is established, described the step of establishing feature vocal print library includes:
The specified phrase of each user or the vocal print of word are obtained as vocal print template;
Grade setting is carried out to the above-mentioned vocal print template of each user;
List of devices and permissions list corresponding to the different grades of vocal print template are set.
8. a kind of voice system, which is characterized in that including:
Extraction unit, the vocal print for extracting characteristic voice section and the characteristic voice section in voice messaging;
Instructions match unit matches corresponding language according to the characteristic voice and the vocal print from preset feature instruction database Sound instructs;
Target device matching unit, for matching mesh from preset list of devices according to the characteristic voice and the vocal print Marking device;
Control unit, for according to voice command control target device.
9. a kind of computer equipment, including memory, processor and storage can be run on a memory and on a processor Computer program, which is characterized in that the processor is realized when executing described program such as any one of claim 1~7 institute The method stated.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor The method as described in any one of claim 1~7 is realized when execution.
CN201810361415.XA 2018-04-20 2018-04-20 Sound control method, voice system, equipment and storage medium Pending CN108682414A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810361415.XA CN108682414A (en) 2018-04-20 2018-04-20 Sound control method, voice system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810361415.XA CN108682414A (en) 2018-04-20 2018-04-20 Sound control method, voice system, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN108682414A true CN108682414A (en) 2018-10-19

Family

ID=63801530

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810361415.XA Pending CN108682414A (en) 2018-04-20 2018-04-20 Sound control method, voice system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN108682414A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109119078A (en) * 2018-10-26 2019-01-01 北京石头世纪科技有限公司 Automatic robot's control method, device, automatic robot and medium
CN109389978A (en) * 2018-11-05 2019-02-26 珠海格力电器股份有限公司 Voice recognition method and device
CN110139146A (en) * 2019-04-03 2019-08-16 深圳康佳电子科技有限公司 Speech recognition anti-interference method, device and storage medium based on Application on Voiceprint Recognition
CN110188171A (en) * 2019-05-30 2019-08-30 上海联影医疗科技有限公司 A kind of voice search method, device, electronic equipment and storage medium
CN110232916A (en) * 2019-05-10 2019-09-13 平安科技(深圳)有限公司 Method of speech processing, device, computer equipment and storage medium
CN110310657A (en) * 2019-07-10 2019-10-08 北京猎户星空科技有限公司 A kind of audio data processing method and device
CN110543129A (en) * 2019-09-30 2019-12-06 深圳市酷开网络科技有限公司 intelligent electric appliance control method, intelligent electric appliance control system and storage medium
CN110570120A (en) * 2019-09-06 2019-12-13 Oppo(重庆)智能科技有限公司 ERP intelligent ordering method, device, system and storage medium
CN111261163A (en) * 2020-03-27 2020-06-09 四川虹美智能科技有限公司 Voice control method and system and intelligent air conditioner
CN112802482A (en) * 2021-04-15 2021-05-14 北京远鉴信息技术有限公司 Voiceprint serial-parallel identification method, individual soldier system and storage medium
CN112863011A (en) * 2020-12-31 2021-05-28 金茂智慧科技(广州)有限公司 Processing method, device, medium and terminal equipment for preventing parking space occupation

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030195751A1 (en) * 2002-04-10 2003-10-16 Mitsubishi Electric Research Laboratories, Inc. Distributed automatic speech recognition with persistent user parameters
CN103730120A (en) * 2013-12-27 2014-04-16 深圳市亚略特生物识别科技有限公司 Voice control method and system for electronic device
CN105242556A (en) * 2015-10-28 2016-01-13 小米科技有限责任公司 A speech control method and device of intelligent devices, a control device and the intelligent device
CN105444332A (en) * 2014-08-19 2016-03-30 青岛海尔智能家电科技有限公司 Equipment voice control method and device
US20170124311A1 (en) * 2015-03-20 2017-05-04 Baidu Online Network Technology (Beijing) Co., Ltd. Voiceprint login method and apparatus based on artificial intelligence
CN107452386A (en) * 2017-08-16 2017-12-08 联想(北京)有限公司 A kind of voice data processing method and system
CN107767875A (en) * 2017-10-17 2018-03-06 深圳市沃特沃德股份有限公司 Sound control method, device and terminal device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030195751A1 (en) * 2002-04-10 2003-10-16 Mitsubishi Electric Research Laboratories, Inc. Distributed automatic speech recognition with persistent user parameters
CN103730120A (en) * 2013-12-27 2014-04-16 深圳市亚略特生物识别科技有限公司 Voice control method and system for electronic device
CN105444332A (en) * 2014-08-19 2016-03-30 青岛海尔智能家电科技有限公司 Equipment voice control method and device
US20170124311A1 (en) * 2015-03-20 2017-05-04 Baidu Online Network Technology (Beijing) Co., Ltd. Voiceprint login method and apparatus based on artificial intelligence
CN105242556A (en) * 2015-10-28 2016-01-13 小米科技有限责任公司 A speech control method and device of intelligent devices, a control device and the intelligent device
CN107452386A (en) * 2017-08-16 2017-12-08 联想(北京)有限公司 A kind of voice data processing method and system
CN107767875A (en) * 2017-10-17 2018-03-06 深圳市沃特沃德股份有限公司 Sound control method, device and terminal device

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109119078A (en) * 2018-10-26 2019-01-01 北京石头世纪科技有限公司 Automatic robot's control method, device, automatic robot and medium
CN109389978B (en) * 2018-11-05 2020-11-03 珠海格力电器股份有限公司 Voice recognition method and device
CN109389978A (en) * 2018-11-05 2019-02-26 珠海格力电器股份有限公司 Voice recognition method and device
CN110139146A (en) * 2019-04-03 2019-08-16 深圳康佳电子科技有限公司 Speech recognition anti-interference method, device and storage medium based on Application on Voiceprint Recognition
CN110232916A (en) * 2019-05-10 2019-09-13 平安科技(深圳)有限公司 Method of speech processing, device, computer equipment and storage medium
CN110188171A (en) * 2019-05-30 2019-08-30 上海联影医疗科技有限公司 A kind of voice search method, device, electronic equipment and storage medium
CN110310657A (en) * 2019-07-10 2019-10-08 北京猎户星空科技有限公司 A kind of audio data processing method and device
CN110310657B (en) * 2019-07-10 2022-02-08 北京猎户星空科技有限公司 Audio data processing method and device
CN110570120A (en) * 2019-09-06 2019-12-13 Oppo(重庆)智能科技有限公司 ERP intelligent ordering method, device, system and storage medium
CN110543129A (en) * 2019-09-30 2019-12-06 深圳市酷开网络科技有限公司 intelligent electric appliance control method, intelligent electric appliance control system and storage medium
CN111261163A (en) * 2020-03-27 2020-06-09 四川虹美智能科技有限公司 Voice control method and system and intelligent air conditioner
CN112863011A (en) * 2020-12-31 2021-05-28 金茂智慧科技(广州)有限公司 Processing method, device, medium and terminal equipment for preventing parking space occupation
CN112802482A (en) * 2021-04-15 2021-05-14 北京远鉴信息技术有限公司 Voiceprint serial-parallel identification method, individual soldier system and storage medium
CN112802482B (en) * 2021-04-15 2021-07-23 北京远鉴信息技术有限公司 Voiceprint serial-parallel identification method, individual soldier system and storage medium

Similar Documents

Publication Publication Date Title
CN108682414A (en) Sound control method, voice system, equipment and storage medium
US10522136B2 (en) Method and device for training acoustic model, computer device and storage medium
US11100934B2 (en) Method and apparatus for voiceprint creation and registration
WO2021093449A1 (en) Wakeup word detection method and apparatus employing artificial intelligence, device, and medium
CN108133707B (en) Content sharing method and system
CN104380375B (en) Device for extracting information from a dialog
US20150325240A1 (en) Method and system for speech input
CN107180628A (en) Set up the method, the method for extracting acoustic feature, device of acoustic feature extraction model
JP2021533397A (en) Speaker dialification using speaker embedding and a trained generative model
CN107134279A (en) A kind of voice awakening method, device, terminal and storage medium
CN107331400A (en) A kind of Application on Voiceprint Recognition performance improvement method, device, terminal and storage medium
CN109658579A (en) A kind of access control method, system, equipment and storage medium
JP6756079B2 (en) Artificial intelligence-based ternary check method, equipment and computer program
CN110459222A (en) Sound control method, phonetic controller and terminal device
CN111833845A (en) Multi-language speech recognition model training method, device, equipment and storage medium
CN109192194A (en) Voice data mask method, device, computer equipment and storage medium
WO2020233363A1 (en) Speech recognition method and device, electronic apparatus, and storage medium
Tiwari et al. Virtual home assistant for voice based controlling and scheduling with short speech speaker identification
US20230386506A1 (en) Self-supervised speech representations for fake audio detection
CN113129867B (en) Training method of voice recognition model, voice recognition method, device and equipment
WO2021135454A1 (en) Method, device, and computer-readable storage medium for recognizing fake speech
CN112632244A (en) Man-machine conversation optimization method and device, computer equipment and storage medium
JP2006119625A (en) Verb error recovery in speech recognition
JP7178394B2 (en) Methods, apparatus, apparatus, and media for processing audio signals
CN107657454A (en) Biological method of payment, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned
AD01 Patent right deemed abandoned

Effective date of abandoning: 20230120