CN110113319A

CN110113319A - Identity identifying method, device, computer equipment and storage medium

Info

Publication number: CN110113319A
Application number: CN201910304302.0A
Authority: CN
Inventors: 郑子奇; 徐国强; 邱寒
Original assignee: OneConnect Smart Technology Co Ltd
Current assignee: OneConnect Smart Technology Co Ltd
Priority date: 2019-04-16
Filing date: 2019-04-16
Publication date: 2019-08-09

Abstract

This application involves field of artificial intelligence, are applied to financial industry, more particularly to a kind of identity identifying method, device, computer equipment and storage medium.Method in one embodiment includes: to obtain the certification request of user to be certified, certification request carries the user identifier of user to be certified and the certification video of user to be certified, Face datection and face feature point detection are carried out to certification video, obtain lip image sequence, feature extraction is carried out to lip image sequence, obtain the feature vector of lip image sequence, using feature vector as lid speech characteristic authentication information, pre-set user registration information is searched according to user identifier, obtain lid speech characteristic registration information corresponding with user identifier, when lid speech characteristic authentication information is matched with lid speech characteristic registration information, it obtains user to be certified and passes through the authentication result of certification.It may be implemented to carry out authentication by user identifier+lip reading identification in this way, can effectively improve the safety of authentication.

Description

Identity identifying method, device, computer equipment and storage medium

Technical field

This application involves field of artificial intelligence, more particularly to a kind of identity identifying method, device, computer equipment And storage medium.

Background technique

With the development of science and technology, earth-shaking variation has occurred in people's lives, and mobile internet device becomes The necessity of daily life, personally identifiable information are play an important role in Internet technology.

Traditional identity authentication method, for example identity is confirmed by password, password, certificate, there are verification informations to be easy quilt The problem of stealing, being assumed another's name by people, i.e., traditional identity identifying method have that safety is low.

Summary of the invention

Based on this, it is necessary in view of the above technical problems, provide a kind of authentication side that can be improved authentication security Method, device, computer equipment and storage medium.

A kind of identity identifying method, which comprises

The certification request of user to be certified is obtained, the certification request carries the user identifier of user to be certified and wait recognize Demonstrate,prove the certification video of user；

Face datection and face feature point detection are carried out to the certification video, obtain lip image sequence；

Feature extraction is carried out to the lip image sequence, the feature vector of the lip image sequence is obtained, with described Feature vector is lid speech characteristic authentication information；

Pre-set user registration information is searched according to the user identifier, obtains lid speech characteristic corresponding with the user identifier Registration information；

When the lid speech characteristic authentication information is matched with the lid speech characteristic registration information, obtains user to be certified and pass through The authentication result of certification.

In one embodiment, described that feature extraction is carried out to the lip image sequence, obtain the lip image sequence The feature vector of column, comprising:

The corresponding video block of the lip image sequence is divided into multiple sub-video blocks；

Each sub-video block is passed through into the first independence subspace analysis respectively and carries out characteristic vector pickup, obtains fisrt feature Vector set；

Dimension-reduction treatment is carried out to the feature vector in the first eigenvector set, by the feature vector after dimension-reduction treatment The second independence subspace analysis is inputted, second feature vector set is obtained；

According to the first eigenvector set and the second feature vector set, the lip image sequence is obtained Feature vector.

In one embodiment, before the progress feature extraction to the lip image sequence, further includes:

Obtain video sample and Recognition with Recurrent Neural Network model, the video sample include lip reading image and with the lip Sonagram is as corresponding lip reading recognition result；

Using the lip reading image as the input of the Recognition with Recurrent Neural Network model, using the lip reading recognition result as institute The output for stating Recognition with Recurrent Neural Network model is trained the Recognition with Recurrent Neural Network model, the circulation nerve trained Network model；

It is described that feature extraction is carried out to the lip image sequence, comprising:

Feature extraction is carried out to the lip image sequence by the Recognition with Recurrent Neural Network model trained.

In one embodiment, described using the lip reading image as the input of the Recognition with Recurrent Neural Network model, by institute Output of the lip reading recognition result as the Recognition with Recurrent Neural Network model is stated, the Recognition with Recurrent Neural Network model is trained, Before the Recognition with Recurrent Neural Network model trained, further includes:

Image procossing is carried out to the video sample, obtains syllable adhesion data；

It is described using the lip reading image as the input of the Recognition with Recurrent Neural Network model, the lip reading recognition result is made For the output of the Recognition with Recurrent Neural Network model, the Recognition with Recurrent Neural Network model is trained, the circulation trained Neural network model, comprising:

Using the lip reading image and the syllable adhesion data as the input of the Recognition with Recurrent Neural Network model, by institute Output of the lip reading recognition result as the Recognition with Recurrent Neural Network model is stated, the Recognition with Recurrent Neural Network model is trained, The Recognition with Recurrent Neural Network model trained.

In one embodiment, described that image procossing is carried out to the video sample, obtain syllable adhesion data, comprising:

Mouth sequence picture is converted by the video sample；

Mouth-Shape Recognition is carried out to the mouth sequence picture, obtains mouth sequence；

Syllable sequence is converted by the mouth sequence, based on default semantic base and the syllable sequence, obtains syllable Adhesion data.

In one embodiment, described when the lid speech characteristic authentication information is matched with the lid speech characteristic registration information When, it obtains user to be certified and passes through the authentication result of certification, comprising:

Obtain the similarity between the lid speech characteristic authentication information and the lid speech characteristic registration information；

When the similarity between the lid speech characteristic authentication information and the lid speech characteristic registration information is greater than preset threshold When, it obtains user to be certified and passes through the authentication result of certification.

A kind of identification authentication system, described device include:

Certification request obtains module, and for obtaining the certification request of user to be certified, the certification request carries to be certified The certification video of the user identifier of user and user to be certified；

Image sequence extraction module obtains mouth for carrying out Face datection and face feature point detection to certification video Lip image sequence；

Feature vector obtains module, for carrying out feature extraction to lip image sequence, obtains the spy of lip image sequence Vector is levied, using feature vector as lid speech characteristic authentication information；Registration information obtains module, for being searched according to the user identifier Pre-set user registration information obtains lid speech characteristic registration information corresponding with the user identifier；

Authentication module, for obtaining when the lid speech characteristic authentication information is matched with the lid speech characteristic registration information The authentication result that user to be certified passes through certification.

A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing Device performs the steps of when executing the computer program

A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor It is performed the steps of when row

Above-mentioned identity identifying method, device, computer equipment and storage medium, the certification by obtaining user to be certified are asked It asks, certification request carries the user identifier of user to be certified and the certification video of user to be certified, carries out people to certification video Face detection and face feature point detection, obtain lip image sequence, carry out feature extraction to lip image sequence, obtain lip The feature vector of image sequence searches pre-set user registration according to user identifier using feature vector as lid speech characteristic authentication information Information obtains lid speech characteristic registration information corresponding with user identifier, when lid speech characteristic authentication information and lid speech characteristic registration are believed When breath matching, obtain user to be certified by the authentication result of certification, may be implemented to identify by user identifier+lip reading in this way into Row authentication can effectively improve the safety of authentication；And by obtaining lip image sequence to certification video, then from Feature vector is extracted in lip image sequence, obtains lid speech characteristic authentication information, the essence of the lid speech characteristic got can be improved Parasexuality, to further increase the safety of authentication.

Detailed description of the invention

Fig. 1 is the flow diagram of identity identifying method in one embodiment；

Fig. 2 is the flow diagram of the feature vector generation step of lip image sequence in one embodiment；

Fig. 3 is the flow diagram of identity identifying method in another embodiment；

Fig. 4 is the flow diagram of one embodiment syllable adhesion data generation step；

Fig. 5 is the flow diagram of user authentication process in one embodiment；

Fig. 6 is the structural block diagram of identification authentication system in one embodiment；

Fig. 7 is the internal structure chart of computer equipment in one embodiment.

Specific embodiment

It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.

In one embodiment, as shown in Figure 1, providing a kind of identity identifying method, comprising the following steps:

Step 102, obtain the certification request of user to be certified, certification request carry user to be certified user identifier and The certification video of user to be certified.

Certification request refers to be proposed by user to be certified, and the request of authentication is carried out to certification authority.User identifier is Refer to the mark for distinguishing different user, each user to be certified is corresponding to be identified uniquely, for example can pass through number A, B, C etc. Distinguish different users to be certified.Certification video is when referring to that user to be certified authenticates, voice that field real-time acquisition arrives and Image information, such as user to be certified read aloud the video of one section of password.

Step 104, Face datection is carried out to certification video and face feature point detects, obtain lip image sequence.

Face datection refers to for given image, scans for image to determine whether containing face, if it is Then return to position, size and the posture of face.Face datection can by the method based on feature, the method based on template and Method based on statistical theory is realized.Specifically, rule derived from the priori knowledge using face is passed through based on the method for feature Carry out Face datection, including edge and shape feature, textural characteristics, color characteristic etc.；Method based on template is by calculating people Correlation between face template and image to be detected carries out Face datection, for example, calculate separately image to be detected eyes, nose, The correlation between features and face template such as mouth, by the size of correlation to determine whether there are faces；It is managed based on statistics The method of opinion refers to finds face and non-face sample characteristics with machine learning using statistical analysis respectively, passes through the feature found Classifier is constructed, to realize Face datection.It can be using the method returned by cascade to face in the facial image detected Portion's characteristic point is detected, and carries out lip region extraction according to the position of characteristic point.Based on cascade return method by using A series of weak non-linear relations for returning device fitting complexity, study regression function map directly to testing result.

By human face detection tech and face feature point detection technique, lip region is extracted from video image, is realized Positioning to face lip.For example, can calculate using two key points of the corners of the mouth, the translation and rotation relative to standard mouth are obtained Transposon.Use eye spacing as benchmark, the lip of different people is transformed into identical scale by scale parameter.According to translation With twiddle factor and scale parameter, the lip image sequence being aligned.Lip alignment can eliminate mouth in different frame image The position of lip, angle and scale it is inconsistent, lip reading identification is carried out by the lip image sequence of alignment, not only may be implemented The opposite change information for talking about mouth, can also be normalized different speaker's difference mouths, position of the reduction because of lip, angle With the inconsistent influence for causing to identify lip reading of scale.

Step 106, feature extraction is carried out to lip image sequence, the feature vector of lip image sequence is obtained, with feature Vector is lid speech characteristic authentication information.

Lip image sequence refers to speaker's continuous shape of the mouth as one speaks variation characteristic in speech.Lip reading identification be can directly from The technology that speech content is identified in the image of someone's speech continuously identifies face from certification video by lip reading identification, sentences It is disconnected to extract the continuous shape of the mouth as one speaks variation characteristic of this person wherein just in talker, the feature of consecutive variations is input to lip reading identification In model, the corresponding pronunciation of teller's shape of the mouth as one speaks is identified, further according to the pronunciation identified, obtain lip reading recognition result.Lip reading is special Sign authentication information, which refers to, when carrying out authentication, is parsing certification video, obtained lip reading recognition result to user to be certified.

Step 108, pre-set user registration information is searched according to user identifier, obtains lid speech characteristic corresponding with user identifier Registration information.

User acquires user identifier and registration video when registering, by user identifier with register that video is corresponding deposits Storage.Registration video refers to that user reads aloud customized or setting password video, parses, obtains corresponding to registration video User identifier and corresponding lid speech characteristic registration information associated storage are obtained user's registration information by lid speech characteristic registration information. Lid speech characteristic registration information, which refers to, when registering, is parsing registration video, obtained lip reading recognition result to user.

Step 110, it when lid speech characteristic authentication information is matched with lid speech characteristic registration information, obtains user to be certified and passes through The authentication result of certification.

For example, user's first, in registration, typing registration video, collected registration video are parsed, obtained lip reading Recognition result is 201810, i.e., lid speech characteristic registration information is 201810.When carrying out authentication to user's first, acquisition certification Video parses collected certification video, and obtained lip reading recognition result is 201810, i.e. lid speech characteristic authentication information is 201810.The lid speech characteristic authentication information of first is matched with lid speech characteristic registration information, thus obtains user's first to be certified by recognizing The authentication result of card.

Above-mentioned identity identifying method, by obtaining the certification request of user to be certified, certification request carries user to be certified User identifier and user to be certified certification video, Face datection and face feature point are carried out to certification video and detected, Lip image sequence is obtained, feature extraction is carried out to lip image sequence, the feature vector of lip image sequence is obtained, with feature Vector is lid speech characteristic authentication information, searches pre-set user registration information according to user identifier, obtains corresponding with user identifier It is logical to obtain user to be certified when lid speech characteristic authentication information is matched with lid speech characteristic registration information for lid speech characteristic registration information The authentication result for crossing certification may be implemented to carry out authentication by user identifier+lip reading identification in this way, can effectively improve body The safety of part certification；And by certification video obtain lip image sequence, then from lip image sequence extract feature to Amount, obtains lid speech characteristic authentication information, the accuracy of lip reading identification can be improved, to further increase the safety of authentication Property.

In one embodiment, as shown in Fig. 2, carrying out feature extraction to lip image sequence, lip image sequence is obtained Feature vector, comprising: step 202, the corresponding video block of lip image sequence is divided into multiple sub-video blocks；Step 204, Each sub-video block is passed through into the first independence subspace analysis respectively and carries out characteristic vector pickup, obtains first eigenvector collection It closes；Step 206, dimension-reduction treatment is carried out to the feature vector in first eigenvector set, by the feature vector after dimension-reduction treatment The second independence subspace analysis is inputted, second feature vector set is obtained；Step 208, according to first eigenvector set and Two feature vector set, obtain the feature vector of lip image sequence.According to lip region image sequence, the feature of image is obtained Vector, as lid speech characteristic information.Since lip reading video includes image and motion information, a feature vector connects from multiframe It is extracted in the subimage sequence that continuous image is formed.Independence subspace analysis is a kind of unsupervised feature learning method, can be from image The feature of phase invariant is provided in middle school's acquistion.That the first independence subspace analysis extracts is the spy on mobile side in video sequence Sign, the data that the first independence subspace analysis is extracted carry out dimensionality reduction by principal component analysis, can accelerate calculating speed, What the second independence subspace analysis was handled is the input data compared with low dimensional, and what is extracted is more abstract feature.Specifically may be used To realize feature extraction by stacking convolution independence subspace analysis network, convolution independence subspace analysis network is stacked by only Vertical subspace analysis and principal component analysis successively stack composition.When calculating video features first by the picture in lesser video block Element is input to first layer independence subspace analysis network after pulling into a vector, and then video block adjacent in bigger region is only Vertical subspace analysis output is joined together, and after the pretreatment of principal component analysis dimensionality reduction, is input to second layer Independent subspace Network is analyzed, and so on, the output of last every layer of independence subspace analysis network is connected into a vector as the view The feature vector of frequency block.Feature vector in order to obtain is divided into multiple small by the video block of multiple continuous lip reading image constructions Video block, each small video block, which is input into stack in convolution independence subspace analysis network, extracts feature vector, last each The feature vector of small video block and treated that multiple feature vectors are connected is unified into final feature by principal component analysis Vector.

In one embodiment, as shown in figure 3, before to the progress feature extraction of lip image sequence, further includes: step 302, obtain video sample and Recognition with Recurrent Neural Network model, video sample includes lip reading image and corresponding with lip reading image Lip reading recognition result；Step 304, using lip reading image as the input of Recognition with Recurrent Neural Network model, using lip reading recognition result as The output of Recognition with Recurrent Neural Network model is trained Recognition with Recurrent Neural Network model, the Recognition with Recurrent Neural Network mould trained Type；Feature extraction is carried out to lip image sequence, comprising: step 306, by the Recognition with Recurrent Neural Network model trained to lip Image sequence carries out feature extraction.Certification video is made of continuous multiple image, and authentication authorization and accounting video is sequence data, and sequence There is relevance, the picture frame that front occurs has significant impact or even subsequent figure to subsequent picture frame between before and after data As data frame of the frame to front is also to have a major impact.Recognition with Recurrent Neural Network needs to consider front when handling current information The information of appearance, Recognition with Recurrent Neural Network may include all Given informations before current information, recycle the pre-treatment referred to Information can be utilized always and help subsequent information.Identification for lip reading passes through the Recognition with Recurrent Neural Network model trained It realizes, obtains using this model lip shape and change with time feature, to carry out predicted letters, for example, acquisition speaker is bright The image for reading Arabic numerals 0, in this, as lip reading image, corresponding lip reading recognition result is Arabic numerals 0.By speaker The image for reading aloud Arabic numerals 0 played the input of model as circulation nerve, was that Arabic numerals 0 are made by lip reading recognition result For the standard output of Recognition with Recurrent Neural Network, Recognition with Recurrent Neural Network model is trained, according to the reality of Recognition with Recurrent Neural Network model Border output and standard output, adjust the training weight of Recognition with Recurrent Neural Network model, until the reality of Recognition with Recurrent Neural Network model Output meets preset condition compared with standard output, the Recognition with Recurrent Neural Network model trained.

In one embodiment, using lip reading image as the input of Recognition with Recurrent Neural Network model, lip reading recognition result is made For the output of Recognition with Recurrent Neural Network model, Recognition with Recurrent Neural Network model is trained, the Recognition with Recurrent Neural Network trained Before model, further includes: carry out image procossing to video sample, obtain syllable adhesion data；Using lip reading image as circulation mind Input through network model, using lip reading recognition result as the output of Recognition with Recurrent Neural Network model, to Recognition with Recurrent Neural Network model It is trained, the Recognition with Recurrent Neural Network model trained, comprising: using lip reading image and syllable adhesion data as circulation The input of neural network model, using lip reading recognition result as the output of Recognition with Recurrent Neural Network model, to Recognition with Recurrent Neural Network mould Type is trained, the Recognition with Recurrent Neural Network model trained.It joined when carrying out Recognition with Recurrent Neural Network model training pair The training of syllable adhesion when fast word speed keeps lip reading identification more acurrate.For example the video training data of fast word speed can be increased, so that Recognition with Recurrent Neural Network model can learn the lip shape variation in the case of fast word speed；It can also reinforce video frame rate, reduce circulation nerve The difficulty of network model training.

In one embodiment, as shown in figure 4, carrying out image procossing to video sample, syllable adhesion data are obtained, are wrapped It includes: step 402, converting mouth sequence picture for video sample；Step 404, lip-sync sequence of pictures carries out Mouth-Shape Recognition, obtains To mouth sequence；Step 406, syllable sequence is converted by mouth sequence, based on default semantic base and syllable sequence, obtains sound Save adhesion data.Mouth sequence picture is converted by video sample by image procossing, mouth sequence picture is subjected to gray processing And filtering processing, then binary conversion treatment is carried out, so as to the extraction of feature vector.Shape of the mouth as one speaks template is lip-sync according to daily life Empirically determined, lip-sync sequence of pictures carries out characteristic vector pickup, and is matched with shape of the mouth as one speaks template, obtains mouth sequence.Language Yi Ku can be using Chinese phonetic alphabet as basic unit the shape of the mouth as one speaks template library established, and include all phonetic words in semantic base The shape of the mouth as one speaks picture and multi-C vector parameter of mother's pronunciation.Syllable sequence is converted into word, language by the efficient combination of syllable sequence The semantic constraint in adopted library refers to grammer based on natural language and syntactic relation and Chinese character during being combined into syllables by syllable The dependence of inter-syllable.It is influenced by the association of study front and back syllable, reduces the case where misrecognition is at single syllable.

In one embodiment, as shown in figure 5, when lid speech characteristic authentication information is matched with lid speech characteristic registration information, It obtains user to be certified and passes through the authentication result of certification, comprising: step 502, obtain lid speech characteristic authentication information and lid speech characteristic Similarity between registration information；Step 504, when the similarity between lid speech characteristic authentication information and lid speech characteristic registration information When greater than preset threshold, obtains user to be certified and pass through the authentication result of certification.By lid speech characteristic authentication information and lid speech characteristic Registration information is respectively converted into the form of feature vector, and the phase of the two can be calculated by Euclidean distance or cosine similarity Like degree, wherein Euclidean distance refers to the natural length in actual distance or vector in m-dimensional space between two points, two Euclidean distance in peacekeeping three-dimensional space is exactly the actual range between two o'clock；Cosine similarity refers to by calculating two vectors Included angle cosine value come the similarity both assessed.

It should be understood that although each step in the flow chart of Fig. 1-5 is successively shown according to the instruction of arrow, These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps Execution there is no stringent sequences to limit, these steps can execute in other order.Moreover, at least one in Fig. 1-5 Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps Completion is executed, but can be executed at different times, the execution sequence in these sub-steps or stage is also not necessarily successively It carries out, but can be at least part of the sub-step or stage of other steps or other steps in turn or alternately It executes.

In one embodiment, as shown in fig. 6, providing a kind of identification authentication system, comprising: certification request obtains module 602, image sequence extraction module 604, feature vector obtain module 606, registration information obtains module 608 and authentication module 610. Certification request obtains module, and for obtaining the certification request of user to be certified, certification request carries user's mark of user to be certified The certification video of knowledge and user to be certified；Image sequence extraction module, for carrying out Face datection and face to certification video The detection of portion's characteristic point, obtains lip image sequence；Feature vector obtains module, mentions for carrying out feature to lip image sequence It takes, obtains the feature vector of lip image sequence, using feature vector as lid speech characteristic authentication information；Registration information obtains module, For searching pre-set user registration information according to user identifier, lid speech characteristic registration information corresponding with user identifier is obtained；Recognize Module is demonstrate,proved, for user to be certified being obtained and passing through certification when lid speech characteristic authentication information is matched with lid speech characteristic registration information Authentication result.

In one embodiment, it includes video block division unit that feature vector, which obtains module, is used for lip image sequence Corresponding video block is divided into multiple sub-video blocks；Vector extraction unit, for each sub-video block to be passed through first solely respectively Vertical subspace analysis carries out characteristic vector pickup, obtains first eigenvector set；Dimensionality reduction unit, for first eigenvector Feature vector in set carries out dimension-reduction treatment, and the feature vector after dimension-reduction treatment is inputted the second independence subspace analysis, is obtained To second feature vector set；Output unit, for obtaining mouth according to first eigenvector set and second feature vector set The feature vector of lip image sequence.

It in one embodiment, further include that model obtains module before feature vector obtains module, for obtaining video sample Originally and Recognition with Recurrent Neural Network model, video sample include lip reading image and lip reading recognition result corresponding with lip reading image； Model training module, for using lip reading image as the input of Recognition with Recurrent Neural Network model, using lip reading recognition result as circulation The output of neural network model is trained Recognition with Recurrent Neural Network model, the Recognition with Recurrent Neural Network model trained；It is special Sign vector obtains module and is also used to the Recognition with Recurrent Neural Network model by having trained to the progress feature extraction of lip image sequence.

It in one embodiment, further include that syllable adhesion obtains module before model training module, for video sample Image procossing is carried out, syllable adhesion data are obtained；Model training module be used for using lip reading image and syllable adhesion data as The input of Recognition with Recurrent Neural Network model, using lip reading recognition result as the output of Recognition with Recurrent Neural Network model, to circulation nerve net Network model is trained, the Recognition with Recurrent Neural Network model trained.

In one embodiment, it includes the first conversion unit that syllable adhesion, which obtains module, for converting video sample to Mouth sequence picture；Mouth-Shape Recognition unit carries out Mouth-Shape Recognition for lip-sync sequence of pictures, obtains mouth sequence；Second turn Change unit, for converting syllable sequence for mouth sequence, based on default semantic base and syllable sequence, obtains syllable adhesion number According to.

In one embodiment, authentication module includes similarity acquiring unit, for obtain lid speech characteristic authentication information with Similarity between lid speech characteristic registration information；Similarity-rough set unit, for working as lid speech characteristic authentication information and lid speech characteristic When similarity between registration information is greater than preset threshold, obtains user to be certified and pass through the authentication result of certification.

Specific about identification authentication system limits the restriction that may refer to above for identity identifying method, herein not It repeats again.Modules in above-mentioned identification authentication system can be realized fully or partially through software, hardware and combinations thereof.On Stating each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also store in a software form In memory in computer equipment, the corresponding operation of the above modules is executed in order to which processor calls.

In one embodiment, a kind of computer equipment is provided, which can be terminal, internal structure Figure can be as shown in Figure 7.The computer equipment includes processor, the memory, network interface, display connected by system bus Screen and input unit.Wherein, the processor of the computer equipment is for providing calculating and control ability.The computer equipment is deposited Reservoir includes non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system and computer journey Sequence.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating The network interface of machine equipment is used to communicate with external terminal by network connection.When the computer program is executed by processor with Realize a kind of identity identifying method.The display screen of the computer equipment can be liquid crystal display or electric ink display screen, The input unit of the computer equipment can be the touch layer covered on display screen, be also possible to be arranged on computer equipment shell Key, trace ball or Trackpad, can also be external keyboard, Trackpad or mouse etc..

It will be understood by those skilled in the art that structure shown in Fig. 7, only part relevant to application scheme is tied The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.

In one embodiment, a kind of computer equipment, including memory and processor are provided, which is stored with Computer program, the processor perform the steps of the certification request for obtaining user to be certified when executing computer program, authenticate Request carries the user identifier of user to be certified and the certification video of user to be certified；To certification video carry out Face datection with And face feature point detection, obtain lip image sequence；Feature extraction is carried out to lip image sequence, obtains lip image sequence Feature vector, using feature vector as lid speech characteristic authentication information；Pre-set user registration information is searched according to user identifier, is obtained Lid speech characteristic registration information corresponding with user identifier；When lid speech characteristic authentication information is matched with lid speech characteristic registration information, It obtains user to be certified and passes through the authentication result of certification.

In one embodiment, it also performs the steps of when processor executes computer program by lip image sequence pair The video block answered is divided into multiple sub-video blocks；Each sub-video block is passed through into the first independence subspace analysis respectively and carries out feature Vector extracts, and obtains first eigenvector set；Dimension-reduction treatment is carried out to the feature vector in first eigenvector set, will be dropped Dimension treated feature vector inputs the second independence subspace analysis, obtains second feature vector set；According to fisrt feature to Duration set and second feature vector set, obtain the feature vector of lip image sequence.

In one embodiment, processor execute computer program when also perform the steps of obtain video sample and Recognition with Recurrent Neural Network model, video sample include lip reading image and lip reading recognition result corresponding with lip reading image；By lip reading Input of the image as Recognition with Recurrent Neural Network model, using lip reading recognition result as the output of Recognition with Recurrent Neural Network model, to following Ring neural network model is trained, the Recognition with Recurrent Neural Network model trained；Pass through the Recognition with Recurrent Neural Network trained Model carries out feature extraction to lip image sequence.

In one embodiment, it is also performed the steps of when processor executes computer program and figure is carried out to video sample As processing, syllable adhesion data are obtained；Using lip reading image and syllable adhesion data as the input of Recognition with Recurrent Neural Network model, Using lip reading recognition result as the output of Recognition with Recurrent Neural Network model, Recognition with Recurrent Neural Network model is trained, has been instructed Experienced Recognition with Recurrent Neural Network model.

In one embodiment, it is also performed the steps of when processor executes computer program and converts video sample to Mouth sequence picture；Lip-sync sequence of pictures carries out Mouth-Shape Recognition, obtains mouth sequence；Syllable sequence is converted by mouth sequence Column obtain syllable adhesion data based on default semantic base and syllable sequence.

In one embodiment, it is also performed the steps of when processor executes computer program and obtains lid speech characteristic certification Similarity between information and lid speech characteristic registration information；When between lid speech characteristic authentication information and lid speech characteristic registration information When similarity is greater than preset threshold, obtains user to be certified and pass through the authentication result of certification.

In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated Machine program performs the steps of the certification request for obtaining user to be certified when being executed by processor, certification request carries to be certified The certification video of the user identifier of user and user to be certified；Face datection and face feature point inspection are carried out to certification video It surveys, obtains lip image sequence；Feature extraction is carried out to lip image sequence, obtains the feature vector of lip image sequence, with Feature vector is lid speech characteristic authentication information；Pre-set user registration information is searched according to user identifier, is obtained and user identifier pair The lid speech characteristic registration information answered；When lid speech characteristic authentication information is matched with lid speech characteristic registration information, use to be certified is obtained The authentication result that family passes through certification.

In one embodiment, it is also performed the steps of when computer program is executed by processor by lip image sequence Corresponding video block is divided into multiple sub-video blocks；Each sub-video block is passed through into the first independence subspace analysis respectively and carries out spy It levies vector to extract, obtains first eigenvector set；Dimension-reduction treatment is carried out to the feature vector in first eigenvector set, it will Feature vector after dimension-reduction treatment inputs the second independence subspace analysis, obtains second feature vector set；According to fisrt feature Vector set and second feature vector set, obtain the feature vector of lip image sequence.

In one embodiment, when computer program is executed by processor also perform the steps of obtain video sample with And Recognition with Recurrent Neural Network model, video sample include lip reading image and lip reading recognition result corresponding with lip reading image；By lip Input of the sonagram picture as Recognition with Recurrent Neural Network model, it is right using lip reading recognition result as the output of Recognition with Recurrent Neural Network model Recognition with Recurrent Neural Network model is trained, the Recognition with Recurrent Neural Network model trained；Pass through the circulation nerve net trained Network model carries out feature extraction to lip image sequence.

In one embodiment, it is also performed the steps of when computer program is executed by processor and video sample is carried out Image procossing obtains syllable adhesion data；Using lip reading image and syllable adhesion data as the defeated of Recognition with Recurrent Neural Network model Enter, using lip reading recognition result as the output of Recognition with Recurrent Neural Network model, Recognition with Recurrent Neural Network model is trained, is obtained Trained Recognition with Recurrent Neural Network model.

In one embodiment, it is also performed the steps of when computer program is executed by processor and converts video sample For mouth sequence picture；Lip-sync sequence of pictures carries out Mouth-Shape Recognition, obtains mouth sequence；Syllable sequence is converted by mouth sequence Column obtain syllable adhesion data based on default semantic base and syllable sequence.

In one embodiment, acquisition lid speech characteristic is also performed the steps of when computer program is executed by processor to recognize Demonstrate,prove the similarity between information and lid speech characteristic registration information；When between lid speech characteristic authentication information and lid speech characteristic registration information Similarity be greater than preset threshold when, obtain user to be certified by certification authentication result.

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, To any reference of memory, storage, database or other media used in each embodiment provided herein, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..

Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance Shield all should be considered as described in this specification.

The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims

1. a kind of identity identifying method, which comprises

The certification request of user to be certified is obtained, the certification request carries the user identifier and use to be certified of user to be certified The certification video at family；

Feature extraction is carried out to the lip image sequence, the feature vector of the lip image sequence is obtained, with the feature Vector is lid speech characteristic authentication information；

Pre-set user registration information is searched according to the user identifier, obtains lid speech characteristic registration corresponding with the user identifier Information；

When the lid speech characteristic authentication information is matched with the lid speech characteristic registration information, obtains user to be certified and pass through certification Authentication result.

2. the method according to claim 1, wherein it is described to the lip image sequence carry out feature extraction, Obtain the feature vector of the lip image sequence, comprising:

The corresponding video block of the lip image sequence is divided into multiple sub-video blocks；Each sub-video block is passed through respectively One independence subspace analysis carries out characteristic vector pickup, obtains first eigenvector set；

Dimension-reduction treatment is carried out to the feature vector in the first eigenvector set, the feature vector after dimension-reduction treatment is inputted Second independence subspace analysis obtains second feature vector set；

According to the first eigenvector set and the second feature vector set, the feature of the lip image sequence is obtained Vector.

3. the method according to claim 1, wherein it is described to the lip image sequence carry out feature extraction it Before, further includes:

Obtain video sample and Recognition with Recurrent Neural Network model, the video sample include lip reading image and with the lip reading figure As corresponding lip reading recognition result；

Using the lip reading image as the input of the Recognition with Recurrent Neural Network model, the lip reading recognition result is followed as described in The output of ring neural network model is trained the Recognition with Recurrent Neural Network model, the Recognition with Recurrent Neural Network trained Model；

4. according to the method described in claim 3, it is characterized in that, described using the lip reading image as the circulation nerve net The input of network model, using the lip reading recognition result as the output of the Recognition with Recurrent Neural Network model, to the circulation nerve Network model is trained, before the Recognition with Recurrent Neural Network model trained, further includes:

It is described using the lip reading image as the input of the Recognition with Recurrent Neural Network model, using the lip reading recognition result as institute The output for stating Recognition with Recurrent Neural Network model is trained the Recognition with Recurrent Neural Network model, the circulation nerve trained Network model, comprising:

Using the lip reading image and the syllable adhesion data as the input of the Recognition with Recurrent Neural Network model, by the lip Output of the language recognition result as the Recognition with Recurrent Neural Network model, is trained the Recognition with Recurrent Neural Network model, obtains The Recognition with Recurrent Neural Network model trained.

5. according to the method described in claim 4, it is characterized in that, it is described to the video sample carry out image procossing, obtain Syllable adhesion data, comprising:

Mouth sequence picture is converted by the video sample；

6. the method according to claim 1, wherein described work as the lid speech characteristic authentication information and the lip reading When feature registration information matches, obtains user to be certified and passes through the authentication result of certification, comprising:

When the similarity between the lid speech characteristic authentication information and the lid speech characteristic registration information is greater than preset threshold, obtain Pass through the authentication result of certification to user to be certified.

7. a kind of identification authentication system, which is characterized in that described device includes:

Certification request obtains module, and for obtaining the certification request of user to be certified, the certification request carries user to be certified User identifier and user to be certified certification video；

Image sequence extraction module obtains lip figure for carrying out Face datection and face feature point detection to certification video As sequence；

Feature vector obtains module, for carrying out feature extraction to lip image sequence, obtain the feature of lip image sequence to Amount, using feature vector as lid speech characteristic authentication information；

Registration information obtains module, for searching pre-set user registration information according to the user identifier, obtains and the user Identify corresponding lid speech characteristic registration information；

Authentication module, for obtaining wait recognize when the lid speech characteristic authentication information is matched with the lid speech characteristic registration information It demonstrate,proves user and passes through the authentication result of certification.

8. device according to claim 7, which is characterized in that described eigenvector obtains module and includes:

Video block division unit, for the corresponding video block of lip image sequence to be divided into multiple sub-video blocks；

Vector extraction unit is mentioned for each sub-video block to be passed through the first independence subspace analysis progress feature vector respectively It takes, obtains first eigenvector set；

Dimensionality reduction unit, for carrying out dimension-reduction treatment to the feature vector in first eigenvector set, by the spy after dimension-reduction treatment It levies vector and inputs the second independence subspace analysis, obtain second feature vector set；

Output unit, for obtaining the spy of lip image sequence according to first eigenvector set and second feature vector set Levy vector.

9. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists In the step of processor realizes any one of claims 1 to 6 the method when executing the computer program.

10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of method described in any one of claims 1 to 6 is realized when being executed by processor.