Nothing Special   »   [go: up one dir, main page]

WO2016165614A1 - Method for expression recognition in instant video and electronic equipment - Google Patents

Method for expression recognition in instant video and electronic equipment Download PDF

Info

Publication number
WO2016165614A1
WO2016165614A1 PCT/CN2016/079115 CN2016079115W WO2016165614A1 WO 2016165614 A1 WO2016165614 A1 WO 2016165614A1 CN 2016079115 W CN2016079115 W CN 2016079115W WO 2016165614 A1 WO2016165614 A1 WO 2016165614A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature point
point coordinate
face
feature
instant video
Prior art date
Application number
PCT/CN2016/079115
Other languages
French (fr)
Chinese (zh)
Inventor
武俊敏
Original Assignee
美国掌赢信息科技有限公司
武俊敏
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 美国掌赢信息科技有限公司, 武俊敏 filed Critical 美国掌赢信息科技有限公司
Publication of WO2016165614A1 publication Critical patent/WO2016165614A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Definitions

  • the present invention relates to the field of video, and in particular, to an expression recognition method and an electronic device in an instant video.
  • instant video applications With the popularity of instant video applications on mobile terminals, more and more users realize interaction with others through instant video applications. Therefore, an expression recognition method in instant video is needed to satisfy users.
  • the instant video application realizes the personalized needs when interacting with others, and improves the user experience in the interactive scenario.
  • the prior art provides an expression recognition method, which specifically includes: acquiring a current frame picture to be recognized from a pre-recorded video, identifying a facial expression in the current frame picture, and continuing the above steps on other frame images. To identify facial expressions in the video frame picture in the video.
  • the method cannot recognize the facial expression in the real-time video in real time, and in the implementation process, since the method occupies a large amount of processing resources and storage resources of the device, the method has high requirements on the device, and the method cannot be applied.
  • mobile terminals such as smart phones and tablets, it is unable to meet the diverse needs of users and reduce the user experience.
  • the embodiment of the present invention provides an expression recognition method and an electronic device in an instant video.
  • the technical solution is as follows:
  • an expression recognition method in an instant video comprising:
  • the feature point is used to describe the current expression of the face
  • the feature vector includes feature point coordinates and texture feature point coordinates under a standard pose matrix
  • the texture feature points are used to uniquely determine the feature points
  • Obtaining a feature vector corresponding to at least one feature point of the face in the instant video frame includes:
  • the acquiring the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix includes:
  • the at least one feature point is normalized to obtain the at least one of the standard pose matrix
  • the at least one texture feature point coordinate of the feature point includes:
  • Rotating the current pose matrix into a standard pose matrix and acquiring the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix.
  • the identifying the feature vector corresponding to the at least one feature point includes:
  • the determining, according to the recognition result, that the current expression is one of a plurality of pre-stored expressions includes:
  • the recognition result is within the preset range, it is determined that the expression corresponding to the feature vector is one of a plurality of pre-stored expressions.
  • an electronic device comprising:
  • An acquiring module configured to acquire a feature vector corresponding to at least one feature point of a face in an instant video frame, where the feature point is used to describe a current expression of the face;
  • An identification module configured to identify a feature vector corresponding to the at least one feature point, and generate a recognition result
  • a determining module configured to determine, according to the recognition result, the current expression as one of a plurality of pre-stored expressions.
  • the acquiring module is further configured to acquire the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix;
  • the identification module is further configured to generate a feature vector corresponding to the at least one feature point according to the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix.
  • the acquiring module is further configured to acquire the at least one feature point coordinate of the face in the instant video frame and the at least one texture feature point coordinate;
  • the device further includes a processing module, configured to perform normalization processing on the at least one feature point to obtain the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix.
  • the obtaining module is further configured to: according to the at least one feature point of a face in the instant video frame Obtaining, by the coordinates and the at least one texture feature point coordinate, the at least one feature point of the face in the instant video frame and the current pose matrix corresponding to the at least one texture feature point;
  • the processing module is further configured to rotate the current pose matrix into a standard pose matrix, and acquire the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix.
  • the device further includes:
  • a calculation module configured to input a feature vector corresponding to the at least one feature point into a preset expression model library for calculation, and obtain the recognition result.
  • the determining module is specifically configured to:
  • the recognition result is within the preset range, it is determined that the expression corresponding to the feature vector is one of a plurality of pre-stored expressions.
  • an electronic device including a video input module, a video output module, a sending module, a receiving module, a memory, and the video input module, the video output module, the sending module, and the receiving And a processor coupled to the memory, wherein the memory stores a set of program code, the processor is configured to invoke program code stored in the memory, and perform the following operations:
  • the feature point is used to describe the current expression of the face
  • the processor is further configured to invoke program code stored in the memory, and perform the following operations:
  • the at least one feature point coordinate and the at least one texture according to the standard pose matrix Feature point coordinates, generating feature vectors corresponding to the at least one feature point.
  • the processor is further configured to invoke program code stored in the memory, and perform the following operations:
  • the processor is further configured to invoke the program code stored in the memory, and perform the following operations:
  • Rotating the current pose matrix into a standard pose matrix and acquiring the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix.
  • the processor is further configured to invoke the program code stored in the memory, and perform the following operations:
  • the processor is further configured to invoke the program code stored in the memory, and perform the following operations:
  • the recognition result is within the preset range, it is determined that the expression corresponding to the feature vector is one of a plurality of pre-stored expressions.
  • An embodiment of the present invention provides an expression recognition method and an electronic device in an instant video, including: acquiring a feature vector corresponding to at least one feature point of a face in an instant video frame, where the feature point is used to describe the current face of the face An expression; identifying a feature vector corresponding to the at least one feature point to generate a recognition result; and determining, according to the recognition result, the current expression as one of a plurality of pre-stored expressions.
  • Obtaining feature points through feature points by acquiring feature points for describing a current expression of a face in an instant video
  • the corresponding feature vector can more accurately represent the current expression of the face, and then obtain the recognition result according to the feature vector by identifying the feature vector, which simplifies the complexity of the algorithm for recognizing the face in the instant video, so that the embodiment of the present invention provides
  • the method can be run on the mobile terminal to meet the diverse needs of the user and improve the user experience.
  • FIG. 1 is a schematic diagram of an interaction system according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of an interaction system according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of an interaction system according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of an expression recognition method in an instant video according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • An embodiment of the present invention provides an expression recognition method in an instant video, where the method is applied to an interactive system including at least two mobile terminals and a server, wherein the mobile terminal can run an instant video program, and the user can run Instant video program on the mobile terminal to achieve interaction with others
  • the mobile terminal may be a smart phone, a tablet computer, or another mobile terminal.
  • the specific mobile terminal is not limited in the embodiment of the present invention.
  • the mobile terminal at least includes a video input module and a video display module, the video input module may include a camera, and the video display module may include a display screen, and the instant video program may implement real-time video input by controlling a video input module of the mobile terminal, and may also control the video.
  • the display module enables the display of instant video.
  • the interactive system can be referred to FIG. 1 , in which the mobile terminal 1 is an instant video sender, the mobile terminal 2 is an instant video receiver, and the instant video sent by the mobile terminal 1 is forwarded to the mobile terminal 2 via the server;
  • the user of the mobile terminal 1 and the user of the mobile terminal 2 can interact through the interactive system.
  • the execution body of the method provided by the embodiment of the present invention may be any one of the mobile terminal 1, the mobile terminal 2, and the server. If the execution subject of the method is the mobile terminal 1, the mobile terminal 1 After receiving the instant video input through the video input module of the user, performing facial expression recognition on the face in the instant video, forwarding the recognition result to the mobile terminal 2 via the server, and/or outputting the recognition result through the display screen of the same;
  • the execution body of the method is a server, and after the mobile terminal 1 and/or the mobile terminal 2 input the live video through the video input module of the method, the instant video is sent to the server, and the server recognizes the facial expression in the instant video.
  • the recognition result is sent to the mobile terminal 1 and/or the mobile terminal 2; if the execution subject of the method is the mobile terminal 2, the mobile terminal 1 sends the live video to the server after inputting the live video through its own video input module.
  • the server sends the instant video to the mobile terminal 2, and the mobile terminal 2 performs the facial expression in the instant video. Identification, forwarding the recognition result to the mobile terminal 1 via the server, and/or outputting the recognition result through its own display screen.
  • the specific implementation body of the method in the interaction system is not limited in the embodiment of the present invention.
  • the method provided by the embodiment of the present invention can also be applied to an interactive system including only the mobile terminal 1 and the mobile terminal 2.
  • the interactive system can be referred to FIG. 2, wherein the The mobile terminal in the interactive system is the same as the mobile terminal in the interactive system shown in FIG. 1, and details are not described herein again.
  • the execution body of the method provided by the embodiment of the present invention may be Any one of the mobile terminal 1 and the mobile terminal 2, if the execution subject of the method is the mobile terminal 1, the mobile terminal 1 performs an expression on the face in the instant video after inputting the live video through its own video input module Identifying, then transmitting the recognition result to the communication device 2, and/or outputting the recognition result through its own display screen; if the execution subject of the method is the mobile terminal 2, the mobile terminal 1 inputs the live video through its own video input module Sending the live video to the mobile terminal 2, the mobile terminal 2 performs expression recognition on the face in the instant video, and then transmits the recognition result to the mobile terminal 1, and/or outputs the recognition result through its own display screen.
  • the specific implementation body of the method in the interaction system is not limited in the embodiment of the present invention.
  • the method provided by the embodiment of the present invention can also be applied to an interactive system including only the mobile terminal 1 and the user.
  • the interactive system can be referred to FIG. 3, wherein the mobile terminal 1 includes at least a video input.
  • the module and the video display module, the video input module may include a camera, the video display module may include a display screen, and at least one instant video program may be run in the mobile terminal, and the instant video program controls the video input module and the video display module of the mobile terminal to perform instant video.
  • the mobile terminal receives the instant video input by the user, performs facial expression recognition on the instant video, and outputs the recognition result through the display screen of the user.
  • the mobile terminal in the embodiment of the present invention may be one or multiple, and the specific mobile terminal is not limited in the embodiment of the present invention.
  • embodiment of the present invention may further include other application scenarios, and the specific application scenario is not limited in the embodiment of the present invention.
  • An embodiment of the present invention provides an expression recognition method in an instant video. As shown in FIG. 4, the method includes:
  • the feature vector includes feature point coordinates and texture feature point coordinates under a standard pose matrix, and the texture feature points are used to uniquely determine feature points.
  • acquiring the feature vector corresponding to the at least one feature point of the face in the instant video frame includes:
  • the process of obtaining at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix may be:
  • the at least one feature point is normalized, and at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix are obtained.
  • the process of normalizing at least one feature point and acquiring at least one texture feature point coordinate of at least one feature point under the standard pose matrix may be:
  • Rotating the current pose matrix into a standard pose matrix and acquiring at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix.
  • the feature vector corresponding to the at least one feature point is input into the preset expression model library for calculation, and the recognition result is obtained.
  • the recognition result is within the preset range, it is determined that the expression corresponding to the feature vector is one of a plurality of pre-stored expressions.
  • Embodiments of the present invention provide an expression recognition method and an electronic device in an instant video.
  • the feature vector corresponding to the feature point is obtained by the feature point, and the current expression of the face is more accurately represented, and then the feature is recognized.
  • the vector obtains the recognition result according to the feature vector, which simplifies the complexity of the algorithm for recognizing the face in the instant video, so that the method provided by the embodiment of the present invention can be run on the mobile terminal, satisfies the diversified needs of the user, and improves the vector. user experience.
  • An embodiment of the present invention provides an expression recognition method in an instant video. Referring to FIG. 5, the method flow includes:
  • the at least one feature point is used to describe a current expression of a face in an instant video.
  • the at least one feature point is used to describe the outline of the face detail, and the face detail includes at least the eyes, the mouth, the eyebrows, and the nose.
  • the manner in which the face feature points are obtained is not limited in the embodiment of the present invention.
  • the feature parameter may include coordinates of the feature point in a vector including at least a face face, and may further include the feature point including at least a face of the face The scale and direction of the vector indicated in the section.
  • a texture feature point is acquired near each feature point, and the texture feature point is used to uniquely determine the feature point, and the texture feature point does not change with changes in light, angle, and the like.
  • feature points and texture feature points can be extracted from the face by a preset extraction model or an extraction algorithm.
  • feature points and textures can be extracted from the face by other means.
  • Feature points, the specific extraction model, the extraction algorithm, and the extraction method are not limited in the embodiment of the present invention.
  • the texture feature point describes the region where the feature point is located
  • the texture feature point can be used to uniquely determine the feature point, so that the face detail is determined according to the feature point and the texture feature point, and the feature point in the instant video is guaranteed to be the same as the actual feature point.
  • a position ensures the recognition quality of the image details, thereby improving the reliability of the expression recognition.
  • the attitude matrix is used to indicate the scale and direction of the vector indicated by the three-dimensional coordinates of the feature point and the feature texture point corresponding to the feature point.
  • the process can be:
  • the normalization process may be:
  • the corresponding scale and direction are the scale and direction in the two-dimensional coordinates, so a preset conversion algorithm converts coordinates, scales and directions corresponding to the texture feature points corresponding to the at least one feature point and each feature point into two-dimensional coordinates, the at least one feature point and each feature in two-dimensional coordinates
  • the coordinates, scales, and directions corresponding to the texture feature points corresponding to the points; the specific algorithm and the conversion mode are not limited in the embodiment of the present invention.
  • step c can also be implemented in the following manner:
  • the current pose matrix is used to indicate the scale and direction of the vector indicated by the feature point;
  • the embodiment of the present invention does not limit the specific manner of rotating the current posture matrix into a standard posture matrix.
  • step 502 to step 503 at least one feature point is normalized to obtain at least one texture feature point coordinate of at least one feature point in the standard pose matrix, and in addition, other methods may be adopted.
  • the specific manner is not limited in the embodiment of the present invention.
  • the embodiment of the present invention normalizes the at least one feature point coordinate of the face in the instant video and the at least one texture feature point coordinate, so that the acquired pose matrix is not affected by, for example, illumination changes and perspective changes, and Compared with the traditional expression recognition, the expression recognition in the instant video is not changed by the change of the attitude zoom, so that the expression recognition is more accurate.
  • steps 501 to 503 are processes for acquiring at least one feature point coordinate and at least one texture feature point coordinate in the standard pose matrix, and the process may be implemented in other manners, in the embodiment of the present invention.
  • the specific method is not limited.
  • the obtained at least one feature point and the at least one texture feature point are acquired in the standard pose matrix, and the influence of external factors such as illumination and angle on the instant video face is excluded, so that the acquired feature point and the texture feature point are more comparable.
  • Sex makes the expression in the total recognition of instant video more accurate.
  • the orientation matrix indicates the direction and the scale of the feature point. Therefore, at least one feature point coordinate corresponding to the standard pose matrix and at least one texture feature point coordinate corresponding to the at least one feature point may be acquired according to the standard pose matrix.
  • the embodiment of the present invention does not limit the manner in which the feature vector corresponding to at least one feature point is generated according to at least one feature point coordinate and at least one texture feature point coordinate in the standard pose matrix. set.
  • steps 501 to 504 are the process of acquiring the feature vector corresponding to the at least one feature point of the face in the instant video frame, and the process may be implemented in other manners. The way is not limited.
  • the feature vector is input into a preset expression model corresponding to each expression for calculation.
  • the preset expression model can be a regression equation, which can be:
  • A is the regression coefficient
  • x is the feature vector
  • y is the recognition result
  • the result y is calculated according to the feature vector in the preset expression model corresponding to each expression, and the recognition result in the at least one preset expression model is obtained.
  • the step is to identify the feature vector corresponding to the at least one feature point, and to generate the process of the recognition result.
  • the process may be implemented in other manners, and the specific method is not limited in the embodiment of the present invention. .
  • the current expression is determined to be one of a plurality of pre-stored expressions according to the y value included in the recognition result of the feature vector in the preset expression model corresponding to each expression.
  • step 506 is a process for determining that the current expression is one of a plurality of pre-stored expressions according to the recognition result.
  • the process may be implemented in other manners. The process is not limited.
  • the method process further includes:
  • the number n of frames for recognizing the expression is determined in the process of the instant video, and the sum of the scores of each of the acquired expressions in the n frames is calculated, and the highest sum of the scores is the recognized expression in the n frames.
  • n is an integer greater than or equal to 2.
  • the facial expression in the instant video is constantly changing, at least one recognition result is generated by identifying the facial expression in the two or more instant video frames, and then determining the instant video frame according to the at least one recognition result.
  • the recognition result is generated by recognizing the facial expression in one frame, and the facial expression in the instant video is determined according to the recognition result, and the recognition result is more accurate, and the expression recognition can be further improved. Reliability to improve the user experience.
  • the method process further includes:
  • the models of each expression are respectively trained, and the preset expressions to be established are taken as positive samples, and the other preset expressions are used as negative samples, and the logistic regression equation indicated in step 505 is used for training.
  • the process may be:
  • the expression to be trained is taken as a positive sample, and the other expressions are used as negative samples.
  • the output result y 1
  • the output result y 0;
  • parameter A acquisition process in the logistic regression equation can be:
  • the instant expressions of all the obtained users in the instant video are input into a preset optimization formula to generate a parameter A, and the preset optimization formula may be:
  • J(A) represents the parameter A
  • y i is the predicted A value of the prediction function
  • y i ' is the true value of A.
  • step 508 is not required to be performed each time step 501 to step 506 is performed.
  • Embodiments of the present invention provide an expression recognition method and an electronic device in an instant video.
  • the feature vector corresponding to the feature point is obtained by the feature point, and the current expression of the face is more accurately represented, and then the feature vector is obtained according to the feature vector.
  • Obtaining the recognition result simplifies the complexity of the algorithm for recognizing the face in the instant video, so that the method provided by the embodiment of the present invention can be run on the mobile terminal, satisfies the diversified needs of the user, and improves the user experience.
  • the texture feature point describes the region where the feature point is located
  • the texture feature point can be used to uniquely determine the feature point, so that the face detail is determined according to the feature point and the texture feature point, and the feature point and the actual feature point in the instant video are guaranteed. In the same position, the recognition quality of the image details is ensured, thereby improving the reliability of the expression recognition.
  • the distortion rate in image processing is improved, and the reliability of image processing is increased.
  • the acquired pose matrix is not affected by, for example, illumination changes and viewing angle changes, and the like.
  • the expression recognition in the instant video is not changed by the change of the attitude zoom, so that the expression recognition is more accurate.
  • the acquired at least one feature point and the at least one texture feature point are acquired in the standard pose matrix, and the influence of external factors such as illumination and angle on the instant video face is excluded, so that the acquired feature point and the texture feature point are more Comparable, making the expression of total recognition in real-time video more accurate.
  • the computational complexity is reduced, and the recognition of the face is faster in the process of instant video, reducing the occupation of the system process, the occupation of processing resources and storage resources. , improve the operating efficiency of the processor.
  • An embodiment of the present invention provides an electronic device 6.
  • the electronic device 6 includes:
  • the obtaining module 61 is configured to acquire a feature corresponding to at least one feature point of the face in the instant video frame Vector, feature points are used to describe the current expression of the face;
  • the identification module 62 is configured to identify a feature vector corresponding to the at least one feature point, and generate a recognition result
  • the determining module 63 is configured to determine, according to the recognition result, that the current expression is one of a plurality of expressions stored in advance.
  • the obtaining module 61 is further configured to acquire at least one feature point coordinate and at least one texture feature point coordinate in the standard pose matrix;
  • the identification module 62 is further configured to generate a feature vector corresponding to the at least one feature point according to the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix.
  • the obtaining module 61 is further configured to: acquire at least one feature point coordinate of the face in the instant video frame and at least one texture feature point coordinate;
  • the device further includes a processing module, configured to normalize the at least one feature point to obtain at least one feature point coordinate and at least one texture feature point coordinate in the standard pose matrix.
  • the obtaining module 61 is further configured to acquire, according to at least one feature point coordinate of the face and at least one texture feature point coordinate of the face in the instant video frame, at least one feature point of the face in the instant video frame and a current pose corresponding to the at least one texture feature point. matrix;
  • the processing module is further configured to rotate the current pose matrix into a standard pose matrix, and acquire at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix.
  • the electronic device 6 further includes:
  • the calculation module is configured to input the feature vector corresponding to the at least one feature point into the preset expression model library for calculation, and obtain the recognition result.
  • the determining module 63 is specifically configured to:
  • the recognition result is within the preset range, it is determined that the expression corresponding to the feature vector is one of a plurality of pre-stored expressions.
  • An embodiment of the present invention provides an electronic device, which is obtained by using an instant video. Describe the feature points of the current expression of the face, so that the feature vector corresponding to the feature point is more accurately represented by the feature point, and then the recognition result is obtained by identifying the feature vector, and the recognition result is simplified according to the feature vector.
  • the complexity of the algorithm for recognizing a face in the instant video enables the method provided by the embodiment of the present invention to be run on the mobile terminal, satisfies the diverse needs of the user, and improves the user experience.
  • the electronic device 7 includes a video input module 71, a video output module 72, a sending module 73, a receiving module 74, a memory 75, and a video input module 71, and a video output.
  • the module 72, the transmitting module 73, the receiving module 74 and the processor 75 are connected to the processor 76, wherein the memory 75 stores a set of program codes, and the processor 76 is configured to call the program code stored in the memory 75 to perform the following operations:
  • the current expression is one of a plurality of expressions stored in advance.
  • the processor 76 is configured to call the program code stored in the memory 75, and perform the following operations:
  • the processor 76 is configured to call the program code stored in the memory 75, and perform the following operations:
  • the at least one feature point is normalized to obtain at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix.
  • the processor 76 is configured to call the program code stored in the memory 75, and perform the following operations:
  • Rotating the current pose matrix into a standard pose matrix and acquiring at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix.
  • the processor 76 is configured to call the program code stored in the memory 75, and perform the following operations:
  • the feature vector corresponding to the at least one feature point is input into the preset expression model library for calculation, and the recognition result is obtained.
  • the processor 76 is configured to call the program code stored in the memory 75, and perform the following operations:
  • the recognition result is within the preset range, it is determined that the expression corresponding to the feature vector is one of a plurality of pre-stored expressions.
  • An embodiment of the present invention provides an electronic device that obtains a feature point for describing a current expression of a face in an instant video, so that the feature vector corresponding to the feature point is more accurately represented by the feature point.
  • the current expression of the face by identifying the feature vector, and obtaining the recognition result according to the feature vector, simplifies the complexity of the algorithm for recognizing the face in the instant video, so that the method provided by the embodiment of the present invention can be run on the mobile terminal, satisfying The diverse needs of users have improved the user experience.
  • the electronic device provided by the foregoing embodiment is only illustrated by the division of each functional module. In actual applications, the functions may be assigned differently according to needs.
  • the function module is completed, that is, the internal structure of the electronic device is divided into different functional modules to complete all or part of the functions described above.
  • the electronic device and the method embodiment of the foregoing embodiments are in the same concept, and the specific implementation process is described in detail in the method embodiment, and details are not described herein again.
  • the storage medium may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

A method for expression recognition in an instant video belongs to the field of videos. The method comprises: acquiring a characteristic vector corresponding to at least one characteristic point of a human face in an instant video frame, the characteristic point being used for description of a current expression of the human face (401); recognizing the characteristic vector corresponding to the at least one characteristic point to generate a recognition result (402); and determining, according to the recognition result, that the current expression is one of a plurality of pre-stored expressions (403). In the method, human face expressions in an instant video are recognized according to characteristic vectors, so that the diversified demands of users are met and user experience is improved.

Description

一种即时视频中的表情识别方法和电子设备Expression recognition method and electronic device in instant video 技术领域Technical field
本发明涉及视频领域,特别涉及一种即时视频中的表情识别方法和电子设备。The present invention relates to the field of video, and in particular, to an expression recognition method and an electronic device in an instant video.
背景技术Background technique
随着即时视频应用在移动终端上的普及,使得越来越多的用户通过即时视频应用来实现与他人之间的交互,因此需要一种在即时视频中的表情识别方法,来满足用户在通过即时视频应用来实现与他人之间的交互时的个性化需求,提高交互场景下的用户体验。With the popularity of instant video applications on mobile terminals, more and more users realize interaction with others through instant video applications. Therefore, an expression recognition method in instant video is needed to satisfy users. The instant video application realizes the personalized needs when interacting with others, and improves the user experience in the interactive scenario.
现有技术提供一种表情识别方法,该方法具体包括:从预先录制的视频中获取所要识别的当前帧画面,对当前帧画面中的人脸表情进行识别,并对其他帧图像继续执行上述步骤,从而对视频中的视频帧画面中人脸表情进行识别。The prior art provides an expression recognition method, which specifically includes: acquiring a current frame picture to be recognized from a pre-recorded video, identifying a facial expression in the current frame picture, and continuing the above steps on other frame images. To identify facial expressions in the video frame picture in the video.
但是该方法由于无法实时识别即时视频中的人脸表情,且在实现过程中,由于该方法会大量占用设备的处理资源和存储资源,所以该方法对设备的要求较高,使得该方法无法应用于如智能手机和平板电脑等移动终端,从而无法满足用户的多样化需求,降低了用户体验效果。However, since the method cannot recognize the facial expression in the real-time video in real time, and in the implementation process, since the method occupies a large amount of processing resources and storage resources of the device, the method has high requirements on the device, and the method cannot be applied. For mobile terminals such as smart phones and tablets, it is unable to meet the diverse needs of users and reduce the user experience.
发明内容Summary of the invention
为了满足用户的多样化需求,提高用户体验效果,本发明实施例提供了一种即时视频中的表情识别方法和电子设备。所述技术方案如下:In order to meet the diversified needs of the user and improve the user experience, the embodiment of the present invention provides an expression recognition method and an electronic device in an instant video. The technical solution is as follows:
第一方面,提供了一种即时视频中的表情识别方法,所述方法包括:In a first aspect, an expression recognition method in an instant video is provided, the method comprising:
获取即时视频帧中人脸的至少一个特征点所对应的特征向量,所述特征点用于描述所述人脸的当前表情; Obtaining, by the feature vector corresponding to the at least one feature point of the face in the instant video frame, the feature point is used to describe the current expression of the face;
识别所述至少一个特征点所对应的特征向量,生成识别结果;Identifying a feature vector corresponding to the at least one feature point to generate a recognition result;
根据所述识别结果,确定所述当前表情为预先存储的多个表情中的一个。And determining, according to the recognition result, the current expression as one of a plurality of expressions stored in advance.
结合第一方面,在第一种可能的实现方式中,所述特征向量包括标准姿态矩阵下的特征点坐标和纹理特征点坐标,所述纹理特征点用于唯一确定所述特征点,所述获取即时视频帧中人脸的至少一个特征点所对应的特征向量包括:With reference to the first aspect, in a first possible implementation, the feature vector includes feature point coordinates and texture feature point coordinates under a standard pose matrix, the texture feature points are used to uniquely determine the feature points, Obtaining a feature vector corresponding to at least one feature point of the face in the instant video frame includes:
获取所述标准姿态矩阵下的所述至少一个特征点坐标和至少一个纹理特征点坐标;Obtaining the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix;
根据所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标,生成所述至少一个特征点所对应的特征向量。Generating a feature vector corresponding to the at least one feature point according to the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix.
结合第一方面的第一种可能的实现方式,在第二种可能的实现方式中,所述获取所述标准姿态矩阵下的所述至少一个特征点坐标和至少一个纹理特征点坐标包括:In conjunction with the first possible implementation of the first aspect, in the second possible implementation, the acquiring the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix includes:
获取所述即时视频帧中人脸的所述至少一个特征点坐标和所述至少一个纹理特征点坐标;Obtaining the at least one feature point coordinate of the face in the instant video frame and the at least one texture feature point coordinate;
将所述至少一个特征点进行归一化处理,得到所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标。And normalizing the at least one feature point to obtain the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix.
结合第一方面的第二种可能的实现方式,在第三种可能的实现方式中,所述将所述至少一个特征点进行归一化处理,得到所述标准姿态矩阵下的所述至少一个特征点所述至少一个纹理特征点坐标包括:In conjunction with the second possible implementation of the first aspect, in a third possible implementation, the at least one feature point is normalized to obtain the at least one of the standard pose matrix The at least one texture feature point coordinate of the feature point includes:
根据所述即时视频帧中人脸的所述至少一个特征点坐标和所述至少一个纹理特征点坐标,获取所述即时视频帧中人脸的所述至少一个特征点和所述至少一个纹理特征点对应的当前姿态矩阵;Obtaining the at least one feature point and the at least one texture feature of the face in the instant video frame according to the at least one feature point coordinate of the face in the instant video frame and the at least one texture feature point coordinate The current pose matrix corresponding to the point;
将所述当前姿态矩阵旋转为标准姿态矩阵,并获取所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标。Rotating the current pose matrix into a standard pose matrix, and acquiring the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix.
结合第一方面的第三种可能的实现方式,在第四种可能的实现方式中,所述识别所述至少一个特征点所对应的特征向量包括:In conjunction with the third possible implementation of the first aspect, in the fourth possible implementation, the identifying the feature vector corresponding to the at least one feature point includes:
将所述至少一个特征点所对应的特征向量输入预设表情模型库中进行计 算,获取在至少一个预设表情模型中的计算结果,所述计算结果用于表示识别结果。Inputting a feature vector corresponding to the at least one feature point into a preset expression model library for calculation Calculating, obtaining a calculation result in at least one preset expression model, the calculation result being used to represent the recognition result.
结合第一方面的第四种可能的实现方式,在第五种可能的实现方式中,所述根据所述识别结果,确定所述当前表情为预先存储的多个表情中的一个包括:In conjunction with the fourth possible implementation of the first aspect, in a fifth possible implementation, the determining, according to the recognition result, that the current expression is one of a plurality of pre-stored expressions includes:
若所述识别结果在预设范围内,则判定所述特征向量所对应的表情为预先存储的多个表情中的一个。If the recognition result is within the preset range, it is determined that the expression corresponding to the feature vector is one of a plurality of pre-stored expressions.
第二方面,提供了一种电子设备,所述电子设备包括:In a second aspect, an electronic device is provided, the electronic device comprising:
获取模块,用于获取即时视频帧中人脸的至少一个特征点所对应的特征向量,所述特征点用于描述所述人脸的当前表情;An acquiring module, configured to acquire a feature vector corresponding to at least one feature point of a face in an instant video frame, where the feature point is used to describe a current expression of the face;
识别模块,用于识别所述至少一个特征点所对应的特征向量,生成识别结果;An identification module, configured to identify a feature vector corresponding to the at least one feature point, and generate a recognition result;
确定模块,用于根据所述识别结果,确定所述当前表情为预先存储的多个表情中的一个。And a determining module, configured to determine, according to the recognition result, the current expression as one of a plurality of pre-stored expressions.
结合第二方面,在第一种可能的实现方式中,In combination with the second aspect, in a first possible implementation manner,
所述获取模块还用于,获取标准姿态矩阵下的所述至少一个特征点坐标和至少一个纹理特征点坐标;The acquiring module is further configured to acquire the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix;
所述识别模块还用于,根据所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标,生成所述至少一个特征点所对应的特征向量。The identification module is further configured to generate a feature vector corresponding to the at least one feature point according to the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix.
结合第二方面的第一种可能的实现方式,在第二种可能的实现方式中,In conjunction with the first possible implementation of the second aspect, in a second possible implementation,
所述获取模块还用于,获取所述即时视频帧中人脸的所述至少一个特征点坐标和所述至少一个纹理特征点坐标;The acquiring module is further configured to acquire the at least one feature point coordinate of the face in the instant video frame and the at least one texture feature point coordinate;
所述设备还包括处理模块,用于将所述至少一个特征点进行归一化处理,得到所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标。The device further includes a processing module, configured to perform normalization processing on the at least one feature point to obtain the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix.
结合第二方面的第二种可能的实现方式,在第三种可能的实现方式中,In conjunction with the second possible implementation of the second aspect, in a third possible implementation manner,
所述获取模块还用于,根据所述即时视频帧中人脸的所述至少一个特征点 坐标和所述至少一个纹理特征点坐标,获取所述即时视频帧中人脸的所述至少一个特征点和所述至少一个纹理特征点对应的当前姿态矩阵;The obtaining module is further configured to: according to the at least one feature point of a face in the instant video frame Obtaining, by the coordinates and the at least one texture feature point coordinate, the at least one feature point of the face in the instant video frame and the current pose matrix corresponding to the at least one texture feature point;
所述处理模块还用于将所述当前姿态矩阵旋转为标准姿态矩阵,并获取所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标。The processing module is further configured to rotate the current pose matrix into a standard pose matrix, and acquire the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix.
结合第二方面的第一种或者第二种可能的实现方式,在第四种可能的实现方式中,所述设备还包括:With reference to the first or second possible implementation of the second aspect, in a fourth possible implementation, the device further includes:
计算模块,用于将所述至少一个特征点所对应的特征向量输入预设表情模型库中进行计算,获取所述识别结果。And a calculation module, configured to input a feature vector corresponding to the at least one feature point into a preset expression model library for calculation, and obtain the recognition result.
结合第二方面的第四种可能的实现方式,在第五种可能的实现方式中,所述确定模块具体用于:With reference to the fourth possible implementation of the second aspect, in a fifth possible implementation, the determining module is specifically configured to:
若所述识别结果在预设范围内,则判定所述特征向量所对应的表情为预先存储的多个表情中的一个。If the recognition result is within the preset range, it is determined that the expression corresponding to the feature vector is one of a plurality of pre-stored expressions.
第三方面,提供了一种电子设备,包括是视频输入模块、视频输出模块、发送模块、接收模块、存储器以及与所述视频输入模块、所述视频输出模块、所述发送模块、所述接收模块和所述存储器连接的处理器,其中,所述存储器存储一组程序代码,所述处理器用于调用所述存储器中存储的程序代码,执行以下操作:In a third aspect, an electronic device is provided, including a video input module, a video output module, a sending module, a receiving module, a memory, and the video input module, the video output module, the sending module, and the receiving And a processor coupled to the memory, wherein the memory stores a set of program code, the processor is configured to invoke program code stored in the memory, and perform the following operations:
获取即时视频帧中人脸的至少一个特征点所对应的特征向量,所述特征点用于描述所述人脸的当前表情;Obtaining, by the feature vector corresponding to the at least one feature point of the face in the instant video frame, the feature point is used to describe the current expression of the face;
识别所述至少一个特征点所对应的特征向量,生成识别结果;Identifying a feature vector corresponding to the at least one feature point to generate a recognition result;
根据所述识别结果,确定所述当前表情为预先存储的多个表情中的一个。And determining, according to the recognition result, the current expression as one of a plurality of expressions stored in advance.
结合第三方面,在第一种可能的实现方式中,所述处理器还用于调用所述存储器中存储的程序代码,执行以下操作:In conjunction with the third aspect, in a first possible implementation, the processor is further configured to invoke program code stored in the memory, and perform the following operations:
获取所述标准姿态矩阵下的所述至少一个特征点坐标和至少一个纹理特征点坐标;Obtaining the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix;
根据所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理 特征点坐标,生成所述至少一个特征点所对应的特征向量。And the at least one feature point coordinate and the at least one texture according to the standard pose matrix Feature point coordinates, generating feature vectors corresponding to the at least one feature point.
结合第三方面的第一种可能的实现方式,在第二种可能的实现方式中,所述处理器还用于调用所述存储器中存储的程序代码,执行以下操作:In conjunction with the first possible implementation of the third aspect, in a second possible implementation, the processor is further configured to invoke program code stored in the memory, and perform the following operations:
获取所述即时视频帧中人脸的所述至少一个特征点坐标和所述至少一个纹理特征点坐标;Obtaining the at least one feature point coordinate of the face in the instant video frame and the at least one texture feature point coordinate;
将所述至少一个特征点进行归一化处理,得到所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标。And normalizing the at least one feature point to obtain the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix.
结合第三方面的第二种可能的实现方式,在第三种可能的实现方式中,所述处理器还用于调用所述存储器中存储的程序代码,执行以下操作:In conjunction with the second possible implementation of the third aspect, in a third possible implementation, the processor is further configured to invoke the program code stored in the memory, and perform the following operations:
根据所述即时视频帧中人脸的所述至少一个特征点坐标和所述至少一个纹理特征点坐标,获取所述即时视频帧中人脸的所述至少一个特征点和所述至少一个纹理特征点对应的当前姿态矩阵;Obtaining the at least one feature point and the at least one texture feature of the face in the instant video frame according to the at least one feature point coordinate of the face in the instant video frame and the at least one texture feature point coordinate The current pose matrix corresponding to the point;
将所述当前姿态矩阵旋转为标准姿态矩阵,并获取所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标。Rotating the current pose matrix into a standard pose matrix, and acquiring the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix.
结合第三方面的第三种可能的实现方式,在第四种可能的实现方式中,所述处理器还用于调用所述存储器中存储的程序代码,执行以下操作:In conjunction with the third possible implementation of the third aspect, in a fourth possible implementation, the processor is further configured to invoke the program code stored in the memory, and perform the following operations:
将所述至少一个特征点所对应的特征向量输入预设表情模型库中进行计算,获取所述识别结果。And inputting a feature vector corresponding to the at least one feature point into a preset expression model library for calculation, and acquiring the recognition result.
结合第三方面的第四种可能的实现方式,在第五种可能的实现方式中,所述处理器还用于调用所述存储器中存储的程序代码,执行以下操作:In conjunction with the fourth possible implementation of the third aspect, in a fifth possible implementation, the processor is further configured to invoke the program code stored in the memory, and perform the following operations:
若所述识别结果在预设范围内,则判定所述特征向量所对应的表情为预先存储的多个表情中的一个。If the recognition result is within the preset range, it is determined that the expression corresponding to the feature vector is one of a plurality of pre-stored expressions.
本发明实施例提供了一种即时视频中的表情识别方法和电子设备,包括:获取即时视频帧中人脸的至少一个特征点所对应的特征向量,特征点用于描述所述人脸的当前表情;识别至少一个特征点所对应的特征向量,生成识别结果;根据识别结果,确定当前表情为预先存储的多个表情中的一个。通过在即时视频中获取用于描述人脸的当前表情的特征点,从而使得通过特征点获取特征点 对应的特征向量更加能够准确的表示人脸的当前表情,再通过识别特征向量,根据特征向量获取识别结果,简化了在即时视频中识别人脸的算法的复杂度,使得通过本发明实施例提供的方法可以在移动终端上运行,满足了用户的多样化需求,提高了用户体验。An embodiment of the present invention provides an expression recognition method and an electronic device in an instant video, including: acquiring a feature vector corresponding to at least one feature point of a face in an instant video frame, where the feature point is used to describe the current face of the face An expression; identifying a feature vector corresponding to the at least one feature point to generate a recognition result; and determining, according to the recognition result, the current expression as one of a plurality of pre-stored expressions. Obtaining feature points through feature points by acquiring feature points for describing a current expression of a face in an instant video The corresponding feature vector can more accurately represent the current expression of the face, and then obtain the recognition result according to the feature vector by identifying the feature vector, which simplifies the complexity of the algorithm for recognizing the face in the instant video, so that the embodiment of the present invention provides The method can be run on the mobile terminal to meet the diverse needs of the user and improve the user experience.
附图说明DRAWINGS
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present invention. Other drawings may also be obtained from those of ordinary skill in the art in light of the inventive work.
图1是本发明实施例提供的一种交互系统示意图;FIG. 1 is a schematic diagram of an interaction system according to an embodiment of the present invention;
图2是本发明实施例提供的一种交互系统示意图;2 is a schematic diagram of an interaction system according to an embodiment of the present invention;
图3是本发明实施例提供的一种交互系统示意图;3 is a schematic diagram of an interaction system according to an embodiment of the present invention;
图4是本发明实施例提供的一种即时视频中的表情识别方法流程图;4 is a flowchart of an expression recognition method in an instant video according to an embodiment of the present invention;
图5是本发明实施例提供的一种即时视频中的表情识别方法流程图;FIG. 5 is a flowchart of an expression recognition method in an instant video according to an embodiment of the present invention; FIG.
图6是本发明实施例提供的一种电子设备结构示意图;6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;
图7是本发明实施例提供的一种电子设备结构示意图。FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
具体实施方式detailed description
为使本发明的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. Some embodiments of the invention, rather than all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
本发明实施例提供了一种即时视频中的表情识别方法,该方法应用于一种包括至少两个移动终端和服务器的交互系统中,其中,移动终端可以运行一即时视频程序,用户可通过运行该移动终端上的即时视频程序来实现与他人之间 的交互,该移动终端可以是智能手机(Smart Phone),可以是平板电脑(Tablet Computer),还可以是其他移动终端,本发明实施例对具体的移动终端不加以限定。移动终端至少包括视频输入模块和视频显示模块,视频输入模块可以包括摄像头,视频显示模块可以包括显示屏,即时视频程序可以通过控制移动终端的视频输入模块实现即时视频的输入,还可以通过控制视频显示模块实现即时视频的显示。An embodiment of the present invention provides an expression recognition method in an instant video, where the method is applied to an interactive system including at least two mobile terminals and a server, wherein the mobile terminal can run an instant video program, and the user can run Instant video program on the mobile terminal to achieve interaction with others The mobile terminal may be a smart phone, a tablet computer, or another mobile terminal. The specific mobile terminal is not limited in the embodiment of the present invention. The mobile terminal at least includes a video input module and a video display module, the video input module may include a camera, and the video display module may include a display screen, and the instant video program may implement real-time video input by controlling a video input module of the mobile terminal, and may also control the video. The display module enables the display of instant video.
该交互系统可以参照图1所示,在该交互系统中,移动终端1为即时视频发送方,移动终端2为即时视频接收方,移动终端1所发送的即时视频经由服务器转发至移动终端2;移动终端1的用户与移动终端2的用户可以通过该交互系统实现交互。The interactive system can be referred to FIG. 1 , in which the mobile terminal 1 is an instant video sender, the mobile terminal 2 is an instant video receiver, and the instant video sent by the mobile terminal 1 is forwarded to the mobile terminal 2 via the server; The user of the mobile terminal 1 and the user of the mobile terminal 2 can interact through the interactive system.
特别的,本发明实施例所提供的方法的执行主体,即电子设备,可以是移动终端1、移动终端2和服务器中的任意一个,若该方法的执行主体是移动终端1,则移动终端1接收通过自身的视频输入模块输入的即时视频后,对该即时视频中的人脸进行表情识别,将识别结果经服务器转发至移动终端2,和/或通过自身的显示屏输出识别结果;若该方法的执行主体是服务器,则移动终端1和/或移动终端2在通过自身的视频输入模块输入即时视频后,将该即时视频发送至服务器,由服务器对即时视频中的人脸表情进行识别,再将识别结果发送至移动终端1和/或移动终端2;若该方法的执行主体是移动终端2,移动终端1在通过自身的视频输入模块输入即时视频后,将该即时视频发送至服务器,服务器将该即时视频发送至移动终端2,移动终端2对即时视频中的人脸表情进行识别,将识别结果经服务器转发至移动终端1,和/或通过自身的显示屏输出识别结果。本发明实施例对该交互系统中的该方法的具体的执行主体不加以限定。In particular, the execution body of the method provided by the embodiment of the present invention, that is, the electronic device, may be any one of the mobile terminal 1, the mobile terminal 2, and the server. If the execution subject of the method is the mobile terminal 1, the mobile terminal 1 After receiving the instant video input through the video input module of the user, performing facial expression recognition on the face in the instant video, forwarding the recognition result to the mobile terminal 2 via the server, and/or outputting the recognition result through the display screen of the same; The execution body of the method is a server, and after the mobile terminal 1 and/or the mobile terminal 2 input the live video through the video input module of the method, the instant video is sent to the server, and the server recognizes the facial expression in the instant video. The recognition result is sent to the mobile terminal 1 and/or the mobile terminal 2; if the execution subject of the method is the mobile terminal 2, the mobile terminal 1 sends the live video to the server after inputting the live video through its own video input module. The server sends the instant video to the mobile terminal 2, and the mobile terminal 2 performs the facial expression in the instant video. Identification, forwarding the recognition result to the mobile terminal 1 via the server, and/or outputting the recognition result through its own display screen. The specific implementation body of the method in the interaction system is not limited in the embodiment of the present invention.
除此之外,本发明实施例所提供的方法还可以应用于一种只包括移动终端1和移动终端2的交互系统中,该交互系统可以参照图2所示,其中,图2所示的交互系统中的移动终端与图1所示的交互系统中的移动终端相同,此处再不加以赘述。In addition, the method provided by the embodiment of the present invention can also be applied to an interactive system including only the mobile terminal 1 and the mobile terminal 2. The interactive system can be referred to FIG. 2, wherein the The mobile terminal in the interactive system is the same as the mobile terminal in the interactive system shown in FIG. 1, and details are not described herein again.
特别的,本发明实施例所提供的方法的执行主体,即电子设备,可以是移 动终端1和移动终端2中的任意一个,若该方法的执行主体是移动终端1,则该移动终端1在通过自身的视频输入模块输入即时视频之后,对该即时视频中的人脸进行表情识别,然后将识别结果发送至通讯设备2,和/或者在通过自身的显示屏输出识别结果;若该方法的执行主体是移动终端2,移动终端1在通过自身的视频输入模块输入即时视频后,将该即时视频发送至移动终端2,移动终端2对即时视频中的人脸进行表情识别,再识别结果发送至移动终端1,和/或者通过自身的显示屏输出识别结果。本发明实施例对该交互系统中的该方法的具体的执行主体不加以限定。In particular, the execution body of the method provided by the embodiment of the present invention, that is, the electronic device, may be Any one of the mobile terminal 1 and the mobile terminal 2, if the execution subject of the method is the mobile terminal 1, the mobile terminal 1 performs an expression on the face in the instant video after inputting the live video through its own video input module Identifying, then transmitting the recognition result to the communication device 2, and/or outputting the recognition result through its own display screen; if the execution subject of the method is the mobile terminal 2, the mobile terminal 1 inputs the live video through its own video input module Sending the live video to the mobile terminal 2, the mobile terminal 2 performs expression recognition on the face in the instant video, and then transmits the recognition result to the mobile terminal 1, and/or outputs the recognition result through its own display screen. The specific implementation body of the method in the interaction system is not limited in the embodiment of the present invention.
除此之外,本发明实施例所提供的方法还可以应用于一种只包括移动终端1和用户的交互系统中,该交互系统可以参照图3所示,其中,移动终端1至少包括视频输入模块和视频显示模块,视频输入模块可以包括摄像头,视频显示模块可以包括显示屏,且移动终端中至少可以运行一即时视频程序,该即时视频程序控制移动终端的视频输入模块和视频显示模块进行即时视频。具体的,移动终端接收用户输入的即时视频,在该即时视频进行人脸表情识别,并通过自身的显示屏输出识别结果。In addition, the method provided by the embodiment of the present invention can also be applied to an interactive system including only the mobile terminal 1 and the user. The interactive system can be referred to FIG. 3, wherein the mobile terminal 1 includes at least a video input. The module and the video display module, the video input module may include a camera, the video display module may include a display screen, and at least one instant video program may be run in the mobile terminal, and the instant video program controls the video input module and the video display module of the mobile terminal to perform instant video. Specifically, the mobile terminal receives the instant video input by the user, performs facial expression recognition on the instant video, and outputs the recognition result through the display screen of the user.
需要说明的是,本发明实施例中的移动终端可以为1个,也可以为多个,本发明实施例对具体的移动终端不加以限定。It should be noted that the mobile terminal in the embodiment of the present invention may be one or multiple, and the specific mobile terminal is not limited in the embodiment of the present invention.
除此之外,本发明是实施例还可以包括其他应用场景,本发明实施例对具体的应用场景不加以限定。In addition, the embodiment of the present invention may further include other application scenarios, and the specific application scenario is not limited in the embodiment of the present invention.
实施例一Embodiment 1
本发明实施例提供了一种即时视频中的表情识别方法,参见图4所示,该方法流程包括:An embodiment of the present invention provides an expression recognition method in an instant video. As shown in FIG. 4, the method includes:
401、获取即时视频帧中人脸的至少一个特征点所对应的特征向量,特征点用于描述人脸的当前表情。401. Acquire a feature vector corresponding to at least one feature point of a face in an instant video frame, where the feature point is used to describe a current expression of the face.
其中,特征向量包括标准姿态矩阵下的特征点坐标和纹理特征点坐标,纹理特征点用于唯一确定特征点。The feature vector includes feature point coordinates and texture feature point coordinates under a standard pose matrix, and the texture feature points are used to uniquely determine feature points.
具体的,获取即时视频帧中人脸的至少一个特征点所对应的特征向量包括: Specifically, acquiring the feature vector corresponding to the at least one feature point of the face in the instant video frame includes:
获取标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标;Obtaining at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix;
值得注意的是,获取标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标的过程可以为:It should be noted that the process of obtaining at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix may be:
获取即时视频帧中人脸的至少一个特征点坐标和至少一个纹理特征点坐标;Obtaining at least one feature point coordinate of the face in the instant video frame and at least one texture feature point coordinate;
将至少一个特征点进行归一化处理,并获取标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标。The at least one feature point is normalized, and at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix are obtained.
值得注意的是,将至少一个特征点进行归一化处理,并获取标准姿态矩阵下的至少一个特征点至少一个纹理特征点坐标的过程可以为:It should be noted that the process of normalizing at least one feature point and acquiring at least one texture feature point coordinate of at least one feature point under the standard pose matrix may be:
根据即时视频帧中人脸的至少一个特征点坐标和至少一个纹理特征点坐标,获取即时视频帧中人脸的至少一个特征点和至少一个纹理特征点对应的当前姿态矩阵;Obtaining, according to at least one feature point coordinate of the face and at least one texture feature point coordinate of the face in the instant video frame, acquiring at least one feature point of the face in the instant video frame and a current pose matrix corresponding to the at least one texture feature point;
将当前姿态矩阵旋转为标准姿态矩阵,并获取标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标。Rotating the current pose matrix into a standard pose matrix, and acquiring at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix.
在将至少一个特征点进行归一化处理,并获取标准姿态矩阵下的至少一个特征点至少一个纹理特征点坐标之后,执行下述步骤:After normalizing at least one feature point and acquiring at least one texture feature point coordinate of at least one feature point under the standard pose matrix, performing the following steps:
根据标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标,生成至少一个特征点所对应的特征向量。Generating a feature vector corresponding to the at least one feature point according to at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix.
402、识别至少一个特征点所对应的特征向量,生成识别结果。402. Identify a feature vector corresponding to the at least one feature point, and generate a recognition result.
具体的,将至少一个特征点所对应的特征向量输入预设表情模型库中进行计算,获取识别结果。Specifically, the feature vector corresponding to the at least one feature point is input into the preset expression model library for calculation, and the recognition result is obtained.
403、根据识别结果,确定当前表情为预先存储的多个表情中的一个。403. Determine, according to the recognition result, that the current expression is one of a plurality of expressions stored in advance.
具体的,若识别结果在预设范围内,则判定特征向量所对应的表情为预先存储的多个表情中的一个。Specifically, if the recognition result is within the preset range, it is determined that the expression corresponding to the feature vector is one of a plurality of pre-stored expressions.
本发明实施例提供了一种即时视频中的表情识别方法和电子设备。通过在即时视频中获取用于描述人脸的当前表情的特征点,从而使得通过特征点获取特征点对应的特征向量更加能够准确的表示人脸的当前表情,再通过识别特征 向量,根据特征向量获取识别结果,简化了在即时视频中识别人脸的算法的复杂度,使得通过本发明实施例提供的方法可以在移动终端上运行,满足了用户的多样化需求,提高了用户体验。Embodiments of the present invention provide an expression recognition method and an electronic device in an instant video. By acquiring feature points for describing the current expression of the face in the instant video, the feature vector corresponding to the feature point is obtained by the feature point, and the current expression of the face is more accurately represented, and then the feature is recognized. The vector obtains the recognition result according to the feature vector, which simplifies the complexity of the algorithm for recognizing the face in the instant video, so that the method provided by the embodiment of the present invention can be run on the mobile terminal, satisfies the diversified needs of the user, and improves the vector. user experience.
实施例二Embodiment 2
本发明实施例提供了一种即时视频中的表情识别方法,参照图5所示,方法流程包括:An embodiment of the present invention provides an expression recognition method in an instant video. Referring to FIG. 5, the method flow includes:
501、获取即时视频帧中人脸的至少一个特征点坐标和至少一个纹理特征点坐标。501. Obtain at least one feature point coordinate of the face in the instant video frame and at least one texture feature point coordinate.
具体的,该至少一个特征点用于描述即时视频中人脸的当前表情。Specifically, the at least one feature point is used to describe a current expression of a face in an instant video.
由于人脸的表情是通过人脸细节来确定的,所以,该至少一个特征点用于描述人脸细节的轮廓,人脸细节至少包括眼部、嘴部、眉毛和鼻子。本发明实施例对具体的获取人脸特征点的方式不加以限定。Since the expression of the face is determined by the face detail, the at least one feature point is used to describe the outline of the face detail, and the face detail includes at least the eyes, the mouth, the eyebrows, and the nose. The manner in which the face feature points are obtained is not limited in the embodiment of the present invention.
根据获取的人脸的特征点,获取用于描述该特征点的特征参数,该特征参数可以包括该特征点在至少包括人脸面部的向量的坐标,还可以包括该特征点在至少包括人脸面部中所指示的向量的尺度和方向。Obtaining a feature parameter for describing the feature point according to the acquired feature point of the face, the feature parameter may include coordinates of the feature point in a vector including at least a face face, and may further include the feature point including at least a face of the face The scale and direction of the vector indicated in the section.
根据获取的特征点参数获取该特征点在至少包括人脸面部的向量的坐标。Obtaining, according to the acquired feature point parameter, coordinates of the feature point at least including a face of the face.
在每个特征点附近获取纹理特征点,纹理特征点用于唯一确定特征点,并且纹理特征点不随光线、角度等的变化而变化。A texture feature point is acquired near each feature point, and the texture feature point is used to uniquely determine the feature point, and the texture feature point does not change with changes in light, angle, and the like.
值得注意的是,可以通过预设的提取模型或者提取算法,从人脸中提取出特征点和纹理特征点,除此之外,还可以通过其他方式,从人脸中提取出特征点和纹理特征点,本发明实施例对具体的提取模型、提取算法以及提取方式不加以限定。It is worth noting that feature points and texture feature points can be extracted from the face by a preset extraction model or an extraction algorithm. In addition, feature points and textures can be extracted from the face by other means. Feature points, the specific extraction model, the extraction algorithm, and the extraction method are not limited in the embodiment of the present invention.
由于纹理特征点描述了特征点所在区域,所以纹理特征点可以用于唯一确定特征点,使得根据特征点和纹理特征点确定人脸细节,保证了即时视频中的特征点与实际特征点在同一个位置,确保了图像细节的识别质量,从而提高了表情识别的可靠性。 Since the texture feature point describes the region where the feature point is located, the texture feature point can be used to uniquely determine the feature point, so that the face detail is determined according to the feature point and the texture feature point, and the feature point in the instant video is guaranteed to be the same as the actual feature point. A position ensures the recognition quality of the image details, thereby improving the reliability of the expression recognition.
502、根据即时视频帧中人脸的至少一个特征点坐标和至少一个纹理特征点坐标,获取即时视频帧中人脸的至少一个特征点和至少一个纹理特征点对应的当前姿态矩阵。502. Acquire at least one feature point of the face in the instant video frame and a current pose matrix corresponding to the at least one texture feature point according to the at least one feature point coordinate of the face in the instant video frame and the at least one texture feature point coordinate.
具体的,该姿态矩阵用于指示特征点和以及与该特征点对应的特征纹理点的三维坐标所指示的向量的尺度和方向。Specifically, the attitude matrix is used to indicate the scale and direction of the vector indicated by the three-dimensional coordinates of the feature point and the feature texture point corresponding to the feature point.
该过程可以为:The process can be:
a、将至少一个特征点和至少一个纹理特征点进行归一化,获取即时视频帧中人脸的至少一个特征点和至少一个纹理特征点对应的当前姿态矩阵,该归一化过程可以为:a normalizing at least one feature point and at least one texture feature point to obtain at least one feature point of the face in the instant video frame and a current pose matrix corresponding to the at least one texture feature point, the normalization process may be:
b、获取该至少一个特征点和每个特征点对应的纹理特征点对应的三维坐标、尺度和方向。b. Obtain three-dimensional coordinates, scales, and directions corresponding to the at least one feature point and the texture feature points corresponding to each feature point.
由于在即时视频画面中所获取的特征点和每个特征点对应的纹理特征点所对应的坐标为二维坐标,所以对应的尺度和方向为二维坐标下的尺度和方向,所以,可以根据预设的转换算法,将二维坐标下,该至少一个特征点和每个特征点对应的纹理特征点对应的坐标、尺度和方向,转换为三维坐标下,该至少一个特征点和每个特征点对应的纹理特征点对应的坐标、尺度和方向;本发明实施例对具体的算法和转换方式不加以限定。Since the feature points acquired in the instant video picture and the coordinates corresponding to the texture feature points corresponding to each feature point are two-dimensional coordinates, the corresponding scale and direction are the scale and direction in the two-dimensional coordinates, so a preset conversion algorithm converts coordinates, scales and directions corresponding to the texture feature points corresponding to the at least one feature point and each feature point into two-dimensional coordinates, the at least one feature point and each feature in two-dimensional coordinates The coordinates, scales, and directions corresponding to the texture feature points corresponding to the points; the specific algorithm and the conversion mode are not limited in the embodiment of the present invention.
c、根据用于描述同一个细节的所有特征点以及与该所有特征点对应的纹理特征点的尺度和方向,生成与该所有特征点以及与该所有特征点对应的纹理特征点对应的当前姿态矩阵;其中,该当前姿态矩阵用于指示该所有特征点所指示的向量的尺度和方向。c. generating a current pose corresponding to all the feature points and the texture feature points corresponding to the feature points according to all feature points for describing the same detail and the scale and direction of the texture feature points corresponding to the feature points a matrix; wherein the current pose matrix is used to indicate the scale and direction of the vector indicated by the all feature points.
可选的,还可以通过以下方式实现步骤c,具体为:Optionally, step c can also be implemented in the following manner:
根据一个特征点以及与该特征点对应的纹理特征点的尺度和方向,生成与该特征点以及与该特征点对应的纹理特征点对应的当前姿态矩阵;Generating a current pose matrix corresponding to the feature point and the texture feature point corresponding to the feature point according to a feature point and a scale and direction of the texture feature point corresponding to the feature point;
该当前姿态矩阵用于指示该特征点所指示的向量的尺度和方向;The current pose matrix is used to indicate the scale and direction of the vector indicated by the feature point;
对下一个特征点继续执行上述步骤,直至生成所述特征点所对应的姿态矩阵。 The above steps are continued for the next feature point until the pose matrix corresponding to the feature point is generated.
由于相较于描述同一个细节的所有特征点,对每一个特征点都进行处理会减少提高图像处理时的失真率,增加图像处理的可靠性。Since each feature point is processed compared to all feature points describing the same detail, the distortion rate during image processing is improved, and the reliability of image processing is increased.
503、将当前姿态矩阵旋转为标准姿态矩阵,得到标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标。503. Rotate the current pose matrix into a standard pose matrix, and obtain at least one feature point coordinate and at least one texture feature point coordinate in the standard pose matrix.
具体的,本发明实施例对具体的将当前姿态矩阵旋转为标准姿态矩阵的方式不加以限定。Specifically, the embodiment of the present invention does not limit the specific manner of rotating the current posture matrix into a standard posture matrix.
值得注意的是,步骤502至步骤503是将至少一个特征点进行归一化处理,得到标准姿态矩阵下的至少一个特征点至少一个纹理特征点坐标的过程,除此之外,还可以通过其他方式实现该过程,本发明实施例对具体的方式不加以限定。It should be noted that, in step 502 to step 503, at least one feature point is normalized to obtain at least one texture feature point coordinate of at least one feature point in the standard pose matrix, and in addition, other methods may be adopted. The specific manner is not limited in the embodiment of the present invention.
本发明实施例通过对获取即时视频中的人脸的至少一个特征点坐标和至少一个纹理特征点坐标进行归一化处理,使得获取的姿态矩阵不受例如光照变化和视角变化等的影响,与传统的表情识别相比,使得即时视频中的表情识别不受姿态缩放的变化而变化,从而表情识别更加准确。The embodiment of the present invention normalizes the at least one feature point coordinate of the face in the instant video and the at least one texture feature point coordinate, so that the acquired pose matrix is not affected by, for example, illumination changes and perspective changes, and Compared with the traditional expression recognition, the expression recognition in the instant video is not changed by the change of the attitude zoom, so that the expression recognition is more accurate.
值得注意的是,步骤501至步骤503是获取标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标的过程,除此之外,还可以通过其他方式实现该过程,本发明实施例对具体的方式不加以限定。It should be noted that the steps 501 to 503 are processes for acquiring at least one feature point coordinate and at least one texture feature point coordinate in the standard pose matrix, and the process may be implemented in other manners, in the embodiment of the present invention. The specific method is not limited.
通过获取的至少一个特征点和至少一个纹理特征点是在标准姿态矩阵中获取的,排除了光照、角度等外界因素对即时视频人脸的影响,使得获取的特征点和纹理特征点更加有可比性,使得在即时视频总识别的表情更加准确。The obtained at least one feature point and the at least one texture feature point are acquired in the standard pose matrix, and the influence of external factors such as illumination and angle on the instant video face is excluded, so that the acquired feature point and the texture feature point are more comparable. Sex makes the expression in the total recognition of instant video more accurate.
504、根据标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标,生成至少一个特征点所对应的特征向量。504. Generate, according to at least one feature point coordinate and at least one texture feature point coordinate in the standard pose matrix, a feature vector corresponding to the at least one feature point.
其中,由于姿态矩阵指示了特征点的方向和尺度,所以,可以根据标准姿态矩阵,获取与该标准姿态矩阵对应的至少一个特征点坐标以及与至少一个特征点对应的至少一个纹理特征点坐标。The orientation matrix indicates the direction and the scale of the feature point. Therefore, at least one feature point coordinate corresponding to the standard pose matrix and at least one texture feature point coordinate corresponding to the at least one feature point may be acquired according to the standard pose matrix.
本发明实施例对具体的根据标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标,生成至少一个特征点所对应的特征向量的方式不加以限 定。The embodiment of the present invention does not limit the manner in which the feature vector corresponding to at least one feature point is generated according to at least one feature point coordinate and at least one texture feature point coordinate in the standard pose matrix. set.
值得注意的是,步骤501至504是获取即时视频帧中人脸的至少一个特征点所对应的特征向量的过程,除此之外,还可以通过其他方式实现该过程,本发明实施例对具体的方式不加以限定。It is to be noted that the steps 501 to 504 are the process of acquiring the feature vector corresponding to the at least one feature point of the face in the instant video frame, and the process may be implemented in other manners. The way is not limited.
505、将至少一个特征点所对应的特征向量输入预设表情模型库中进行计算,获取识别结果。505. Enter a feature vector corresponding to the at least one feature point into a preset expression model library for calculation, and obtain a recognition result.
具体的,将特征向量输入每个表情对应的预设表情模型中进行计算。Specifically, the feature vector is input into a preset expression model corresponding to each expression for calculation.
该预设表情模型可以为回归方程,该回归方程可以为:The preset expression model can be a regression equation, which can be:
Figure PCTCN2016079115-appb-000001
Figure PCTCN2016079115-appb-000001
其中,A为回归系数,x为特征向量,y为识别结果。Where A is the regression coefficient, x is the feature vector, and y is the recognition result.
y∈(0,1)Y∈(0,1)
根据特征向量在每个表情对应的预设表情模型中计算结果y值,获取在至少一个预设表情模型中的识别结果。The result y is calculated according to the feature vector in the preset expression model corresponding to each expression, and the recognition result in the at least one preset expression model is obtained.
值得注意的是,该步骤是识别至少一个特征点所对应的特征向量,生成识别结果的过程,除此之外,还可以通过其他方式实现该过程,本发明实施例对具体的方式不加以限定。It is to be noted that the step is to identify the feature vector corresponding to the at least one feature point, and to generate the process of the recognition result. In addition, the process may be implemented in other manners, and the specific method is not limited in the embodiment of the present invention. .
通过利用逻辑回归方程的识别结果来实现即时视频中的人脸表情的识别,降低了计算的复杂性,使得在即时视频过程中识别人脸更加快速,减少系统进程占用,处理资源和存储资源的占用,提高了处理器的运行效率。By using the recognition result of the logistic regression equation to realize the recognition of facial expressions in the instant video, the computational complexity is reduced, and the recognition of the face is faster in the process of instant video, reducing the occupation of the system process, processing resources and storage resources. Occupied, improving the operating efficiency of the processor.
506、若识别结果在预设范围内,则判定特征向量所对应的表情为预先存储的多个表情中的一个。506. If the recognition result is within a preset range, determine that the expression corresponding to the feature vector is one of a plurality of pre-stored expressions.
根据特征向量在每个表情对应的预设表情模型中的识别结果所包括的y值,确定当前表情为预先存储的多个表情中的一个。The current expression is determined to be one of a plurality of pre-stored expressions according to the y value included in the recognition result of the feature vector in the preset expression model corresponding to each expression.
具体的,若y与1的差值在预设范围之内,则指示即时视频中的人脸表情为该预设表情模型所指示的表情;Specifically, if the difference between y and 1 is within a preset range, indicating that the facial expression in the instant video is the expression indicated by the preset expression model;
若y与0的差值在预设范围之内,则指示即时视频中的人脸表情不是该预设 表情模型所指示的表情。If the difference between y and 0 is within the preset range, it indicates that the facial expression in the instant video is not the preset The expression indicated by the expression model.
值得注意的是,步骤506是实现根据识别结果,确定当前表情为预先存储的多个表情中的一个的过程,除了上述方式之外,还可以通过其他方式实现该过程,本发明实施例对具体的过程不加以限定。It is to be noted that the step 506 is a process for determining that the current expression is one of a plurality of pre-stored expressions according to the recognition result. In addition to the foregoing manner, the process may be implemented in other manners. The process is not limited.
可选的,除了上述过程外,在步骤506之后,方法流程还包括:Optionally, in addition to the foregoing process, after the step 506, the method process further includes:
507、对即时视频进行平滑处理。507. Smoothing the instant video.
具体的,在即时视频过程中确定识别表情的帧数n,计算n帧内,获取的每种表情的得分总和,得分总和最高的就是这n帧中的被识别的表情。Specifically, the number n of frames for recognizing the expression is determined in the process of the instant video, and the sum of the scores of each of the acquired expressions in the n frames is calculated, and the highest sum of the scores is the recognized expression in the n frames.
其中,n为大于或者等于2的整数。Where n is an integer greater than or equal to 2.
由于即时视频中的人脸表情是不断变化的,所以通过识别两帧或者两帧以上的即时视频帧中的人脸表情,生成至少一个识别结果,再根据该至少一个识别结果,确定即时视频帧中的人脸表情,相比于通过识别一帧中的人脸表情,生成识别结果,根据该识别结果,确定该即时视频中的人脸表情,识别结果更加准确,可以更进一步的提高表情识别的可靠性,提高用户体验。Since the facial expression in the instant video is constantly changing, at least one recognition result is generated by identifying the facial expression in the two or more instant video frames, and then determining the instant video frame according to the at least one recognition result. In the facial expression, the recognition result is generated by recognizing the facial expression in one frame, and the facial expression in the instant video is determined according to the recognition result, and the recognition result is more accurate, and the expression recognition can be further improved. Reliability to improve the user experience.
可选的,在步骤501之前方法流程还包括:Optionally, before the step 501, the method process further includes:
508、建立各个表情对应的表情模型。508. Establish an expression model corresponding to each expression.
具体的,分别训练各个表情的模型,以所要建立的预设表情为正样本,其他预设表情作为负样本,利用步骤505所指示的逻辑回归方程进行训练,过程可以为:Specifically, the models of each expression are respectively trained, and the preset expressions to be established are taken as positive samples, and the other preset expressions are used as negative samples, and the logistic regression equation indicated in step 505 is used for training. The process may be:
将所要训练的表情作为正样本,其他的表情作为负样本,设置当输入值为正样本时,输出结果y=1,设置当输入值为负样本时,输出结果y=0;The expression to be trained is taken as a positive sample, and the other expressions are used as negative samples. When the input value is a positive sample, the output result y=1, and when the input value is a negative sample, the output result y=0;
其中,逻辑回归方程中的参数A获取过程可以为:Wherein, the parameter A acquisition process in the logistic regression equation can be:
将获取的所有用户在即时视频中的即时表情输入预设的优化公式中,生成参数A,该预设的优化公式可以为:The instant expressions of all the obtained users in the instant video are input into a preset optimization formula to generate a parameter A, and the preset optimization formula may be:
Figure PCTCN2016079115-appb-000002
Figure PCTCN2016079115-appb-000002
其中,J(A)表示参数A,yi为预测函数所预测的A值,yi'为A的真实值。 Where J(A) represents the parameter A, y i is the predicted A value of the prediction function, and y i ' is the true value of A.
值得注意的是,在执行步骤501至步骤506所述的方法时,可以通过预先建立的表情模型实现表情的识别,从而无需在每次执行步骤501至步骤506时,都执行步骤508。It should be noted that, when performing the method described in steps 501 to 506, the recognition of the expression can be realized by the pre-established expression model, so that step 508 is not required to be performed each time step 501 to step 506 is performed.
本发明实施例提供了一种即时视频中的表情识别方法和电子设备。通过在即时视频中获取用于描述人脸的当前表情的特征点,从而使得通过特征点获取特征点对应的特征向量更加能够准确的表示人脸的当前表情,再通过识别特征向量,根据特征向量获取识别结果,简化了在即时视频中识别人脸的算法的复杂度,使得通过本发明实施例提供的方法可以在移动终端上运行,满足了用户的多样化需求,提高了用户体验。另外,由于纹理特征点描述了特征点所在区域,所以纹理特征点可以用于唯一确定特征点,使得根据特征点和纹理特征点确定人脸细节,保证了即时视频中的特征点与实际特征点在同一个位置,确保了图像细节的识别质量,从而提高了表情识别的可靠性。另外,由于相较于描述同一个细节的所有特征点,对每一个特征点都进行处理会减少提高图像处理时的失真率,增加图像处理的可靠性。另外,通过对获取即时视频中的人脸的至少一个特征点坐标和至少一个纹理特征点坐标进行归一化处理,使得获取的姿态矩阵不受例如光照变化和视角变化等的影响,与传统的表情识别相比,使得即时视频中的表情识别不受姿态缩放的变化而变化,从而表情识别更加准确。另外,通过获取的至少一个特征点和至少一个纹理特征点是在标准姿态矩阵中获取的,排除了光照、角度等外界因素对即时视频人脸的影响,使得获取的特征点和纹理特征点更加有可比性,使得在即时视频总识别的表情更加准确。另外,通过利用逻辑回归方程的计算结果识别即时视频中的人脸表情,降低了计算的复杂性,使得在即时视频过程中识别人脸更加快速,减少系统进程占用,处理资源和存储资源的占用,提高了处理器的运行效率。Embodiments of the present invention provide an expression recognition method and an electronic device in an instant video. By acquiring feature points for describing the current expression of the face in the instant video, the feature vector corresponding to the feature point is obtained by the feature point, and the current expression of the face is more accurately represented, and then the feature vector is obtained according to the feature vector. Obtaining the recognition result simplifies the complexity of the algorithm for recognizing the face in the instant video, so that the method provided by the embodiment of the present invention can be run on the mobile terminal, satisfies the diversified needs of the user, and improves the user experience. In addition, since the texture feature point describes the region where the feature point is located, the texture feature point can be used to uniquely determine the feature point, so that the face detail is determined according to the feature point and the texture feature point, and the feature point and the actual feature point in the instant video are guaranteed. In the same position, the recognition quality of the image details is ensured, thereby improving the reliability of the expression recognition. In addition, since all the feature points are processed compared to all the feature points describing the same detail, the distortion rate in image processing is improved, and the reliability of image processing is increased. In addition, by performing normalization processing on acquiring at least one feature point coordinate of the face in the instant video and at least one texture feature point coordinate, the acquired pose matrix is not affected by, for example, illumination changes and viewing angle changes, and the like. Compared with the expression recognition, the expression recognition in the instant video is not changed by the change of the attitude zoom, so that the expression recognition is more accurate. In addition, the acquired at least one feature point and the at least one texture feature point are acquired in the standard pose matrix, and the influence of external factors such as illumination and angle on the instant video face is excluded, so that the acquired feature point and the texture feature point are more Comparable, making the expression of total recognition in real-time video more accurate. In addition, by using the calculation result of the logistic regression equation to identify the facial expression in the instant video, the computational complexity is reduced, and the recognition of the face is faster in the process of instant video, reducing the occupation of the system process, the occupation of processing resources and storage resources. , improve the operating efficiency of the processor.
实施例三Embodiment 3
本发明实施例提供了一种电子设备6,参照图6所示,电子设备6包括:An embodiment of the present invention provides an electronic device 6. Referring to FIG. 6, the electronic device 6 includes:
获取模块61,用于获取即时视频帧中人脸的至少一个特征点所对应的特征 向量,特征点用于描述人脸的当前表情;The obtaining module 61 is configured to acquire a feature corresponding to at least one feature point of the face in the instant video frame Vector, feature points are used to describe the current expression of the face;
识别模块62,用于识别至少一个特征点所对应的特征向量,生成识别结果;The identification module 62 is configured to identify a feature vector corresponding to the at least one feature point, and generate a recognition result;
确定模块63,用于根据识别结果,确定当前表情为预先存储的多个表情中的一个。The determining module 63 is configured to determine, according to the recognition result, that the current expression is one of a plurality of expressions stored in advance.
可选的,Optional,
获取模块61还用于,获取标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标;The obtaining module 61 is further configured to acquire at least one feature point coordinate and at least one texture feature point coordinate in the standard pose matrix;
识别模块62还用于,根据标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标,生成至少一个特征点所对应的特征向量。The identification module 62 is further configured to generate a feature vector corresponding to the at least one feature point according to the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix.
可选的,Optional,
获取模块61还用于,获取即时视频帧中人脸的至少一个特征点坐标和至少一个纹理特征点坐标;The obtaining module 61 is further configured to: acquire at least one feature point coordinate of the face in the instant video frame and at least one texture feature point coordinate;
设备还包括处理模块,用于将至少一个特征点进行归一化处理,得到标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标。The device further includes a processing module, configured to normalize the at least one feature point to obtain at least one feature point coordinate and at least one texture feature point coordinate in the standard pose matrix.
可选的,Optional,
获取模块61还用于,根据即时视频帧中人脸的至少一个特征点坐标和至少一个纹理特征点坐标,获取即时视频帧中人脸的至少一个特征点和至少一个纹理特征点对应的当前姿态矩阵;The obtaining module 61 is further configured to acquire, according to at least one feature point coordinate of the face and at least one texture feature point coordinate of the face in the instant video frame, at least one feature point of the face in the instant video frame and a current pose corresponding to the at least one texture feature point. matrix;
处理模块还用于将当前姿态矩阵旋转为标准姿态矩阵,并获取标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标。The processing module is further configured to rotate the current pose matrix into a standard pose matrix, and acquire at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix.
可选的,电子设备6还包括:Optionally, the electronic device 6 further includes:
计算模块,用于将至少一个特征点所对应的特征向量输入预设表情模型库中进行计算,获取识别结果。The calculation module is configured to input the feature vector corresponding to the at least one feature point into the preset expression model library for calculation, and obtain the recognition result.
可选的,确定模块63具体用于:Optionally, the determining module 63 is specifically configured to:
若识别结果在预设范围内,则判定特征向量所对应的表情为预先存储的多个表情中的一个。If the recognition result is within the preset range, it is determined that the expression corresponding to the feature vector is one of a plurality of pre-stored expressions.
本发明实施例提供了一种电子设备,该电子设备通过在即时视频中获取用 于描述人脸的当前表情的特征点,从而使得通过特征点获取特征点对应的特征向量更加能够准确的表示人脸的当前表情,再通过识别特征向量,根据特征向量获取识别结果,简化了在即时视频中识别人脸的算法的复杂度,使得通过本发明实施例提供的方法可以在移动终端上运行,满足了用户的多样化需求,提高了用户体验。An embodiment of the present invention provides an electronic device, which is obtained by using an instant video. Describe the feature points of the current expression of the face, so that the feature vector corresponding to the feature point is more accurately represented by the feature point, and then the recognition result is obtained by identifying the feature vector, and the recognition result is simplified according to the feature vector. The complexity of the algorithm for recognizing a face in the instant video enables the method provided by the embodiment of the present invention to be run on the mobile terminal, satisfies the diverse needs of the user, and improves the user experience.
实施例四Embodiment 4
本发明实施例提供了一种电子设备7,参见图7,电子设备7包括:视频输入模块71、视频输出模块72、发送模块73、接收模块74、存储器75以及与视频输入模块71、视频输出模块72、发送模块73、接收模块74和存储器75连接的处理器76,其中,存储器75存储一组程序代码,处理器76用于调用存储器75中存储的程序代码,执行以下操作:The embodiment of the present invention provides an electronic device 7. Referring to FIG. 7, the electronic device 7 includes a video input module 71, a video output module 72, a sending module 73, a receiving module 74, a memory 75, and a video input module 71, and a video output. The module 72, the transmitting module 73, the receiving module 74 and the processor 75 are connected to the processor 76, wherein the memory 75 stores a set of program codes, and the processor 76 is configured to call the program code stored in the memory 75 to perform the following operations:
获取即时视频帧中人脸的至少一个特征点所对应的特征向量,特征点用于描述人脸的当前表情;Obtaining a feature vector corresponding to at least one feature point of the face in the instant video frame, where the feature point is used to describe the current expression of the face;
识别至少一个特征点所对应的特征向量,生成识别结果;Identifying a feature vector corresponding to the at least one feature point to generate a recognition result;
根据识别结果,确定当前表情为预先存储的多个表情中的一个。Based on the recognition result, it is determined that the current expression is one of a plurality of expressions stored in advance.
可选的,处理器76用于调用存储器75中存储的程序代码,执行以下操作:Optionally, the processor 76 is configured to call the program code stored in the memory 75, and perform the following operations:
获取标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标;Obtaining at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix;
根据标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标,生成至少一个特征点所对应的特征向量。Generating a feature vector corresponding to the at least one feature point according to at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix.
可选的,处理器76用于调用存储器75中存储的程序代码,执行以下操作:Optionally, the processor 76 is configured to call the program code stored in the memory 75, and perform the following operations:
获取即时视频帧中人脸的至少一个特征点坐标和至少一个纹理特征点坐标;Obtaining at least one feature point coordinate of the face in the instant video frame and at least one texture feature point coordinate;
将至少一个特征点进行归一化处理,得到标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标。The at least one feature point is normalized to obtain at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix.
可选的,处理器76用于调用存储器75中存储的程序代码,执行以下操作:Optionally, the processor 76 is configured to call the program code stored in the memory 75, and perform the following operations:
根据即时视频帧中人脸的至少一个特征点坐标和至少一个纹理特征点坐标,获取即时视频帧中人脸的至少一个特征点和至少一个纹理特征点对应的当 前姿态矩阵;Acquiring at least one feature point of the face in the instant video frame and at least one texture feature point according to at least one feature point coordinate of the face in the instant video frame and at least one texture feature point coordinate Front pose matrix
将当前姿态矩阵旋转为标准姿态矩阵,并获取标准姿态矩阵下的至少一个特征点坐标和至少一个纹理特征点坐标。Rotating the current pose matrix into a standard pose matrix, and acquiring at least one feature point coordinate and at least one texture feature point coordinate under the standard pose matrix.
可选的,处理器76用于调用存储器75中存储的程序代码,执行以下操作:Optionally, the processor 76 is configured to call the program code stored in the memory 75, and perform the following operations:
将至少一个特征点所对应的特征向量输入预设表情模型库中进行计算,获取识别结果。The feature vector corresponding to the at least one feature point is input into the preset expression model library for calculation, and the recognition result is obtained.
可选的,处理器76用于调用存储器75中存储的程序代码,执行以下操作:Optionally, the processor 76 is configured to call the program code stored in the memory 75, and perform the following operations:
若识别结果在预设范围内,则判定特征向量所对应的表情为预先存储的多个表情中的一个。If the recognition result is within the preset range, it is determined that the expression corresponding to the feature vector is one of a plurality of pre-stored expressions.
本发明实施例提供了一种电子设备,该电子设备通过在即时视频中获取用于描述人脸的当前表情的特征点,从而使得通过特征点获取特征点对应的特征向量更加能够准确的表示人脸的当前表情,再通过识别特征向量,根据特征向量获取识别结果,简化了在即时视频中识别人脸的算法的复杂度,使得通过本发明实施例提供的方法可以在移动终端上运行,满足了用户的多样化需求,提高了用户体验。An embodiment of the present invention provides an electronic device that obtains a feature point for describing a current expression of a face in an instant video, so that the feature vector corresponding to the feature point is more accurately represented by the feature point. The current expression of the face, by identifying the feature vector, and obtaining the recognition result according to the feature vector, simplifies the complexity of the algorithm for recognizing the face in the instant video, so that the method provided by the embodiment of the present invention can be run on the mobile terminal, satisfying The diverse needs of users have improved the user experience.
需要说明的是:上述实施例提供的电子设备在执行即时视频中的表情识别方法时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将电子设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的电子设备与方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。It should be noted that, when performing the expression recognition method in the instant video, the electronic device provided by the foregoing embodiment is only illustrated by the division of each functional module. In actual applications, the functions may be assigned differently according to needs. The function module is completed, that is, the internal structure of the electronic device is divided into different functional modules to complete all or part of the functions described above. In addition, the electronic device and the method embodiment of the foregoing embodiments are in the same concept, and the specific implementation process is described in detail in the method embodiment, and details are not described herein again.
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。A person skilled in the art can understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be executed by a program to instruct related hardware, and the program may be stored in a computer readable storage medium. The storage medium may be a read only memory, a magnetic disk or an optical disk or the like.
以上仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护 范围之内。 The above are only the preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalents, improvements, etc., which are within the spirit and scope of the present invention, should be included in the protection of the present invention. Within the scope.

Claims (12)

  1. 一种即时视频中的表情识别方法,其特征在于,所述方法包括:An expression recognition method in instant video, characterized in that the method comprises:
    获取即时视频帧中人脸的至少一个特征点所对应的特征向量,所述特征点用于描述所述人脸的当前表情;Obtaining, by the feature vector corresponding to the at least one feature point of the face in the instant video frame, the feature point is used to describe the current expression of the face;
    识别所述至少一个特征点所对应的特征向量,生成识别结果;Identifying a feature vector corresponding to the at least one feature point to generate a recognition result;
    根据所述识别结果,确定所述当前表情为预先存储的多个表情中的一个。And determining, according to the recognition result, the current expression as one of a plurality of expressions stored in advance.
  2. 根据权利要求1所述的方法,其特征在于,所述特征向量包括标准姿态矩阵下的特征点坐标和纹理特征点坐标,所述纹理特征点用于唯一确定所述特征点。The method of claim 1 wherein said feature vector comprises feature point coordinates and texture feature point coordinates under a standard pose matrix, said texture feature points being used to uniquely determine said feature points.
  3. 根据权利要求2所述的方法,其特征在于,所述获取即时视频帧中人脸的至少一个特征点所对应的特征向量包括:The method according to claim 2, wherein the acquiring the feature vector corresponding to the at least one feature point of the face in the instant video frame comprises:
    获取所述标准姿态矩阵下的所述至少一个特征点坐标和至少一个纹理特征点坐标;Obtaining the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix;
    根据所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标,生成所述至少一个特征点所对应的特征向量。Generating a feature vector corresponding to the at least one feature point according to the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix.
  4. 根据权利要求3所述的方法,其特征在于,所述获取所述标准姿态矩阵下的所述至少一个特征点坐标和至少一个纹理特征点坐标包括:The method according to claim 3, wherein the acquiring the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix comprises:
    获取所述即时视频帧中人脸的所述至少一个特征点坐标和所述至少一个纹理特征点坐标;Obtaining the at least one feature point coordinate of the face in the instant video frame and the at least one texture feature point coordinate;
    将所述至少一个特征点进行归一化处理,得到所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标。And normalizing the at least one feature point to obtain the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix.
  5. 根据权利要求4所述的方法,其特征在于,所述将所述至少一个特征点进行归一化处理,得到所述标准姿态矩阵下的所述至少一个特征点所述至少一个纹理特征点坐标包括:The method according to claim 4, wherein the normalizing the at least one feature point to obtain the at least one texture feature point coordinate of the at least one feature point under the standard pose matrix include:
    根据所述即时视频帧中人脸的所述至少一个特征点坐标和所述至少一个纹 理特征点坐标,获取所述即时视频帧中人脸的所述至少一个特征点和所述至少一个纹理特征点对应的当前姿态矩阵;And according to the at least one feature point coordinate of the face in the instant video frame and the at least one pattern And acquiring, by the feature point coordinates, the at least one feature point of the face in the instant video frame and the current pose matrix corresponding to the at least one texture feature point;
    将所述当前姿态矩阵旋转为标准姿态矩阵,并获取所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标。Rotating the current pose matrix into a standard pose matrix, and acquiring the at least one feature point coordinate and the at least one texture feature point coordinate under the standard pose matrix.
  6. 根据权利要求1所述的方法,其特征在于,所述识别所述至少一个特征点所对应的特征向量包括:The method according to claim 1, wherein the identifying the feature vector corresponding to the at least one feature point comprises:
    将所述至少一个特征点所对应的特征向量输入预设表情模型库中进行计算,获取所述识别结果。And inputting a feature vector corresponding to the at least one feature point into a preset expression model library for calculation, and acquiring the recognition result.
  7. 一种电子设备,其特征在于,所述电子设备包括:An electronic device, comprising:
    获取模块,用于获取即时视频帧中人脸的至少一个特征点所对应的特征向量,所述特征点用于描述所述人脸的当前表情;An acquiring module, configured to acquire a feature vector corresponding to at least one feature point of a face in an instant video frame, where the feature point is used to describe a current expression of the face;
    识别模块,用于识别所述至少一个特征点所对应的特征向量,生成识别结果;An identification module, configured to identify a feature vector corresponding to the at least one feature point, and generate a recognition result;
    确定模块,用于根据所述识别结果,确定所述当前表情为预先存储的多个表情中的一个。And a determining module, configured to determine, according to the recognition result, the current expression as one of a plurality of pre-stored expressions.
  8. 根据权利要求7所述的设备,其特征在于,The device according to claim 7, wherein
    所述获取模块还用于,获取标准姿态矩阵下的所述至少一个特征点坐标和至少一个纹理特征点坐标;The acquiring module is further configured to acquire the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix;
    所述识别模块还用于,根据所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标,生成所述至少一个特征点所对应的特征向量。The identification module is further configured to generate a feature vector corresponding to the at least one feature point according to the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix.
  9. 根据权利要求7所述的设备,其特征在于,The device according to claim 7, wherein
    所述获取模块还用于,获取所述即时视频帧中人脸的所述至少一个特征点坐标和所述至少一个纹理特征点坐标;The acquiring module is further configured to acquire the at least one feature point coordinate of the face in the instant video frame and the at least one texture feature point coordinate;
    所述设备还包括处理模块,用于将所述至少一个特征点进行归一化处理,得到所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标。 The device further includes a processing module, configured to perform normalization processing on the at least one feature point to obtain the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix.
  10. 根据权利要求9所述的设备,其特征在于,The device according to claim 9, wherein
    所述获取模块还用于,根据所述即时视频帧中人脸的所述至少一个特征点坐标和所述至少一个纹理特征点坐标,获取所述即时视频帧中人脸的所述至少一个特征点和所述至少一个纹理特征点对应的当前姿态矩阵;The acquiring module is further configured to acquire the at least one feature of the face in the instant video frame according to the at least one feature point coordinate of the face in the instant video frame and the at least one texture feature point coordinate a current pose matrix corresponding to the at least one texture feature point;
    所述处理模块还用于,将所述当前姿态矩阵旋转为标准姿态矩阵,并获取所述标准姿态矩阵下的所述至少一个特征点坐标和所述至少一个纹理特征点坐标。The processing module is further configured to rotate the current pose matrix into a standard pose matrix, and acquire the at least one feature point coordinate and the at least one texture feature point coordinate in the standard pose matrix.
  11. 根据权利要求7所述的设备,其特征在于,所述设备还包括:The device according to claim 7, wherein the device further comprises:
    计算模块,用于将所述至少一个特征点所对应的特征向量输入预设表情模型库中进行计算,获取所述识别结果。And a calculation module, configured to input a feature vector corresponding to the at least one feature point into a preset expression model library for calculation, and obtain the recognition result.
  12. 一种电子设备,其特征在于,包括是视频输入模块、视频输出模块、发送模块、接收模块、存储器以及与所述视频输入模块、所述视频输出模块、所述发送模块、所述接收模块和所述存储器连接的处理器,其中,所述存储器存储一组程序代码,所述处理器用于调用所述存储器中存储的程序代码,执行以下操作:An electronic device, comprising: a video input module, a video output module, a transmitting module, a receiving module, a memory, and the video input module, the video output module, the transmitting module, the receiving module, and The memory-connected processor, wherein the memory stores a set of program code, the processor is configured to invoke program code stored in the memory, and perform the following operations:
    获取即时视频帧中人脸的至少一个特征点所对应的特征向量,所述特征点用于描述所述人脸的当前表情;Obtaining, by the feature vector corresponding to the at least one feature point of the face in the instant video frame, the feature point is used to describe the current expression of the face;
    识别所述至少一个特征点所对应的特征向量,生成识别结果;Identifying a feature vector corresponding to the at least one feature point to generate a recognition result;
    根据所述识别结果,确定所述当前表情为预先存储的多个表情中的一个。 And determining, according to the recognition result, the current expression as one of a plurality of expressions stored in advance.
PCT/CN2016/079115 2015-04-16 2016-04-13 Method for expression recognition in instant video and electronic equipment WO2016165614A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510182122.1 2015-04-16
CN201510182122.1A CN104794444A (en) 2015-04-16 2015-04-16 Facial expression recognition method in instant video and electronic equipment

Publications (1)

Publication Number Publication Date
WO2016165614A1 true WO2016165614A1 (en) 2016-10-20

Family

ID=53559232

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/079115 WO2016165614A1 (en) 2015-04-16 2016-04-13 Method for expression recognition in instant video and electronic equipment

Country Status (2)

Country Link
CN (1) CN104794444A (en)
WO (1) WO2016165614A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541422A (en) * 2020-12-08 2021-03-23 北京科技大学 Expression recognition method and device with robust illumination and head posture and storage medium

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794444A (en) * 2015-04-16 2015-07-22 美国掌赢信息科技有限公司 Facial expression recognition method in instant video and electronic equipment
CN109309866B (en) * 2017-07-27 2022-03-08 腾讯科技(深圳)有限公司 Image processing method and device, and storage medium
CN109934156A (en) * 2019-03-11 2019-06-25 重庆科技学院 A kind of user experience evaluation method and system based on ELMAN neural network
CN109978996B (en) * 2019-03-28 2021-06-11 北京达佳互联信息技术有限公司 Method, device, terminal and storage medium for generating expression three-dimensional model
CN110213667B (en) * 2019-04-16 2022-04-05 佛山市丰智胜教育咨询服务有限公司 Network guarantee method, system, equipment and storage medium for online video interaction
CN111460945A (en) * 2020-03-25 2020-07-28 亿匀智行(深圳)科技有限公司 Algorithm for acquiring 3D expression in RGB video based on artificial intelligence

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080107311A1 (en) * 2006-11-08 2008-05-08 Samsung Electronics Co., Ltd. Method and apparatus for face recognition using extended gabor wavelet features
CN103488293A (en) * 2013-09-12 2014-01-01 北京航空航天大学 Man-machine motion interaction system and method based on expression recognition
CN104077579A (en) * 2014-07-14 2014-10-01 上海工程技术大学 Facial expression image recognition method based on expert system
CN104123545A (en) * 2014-07-24 2014-10-29 江苏大学 Real-time expression feature extraction and identification method
CN104794444A (en) * 2015-04-16 2015-07-22 美国掌赢信息科技有限公司 Facial expression recognition method in instant video and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080107311A1 (en) * 2006-11-08 2008-05-08 Samsung Electronics Co., Ltd. Method and apparatus for face recognition using extended gabor wavelet features
CN103488293A (en) * 2013-09-12 2014-01-01 北京航空航天大学 Man-machine motion interaction system and method based on expression recognition
CN104077579A (en) * 2014-07-14 2014-10-01 上海工程技术大学 Facial expression image recognition method based on expert system
CN104123545A (en) * 2014-07-24 2014-10-29 江苏大学 Real-time expression feature extraction and identification method
CN104794444A (en) * 2015-04-16 2015-07-22 美国掌赢信息科技有限公司 Facial expression recognition method in instant video and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541422A (en) * 2020-12-08 2021-03-23 北京科技大学 Expression recognition method and device with robust illumination and head posture and storage medium
CN112541422B (en) * 2020-12-08 2024-03-12 北京科技大学 Expression recognition method, device and storage medium with robust illumination and head posture

Also Published As

Publication number Publication date
CN104794444A (en) 2015-07-22

Similar Documents

Publication Publication Date Title
WO2016165614A1 (en) Method for expression recognition in instant video and electronic equipment
WO2016110199A1 (en) Expression migration method, electronic device and system
WO2019128508A1 (en) Method and apparatus for processing image, storage medium, and electronic device
CN111476709B (en) Face image processing method and device and electronic equipment
US11163978B2 (en) Method and device for face image processing, storage medium, and electronic device
WO2019201042A1 (en) Image object recognition method and device, storage medium, and electronic device
JP6330036B2 (en) Image processing apparatus and image display apparatus
CN111783605B (en) Face image recognition method, device, equipment and storage medium
EP2659400A1 (en) Method, apparatus, and computer program product for image clustering
KR20140010541A (en) Method for correcting user's gaze direction in image, machine-readable storage medium and communication terminal
US10084986B2 (en) System and method for video call using augmented reality
CN111353336B (en) Image processing method, device and equipment
US9747695B2 (en) System and method of tracking an object
CN111680544B (en) Face recognition method, device, system, equipment and medium
CN114445562A (en) Three-dimensional reconstruction method and device, electronic device and storage medium
CN115049819A (en) Watching region identification method and device
US20160350622A1 (en) Augmented reality and object recognition device
CN113380269B (en) Video image generation method, apparatus, device, medium, and computer program product
CN112714337A (en) Video processing method and device, electronic equipment and storage medium
CN110545386B (en) Method and apparatus for photographing image
CN110956576B (en) Image processing method, device, equipment and storage medium
CN115424309A (en) Face key point generation method and device, terminal equipment and readable storage medium
TW202046169A (en) Electronic device and face recognition method
CN110597384A (en) Information communication method and system
CN110849317B (en) Method for determining included angle between display screens, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16779589

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 05/02/2018)

122 Ep: pct application non-entry in european phase

Ref document number: 16779589

Country of ref document: EP

Kind code of ref document: A1