CN107133361A - Gesture identification method, device and terminal device - Google Patents
Gesture identification method, device and terminal device Download PDFInfo
- Publication number
- CN107133361A CN107133361A CN201710398580.8A CN201710398580A CN107133361A CN 107133361 A CN107133361 A CN 107133361A CN 201710398580 A CN201710398580 A CN 201710398580A CN 107133361 A CN107133361 A CN 107133361A
- Authority
- CN
- China
- Prior art keywords
- gesture video
- gesture
- preset
- video
- recognized
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 239000011159 matrix material Substances 0.000 claims description 123
- 238000012706 support-vector machine Methods 0.000 claims description 81
- 230000006870 function Effects 0.000 claims description 28
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 22
- 230000002123 temporal effect Effects 0.000 claims description 9
- 238000010586 diagram Methods 0.000 description 22
- 230000003287 optical effect Effects 0.000 description 13
- 238000013461 design Methods 0.000 description 12
- 238000012545 processing Methods 0.000 description 12
- 238000004891 communication Methods 0.000 description 10
- 238000003709 image segmentation Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000001514 detection method Methods 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 230000001133 acceleration Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-M argininate Chemical compound [O-]C(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-M 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/73—Querying
- G06F16/732—Query formulation
- G06F16/7335—Graphical querying, e.g. query-by-region, query-by-sketch, query-by-trajectory, GUIs for designating a person/face/object as a query predicate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Human Computer Interaction (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Image Analysis (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The disclosure is directed to a kind of gesture identification method, device and terminal device, this method includes:Obtain gesture video to be identified;Further, the similarity of the default gesture video in the gesture video to be identified and presetting database, determines the gesture video collection belonging to the gesture video to be identified;The presetting database includes an at least class gesture video collection, and each class gesture video collection includes at least one default gesture video.It can be seen that, compare and prior art, another implementation of gesture identification is provided in the embodiment of the present disclosure.
Description
Technical Field
The present disclosure relates to the technical field of electronic devices, and in particular, to a gesture recognition method and apparatus, and a terminal device.
Background
With the increasing demand of users for convenience in use of electronic products, hands-free operation or gesture recognition will become a key factor for distinguishing high-end electronic products from other similar electronic products.
In the prior art, a video of a gesture to be recognized is shot through an infrared camera, and a movement track of a hand skeleton joint point is determined according to the position of the hand skeleton joint point in each frame of gesture image in the video of the gesture to be recognized, so that the gesture to be recognized is determined according to the movement track.
Disclosure of Invention
In order to overcome the problems in the related art, the present disclosure provides a gesture recognition method, device and terminal device.
According to a first aspect of the embodiments of the present disclosure, there is provided a gesture recognition method, including:
acquiring a gesture video to be recognized;
determining a gesture video set to which the gesture video to be recognized belongs according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database; the preset database comprises at least one type of gesture video set, and each type of gesture video set comprises at least one preset gesture video.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: acquiring a gesture video to be recognized; further, according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database, determining a gesture video set to which the gesture video to be recognized belongs, so as to determine that the gesture to be recognized is a preset gesture in the preset gesture video included in the gesture video set to which the gesture video to be recognized belongs. It can be seen that, in contrast to the prior art, another implementation of gesture recognition is provided in the embodiments of the present disclosure.
In one possible design, the determining, according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database, a gesture video set to which the gesture video to be recognized belongs includes:
performing an acquisition operation, the acquisition operation comprising: according to the type of a first gesture video set in the preset database, dividing the gesture video set in the preset database into a first type of gesture video and a second type of gesture video; and acquiring a support vector machine of the gesture video to be recognized according to the first type of gesture video, the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video; the first gesture video set is any type of gesture video set in the preset database; the preset gesture videos included in the first gesture video set belong to the first type of gesture videos, and the preset gesture videos included in other gesture video sets except the first gesture video set in the preset database belong to the second type of gesture videos;
when the support vector machine is larger than 0, determining that the gesture video to be recognized belongs to the first gesture video set;
when the support vector machine is not larger than 0, taking any one of the other types of gesture video sets in the preset database as a new first gesture video set, returning to execute the obtaining operation to obtain a new first type of gesture video and a new second type of gesture video according to the new first type of gesture video set, and obtaining a new support vector machine of the gesture video to be recognized according to the new first type of gesture video, the new second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video until the new support vector machine is larger than 0, and determining that the gesture video to be recognized belongs to the new first gesture video set.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: the method comprises the steps of providing an implementation mode for determining a gesture video set to which a gesture video to be recognized belongs according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database. Compared with the prior art, the purpose of accurately determining the gesture video to be recognized is achieved, so that the gesture to be recognized in the gesture video to be recognized is accurately determined.
In one possible design, the obtaining a support vector machine of the gesture video to be recognized according to the first type of gesture video, the second type of gesture video, and the similarity between the gesture video to be recognized and each preset gesture video includes:
determining the label factors of the first type of gesture videos and the second type of gesture videos; the label factor of the first type of gesture video is equal to a first preset value belonging to a positive number, the label factor of the second type of gesture video is equal to a second preset value belonging to a negative number, and the absolute values of the first preset value and the second preset value are the same;
and acquiring a support vector machine of the gesture video to be recognized according to the label factors of the first type of gesture video, the label factors of the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video.
In one possible design, the obtaining a support vector machine of the gesture video to be recognized according to the tag factor of the first type of gesture video, the tag factor of the second type of gesture video, and the similarity between the gesture video to be recognized and each preset gesture video includes:
determining a gesture video matrix X to be recognized corresponding to the gesture video to be recognized according to the characteristic sequence of each frame of gesture image to be recognized in the gesture video to be recognized; wherein, the mth column of the gesture video matrix X to be recognized comprises: a feature sequence corresponding to the mth frame of the gesture video to be recognized and the gesture image to be recognized, wherein m is an integer greater than or equal to 1;
determining a preset gesture video matrix Y corresponding to the ith preset gesture video according to the characteristic sequence of each frame of preset gesture image in the ith preset gesture video in the preset databasei(ii) a Wherein i is an integer greater than or equal to 1 and less than or equal to N, N is the number of preset gesture videos included in the preset database, and the preset gesture video matrix YiColumn m of (d) contains: a feature sequence corresponding to an m-th frame of preset gesture images of the ith preset gesture video;
according to the formulaDetermining a support vector machine f (X) of the gesture video to be recognized, wherein sign () represents a sign function αiRepresents a first weighting coefficient, yiA label factor, κ (Y), representing the ith preset gesture videoiX) represents the gesture video matrix X to be recognized and the preset gesture video matrix YiThe similarity between them, b representsA predetermined constant; wherein if the ith preset gesture video belongs to the first type of gesture video, then the yiEqual to the first preset value, if the ith preset gesture video belongs to the second type of gesture video, the y isiEqual to said second preset value.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: according to the tag factors of the first type of gesture videos, the tag factors of the second type of gesture videos and the similarity between the gesture video to be recognized and each preset gesture video, an implementation mode of a support vector machine of the gesture video to be recognized is obtained, so that whether the gesture video to be recognized belongs to the first gesture video set or not is further judged.
In one possible design, the method further includes:
according to the formulaDetermining the gesture video matrix X to be recognized and the preset gesture video matrix YiSimilarity between K (Y)i,X);
The K represents the number of video matrixes included in each layer of the time pyramid comprising L layers, the gesture video matrix X to be recognized and the preset gesture video matrix corresponding to each preset gesture video in the preset database are divided into the number of video matrixes included in each layer of the time pyramid comprising L layers according to the same division rule, and the K is 2lL represents the l-th layer, l is a natural number greater than or equal to 0, k represents the k-th video matrix of each layer, and XlkRepresenting the kth video matrix of the ith layer of the time pyramid corresponding to the gesture video matrix X to be recognized,representing the preset gesture video matrix YiThe kth video matrix of the l-th layer of the corresponding temporal pyramid,represents saidAnd said XlkSimilarity between them, μlkRepresenting the second weighting factor.
In one possible design, the method further includes:
according to the formulaDetermining theAnd said XlkSimilarity between themWherein exp () represents an exponential function,represents saidAnd said XlkAnd gamma represents a second predetermined constant.
In one possible design, the method further includes:
according to the formulaDetermining theAnd said XlkThe distance betweenWherein,stands for EuropeThe function of the distance in degrees f,representing a first preset sparse affine sequence,representing a second preset sparse affine sequence.
According to a second aspect of the embodiments of the present disclosure, there is provided a gesture recognition apparatus including:
the acquisition module is configured to acquire a gesture video to be recognized;
the first determining module is configured to determine a gesture video set to which the gesture video to be recognized belongs according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database; the preset database comprises at least one type of gesture video set, and each type of gesture video set comprises at least one preset gesture video.
In one possible design, the first determining module includes:
an acquisition submodule configured to perform an acquisition operation, the acquisition operation including: according to the type of a first gesture video set in the preset database, dividing the gesture video set in the preset database into a first type of gesture video and a second type of gesture video; and acquiring a support vector machine of the gesture video to be recognized according to the first type of gesture video, the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video; the first gesture video set is any type of gesture video set in the preset database; the preset gesture videos included in the first gesture video set belong to the first type of gesture videos, and the preset gesture videos included in other gesture video sets except the first gesture video set in the preset database belong to the second type of gesture videos;
a first determining submodule configured to determine that the gesture video to be recognized belongs to the first gesture video set when the support vector machine is greater than 0;
and the second determining submodule is configured to, when the support vector machine is not greater than 0, take any one of other types of gesture video sets in the preset database as a new first gesture video set, return to the acquiring submodule to execute the acquiring operation, acquire a new first type of gesture video and a new second type of gesture video according to the new first type of gesture video and the new second type of gesture video, and acquire a new support vector machine of the gesture video to be recognized according to the new first type of gesture video, the new second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video, and determine that the gesture video to be recognized belongs to the new first gesture video set until the new support vector machine is greater than 0.
In one possible design, the obtaining sub-module includes:
a determining unit configured to determine a tag factor of the first type of gesture video and a tag factor of the second type of gesture video; the label factor of the first type of gesture video is equal to a first preset value belonging to a positive number, the label factor of the second type of gesture video is equal to a second preset value belonging to a negative number, and the absolute values of the first preset value and the second preset value are the same;
the acquisition unit is configured to acquire a support vector machine of the gesture video to be recognized according to the label factor of the first type of gesture video, the label factor of the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video.
In one possible design, the obtaining unit is specifically configured to:
determining a gesture video matrix X to be recognized corresponding to the gesture video to be recognized according to the characteristic sequence of each frame of gesture image to be recognized in the gesture video to be recognized; wherein, the mth column of the gesture video matrix X to be recognized comprises: a feature sequence corresponding to the mth frame of the gesture video to be recognized and the gesture image to be recognized, wherein m is an integer greater than or equal to 1;
determining a preset gesture video matrix Y corresponding to the ith preset gesture video according to the characteristic sequence of each frame of preset gesture image in the ith preset gesture video in the preset databasei(ii) a Wherein i is an integer greater than or equal to 1 and less than or equal to N, N is the number of preset gesture videos included in the preset database, and the preset gesture video matrix YiColumn m of (d) contains: a feature sequence corresponding to an m-th frame of preset gesture images of the ith preset gesture video;
according to the formulaDetermining a support vector machine f (X) of the gesture video to be recognized, wherein sign () represents a sign function αiRepresents a first weighting coefficient, yiA label factor, κ (Y), representing the ith preset gesture videoiX) represents the gesture video matrix X to be recognized and the preset gesture video matrix YiB represents a first preset constant; wherein if the ith preset gesture video belongs to the first type of gesture video, then the yiEqual to the first preset value, if the ith preset gesture video belongs to the second type of gesture video, the y isiEqual to said second preset value.
In one possible design, the apparatus further includes:
a second determination module configured to determine a formulaDetermining the gesture video matrix X to be recognized and the preset gesture video matrix YiSimilarity between K (Y)i,X);
The K represents the number of video matrixes included in each layer of the time pyramid comprising L layers, the gesture video matrix X to be recognized and the preset gesture video matrix corresponding to each preset gesture video in the preset database are divided into the number of video matrixes included in each layer of the time pyramid comprising L layers according to the same division rule, and the K is 2lL represents the l-th layer, l is a natural number greater than or equal to 0, k represents the k-th video matrix of each layer, and XlkRepresenting the kth video matrix of the ith layer of the time pyramid corresponding to the gesture video matrix X to be recognized,representing the preset gesture video matrix YiThe kth video matrix of the l-th layer of the corresponding temporal pyramid,represents saidAnd said XlkSimilarity between them, μlkRepresenting the second weighting factor.
In one possible design, the apparatus further includes:
a third determination module configured to determine a formulaDetermining theAnd said XlkSimilarity between themWherein exp () represents an exponential function,represents saidAnd said XlkAnd gamma represents a second predetermined constant.
In one possible design, the apparatus further includes:
a fourth determination module configured to determine a formulaDetermining theAnd said XlkThe distance betweenWherein,representing the function of the euclidean distance,representing a first preset sparse affine sequence,representing a second preset sparse affine sequence.
According to a third aspect of the embodiments of the present disclosure, there is provided a terminal device, including: a processor and a memory for storing processor-executable instructions;
the processor is configured to:
acquiring a gesture video to be recognized;
determining a gesture video set to which the gesture video to be recognized belongs according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database; the preset database comprises at least one type of gesture video set, and each type of gesture video set comprises at least one preset gesture video.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: the method and the device for recognizing the gesture and the terminal equipment are provided, and the gesture video to be recognized is obtained; further, according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database, determining a gesture video set to which the gesture video to be recognized belongs, so as to determine that the gesture to be recognized is a preset gesture in the preset gesture video included in the gesture video set to which the gesture video to be recognized belongs. It can be seen that, in contrast to the prior art, another implementation of gesture recognition is provided in the embodiments of the present disclosure.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
FIG. 1A is a flow diagram illustrating a method of gesture recognition in accordance with an exemplary embodiment;
FIG. 1B is a schematic diagram illustrating image segmentation according to an exemplary embodiment;
FIG. 2 is a flow diagram illustrating a method of gesture recognition in accordance with another exemplary embodiment;
FIG. 3 is a flow diagram illustrating a method of gesture recognition in accordance with another exemplary embodiment;
FIG. 4 is a block diagram illustrating a first embodiment of a gesture recognition apparatus in accordance with an illustrative embodiment;
FIG. 5 is a block diagram illustrating a second embodiment of a gesture recognition apparatus in accordance with an illustrative embodiment;
FIG. 6 is a block diagram illustrating a third embodiment of a gesture recognition apparatus in accordance with an illustrative embodiment;
FIG. 7 is a block diagram illustrating a fourth embodiment of a gesture recognition apparatus in accordance with an illustrative embodiment;
FIG. 8 is a block diagram illustrating a fifth embodiment of a gesture recognition apparatus in accordance with an illustrative embodiment;
FIG. 9 is a block diagram illustrating a sixth embodiment of a gesture recognition apparatus in accordance with an illustrative embodiment;
FIG. 10 is a block diagram illustrating a terminal device according to an example embodiment;
fig. 11 is a block diagram illustrating a terminal device 1200 according to an example embodiment.
With the foregoing drawings in mind, certain embodiments of the disclosure have been shown and described in more detail below. These drawings and written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the concepts of the disclosure to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
First, words related to the present disclosure will be explained:
the terminal devices to which the present disclosure relates may include, but are not limited to: the terminal may be a smart phone, a tablet computer, an electronic reader, a personal digital assistant, a smart television, smart glasses, or other terminals having an image capturing function, which is not limited in the embodiments of the present disclosure.
The Histogram of Oriented Gradient (HOG) feature to which the present disclosure relates is a feature descriptor used for object detection in computer vision and image processing. The HOG features are constructed by calculating and counting the histogram of gradient direction of local area of image.
Similar to the HOG feature, the Optical Flow Histogram (HOF) feature according to the present disclosure is to perform weighted statistics on the Optical Flow direction to obtain an Optical Flow direction information Histogram. Because the size of the target changes with time, the dimensionality of the corresponding optical flow feature descriptor also changes, and meanwhile, the calculation of the optical flow is sensitive to background noise, scale change and motion direction, a feature which can represent time domain action information and is insensitive to scale and motion direction based on the optical flow needs to be found, and the HOF is proposed based on the requirement.
Next, an application scenario of the embodiment of the present disclosure is introduced:
with the increasing demand of users for convenience in use of electronic products, hands-free operation or gesture recognition will become a key factor for distinguishing high-end electronic products from other similar electronic products. Therefore, research on gesture recognition technology is a very important research direction.
In the prior art, a video of a gesture to be recognized is shot through an infrared camera, and a movement track of a hand skeleton joint point is determined according to the position of the hand skeleton joint point in each frame of gesture image in the video of the gesture to be recognized, so that the gesture to be recognized is determined according to the movement track.
Another implementation manner of gesture recognition is provided in the embodiments of the present disclosure, and specific implementation manners are as follows:
the following describes a gesture recognition method, a gesture recognition device, and a terminal device according to embodiments of the present disclosure in detail with reference to the accompanying drawings.
Fig. 1A is a flow diagram illustrating a gesture recognition method according to an exemplary embodiment, and fig. 1B is a schematic diagram illustrating image segmentation according to an exemplary embodiment. The execution subject of this embodiment may be a gesture recognition apparatus in the terminal device, and the apparatus may be implemented by software and/or hardware. As shown in fig. 1A, the scheme of the present embodiment may include the following steps:
in step S101, a gesture video to be recognized is acquired.
In this step, the gesture recognition apparatus obtains a gesture video to be recognized through the image acquisition unit, optionally, the gesture video to be recognized includes: at least one frame of gesture image to be recognized, wherein each frame of gesture image to be recognized comprises: the gesture to be recognized. Optionally, the image acquisition unit may be any one of: the color camera and the infrared camera may also be other units having an image capturing function, which is not limited in the embodiment of the present disclosure.
Optionally, the implementation manner of obtaining the gesture video to be recognized by the image acquisition unit at least includes the following:
the first realizable way: the gesture recognition device acquires an original video (including at least one frame of original color image) through a color camera, and segments a hand image and a background image in each frame of original color image in the original video by adopting an image segmentation method based on skin color detection to obtain the gesture video to be recognized, which includes at least one frame of gesture image to be recognized (only including the hand image). For example, as shown in fig. 1B, an image segmentation method based on skin color detection is adopted to segment a hand image and a background image in an original color image of a certain frame, so as to obtain a gesture image to be recognized, which only includes the hand image. Optionally, a specific implementation manner of the image segmentation method based on skin color detection in the embodiment of the present disclosure may refer to an image segmentation method based on skin color detection in the prior art, which is not limited in the embodiment of the present disclosure.
The second realizable way: the gesture recognition device acquires an original video (including at least one frame of original depth image) through an infrared camera, and divides a hand image and a background image in each frame of original depth image in the original video by adopting an infrared image division method based on the infrared camera to obtain the gesture video to be recognized, which includes at least one frame of gesture image to be recognized (only including the hand image). Optionally, a specific implementation manner of the infrared image segmentation method based on the infrared camera in the embodiment of the present disclosure may refer to an infrared image segmentation method in the prior art, which is not limited in the embodiment of the present disclosure.
Of course, the implementation manner of obtaining the gesture video to be recognized through the image acquisition unit may also include other implementation manners, which is not limited in the embodiment of the present disclosure.
In step S102, a gesture video set to which the gesture video to be recognized belongs is determined according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database.
In the embodiment of the present disclosure, a preset database is preset in the gesture recognition device, and optionally, the preset database includes: at least one type of gesture video set (such as a power-on gesture video set, a power-off gesture video set, a channel changing gesture video set and the like); wherein each type of gesture video set comprises: at least one preset gesture video (for example, the power-on gesture video set comprises at least one preset power-on gesture video, and the power-off gesture video set comprises at least one preset power-off gesture video, etc.).
In this step, the gesture recognition device determines a gesture video set to which the gesture video to be recognized belongs according to the similarity between the gesture video to be recognized and each preset gesture video in a preset database. Optionally, the gesture recognition device determines, according to a similarity between the gesture video to be recognized and each preset gesture video in a preset database, whether the gesture video to be recognized belongs to a first gesture video set in the preset database (where the first gesture video set is any type of gesture video set in the preset database, for example, the first gesture video set is a power-on gesture video set); if the gesture video to be recognized is determined to belong to the first gesture video set, ending the process; if the gesture video to be recognized is determined not to belong to the first gesture video set, continuously judging whether the gesture video to be recognized belongs to a second gesture video set in the preset database (the second gesture video set is any other type of gesture video set except the first gesture video set in the preset database, for example, the second gesture video set is a shutdown gesture video set); if the gesture video to be recognized is determined to belong to the second gesture video set, ending the process; if the gesture video to be recognized does not belong to the second gesture video set, continuously judging whether the gesture video to be recognized belongs to a third gesture video set in the preset database (the third gesture video set is any other type of gesture video set except the first gesture video set and the second gesture video set in the preset database, for example, the third gesture video set is a channel change gesture video set), … …, and repeating the steps until the gesture video set to be recognized belongs to the gesture video set.
Optionally, the gesture recognition device determines a gesture video set (e.g., a second gesture video set) to which the gesture video to be recognized belongs, that is, determines a gesture to be recognized in the gesture video to be a preset gesture in a preset gesture video included in the gesture video set (e.g., the second gesture video set) to determine the gesture to be recognized, so as to determine a target operation corresponding to the gesture to be recognized further according to the gesture to be recognized and preset mapping information (including a corresponding relationship between at least one preset gesture and the target operation). For example, when the gesture to be recognized is determined to be a preset power-on gesture, the determined target operation corresponding to the gesture to be recognized is a power-on operation.
In the embodiment, a gesture video to be recognized is obtained; further, according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database, determining a gesture video set to which the gesture video to be recognized belongs, so as to determine that the gesture to be recognized is a preset gesture in the preset gesture video included in the gesture video set to which the gesture video to be recognized belongs. It can be seen that, in contrast to the prior art, another implementation of gesture recognition is provided in the embodiments of the present disclosure.
FIG. 2 is a flow chart illustrating a method of gesture recognition according to another exemplary embodiment. On the basis of the above embodiment, as shown in fig. 2, step S102 includes:
in step S102A, an acquisition operation is performed, the acquisition operation including: according to the type of a first gesture video set in the preset database, dividing the gesture video set in the preset database into a first type of gesture video and a second type of gesture video; and acquiring a support vector machine of the gesture video to be recognized according to the first type of gesture video, the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video.
The first gesture video set is any type of gesture video set in the preset database; the preset gesture videos included in the first gesture video set belong to the first type of gesture videos, and the preset gesture videos included in other gesture video sets except the first gesture video set in the preset database belong to the second type of gesture videos.
In this step, in order to determine whether the gesture video to be recognized belongs to the first gesture video set (e.g., a power-on gesture video set) in the preset database, the gesture recognition apparatus first divides the gesture video set in the preset database into a first type of gesture video (e.g., a power-on gesture video) and a second type of gesture video (e.g., a non-power-on gesture video) according to the type of the first gesture video set (e.g., the power-on gesture video set). For example, the gesture recognition device divides all preset gesture videos in the preset database into a power-on gesture video and a non-power-on gesture video.
Further, the gesture recognition device acquires a Support Vector Machine (SVM) of the gesture video to be recognized according to the first type of gesture video (for example, a power-on type gesture video), the second type of gesture video (for example, a non-power-on type gesture video), and the similarity between the gesture video to be recognized and each preset gesture video. Optionally, the gesture recognition device determines the label factor of the first type of gesture video (e.g., power-on type gesture video) and the label factor of the second type of gesture video (e.g., non-power-on type gesture video); the label factor of the first type of gesture video is equal to a first preset value (for example, 1) belonging to a positive number, the label factor of the second type of gesture video is equal to a second preset value (for example, -1) belonging to a negative number, and the absolute values of the first preset value and the second preset value are the same; further, a support vector machine of the gesture video to be recognized is obtained according to the label factor of the first type of gesture video, the label factor of the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video.
Of course, the gesture recognition device may also obtain a support vector machine of the gesture video to be recognized in other manners according to the first type of gesture video, the second type of gesture video, and the similarity between the gesture video to be recognized and each preset gesture video, which is not limited in the embodiment of the present disclosure.
When the support vector machine is greater than 0, executing step S102B; when the support vector machine is not greater than 0, determining that the gesture video to be recognized does not belong to the first gesture video set, and executing step S102C.
In step S102B, it is determined that the gesture video to be recognized belongs to the first gesture video set.
In step S102C, taking any one of the other types of gesture video sets in the preset database as a new first gesture video set, returning to execute the obtaining operation to obtain a new first type of gesture video and a new second type of gesture video according to the new first gesture video set, and obtaining a new support vector machine of the gesture video to be recognized according to the new first type of gesture video, the new second type of gesture video, and a similarity between the gesture video to be recognized and each of the preset gesture videos, until the new support vector machine is greater than 0, determining that the gesture video to be recognized belongs to the new first gesture video set.
In this step, the gesture recognition device uses any one of the other types of gesture video sets (for example, a second gesture video set) in the preset database as a new first gesture video set to determine whether the gesture video to be recognized belongs to the new first gesture video set (for example, the second gesture video set) in the preset database; further, returning to execute the obtaining operation, so as to divide the gesture video set in the preset database into a new first type of gesture video (for example, a shutdown type gesture video) and a new second type of gesture video (for example, a non-shutdown type gesture video) according to the type of the new first gesture video set (for example, a second gesture video set, and the second gesture video set is a shutdown gesture video set), and obtain a new support vector machine of the gesture video to be recognized according to the new first type of gesture video, the new second type of gesture video, and the similarity between the gesture video to be recognized and each of the preset gesture videos; when the new support vector machine is larger than 0, determining that the gesture video to be recognized belongs to the new first gesture video set (for example, a second gesture video set); when the new support vector machine is not larger than 0, taking any one of other types of gesture video sets (such as a third gesture video set) in the preset database as a new first gesture video set to judge whether the gesture video to be recognized belongs to the new first gesture video set (such as the third gesture video set) in the preset database; further, returning to execute the acquiring operation, dividing the gesture video sets in the preset database into a new first type gesture video (such as a channel changing type gesture video) and a new second type gesture video (such as a non-channel changing type gesture video) according to the types of the new first gesture video set (such as a third gesture video set and the third gesture video set is a channel changing gesture video set), and acquiring a new support vector machine of the gesture video to be recognized according to the new first type of gesture video, the new second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video, … …, and repeating the steps until the new support vector machine is larger than 0, and determining that the gesture video to be recognized belongs to a new first gesture video set.
Optionally, a new first type of gesture video and a new second type of gesture video are obtained according to the new first type of gesture video set, and an implementation manner of a new support vector machine for the gesture video to be recognized is obtained according to the new first type of gesture video, the new second type of gesture video, and a similarity between the gesture video to be recognized and each preset gesture video, which may be referred to relevant parts of step S102A in the embodiment of the present disclosure, and is not described herein again.
In an embodiment of the present disclosure, by performing an obtaining operation, the obtaining operation includes: according to the type of a first gesture video set in the preset database, dividing the gesture video set in the preset database into a first type of gesture video and a second type of gesture video; and acquiring a support vector machine of the gesture video to be recognized according to the first type of gesture video, the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video; when the support vector machine is larger than 0, determining that the gesture video to be recognized belongs to the first gesture video set; when the support vector machine is not larger than 0, taking any one of the other types of gesture video sets in the preset database as a new first gesture video set, returning to execute the obtaining operation to obtain a new first type of gesture video and a new second type of gesture video according to the new first type of gesture video set, and obtaining a new support vector machine of the gesture video to be recognized according to the new first type of gesture video, the new second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video until the new support vector machine is larger than 0, and determining that the gesture video to be recognized belongs to the new first gesture video set. Therefore, the purpose of accurately determining the gesture video to be recognized is achieved, and the gesture to be recognized in the gesture video to be recognized is accurately determined.
FIG. 3 is a flow chart illustrating a method of gesture recognition according to another exemplary embodiment. On the basis of the foregoing embodiment, as shown in fig. 3, the obtaining a support vector machine of the gesture video to be recognized according to the tag factor of the first type of gesture video, the tag factor of the second type of gesture video, and the similarity between the gesture video to be recognized and each preset gesture video includes:
in step S301, a gesture video matrix X to be recognized corresponding to the gesture video to be recognized is determined according to a feature sequence of each frame of gesture image to be recognized in the gesture video to be recognized.
In the step, a gesture recognition device determines a gesture video matrix X to be recognized corresponding to the gesture video to be recognized according to a feature sequence of each frame of gesture image to be recognized in the gesture video to be recognized; wherein, the mth column of the gesture video matrix X to be recognized comprises: and m is an integer greater than or equal to 1 (namely, the maximum value of m is the total frame number of the gesture images to be recognized included in the gesture video to be recognized). Alternatively, the signature sequence may be a combination of one or more of: HOG characteristic sequence and HOF characteristic sequence; of course, the characteristic sequence may also include other sequences, which are not limited in the embodiments of the present disclosure.
Assuming that the feature sequence includes an HOF feature sequence, optionally, the gesture recognition apparatus determines the HOF feature sequence of each frame of gesture image to be recognized in the gesture video to be recognized by extracting an optical flow (optical flow) of each frame of gesture image to be recognized (including only a hand image) in the gesture video to be recognized and according to the optical flow of each frame of gesture image to be recognized in the gesture video to be recognized. Optionally, an implementation process of determining the HOF feature sequence of each frame of gesture image to be recognized in the gesture video to be recognized according to the optical flow of each frame of gesture image to be recognized in the gesture video to be recognized may refer to an implementation process of determining the HOF feature of an image according to the optical flow of an image in the prior art, which is not limited in the embodiment of the present disclosure. Of course, the HOF feature sequence of each frame of gesture image to be recognized in the gesture video to be recognized may also be determined in other ways, which is not limited in the embodiment of the present disclosure.
Assuming that the feature sequence includes a HOG feature sequence, optionally, the gesture recognition apparatus determines the HOG feature sequence of each frame of gesture image to be recognized in the gesture video to be recognized by extracting three primary colors (RGB) of each frame of gesture image to be recognized (including only hand images) in the gesture video to be recognized and according to the RGB of each frame of gesture image to be recognized in the gesture video to be recognized. Optionally, an implementation process of determining the HOG feature sequence of each frame of the gesture image to be recognized in the gesture video to be recognized according to RGB of each frame of the gesture image to be recognized in the gesture video to be recognized may refer to an implementation process of determining the HOG feature of an image according to RGB of an image in the prior art, which is not limited in the embodiment of the present disclosure. Of course, the HOG feature sequence of each frame of gesture image to be recognized in the gesture video to be recognized may also be determined in other ways, which is not limited in the embodiment of the present disclosure.
Assuming that the feature sequence includes an HOG feature sequence and an HOF feature sequence, the above-mentioned portion of "determining the HOG feature sequence of each frame of the gesture image to be recognized in the gesture video to be recognized" and the portion of "determining the HOF feature sequence of each frame of the gesture image to be recognized in the gesture video to be recognized" are combined, and details are not repeated here.
In step S302, according to a feature sequence of each frame of preset gesture image in an ith preset gesture video in the preset database, a preset gesture video matrix Y corresponding to the ith preset gesture video is determinedi。
In this step, the gesture recognition device determines a preset gesture video matrix Y corresponding to the ith preset gesture video according to a feature sequence of each frame of preset gesture images in the ith preset gesture video in the preset databasei(ii) a Wherein i is an integer greater than or equal to 1 and less than or equal to N, N is the number of preset gesture videos included in the preset database, and the preset gesture video matrix YiColumn m of (d) contains: and a feature sequence corresponding to the m-th frame of the ith preset gesture video is preset (namely, the maximum value of m is the total frame number of the preset gesture images included in the ith preset gesture video). Alternatively, the signature sequence may be a combination of one or more of: HOG characteristic sequence and HOF characteristic sequence; of course, the characteristic sequence may also include other sequences, which are not limited in the embodiments of the present disclosure.
Optionally, an implementation manner of determining the feature sequence of each frame of the preset gesture image in the ith preset gesture video may refer to the relevant part of "determining the feature sequence of each frame of the gesture image to be recognized in the gesture video to be recognized", and details are not repeated here.
In step S303, according to the formulaDetermining a support vector machine f (X) of the gesture video to be recognized.
In this step, the gesture recognition device determines the support vector machine of the gesture video to be recognized by taking the similarity between the gesture video to be recognized and each preset gesture video as a kernel function of the support vector machine. Optionally, the gesture recognition means is in accordance withFormula (II)Determining a support vector machine f (X) of the gesture video to be recognized, wherein sign () represents a sign function αiRepresents a first weighting coefficient, yiA label factor, κ (Y), representing the ith preset gesture videoiX) represents the gesture video matrix X to be recognized and the preset gesture video matrix YiB represents a first preset constant; wherein if the ith preset gesture video belongs to the first type of gesture video (for example, the startup type of gesture video), the yiEqual to the first preset value, if the ith preset gesture video belongs to the second type of gesture video (for example, a non-starting type gesture video), the yiEqual to said second preset value.
Optionally, the formula can also be usedThe support vector machine f (x) of the gesture video to be recognized is determined by other equivalent or deformation formulas, which are not limited in the embodiment of the present disclosure.
Optionally, in this embodiment of the present disclosure, an achievable manner of obtaining a new support vector machine of the gesture video to be recognized according to the tag factor of the new first type of gesture video, the tag factor of the new second type of gesture video, and the similarity between the gesture video to be recognized and each preset gesture video may be referred to as the achievable manner of obtaining the support vector machine of the gesture video to be recognized according to the tag factor of the first type of gesture video, the tag factor of the second type of gesture video, and the similarity between the gesture video to be recognized and each preset gesture video, which is not described herein again in this embodiment of the present disclosure.
Optionally, in this disclosure, the size of the sequence number of the step is not limited to the order of execution, and the execution order of each step may be adjusted appropriately, which is not limited in this disclosure.
In the embodiment of the disclosure, how to obtain an implementation manner of a support vector machine of a gesture video to be recognized according to a tag factor of the first type of gesture video, a tag factor of the second type of gesture video, and a similarity between the gesture video to be recognized and each preset gesture video is provided, so as to further judge whether the gesture video to be recognized belongs to the first gesture video set.
Further, on the basis of the above embodiments, in the embodiment of the present disclosure, an implementable manner of determining the similarity between the gesture video to be recognized and any preset gesture video in the preset database (for example, the ith preset gesture video in the preset database) is explained:
in the embodiment of the disclosure, the gesture recognition device divides the gesture video matrix X to be recognized and the preset gesture video matrix corresponding to each preset gesture video in the preset database into a time pyramid comprising L layers according to the same division rule; wherein each layer comprises K video matrices, K2lAnd l represents the l-th layer, l is a natural number which is greater than or equal to 0, and k represents the k-th video matrix of each layer. For example, the layer 0 (l ═ 0) of the time pyramid corresponding to the gesture video matrix X to be recognized includes: a complete gesture video matrix X to be recognized of the gesture video to be recognized and a preset gesture video matrix Yi(the preset gesture video matrix corresponding to the ith preset gesture video in the preset database) the 0 th layer (l ═ 0) of the time pyramid comprises: the complete preset gesture video matrix Y of the ith preset gesture videoi(ii) a The 1 st layer (l ═ 1) of the time pyramid corresponding to the gesture video matrix X to be recognized includes: the gesture video to be recognized is divided into two sub-gesture videos to be recognized, the two sub-gesture videos to be recognized respectively correspond to video matrixes, and a preset gesture video matrix Y is setiThe 1 st layer (l ═ 1) of the corresponding temporal pyramid includes: the two sub-preset gesture videos after the ith preset gesture video is divided into two correspond to each other respectivelyThe video matrix of (2); and so on.
In this step, the gesture recognition device is based on a formulaDetermining the gesture video matrix X to be recognized and the preset gesture video matrix YiSimilarity between K (Y)iX); wherein, XlkRepresenting the kth video matrix of the ith layer of the time pyramid corresponding to the gesture video matrix X to be recognized,representing the preset gesture video matrix YiThe kth video matrix of the l-th layer of the corresponding temporal pyramid,represents saidAnd said XlkSimilarity between them, μlkRepresenting the second weighting factor. Optionally, the μlk=1/2L-1Of course, the μlkAnd may be equal to other values, which are not limited in the embodiments of the present disclosure. Optionally, the gesture recognition device can also be according to the formulaDetermining the gesture video matrix X to be recognized and the preset gesture video matrix Y by other equivalent or deformation formulasiSimilarity between K (Y)iX), which is not limited in the embodiments of the present disclosure.
Alternatively, the gesture recognition means is according to a formulaDetermining theAnd said XlkSimilarity between themWherein exp () represents an exponential function,represents saidAnd said XlkAnd gamma represents a second predetermined constant. Optionally, the gesture recognition device can also be according to a formulaOther equivalent or deformation formulas for determining saidAnd said XlkThe similarity between the two is not limited in the embodiments of the present disclosure.
Optionally, the gesture recognition device is based on sparse affine packagesDetermining theAnd said XlkThe distance betweenAlternatively, the gesture recognition means is according to a formulaDetermining theAnd said XlkThe distance betweenWherein,representing the function of the euclidean distance,representing a first preset sparse affine sequence,representing a second preset sparse affine sequence. Optionally, the gesture recognition device can also be according to the formulaOther equivalent or deformation formulas for determining saidAnd said XlkThe distance betweenThis is not a limitation in the embodiments of the present disclosure.
Optionally, the following embodiments of the disclosure are directed to determining sparse affine packagesExplains the realizations of:
determining a sparse affine package under the assumption of a sample gesture video W and a sample to-be-recognized gesture video ZCan be realized as follows:
wherein,represents said βiP represents βiIs composed ofβ (optionally, the same number of columns as W), andnrepresents the nth preset sparse affine coefficient of β, and q represents βnThe number of preset sparse affine coefficients included (optionally, the same number of columns as Z), arg () representing the parameter-solving function (optionally, so thatReach β of minimum valueiAnd β), min () represents the minimum function,1representing absolute value functions, and λ representing a third predetermined constant (e.g., 0.1, 0.01, etc.). the first three equations described above in this paragraph are combined to solve βiTo obtainAnd solving β to obtain
Of course, in the embodiment of the present disclosure, the sparse affine package may also be determined by other mannersThis is not a limitation in the embodiments of the present disclosure.
In the disclosed embodiments, the determination is made by a sparse affine package based(the Preset gesture video matrix YiThe kth video matrix of the l-th layer of the corresponding temporal pyramid) and Xlk(the kth video matrix of the l layer of the time pyramid corresponding to the gesture video matrix X to be recognized) are obtainedFurther, according to theDetermining theAnd said XlkSimilarity between themFurther, according to theDetermining the gesture video matrix X to be recognized and the preset gesture video matrix YiSimilarity between K (Y)iX); further, according to the tag factor of the ith preset gesture video (for example, the tag factor of the first type of gesture video or the tag factor of the second type of gesture video) and the k (Y)iAnd X) determining a support vector machine f (X) of the gesture video to be recognized so as to further judge whether the gesture video to be recognized belongs to the first gesture video set according to the support vector machine f (X). Compared with the prior art, in the embodiment, the similarity between the gesture video to be recognized and the ith preset gesture video determined based on the sparse affine package is used as the kernel function of the support vector machine, so that the accuracy of gesture recognition is high.
Fig. 4 is a block diagram illustrating a first embodiment of a gesture recognition apparatus according to an exemplary embodiment. As shown in fig. 4, the gesture recognition apparatus 40 includes:
an obtaining module 401 configured to obtain a gesture video to be recognized;
a first determining module 402, configured to determine a gesture video set to which the gesture video to be recognized belongs according to similarity between the gesture video to be recognized and a preset gesture video in a preset database; the preset database comprises at least one type of gesture video set, and each type of gesture video set comprises at least one preset gesture video.
In the gesture recognition apparatus provided by the embodiment of the present disclosure, the obtaining module 401 obtains a gesture video to be recognized; further, the first determining module 402 determines a gesture video set to which the gesture video to be recognized belongs according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database, so as to determine that the gesture to be recognized is a preset gesture in a preset gesture video included in the gesture video set to which the gesture video to be recognized belongs. It can be seen that, in contrast to the prior art, another implementation of gesture recognition is provided in the embodiments of the present disclosure.
On the basis of the embodiment shown in fig. 4, fig. 5 is a block diagram of a second embodiment of a gesture recognition apparatus according to an exemplary embodiment. Referring to fig. 5, the first determining module 402 includes:
an acquisition submodule 402A configured to perform an acquisition operation, the acquisition operation including: according to the type of a first gesture video set in the preset database, dividing the gesture video set in the preset database into a first type of gesture video and a second type of gesture video; and acquiring a support vector machine of the gesture video to be recognized according to the first type of gesture video, the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video; the first gesture video set is any type of gesture video set in the preset database; the preset gesture videos included in the first gesture video set belong to the first type of gesture videos, and the preset gesture videos included in other gesture video sets except the first gesture video set in the preset database belong to the second type of gesture videos;
a first determining submodule 402B configured to determine that the gesture video to be recognized belongs to the first gesture video set when the support vector machine is greater than 0;
a second determining submodule 402C configured to, when the support vector machine is not greater than 0, regard any one of the other types of gesture video sets in the preset database as a new first gesture video set, return to the obtaining submodule 402A to perform the obtaining operation, obtain a new first type of gesture video and a new second type of gesture video according to the new first gesture video set, and obtain a new support vector machine of the gesture video to be recognized according to the new first type of gesture video, the new second type of gesture video, and a similarity between the gesture video to be recognized and each of the preset gesture videos, until the new support vector machine is greater than 0, determine that the gesture video to be recognized belongs to the new first gesture video set.
On the basis of the embodiment shown in fig. 5, fig. 6 is a block diagram of a third embodiment of a gesture recognition apparatus according to an exemplary embodiment. Referring to fig. 6, the acquisition submodule 402A includes:
a determining unit 402a1 configured to determine a tag factor of the first type of gesture video and a tag factor of the second type of gesture video; the label factor of the first type of gesture video is equal to a first preset value belonging to a positive number, the label factor of the second type of gesture video is equal to a second preset value belonging to a negative number, and the absolute values of the first preset value and the second preset value are the same;
the obtaining unit 402a2 is configured to obtain a support vector machine of the gesture video to be recognized according to the tag factor of the first type of gesture video, the tag factor of the second type of gesture video, and the similarity between the gesture video to be recognized and each preset gesture video.
Optionally, the obtaining unit 402a2 is specifically configured to:
determining a gesture video matrix X to be recognized corresponding to the gesture video to be recognized according to the characteristic sequence of each frame of gesture image to be recognized in the gesture video to be recognized; wherein, the mth column of the gesture video matrix X to be recognized comprises: a feature sequence corresponding to the mth frame of the gesture video to be recognized and the gesture image to be recognized, wherein m is an integer greater than or equal to 1;
according to each frame of preset gesture image in ith preset gesture video in preset databaseDetermining a preset gesture video matrix Y corresponding to the ith preset gesture video according to the characteristic sequencei(ii) a Wherein i is an integer greater than or equal to 1 and less than or equal to N, N is the number of preset gesture videos included in the preset database, and the preset gesture video matrix YiColumn m of (d) contains: a feature sequence corresponding to an m-th frame of preset gesture images of the ith preset gesture video;
according to the formulaDetermining a support vector machine f (X) of the gesture video to be recognized, wherein sign () represents a sign function αiRepresents a first weighting coefficient, yiA label factor, κ (Y), representing the ith preset gesture videoiX) represents the gesture video matrix X to be recognized and the preset gesture video matrix YiB represents a first preset constant; wherein if the ith preset gesture video belongs to the first type of gesture video, then the yiEqual to the first preset value, if the ith preset gesture video belongs to the second type of gesture video, the y isiEqual to said second preset value.
On the basis of the embodiment shown in fig. 6, fig. 7 is a block diagram of a fourth embodiment of a gesture recognition apparatus according to an exemplary embodiment. Referring to fig. 7, the gesture recognition apparatus 40 further includes:
a second determination module 403 configured to determine a formulaDetermining the gesture video matrix X to be recognized and the preset gesture video matrix YiSimilarity between K (Y)i,X);
The K represents the time for dividing the gesture video matrix X to be recognized and the preset gesture video matrix corresponding to each preset gesture video in the preset database into L layers according to the same division ruleThe number of video matrixes included in each layer in the pyramid is 2lL represents the l-th layer, l is a natural number greater than or equal to 0, k represents the k-th video matrix of each layer, and XlkRepresenting the kth video matrix of the ith layer of the time pyramid corresponding to the gesture video matrix X to be recognized,representing the preset gesture video matrix YiThe kth video matrix of the l-th layer of the corresponding temporal pyramid,represents saidAnd said XlkSimilarity between them, μlkRepresenting the second weighting factor.
On the basis of the embodiment shown in fig. 7, fig. 8 is a block diagram of a fifth embodiment of a gesture recognition apparatus according to an exemplary embodiment. Referring to fig. 8, the gesture recognition apparatus 40 further includes:
a third determination module 404 configured to determine a value based on a formulaDetermining theAnd said XlkSimilarity between themWherein exp () represents an exponential function,represents said Yl i kAnd said XlkAnd gamma represents a second predetermined constant.
On the basis of the embodiment shown in fig. 8, fig. 9 is a block diagram of a sixth embodiment of a gesture recognition apparatus according to an exemplary embodiment. Referring to fig. 9, the gesture recognition apparatus 40 further includes:
a fourth determination module 405 configured to determine a formulaDetermining theAnd said XlkThe distance betweenWherein,representing the function of the euclidean distance,representing a first preset sparse affine sequence,representing a second preset sparse affine sequence.
The gesture recognition device provided by any one of the embodiments is used in the technical scheme of any one of the embodiments of the gesture recognition method disclosed by the disclosure, the implementation principle and the technical effect are similar, and a gesture video to be recognized is obtained; further, according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database, determining a gesture video set to which the gesture video to be recognized belongs, so as to determine that the gesture to be recognized is a preset gesture in the preset gesture video included in the gesture video set to which the gesture video to be recognized belongs. It can be seen that, in contrast to the prior art, another implementation of gesture recognition is provided in the embodiments of the present disclosure.
The internal functional modules and the structural schematic of the gesture recognition apparatus are described above, and the execution subject of the gesture recognition apparatus should be a terminal device, and fig. 10 is a block diagram of a terminal device according to an exemplary embodiment. Referring to fig. 10, the terminal device includes: a processor and a memory for storing processor-executable instructions;
the processor is configured to:
acquiring a gesture video to be recognized;
determining a gesture video set to which the gesture video to be recognized belongs according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database; the preset database comprises at least one type of gesture video set, and each type of gesture video set comprises at least one preset gesture video.
In the above embodiments of the terminal device, it should be understood that the Processor may be a Central Processing Unit (CPU), other general-purpose processors, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. The general-purpose processor may be a microprocessor, or the processor may be any conventional processor, and the aforementioned memory may be a read-only memory (ROM), a Random Access Memory (RAM), a flash memory, a hard disk, or a solid state disk. The steps of a method disclosed in connection with the embodiments of the present disclosure may be embodied directly in a hardware processor, or in a combination of hardware and software modules.
Fig. 11 is a block diagram illustrating a terminal device 1200 according to an example embodiment. Referring to fig. 11, terminal device 1200 may include one or more of the following components: processing component 1202, memory 1204, power component 1206, multimedia component 1208, audio component 1210, input/output (I/O) interface 1212, sensor component 1214, and communications component 1216.
The processing component 1202 generally controls overall operation of the terminal device 1200, such as operations associated with display, data communication, multimedia operations, and recording operations. The processing components 1202 may include one or more processors 1220 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 1202 can include one or more modules that facilitate interaction between the processing component 1202 and other components. For example, the processing component 1202 can include a multimedia module to facilitate interaction between the multimedia component 1208 and the processing component 1202.
The memory 1204 is configured to store various types of data to support operation at the terminal device 1200. Examples of such data include instructions for any application or method operating on terminal device 1200, various types of data, messages, pictures, videos, and so forth. The memory 1204 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power supply components 1206 provide power to the various components of terminal device 1200. Power components 1206 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for terminal device 1200.
The multimedia component 1208 includes a screen providing an output interface between the terminal device 1200 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.
Audio component 1210 is configured to output and/or input audio signals. For example, the audio component 1210 includes a Microphone (MIC) configured to receive an external audio signal when the terminal apparatus 1200 is in an operation mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 1204 or transmitted via the communication component 1216. In some embodiments, audio assembly 1210 further includes a speaker for outputting audio signals.
The I/O interface 1212 provides an interface between the processing component 1202 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc.
The sensor assembly 1214 includes one or more sensors for providing various aspects of state assessment for the terminal device 1200. For example, sensor assembly 1214 may detect an open/closed state of terminal device 1200, the relative positioning of components, such as a display and keypad of terminal device 1200, sensor assembly 1214 may also detect a change in position of terminal device 1200 or a component of terminal device 1200, the presence or absence of user contact with terminal device 1200, orientation or acceleration/deceleration of terminal device 1200, and a change in temperature of terminal device 1200. The sensor assembly 1214 may include a proximity sensor configured to detect the presence of a nearby object in the absence of any physical contact. The sensor assembly 1214 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1214 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
Communications component 1216 is configured to facilitate communications between terminal device 1200 and other devices in a wired or wireless manner. The terminal device 1200 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 1216 receives the broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 1216 also includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the terminal device 1200 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as memory 1204 comprising instructions, executable by processor 1220 of terminal device 1200 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A non-transitory computer readable storage medium in which instructions, when executed by a processing component of a terminal device 1200, enable the terminal device 1200 to perform a gesture recognition method, the method comprising:
acquiring a gesture video to be recognized;
determining a gesture video set to which the gesture video to be recognized belongs according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database; the preset database comprises at least one type of gesture video set, and each type of gesture video set comprises at least one preset gesture video.
Optionally, the determining, according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database, a gesture video set to which the gesture video to be recognized belongs includes:
performing an acquisition operation, the acquisition operation comprising: according to the type of a first gesture video set in the preset database, dividing the gesture video set in the preset database into a first type of gesture video and a second type of gesture video; and acquiring a support vector machine of the gesture video to be recognized according to the first type of gesture video, the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video; the first gesture video set is any type of gesture video set in the preset database; the preset gesture videos included in the first gesture video set belong to the first type of gesture videos, and the preset gesture videos included in other gesture video sets except the first gesture video set in the preset database belong to the second type of gesture videos;
when the support vector machine is larger than 0, determining that the gesture video to be recognized belongs to the first gesture video set;
when the support vector machine is not larger than 0, taking any one of the other types of gesture video sets in the preset database as a new first gesture video set, returning to execute the obtaining operation to obtain a new first type of gesture video and a new second type of gesture video according to the new first type of gesture video set, and obtaining a new support vector machine of the gesture video to be recognized according to the new first type of gesture video, the new second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video until the new support vector machine is larger than 0, and determining that the gesture video to be recognized belongs to the new first gesture video set.
Optionally, the obtaining a support vector machine of the gesture video to be recognized according to the first type of gesture video, the second type of gesture video, and the similarity between the gesture video to be recognized and each preset gesture video includes:
determining the label factors of the first type of gesture videos and the second type of gesture videos; the label factor of the first type of gesture video is equal to a first preset value belonging to a positive number, the label factor of the second type of gesture video is equal to a second preset value belonging to a negative number, and the absolute values of the first preset value and the second preset value are the same;
and acquiring a support vector machine of the gesture video to be recognized according to the label factors of the first type of gesture video, the label factors of the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video.
Optionally, the obtaining a support vector machine of the gesture video to be recognized according to the tag factor of the first type of gesture video, the tag factor of the second type of gesture video, and the similarity between the gesture video to be recognized and each preset gesture video includes:
determining a gesture video matrix X to be recognized corresponding to the gesture video to be recognized according to the characteristic sequence of each frame of gesture image to be recognized in the gesture video to be recognized; wherein, the mth column of the gesture video matrix X to be recognized comprises: a feature sequence corresponding to the mth frame of the gesture video to be recognized and the gesture image to be recognized, wherein m is an integer greater than or equal to 1;
determining a preset gesture video matrix Y corresponding to the ith preset gesture video according to the characteristic sequence of each frame of preset gesture image in the ith preset gesture video in the preset databasei(ii) a Wherein i is an integer greater than or equal to 1 and less than or equal to N, N is the number of preset gesture videos included in the preset database, and the preset gesture video matrix YiColumn m of (d) contains: a feature sequence corresponding to an m-th frame of preset gesture images of the ith preset gesture video;
according to the formulaDetermining a support vector machine f (X) of the gesture video to be recognized, wherein sign () represents a sign function αiRepresents a first weighting coefficient, yiRepresentative instituteThe label factor, κ (Y), of the ith preset gesture videoiX) represents the gesture video matrix X to be recognized and the preset gesture video matrix YiB represents a first preset constant; wherein if the ith preset gesture video belongs to the first type of gesture video, then the yiEqual to the first preset value, if the ith preset gesture video belongs to the second type of gesture video, the y isiEqual to said second preset value.
Optionally, the method further comprises:
according to the formulaDetermining the gesture video matrix X to be recognized and the preset gesture video matrix YiSimilarity between K (Y)i,X);
The K represents the number of video matrixes included in each layer of the time pyramid comprising L layers, the gesture video matrix X to be recognized and the preset gesture video matrix corresponding to each preset gesture video in the preset database are divided into the number of video matrixes included in each layer of the time pyramid comprising L layers according to the same division rule, and the K is 2lL represents the l-th layer, l is a natural number greater than or equal to 0, k represents the k-th video matrix of each layer, and XlkRepresenting the kth video matrix of the ith layer of the time pyramid corresponding to the gesture video matrix X to be recognized,representing the preset gesture video matrix YiThe kth video matrix of the l-th layer of the corresponding temporal pyramid,represents saidAnd said XlkSimilarity between them, μlkRepresenting the second weighting factor.
Optionally, the method further comprises:
according to the formulaDetermining theAnd said XlkSimilarity between themWherein exp () represents an exponential function,represents saidAnd said XlkAnd gamma represents a second predetermined constant.
Optionally, the method further comprises:
according to the formulaDetermining theAnd said XlkThe distance betweenWherein,representing the function of the euclidean distance,representing a first preset sparse affine sequence,representing a second predetermined sparse affine sequence。
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.
Claims (15)
1. A gesture recognition method, comprising:
acquiring a gesture video to be recognized;
determining a gesture video set to which the gesture video to be recognized belongs according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database; the preset database comprises at least one type of gesture video set, and each type of gesture video set comprises at least one preset gesture video.
2. The method according to claim 1, wherein the determining a gesture video set to which the gesture video to be recognized belongs according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database comprises:
performing an acquisition operation, the acquisition operation comprising: according to the type of a first gesture video set in the preset database, dividing the gesture video set in the preset database into a first type of gesture video and a second type of gesture video; and acquiring a support vector machine of the gesture video to be recognized according to the first type of gesture video, the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video; the first gesture video set is any type of gesture video set in the preset database; the preset gesture videos included in the first gesture video set belong to the first type of gesture videos, and the preset gesture videos included in other gesture video sets except the first gesture video set in the preset database belong to the second type of gesture videos;
when the support vector machine is larger than 0, determining that the gesture video to be recognized belongs to the first gesture video set;
when the support vector machine is not larger than 0, taking any one of the other types of gesture video sets in the preset database as a new first gesture video set, returning to execute the obtaining operation to obtain a new first type of gesture video and a new second type of gesture video according to the new first type of gesture video set, and obtaining a new support vector machine of the gesture video to be recognized according to the new first type of gesture video, the new second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video until the new support vector machine is larger than 0, and determining that the gesture video to be recognized belongs to the new first gesture video set.
3. The method according to claim 2, wherein the obtaining a support vector machine of the gesture video to be recognized according to the first type of gesture video, the second type of gesture video, and the similarity between the gesture video to be recognized and each preset gesture video comprises:
determining the label factors of the first type of gesture videos and the second type of gesture videos; the label factor of the first type of gesture video is equal to a first preset value belonging to a positive number, the label factor of the second type of gesture video is equal to a second preset value belonging to a negative number, and the absolute values of the first preset value and the second preset value are the same;
and acquiring a support vector machine of the gesture video to be recognized according to the label factors of the first type of gesture video, the label factors of the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video.
4. The method according to claim 3, wherein the obtaining a support vector machine of the gesture video to be recognized according to the tag factor of the first type of gesture video, the tag factor of the second type of gesture video, and the similarity between the gesture video to be recognized and each preset gesture video comprises:
determining a gesture video matrix X to be recognized corresponding to the gesture video to be recognized according to the characteristic sequence of each frame of gesture image to be recognized in the gesture video to be recognized; wherein, the mth column of the gesture video matrix X to be recognized comprises: a feature sequence corresponding to the mth frame of the gesture video to be recognized and the gesture image to be recognized, wherein m is an integer greater than or equal to 1;
determining a preset gesture video matrix Y corresponding to the ith preset gesture video according to the characteristic sequence of each frame of preset gesture image in the ith preset gesture video in the preset databasei(ii) a Wherein i is an integer greater than or equal to 1 and less than or equal to N, N is the number of preset gesture videos included in the preset database, and the preset gesture video matrix YiColumn m of (d) contains: the above-mentionedA feature sequence corresponding to an m-th frame of a preset gesture video is obtained;
according to the formulaDetermining a support vector machine f (X) of the gesture video to be recognized, wherein sign () represents a sign function αiRepresents a first weighting coefficient, yiA label factor, κ (Y), representing the ith preset gesture videoiX) represents the gesture video matrix X to be recognized and the preset gesture video matrix YiB represents a first preset constant; wherein if the ith preset gesture video belongs to the first type of gesture video, then the yiEqual to the first preset value, if the ith preset gesture video belongs to the second type of gesture video, the y isiEqual to said second preset value.
5. The method of claim 4, further comprising:
according to the formulaDetermining the gesture video matrix X to be recognized and the preset gesture video matrix YiSimilarity between K (Y)i,X);
The K represents the number of video matrixes included in each layer of the time pyramid comprising L layers, the gesture video matrix X to be recognized and the preset gesture video matrix corresponding to each preset gesture video in the preset database are divided into the number of video matrixes included in each layer of the time pyramid comprising L layers according to the same division rule, and the K is 2lL represents the l-th layer, l is a natural number greater than or equal to 0, k represents the k-th video matrix of each layer, and XlkRepresenting the kth video matrix of the ith layer of the time pyramid corresponding to the gesture video matrix X to be recognized,represents said preSet gesture video matrix YiThe kth video matrix of the l-th layer of the corresponding temporal pyramid,represents saidAnd said XlkSimilarity between them, μlkRepresenting the second weighting factor.
6. The method of claim 5, further comprising:
according to the formulaDetermining theAnd said XlkSimilarity between themWherein exp () represents an exponential function,represents saidAnd said XlkAnd gamma represents a second predetermined constant.
7. The method of claim 6, further comprising:
according to the formulaDetermining theAnd said XlkThe distance betweenWherein,representing the function of the euclidean distance,representing a first preset sparse affine sequence,representing a second preset sparse affine sequence.
8. A gesture recognition apparatus, comprising:
the acquisition module is configured to acquire a gesture video to be recognized;
the first determining module is configured to determine a gesture video set to which the gesture video to be recognized belongs according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database; the preset database comprises at least one type of gesture video set, and each type of gesture video set comprises at least one preset gesture video.
9. The apparatus of claim 8, wherein the first determining module comprises:
an acquisition submodule configured to perform an acquisition operation, the acquisition operation including: according to the type of a first gesture video set in the preset database, dividing the gesture video set in the preset database into a first type of gesture video and a second type of gesture video; and acquiring a support vector machine of the gesture video to be recognized according to the first type of gesture video, the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video; the first gesture video set is any type of gesture video set in the preset database; the preset gesture videos included in the first gesture video set belong to the first type of gesture videos, and the preset gesture videos included in other gesture video sets except the first gesture video set in the preset database belong to the second type of gesture videos;
a first determining submodule configured to determine that the gesture video to be recognized belongs to the first gesture video set when the support vector machine is greater than 0;
and the second determining submodule is configured to, when the support vector machine is not greater than 0, take any one of other types of gesture video sets in the preset database as a new first gesture video set, return to the acquiring submodule to execute the acquiring operation, acquire a new first type of gesture video and a new second type of gesture video according to the new first type of gesture video and the new second type of gesture video, and acquire a new support vector machine of the gesture video to be recognized according to the new first type of gesture video, the new second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video, and determine that the gesture video to be recognized belongs to the new first gesture video set until the new support vector machine is greater than 0.
10. The apparatus of claim 9, wherein the acquisition submodule comprises:
a determining unit configured to determine a tag factor of the first type of gesture video and a tag factor of the second type of gesture video; the label factor of the first type of gesture video is equal to a first preset value belonging to a positive number, the label factor of the second type of gesture video is equal to a second preset value belonging to a negative number, and the absolute values of the first preset value and the second preset value are the same;
the acquisition unit is configured to acquire a support vector machine of the gesture video to be recognized according to the label factor of the first type of gesture video, the label factor of the second type of gesture video and the similarity between the gesture video to be recognized and each preset gesture video.
11. The apparatus according to claim 10, wherein the obtaining unit is specifically configured to:
determining a gesture video matrix X to be recognized corresponding to the gesture video to be recognized according to the characteristic sequence of each frame of gesture image to be recognized in the gesture video to be recognized; wherein, the mth column of the gesture video matrix X to be recognized comprises: a feature sequence corresponding to the mth frame of the gesture video to be recognized and the gesture image to be recognized, wherein m is an integer greater than or equal to 1;
determining a preset gesture video matrix Y corresponding to the ith preset gesture video according to the characteristic sequence of each frame of preset gesture image in the ith preset gesture video in the preset databasei(ii) a Wherein i is an integer greater than or equal to 1 and less than or equal to N, N is the number of preset gesture videos included in the preset database, and the preset gesture video matrix YiColumn m of (d) contains: a feature sequence corresponding to an m-th frame of preset gesture images of the ith preset gesture video;
according to the formulaDetermining a support vector machine f (X) of the gesture video to be recognized, wherein sign () represents a sign function αiRepresents a first weighting coefficient, yiA label factor, κ (Y), representing the ith preset gesture videoiX) represents the gesture video matrix X to be recognized and the preset gesture video matrix YiB represents a first preset constant; wherein if the ith preset gesture video belongs to the first type of gesture video, then the yiEqual to the first preset value, if the ith preset gesture video belongs to the second type of gesture video, the y isiEqual to said second preset value.
12. The apparatus of claim 11, further comprising:
a second determination module configured to determine a formulaDetermining the gesture video matrix X to be recognized and the preset gesture video matrix YiSimilarity between K (Y)i,X);
The K represents the number of video matrixes included in each layer of the time pyramid comprising L layers, the gesture video matrix X to be recognized and the preset gesture video matrix corresponding to each preset gesture video in the preset database are divided into the number of video matrixes included in each layer of the time pyramid comprising L layers according to the same division rule, and the K is 2lL represents the l-th layer, l is a natural number greater than or equal to 0, k represents the k-th video matrix of each layer, and XlkRepresenting the kth video matrix of the ith layer of the time pyramid corresponding to the gesture video matrix X to be recognized,representing the preset gesture video matrix YiThe kth video matrix of the l-th layer of the corresponding temporal pyramid,represents saidAnd said XlkSimilarity between them, μlkRepresenting the second weighting factor.
13. The apparatus of claim 12, further comprising:
a third determination module configured to determine a formulaDetermining theAnd said XlkSimilarity between themWherein exp () represents an exponential function,represents saidAnd said XlkAnd gamma represents a second predetermined constant.
14. The apparatus of claim 13, further comprising:
a fourth determination module configured to determine a formulaDetermining theAnd said XlkThe distance betweenWherein,representing the function of the euclidean distance,representing a first preset sparse affine sequence,representing a second preset sparse affine sequence.
15. A terminal device, comprising: a processor and a memory for storing processor-executable instructions;
the processor is configured to:
acquiring a gesture video to be recognized;
determining a gesture video set to which the gesture video to be recognized belongs according to the similarity between the gesture video to be recognized and a preset gesture video in a preset database; the preset database comprises at least one type of gesture video set, and each type of gesture video set comprises at least one preset gesture video.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710398580.8A CN107133361B (en) | 2017-05-31 | 2017-05-31 | Gesture recognition method and device and terminal equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710398580.8A CN107133361B (en) | 2017-05-31 | 2017-05-31 | Gesture recognition method and device and terminal equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107133361A true CN107133361A (en) | 2017-09-05 |
CN107133361B CN107133361B (en) | 2020-02-07 |
Family
ID=59734033
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710398580.8A Active CN107133361B (en) | 2017-05-31 | 2017-05-31 | Gesture recognition method and device and terminal equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107133361B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108022543A (en) * | 2017-11-27 | 2018-05-11 | 深圳中科呼图电子商务有限公司 | A kind of advertisement autonomous demenstration method, system and advertisement machine and application |
CN108268835A (en) * | 2017-12-28 | 2018-07-10 | 努比亚技术有限公司 | sign language interpretation method, mobile terminal and computer readable storage medium |
CN108596079A (en) * | 2018-04-20 | 2018-09-28 | 歌尔科技有限公司 | Gesture identification method, device and electronic equipment |
CN109284689A (en) * | 2018-08-27 | 2019-01-29 | 苏州浪潮智能软件有限公司 | A method of In vivo detection is carried out using gesture identification |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102855488A (en) * | 2011-06-30 | 2013-01-02 | 北京三星通信技术研究有限公司 | Three-dimensional gesture recognition method and system |
CN103092332A (en) * | 2011-11-08 | 2013-05-08 | 苏州中茵泰格科技有限公司 | Digital image interactive method and system of television |
CN103745228A (en) * | 2013-12-31 | 2014-04-23 | 清华大学 | Dynamic gesture identification method on basis of Frechet distance |
CN104299004A (en) * | 2014-10-23 | 2015-01-21 | 浙江大学 | Hand gesture recognition method based on multi-feature fusion and fingertip detecting |
US20150138078A1 (en) * | 2013-11-18 | 2015-05-21 | Eyal Krupka | Hand pose recognition using boosted look up tables |
US20150177842A1 (en) * | 2013-12-23 | 2015-06-25 | Yuliya Rudenko | 3D Gesture Based User Authorization and Device Control Methods |
-
2017
- 2017-05-31 CN CN201710398580.8A patent/CN107133361B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102855488A (en) * | 2011-06-30 | 2013-01-02 | 北京三星通信技术研究有限公司 | Three-dimensional gesture recognition method and system |
CN103092332A (en) * | 2011-11-08 | 2013-05-08 | 苏州中茵泰格科技有限公司 | Digital image interactive method and system of television |
US20150138078A1 (en) * | 2013-11-18 | 2015-05-21 | Eyal Krupka | Hand pose recognition using boosted look up tables |
US20150177842A1 (en) * | 2013-12-23 | 2015-06-25 | Yuliya Rudenko | 3D Gesture Based User Authorization and Device Control Methods |
CN103745228A (en) * | 2013-12-31 | 2014-04-23 | 清华大学 | Dynamic gesture identification method on basis of Frechet distance |
CN104299004A (en) * | 2014-10-23 | 2015-01-21 | 浙江大学 | Hand gesture recognition method based on multi-feature fusion and fingertip detecting |
Non-Patent Citations (5)
Title |
---|
ESHED OHN-BAR ET AL: "Joint Angles Similiarities and HOG2 for Action Recognition", 《THE IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 * |
NASSER H. DARDAS ET AL: "Real-Time Hand Gesture Detection and Recognition Using Bag-of-Features and Support Vector Machine Techniques", 《IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT》 * |
任彧 等: "基于HOG特征和SVM的手势识别", 《科技通报》 * |
孙玉: "基于计算机视觉的手势跟踪与识别算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
袁方: "基于HOG特征和支持向量机的静态手势识别", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108022543A (en) * | 2017-11-27 | 2018-05-11 | 深圳中科呼图电子商务有限公司 | A kind of advertisement autonomous demenstration method, system and advertisement machine and application |
CN108268835A (en) * | 2017-12-28 | 2018-07-10 | 努比亚技术有限公司 | sign language interpretation method, mobile terminal and computer readable storage medium |
CN108596079A (en) * | 2018-04-20 | 2018-09-28 | 歌尔科技有限公司 | Gesture identification method, device and electronic equipment |
CN108596079B (en) * | 2018-04-20 | 2021-06-15 | 歌尔光学科技有限公司 | Gesture recognition method and device and electronic equipment |
CN109284689A (en) * | 2018-08-27 | 2019-01-29 | 苏州浪潮智能软件有限公司 | A method of In vivo detection is carried out using gesture identification |
Also Published As
Publication number | Publication date |
---|---|
CN107133361B (en) | 2020-02-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11532180B2 (en) | Image processing method and device and storage medium | |
CN106651955B (en) | Method and device for positioning target object in picture | |
JP7110412B2 (en) | LIFE DETECTION METHOD AND DEVICE, ELECTRONIC DEVICE, AND STORAGE MEDIUM | |
US10007841B2 (en) | Human face recognition method, apparatus and terminal | |
US10452890B2 (en) | Fingerprint template input method, device and medium | |
US12008167B2 (en) | Action recognition method and device for target object, and electronic apparatus | |
US9959484B2 (en) | Method and apparatus for generating image filter | |
US20210224592A1 (en) | Method and device for training image recognition model, and storage medium | |
CN106648063B (en) | Gesture recognition method and device | |
CN106557759B (en) | Signpost information acquisition method and device | |
TWI757668B (en) | Network optimization method and device, image processing method and device, storage medium | |
WO2020048392A1 (en) | Application virus detection method, apparatus, computer device, and storage medium | |
WO2020114236A1 (en) | Keypoint detection method and apparatus, electronic device, and storage medium | |
CN107133361B (en) | Gesture recognition method and device and terminal equipment | |
CN107463903B (en) | Face key point positioning method and device | |
CN105354560A (en) | Fingerprint identification method and device | |
RU2632578C2 (en) | Method and device of characteristic extraction | |
CN113486830A (en) | Image processing method and device, electronic equipment and storage medium | |
US20160350584A1 (en) | Method and apparatus for providing contact card | |
CN104077597A (en) | Image classifying method and device | |
CN107977636B (en) | Face detection method and device, terminal and storage medium | |
CN106372663B (en) | Construct the method and device of disaggregated model | |
US20170147896A1 (en) | Method, device, and storage medium for feature extraction | |
CN107729886B (en) | Method and device for processing face image | |
TWI770531B (en) | Face recognition method, electronic device and storage medium thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |