Nothing Special   »   [go: up one dir, main page]

CN111340090B - Image feature comparison method and device, equipment and computer readable storage medium - Google Patents

Image feature comparison method and device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN111340090B
CN111340090B CN202010107969.4A CN202010107969A CN111340090B CN 111340090 B CN111340090 B CN 111340090B CN 202010107969 A CN202010107969 A CN 202010107969A CN 111340090 B CN111340090 B CN 111340090B
Authority
CN
China
Prior art keywords
video
image
weights
image features
distance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010107969.4A
Other languages
Chinese (zh)
Other versions
CN111340090A (en
Inventor
李玺
吴昊潜
田�健
李斌
吴飞
董霖
叶新江
方毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Merit Interactive Co Ltd
Original Assignee
Zhejiang University ZJU
Merit Interactive Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU, Merit Interactive Co Ltd filed Critical Zhejiang University ZJU
Priority to CN202010107969.4A priority Critical patent/CN111340090B/en
Publication of CN111340090A publication Critical patent/CN111340090A/en
Application granted granted Critical
Publication of CN111340090B publication Critical patent/CN111340090B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • G06V40/25Recognition of walking or running movements, e.g. gait recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Social Psychology (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Psychiatry (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to an image feature comparison method, a gait recognition method and equipment. The method comprises the following steps: and calculating the weight of each image feature by using a weight calculation module obtained by combined training with the image feature extraction neural network, and carrying out weighting treatment on each image feature to obtain the distance between the compared videos. The method can improve the precision of gait recognition results.

Description

Image feature comparison method and device, equipment and computer readable storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to an image feature comparison method, a gait recognition method and a device.
Background
In the technical field of video gait recognition and other image processing, image feature comparison is required, namely, the similarity between two images/videos to be compared is determined according to the extracted image features.
Taking the field of video gait recognition as an example, the distance (i.e. similarity) between two videos is calculated through image feature comparison between the videos, and then whether targets (usually pedestrians) in the two videos are the same targets is judged.
In the existing implementation manner, euclidean distances of the same image features among videos are calculated, and the sum of the euclidean distances of the image features is used as the distance between two videos. But the robust image characteristics differ for different image variables. For example, if a pedestrian in one video wears a thick coat and a pedestrian in another video wears a thin coat, the robustness of the image features corresponding to the coat region is poor, and the image features are extracted and compared according to the previous year implementation, which results in lower precision of gait recognition results.
In order to solve the problem caused by extracting the image features with poor robustness, in another existing implementation, different image features are extracted according to various image variable conditions. For example, pedestrians in one video wear thick jackets and pedestrians in the other video wear thin jackets, and then image features corresponding to the jacket areas are not extracted. Although the implementation can avoid extracting the image features with poor robustness to a certain extent, different image features need to be extracted according to various image variable conditions, and the image feature extraction neural network needs to be trained respectively.
Disclosure of Invention
In order to solve the technical problems, a method capable of comparing image features, a gait recognition method and equipment are provided.
In a first aspect, an embodiment of the present application provides an image feature comparison method, including:
utilizing the image feature extraction neural network to respectively acquire a plurality of image features of the first video and the second video;
respectively determining the distance of the same image characteristic between the first video and the second video;
calculating the weight of the same image feature between the first video and the second video by using a weight calculation module, wherein the weight calculation module is obtained by training in combination with the image feature extraction neural network, and the weight calculation module is used for calculating the weight of the image feature according to the target image variables of the first video and the second video;
and weighting the plurality of image features by using the weights of the plurality of image features, and determining the distance between the first video and the second video according to the weighted result, wherein the distance between the first video and the second video reflects the similarity between the target image of the first video and the target image of the second video.
In a second aspect, embodiments of the present application provide a computer device comprising a processor and a memory;
the memory is used for storing a program for executing the method of the first aspect;
the processor is configured to execute a program stored in the memory.
In a third aspect, embodiments of the present application provide a gait recognition method, the method including:
the method comprises the steps of utilizing an image feature extraction neural network to obtain a plurality of image features of a first video, and obtaining a plurality of image features of a plurality of second videos respectively;
determining the distance of the same image characteristic between the first video and each second video respectively;
calculating the weight of the same image feature between the first video and each second video by using a weight calculation module, wherein the weight calculation module is obtained by training in combination with the image feature extraction neural network, and the weight calculation module is used for calculating the weight of the image feature according to the target image variables of the first video and the second video;
for each second video, weighting the plurality of image features by using the respective weights of the plurality of image features, and determining the distance between the first video and the corresponding second video according to the weighting result, wherein the distance between the first video and the second video reflects the gait similarity between the target image of the first video and the target image of the second video;
and selecting a target corresponding to at least one second video as a gait recognition result of the first video according to the distance between the first video and each second video.
In a fourth aspect, embodiments of the present application provide a computer device comprising a processor and a memory;
the memory is used for storing a program for executing the method of the third aspect;
the processor is configured to execute a program stored in the memory.
The technical scheme provided by the embodiment of the application has at least the following technical effects or advantages:
and calculating the weight of each image feature by using a weight calculation module obtained by combined training with the image feature extraction neural network, and carrying out weighting treatment on each image feature to obtain the distance between the compared videos. Because the weights of the image features are calculated according to the target image variables, the same image features can be extracted for different target image variables by considering the importance degree of each image feature under different target image variables, the image features are weighted by the weights obtained by calculation, so that the robustness of the important image features is improved, and the accuracy of gait recognition can be improved when the method is applied to gait recognition.
Drawings
FIG. 1 is a flowchart of an image feature comparison method according to an embodiment of the present disclosure;
fig. 2 is a flowchart of a gait recognition method according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
In some of the processes described in the specification and claims of this application and the above figures, a number of operations occurring in a particular order are included, but it should be understood that these processes may include more or less operations, and that these operations may be performed sequentially or in parallel.
The embodiment of the invention provides an image characteristic comparison method, as shown in fig. 1, which comprises the following operations:
and 101, respectively acquiring a plurality of image features of the first video and the second video by utilizing an image feature extraction neural network.
The image features specifically refer to feature vectors of target images in the video.
In the embodiment of the invention, the first video and the second video can be original videos acquired by a camera or character contour videos obtained after processing. The character contour video is a character contour image for every frame of image in the video.
The method provided by the embodiment of the invention is suitable for the training stage of the image feature extraction neural network, and then the first video and the second video are sample videos for training the image feature extraction neural network. The method provided by the embodiment of the invention is also suitable for an application stage after the training of the image feature extraction neural network is finished, such as a gait recognition process, and then the first video is a video to be processed (such as to be recognized), and the second video can be a video to be processed or a video with a known recognition result (such as a gait video of a specific person).
Step 102, determining distances of the same image features between the first video and the second video respectively.
By way of example and not limitation, euclidean distance for the same image feature is determined. Of course, in practical applications, other known distances may be used instead of euclidean distances, which is not limited in this embodiment of the present invention.
Step 103, calculating the weight of the same image feature between the first video and the second video by using a weight calculation module, wherein the weight calculation module is obtained by training in combination with the image feature extraction neural network, and the weight calculation module is used for calculating the weight of the image feature according to the target image variables of the first video and the second video.
In the embodiment of the invention, the target image in the video refers to an image (which may be, but is not limited to, a contour image) of a pedestrian in the video. The target image variable includes at least one of: the shooting angle of the target image, and the type of the coat and the clothing of the target image.
In the embodiment of the invention, the image variable of the video is also called as the condition of the video.
The shooting angle may be an accurate value or a value interval. If the photographing angle is an accurate value, in practical application, a plurality of photographing angles, for example, 0 °, 30 °, 60 °, … …, may be predetermined, the actual photographing angle of the target image may be determined, and the actual photographing angle may be matched to the closest predetermined photographing angle. For example, the actual photographing angle of the target image is 35 °, and the photographing angle as a target image variable is 30 °.
The type of the coat can be specifically determined according to scene requirements in practical application. As a distance, and not by way of limitation, the coat-wear types include: thin jackets, thick jackets, long jackets, short jackets, etc.
Step 104, weighting the plurality of image features by using the weights of the plurality of image features, and determining the distance between the first video and the second video according to the weighting result, wherein the distance between the first video and the second video reflects the similarity between the target image of the first video and the target image of the second video.
By way of example and not limitation, the sum of the weighted results is specifically taken as the distance between the first video and the second video.
In practical application, according to the application scene of the image feature comparison result, determining the specific content of the image feature comparison. For example, in a gait recognition scenario, the gait similarity of the target images between videos is compared.
According to the method provided by the embodiment of the invention, the weight of each image feature is calculated by using the weight calculation module obtained by combined training with the image feature extraction neural network, and the distance between the compared videos is obtained by carrying out weighting treatment on each image feature. Because the weights of the image features are calculated according to the target image variables, the same image features can be extracted for different target image variables by considering the importance degree of each image feature under different target image variables, the image features are weighted by the weights obtained by calculation, so that the robustness of the important image features is improved, and the accuracy of gait recognition can be improved when the method is applied to gait recognition.
The implementation manner of each step of the image feature comparison method provided by the embodiment of the invention is further described below.
There are various implementations of the above step 101. In the training stage of the image feature extraction neural network, one implementation mode is that the first video and the second video are respectively input into the image feature extraction neural network to respectively obtain a plurality of image features of the first video and the second video output by the image feature extraction neural network; in another implementation manner, the first video and the second video are input into the image feature extraction neural network respectively to obtain feature images of the first video and the second video output by the image feature extraction neural network, the feature images of the first video and the second video are segmented according to a preset rule respectively, and pooling operation is carried out on each sub-feature image obtained by segmentation to obtain image features corresponding to each sub-feature image of the first video and the second video respectively. In an application stage of the image feature extraction neural network, in one implementation manner, inputting a first video into the image feature extraction neural network to obtain a plurality of image features of the first video output by the image feature extraction neural network, and reading a plurality of image features of a predetermined second video from a predetermined storage position; in another implementation manner, a first video is input into an image feature extraction neural network to obtain a feature map of the first video output by the image feature extraction neural network, the feature map of the first video is segmented according to a preset rule, pooling operation is carried out on each sub-feature map obtained by segmentation to obtain image features corresponding to each sub-feature map of the first video, and a plurality of image features of a preset second video are read from a preset storage position. The determining manner of the image features of the second video may refer to the determining manner of the image features of the first video.
It can be seen that, no matter what application scenario, the extraction of the image features is realized by using the image feature extraction neural network. For example, inputting the first video into the image feature extraction neural network to obtain a feature map of the first video output by the image feature extraction neural network, where the feature map of the first video has a size of nxc×h×w, N is the number of frames of the first video, C is the number of channels of a single frame video of the first video, H is the height of the single frame image in the first video, and W is the width of the single frame image in the first video; dividing the feature map of the first video along the horizontal direction to obtain M sub-feature maps; and respectively carrying out pooling operation on each sub-feature map of the first video to obtain M image features of the first video, wherein the size of each single image feature of the first video is NxC multiplied by 1. For another example, the second video is input into the image feature extraction neural network to obtain a feature map of the second video output by the image feature extraction neural network, wherein the feature map of the second video has a size of nxc×h×w, N is the number of frames of the second video, C is the number of channels of a single frame video of the second video, H is the height of a single frame image in the second video, and W is the width of a single frame image in the second video; dividing the feature map of the second video along the horizontal direction (H dimension) to obtain M sub-feature maps; and respectively carrying out pooling operation on each sub-feature map of the second video to obtain M image features of the second video, wherein the size of each single image feature of the second video is NxC multiplied by 1.
In a particular embodiment, the image feature extraction neural network includes 6 convolutional layers and 2 pooling layers.
Embodiments of the present invention are not limited to pooling operations on sub-feature maps, e.g., maximum pooling and average pooling may be employed.
In the embodiment of the invention, the feature map may be equally divided into M sub-feature maps, or may be split according to a predetermined splitting rule, which is not limited in the embodiment of the invention.
It should be noted that in actual references, there may also be different feature map segmentation manners according to different conditions of the image features, which is not limited by the embodiment of the present invention.
The step 102 may be implemented in various ways, for example, the euclidean distance of the same image feature is determined, and the euclidean distance of the same image feature between videos may be calculated by using the existing method for calculating the euclidean distance.
There are various implementations of the step 103. A preferred implementation manner is to acquire a gait energy diagram of the first video and a gait energy diagram of the second video respectively; and inputting the gait energy diagram of the first video and the gait energy diagram of the second video into the weight calculation module, and acquiring the weights of the same image features between the first video and the second video output by the weight calculation module. The gait energy diagram can be obtained by adopting the existing implementation mode, and the embodiment of the invention is not limited to the gait energy diagram. In another implementation manner, the first video and the second video are input into the weight calculation module, and weights of the same image features between the first video and the second video output by the weight calculation module are obtained. In yet another implementation, a plurality of image features of a litigation first video and a plurality of image features of the second video are input into a weight calculation module, and weights of the same image features between the first video and the second video output by the weight calculation module are obtained.
For example, if there are M image features, the weight calculation module outputs M weights, each corresponding to one image feature.
The specific implementation manner of the weight calculation module is various, and the embodiment of the invention is not limited to this. By way of example, but not limitation, by training, the image variables of the target images in the two videos (or gait energy images and image features) are compared in advance by comparing the result with the weight mapping relation, and a set of weights corresponding to the comparison result are searched for by comparing the shooting angles of the image variables and/or the type of the coat. The implementation manner of comparing the image variables of the target images in the two videos (or gait energy diagrams and image features) may be to first identify the image variables of the target images in each video (for example, identify a shooting angle, identify a coat type, etc.), and then compare the image variables. Wherein, the existing image processing technology can be adopted to recognize the shooting angle and the type of the coat.
In the embodiment of the invention, the joint training mode of the weight calculation module and the image feature extraction neural network is as follows:
determining a first loss function by using a first distance between sample videos, training the image feature extraction neural network by using the first loss function and the sample videos, wherein the first distance between the sample videos is calculated by using the distance of the same image feature between the sample videos;
determining a second loss function using a second distance between the sample videos; training the weight calculation module with the second loss function and the sample video, or training the weight calculation module and the image feature extraction neural network with the second loss function and the sample video; the second distance between the sample videos is calculated by weighting the distance between the image features by the weight of the same image features between the sample videos.
By the aid of the combined training mode, accurate weight results can be obtained.
It should be noted that the implementation between the above steps may be combined arbitrarily, resulting in new embodiments.
On the basis of any of the above method embodiments, the target image variable includes a shooting angle of a target image, when the shooting angles of the target images of the first video and the second video are consistent, a weight of an image feature corresponding to a shoulder image area between the first video and the second video is greater than a weight of other image features, and when the shooting angles of the target images of the first video and the second video are inconsistent, a weight of an image feature corresponding to a head and foot image area between the first video and the second video is greater than a weight of other image features.
On the basis of any of the above method embodiments, the target image variable includes a garment type of the target image, and when the garment types of the target images of the first video and the second video are different, the weights of image features corresponding to a garment image area between the first video and the second video are smaller than the weights of other image features.
Based on the same inventive concept, the embodiment of the invention also provides a computer device, which comprises a processor and a memory; the memory is used for storing a program for executing the method according to any of the embodiments; the processor is configured to execute a program stored in the memory.
The embodiment of the invention also provides a gait recognition method, as shown in fig. 2, comprising the following steps:
step 201, obtaining a plurality of image features of a first video by using an image feature extraction neural network, and obtaining a plurality of image features of a plurality of second videos respectively.
In this embodiment, the first video is a video to be identified, and the plurality of second videos are videos (such as sample videos) with known identification results.
The method for acquiring the plurality of image features of the first video is referred to the above method embodiment, and is not described herein again. Wherein the plurality of image features of the second video are predetermined, in which step the predetermined image features of the second video are read.
Step 202, determining the distance of the same image feature between the first video and each second video respectively.
Suppose there are X second videos (V 1 、V 2 、……V X ) And determining the distances of the same image features between the first video and each second video respectively to obtain X groups of distances, wherein each group of distances corresponds to one second video, the number of the group of distances is M, and the distances correspond to each image feature respectively.
And 203, calculating the weight of the same image feature between the first video and each second video by using a weight calculation module, wherein the weight calculation module is obtained by training in combination with the image feature extraction neural network, and the weight calculation module is used for calculating the weight of the image feature according to the target image variables of the first video and the second video.
Suppose there are X second videos (V 1 、V 2 、……V X ) And M image features, namely, taking the first video and the second video as a group of input weight calculation modules each time to obtain X groups of output results, wherein each group of output results comprises M weights, and each weight corresponds to one image feature.
The specific implementation manner may refer to the description of the above method embodiments, and will not be repeated here.
Step 204, for each second video, weighting the plurality of image features by using the weights of the plurality of image features, and determining the distance between the first video and the corresponding second video according to the weighting result, wherein the distance between the first video and the second video reflects the gait similarity between the target image of the first video and the target image of the second video.
The distance between the first video and each second video is calculated separately, and when the distance between the first video and each second video is calculated, the implementation manner may refer to the description of the above method embodiment, which is not repeated herein.
Step 205, selecting a target corresponding to at least one second video as a gait recognition result of the first video according to the distance between the first video and each second video.
Preferably, the target corresponding to the second video closest to the first video is used as the gait recognition result of the first video.
Wherein, the target refers to the character corresponding to the target object.
According to the method provided by the embodiment of the invention, the weight of each image feature is calculated by using the weight calculation module obtained by combined training with the image feature extraction neural network, and the distance between the compared videos is obtained by carrying out weighting treatment on each image feature. Because the weights of the image features are calculated according to the target image variables, the same image features can be extracted for different target image variables by considering the importance degree of each image feature under different target image variables, the calculated weights are used for carrying out weighting processing on the image features, so that the robustness of the important image features is improved, and the accuracy and the precision of gait recognition can be improved when the method is applied to gait recognition.
Wherein the calculating, by using the weight calculating module, the weight of the same image feature between the first video and each second video includes:
respectively acquiring a gait energy diagram of the first video and a gait energy diagram of each second video;
and inputting the gait energy diagram of the first video and the gait energy diagram of the single second video into the weight calculation module, and acquiring the weights of the same image features between the first video and the single second video output by the weight calculation module.
Based on the same inventive concept, an embodiment of the present invention provides a computer device including a processor and a memory;
the memory is used for storing a program for executing the method of any gait recognition method embodiment;
the processor is configured to execute a program stored in the memory.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
The embodiments described above are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.

Claims (9)

1. An image feature comparison method, the method comprising:
utilizing the image feature extraction neural network to respectively acquire a plurality of image features of the first video and the second video;
respectively determining the distance of the same image characteristic between the first video and the second video;
calculating the weight of the same image feature between the first video and the second video by using a weight calculation module, wherein the weight calculation module is obtained by training in combination with the image feature extraction neural network, and the weight calculation module is used for calculating the weight of the image feature according to the target image variables of the first video and the second video;
weighting the plurality of image features by using the weights of the plurality of image features, and determining the distance between the first video and the second video according to the weighted result, wherein the distance between the first video and the second video reflects the similarity between the target image of the first video and the target image of the second video;
the target image variables comprise shooting angles of target images and the type of clothing of the target images, when the shooting angles of the target images of the first video and the second video are consistent, the weights of image features corresponding to shoulder image areas between the first video and the second video are larger than the weights of other image features, when the shooting angles of the target images of the first video and the second video are inconsistent, the weights of image features corresponding to head and foot image areas between the first video and the second video are larger than the weights of other image features, and when the type of clothing of the target images of the first video and the second video is different, the weights of the image features corresponding to the top image areas between the first video and the second video are smaller than the weights of the other image features.
2. The method of claim 1, wherein calculating weights for the same image feature between the first video and the second video using a weight calculation module comprises:
respectively acquiring a gait energy diagram of the first video and a gait energy diagram of the second video;
and inputting the gait energy diagram of the first video and the gait energy diagram of the second video into the weight calculation module, and acquiring the weights of the same image features between the first video and the second video output by the weight calculation module.
3. The method of claim 1, wherein the joint training of the weight calculation module and the image feature extraction neural network is as follows:
determining a first loss function by using a first distance between sample videos, training the image feature extraction neural network by using the first loss function and the sample videos, wherein the first distance between the sample videos is calculated by using the distance of the same image feature between the sample videos;
determining a second loss function using a second distance between the sample videos; training the weight calculation module with the second loss function and the sample video, or training the weight calculation module and the image feature extraction neural network with the second loss function and the sample video; the second distance between the sample videos is calculated by weighting the distance between the image features by the weight of the same image features between the sample videos.
4. The method of claim 1, wherein the acquiring the plurality of image features of the first video and the second video, respectively, using the image feature extraction neural network comprises:
inputting the first video into the image feature extraction neural network to obtain a feature map of the first video output by the image feature extraction neural network, wherein the size of the feature map of the first video is NxCxHxW, N is the number of frames of the first video, C is the number of channels of a single-frame video of the first video, H is the height of the single-frame image in the first video, and W is the width of the single-frame image in the first video; dividing the feature map of the first video along the horizontal direction to obtain M sub-feature maps; pooling operation is carried out on each sub-feature diagram of the first video respectively to obtain M image features of the first video, wherein the size of a single image feature of the first video is NxC x 1;
inputting the second video into the image feature extraction neural network to obtain a feature map of the second video output by the image feature extraction neural network, wherein the size of the feature map of the second video is NxCxHxW, N is the number of frames of the second video, C is the number of channels of a single frame video of the second video, H is the height of the single frame image in the second video, and W is the width of the single frame image in the second video; dividing the feature map of the second video along the horizontal direction to obtain M sub-feature maps; and respectively carrying out pooling operation on each sub-feature map of the second video to obtain M image features of the second video, wherein the size of each single image feature of the second video is NxC multiplied by 1.
5. The method according to any one of claims 1 to 4, wherein:
the distance between the first video and the second video is the sum of weighted results;
and/or the number of the groups of groups,
the distance of the same image characteristic between the first video and the second video is Euclidean distance;
and/or the number of the groups of groups,
the distance between the first video and the second video reflects a gait similarity between the target image of the first video and the target image of the second video.
6. A computer device comprising a processor and a memory;
the memory is used for storing a program for executing the method of any one of claims 1 to 5;
the processor is configured to execute a program stored in the memory.
7. A gait recognition method, the method comprising:
the method comprises the steps of utilizing an image feature extraction neural network to obtain a plurality of image features of a first video, and obtaining a plurality of image features of a plurality of second videos respectively;
determining the distance of the same image characteristic between the first video and each second video respectively;
calculating the weight of the same image feature between the first video and each second video by using a weight calculation module, wherein the weight calculation module is obtained by training in combination with the image feature extraction neural network, and the weight calculation module is used for calculating the weight of the image feature according to the target image variables of the first video and the second video;
for each second video, weighting the plurality of image features by using the respective weights of the plurality of image features, and determining the distance between the first video and the corresponding second video according to the weighting result, wherein the distance between the first video and the second video reflects the gait similarity between the target image of the first video and the target image of the second video;
selecting a target corresponding to at least one second video as a gait recognition result of the first video according to the distance between the first video and each second video;
the target image variables comprise shooting angles of target images and the type of clothing of the target images, when the shooting angles of the target images of the first video and the second video are consistent, the weights of image features corresponding to shoulder image areas between the first video and the second video are larger than the weights of other image features, when the shooting angles of the target images of the first video and the second video are inconsistent, the weights of image features corresponding to head and foot image areas between the first video and the second video are larger than the weights of other image features, and when the type of clothing of the target images of the first video and the second video is different, the weights of the image features corresponding to the top image areas between the first video and the second video are smaller than the weights of the other image features.
8. The method of claim 7, wherein calculating weights for the same image feature between the first video and each second video using a weight calculation module comprises:
respectively acquiring a gait energy diagram of the first video and a gait energy diagram of each second video;
and inputting the gait energy diagram of the first video and the gait energy diagram of the single second video into the weight calculation module, and acquiring the weights of the same image features between the first video and the single second video output by the weight calculation module.
9. A computer device comprising a processor and a memory;
the memory is used for storing a program for executing the method of claim 7 or 8;
the processor is configured to execute a program stored in the memory.
CN202010107969.4A 2020-02-21 2020-02-21 Image feature comparison method and device, equipment and computer readable storage medium Active CN111340090B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010107969.4A CN111340090B (en) 2020-02-21 2020-02-21 Image feature comparison method and device, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010107969.4A CN111340090B (en) 2020-02-21 2020-02-21 Image feature comparison method and device, equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111340090A CN111340090A (en) 2020-06-26
CN111340090B true CN111340090B (en) 2023-08-01

Family

ID=71185381

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010107969.4A Active CN111340090B (en) 2020-02-21 2020-02-21 Image feature comparison method and device, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111340090B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017101434A1 (en) * 2015-12-16 2017-06-22 深圳大学 Human body target re-identification method and system among multiple cameras
WO2018121690A1 (en) * 2016-12-29 2018-07-05 北京市商汤科技开发有限公司 Object attribute detection method and device, neural network training method and device, and regional detection method and device
CN110008815A (en) * 2019-01-25 2019-07-12 平安科技(深圳)有限公司 The generation method and device of recognition of face Fusion Model
WO2019196626A1 (en) * 2018-04-12 2019-10-17 腾讯科技(深圳)有限公司 Media processing method and related apparatus
CN110807401A (en) * 2019-10-29 2020-02-18 腾讯科技(深圳)有限公司 User identity identification and multi-user card punching method, device, storage medium and equipment

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102122354B (en) * 2011-03-15 2013-03-20 上海交通大学 Adaptive characteristic block selection-based gait identification method
WO2016065534A1 (en) * 2014-10-28 2016-05-06 中国科学院自动化研究所 Deep learning-based gait recognition method
KR102288280B1 (en) * 2014-11-05 2021-08-10 삼성전자주식회사 Device and method to generate image using image learning model
CN108230291B (en) * 2017-03-30 2020-09-29 北京市商汤科技开发有限公司 Object recognition system training method, object recognition method, device and electronic equipment
CN108875767A (en) * 2017-12-07 2018-11-23 北京旷视科技有限公司 Method, apparatus, system and the computer storage medium of image recognition
CN110032976B (en) * 2019-04-16 2021-02-05 中国人民解放军国防科技大学 Mask processing based novel gait energy map acquisition and identity recognition method
CN110569725B (en) * 2019-08-05 2022-05-20 华中科技大学 Gait recognition system and method for deep learning based on self-attention mechanism

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017101434A1 (en) * 2015-12-16 2017-06-22 深圳大学 Human body target re-identification method and system among multiple cameras
WO2018121690A1 (en) * 2016-12-29 2018-07-05 北京市商汤科技开发有限公司 Object attribute detection method and device, neural network training method and device, and regional detection method and device
WO2019196626A1 (en) * 2018-04-12 2019-10-17 腾讯科技(深圳)有限公司 Media processing method and related apparatus
CN110008815A (en) * 2019-01-25 2019-07-12 平安科技(深圳)有限公司 The generation method and device of recognition of face Fusion Model
CN110807401A (en) * 2019-10-29 2020-02-18 腾讯科技(深圳)有限公司 User identity identification and multi-user card punching method, device, storage medium and equipment

Also Published As

Publication number Publication date
CN111340090A (en) 2020-06-26

Similar Documents

Publication Publication Date Title
CN109271870B (en) Pedestrian re-identification method, device, computer equipment and storage medium
CN110909651B (en) Method, device and equipment for identifying video main body characters and readable storage medium
CN109635686B (en) Two-stage pedestrian searching method combining human face and appearance
CN104933414B (en) A kind of living body faces detection method based on WLD-TOP
CN109117801A (en) Method, apparatus, terminal and the computer readable storage medium of recognition of face
CN109903331B (en) Convolutional neural network target detection method based on RGB-D camera
CN109960742B (en) Local information searching method and device
CN115171165A (en) Pedestrian re-identification method and device with global features and step-type local features fused
CN109871821B (en) Pedestrian re-identification method, device, equipment and storage medium of self-adaptive network
CN109033955B (en) Face tracking method and system
CN105488468A (en) Method and device for positioning target area
US20220383525A1 (en) Method for depth estimation for a variable focus camera
CN112101195A (en) Crowd density estimation method and device, computer equipment and storage medium
CN111709313A (en) Pedestrian re-identification method based on local and channel combination characteristics
CN110942473A (en) Moving target tracking detection method based on characteristic point gridding matching
CN113298146A (en) Image matching method, device, equipment and medium based on feature detection
CN111444817A (en) Person image identification method and device, electronic equipment and storage medium
Barroso-Laguna et al. Scalenet: A shallow architecture for scale estimation
KR101705584B1 (en) System of Facial Feature Point Descriptor for Face Alignment and Method thereof
JP2013218605A (en) Image recognition device, image recognition method, and program
CN114092850A (en) Re-recognition method and device, computer equipment and storage medium
CN112084365A (en) Real-time image retrieval method of network camera based on OpenCV and CUDA acceleration
CN111340090B (en) Image feature comparison method and device, equipment and computer readable storage medium
CN116912763A (en) Multi-pedestrian re-recognition method integrating gait face modes
KR20160121481A (en) Object recognition system and method the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 310012 Room 418, West District, Building A, 525 Xixi Road, Xihu District, Hangzhou City, Zhejiang Province

Applicant after: Daily interactive Co.,Ltd.

Applicant after: ZHEJIANG University

Address before: 310012 Room 418, West District, Building A, 525 Xixi Road, Xihu District, Hangzhou City, Zhejiang Province

Applicant before: ZHEJIANG MEIRI INTERDYNAMIC NETWORK TECHNOLOGY Co.,Ltd.

Applicant before: ZHEJIANG University

GR01 Patent grant
GR01 Patent grant