Nothing Special   »   [go: up one dir, main page]

CN113095336A - Method for training key point detection model and method for detecting key points of target object - Google Patents

Method for training key point detection model and method for detecting key points of target object Download PDF

Info

Publication number
CN113095336A
CN113095336A CN202110439103.8A CN202110439103A CN113095336A CN 113095336 A CN113095336 A CN 113095336A CN 202110439103 A CN202110439103 A CN 202110439103A CN 113095336 A CN113095336 A CN 113095336A
Authority
CN
China
Prior art keywords
training
key points
detection model
training samples
loss function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110439103.8A
Other languages
Chinese (zh)
Other versions
CN113095336B (en
Inventor
宫延河
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110439103.8A priority Critical patent/CN113095336B/en
Publication of CN113095336A publication Critical patent/CN113095336A/en
Application granted granted Critical
Publication of CN113095336B publication Critical patent/CN113095336B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a training method of a key point detection model and a method for detecting key points of a target object, which are applied to the technical field of electronics, and are particularly applied to the technical field of augmented reality and deep learning. The specific implementation scheme of the training method of the key point detection model is as follows: obtaining training samples including a target object, wherein the training samples at least include a first type of training sample without a label; based on the training sample, adopting a key point detection model to obtain the predicted position information of a plurality of key points of the target object; the keypoint detection model is trained based on a predetermined loss function and predicted position information for the training samples. And constructing a predetermined loss function for the first class of training samples based on the predicted position information of the adjacent key points in the plurality of key points.

Description

Method for training key point detection model and method for detecting key points of target object
Technical Field
The present disclosure relates to the field of electronic technologies, and in particular, to the field of augmented reality and deep learning technologies, and in particular, to a method for training a keypoint detection model, a method and an apparatus for detecting keypoints of a target object, an electronic device, and a storage medium.
Background
Current keypoint detection methods use labeled supervisory data for training. In order to improve the detection accuracy, a large amount of supervision data is usually required, which undoubtedly brings a large labeling cost to the training.
Disclosure of Invention
The method for training the key point detection model, the method for detecting the key points of the target object, the device, the electronic equipment and the storage medium reduce training cost and guarantee detection precision.
According to an aspect of the present disclosure, there is provided a method for training a keypoint detection model, the method including: obtaining training samples including a target object, wherein the training samples at least include a first type of training sample without a label; based on the training sample, adopting a key point detection model to obtain the predicted position information of a plurality of key points of the target object; and training the key point detection model based on the preset loss function and the predicted position information aiming at the training samples, wherein the preset loss function aiming at the first class of training samples is constructed based on the predicted position information of adjacent key points in the plurality of key points.
According to another aspect of the present disclosure, there is provided a method of detecting a target object keypoint, comprising: acquiring an image to be processed including a target object; and obtaining the position information of the key points of the target object in the image to be processed by adopting a key point detection model. The key point detection model is obtained by training by adopting the training method of the key point detection model.
According to another aspect of the present disclosure, there is provided a training apparatus for a keypoint detection model, the apparatus including: the sample acquisition module is used for acquiring training samples including target objects, wherein the training samples at least include a first type of training samples without labels; the prediction information obtaining module is used for obtaining prediction position information of a plurality of key points of the target object by adopting a key point detection model based on the training sample; and the model training module is used for training the key point detection model based on the preset loss function and the predicted position information aiming at the training samples, wherein the preset loss function aiming at the first class of training samples is constructed based on the predicted position information of the adjacent key points in the plurality of key points.
According to another aspect of the present disclosure, there is provided an apparatus for detecting a key point of a target object, the apparatus including: the image acquisition module is used for acquiring an image to be processed comprising a target object; and the position information determining module is used for acquiring the position information of the key points of the target object in the image to be processed by adopting the key point detection model. The key point detection model is obtained by training by adopting the training device of the key point detection model.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of training a keypoint detection model and/or a method of detecting keypoints of a target object provided by the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform a method of training a keypoint detection model and/or a method of detecting keypoints of a target object provided by the present disclosure.
According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method of training a keypoint detection model and/or the method of detecting keypoints of a target object provided by the present disclosure.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram of an application scenario of a method for training a keypoint detection model and a method for detecting keypoints of a target object according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of a method of training a keypoint detection model according to an embodiment of the disclosure;
FIG. 3 is a flow chart of a method of training a keypoint detection model according to another embodiment of the present disclosure;
FIG. 4 is a schematic diagram illustrating a principle of determining a value of a predetermined loss function for a first class of training samples according to an embodiment of the present disclosure;
FIG. 5 is a flow chart of a method of detecting key points of a target object according to an embodiment of the present disclosure;
FIG. 6 is a block diagram of a training apparatus for a keypoint detection model according to an embodiment of the present disclosure;
FIG. 7 is a block diagram of an apparatus for detecting key points of a target object according to an embodiment of the present disclosure; and
fig. 8 is a block diagram of an electronic device for implementing a method for training a keypoint detection model and/or a method for detecting keypoints of a target object according to an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The disclosure provides a method for training a key point detection model, which comprises a sample acquisition stage, a prediction information acquisition stage and a model training stage. In the sample acquisition phase, training samples including the target object are acquired, wherein the training samples at least include a first type of training sample without a label. In the stage of obtaining the prediction information, on the basis of the training samples, the prediction position information of a plurality of key points of the target object is obtained by adopting a key point detection model. In the model training phase, the keypoint detection model is trained based on a predetermined loss function and predicted position information for the training samples. And constructing a predetermined loss function for the first class of training samples based on the predicted position information of the adjacent key points in the plurality of key points.
An application scenario of the method and apparatus provided by the present disclosure will be described below with reference to fig. 1.
Fig. 1 is a schematic view of an application scenario of a method for training a keypoint detection model and a method for detecting keypoints of a target object according to an embodiment of the present disclosure.
As shown in fig. 1, the application scenario 100 includes a terminal device 110, which may be any electronic device with processing functionality, including but not limited to a smartphone, a tablet, a laptop, a desktop computer, a server, and so on.
The terminal device 110 may, for example, detect a target object from the input image 120, and label key points of the detected target object, thereby implementing identification of the target object. For example, the terminal device 110 may label the key points of the target object by using a key point detection model, and obtain the position information 130 of the key points of the target object.
Illustratively, the keypoint detection model may be based on a regression method or a model constructed based on a gaussian thermogram method, for example. The regression method directly outputs coordinates of key points of the target object through a regression method, and the method is suitable for detecting points with obvious texture features. The principle of the gaussian heatmap is to encode the position information of a point into a gaussian smoothed peak point. The gaussian thermogram method is suitable for fitting points of a non-rigid body.
According to an embodiment of the present disclosure, as shown in fig. 1, the application scenario 100 may further include a server 140, for example. Terminal device 110 may be communicatively coupled to server 140 via a network, which may include wired or wireless communication links.
For example, the server 140 may be configured to train the key point detection model, and send the trained key point detection model 150 to the terminal device 110 in response to the model acquisition request sent by the terminal device 110, so as to facilitate the terminal device 110 to detect the target object for the input image.
Illustratively, the server may be, for example, a server that provides various services, such as a background management server that provides support for applications running on the terminal device 110. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.
According to an embodiment of the present disclosure, as shown in fig. 1, the application scenario 100 may further include a database 160, and the database 160 may maintain, for example, a massive amount of images including images with tags and images without tags. The server 140 may access the database 160, for example, randomly extract partial images from a large number of images in the database, and train the keypoint detection model using the extracted images as training samples.
In an embodiment, the terminal device 110 and the server 140 may be, for example, the same device, and the same device includes a first processing module for performing target object detection on an image and a second processing module for training a keypoint detection model. The first processing module and the second processing module can communicate with each other through a network protocol.
It should be noted that the method for training the keypoint detection model and the method for detecting the keypoints of the target object provided by the present disclosure may be executed by different devices in the server 140 and the terminal device 110, or may be executed by the same device in the server 140 and the terminal device 110. Accordingly, the training apparatus for the keypoint detection model and the apparatus for detecting the keypoints of the target object provided by the present disclosure may be disposed in different devices of the server 140 and the terminal device 110, or may be disposed in the same device of the server 140 and the terminal device 110.
It should be understood that the number and types of terminal devices, servers, keypoint detection models and databases in fig. 1 are merely illustrative. There may be any number and type of terminal devices, servers, keypoint detection models, and databases, as desired for implementation.
FIG. 2 is a flow chart of a method of training a keypoint detection model according to an embodiment of the disclosure.
As shown in fig. 2, the method 200 for training the keypoint detection model of the embodiment may include operations S210 to S230.
In operation S210, training samples including a target object are obtained, the training samples including at least a first type of training sample without a label.
According to an embodiment of the present disclosure, each training sample may be, for example, an image including a target object. The target object may be any entity, such as an animal, a plant, a living article, a dress, a shoe, a hat, etc., for example.
Illustratively, the training samples may be obtained, for example, by randomly drawing images from a database, where the images stored in the database may include images with labels and images without labels. In one embodiment, the images with tags and the images without tags may be stored in different memory partitions of a database, for example. This embodiment may obtain a predetermined number of images from each memory partition, resulting in a training sample. The image has labels for indicating position information of a plurality of key points of the target object in the image, and the position information can be obtained by pre-calibration.
For example, when the keypoint detection model is a pre-trained model with accuracy not meeting the predetermined requirement, the image may be acquired from a storage partition storing only images without labels, and the acquired images without labels may be used as a first class of training samples to perform unsupervised training on the keypoint detection model.
In operation S220, predicted position information of a plurality of key points of the target object is obtained using the key point detection model based on the training samples.
According to an embodiment of the disclosure, the keypoint detection model may be constructed based on a lightweight front-end mvc (model View controller) framework, for example, may be constructed based on a Backbone (Backbone) framework. It is to be understood that the architecture of the keypoint detection model is merely an example to facilitate understanding of the present disclosure, and the present disclosure is not limited thereto.
For example, the keypoint detection model may be a neural network model constructed based on a regression method or a gaussian thermal mapping method, so that the position coordinates of the keypoints in the input image are output by processing the input image. The operation S220 may use the training sample as an input of the key point detection model, and output the predicted coordinate values of the key points of the target object in the training sample after being processed by the key point detection model. It is to be understood that the type of keypoint detection model and the algorithm employed are merely examples to facilitate understanding of the present disclosure, and the present disclosure is not limited thereto.
According to an embodiment of the present disclosure, in the case where the training sample is plural, the training sample may be batch-input to the keypoint detection model to train the model one by one.
It will be appreciated that the number of the plurality of keypoints depends on the type of target object. For the same target object, the number of the plurality of key points can be set according to actual requirements. The number of the plurality of key points may be set by an initial configuration parameter of the key point detection model, which is not limited in this disclosure.
In operation S230, the keypoint detection model is trained based on the predicted position information and a predetermined loss function for the training samples.
According to an embodiment of the disclosure, for the first class of training samples, the predetermined loss function may be constructed based on predicted position information of adjacent key points in the plurality of key points. When the training sample is the first type of training sample, the operation S230 may determine a value of the predetermined loss function according to the predicted location information of the plurality of key points. The keypoint detection model is trained with values based on the predetermined loss function.
For example, the predicted position information of the plurality of key points may be used as the value of a variable in the predetermined loss function, so as to obtain the value of the predetermined loss function. And then, when the value of the predetermined loss function is determined to be the minimum value by adopting a gradient descent algorithm and the like, the value of the parameter in the key point detection model is determined. And the value of the parameter is assigned to the key point detection model, so that the optimization of the key point detection model can be realized. It is to be understood that the algorithm for determining the minimum value of the predetermined loss function value is only an example to facilitate understanding of the present disclosure, and the present disclosure is not limited thereto. For example, a back propagation algorithm may also be employed to determine the minimum of the predetermined loss function values. For example, an auto-derivative system such as PyTorch may be used to determine the minimum value of the predetermined function value.
For example, when determining the value of the predetermined loss function for the first class of training samples, the distance between the predicted positions of two adjacent key points may be calculated. And if the distance is greater than the preset distance, taking the predicted positions of the two key points as values of variables in the preset loss function. Or, an included angle value between a connecting line between the two adjacent key points and a straight line parallel to the predetermined axis may be calculated first, and if the included angle value is greater than the predetermined included angle value, the predicted positions of the two key points are used as values of variables in the predetermined loss function.
The embodiment of the disclosure can use the unlabeled training sample when training the key point detection model by constructing the predetermined loss function according to the position information between the adjacent key points without the help of the position information indicated by the labels, thereby realizing the unsupervised training of the model. Compared with the technical scheme that the supervised training of the model is required by adopting a large number of training samples in the related technology, the method can at least partially reduce the labeling cost of the label while the trained model meets the precision requirement. The method of the embodiment is particularly suitable for scenes in which more training samples are needed due to large freedom degree of the target object and diversified feature textures.
FIG. 3 is a flow chart of a method of training a keypoint detection model according to another embodiment of the present disclosure.
As shown in fig. 3, the method 300 for training the keypoint detection model of this embodiment may include operations S310 to S340.
In operation S310, training samples including a target object are obtained, the training samples including a first type of training sample without a label and a second type of training sample with a label.
According to an embodiment of the present disclosure, the label of the second type of training sample indicates position information of a keypoint of the target object comprised by the second type of training sample. The second class of training samples may be obtained, for example, by labeling key points of the target object in the image. The implementation method of operation S310 is similar to the method for obtaining training samples described above, and is not described herein again.
In operation S320, a predetermined loss function for the training samples is determined according to the type of the training samples.
According to an embodiment of the present disclosure, the types of the training samples may include a first type and a second type, where the training samples are the first type training samples, and the training samples are the second type training samples. The embodiment may determine the type of training sample based on whether the training sample has a label. If the label is present, the type is the first type, and if the label is absent, the type is the second type.
For example, in the case that the training samples are training samples of the first type, the predetermined loss function for the training samples may be a function constructed based on the predicted location information of the neighboring keypoints in the plurality of keypoints as described above. The predetermined loss function for the first type of training samples will be described in detail later and will not be described in detail here.
Illustratively, in the case where the training samples are of the second class, the predetermined loss function for the training samples includes any one of: mean absolute error, mean square error loss, smoothed squared absolute error. It is to be understood that the above-mentioned types of predetermined loss functions for the second class of training samples are only examples to facilitate understanding of the present disclosure, and the present disclosure is not limited thereto.
In operation S330, predicted position information of a plurality of key points of the target object is obtained using the key point detection model based on the training samples. According to an embodiment of the present disclosure, the implementation of operation S330 is similar to the method for obtaining the predicted location information described above, and is not described herein again.
In operation S340, the keypoint detection model is trained based on the predicted location information and a predetermined loss function for the training samples.
According to the embodiment of the present disclosure, when the training sample is the first type of training sample, the implementation method of operation S340 is similar to the method for training the keypoint detection model described above, and is not described herein again.
According to an embodiment of the present disclosure, when the training sample is a second class training sample, the operation S340 may train the keypoint detection model based on a predetermined loss function for the training sample, predicted position information of a plurality of keypoints, and position information indicated by the label. For example, the value of the predetermined loss function may be determined according to a difference between the predicted position information of the plurality of key points and the position information indicated by the tag. And then, when a back propagation algorithm and the like are adopted to determine that the preset loss function takes the minimum value, the value of the parameter in the key point detection model is obtained. And replacing the value of the parameter in the key point detection model in the current state with the value of the parameter to obtain the optimized key point detection model, thereby realizing the training of the model.
According to the embodiment of the disclosure, the key point detection model is trained by integrating the labeled training sample and the unlabeled training sample, so that the precision of the trained key point detection model can be further improved.
According to the embodiment of the disclosure, after the first type of training sample and the second type of training sample are obtained, for example, the second type of training sample may be used to pre-train the keypoint detection model, so that the value of the parameter in the keypoint detection model is within the reasonable interval range. After the pre-training is finished, the key point detection model is trained by adopting a mode of mixing the first type of training sample and the second type of training sample or adopting a single first type of training sample. Namely, a supervised mode is adopted for pre-training, and then a supervised and unsupervised mixed mode or an unsupervised mode is adopted for accurate training. By the method, the training efficiency of the model can be improved.
Fig. 4 is a schematic diagram of a principle of determining a value of a predetermined loss function for a first class of training samples according to an embodiment of the present disclosure.
According to the embodiment of the disclosure, in the case that the training sample is the first type of training sample, the operation of determining the value of the predetermined loss function may, for example, first determine a connection line between any two adjacent key points in the plurality of key points, as a target connection line. After the target connecting line is obtained, the difference of two target connecting lines between any three adjacent key points is determined based on the predicted position information of any three adjacent key points in the plurality of key points, and the value of the predetermined loss function is determined according to the difference.
For example, the sum of all differences determined from the plurality of key points may be taken as the value of the predetermined loss function, or the average of all differences determined from the plurality of key points may be taken as the value of the predetermined loss function.
For example, as shown in fig. 4, when the target object is a shoe 400, the number of the plurality of key points may be 18, for example, and the data output by the key point detection model may be coordinate values of the 18 key points, which may be relative to the coordinate system shown in fig. 4, for example. The coordinate system takes the toe position of the shoe 400 as the origin O of coordinates, the shoe width direction as the X-axis, and the shoe length direction as the Y-axis. After the coordinate values of the 18 key points are obtained, a target connecting line can be obtained by positioning according to two adjacent coordinate values. So that an 18-entry scribe line (e.g., a dotted line between adjacent keypoints in fig. 4) can be obtained.
For example, after the target connection line is obtained, the length of the target connection line may be determined according to coordinate values of two end points of the target connection line. After the lengths of the target connecting lines are obtained, the difference value of the lengths of the two adjacent target connecting lines can be determined, and the difference value is used as the difference of the two target connecting lines between any three adjacent key points. For example, as shown in fig. 4, for three adjacent key points 401, 402, and 403, two target connection lines can be obtained by connecting two adjacent key points, and the lengths of the two target connection lines are L respectively1And L2The difference between the lengths of the two target connecting lines is L1-L2. In an embodiment, the value of the difference may be L1-L2To avoid the situation that the value of the predetermined loss function is inaccurate due to the cancellation of the positive and negative difference values. This embodiment may use the sum of the values of the determined differences as the value of the predetermined loss function.
For example, after the target connection line is obtained, an included angle value of two target connection lines between any three adjacent key points may be determined according to coordinate values of two end points of the target connection line. And taking the included angle value as the difference of the two target connecting lines. In an embodiment, a value of an included angle between each target connection line and the X axis may also be determined according to a slope k of each target connection line with respect to the X axis in fig. 4, where the included angle is, for example, tan-1k. And taking the difference value of the values of the two included angles between the two target connecting lines and the X axis as the difference of the two target connecting lines. For example, for three adjacent key points 401, 402 and 403 in fig. 4, the included angle between two target connecting lines formed by connecting is λ. This embodiment may take the sum of all pinch angle values determined as the value of the predetermined loss function.
For example, after the target connecting line is obtained, the lengths of two target connecting lines between any three adjacent key points and the angle value of the two target connecting lines, which are determined by the method described above, may be used as the difference between the two target connecting lines. Substituting the lengths of the two target connecting lines and the included angle value of the two target connecting lines into a formula of a predetermined loss function, and calculating to obtain a value of the predetermined loss function.
Figure BDA0003033804050000101
Wherein n is the number of a plurality of key points, piFor the ith keypoint, p, of the plurality of keypointsi-1Is the (i-1) th key point, p, of the plurality of key pointsi+1For the (i +1) th keypoint of the plurality of keypoints, D (p)i-1,pi) Is the length of the target link between the (i-1) th and ith key points, D (p)i,pi+1) Is the length of the target link between the ith and (i +1) th keypoints, V (p)i-1,pi) A rotation angle, V (p), of a target link between the (i-1) th key point and the ith key point with respect to a predetermined axisi,pi+1) Is the ith key point and the (ith)+1) angle of rotation of the target link between the key points with respect to the predetermined axis, (V)pi-1,pi-Vpi,pi+1) The included angle value between the two target connecting lines is shown. Where i-1 is assigned n when i is 1 and i +1 is assigned 1 when i is n. The predetermined axis may be any coordinate axis in a coordinate system constructed based on the target object, or may be any axis set in advance.
In summary, in the embodiment, through the setting of the predetermined loss function for the first class of training samples, unsupervised training of the keypoint detection model can be realized, and the labeling cost of the label can be reduced at least partially while the trained model meets the precision requirement. The method of the embodiment is particularly suitable for scenes in which more training samples are needed due to large freedom and diversified feature textures of target objects (such as shoes and the like).
Furthermore, the value of the loss function is determined according to the difference between two target connecting lines formed by connecting three adjacent key points, and the relative position information between the key points can be fully considered, so that the more accurate learning of the model on the target object characteristics can be improved, and the accuracy of the trained key point detection model can be improved conveniently.
Based on the above training method of the key point detection model, the present disclosure also provides a method for detecting key points of a target object. This method will be described in detail below with reference to fig. 5.
Fig. 5 is a flowchart of a method of detecting a target object keypoint according to an embodiment of the present disclosure.
As shown in fig. 5, the method 500 of detecting a target object keypoint of the embodiment may include operations S510 to S520.
In operation S510, a to-be-processed image including a target object is acquired.
According to the embodiments of the present disclosure, the target object is similar to the target object described above, and is not described herein again. The image to be processed may be, for example, an image photographed in real time, or may be an image that is previously photographed and then cached. In the virtual fitting scene, the image to be processed may be a garment image shot in advance, or a shoe image, or the like.
In operation S520, position information of a keypoint of a target object in an image to be processed is obtained using a keypoint detection model. The keypoint detection model may be obtained by training using the aforementioned training method for the keypoint detection model, for example.
It is understood that the operation S520 is similar to the method for obtaining the predicted position information of the plurality of key points of the target object in the training sample by using the key point detection model described above, except that the key point detection model in this embodiment is a model that is trained in advance and has a precision meeting a condition.
According to the method for detecting the key points of the target object, the key point detection model obtained by training through the training method described above can be used for accurately detecting the key points of the target object with high degree of freedom and diversified texture features, and therefore user experience is improved conveniently.
Based on the training method of the key point detection model, the disclosure also provides a training device of the key point detection model. The apparatus will be described in detail below with reference to fig. 6.
Fig. 6 is a block diagram of a structure of a training apparatus for a keypoint detection model according to an embodiment of the present disclosure.
As shown in fig. 6, the training apparatus 600 for a keypoint detection model of this embodiment may include a sample acquisition module 610, a prediction information obtaining module 620, and a model training module 630.
The sample acquiring module 610 is configured to acquire training samples including target objects, where the training samples include at least a first type of training sample without a label. In an embodiment, the sample obtaining module 610 may be configured to perform the operation S210 described above, for example, and is not described herein again.
The predicted information obtaining module 620 is configured to obtain predicted position information of a plurality of key points of the target object by using a key point detection model based on the training samples. In an embodiment, the prediction information obtaining module 620 may be configured to perform the operation S220 described above, for example, and is not described herein again.
The model training module 630 is used to train the keypoint detection model based on the predicted location information and the predetermined loss function for the training samples. And constructing a predetermined loss function for the first class of training samples based on the predicted position information of the adjacent key points in the plurality of key points. In an embodiment, the model training module 630 may be used to perform the operation S230 described above, for example, and is not described herein again.
According to an embodiment of the present disclosure, the model training module 630 may include, for example, a value determination sub-module and a training sub-module. And the value determination submodule is used for determining the value of the predetermined loss function based on the predicted position information of the plurality of key points. And the training submodule is used for training the key point detection model according to the value of the preset loss function.
According to an embodiment of the present disclosure, the value determination submodule may include, for example, a connection line determination unit, a difference determination unit, and a value determination unit. The connecting line determining unit is used for determining a connecting line between any two adjacent key points in the plurality of key points as a target connecting line under the condition that the training sample is the first type of training sample. The difference determining unit is used for determining the difference of two target connecting lines between any three adjacent key points based on the predicted position information of any three adjacent key points in the plurality of key points. The value determining unit is used for determining the value of the predetermined loss function according to the difference.
According to an embodiment of the present disclosure, the difference determined by the difference determination unit comprises at least one of: the difference of the lengths of the two target connecting lines and the included angle value between the two target connecting lines.
According to an embodiment of the present disclosure, the predetermined loss function for the first class of training samples is expressed by the following formula:
Figure BDA0003033804050000121
wherein n is the number of a plurality of key points, piFor the ith keypoint, p, of the plurality of keypointsi-1For the (i-1) th of the plurality of key pointsKey point, pi+1For the (i +1) th keypoint of the plurality of keypoints, D (p)i-1,pi) Is the length of the target link between the (i-1) th and ith key points, D (p)i,pi+1) Is the length of the target link between the ith and (i +1) th keypoints, V (p)i-1,pi) A rotation angle, V (p), of a target link between the (i-1) th key point and the ith key point with respect to a predetermined axisi,pi+1) And a rotation angle of a target connecting line between the ith key point and the (i +1) th key point relative to a preset axis, wherein when i is equal to 1, i-1 is assigned as n, and when i is equal to n, i +1 is assigned as 1.
According to an embodiment of the present disclosure, the training samples further comprise a second class of training samples having labels indicating location information of a plurality of key points in the target object. The training apparatus 600 for the keypoint detection model may further include a loss function determining module, configured to determine a predetermined loss function for the training samples according to the types of the training samples.
According to an embodiment of the present disclosure, the model training module is specifically configured to: in the case where the training samples are second-class training samples, the keypoint detection model is trained based on a predetermined loss function for the training samples, predicted position information of a plurality of keypoints, and position information indicated by the labels.
According to an embodiment of the present disclosure, in a case where the training samples are training samples of a second class, the predetermined loss function for the training samples includes any one of: mean absolute error, mean square error loss, smoothed squared absolute error.
Based on the method for detecting the key points of the target object, the disclosure also provides a device for detecting the key points of the target object. The apparatus will be described in detail below with reference to fig. 7.
Fig. 7 is a block diagram of a structure of an apparatus for detecting a key point of a target object according to an embodiment of the present disclosure.
As shown in fig. 7, the apparatus 700 for detecting key points of a target object of this embodiment may include an image acquisition module 710 and a location information determination module 720.
The image acquisition module 710 is used for acquiring an image to be processed including a target object. In an embodiment, the image obtaining module 710 may be configured to perform the operation S510 described above, which is not described herein again.
The position information determining module 720 is configured to obtain position information of a key point of a target object in the image to be processed by using the key point detection model. The key point detection model may be obtained by training with the training device of the key point detection model described above. In an embodiment, the location information determining module 720 may be configured to perform the operation S520 described above, which is not described herein again.
It should be noted that, in the technical solution of the present disclosure, the acquisition, storage, application, and the like of the personal information of the related user all conform to the regulations of the relevant laws and regulations, and do not violate the common customs of the public order.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that may be used to implement the method of training the keypoint detection model and/or the method of detecting keypoints for a target object of embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 801 performs the various methods and processes described above, such as a method of training a keypoint detection model and/or a method of detecting keypoints of a target object. For example, in some embodiments, the method of training the keypoint detection model and/or the method of detecting keypoints for the target object may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 802 and/or communications unit 809. When loaded into RAM 803 and executed by computing unit 801, a computer program may perform one or more steps of the above described method of keypoint detection model training and/or method of detecting keypoints of a target object. Alternatively, in other embodiments, the computing unit 801 may be configured by any other suitable means (e.g., by means of firmware) to perform a method of training a keypoint detection model and/or a method of detecting keypoints of a target object.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server may be a cloud Server, which is also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in a traditional physical host and a VPS service ("Virtual Private Server", or "VPS" for short). The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (21)

1. A method for training a keypoint detection model comprises the following steps:
obtaining training samples including a target object, wherein the training samples at least include a first type of training sample without a label;
based on the training sample, obtaining predicted position information of a plurality of key points of the target object by adopting a key point detection model; and
training the keypoint detection model based on a predetermined loss function for the training samples and the predicted position information,
and constructing a predetermined loss function for the first class of training samples based on the predicted position information of the adjacent key points in the plurality of key points.
2. The method of claim 1, wherein training the keypoint detection model comprises:
determining values of the predetermined loss function based on the predicted position information of the plurality of key points; and
and training the key point detection model according to the value of the predetermined loss function.
3. The method of claim 2, wherein, in the case that the training samples are of a first type, determining the value of the predetermined loss function comprises:
determining a connecting line between any two adjacent key points in the plurality of key points as a target connecting line;
determining the difference of two target connecting lines between any three adjacent key points based on the predicted position information of the any three adjacent key points in the plurality of key points; and
and determining the value of the predetermined loss function according to the difference.
4. The method of claim 3, wherein determining a difference of two target links between the any three keypoints comprises at least one of:
determining the difference value of the lengths of the two target connecting lines;
and determining the included angle value between the two target connecting lines.
5. The method of claim 4, wherein the predetermined loss function for the first class of training samples is expressed by the following equation:
Figure FDA0003033804040000021
wherein n is the number of the plurality of key points, piFor the ith keypoint, p, of said plurality of keypointsi-1For the (i-1) th keypoint, p, of said plurality of keypointsi+1For said plurality of switches(i +1) th of the keypoints, D (p)i-1,pi) Is the length of the target link between the (i-1) th and ith key points, D (p)i,pi+1) Is the length of the target link between the ith and (i +1) th keypoints, V (p)i-1,pi) A rotation angle, V (p), of a target link between the (i-1) th key point and the ith key point with respect to a predetermined axisi,pi+1) And a rotation angle of a target connecting line between the ith key point and the (i +1) th key point relative to a preset axis, wherein when i is equal to 1, i-1 is assigned as n, and when i is equal to n, i +1 is assigned as 1.
6. The method according to any one of claims 1-5, wherein the training samples further comprise a second class of training samples having labels indicating location information for a plurality of key points in the target object; the method further comprises the following steps:
determining a predetermined loss function for the training samples according to the type of the training samples.
7. The method of claim 6, wherein training the keypoint detection model if the training samples are of the second class comprises:
training the keypoint detection model based on a predetermined loss function for the training samples, predicted position information for the plurality of keypoints, and position information indicated by the labels.
8. The method according to claim 6 or 7, wherein in case the training samples are of the second class, the predetermined loss function for the training samples comprises any one of: mean absolute error, mean square error loss, smoothed squared absolute error.
9. A method of detecting key points of a target object, comprising:
acquiring an image to be processed including a target object; and
obtaining the position information of the key points of the target object in the image to be processed by adopting a key point detection model,
the method for detecting the key points comprises the following steps of training a key point detection model according to any one of claims 1-8.
10. A training apparatus for a keypoint detection model, comprising:
the system comprises a sample acquisition module, a comparison module and a comparison module, wherein the sample acquisition module is used for acquiring training samples comprising target objects, and the training samples at least comprise a first type of training sample without labels;
the predicted information obtaining module is used for obtaining predicted position information of a plurality of key points of the target object by adopting a key point detection model based on the training sample; and
a model training module to train the keypoint detection model based on a predetermined loss function for the training samples and the predicted location information,
wherein the predetermined loss function for the first class of training samples is constructed based on predicted location information of neighboring keypoints of the plurality of keypoints.
11. The apparatus of claim 10, wherein the model training module comprises:
a value determination submodule, configured to determine a value of the predetermined loss function based on predicted location information of the plurality of key points; and
and the training submodule is used for training the key point detection model according to the value of the preset loss function.
12. The apparatus of claim 11, wherein the value determination submodule comprises:
a connecting line determining unit, configured to determine, when the training sample is a first-class training sample, a connecting line between any two adjacent key points in the plurality of key points, as a target connecting line;
a difference determining unit, configured to determine a difference between two target connection lines between any three adjacent key points in the plurality of key points based on predicted position information of the any three adjacent key points; and
and the value determination unit is used for determining the value of the predetermined loss function according to the difference.
13. The apparatus of claim 12, wherein the difference determined by the difference determination unit comprises at least one of:
the difference of the lengths of the two target connecting lines;
and the included angle between the two target connecting lines.
14. The apparatus of claim 13, wherein the predetermined loss function for the first class of training samples is expressed by the following equation:
Figure FDA0003033804040000031
wherein n is the number of the plurality of key points, piFor the ith keypoint, p, of said plurality of keypointsi-1For the (i-1) th keypoint, p, of said plurality of keypointsi+1For the (i +1) th keypoint of said plurality, D (p)i-1,pi) Is the length of the target link between the (i-1) th and ith key points, D (p)i,pi+1) Is the length of the target link between the ith and (i +1) th keypoints, V (p)i-1,pi) A rotation angle, V (p), of a target link between the (i-1) th key point and the ith key point with respect to a predetermined axisi,pi+1) And a rotation angle of a target connecting line between the ith key point and the (i +1) th key point relative to a preset axis, wherein when i is equal to 1, i-1 is assigned as n, and when i is equal to n, i +1 is assigned as 1.
15. The apparatus of any of claims 10-14, wherein the training samples further comprise a second class of training samples having labels indicating location information for a plurality of key points in the target object; the device further comprises:
a loss function determination module for determining a predetermined loss function for the training samples according to the type of the training samples.
16. The apparatus of claim 15, wherein the model training module is specifically configured to:
training the keypoint detection model based on a predetermined loss function for the training samples, the predicted position information of the plurality of keypoints, and the position information indicated by the label, if the training samples are the second class of training samples.
17. The apparatus according to claim 15 or 16, wherein in case the training samples are of the second class, the predetermined loss function for the training samples comprises any one of: mean absolute error, mean square error loss, smoothed squared absolute error.
18. An apparatus for detecting key points of a target object, comprising:
the image acquisition module is used for acquiring an image to be processed comprising a target object; and
a position information determining module for obtaining position information of the key point of the target object by taking the image to be processed as the input of the key point detection model,
wherein, the key point detection model is obtained by training by adopting the device of any one of claims 10-17.
19. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-9.
20. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any of claims 1-9.
21. A computer program product comprising a computer program which, when executed by a processor, implements a method according to any one of claims 1 to 9.
CN202110439103.8A 2021-04-22 2021-04-22 Method for training key point detection model and method for detecting key points of target object Active CN113095336B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110439103.8A CN113095336B (en) 2021-04-22 2021-04-22 Method for training key point detection model and method for detecting key points of target object

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110439103.8A CN113095336B (en) 2021-04-22 2021-04-22 Method for training key point detection model and method for detecting key points of target object

Publications (2)

Publication Number Publication Date
CN113095336A true CN113095336A (en) 2021-07-09
CN113095336B CN113095336B (en) 2022-03-11

Family

ID=76679583

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110439103.8A Active CN113095336B (en) 2021-04-22 2021-04-22 Method for training key point detection model and method for detecting key points of target object

Country Status (1)

Country Link
CN (1) CN113095336B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113741459A (en) * 2021-09-03 2021-12-03 阿波罗智能技术(北京)有限公司 Method for determining training sample and training method and device for automatic driving model
CN114550207A (en) * 2022-01-17 2022-05-27 北京新氧科技有限公司 Method and device for detecting key points of neck and method and device for training detection model
CN114596637A (en) * 2022-03-23 2022-06-07 北京百度网讯科技有限公司 Image sample data enhancement training method and device and electronic equipment
CN114663671A (en) * 2022-02-21 2022-06-24 佳都科技集团股份有限公司 Target detection method, device, equipment and storage medium
CN115147680A (en) * 2022-06-30 2022-10-04 北京百度网讯科技有限公司 Pre-training method, device and equipment of target detection model
CN115578451A (en) * 2022-09-30 2023-01-06 北京百度网讯科技有限公司 Image processing method, and training method and device of image processing model
CN115661577A (en) * 2022-11-01 2023-01-31 吉咖智能机器人有限公司 Method, apparatus, and computer-readable storage medium for object detection
CN116663650A (en) * 2023-06-06 2023-08-29 北京百度网讯科技有限公司 Training method of deep learning model, target object detection method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9275273B2 (en) * 2011-06-02 2016-03-01 Kriegman-Belhumeur Vision Technologies, Llc Method and system for localizing parts of an object in an image for computer vision applications
CN109614867A (en) * 2018-11-09 2019-04-12 北京市商汤科技开发有限公司 Human body critical point detection method and apparatus, electronic equipment, computer storage medium
CN111832611A (en) * 2020-06-03 2020-10-27 北京百度网讯科技有限公司 Training method, device and equipment of animal recognition model and storage medium
CN111931591A (en) * 2020-07-15 2020-11-13 北京百度网讯科技有限公司 Method and device for constructing key point learning model, electronic equipment and readable storage medium
CN112633221A (en) * 2020-12-30 2021-04-09 深圳市捷顺科技实业股份有限公司 Face direction detection method and related device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9275273B2 (en) * 2011-06-02 2016-03-01 Kriegman-Belhumeur Vision Technologies, Llc Method and system for localizing parts of an object in an image for computer vision applications
CN109614867A (en) * 2018-11-09 2019-04-12 北京市商汤科技开发有限公司 Human body critical point detection method and apparatus, electronic equipment, computer storage medium
CN111832611A (en) * 2020-06-03 2020-10-27 北京百度网讯科技有限公司 Training method, device and equipment of animal recognition model and storage medium
CN111931591A (en) * 2020-07-15 2020-11-13 北京百度网讯科技有限公司 Method and device for constructing key point learning model, electronic equipment and readable storage medium
CN112633221A (en) * 2020-12-30 2021-04-09 深圳市捷顺科技实业股份有限公司 Face direction detection method and related device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CLARA FERNANDEZ-LABRADOR ET AL.: "Unsupervised Learning of Category-Specific Symmetric 3D Keypoints from Point Sets", 《ARXIV.ORG》 *
胡江颢 等: "基于轻量级网络的实时人体关键点检测算法", 《计算机工程》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113741459A (en) * 2021-09-03 2021-12-03 阿波罗智能技术(北京)有限公司 Method for determining training sample and training method and device for automatic driving model
CN114550207A (en) * 2022-01-17 2022-05-27 北京新氧科技有限公司 Method and device for detecting key points of neck and method and device for training detection model
CN114550207B (en) * 2022-01-17 2023-01-17 北京新氧科技有限公司 Method and device for detecting key points of neck and method and device for training detection model
CN114663671A (en) * 2022-02-21 2022-06-24 佳都科技集团股份有限公司 Target detection method, device, equipment and storage medium
CN114596637A (en) * 2022-03-23 2022-06-07 北京百度网讯科技有限公司 Image sample data enhancement training method and device and electronic equipment
CN114596637B (en) * 2022-03-23 2024-02-06 北京百度网讯科技有限公司 Image sample data enhancement training method and device and electronic equipment
CN115147680A (en) * 2022-06-30 2022-10-04 北京百度网讯科技有限公司 Pre-training method, device and equipment of target detection model
CN115147680B (en) * 2022-06-30 2023-08-25 北京百度网讯科技有限公司 Pre-training method, device and equipment for target detection model
CN115578451A (en) * 2022-09-30 2023-01-06 北京百度网讯科技有限公司 Image processing method, and training method and device of image processing model
CN115578451B (en) * 2022-09-30 2024-01-23 北京百度网讯科技有限公司 Image processing method, training method and device of image processing model
CN115661577A (en) * 2022-11-01 2023-01-31 吉咖智能机器人有限公司 Method, apparatus, and computer-readable storage medium for object detection
CN115661577B (en) * 2022-11-01 2024-04-16 吉咖智能机器人有限公司 Method, apparatus and computer readable storage medium for object detection
CN116663650A (en) * 2023-06-06 2023-08-29 北京百度网讯科技有限公司 Training method of deep learning model, target object detection method and device
CN116663650B (en) * 2023-06-06 2023-12-19 北京百度网讯科技有限公司 Training method of deep learning model, target object detection method and device

Also Published As

Publication number Publication date
CN113095336B (en) 2022-03-11

Similar Documents

Publication Publication Date Title
CN113095336B (en) Method for training key point detection model and method for detecting key points of target object
US11810319B2 (en) Image detection method, device, storage medium and computer program product
CN113971751A (en) Training feature extraction model, and method and device for detecting similar images
CN113065614B (en) Training method of classification model and method for classifying target object
CN113222942A (en) Training method of multi-label classification model and method for predicting labels
CN113657289B (en) Training method and device of threshold estimation model and electronic equipment
CN113379813A (en) Training method and device of depth estimation model, electronic equipment and storage medium
CN113627361B (en) Training method and device for face recognition model and computer program product
CN113436100A (en) Method, apparatus, device, medium and product for repairing video
CN113420682A (en) Target detection method and device in vehicle-road cooperation and road side equipment
CN113591566A (en) Training method and device of image recognition model, electronic equipment and storage medium
CN114120414B (en) Image processing method, image processing apparatus, electronic device, and medium
CN113205041A (en) Structured information extraction method, device, equipment and storage medium
CN115719436A (en) Model training method, target detection method, device, equipment and storage medium
CN113362314B (en) Medical image recognition method, recognition model training method and device
CN113705362A (en) Training method and device of image detection model, electronic equipment and storage medium
CN113344862A (en) Defect detection method, defect detection device, electronic equipment and storage medium
CN112528995A (en) Method for training target detection model, target detection method and device
CN112967315A (en) Target tracking method and device and electronic equipment
CN114511743B (en) Detection model training, target detection method, device, equipment, medium and product
CN113591709B (en) Motion recognition method, apparatus, device, medium, and product
CN115311469A (en) Image labeling method, training method, image processing method and electronic equipment
CN114119990A (en) Method, apparatus and computer program product for image feature point matching
CN116524165B (en) Migration method, migration device, migration equipment and migration storage medium for three-dimensional expression model
CN113219505A (en) Method, device and equipment for acquiring GPS coordinates for vehicle-road cooperative tunnel scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20210709

Assignee: Beijing Intellectual Property Management Co.,Ltd.

Assignor: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

Contract record no.: X2023110000096

Denomination of invention: Training methods for key point detection models and methods for detecting key points of target objects

Granted publication date: 20220311

License type: Common License

Record date: 20230821