WO2018121737A1 - 关键点预测、网络训练及图像处理方法和装置、电子设备 - Google Patents
关键点预测、网络训练及图像处理方法和装置、电子设备 Download PDFInfo
- Publication number
- WO2018121737A1 WO2018121737A1 PCT/CN2017/119877 CN2017119877W WO2018121737A1 WO 2018121737 A1 WO2018121737 A1 WO 2018121737A1 CN 2017119877 W CN2017119877 W CN 2017119877W WO 2018121737 A1 WO2018121737 A1 WO 2018121737A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- key point
- prediction
- information
- neural network
- convolutional neural
- Prior art date
Links
- 238000012549 training Methods 0.000 title claims abstract description 97
- 238000003672 processing method Methods 0.000 title claims abstract description 21
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 238
- 238000000034 method Methods 0.000 claims abstract description 98
- 238000002372 labelling Methods 0.000 claims abstract description 34
- 238000012545 processing Methods 0.000 claims description 63
- 230000006870 function Effects 0.000 claims description 51
- 238000000605 extraction Methods 0.000 claims description 44
- 238000004590 computer program Methods 0.000 claims description 9
- 238000009877 rendering Methods 0.000 claims description 4
- 210000005036 nerve Anatomy 0.000 claims description 2
- 238000004891 communication Methods 0.000 description 21
- 238000010586 diagram Methods 0.000 description 12
- 238000001514 detection method Methods 0.000 description 9
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000012544 monitoring process Methods 0.000 description 6
- 241000196324 Embryophyta Species 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 241000282326 Felis catus Species 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 241001533104 Tribulus terrestris Species 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
Definitions
- the embodiments of the present invention relate to the field of artificial intelligence technologies, and in particular, to a key point prediction, a network training and image processing method and apparatus, and an electronic device.
- the key point prediction of a general object refers to a key point of a general object (such as a human body, a vehicle, an animal or a plant, a furniture, etc.) in a natural scene (such as a person's head, hand, torso position; a front window of the vehicle, Tire, chassis, back box position, etc. are predicted.
- Key points for general purpose objects can be used to enhance the effects of applications such as general object detection and scene segmentation.
- the embodiment of the present application provides a key point prediction, network training, and image processing technical solution.
- a key point prediction method including: detecting an image by using a first convolutional neural network to obtain feature information of the image; and using the first convolutional neural network to use a universal a convolutional neural network trained by the sample image of the keypoint of the object; using the first convolutional neural network to predict a key point of the universal object of the image according to the feature information, and obtaining a general object of the image
- the key point prediction result includes key point position prediction information and key point existence prediction information.
- the first convolutional neural network includes: a feature extraction layer, a first keypoint prediction convolution layer and a second keypoint prediction convolution layer, the first keypoint prediction convolution layer and the first Two key point prediction convolution layers are respectively connected to the feature extraction layer; wherein: the feature extraction layer is used to extract feature information of the image; and the first key point prediction convolution layer is used to compare the feature information Performing a convolution operation to obtain the key point position prediction information; the second key point prediction convolution layer is configured to perform a convolution operation on the feature information to obtain the key point presence prediction information.
- the convolution kernel of the first keypoint prediction convolution layer is 1*1*2N
- the convolution kernel of the second keypoint prediction convolution layer is 1*1*N, where N is The total number of key points to be predicted.
- the first convolutional neural network comprises: a full convolutional neural network.
- the training of the first convolutional neural network includes: acquiring the sample image, where the key point annotation information includes key point location annotation information and key point presence annotation information; and training the first image using the sample image a convolutional neural network, obtaining key point position prediction information and key point presence prediction information of the general object of the first convolutional neural network for the sample image; predicting the key point position information and the key according to an objective function
- the point presence prediction information is supervised to determine whether the iterative loss rate of the first convolutional neural network satisfies a set condition; if satisfied, the training of the first convolutional neural network is completed.
- the training of the first convolutional neural network further includes: if not satisfied, if not, adjusting the first according to the obtained key point position prediction information and the key point presence prediction information Convolving the parameters of the neural network until the iterative loss rate satisfies the set condition.
- the supervising the key point position prediction information and the key point presence prediction information according to an objective function comprising: supervising the key point position prediction information according to a regression objective function, according to a classification objective function pair Predictive information exists for monitoring at the key points.
- a key point prediction network training method including: acquiring a sample image of key point annotation information including a general object, wherein the key point labeling information includes key point position labeling information And the key point has annotation information; using the sample image to train the first convolutional neural network, obtaining key point position prediction information and key point existence prediction information of the universal object of the first convolutional neural network for the sample image; And supervising the key point position prediction information and the key point presence prediction information according to an objective function, determining whether an iterative loss rate of the first convolutional neural network satisfies a set condition; if satisfied, completing the Training of a convolutional neural network.
- the method further includes: adjusting, if not satisfied, adjusting parameters of the first convolutional neural network according to the obtained key point position prediction information and the key point presence prediction information until the iterative loss rate satisfies Set the conditions.
- the supervising the key point position prediction information and the key point presence prediction information according to an objective function comprising: supervising the key point position prediction information according to a regression objective function, according to a classification objective function pair Predictive information exists for monitoring at the key points.
- the first convolutional neural network includes: a feature extraction layer, a first keypoint prediction convolution layer and a second keypoint prediction convolution layer, the first keypoint prediction convolution layer and the first Two key point prediction convolution layers are respectively connected to the feature extraction layer; wherein: the feature extraction layer is used to extract feature information of the sample image; and the first key point prediction convolution layer is used to The information is subjected to a convolution operation to obtain the key point position prediction information; and the second key point prediction convolution layer is configured to perform a convolution operation on the feature information to obtain the key point existence prediction information.
- the convolution kernel of the first keypoint prediction convolution layer is 1*1*2N
- the convolution kernel of the second keypoint prediction convolution layer is 1*1*N, where N is The total number of key points to be predicted.
- the first convolutional neural network comprises: a full convolutional neural network.
- an image processing method comprising: detecting an image by using a key point prediction method according to any of the above embodiments, or adopting a key as described in any of the above embodiments.
- the first convolutional neural network detected by the point prediction network training method detects an image, and obtains a key point prediction result of the general object of the image, where the key point prediction result includes key point position prediction information and key point existence prediction information;
- the image is processed according to a key point prediction result of the universal object.
- the processing the image according to a key point prediction result of the universal object comprises: determining a location of a general object in the image according to a key point prediction result of the universal object.
- the processing the image according to the key point prediction result of the universal object comprises: extracting an object feature of the universal object in the image according to the key point prediction result of the universal object.
- the processing the image according to a key point prediction result of the universal object comprises: estimating a posture of a general object in the image according to a key point prediction result of the universal object.
- the processing the image according to the key point prediction result of the universal object comprises: tracking a general object in the image according to a key point prediction result of the universal object.
- the processing the image according to the key point prediction result of the universal object comprises: identifying a general object in the image according to a key point prediction result of the universal object.
- the processing the image according to the key point prediction result of the universal object comprises: rendering a general object in the image according to a key point prediction result of the universal object.
- a key point prediction apparatus includes: a detection module, configured to detect an image by using a first convolutional neural network, to obtain feature information of the image; and the first convolution a neural network is a convolutional neural network trained using a sample image containing key point annotation information of a general object; a prediction module for predicting a general object of the image based on the feature information by using the first convolutional neural network A key point is obtained by obtaining a key point prediction result of the general object of the image, the key point prediction result including key point position prediction information and key point existence prediction information.
- the first convolutional neural network includes: a feature extraction layer, a first keypoint prediction convolution layer and a second keypoint prediction convolution layer, the first keypoint prediction convolution layer and the first Two key point prediction convolution layers are respectively connected to the feature extraction layer; wherein: the feature extraction layer is used to extract feature information of the image; and the first key point prediction convolution layer is used to compare the feature information Performing a convolution operation to obtain the key point position prediction information; the second key point prediction convolution layer is configured to perform a convolution operation on the feature information to obtain the key point presence prediction information.
- the convolution kernel of the first keypoint prediction convolution layer is 1*1*2N
- the convolution kernel of the second keypoint prediction convolution layer is 1*1*N, where N is The total number of key points to be predicted.
- the first convolutional neural network comprises: a full convolutional neural network.
- the device further includes: a training module, configured to train the first convolutional neural network
- the training module includes: an acquisition submodule, configured to acquire the sample image, where the key point annotation information includes Key point location label information and key point presence label information; training sub-module for training the first convolutional neural network using the sample image to obtain a key of the first convolutional neural network for the general object of the sample image Point position prediction information and key point existence prediction information; a monitoring sub-module for supervising the key point position prediction information and the key point existence prediction information according to an objective function; and determining a sub-module for determining the first Whether the iterative loss rate of the convolutional neural network satisfies the set condition; the execution submodule is configured to complete the first convolutional neural network if the iterative loss rate of the first convolutional neural network satisfies the set condition training.
- the execution sub-module is further configured to: if the iterative loss rate of the first convolutional neural network does not satisfy the set condition, the information and location information of the key point obtained according to the training sub-module The key point presence prediction information adjusts parameters of the first convolutional neural network until the iterative loss rate satisfies the set condition.
- the supervising sub-module is configured to supervise the key point position prediction information according to a regression objective function, and supervise the key point existence prediction information according to the classification objective function.
- a key point prediction network training apparatus including: an acquisition module, configured to acquire a sample image of key point annotation information including a general object, wherein the key point annotation information includes The key point location labeling information and the key point presence labeling information; the training module, configured to train the first convolutional neural network using the sample image, to obtain a key point of the first convolutional neural network for the general object of the sample image Position prediction information and key point existence prediction information; a monitoring module, configured to supervise the key point position prediction information and the key point presence prediction information according to an objective function; and a determining module, configured to determine the first convolutional nerve Whether the iterative loss rate of the network satisfies the set condition; and an execution module, configured to complete training on the first convolutional neural network if an iterative loss rate of the first convolutional neural network satisfies a set condition.
- the execution module is further configured to: if the iterative loss rate of the first convolutional neural network does not satisfy the set condition, the key point position prediction information and the key obtained according to the training module The point presence prediction information adjusts parameters of the first convolutional neural network until the iterative loss rate satisfies the set condition.
- the monitoring module is configured to supervise the key point position prediction information according to a regression objective function, and supervise the key point existence prediction information according to the classification objective function.
- the first convolutional neural network includes: a feature extraction layer, a first keypoint prediction convolution layer and a second keypoint prediction convolution layer, the first keypoint prediction convolution layer and the first Two key point prediction convolution layers are respectively connected to the feature extraction layer; wherein: the feature extraction layer is used to extract feature information of the sample image; and the first key point prediction convolution layer is used to The information is subjected to a convolution operation to obtain the key point position prediction information; and the second key point prediction convolution layer is configured to perform a convolution operation on the feature information to obtain the key point existence prediction information.
- the convolution kernel of the first keypoint prediction convolution layer is 1*1*2N
- the convolution kernel of the second keypoint prediction convolution layer is 1*1*N, where N is The total number of key points to be predicted.
- the first convolutional neural network comprises: a full convolutional neural network.
- an image processing apparatus including: a detection module, configured to detect an image by using a key point prediction apparatus according to any of the above embodiments of the present application, or
- the first convolutional neural network detected by the key point prediction network training device according to any of the above embodiments detects an image, and obtains a key point prediction result of the general object of the image, where the key point prediction result includes a key point position.
- Predictive information exists in the prediction information and the key points; and the processing module is configured to process the image according to the key point prediction result of the universal object.
- the processing module includes: a location determining submodule, configured to determine a location of the universal object in the image according to a keypoint prediction result of the universal object.
- the processing module includes: a feature extraction sub-module, configured to extract an object feature of the universal object in the image according to the key point prediction result of the universal object.
- the processing module includes: a posture estimation submodule, configured to estimate a posture of the universal object in the image according to a key point prediction result of the universal object.
- a posture estimation submodule configured to estimate a posture of the universal object in the image according to a key point prediction result of the universal object.
- the processing module includes: an object tracking sub-module, configured to track a general object in the image according to a key point prediction result of the universal object.
- the processing module includes: an object recognition submodule, configured to identify a general object in the image according to a key point prediction result of the universal object.
- the processing module includes: an object rendering sub-module, configured to render a general object in the image according to a key point prediction result of the universal object.
- an electronic device including: a processor and a memory; the memory is configured to store at least one executable instruction, the executable instruction causing the processor to execute as in the present application The operation corresponding to the object key point prediction method in any of the above embodiments; or the memory is configured to store at least one executable instruction, the executable instruction causing the processor to perform any of the above embodiments of the present application
- the object key point predicts an operation corresponding to the network training method; or the memory is configured to store at least one executable instruction, the executable instruction causing the processor to perform the method as described in any of the above embodiments of the present application The operation corresponding to the image processing method.
- another electronic device including:
- the processor and the key point prediction apparatus according to any one of the above embodiments of the present application; when the processor runs the key point prediction apparatus, the unit in the key point prediction apparatus according to any one of the embodiments of the present application is operated; or
- the processor and the key point prediction network training device according to any one of the above embodiments of the present application; the key point prediction network training device according to any of the above embodiments of the present application when the processor runs the key point prediction network training device The unit in is being run; or
- the processor and the image processing apparatus according to any of the above embodiments of the present application; when the processor runs the image processing apparatus, the unit in the image processing apparatus according to any of the above embodiments of the present application is operated.
- a computer program comprising computer readable code, the processor in the device executing to implement the present application when the computer readable code is run on a device An instruction of each step in the object key point prediction method described in any of the above embodiments; or
- the processor in the device executes instructions for implementing the steps in the object keypoint prediction network training method as described in any of the above embodiments of the present application; or
- the processor in the device executes instructions for implementing the steps in the image processing method as described in any of the above-described embodiments of the present application.
- a computer readable storage medium for storing computer readable instructions that, when executed, implement an object key as described in any of the above embodiments of the present application. The operation of each step in the point prediction method, or the operation of each step in the object key point prediction network training method according to any of the above embodiments of the present application, or the image processing method according to any of the above embodiments of the present application The operation of each step.
- the first convolutional neural network is trained by using the sample image of the key point annotation information of the universal object, and the trained first convolutional neural network is used to predict the key point of the universal object in the image.
- a general-purpose item can be understood as a general object in a natural scene, such as an object such as a human body, a vehicle, an animal or a plant, a furniture, a key point such as a person's head, a hand, a torso position; a front window of a vehicle, a tire, and a chassis
- the first convolutional neural network expands the critical point prediction range of the object class.
- the key point position prediction information and the key point existence prediction information of the general object can be directly obtained, wherein the key point position prediction information is the key point to be predicted.
- the position information in the image, the key point exists, the prediction information is information about whether the key point to be predicted exists in the image, when the position information of the key point to be predicted is obtained in the image, and the key point to be predicted is determined to exist in the image.
- the key point can be predicted, and the key point position prediction information combined with the common object and the key point presence prediction information are integrated to determine the key points of the general object in the image.
- FIG. 1 is a flowchart of a key point prediction method according to an embodiment of the present application.
- FIG. 2 is a flowchart of a key point prediction method according to another embodiment of the present application.
- FIG. 3 is a flowchart of training a first convolutional neural network in a key point prediction method according to another embodiment of the present application
- FIG. 4 is a schematic diagram of a training principle of a first convolutional neural network according to an embodiment of the present application
- FIG. 5 is a flowchart of an image processing method according to an embodiment of the present application.
- FIG. 6 is a structural block diagram of a key point prediction apparatus according to an embodiment of the present application.
- FIG. 7 is a structural block diagram of a key point prediction apparatus according to another embodiment of the present application.
- FIG. 8 is a structural block diagram of a key point prediction network training apparatus according to an embodiment of the present application.
- FIG. 9 is a block diagram showing the structure of an image processing apparatus according to an embodiment of the present application.
- FIG. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
- Embodiments of the present application can be applied to electronic devices such as terminal devices, computer systems, servers, etc., which can operate with numerous other general purpose or special purpose computing system environments or configurations.
- Examples of well-known terminal devices, computing systems, environments, and/or configurations suitable for use with electronic devices such as terminal devices, computer systems, servers, and the like include, but are not limited to, personal computer systems, server computer systems, thin clients, thick clients Machines, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, networked personal computers, small computer systems, mainframe computer systems, and distributed cloud computing technology environments including any of the above, and the like.
- Electronic devices such as terminal devices, computer systems, servers, etc., can be described in the general context of computer system executable instructions (such as program modules) being executed by a computer system.
- program modules may include routines, programs, target programs, components, logic, data structures, and the like that perform particular tasks or implement particular abstract data types.
- the computer system/server can be implemented in a distributed cloud computing environment where tasks are performed by remote processing devices that are linked through a communication network.
- program modules may be located on a local or remote computing system storage medium including storage devices.
- the present embodiment takes the key point of the general object in the prediction image as an example scenario, and takes the mobile terminal or the PC as the executor of the key point prediction method in this embodiment as an example.
- the key point prediction method of this embodiment will be described.
- other application scenarios, as well as other devices having the functions of data collection, processing, and transmission can implement the key point prediction solution provided by the embodiment of the present application.
- the scene is not limited.
- a universal object refers to any object existing in a natural scene, such as an object such as a human body, a vehicle, an animal or a plant, or a piece of furniture.
- the key point of a general object is the local position that most objects in the object category to which the universal object belongs can be used to distinguish the object type, for example, the position of the person's head, hand, torso, etc., the front window of the vehicle, the tire, Chassis, rear box and other locations.
- the key point prediction method of this embodiment includes:
- Step S100 detecting an image by using a first convolutional neural network to obtain feature information of the image.
- the first convolutional neural network is a convolutional neural network trained using a sample image containing key point annotation information of a general object, and the first convolutional neural network is used to predict key point information of the general object in the image.
- the convolutional neural network in various embodiments of the present application is a neural network including convolution processing capability, and may include a convolutional layer, or a convolutional layer and a non-convolutional layer.
- the image may be an image derived from an image capturing device, and is composed of an image of one frame and one frame, or may be a single image or an image, and may also be derived from other devices, and the image includes a static image or a video. image.
- the image can be input to the first convolutional neural network to obtain feature information of the image.
- the feature information includes feature information of the general object.
- the step S100 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by the detection module 600 executed by the processor.
- Step S102 Using a first convolutional neural network to predict a key point of the general object of the image according to the feature information, and obtaining a key point prediction result of the universal object of the image.
- the first convolutional neural network may include, for example but not limited to: an input layer, a feature extraction layer, and a keypoint prediction convolution layer.
- the input layer is used to input the image
- the feature extraction layer is used to extract the feature information of the image
- the key point prediction convolution layer is used to convolve the feature information to obtain the key point prediction result
- the key point prediction result includes the key point.
- the position prediction information and the key points have prediction information.
- the step S102 may be performed by a processor invoking a corresponding instruction stored in a memory, or may be performed by a prediction module 602 executed by the processor.
- the keyed point prediction result of the general object can be predicted from the image by using the trained first convolutional neural network, and the first convolutional neural network in this embodiment is used by the universal convolutional neural network.
- the sample image of the key point annotation information of the object is trained.
- the first convolutional neural network can predict the key points of the objects of multiple categories, and expands the applicable range of the prediction key convolutional neural network of the key point of the object.
- FIG. 2 a flow chart of steps of a keypoint prediction method according to another embodiment of the present application is shown.
- the present embodiment still uses a mobile terminal or a PC as an example to describe the key point prediction method provided in this embodiment.
- Other devices and scenarios may be implemented by referring to this embodiment.
- This embodiment is to emphasize the differences from the above-mentioned embodiments. For the same, reference may be made to the description and description in the foregoing embodiments, and details are not described herein.
- the key point prediction method of this embodiment includes:
- Step S200 training the first convolutional neural network.
- This step S200 may include the following sub-steps:
- Sub-step S300 acquiring a sample image containing key point annotation information of the general object.
- the sample image of the key point labeling information of the universal object may be a video image derived from the image capturing device, and is composed of an image of one frame and one frame, or may be a single image or an image, or may be derived from Other devices are then labeled in the sample image.
- the key point labeling information includes the key point position labeling information and the key point existence labeling information.
- the key point of the general object and the key point position of the general object may be marked in the sample image.
- the present embodiment does not limit the source and the acquisition path of the sample image of the key point labeling information of the general object.
- the sub-step S300 may be performed by a processor invoking a corresponding instruction stored in a memory, or may be performed by an acquisition module 800 executed by the processor.
- Sub-step S302 training the first convolutional neural network using the sample image, obtaining key point position prediction information and key point existence prediction information of the general object of the first convolutional neural network for the sample image.
- the key point position prediction information can be understood as position information of a key point of the general object in the sample image, for example, coordinate point information or pixel point information.
- the key point existence prediction information can be understood as the existence information of the key point of the general object in the sample image, for example, a certain key point of a general object exists or does not exist in the sample image, and the key point position of the common object in this embodiment There is no restriction on the content of prediction information and key points in the presence of prediction information.
- the first convolutional neural network may include: an input layer, a feature extraction layer, and a keypoint prediction convolution layer.
- the key point prediction convolution layer may include a first key point prediction convolution layer and a second key point prediction convolution layer, and the first key point prediction convolution layer and the second key point prediction convolution layer respectively and the feature extraction layer connection.
- the input layer is used to input the sample image
- the feature extraction layer is used to extract the feature information of the sample image.
- the first key point prediction convolution layer is used to convolve the feature information to obtain the key position prediction information
- the second key point prediction The convolution layer is used to convolve the feature information to obtain the prediction information of the key point existence.
- the convolution kernel of the first key point prediction convolution layer is 1*1*2N
- the convolution kernel of the second key point prediction convolution layer is 1*1*N
- N The total number of key points to be predicted
- N is an integer greater than or equal to 1.
- Training the first convolutional neural network that is, training the input layer of the first convolutional neural network, the feature extraction layer, the first key point prediction convolution layer and the second key point prediction convolution layer parameters, and then according to the training parameters Construct a first convolutional neural network.
- the first convolutional neural network can be trained using a sample image containing key point annotation information of a general object, so that the first convolutional neural network obtained by the training is more accurate, and samples in various cases can be selected when selecting the sample image.
- the image may include a sample image marked with key point annotation information of the general object, and may also include a sample image of key point annotation information not labeled with the general object.
- the first convolutional neural network in this embodiment may include a full convolutional neural network, that is, a neural network composed entirely of convolutional layers.
- the first convolutional neural network in each embodiment of the present application may be a convolutional neural network of any structure. This embodiment is only used as an example for description. In practical applications, the first convolutional neural network is not limited thereto. For example, it can also be other two-class or multi-class convolutional neural networks.
- the sub-step S302 may be performed by a processor invoking a corresponding instruction stored in a memory, or may be performed by a training module 802 executed by the processor.
- Sub-step S304 supervising the key point position prediction information and the key point existence prediction information according to the objective function.
- the key point position prediction information and the key point existence prediction information are simultaneously supervised according to the objective function, for example, according to a regression objective function, such as a smooth L1 objective function, an euclidean objective function, etc.
- the key position prediction information is supervised, and the prediction information of the key points is supervised according to the classification objective function, such as the softmax objective function, the cross entropy objective function, the hint objective function, and the like.
- the sub-step S304 may be performed by a processor invoking a corresponding instruction stored in a memory, or may be performed by a supervisory module 804 executed by the processor.
- Sub-step S306 determining whether the iterative loss rate of the first convolutional neural network satisfies the set condition.
- sub-step S308 is performed; if not, sub-step S310 is performed.
- the setting condition may be that the iteration loss rate remains unchanged during the predetermined number of training processes of the first convolutional neural network, or the change of the iterative loss rate is kept within a certain range, and the setting conditions may be The content is not limited.
- the sub-step S306 may be performed by a processor invoking a corresponding instruction stored in a memory, or may be performed by a decision module 806 executed by the processor.
- Sub-step S308 completing the training of the first convolutional neural network.
- the sub-step S308 may be performed by a processor invoking a corresponding instruction stored in a memory or by an execution module 808 executed by the processor.
- Sub-step S310 adjusting parameters of the first convolutional neural network according to the obtained key point position prediction information and key point presence prediction information until the iterative loss rate satisfies the set condition.
- the iterative loss rate of the first convolutional neural network does not satisfy the set condition, the obtained key point position prediction information and the key point existence prediction information do not correspond to the key point position labeling information and the key point presence labeling information in the sample image. That is to say, the parameters of the currently trained first convolutional neural network are not accurate enough, and the parameters of the first convolutional neural network need to be adjusted accordingly.
- the adjustment process of the parameters of the first convolutional neural network is not in this embodiment. Make restrictions. When the iterative loss rate of the parameter-adjusted first convolutional neural network satisfies the set condition, it is determined that the training of the first convolutional neural network is completed.
- the sub-step S310 may be performed by a processor invoking a corresponding instruction stored in a memory or by an execution module 808 executed by the processor.
- the step S200 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a keypoint predictive network training device operated by the processor.
- Step S202 Detecting an image by using a first convolutional neural network to obtain feature information of the image.
- the step S202 may be performed by the processor invoking a corresponding instruction stored in the memory, or may be performed by the detection module 700 being executed by the processor.
- Step S204 Using a first convolutional neural network to predict a key point of the general object of the image according to the feature information, and obtaining a key point prediction result of the universal object of the image.
- the step S204 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a prediction module 702 being executed by the processor.
- FIG. 4 a schematic diagram of a training principle of a first convolutional neural network according to an embodiment of the present application is shown. Since the operation speed of the full convolutional neural network is faster than that of the non-full convolutional neural network, in this embodiment, the first convolutional neural network uses a full convolutional neural network. In this embodiment, the full convolutional neural network or the non-full convolutional neural network is not limited to the first convolutional neural network. In this embodiment, the first convolutional neural network is taken as an example of a full convolutional neural network, and the sample image is input to the full convolutional neural network, and the feature information of the sample image is obtained from the feature extraction layer of the full convolutional neural network.
- the first key point prediction convolution layer is used to convolve the feature information to obtain the key position prediction information. Even if the key point does not exist, the first key point prediction convolution layer will be a non-existent key point random prediction. A key point location information.
- the convolution operation is performed on the feature information by the second key point prediction convolution layer to obtain the key point existence prediction information.
- the smoothing L1 objective function is used to supervise the regression task of the training key position prediction information
- the softmax objective function is used to supervise the classification task of the training key point with the prediction information.
- the prediction information predicts the key point of the general object in the sample image.
- the keyed point prediction result of the general object can be predicted from the image by using the trained first convolutional neural network, and the first convolutional neural network in this embodiment is used by the universal convolutional neural network.
- the sample image of the key point annotation information of the object is trained.
- the first convolutional neural network can predict the key points of the objects of multiple categories, and expands the applicable range of the prediction key convolutional neural network of the key point of the object.
- the key point prediction result in this embodiment may include key point position prediction information and key point existence prediction information, wherein the key point position prediction information is position information of a key point to be predicted in the image, and the key point existence prediction information is to be The key point of the prediction is whether the information exists in the image.
- the key point position prediction information of the object and the key point existence prediction information comprehensively judge the key points of the general object in the image.
- the key point prediction result in this embodiment includes not only the key point position prediction information but also the key point existence prediction information, which increases the prediction of whether the key point exists, and improves the accuracy of the key point prediction.
- the first convolutional neural network in this embodiment may include a first key point prediction convolution layer and a second key point prediction convolution layer, and the first key point prediction convolution layer and the second key point prediction convolution layer respectively
- the feature extraction layer is connected, and after the feature extraction layer extracts the feature information of the image, the first key point prediction convolution layer and the second key point prediction convolution layer may perform convolution operation on the feature information in parallel, and the first key point prediction convolution
- the layer and the second key point prediction convolution layer belong to a parallel relationship, that is, the key point position prediction information and the key point presence prediction information are simultaneously predicted.
- the key point position prediction information includes [x1, y1, x2, y2, ..., xN, yN], where x1, x2, ..., xN represent key points in the sample image
- the abscissa information in the middle, y1, y2, ..., yN represents the ordinate information of the key point in the sample image.
- the key point existence prediction information includes [s1, s2, ..., sN], where s1, s2, ..., sN represent the existence information of the key point in the sample image.
- the first key point prediction convolution layer in this embodiment is used for performing convolution operation on the feature information to obtain key point position prediction information. Since the key point position prediction information includes the abscissa information and the ordinate information, the first key point is The convolution kernel of the predicted convolutional layer is 1*1*2N.
- the second key point prediction convolution layer is used to convolve the feature information to obtain the prediction information of the key point. Since the key point exists as the key point or the key point does not exist, the second key point prediction convolution layer The convolution kernel is 1*1*N.
- the first convolutional neural network in this embodiment may be a full convolutional neural network. Since the operation speed of the full convolutional neural network is faster than that of the non-full convolutional neural network, the first convolutional neural network is used for prediction. The speed of the key points is faster than the prediction of key points using a non-full convolutional neural network.
- the image processing method of this embodiment may be performed by any device having data acquisition, processing, and transmission functions, including but not limited to a mobile terminal, a PC, and the like.
- the image processing method of this embodiment includes:
- Step S500 Perform key point prediction on the image to obtain a key point prediction result of the general object in the image.
- the key point prediction of the image may be performed by using the first convolutional neural network trained in the above embodiment to perform key point prediction on the image, or the key point prediction method in the above embodiment is used to perform key point prediction and prediction on the image.
- the key point prediction method in the above embodiment is used to perform key point prediction and prediction on the image.
- the step S500 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by the detection module 900 being executed by the processor.
- Step S502 Processing the image according to the key point prediction result of the general object.
- step S502 may be performed by a processor invoking a corresponding instruction stored in a memory or by a processing module 902 being executed by the processor.
- the image may be subjected to various processing according to the key point prediction result of the general object, such as determining the position of the general object in the image according to the key point prediction result of the universal object; and extracting the image according to the key point prediction result of the general object
- the object feature of the universal object estimating the pose of the universal object in the image according to the key point prediction result of the universal object; tracking the general object in the image according to the key point prediction result of the universal object; and identifying the image according to the key point prediction result of the universal object a generic object; renders a generic object in an image based on keypoint predictions of a generic object, and so on.
- This embodiment is only described by taking the position of the general object in the image according to the key point prediction result of the general object as an example. Other ways of processing the image according to the key point prediction result of the general object can be performed by referring to the commonly used processing method.
- the embodiment does not limit the technical means for processing an image according to a key point prediction result of a general object.
- predicting the key point position prediction information of the general object and the presence information of the key point prediction information can be based on the above-mentioned cat key
- the point information determines the position, orientation and posture of the cat, and the like.
- any of the above methods provided by the embodiments of the present application may be performed by any suitable device having data processing capabilities, including but not limited to: a terminal device, a server, and the like.
- any of the foregoing methods provided by the embodiments of the present application may be executed by a processor, such as the processor, by using a corresponding instruction stored in the memory to perform any of the foregoing methods provided by the embodiments of the present application. This will not be repeated below.
- the foregoing program may be stored in a computer readable storage medium, and the program is executed when executed.
- the foregoing steps include the steps of the foregoing method embodiments; and the foregoing storage medium includes: a medium that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
- the key point prediction apparatus includes: a detection module 600, configured to detect an image by using a first convolutional neural network, and obtain feature information of the image; the first convolutional neural network is to mark information by using a key point containing a general object.
- the convolutional neural network obtained by training the sample image; the prediction module 602 is configured to use the first convolutional neural network to predict a key point of the general object of the image according to the feature information, obtain a key point prediction result of the general object of the image, and predict the key point result Including key point position prediction information and key point presence prediction information.
- the key point prediction result of the general object can be predicted from the image by using the trained first convolutional neural network, and the first convolutional neural network in this embodiment is used by the universal convolutional neural network.
- the sample image of the key point annotation information of the object is trained.
- the first convolutional neural network can predict the key points of the objects of multiple categories, and expands the applicable range of the prediction key convolutional neural network of the key point of the object.
- the key point prediction apparatus includes: a detection module 700, configured to detect an image by using a first convolutional neural network to obtain feature information of an image; and the first convolutional neural network is to mark information by using a key point containing a general object.
- the convolutional neural network obtained by training the sample image; the prediction module 702 is configured to use the first convolutional neural network to predict a key point of the general object of the image according to the feature information, obtain a key point prediction result of the general object of the image, and predict the key point result Including key point position prediction information and key point presence prediction information.
- the first convolutional neural network may include a feature extraction layer, a first key point prediction convolution layer and a second key point prediction convolution layer, a first key point prediction convolution layer and a second key point prediction convolution
- the layer is respectively connected to the feature extraction layer, wherein the feature extraction layer is used to extract feature information of the image; the first key point prediction convolution layer is used for convoluting the feature information to obtain key position position prediction information; the second key point The prediction convolution layer is used to convolve the feature information to obtain the prediction information of the key point existence.
- the convolution kernel of the first key point prediction convolution layer is 1*1*2N
- the convolution kernel of the second key point prediction convolution layer is 1*1*N, where N is the key to be predicted The total number of points.
- the first convolutional neural network comprises a full convolutional neural network.
- the key point prediction apparatus further includes: a training module 704, configured to train the first convolutional neural network
- the training module 704 includes: an acquisition sub-module 7040, configured to acquire a sample image, and key point labeling information.
- the key position location labeling information and the key point presence labeling information are included;
- the training sub-module 7042 is configured to train the first convolutional neural network using the sample image to obtain the key point position prediction information of the universal object of the first convolutional neural network for the sample image.
- the supervision sub-module 7044 is configured to supervise the key point position prediction information and the key point existence prediction information according to the objective function;
- the determining sub-module 7046 is configured to determine the iterative loss rate of the first convolutional neural network. Whether the setting condition is satisfied; the executing sub-module 7048 is configured to complete the training of the first convolutional neural network if the iterative loss rate of the first convolutional neural network satisfies the set condition.
- the execution sub-module 7048 is further configured to: if the iterative loss rate of the first convolutional neural network does not satisfy the set condition, adjust the key point position prediction information and the key point existence prediction information obtained by the training sub-module 7042. The parameters of a convolutional neural network until the iterative loss rate satisfies the set conditions.
- the supervising sub-module 7044 is configured to supervise the key point position prediction information according to the regression objective function, and at the same time, supervise the existence information of the key point according to the classification objective function.
- the key point prediction result of the general object can be predicted from the image by using the trained first convolutional neural network, and the first convolutional neural network in this embodiment is used by the universal convolutional neural network.
- the sample image of the key point annotation information of the object is trained.
- the first convolutional neural network can predict the key points of the objects of multiple categories, and expands the applicable range of the prediction key convolutional neural network of the key point of the object.
- the key point prediction result in this embodiment includes key point position prediction information and key point existence prediction information, wherein the key point position prediction information is position information of the key point to be predicted in the image, and the key point existence prediction information is to be predicted.
- the key point is whether the information exists in the image.
- the key point position prediction information and the key point existence prediction information comprehensively judge the key points of the general object in the image.
- the key point prediction result in this embodiment includes not only the key point position prediction information but also the key point existence prediction information, which increases the prediction of whether the key point exists, and improves the accuracy of the key point prediction.
- the first convolutional neural network in this embodiment includes a first key point prediction convolution layer and a second key point prediction convolution layer, and the first key point prediction convolution layer and the second key point prediction convolution layer respectively and features
- the extraction layer is connected, and after the feature extraction layer extracts the feature information of the image, the first key point prediction convolution layer and the second key point prediction convolution layer can perform convolution operation on the feature information in parallel, and the first key point prediction convolution layer
- the second key point prediction convolution layer belongs to a parallel relationship, that is, the key point position prediction information and the key point presence prediction information are simultaneously predicted.
- the key point position prediction information includes [x1, y1, x2, y2, ..., xN, yN], where x1, x2, ..., xN represent key points in the sample image
- the abscissa information in the middle, y1, y2, ..., yN represents the ordinate information of the key point in the sample image.
- the key point existence prediction information includes [s1, s2, ..., sN], where s1, s2, ..., sN represent the existence information of the key point in the sample image. This embodiment improves the efficiency of the first convolutional neural network to predict key points.
- the first key point prediction convolution layer in this embodiment is used for performing convolution operation on the feature information to obtain key point position prediction information. Since the key point position prediction information includes the abscissa information and the ordinate information, the first key point is The convolution kernel of the predicted convolutional layer is 1*1*2N.
- the second key point prediction convolution layer is used to convolve the feature information to obtain the prediction information of the key point. Since the key point exists as the key point or the key point does not exist, the second key point prediction convolution layer The convolution kernel is 1*1*N.
- the first convolutional neural network in this embodiment may be a full convolutional neural network. Since the operation speed of the full convolutional neural network is faster than that of the non-full convolutional neural network, the first convolutional neural network is used for prediction. The speed of the key points is faster than the prediction of key points using a non-full convolutional neural network.
- FIG. 8 a block diagram of a key point prediction network training apparatus according to an embodiment of the present application is shown.
- the key point prediction network training apparatus includes: an obtaining module 800, configured to acquire a sample image of key point labeling information of a general object, wherein the key point labeling information includes key point position labeling information and key point presence labeling information.
- a training module 802 configured to train the first convolutional neural network using the sample image, obtain key point position prediction information and key point presence prediction information of the general object of the first convolutional neural network for the sample image; and supervise module 804, configured to:
- the key point position prediction information and the key point existence prediction information are supervised according to the objective function;
- the determining module 806 is configured to determine whether the iterative loss rate of the first convolutional neural network satisfies the set condition; and the executing module 808 is used to The iterative loss rate of the convolutional neural network satisfies the set condition, and the training of the first convolutional neural network is completed.
- the executing module 808 is further configured to: if the iterative loss rate of the first convolutional neural network does not satisfy the set condition, adjust the first volume according to the key point position prediction information and the key point presence prediction information obtained by the training module 802.
- the parameters of the neural network are up to the set condition under the iterative loss rate.
- the monitoring module 804 is configured to supervise the key point position prediction information according to the regression objective function, and supervise the key point existence prediction information according to the classification objective function.
- the first convolutional neural network may include a feature extraction layer, a first key point prediction convolution layer and a second key point prediction convolution layer, a first key point prediction convolution layer and a second key point prediction convolution
- the layer is respectively connected with the feature extraction layer; wherein the feature extraction layer is used for extracting feature information of the sample image; the first key point prediction convolution layer is used for convoluting the feature information to obtain key position position prediction information; The point prediction convolution layer is used to convolve the feature information to obtain key point existence prediction information.
- the convolution kernel of the first key point prediction convolution layer is 1*1*2N
- the convolution kernel of the second key point prediction convolution layer is 1*1*N, where N is the key to be predicted The total number of points.
- the first convolutional neural network comprises a full convolutional neural network.
- the key point prediction network training apparatus of the present embodiment is used to implement the corresponding key point prediction network training method in the foregoing various embodiments, and has the beneficial effects of the corresponding method embodiments, and details are not described herein again.
- the image processing apparatus includes: a detection module 900 for detecting an image by using a key point prediction apparatus according to any of the above embodiments of the present application, or using a key point prediction network training according to any of the above embodiments of the present application.
- the first convolutional neural network detected by the device detects the image, and obtains a key point prediction result of the general object of the image, where the key point prediction result includes key point position prediction information and key point existence prediction information; and the processing module 902 is configured to The key point prediction result of the general object processes the image.
- the processing module 902 includes: a location determining sub-module 9020, configured to determine a location of the universal object in the image according to a keypoint prediction result of the universal object.
- the processing module 902 includes: a feature extraction sub-module 9021, configured to extract an object feature of the universal object in the image according to the key point prediction result of the universal object.
- the processing module 902 includes: a posture estimation sub-module 9022, configured to estimate a posture of the universal object in the image according to the key point prediction result of the universal object.
- the processing module 902 includes: an object tracking sub-module 9023, configured to track a general object in the image according to a key point prediction result of the universal object.
- the processing module 902 includes an object recognition sub-module 9024 for identifying a general object in the image according to a key point prediction result of the universal object.
- the processing module 902 includes: an object rendering sub-module 9025, configured to render a general object in the image according to a key point prediction result of the universal object.
- the image processing apparatus of the present embodiment is used to implement the corresponding image processing method in the foregoing various embodiments, and has the beneficial effects of the corresponding method embodiments, and details are not described herein again.
- an embodiment of the present application further provides an electronic device, including: a processor and a memory;
- the memory is configured to store at least one executable instruction, the executable instruction causing the processor to perform an operation corresponding to the object key point prediction method according to any one of the foregoing embodiments of the present application; or
- the memory is configured to store at least one executable instruction, the executable instruction causing the processor to perform an operation corresponding to the object keypoint prediction network training method described in any one of the foregoing embodiments of the present application; or
- the memory is configured to store at least one executable instruction, the executable instruction causing the processor to perform an operation corresponding to the image processing method described in any of the above embodiments of the present application.
- the embodiment of the present application further provides another electronic device, including:
- the processor and the key point prediction network training device according to any one of the above embodiments of the present application; the key point prediction network training device according to any of the above embodiments of the present application when the processor runs the key point prediction network training device The unit in is being run; or
- the processor and the image processing apparatus according to any of the above embodiments of the present application; when the processor runs the image processing apparatus, the unit in the image processing apparatus according to any of the above embodiments of the present application is executed.
- the embodiment of the present application further provides an electronic device, such as a mobile terminal, a personal computer (PC), a tablet computer, a server, and the like.
- a schematic structural diagram of an electronic device 1000 suitable for implementing a terminal device or a server of an embodiment of the present application is shown.
- the electronic device 1000 includes one or more processors and communication components.
- processors such as: one or more central processing units (CPUs) 1001, and/or one or more image processors (GPUs) 1013, etc., the processors may be stored in a read only memory (ROM)
- ROM read only memory
- the executable instructions in 1002 or executable instructions loaded from the storage portion 1008 into the random access memory (RAM) 1003 perform various appropriate actions and processes.
- the communication component includes a communication component 1012 and/or a communication interface 1009.
- the communication component 1012 may include, but is not limited to, a network card, which may include, but is not limited to, an IB (Infiniband) network card, the communication interface 1009 includes a communication interface of a network interface card such as a LAN card, a modem, etc., and the communication interface 1009 is via a network such as the Internet. Perform communication processing.
- a network card which may include, but is not limited to, an IB (Infiniband) network card
- the communication interface 1009 includes a communication interface of a network interface card such as a LAN card, a modem, etc.
- the communication interface 1009 is via a network such as the Internet. Perform communication processing.
- the processor can communicate with the read only memory 1002 and/or the random access memory 1003 to execute executable instructions, connect to the communication component 1012 via the communication bus 1004, and communicate with other target devices via the communication component 1012, thereby completing the embodiments of the present application.
- the operation corresponding to any of the key point prediction methods provided for example, detecting the image by using the first convolutional neural network to obtain feature information of the image; the first convolutional neural network is a sample image using the key point labeling information containing the general object
- the convolutional neural network obtained by training; using the first convolutional neural network to predict the key points of the general object of the image according to the feature information, and obtaining the key point prediction result of the general object of the image, the key point prediction result includes the key point position prediction information and the key Point presence prediction information.
- the sample image of the key point labeling information of the universal object is obtained, wherein the key point labeling information includes key point position labeling information and key point presence labeling information; and the first convolutional neural network is trained by using the sample image, Obtaining key point position prediction information and key point presence prediction information of the universal object of the first convolutional neural network for the sample image; performing the key point position prediction information and the key point presence prediction information according to an objective function Supervising, judging whether the iterative loss rate of the first convolutional neural network satisfies a set condition; if satisfied, completing training of the first convolutional neural network.
- the image is detected by the key point prediction method according to any of the above embodiments, or the first convolutional neural network is used to detect the image by using the key point prediction network training method as described in any of the above embodiments.
- RAM 1003 various programs and data required for the operation of the device can be stored.
- the CPU 1001 or the GPU 1013, the ROM 1002, and the RAM 1003 are connected to each other through the communication bus 1004.
- ROM 1002 is an optional module.
- the RAM 1003 stores executable instructions, or writes executable instructions to the ROM 1002 at runtime, the executable instructions causing the processor to perform operations corresponding to the above-described communication methods.
- An input/output (I/O) interface 1005 is also coupled to communication bus 1004.
- the communication component 1012 can be integrated or can be configured to have multiple sub-modules (e.g., multiple IB network cards) and be on a communication bus link.
- the following components are connected to the I/O interface 1005: an input portion 1006 including a keyboard, a mouse, etc.; an output portion 1007 including a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a speaker; a storage portion 1008 including a hard disk or the like And a communication interface 1009 including a network interface card such as a LAN card, modem, or the like.
- Driver 1010 is also coupled to I/O interface 1005 as needed.
- a removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like is mounted on the drive 1010 as needed so that a computer program read therefrom is installed into the storage portion 1008 as needed.
- FIG. 10 is only an optional implementation manner.
- the number and type of components in the foregoing FIG. 10 may be selected, deleted, added, or replaced according to actual needs; Different function components can also be implemented in separate settings or integrated settings, such as GPU and CPU detachable settings or GPU can be integrated on the CPU, communication components can be separated, or integrated on the CPU or GPU. ,and many more.
- These alternative embodiments are all within the scope of the present application.
- embodiments of the present application may be implemented as a computer software program.
- embodiments of the present application include a computer program product comprising a computer program tangibly embodied on a machine readable medium, the computer program comprising program code for executing the method illustrated in the flowchart, the program code comprising the corresponding execution
- the instructions corresponding to the method steps provided by the embodiments of the present application are provided by the embodiments of the present application.
- the program code may include an instruction corresponding to the following steps provided in the embodiment of the present application: detecting the image by using the first convolutional neural network to obtain feature information of the image; and using the first convolutional neural network to use a key point containing the general object
- the convolutional neural network is trained by the sample image of the annotation information; the key point of the general object of the image is predicted by the first convolutional neural network according to the feature information, and the key point prediction result of the general object of the image is obtained, and the key point prediction result includes the key point The position prediction information and the key points have prediction information.
- the program code may include an instruction corresponding to the following steps provided in the embodiment of the present application: acquiring a sample image of key point annotation information including a general object, wherein the key point annotation information includes key point location annotation information and a key Point presence information; training the first convolutional neural network using the sample image, obtaining key point position prediction information and key point presence prediction information of the universal object of the first convolutional neural network for the sample image; The function supervises the key point position prediction information and the key point existence prediction information, and determines whether the iterative loss rate of the first convolutional neural network satisfies a set condition; if yes, completes the first volume Training of the neural network.
- the program code may include an instruction corresponding to the following steps provided by the embodiment of the present application: detecting an image by using a key point prediction method according to any of the above embodiments, or adopting any of the foregoing embodiments.
- the key point prediction network training method is used to detect the image by the first convolutional neural network, and the key point prediction result of the general object of the image is obtained, and the key point prediction result includes the key point position prediction information and the key point existence prediction information.
- the computer program can be downloaded and installed from the network via a communication component, and/or installed from the removable medium 1011.
- the embodiment of the present application further provides a computer program, including computer readable code, when the computer readable code is run on a device, the processor in the device executes to implement any of the embodiments of the present application.
- the processor in the device executes instructions for implementing the steps in the object keypoint prediction network training method of any of the embodiments of the present application.
- the processor in the device executes instructions for implementing the steps in the image processing method as described in any of the embodiments of the present application.
- the embodiment of the present application further provides a computer readable storage medium, configured to store computer readable instructions, where the instructions are executed to implement each of the object key point prediction methods according to any one of the embodiments of the present application.
- the methods, apparatus, and apparatus of the present application may be implemented in a number of ways.
- the method, apparatus, and apparatus of the embodiments of the present application can be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware.
- the above-described sequence of steps for the method is for illustrative purposes only, and the steps of the method of the embodiments of the present application are not limited to the order of the above optional description unless otherwise specified.
- the present application may also be embodied as a program recorded in a recording medium, the programs including machine readable instructions for implementing a method in accordance with embodiments of the present application.
- the present application also covers a recording medium storing a program for executing the method according to an embodiment of the present application.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
Claims (44)
- 一种关键点预测方法,其特征在于,包括:采用第一卷积神经网络检测图像,获得所述图像的特征信息;所述第一卷积神经网络为使用含有通用物体的关键点标注信息的样本图像训练得到的卷积神经网络;采用所述第一卷积神经网络根据所述特征信息预测所述图像的通用物体的关键点,获得所述图像的通用物体的关键点预测结果,所述关键点预测结果包括关键点位置预测信息和关键点存在预测信息。
- 根据权利要求1所述的方法,其特征在于,所述第一卷积神经网络包括:特征提取层、第一关键点预测卷积层和第二关键点预测卷积层,所述第一关键点预测卷积层和所述第二关键点预测卷积层分别与所述特征提取层连接;其中:所述特征提取层用于提取所述图像的特征信息;所述第一关键点预测卷积层用于对所述特征信息进行卷积操作,得到所述关键点位置预测信息;所述第二关键点预测卷积层用于对所述特征信息进行卷积操作,得到所述关键点存在预测信息。
- 根据权利要求2所述的方法,其特征在于,所述第一关键点预测卷积层的卷积核为1*1*2N,所述第二关键点预测卷积层的卷积核为1*1*N,其中,N为待预测的关键点的总数量。
- 根据权利要求1-3任一所述的方法,其特征在于,所述第一卷积神经网络包括:全卷积神经网络。
- 根据权利要求1-4任一所述的方法,其特征在于,所述第一卷积神经网络的训练,包括:获取所述样本图像,所述关键点标注信息包括关键点位置标注信息和关键点存在标注信息;使用所述样本图像训练第一卷积神经网络,获得所述第一卷积神经网络针对所述样本图像的通用物体的关键点位置预测信息和关键点存在预测信息;根据目标函数对所述关键点位置预测信息和所述关键点存在预测信息进行监督,判断所述第一卷积神经网络的迭代损失率是否满足设定条件;若满足,则完成对所述第一卷积神经网络的训练。
- 根据权利要求5所述的方法,其特征在于,所述第一卷积神经网络的训练,还包括:若不满足,则根据获得的所述关键点位置预测信息和所述关键点存在预测信息调整所述第一卷积神经网络的参数,直至所述迭代损失率满足所述设定条件。
- 根据权利要求5所述的方法,其特征在于,所述根据目标函数对所述关键点位置预测信息和所述关键点存在预测信息进行监督,包括:根据回归目标函数对所述关键点位置预测信息进行监督,根据分类目标函数对所述关键点存在预测信息进行监督。
- 一种关键点预测网络训练方法,其特征在于,包括:获取含有通用物体的关键点标注信息的样本图像,其中,所述关键点标注信息包括关键点位置标注信息和关键点存在标注信息;使用所述样本图像训练第一卷积神经网络,获得所述第一卷积神经网络针对所述样本图像的通用物体的关键点位置预测信息和关键点存在预测信息;根据目标函数对所述关键点位置预测信息和所述关键点存在预测信息进行监督,判断所述第一卷积神经网络的迭代损失率是否满足设定条件;若满足,则完成对所述第一卷积神经网络的训练。
- 根据权利要求8所述的方法,其特征在于,还包括:若不满足,则根据获得的所述关键点位置预测信息和所述关键点存在预测信息调整所述第一卷积神经网络的参数,直至所述迭代损失率满足所述设定条件。
- 根据权利要求8所述的方法,其特征在于,所述根据目标函数对所述关键点位置预测信息和所述关键点存在预测信息进行监督,包括:根据回归目标函数对所述关键点位置预测信息进行监督,根据分类目标函数对所述关键点存在预测信息进行监督。
- 根据权利要求8-10任一所述的方法,其特征在于,所述第一卷积神经网络包括:特征提取层、第一关键点预测卷积层和第二关键点预测卷积层,所述第一关键点预测卷积层和所述第二关键点预测卷积层分别与所述特征提取层连接;其中:所述特征提取层用于提取所述样本图像的特征信息;所述第一关键点预测卷积层用于对所述特征信息进行卷积操作,得到所述关键点位置预测信息;所述第二关键点预测卷积层用于对所述特征信息进行卷积操作,得到所述关键点存在预测信息。
- 根据权利要求11所述的方法,其特征在于,所述第一关键点预测卷积层的卷积核为1*1*2N,所述第二关键点预测卷积层的卷积核为1*1*N,其中,N为待预测的关键点的总数量。
- 根据权利要求8-12任一所述的方法,其特征在于,所述第一卷积神经网络包括:全卷积神经网络。
- 一种图像处理方法,其特征在于,包括:采用如权利要求1-7任一所述的方法检测图像,或者,采用如权利要求8-13任一所述的方法训练而得的第一卷积神经网络检测图像,得到所述图像的通用物体的关键点预测结果,所述关键点预测结果包括关键点位置预测信息和关键点存在预测信息;根据所述通用物体的关键点预测结果对所述图像进行处理。
- 根据权利要求14所述的方法,其特征在于,所述根据所述通用物体的关键点预测结果对所述图像进行处理,包括:根据所述通用物体的关键点预测结果确定所述图像中的通用物体的位置。
- 根据权利要求14所述的方法,其特征在于,所述根据所述通用物体的关键点预测结果对所述图像进行处理,包括:根据所述通用物体的关键点预测结果提取所述图像中的通用物体的物体特征。
- 根据权利要求14所述的方法,其特征在于,所述根据所述通用物体的关键点预测结果对所述图像进行处理,包括:根据所述通用物体的关键点预测结果估计所述图像中的通用物体的姿态。
- 根据权利要求14所述的方法,其特征在于,所述根据所述通用物体的关键点预测结果对所述图像进行处理,包括:根据所述通用物体的关键点预测结果跟踪所述图像中的通用物体。
- 根据权利要求14所述的方法,其特征在于,所述根据所述通用物体的关键点预测结果对所述图像进行处理,包括:根据所述通用物体的关键点预测结果识别所述图像中的通用物体。
- 根据权利要求14所述的方法,其特征在于,所述根据所述通用物体的关键点预测结果对所述图像进行处理,包括:根据所述通用物体的关键点预测结果渲染所述图像中的通用物体。
- 一种关键点预测装置,其特征在于,包括:检测模块,用于采用第一卷积神经网络检测图像,获得所述图像的特征信息;所述第 一卷积神经网络为使用含有通用物体的关键点标注信息的样本图像训练得到的卷积神经网络;预测模块,用于采用所述第一卷积神经网络根据所述特征信息预测所述图像的通用物体的关键点,获得所述图像的通用物体的关键点预测结果,所述关键点预测结果包括关键点位置预测信息和关键点存在预测信息。
- 根据权利要求21所述的装置,其特征在于,所述第一卷积神经网络包括:特征提取层、第一关键点预测卷积层和第二关键点预测卷积层,所述第一关键点预测卷积层和所述第二关键点预测卷积层分别与所述特征提取层连接;其中:所述特征提取层用于提取所述图像的特征信息;所述第一关键点预测卷积层用于对所述特征信息进行卷积操作,得到所述关键点位置预测信息;所述第二关键点预测卷积层用于对所述特征信息进行卷积操作,得到所述关键点存在预测信息。
- 根据权利要求22所述的装置,其特征在于,所述第一关键点预测卷积层的卷积核为1*1*2N,所述第二关键点预测卷积层的卷积核为1*1*N,其中,N为待预测的关键点的总数量。
- 根据权利要求21-23任一所述的装置,其特征在于,所述第一卷积神经网络包括:全卷积神经网络。
- 根据权利要求21-24任一所述的装置,其特征在于,还包括:训练模块,用于训练所述第一卷积神经网络;所述训练模块包括:获取子模块,用于获取所述样本图像,所述关键点标注信息包括关键点位置标注信息和关键点存在标注信息;训练子模块,用于使用所述样本图像训练第一卷积神经网络,获得所述第一卷积神经网络针对所述样本图像的通用物体的关键点位置预测信息和关键点存在预测信息;监督子模块,用于根据目标函数对所述关键点位置预测信息和所述关键点存在预测信息进行监督;判断子模块,用于判断所述第一卷积神经网络的迭代损失率是否满足设定条件;执行子模块,用于若所述第一卷积神经网络的迭代损失率满足设定条件,则完成对所述第一卷积神经网络的训练。
- 根据权利要求25所述的装置,其特征在于,所述执行子模块,还用于若所述第一卷积神经网络的迭代损失率不满足设定条件,则根据所述训练子模块获得的所述关键点位置预测信息和所述关键点存在预测信息调整所述第一卷积神经网络的参数,直至所述迭代损失率满足所述设定条件。
- 根据权利要求25所述的装置,其特征在于,所述监督子模块,用于根据回归目标函数对所述关键点位置预测信息进行监督,以及根据分类目标函数对所述关键点存在预测信息进行监督。
- 一种关键点预测网络训练装置,其特征在于,包括:获取模块,用于获取含有通用物体的关键点标注信息的样本图像,其中,所述关键点标注信息包括关键点位置标注信息和关键点存在标注信息;训练模块,用于使用所述样本图像训练第一卷积神经网络,获得所述第一卷积神经网络针对所述样本图像的通用物体的关键点位置预测信息和关键点存在预测信息;监督模块,用于根据目标函数对所述关键点位置预测信息和所述关键点存在预测信息进行监督;判断模块,用于判断所述第一卷积神经网络的迭代损失率是否满足设定条件;执行模块,用于若所述第一卷积神经网络的迭代损失率满足设定条件,则完成对所述 第一卷积神经网络的训练。
- 根据权利要求28所述的装置,其特征在于,所述执行模块,还用于若所述第一卷积神经网络的迭代损失率不满足设定条件,则根据所述训练模块获得的所述关键点位置预测信息和所述关键点存在预测信息调整所述第一卷积神经网络的参数,直至所述迭代损失率满足所述设定条件。
- 根据权利要求28所述的装置,其特征在于,所述监督模块,用于根据回归目标函数对所述关键点位置预测信息进行监督,以及根据分类目标函数对所述关键点存在预测信息进行监督。
- 根据权利要求28-30任一所述的装置,其特征在于,所述第一卷积神经网络包括:特征提取层、第一关键点预测卷积层和第二关键点预测卷积层,所述第一关键点预测卷积层和所述第二关键点预测卷积层分别与所述特征提取层连接;其中:所述特征提取层用于提取所述样本图像的特征信息;所述第一关键点预测卷积层用于对所述特征信息进行卷积操作,得到所述关键点位置预测信息;所述第二关键点预测卷积层用于对所述特征信息进行卷积操作,得到所述关键点存在预测信息。
- 根据权利要求31所述的装置,其特征在于,所述第一关键点预测卷积层的卷积核为1*1*2N,所述第二关键点预测卷积层的卷积核为1*1*N,其中,N为待预测的关键点的总数量。
- 根据权利要求28-32任一所述的装置,其特征在于,所述第一卷积神经网络包括:全卷积神经网络。
- 一种图像处理装置,其特征在于,包括:检测模块,用于采用如权利要求21-27任一所述的装置检测图像,或者,采用如权利要求28-33任一所述的装置训练而得的第一卷积神经网络检测图像,得到所述图像的通用物体的关键点预测结果,所述关键点预测结果包括关键点位置预测信息和关键点存在预测信息;处理模块,用于根据所述通用物体的关键点预测结果对所述图像进行处理。
- 根据权利要求34所述的装置,其特征在于,所述处理模块,包括:位置确定子模块,用于根据所述通用物体的关键点预测结果确定所述图像中的通用物体的位置。
- 根据权利要求34所述的装置,其特征在于,所述处理模块,包括:特征提取子模块,用于根据所述通用物体的关键点预测结果提取所述图像中的通用物体的物体特征。
- 根据权利要求34所述的装置,其特征在于,所述处理模块,包括:姿态估计子模块,用于根据所述通用物体的关键点预测结果估计所述图像中的通用物体的姿态。
- 根据权利要求34所述的装置,其特征在于,所述处理模块,包括:物体跟踪子模块,用于根据所述通用物体的关键点预测结果跟踪所述图像中的通用物体。
- 根据权利要求34所述的装置,其特征在于,所述处理模块,包括:物体识别子模块,用于根据所述通用物体的关键点预测结果识别所述图像中的通用物体。
- 根据权利要求34所述的装置,其特征在于,所述处理模块,包括:物体渲染子模块,用于根据所述通用物体的关键点预测结果渲染所述图像中的通用物体。
- 一种电子设备,其特征在于,包括:处理器和存储器;所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行如权利要求1-7任一所述的物体关键点预测方法对应的操作;或者所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行如权利要求8-13任一所述的物体关键点预测网络训练方法对应的操作;或者所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行如权利要求14-20任一所述的图像处理方法对应的操作。
- 一种电子设备,其特征在于,包括:处理器和权利要求21-27任一项所述的关键点预测装置;在处理器运行所述关键点预测装置时,权利要求21-27任一项所述的关键点预测装置中的单元被运行;或者处理器和权利要求28-33任一项所述的关键点预测网络训练装置;在处理器运行所述关键点预测网络训练装置时,权利要求28-33任一项所述的关键点预测网络训练装置中的单元被运行;或者处理器和权利要求34-40任一项所述的图像处理装置;在处理器运行所述的图像处理装置时,权利要求34-40任一项所述的图像处理装置中的单元被运行。
- 一种计算机程序,包括计算机可读代码,其特征在于,当所述计算机可读代码在设备上运行时,所述设备中的处理器执行用于实现如权利要求1-7任一所述的物体关键点预测方法中各步骤的指令;或者当所述计算机可读代码在设备上运行时,所述设备中的处理器执行用于实现如权利要求8-13任一所述的物体关键点预测网络训练方法中各步骤的指令;或者当所述计算机可读代码在设备上运行时,所述设备中的处理器执行用于实现如权利要求14-20任一所述的图像处理方法中各步骤的指令。
- 一种计算机可读存储介质,用于存储计算机可读取的指令,其特征在于,所述指令被执行时实现如权利要求1-7任一所述的物体关键点预测方法中各步骤的操作、或者如权利要求8-13任一所述的物体关键点预测网络训练方法中各步骤的操作、或者如权利要求14-20任一所述的图像处理方法中各步骤的操作。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611261431.9A CN108229489B (zh) | 2016-12-30 | 2016-12-30 | 关键点预测、网络训练、图像处理方法、装置及电子设备 |
CN201611261431.9 | 2016-12-30 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018121737A1 true WO2018121737A1 (zh) | 2018-07-05 |
Family
ID=62657284
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/119877 WO2018121737A1 (zh) | 2016-12-30 | 2017-12-29 | 关键点预测、网络训练及图像处理方法和装置、电子设备 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108229489B (zh) |
WO (1) | WO2018121737A1 (zh) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109472289A (zh) * | 2018-10-09 | 2019-03-15 | 北京陌上花科技有限公司 | 关键点检测方法和设备 |
CN109614867A (zh) * | 2018-11-09 | 2019-04-12 | 北京市商汤科技开发有限公司 | 人体关键点检测方法和装置、电子设备、计算机存储介质 |
CN110309706A (zh) * | 2019-05-06 | 2019-10-08 | 深圳市华付信息技术有限公司 | 人脸关键点检测方法、装置、计算机设备及存储介质 |
CN110910449A (zh) * | 2019-12-03 | 2020-03-24 | 清华大学 | 识别物体三维位置的方法和系统 |
CN111191492A (zh) * | 2018-11-15 | 2020-05-22 | 北京三星通信技术研究有限公司 | 信息估计、模型检索和模型对准方法和装置 |
CN111340043A (zh) * | 2018-12-19 | 2020-06-26 | 北京京东尚科信息技术有限公司 | 关键点检测方法、系统、设备及存储介质 |
CN111353325A (zh) * | 2018-12-20 | 2020-06-30 | 北京京东尚科信息技术有限公司 | 关键点检测模型训练方法及装置 |
CN111507334A (zh) * | 2019-01-30 | 2020-08-07 | 中国科学院宁波材料技术与工程研究所 | 一种基于关键点的实例分割方法 |
CN111783986A (zh) * | 2020-07-02 | 2020-10-16 | 清华大学 | 网络训练方法及装置、姿态预测方法及装置 |
CN111814588A (zh) * | 2020-06-18 | 2020-10-23 | 浙江大华技术股份有限公司 | 行为检测方法以及相关设备、装置 |
CN111862189A (zh) * | 2020-07-07 | 2020-10-30 | 北京海益同展信息科技有限公司 | 体尺信息确定方法、装置、电子设备和计算机可读介质 |
CN111950723A (zh) * | 2019-05-16 | 2020-11-17 | 武汉Tcl集团工业研究院有限公司 | 神经网络模型训练方法、图像处理方法、装置及终端设备 |
CN111967949A (zh) * | 2020-09-22 | 2020-11-20 | 武汉博晟安全技术股份有限公司 | 基于Leaky-Conv & Cross安全课程推荐引擎排序算法 |
CN112115745A (zh) * | 2019-06-21 | 2020-12-22 | 杭州海康威视数字技术股份有限公司 | 一种商品漏扫码行为识别方法、装置及系统 |
CN112150533A (zh) * | 2019-06-28 | 2020-12-29 | 顺丰科技有限公司 | 物体体积计算方法、装置、设备、及存储介质 |
CN112287855A (zh) * | 2020-11-02 | 2021-01-29 | 东软睿驰汽车技术(沈阳)有限公司 | 基于多任务神经网络的驾驶行为检测方法和装置 |
CN112348035A (zh) * | 2020-11-11 | 2021-02-09 | 东软睿驰汽车技术(沈阳)有限公司 | 车辆关键点检测方法、装置及电子设备 |
CN113449718A (zh) * | 2021-06-30 | 2021-09-28 | 平安科技(深圳)有限公司 | 关键点定位模型的训练方法、装置和计算机设备 |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109448007B (zh) * | 2018-11-02 | 2020-10-09 | 北京迈格威科技有限公司 | 图像处理方法、图像处理装置及存储介质 |
CN109410253B (zh) * | 2018-11-06 | 2019-11-26 | 北京字节跳动网络技术有限公司 | 用于生成信息的方法、装置、电子设备和计算机可读介质 |
CN109697446B (zh) * | 2018-12-04 | 2021-12-07 | 北京字节跳动网络技术有限公司 | 图像关键点提取方法、装置、可读存储介质及电子设备 |
CN109670591B (zh) * | 2018-12-14 | 2022-09-27 | 深圳市商汤科技有限公司 | 一种神经网络的训练方法及图像匹配方法、装置 |
CN109657615B (zh) * | 2018-12-19 | 2021-11-02 | 腾讯科技(深圳)有限公司 | 一种目标检测的训练方法、装置及终端设备 |
CN109522910B (zh) * | 2018-12-25 | 2020-12-11 | 浙江商汤科技开发有限公司 | 关键点检测方法及装置、电子设备和存储介质 |
CN110478911A (zh) * | 2019-08-13 | 2019-11-22 | 苏州钛智智能科技有限公司 | 基于机器学习的智能游戏车无人驾驶方法及智能车、设备 |
CN110533006B (zh) * | 2019-09-11 | 2022-03-25 | 北京小米智能科技有限公司 | 一种目标跟踪方法、装置及介质 |
CN110909655A (zh) * | 2019-11-18 | 2020-03-24 | 上海眼控科技股份有限公司 | 一种识别视频事件的方法及设备 |
CN111179247A (zh) * | 2019-12-27 | 2020-05-19 | 上海商汤智能科技有限公司 | 三维目标检测方法及其模型的训练方法及相关装置、设备 |
CN111481208B (zh) * | 2020-04-01 | 2023-05-12 | 中南大学湘雅医院 | 一种应用于关节康复的辅助系统、方法及存储介质 |
CN111523422B (zh) * | 2020-04-15 | 2023-10-10 | 北京华捷艾米科技有限公司 | 一种关键点检测模型训练方法、关键点检测方法和装置 |
CN116721412B (zh) * | 2023-04-17 | 2024-05-03 | 之江实验室 | 一种自下而上的基于结构性先验的豆荚关键点检测方法和系统 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104156715A (zh) * | 2014-09-01 | 2014-11-19 | 杭州朗和科技有限公司 | 一种终端设备、信息采集方法及装置 |
US20150347822A1 (en) * | 2014-05-29 | 2015-12-03 | Beijing Kuangshi Technology Co., Ltd. | Facial Landmark Localization Using Coarse-to-Fine Cascaded Neural Networks |
CN105354565A (zh) * | 2015-12-23 | 2016-02-24 | 北京市商汤科技开发有限公司 | 基于全卷积网络人脸五官定位与判别的方法及系统 |
CN105760836A (zh) * | 2016-02-17 | 2016-07-13 | 厦门美图之家科技有限公司 | 基于深度学习的多角度人脸对齐方法、系统及拍摄终端 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105718879A (zh) * | 2016-01-19 | 2016-06-29 | 华南理工大学 | 基于深度卷积神经网络的自由场景第一视角手指关键点检测方法 |
-
2016
- 2016-12-30 CN CN201611261431.9A patent/CN108229489B/zh active Active
-
2017
- 2017-12-29 WO PCT/CN2017/119877 patent/WO2018121737A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150347822A1 (en) * | 2014-05-29 | 2015-12-03 | Beijing Kuangshi Technology Co., Ltd. | Facial Landmark Localization Using Coarse-to-Fine Cascaded Neural Networks |
CN104156715A (zh) * | 2014-09-01 | 2014-11-19 | 杭州朗和科技有限公司 | 一种终端设备、信息采集方法及装置 |
CN105354565A (zh) * | 2015-12-23 | 2016-02-24 | 北京市商汤科技开发有限公司 | 基于全卷积网络人脸五官定位与判别的方法及系统 |
CN105760836A (zh) * | 2016-02-17 | 2016-07-13 | 厦门美图之家科技有限公司 | 基于深度学习的多角度人脸对齐方法、系统及拍摄终端 |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109472289A (zh) * | 2018-10-09 | 2019-03-15 | 北京陌上花科技有限公司 | 关键点检测方法和设备 |
CN109614867A (zh) * | 2018-11-09 | 2019-04-12 | 北京市商汤科技开发有限公司 | 人体关键点检测方法和装置、电子设备、计算机存储介质 |
CN111191492A (zh) * | 2018-11-15 | 2020-05-22 | 北京三星通信技术研究有限公司 | 信息估计、模型检索和模型对准方法和装置 |
CN111340043A (zh) * | 2018-12-19 | 2020-06-26 | 北京京东尚科信息技术有限公司 | 关键点检测方法、系统、设备及存储介质 |
CN111353325A (zh) * | 2018-12-20 | 2020-06-30 | 北京京东尚科信息技术有限公司 | 关键点检测模型训练方法及装置 |
CN111507334B (zh) * | 2019-01-30 | 2024-03-12 | 中国科学院宁波材料技术与工程研究所 | 一种基于关键点的实例分割方法 |
CN111507334A (zh) * | 2019-01-30 | 2020-08-07 | 中国科学院宁波材料技术与工程研究所 | 一种基于关键点的实例分割方法 |
CN110309706A (zh) * | 2019-05-06 | 2019-10-08 | 深圳市华付信息技术有限公司 | 人脸关键点检测方法、装置、计算机设备及存储介质 |
CN111950723B (zh) * | 2019-05-16 | 2024-05-21 | 武汉Tcl集团工业研究院有限公司 | 神经网络模型训练方法、图像处理方法、装置及终端设备 |
CN111950723A (zh) * | 2019-05-16 | 2020-11-17 | 武汉Tcl集团工业研究院有限公司 | 神经网络模型训练方法、图像处理方法、装置及终端设备 |
CN112115745A (zh) * | 2019-06-21 | 2020-12-22 | 杭州海康威视数字技术股份有限公司 | 一种商品漏扫码行为识别方法、装置及系统 |
CN112150533A (zh) * | 2019-06-28 | 2020-12-29 | 顺丰科技有限公司 | 物体体积计算方法、装置、设备、及存储介质 |
CN110910449A (zh) * | 2019-12-03 | 2020-03-24 | 清华大学 | 识别物体三维位置的方法和系统 |
CN110910449B (zh) * | 2019-12-03 | 2023-10-13 | 清华大学 | 识别物体三维位置的方法和系统 |
CN111814588B (zh) * | 2020-06-18 | 2023-08-01 | 浙江大华技术股份有限公司 | 行为检测方法以及相关设备、装置 |
CN111814588A (zh) * | 2020-06-18 | 2020-10-23 | 浙江大华技术股份有限公司 | 行为检测方法以及相关设备、装置 |
CN111783986A (zh) * | 2020-07-02 | 2020-10-16 | 清华大学 | 网络训练方法及装置、姿态预测方法及装置 |
CN111862189A (zh) * | 2020-07-07 | 2020-10-30 | 北京海益同展信息科技有限公司 | 体尺信息确定方法、装置、电子设备和计算机可读介质 |
CN111862189B (zh) * | 2020-07-07 | 2023-12-05 | 京东科技信息技术有限公司 | 体尺信息确定方法、装置、电子设备和计算机可读介质 |
CN111967949A (zh) * | 2020-09-22 | 2020-11-20 | 武汉博晟安全技术股份有限公司 | 基于Leaky-Conv & Cross安全课程推荐引擎排序算法 |
CN112287855A (zh) * | 2020-11-02 | 2021-01-29 | 东软睿驰汽车技术(沈阳)有限公司 | 基于多任务神经网络的驾驶行为检测方法和装置 |
CN112287855B (zh) * | 2020-11-02 | 2024-05-10 | 东软睿驰汽车技术(沈阳)有限公司 | 基于多任务神经网络的驾驶行为检测方法和装置 |
CN112348035A (zh) * | 2020-11-11 | 2021-02-09 | 东软睿驰汽车技术(沈阳)有限公司 | 车辆关键点检测方法、装置及电子设备 |
CN112348035B (zh) * | 2020-11-11 | 2024-05-24 | 东软睿驰汽车技术(沈阳)有限公司 | 车辆关键点检测方法、装置及电子设备 |
CN113449718A (zh) * | 2021-06-30 | 2021-09-28 | 平安科技(深圳)有限公司 | 关键点定位模型的训练方法、装置和计算机设备 |
Also Published As
Publication number | Publication date |
---|---|
CN108229489B (zh) | 2020-08-11 |
CN108229489A (zh) | 2018-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018121737A1 (zh) | 关键点预测、网络训练及图像处理方法和装置、电子设备 | |
CN108229478B (zh) | 图像语义分割及训练方法和装置、电子设备、存储介质和程序 | |
Sindagi et al. | Prior-based domain adaptive object detection for hazy and rainy conditions | |
CN108460338B (zh) | 人体姿态估计方法和装置、电子设备、存储介质、程序 | |
CN108960036B (zh) | 三维人体姿态预测方法、装置、介质及设备 | |
US10936911B2 (en) | Logo detection | |
US10817714B2 (en) | Method and apparatus for predicting walking behaviors, data processing apparatus, and electronic device | |
CN106557778B (zh) | 通用物体检测方法和装置、数据处理装置和终端设备 | |
WO2018121690A1 (zh) | 对象属性检测、神经网络训练、区域检测方法和装置 | |
WO2018054329A1 (zh) | 物体检测方法和装置、电子设备、计算机程序和存储介质 | |
WO2019091464A1 (zh) | 目标检测方法和装置、训练方法、电子设备和介质 | |
WO2019205604A1 (zh) | 图像处理方法、训练方法、装置、设备、介质及程序 | |
WO2018202089A1 (zh) | 关键点检测方法、装置、存储介质及电子设备 | |
WO2018019126A1 (zh) | 视频类别识别方法和装置、数据处理装置和电子设备 | |
CN108229591B (zh) | 神经网络自适应训练方法和装置、设备、程序和存储介质 | |
CN108229353B (zh) | 人体图像的分类方法和装置、电子设备、存储介质、程序 | |
CN108229418B (zh) | 人体关键点检测方法和装置、电子设备、存储介质和程序 | |
WO2019128979A1 (zh) | 关键帧调度方法和装置、电子设备、程序和介质 | |
CN108491872B (zh) | 目标再识别方法和装置、电子设备、程序和存储介质 | |
CN108229494B (zh) | 网络训练方法、处理方法、装置、存储介质和电子设备 | |
CN118284905A (zh) | 用于3d场景的可泛化语义分割的神经语义场 | |
CN108154153B (zh) | 场景分析方法和系统、电子设备 | |
KR101700030B1 (ko) | 사전 정보를 이용한 영상 물체 탐색 방법 및 이를 수행하는 장치 | |
CN112766284A (zh) | 图像识别方法和装置、存储介质和电子设备 | |
CN109325512A (zh) | 图像分类方法及装置、电子设备、计算机程序及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17888172 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17888172 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 17.12.2019) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17888172 Country of ref document: EP Kind code of ref document: A1 |