CN108280481A - A kind of joint objective classification and 3 d pose method of estimation based on residual error network - Google Patents
A kind of joint objective classification and 3 d pose method of estimation based on residual error network Download PDFInfo
- Publication number
- CN108280481A CN108280481A CN201810077747.5A CN201810077747A CN108280481A CN 108280481 A CN108280481 A CN 108280481A CN 201810077747 A CN201810077747 A CN 201810077747A CN 108280481 A CN108280481 A CN 108280481A
- Authority
- CN
- China
- Prior art keywords
- network
- loss function
- classification
- posture
- pose
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
A kind of joint objective based on residual error network proposed in the present invention is classified and 3 d pose method of estimation, main contents include:Joint objective is classified and 3 d pose estimation, loss function, training, its process is, first, using 50 fourth stages of ResNet as character network, using the 5th stages of ResNet 50 as sorter network, and use three layers of posture network as posture network, and estimated come joint objective classification and 3 d pose using the residual error network based on framework, then, new mathematic(al) representation and new loss function are proposed to 3 d pose, the sum of posture loss function and Classification Loss function is used to characterize live posture, loss function between live tag along sort and the network output proposed, newest training finally is carried out to Pascal3D+ databases.The present invention utilizes the residual error network based on framework, and the loss function that construction is new, has achieved the purpose that joint objective classification and 3 d pose estimation, has realized the effect for reducing the algorithm loss time.
Description
Technical field
The present invention relates to target classification and Attitude estimation fields, more particularly, to a kind of joint mesh based on residual error network
Mark classification and 3 d pose method of estimation.
Background technology
Environment sensing is a key problem of computer vision science and an important portion of modern visual challenge
Point.Understand that a kind of mode of a width scene image is the target described it as inside scene, this relate to target classification and
Attitude estimation.Target classification be by measured target and the training sample of known target one by one compared with, answer with or it is different (true or false);
Attitude estimation is then that the attributes such as form, action to different target are estimated.Target classification is with Attitude estimation in many fields
All have a wide range of applications, for example, safety-security area recognition of face, pedestrian detection, pedestrian tracking, intelligent video analysis etc., traffic
The traffic scene object identification in field, vehicle count, drive in the wrong direction detection, car plate detection and identification and internet arena based on
The image retrieval of content, photograph album automatic clustering etc..It can be said that target classification has been applied to the daily life of people with Attitude estimation
Every aspect living.As success of the emerging science and technology such as deep learning in image classification and two dimension target detection is answered
Classified come processing target using convolutional neural networks with, much work at this stage and the problem of Attitude estimation.But these
Work all uses the output of two dimension target detection system as the input of 3 d pose estimating system.In fact, existing method is just
Estimate target object, the position for detecting target, the 3 d pose for estimating target successively as assembly line.This has resulted in consumption more
The a series of problem such as more time.
The present invention proposes that a kind of joint objective based on residual error network is classified and 3 d pose method of estimation first will
ResNet-50 fourth stages using the 5th stages of ResNet-50 as sorter network, and use three layers of appearance as character network
State network is classified come joint objective using the residual error network based on framework and is estimated with 3 d pose as posture network, so
Afterwards, new mathematic(al) representation and new loss function are proposed to 3 d pose, that is, uses posture loss function and Classification Loss function
The sum of loss function between the live posture of characterization, live tag along sort and the network output proposed, finally to Pascal3D+
Database carries out newest training.Invention achieves the purposes of joint objective classification and 3 d pose estimation, and realize
Reduce the effect of algorithm loss time.
Invention content
For the problems such as more is taken, the present invention is intended to provide the side of a kind of classification of joint objective and 3 d pose estimation
Method, first, using ResNet-50 fourth stages as character network, using the 5th stages of ResNet-50 as sorter network, and
Using three layers of posture network as posture network, and using the residual error network based on framework come joint objective classification and three-dimensional appearance
State is estimated, then, proposes new mathematic(al) representation and new loss function to 3 d pose, that is, uses posture loss function and classification
Loss function between the live posture of the sum of loss function characterization, live tag along sort and the network output proposed, it is finally right
Pascal3D+ databases carry out newest training.
Specifically, main contents of the invention include:
(1) joint objective classification and 3 d pose estimation;
(2) loss function;
(3) training.
Wherein, joint objective classification and 3 d pose estimation, it is unknown to can be applied to target classification label
Situation, and use residual error network ResNet-50 as character network.
Further, the classification inputs, the feature of character network for estimating target classification label as it.
Further, described to use residual error network ResNet-50 as character network, by ResNet-50 fourth stages
As character network, using the 5th stages of ResNet-50 as sorter network, and use three layers of posture network as posture net
Network.
Wherein, the loss function, when target classification label is unknown, the present invention constructs 3 d pose new mathematics
Expression formula and new loss function.First, live posture R is characterized with the sum of posture loss function and Classification Loss function*, it is real
Condition tag along sort c*The loss function between (R, c) is exported with the network proposed, i.e.,:
Wherein, Classification Loss functionUse the classification cross-entropy loss function of standard;And posture loses letter
NumberThen depend on the representation of spin matrix R.
Further, the spin matrix R, R use the representation of axis angle, i.e. ,=expm (θ [v]×), wherein v
Corresponding rotary shaft, [v]×It indicates by v=[v1,v2,v3]TThe antisymmetric matrix of generation, i.e.,:
And θ corresponds to rotation angle, limit θ ∈ [0, π), obtain one a pair of between spin matrix R and axis angle vector y=θ v
It should be related to.
Further, the correspondence between the spin matrix and axis angle vector,
Wherein, y1And y2It is two axis angle vectors;
The loss function over the ground in space is as shown in above formula where spin matrix.
Further, the axis angle vector sets yiIt is the output of i-th of posture network, when known to target classification
When, posture output can be selected according to correct tag along sort, i.e.,:
When live target classification label is unknown, Weighted Loss Function or highest loss function can be used to estimate posture
Output.
Further, the Weighted Loss Function and highest loss function, it is assumed that the output of sorter network is random
Vector, then the posture estimated is ywgt(c)=∑ iyip (c=i), loss function are:
And if it is considered to scheduled target classification label is a label for having maximum probability, then the posture estimated is
yArgmaxip (c=i);
Loss function is as shown in above formula.
Wherein, the training trains network using following steps:
The first step, fixed character network carry out classification pre-training to the image of ImageNet and seek its weights;
Second step knows that the network of sorter network and particular category is unrelated with other networks;
Third walks, and using the information that two steps obtain above as the initial value of whole network, then utilizes new loss function,
Whole network is optimized with lower learning rate, realizes the task of joint objective classification and Attitude estimation.
Description of the drawings
Fig. 1 is a kind of system flow of joint objective classification and 3 d pose method of estimation based on residual error network of the present invention
Figure.
Fig. 2 is a kind of network architecture of joint objective classification and 3 d pose method of estimation based on residual error network of the present invention
Figure.
Specific implementation mode
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
It mutually combines, invention is further described in detail in the following with reference to the drawings and specific embodiments.
Fig. 1 is a kind of system flow of joint objective classification and 3 d pose method of estimation based on residual error network of the present invention
Figure.Include mainly joint objective classification and 3 d pose estimation, loss function, training.
Joint objective is classified and 3 d pose estimation, can be applied to the unknown situation of target classification label, and using residual
Poor network ResNet-50 is as character network.
Loss function characterizes live posture R with the sum of posture loss function and Classification Loss function*, live tag along sort c*
The loss function between (R, c) is exported with the network proposed, i.e.,:
Wherein, Classification Loss functionUsing the classification cross-entropy loss function of standard, and posture loses letter
NumberRepresentation dependent on spin matrix R.
Spin matrix R uses the representation of axis angle, i.e. R=expm (θ [v]×), wherein v corresponds to rotary shaft, [v]×
It indicates by v=[v1,v2,v3]TThe antisymmetric matrix of generation, i.e.,:
And θ corresponds to rotation angle, limit θ ∈ [0, π), obtain one a pair of between spin matrix R and axis angle vector y=θ v
It should be related to.
Correspondence between spin matrix and axis angle vector,Wherein, y1And y2It is two
A axis angle vector;
The loss function over the ground in space is as shown in above formula where spin matrix.
Set yiIt is the output of i-th of posture network, when known to target classification, can be selected according to correct tag along sort
Posture output is selected, i.e.,:
When live target classification label is unknown, Weighted Loss Function or highest loss function can be used to estimate posture
Output.
Weighted Loss Function and highest loss function, it is assumed that the output of sorter network is random vector, then estimate
Posture is ywgt(c)=∑ iyip (c=i), loss function are:
And if it is considered to scheduled target classification label is a label for having maximum probability, then the posture estimated is
yArgmaxip (c=i);
Loss function is as shown in above formula.
Network is trained using following steps:
The first step, fixed character network carry out classification pre-training to the image of ImageNet and seek its weights;
Second step knows that the network of sorter network and particular category is unrelated with other networks;
Third walks, and using the information that two steps obtain above as the initial value of whole network, then utilizes new loss function,
Whole network is optimized with lower learning rate, realizes the task of joint objective classification and Attitude estimation.
Fig. 2 is a kind of network architecture of joint objective classification and 3 d pose method of estimation based on residual error network of the present invention
Figure.Using the feature of character network as input, for estimating target classification label.Using ResNet-50 fourth stages as feature
Network using the 5th stages of ResNet-50 as sorter network, and uses three layers of posture network as posture network.
For those skilled in the art, the present invention is not limited to the details of above-described embodiment, in the essence without departing substantially from the present invention
In the case of refreshing and range, the present invention can be realized in other specific forms.In addition, those skilled in the art can be to this hair
Bright to carry out various modification and variations without departing from the spirit and scope of the present invention, these improvements and modifications also should be regarded as the present invention's
Protection domain.Therefore, the following claims are intended to be interpreted as including preferred embodiment and falls into all changes of the scope of the invention
More and change.
Claims (10)
1. a kind of joint objective classification and 3 d pose method of estimation based on residual error network, which is characterized in that main includes connection
Close target classification and 3 d pose estimation (one);Loss function (two);Training (three).
2. estimating (one) with 3 d pose based on the joint objective classification described in claims 1, which is characterized in that can apply
In the situation that target classification label is unknown, and use residual error network ResNet-50 as character network.
3. based on the classification described in claims 2, which is characterized in that using the feature of character network as input, for estimating
Target classification label.
4. based on using residual error network ResNet-50 as character network described in claims 2, which is characterized in that will
ResNet-50 fourth stages using the 5th stages of ResNet-50 as sorter network, and use three layers of appearance as character network
State network is as posture network.
5. based on the loss function (two) described in claims 1, which is characterized in that use posture loss function and Classification Loss letter
The live posture R of the sum of number characterization*, live tag along sort c*The loss function between (R, c) is exported with the network proposed, i.e.,:
Wherein, Classification Loss functionUsing the classification cross-entropy loss function of standard, and posture loss functionRepresentation dependent on spin matrix R.
6. based on the spin matrix R described in claims 5, which is characterized in that R uses the representation of axis angle, i.e. R=
expm(θ[v]×), wherein v corresponds to rotary shaft, [v]×It indicates by v=[v1,v2,v3]TThe antisymmetric matrix of generation, i.e.,:
And θ corresponds to rotation angle, limit θ ∈ [0, π), the one-to-one correspondence obtained between spin matrix R and axis angle vector y=θ v closes
System.
7. based on the correspondence between the spin matrix described in claims 6 and axis angle vector, which is characterized in thatWherein, y1And y2It is two axis angle vectors;
The loss function over the ground in space is as shown in above formula where spin matrix.
8. based on the axis angle vector described in claims 7, which is characterized in that setting yiIt is the output of i-th of posture network,
When known to target classification, posture output can be selected according to correct tag along sort, i.e.,:
When live target classification label is unknown, Weighted Loss Function or highest loss function estimation posture can be used defeated
Go out.
9. based on Weighted Loss Function and highest loss function described in claims 8, which is characterized in that assuming that classification net
The output of network is random vector, then the posture estimated is ywgt(c)=∑iyiP (c=i), loss function are:
And if it is considered to scheduled target classification label is a label for having maximum probability, then the posture estimated is
Loss function is as shown in above formula.
10. based on the training (three) described in claims 1, which is characterized in that train network using following steps:
The first step, fixed character network carry out classification pre-training to the image of ImageNet and seek its weights;
Second step knows that the network of sorter network and particular category is unrelated with other networks;
Third walks, using the information that two steps obtain above as the initial value of whole network, then using new loss function, with compared with
Low learning rate optimizes whole network, realizes the task of joint objective classification and Attitude estimation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810077747.5A CN108280481A (en) | 2018-01-26 | 2018-01-26 | A kind of joint objective classification and 3 d pose method of estimation based on residual error network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810077747.5A CN108280481A (en) | 2018-01-26 | 2018-01-26 | A kind of joint objective classification and 3 d pose method of estimation based on residual error network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108280481A true CN108280481A (en) | 2018-07-13 |
Family
ID=62805107
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810077747.5A Withdrawn CN108280481A (en) | 2018-01-26 | 2018-01-26 | A kind of joint objective classification and 3 d pose method of estimation based on residual error network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108280481A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063607A (en) * | 2018-07-17 | 2018-12-21 | 北京迈格威科技有限公司 | The method and device that loss function for identifying again determines |
CN110263638A (en) * | 2019-05-16 | 2019-09-20 | 山东大学 | A kind of video classification methods based on significant information |
CN110428464A (en) * | 2019-06-24 | 2019-11-08 | 浙江大学 | Multi-class out-of-order workpiece robot based on deep learning grabs position and orientation estimation method |
CN110929242A (en) * | 2019-11-20 | 2020-03-27 | 上海交通大学 | Method and system for carrying out attitude-independent continuous user authentication based on wireless signals |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103177443A (en) * | 2013-03-07 | 2013-06-26 | 中国电子科技集团公司第十四研究所 | SAR (synthetic aperture radar) target attitude angle estimation method based on randomized hough transformations |
CN103345744A (en) * | 2013-06-19 | 2013-10-09 | 北京航空航天大学 | Human body target part automatic analytic method based on multiple images |
-
2018
- 2018-01-26 CN CN201810077747.5A patent/CN108280481A/en not_active Withdrawn
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103177443A (en) * | 2013-03-07 | 2013-06-26 | 中国电子科技集团公司第十四研究所 | SAR (synthetic aperture radar) target attitude angle estimation method based on randomized hough transformations |
CN103345744A (en) * | 2013-06-19 | 2013-10-09 | 北京航空航天大学 | Human body target part automatic analytic method based on multiple images |
Non-Patent Citations (1)
Title |
---|
SIDDHARTH MAHENDRAN ET AL: ""Joint Object Category and 3D Pose Estimation from 2D Images"", 《ARXIV.ORG》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063607A (en) * | 2018-07-17 | 2018-12-21 | 北京迈格威科技有限公司 | The method and device that loss function for identifying again determines |
CN109063607B (en) * | 2018-07-17 | 2022-11-25 | 北京迈格威科技有限公司 | Method and device for determining loss function for re-identification |
CN110263638A (en) * | 2019-05-16 | 2019-09-20 | 山东大学 | A kind of video classification methods based on significant information |
CN110428464A (en) * | 2019-06-24 | 2019-11-08 | 浙江大学 | Multi-class out-of-order workpiece robot based on deep learning grabs position and orientation estimation method |
CN110428464B (en) * | 2019-06-24 | 2022-01-04 | 浙江大学 | Multi-class out-of-order workpiece robot grabbing pose estimation method based on deep learning |
CN110929242A (en) * | 2019-11-20 | 2020-03-27 | 上海交通大学 | Method and system for carrying out attitude-independent continuous user authentication based on wireless signals |
CN110929242B (en) * | 2019-11-20 | 2020-07-10 | 上海交通大学 | Method and system for carrying out attitude-independent continuous user authentication based on wireless signals |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110414432B (en) | Training method of object recognition model, object recognition method and corresponding device | |
CN110569886B (en) | Image classification method for bidirectional channel attention element learning | |
CN108647583B (en) | Face recognition algorithm training method based on multi-target learning | |
Sun et al. | Deep learning face representation by joint identification-verification | |
JP4999101B2 (en) | How to combine boost classifiers for efficient multi-class object detection | |
CN108280481A (en) | A kind of joint objective classification and 3 d pose method of estimation based on residual error network | |
CN110580460A (en) | Pedestrian re-identification method based on combined identification and verification of pedestrian identity and attribute characteristics | |
Cao et al. | Transfer learning for pedestrian detection | |
CN111310583A (en) | Vehicle abnormal behavior identification method based on improved long-term and short-term memory network | |
CN111709311A (en) | Pedestrian re-identification method based on multi-scale convolution feature fusion | |
CN108492298B (en) | Multispectral image change detection method based on generation countermeasure network | |
CN106682696A (en) | Multi-example detection network based on refining of online example classifier and training method thereof | |
Gruhl et al. | A building block for awareness in technical systems: Online novelty detection and reaction with an application in intrusion detection | |
CN109800643A (en) | A kind of personal identification method of living body faces multi-angle | |
CN111079847A (en) | Remote sensing image automatic labeling method based on deep learning | |
CN111209799B (en) | Pedestrian searching method based on partial shared network and cosine interval loss function | |
WO2006105541A2 (en) | Object identification between non-overlapping cameras without direct feature matching | |
CN111582178B (en) | Vehicle weight recognition method and system based on multi-azimuth information and multi-branch neural network | |
CN107491729B (en) | Handwritten digit recognition method based on cosine similarity activated convolutional neural network | |
KR20180038169A (en) | Safety classification method of the city image using deep learning-based data feature | |
JP4221430B2 (en) | Classifier and method thereof | |
WO2021079451A1 (en) | Learning device, learning method, inference device, inference method, and recording medium | |
CN106709442A (en) | Human face recognition method | |
CN113936301B (en) | Target re-identification method based on center point prediction loss function | |
CN115630361A (en) | Attention distillation-based federal learning backdoor defense method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20180713 |