CN109753853A

CN109753853A - One kind being completed at the same time pedestrian detection and pedestrian knows method for distinguishing again

Info

Publication number: CN109753853A
Application number: CN201711076330.9A
Authority: CN
Inventors: 单鼎一; 刘惟锦; 张晓林
Original assignee: China Changfeng Science Technology Industry Group Corp
Current assignee: China Changfeng Science Technology Industry Group Corp
Priority date: 2017-11-06
Filing date: 2017-11-06
Publication date: 2019-05-14

Abstract

The present invention provides that one kind is completed at the same time pedestrian detection and pedestrian knows method for distinguishing again, and the predeterminable area under different angle camera extracts video frame, artificial to demarcate pedestrian position frame and relevant label information composition training data；Using preceding 5 convolutional layers of VGG16 convolutional neural networks structure as basic network, it adds local pedestrian's candidate network PPN and generates candidate pedestrian's frame position, result is exported according to PPN network and carries out the operation of ROI-pooling pondization, carries out Fusion Features using three full articulamentums；Using the output of the last one full articulamentum as character representation, characteristics dictionary-characteristic key library is built, all pedestrian's features in the deep learning feature of the determined pedestrian area part of detection model and characteristic key library are sought similarity mode；When two characteristic similarities meet preset requirement, the maximum artificial same person of similarity in pedestrian and the picture library in test picture is determined.

Description

One kind being completed at the same time pedestrian detection and pedestrian knows method for distinguishing again

Technical field

The invention belongs to mode identification technologies, more particularly, to the depth identified for pedestrian detection and pedestrian again Spend learning method.

Background technique

Pedestrian identifies the technology for referring under non-overlap video camera different perspectives picture Auto-matching with a group traveling together's object again, Have in mind and identifying work without the camera specific objective pedestrian under the public ken, due to offices such as video definition partial occlusions Limit, is difficult directly to find same target by specifying informations such as faces.Weight identification technology requires to pass through pedestrian's different topography texture Etc. information, suitable feature space under measurement criterion complete identification match.This task first has to pedestrian detection and chooses height generally Rate pedestrian's frame, after feature extraction and similarity mode are carried out to multiple candidate frames, finally lock searched targets.

New lover of the deep learning as video image processing task, under mass data and the auxiliary of high-performance computer, Object identification, target detection and tracking, the tasks such as image segmentation all significantly machine learning algorithms of beyond tradition.It can be in practice The combination of high-precision detection algorithm and efficient tracing algorithm tends not to play the effect of one-plus-one is greater than two, and reason is to detect Algorithm, which obtains target frame and pedestrian's weight recognition training collection picture, has position deviation, and testing result is that algorithm is asked under natural scene It obtains, and pedestrian's weight recognition training collection picture is mostly artificial the problems such as cutting acquisition, causing data-bias asymmetric.

Summary of the invention

For the disadvantages described above of the prior art, the invention proposes one kind based on deep learning be completed at the same time pedestrian detection with Pedestrian's recognition methods again, purpose make detection and the weight better seamless connection of identification mission, overcome intermediate transition phase data not The problems such as symmetrical, promotes pedestrian's weight identification technology precision.Furthermore it designs detection and identifies general feature again, when greatly improving Between efficiency, guarantee algorithm operation real-time requirement.

Technical scheme is as follows:

One kind being completed at the same time pedestrian detection and pedestrian knows method for distinguishing again, it is characterised in that: in same deep learning network The prediction of pedestrian candidate frame is carried out in structure, pedestrian detection frame returns, the strategy of multirow people classification combination learning, major network structure Mainly include a large amount of convolutional layer for VGG16 network+PPN Area generation network+connect identification layer entirely, pond layer with connect entirely Layer；Specifically includes the following steps:

(1) data set constructs: the predeterminable area under different angle camera extracts video frame, manually demarcates pedestrian position Frame forms training data to relevant label information, or obtains similar data by channel and construct data set；

(2) it recognition training: using preceding 5 convolutional layers of VGG16 convolutional neural networks structure as basic network, adds Local pedestrian's candidate network PPN generates candidate pedestrian's frame position, exports result according to PPN network and carries out the pond ROI-pooling Operation carries out Fusion Features using three full articulamentums, it is last simultaneously using pedestrian detection frame offset error regression function with it is more Pedestrian target Classification Loss function carries out model parameter adjustment；

(3) feature calculation: using the output of the last one full articulamentum as character representation, this feature construction tagged word is utilized Allusion quotation-characteristic key library, deep learning feature and characteristic key library of the test phase the determined pedestrian area part of detection model In all pedestrian's features seek similarity mode；

(4) similarity confirms: when two characteristic similarities meet preset requirement, determining the pedestrian in test picture and figure The maximum artificial same person of similarity in valut.

A kind of deep learning that is based on proposed by the present invention is completed at the same time pedestrian detection and pedestrian's recognition methods again, uses convolution Neural network maps the automatic learning characteristic from large-scale data by multilayered nonlinear.Furthermore pass through Model Fusion and target letter Number combination learnings, pedestrian detection and pedestrian's weight identification mission can common characteristic and weighting parameter it is shared, improve Classification and Identification energy Raising efficiency while power, preferably completion pedestrian weight identification mission.The present invention specifically has the advantage that

1, the invention proposes the deep learning pedestrian retrievals of a kind of end-to-end (end to end), and pedestrian to be added to identify again Frame.

2, training process is closer to natural actual scene, strong antijamming capability, and the identification again being more suitable under natural scene is appointed Business.

3, Detection task and identification mission weight are shared, and common features accomplish seamless connection.

4, the present invention can efficiently handle in real time the pedestrian detection of monitoring system and pedestrian identifies problem again.

Detailed description of the invention

Fig. 1 is convolutional neural networks structure chart of the invention；

Fig. 2 is that deep learning pedestrian detection and pedestrian of the invention identify supervised training flow chart again；

Fig. 3 is the operation test flow chart that pedestrian detection and pedestrian of the invention identify again.

Specific embodiment

It is the core depth convolutional neural networks structure chart of invention shown in Fig. 1.

The major network structure that the present invention uses is VGG16 network+PPN Area generation network+connect identification layer entirely, mainly Including a large amount of convolutional layer, pond layer and full articulamentum.Model is using the implicit identification feature in convolutional network study picture The artificial design of traditional characteristic is overcome to interfere, wherein the corresponding goal regression function of PPN pedestrian candidate region network helps Generate high likelihood pedestrian frame.ROI-pooling layers solve the problems, such as that pedestrian's frame extraction characteristic pattern is not of uniform size, are similar to The function of resize.The pedestrian position regressive object of end further corrects pedestrian's frame position, and pedestrian's class object is corresponding complete The output of articulamentum three is as final character representation.

The present invention is using technical solution: the prediction of pedestrian candidate frame is carried out in same deep learning network structure, Pedestrian detection frame returns, the strategy of multirow people classification combination learning, specifically includes the following steps:

Step 1: data set building: the predeterminable area under different angle camera extracts video frame, manually demarcates pedestrian Position frame forms training data to relevant label information, or obtains similar data by channel and construct data set.

Step 2: recognition training: the present invention is using net based on preceding 5 convolutional layers of VGG16 convolutional neural networks structure Network, and part pedestrian's candidate network PPN is added and generates candidate pedestrian's frame position, result is exported according to PPN network and carries out ROI The operation of (Region of interest)-pooling pondization carries out Fusion Features, last while benefit using three full articulamentums Model parameter adjustment is carried out with pedestrian detection frame offset error regression function and multirow people target classification loss function.Largely instructing Practicing data can rapid fine adjustment global depth convolutional Neural net under deep learning error-duration model and gradient decline optimisation strategy are supported Network.This discovery detection algorithm is indicated with identification common features again, reduces model complexity, raising time efficiency, using joint The differentiation ability to express of feature is reinforced in study.

Step 3: feature calculation: the present invention, can be special using this using the output of the last one full articulamentum as character representation Levy construction feature dictionary-characteristic key library.Deep learning feature of the test phase the determined pedestrian area part of detection model Similarity mode is sought with pedestrian's features all in characteristic key library.

Step 4: similarity confirmation: when two characteristic similarities meet preset requirement, determining the pedestrian in test picture With the maximum artificial same person of similarity in picture library.

By the following examples, in conjunction with attached drawing, implementation of the invention is further illustrated.

It is that deep learning pedestrian detection and pedestrian of the invention identify supervised training flow chart again shown in Fig. 2, how is explanation Carry out network monitoring training:

The building of S201 data set: pedestrian's view is acquired in the case where disturbing scene certainly, under the camera of different angle different zones Frequently (cooperate without pedestrian, can be every frame sampling), it is artificial to demarcate pedestrian position frame and label information in video frame, to save workload Can be by the tracing algorithm aid mark of high quality, later period artificial nucleus are to the number of an accurate markup information of high-resolution It is an important ring for all algorithm tasks according to library.

S202 trains PPN pedestrian candidate network module, and PPN is a kind of full convolutional network, can suggest for detection is generated The task of frame is end-to-endly trained.It is basic network forward-propagating with VGG16 convolutional neural networks, according to every in Feature Mapping figure 9 of a position generation are selected the friendship of the location information of frame and authentic signature and ratio determines whether that candidate frame has pedestrian, and carry out Two-value classification learns with the reversed error that frame position returns.

S203 and S204 standardizes pedestrian's Feature Mapping figure: after the completion of PPN network training, fixed relevant parameter, and network Forward-propagating forms the Feature Mapping figure of full figure, according to pedestrian's frame prediction result of PPN network, cuts in full figure Feature Mapping figure The feature for taking single pedestrian uses ROI-pooling standardized feature figure size.

S205 and S206: the training in three full articulamentums with the loss function of pedestrian is returned using pedestrian's feature and position Study gives under learning rate, seeks the updated value of weight according to local derviation with chain type derivation principle by gradient decline, and optimization fine tuning is deep Convolutional neural networks are spent, restrains and stablizes until model.

The extraction of S207 pedestrian's expression feature: the output valve of the last one corresponding full articulamentum of each pedestrian's frame is as it Corresponding character representation.

It is the operation test flow chart that pedestrian detection and pedestrian of the invention identify again shown in Fig. 3, detailed process is as follows:

S301 inputs the pedestrian's picture to be searched: pedestrian inputs picture and is input to network forward-propagating, does not enable PPN network Part is directly extracted the activation value of the last one full articulamentum of the pedestrian as expression feature and is saved.

S302 and S303: camera real-time data collection, every frame picture is input in network, according to the generation of PPN network Frame suggestion, further obtains the Feature Mapping figure of pedestrian, and carries out ROI-pooling standardized feature figure size.

The feature of each pedestrian of S304 continues forward pass, extracts the output valve of the last one full articulamentum as its corresponding spy Sign indicates.

S305 calculates similarity: calculating the feature for searching people in 301 and food inspection to the similarity of pedestrian's feature, can lead to A variety of calculation methods such as COS distance, Euclidean distance are crossed, apart from two features of smaller expression more like being more likely same a group traveling together's mesh Mark.

S306 definitive result: similarity, which reaches preset value, can be determined as same people.

Claims

1. one kind is completed at the same time pedestrian detection and pedestrian knows method for distinguishing again, it is characterised in that: in same deep learning network knot The prediction of pedestrian candidate frame is carried out in structure, pedestrian detection frame returns, the strategy of multirow people classification combination learning, and major network structure is VGG16 network+PPN Area generation network+connect identification layer entirely mainly includes a large amount of convolutional layer, pond layer and full articulamentum； Specifically includes the following steps:

(1) data set constructs: predeterminable area under different angle camera extracts video frame, it is artificial demarcate pedestrian position frame with Relevant label information forms training data, or obtains similar data by channel and construct data set；

(2) recognition training: using preceding 5 convolutional layers of VGG16 convolutional neural networks structure as basic network, part is added Pedestrian candidate network PPN generates candidate pedestrian's frame position, exports result according to PPN network and carries out the operation of ROI-pooling pondization, Fusion Features are carried out using three full articulamentums, it is last to utilize pedestrian detection frame offset error regression function and multirow people mesh simultaneously It marks Classification Loss function and carries out model parameter adjustment；

(3) feature calculation: using the output of the last one full articulamentum as character representation, this feature construction characteristics dictionary-is utilized Characteristic key library, pedestrians all in the deep learning feature of the determined pedestrian area part of detection model and characteristic key library are special Solicit similarity mode；

(4) similarity confirms: when two characteristic similarities meet preset requirement, determining the pedestrian in test picture and picture library The middle maximum artificial same person of similarity.