CN114612936A - Unsupervised abnormal behavior detection method based on background suppression - Google Patents
Unsupervised abnormal behavior detection method based on background suppression Download PDFInfo
- Publication number
- CN114612936A CN114612936A CN202210252961.6A CN202210252961A CN114612936A CN 114612936 A CN114612936 A CN 114612936A CN 202210252961 A CN202210252961 A CN 202210252961A CN 114612936 A CN114612936 A CN 114612936A
- Authority
- CN
- China
- Prior art keywords
- dimensional
- layer
- convolution
- activation function
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 83
- 206010000117 Abnormal behaviour Diseases 0.000 title claims abstract description 77
- 230000001629 suppression Effects 0.000 title claims abstract description 49
- 230000006870 function Effects 0.000 claims abstract description 92
- 238000012549 training Methods 0.000 claims abstract description 20
- 238000012360 testing method Methods 0.000 claims abstract description 17
- 230000002159 abnormal effect Effects 0.000 claims abstract description 5
- 230000004913 activation Effects 0.000 claims description 62
- 230000006399 behavior Effects 0.000 claims description 38
- 238000011176 pooling Methods 0.000 claims description 38
- 238000000034 method Methods 0.000 claims description 21
- 230000015654 memory Effects 0.000 claims description 20
- 238000013527 convolutional neural network Methods 0.000 claims description 19
- 238000000605 extraction Methods 0.000 claims description 15
- 238000000354 decomposition reaction Methods 0.000 claims description 13
- 238000013528 artificial neural network Methods 0.000 claims description 12
- 230000007787 long-term memory Effects 0.000 claims description 11
- 230000006403 short-term memory Effects 0.000 claims description 11
- 238000012544 monitoring process Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 9
- 238000012937 correction Methods 0.000 claims description 6
- 238000001914 filtration Methods 0.000 claims description 4
- 238000011478 gradient descent method Methods 0.000 claims description 4
- 230000017105 transposition Effects 0.000 claims description 4
- 230000000644 propagated effect Effects 0.000 claims description 3
- 230000002401 inhibitory effect Effects 0.000 claims 1
- 230000007547 defect Effects 0.000 abstract description 3
- 230000008447 perception Effects 0.000 abstract description 2
- 238000002474 experimental method Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 7
- 238000013135 deep learning Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000002372 labelling Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 206010039203 Road traffic accident Diseases 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention provides an unsupervised abnormal behavior detection method based on background suppression, which comprises the following steps: (1) acquiring a training sample set and a test sample set; (2) constructing an unsupervised abnormal behavior detection network model; (3) carrying out iterative training on the unsupervised abnormal behavior detection network model H; (4) defining unsupervised abnormal behavior detection network model H*Is abnormal score ofA function score; (5) and acquiring an abnormal behavior detection result. The unsupervised abnormal behavior detection network model constructed by the invention overcomes the defect that the influence of the background characteristics of the video frames on algorithm perception and the influence of the marking accuracy of the training set on supervised learning are not considered in the prior art, and improves the abnormal behavior identification accuracy of the abnormal behavior detection method.
Description
Technical Field
The invention belongs to the technical field of computer vision, and relates to an abnormal behavior detection method, in particular to an unsupervised road monitoring video abnormal behavior detection method based on background suppression.
Background
Road monitoring is the most convenient and direct way to observe the behavior of passerby, and as the number of traffic accidents caused by the fact that passerby use sidewalks not according to traffic regulations increases, urgent needs for detecting abnormal behavior of passerby are generated.
In recent years, with the rapid development of deep learning and source data sets, intelligent monitoring equipment is correspondingly developed, abnormal behavior detection is the most widely applied function of the current intelligent monitoring equipment in daily life, and reliable safety guarantee is provided for the daily work and life of people. However, in the process of detecting passers-by, the current intelligent monitoring equipment with a built-in detection algorithm is easily influenced by factors such as ambient light, background targets, background similar characteristics and the like, and in addition, if a supervision abnormal behavior detection algorithm is adopted, the accuracy of a used manual labeling data set also influences the algorithm, finally, inevitable interference is introduced, the accuracy of abnormal behavior detection is reduced, and the robustness of the algorithm is weakened. Therefore, the accuracy of abnormal behavior detection and the robustness of the algorithm are important indexes for evaluating the performance of the abnormal behavior detection algorithm.
In the patent document "abnormal behavior detection method based on deep learning" (patent application number: CN 202110611720.1; application publication number: CN113361370A) applied by Nanjing industry university, an abnormal behavior detection method based on deep learning is disclosed, which includes the steps of firstly, obtaining an RGB image of an actual scene by using a camera, then, detecting pedestrians in a current video frame by using a YOLOv5 algorithm, outputting position information, confidence and category of a detection frame, performing cascade matching on adjacent frame targets by using a constructed appearance feature network to obtain a matched track, and finally, deleting, creating and tracking a track result by using a Kalman prediction method to obtain a final track and matching the final track with a next frame, so that the cycle is performed. The method has the disadvantages that firstly, the method does not consider the influence of the background characteristics of the video frame on algorithm perception, so that the accuracy of the abnormal behavior detection algorithm is influenced under the interference of background information, secondly, the YOLOv5 algorithm adopted in the method is a supervision algorithm, and the accuracy of the detection algorithm is also influenced by the labeling accuracy of pedestrians in a manually labeled data set when the YOLOv5 algorithm is trained.
In patent document "a violent abnormal behavior detection method based on deep learning" applied by the university of Harbin's rational engineering (patent application No. CN 202110224967.8; application publication No. CN113191182A), a violent abnormal behavior detection method is proposed. The method comprises the steps of firstly carrying out framing processing on videos in a data set to obtain video frames, then stacking a plurality of continuous frames to form a cube, extracting three-dimensional features in the cube by using a three-dimensional convolution neural network, carrying out feature fusion, and judging whether the extracted features have the features of forbidden articles such as knives, guns, sticks and sticks by using a YOLO algorithm. The method has two disadvantages that firstly, the method does not fully consider the interference of similar background information characteristics in the actual life scene on the foreground information. Secondly, the YOLO algorithm adopted in the method is a supervised algorithm, and the accuracy of the labeling of pedestrians in a manually labeled data set can also influence the accuracy of the detection algorithm when the YOLO algorithm is trained.
Disclosure of Invention
The invention aims to provide an unsupervised abnormal behavior detection method based on background suppression aiming at the defects of the prior art, and the unsupervised abnormal behavior detection method is used for solving the technical problem of low detection accuracy caused by neglecting the background information of the video to be detected and manually dividing a data set in the prior art.
In order to achieve the purpose, the technical scheme adopted by the invention comprises the following steps:
(1) acquiring a training sample set and a testing sample set:
(1a) randomly selecting M personal sidewalk monitoring videos for decomposition to obtain M frame sequence sets,whereinDenotes the m-th contains KmA sequence of frames of a frame of the image,vkto representThe K-th frame image, M is more than or equal to 200, Km≥100;
(1b) From a set S of frame sequencesv1Each frame sequence involvedRespectively screened N only containing pedestrian walking eventsmThe frame images form a normal behavior frame sequenceAnd all normal behavior frame sequences contained in the M frame sequences form a training sample set BtrainThen will beP remaining inmAbnormal behavior frame sequence formed by frame imagesThen all abnormal behavior frame sequences are combined into a test sample set BtestWherein N ism≥Pm,Pm=Km-Nm;
(2) Constructing an unsupervised abnormal behavior detection network model H:
(2a) constructing an unsupervised abnormal behavior detection network model H of a background suppression module, a prediction module and a background suppression constraint module which are connected in sequence, wherein the output end of the background suppression module is also connected with a context memory module; wherein:
the prediction module comprises a space encoder, a convolution long-term and short-term memory module and a decoder which are sequentially connected, wherein the space encoder adopts a feature extraction network comprising a plurality of two-dimensional convolution layers and a plurality of activation function layers; the convolution long-term and short-term memory module adopts a memory convolution neural network comprising a plurality of two-dimensional convolution layers, a plurality of tensor decomposition layers and a plurality of activation function layers; the decoder adopts a transposed convolutional neural network comprising a plurality of two-dimensional transposed convolutional layers and a plurality of activation function layers;
the context memory module comprises a motion matching encoder and a memory module which are connected in sequence, wherein the motion matching encoder adopts a three-dimensional convolutional neural network comprising a plurality of three-dimensional convolutional layers, a plurality of activation function layers, a plurality of three-dimensional maximum pooling layers and 1 three-dimensional average pooling layer;
the output end of the memory module in the context memory module is connected with the input end of the decoder in the prediction module;
(2b) background suppression loss function L defining a background suppression constraint moduleBGSBackground constrained loss function LrestrainMinimum square error L2Minimum absolute value deviation L1:
Lrestrain=LBGS+L2+L1
Wherein | · | purple sweet1Representing 1 norm, Binary (·) representing binarization,to representThe result of the prediction of (a) is,to representThe nth frame image of (1);
(3) carrying out iterative training on the unsupervised abnormal behavior detection network model H:
(3a) the initial iteration time is T, the maximum iteration time is T, T is more than or equal to 80, and the parameter of the T-th iteration feature extraction network is thetaG1_tThe memory convolutional neural network parameter is thetaG2_tTransposed convolutional neural network parameter of θG3_tThe three-dimensional convolution neural network parameter is thetaG4_tLet t be 1;
(3b) will train sample set BtrainObtaining the t-th iteration time frame sequence as the input of an unsupervised abnormal behavior detection network model HPredicted result of (2)
(3b1) Background suppression module pair training sample set BtrainOf each normal behavior frame sequenceEach normal behavior frame image in (1)Making background informationInhibiting to obtain M frame sequences after background inhibition;
(3b2) frame sequence with background suppression by spatial coder in prediction moduleEach frame image in the image processing system is subjected to feature extraction, and a convolution long-term and short-term memory module pairFeature tensor of all extracted feature componentsDecomposing to obtainCharacteristic information ofAnd store, c is [2, M-1 ]];
(3b3) Context memorization module for frame division sequenceExtracting features of each frame image in M-1 normal behavior frame sequences except the image sequenceThe features of all previous frame images constitute the above informationAnd store, at the same time, willThe features of all the subsequent frame images constitute context informationAnd storing;
(3b4) the decoder in the prediction module is used for the step (3b2)Characteristic information ofAnd the above information obtained in step (3b3)And context informationDecoding to obtain the t-th iteration time frame sequencePredicted result of (2)
(3c) Background suppression constraint module pairs prediction resultsAnd normal behavior frame sequencesNormal behavior frame image inPerforming binarization processing to obtain prediction result at t momentIs performed on the binary imageNth normal behavior frame imageIs performed on the binary image
(3d) Using a background suppression loss function LBGSBy passingAndcalculate HtBackground suppression loss value L ofBGSAnd using a background constrained loss function LrestrainThrough LBGS、L2And L1Calculate HtIs a background constraint loss value Lrestrain;
(3e) Using a counter-propagating method and passing through LrestrainCalculate HtGradient of network parameters, then by a random gradient descent method through HtNetwork parameter gradient of (a) to network parameter thetaG1_t、θG2_t、θG3_t、θG4_tUpdating to obtain the unsupervised abnormal behavior detection network model H of the iterationt;
(3f) Judging whether T is more than or equal to T, if so, obtaining a trained unsupervised abnormal behavior detection network model H*Otherwise, let t be t +1, HtH, and performing step (3 b);
(4) acquiring an abnormal behavior detection result:
(4a) set B of test samplestestSequence of the c-th anomalously behaving frameUnsupervised abnormal behavior detection network model H as trained*Is forward propagated to obtainPredicted frame image of
(4b) Using an anomaly score function score and by predicting the frame imageAnd frame imageComputingAnd judging whether F and a preset abnormal score detection threshold I meet the condition that F is not less than I, if so, judging that F is not less than IThere is abnormal behavior, whereas there is no abnormal behavior, wherein:
compared with the prior art, the invention has the following advantages:
firstly, in the invention, because the constructed abnormal behavior detection network model comprises a background suppression module and a background suppression constraint module, in the process of training the model and acquiring the detection result, in consideration of the influence of background target characteristic information on foreground abnormal detection, the abnormal behavior detection network model firstly weakens static background information by means of the background suppression module, then suppresses dynamic background information by means of the background suppression constraint module, finally strengthens the information of the foreground target, avoids the false detection defect caused by only considering the foreground information and neglecting the background information in the prior art, and effectively improves the detection accuracy.
Secondly, the invention realizes unsupervised abnormal behavior detection by means of a spatial encoder and a decoder due to the fact that a prediction module contained in the constructed abnormal behavior detection network model is connected with the spatial encoder, the convolution long-term and short-term memory module and the decoder in sequence, and overcomes the influence of the accuracy of a manual labeling data set on supervised learning, so that the invention has the advantage of strong robustness under different data sets.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention.
Fig. 2 is a schematic structural diagram of an abnormal behavior detection network model constructed by the present invention.
Detailed Description
The invention is described in further detail below with reference to the figures and specific examples.
Referring to fig. 1, the present invention includes the steps of:
step 1) obtaining a training sample set and a testing sample set:
(1a) randomly selecting M personal sidewalk monitoring videos for decomposition to obtain M frame sequence sets,whereinDenotes the m-th contains KmA sequence of frames of a frame of the image,vkto representThe K-th frame image, M is more than or equal to 200, Km≥100;
In this example, experiments show that when M is 200, the training speed is fast, and the detection effect of the model is good.
(1b) From a set S of frame sequencesv1Each frame sequence involvedRespectively screened N only containing pedestrian walking eventsmThe frame images form a normal behavior frame sequenceAnd all normal behavior frame sequences contained in the M frame sequences form a training sample set BtrainThen will beP remaining inmAbnormal behavior frame sequence formed by frame imagesThen all abnormal behavior frame sequences are combined into a test sample set BtestWherein N ism≥Pm,Pm=Km-Nm;
In this example, walking of a pedestrian appearing in the sidewalk monitoring video is defined as a normal behavior, and riding a bicycle and a skateboard are defined as an abnormal behavior.
Step 2), constructing an unsupervised abnormal behavior detection network model H:
(2a) constructing an unsupervised abnormal behavior detection network model H of a background suppression module, a prediction module and a background suppression constraint module which are connected in sequence, wherein the output end of the background suppression module is also connected with a context memory module; the prediction module comprises a space encoder, a convolution long-term and short-term memory module and a decoder which are sequentially connected, wherein the space encoder adopts a feature extraction network comprising a plurality of two-dimensional convolution layers and a plurality of activation function layers; the convolution long-term and short-term memory module adopts a memory convolution neural network comprising a plurality of two-dimensional convolution layers, a plurality of tensor decomposition layers and a plurality of activation function layers; the decoder adopts a transposed convolutional neural network comprising a plurality of two-dimensional transposed convolutional layers and a plurality of activation function layers; the context memory module comprises a motion matching encoder and a memory module which are connected in sequence, and the output end of the memory module is connected with the input end of a decoder in the video prediction module; the motion matching encoder adopts a three-dimensional convolution neural network comprising a plurality of three-dimensional convolution layers, a plurality of activation function layers, a plurality of three-dimensional maximum pooling layers and a three-dimensional average pooling layer;
the number of the two-dimensional convolution layer and the number of the activation function layer which are contained in the space encoder are both 4, and the specific structure of the space encoder is as follows: the first two-dimensional convolution layer → the first activation function layer → the second two-dimensional convolution layer → the second activation function layer → the third two-dimensional convolution layer → the third activation function layer → the fourth two-dimensional convolution layer → the fourth activation function layer; wherein the input channel of the first two-dimensional convolutional layer is 1, the output channel is 64, and the step length is 2; the input channel of the second two-dimensional convolutional layer is 64, the output channel is 64, and the step length is 1; the third two-dimensional convolutional layer has an input channel of 64, an output channel of 128 and a step length of 2; the fourth two-dimensional convolutional layer has an input channel of 128, an output channel of 128 and a step length of 1; the convolution kernels used by the 4 two-dimensional convolution layers are all 3 multiplied by 3 in size; the 4 activation function layers all adopt ELU functions;
because each frame sequence in the example is obtained after the video decomposition, the frame image feature information in the frame sequence has strong correlation, and compared with the prior art in which only a common convolutional neural network is used for extracting the frame image feature information, the example uses a spatial encoder pairThe feature extraction is carried out on each frame image, so that the extracted feature information has strong relevance, and the feature information can obtain better decoding effect when being decoded in a decoder.
The convolution long-term memory module, it contains that the number of two-dimentional convolution layer and tensor decomposition layer is 2, and the number of activation function layer is 3, and concrete structure is: the first two-dimensional convolution layer → the second two-dimensional convolution layer → the first tensor decomposition layer → the second tensor decomposition layer → the first activation function layer → the second activation function layer → the third activation function layer; wherein the first two-dimensional convolutional layer and the second two-dimensional convolutional layer are the same, the input channel is 128, and the output channel is 128; 3 activation function layers all adopt sigmoid functions;
the decoder, its two-dimentional transposition convolution layer that contains number is 4, and the number of activation function layer is 3, and the concrete structure is: a first two-dimensional transposed convolution layer → a first activation function layer → a second two-dimensional transposed convolution layer → a second activation function layer → a third two-dimensional transposed convolution layer → a third activation function layer → a fourth two-dimensional transposed convolution layer; wherein the input channel of the first two-dimensional transpose convolution layer is 256, the output channel is 128, and the step length is 1; the second two-dimensional transpose convolution layer has an input channel of 128, an output channel of 64, and a step size of 2; the third two-dimensional transpose convolution layer has 64 input channels, 64 output channels and 1 step length; the fourth two-dimensional transpose convolution layer has an input channel of 64, an output channel of 1 and a step length of 1; convolution kernels used by the 4 two-dimensional transposition convolution layers are all 3 multiplied by 3 in the same size, and 3 activation function layers all adopt ELU functions;
the motion matching encoder comprises 6 three-dimensional convolution layers and 6 activation function layers, wherein the number of three-dimensional maximum pooling layers is 4, the number of three-dimensional average pooling layers is 1, and the specific structure is as follows: the first three-dimensional convolution layer → the first activation function layer → the first three-dimensional maximum pooling layer → the second three-dimensional convolution layer → the second activation function layer → the second three-dimensional maximum pooling layer → the third three-dimensional convolution layer → the third activation function layer → the fourth three-dimensional convolution layer → the fourth activation function layer → the third three-dimensional maximum pooling layer → the fifth three-dimensional convolution layer → the fifth activation function layer → the sixth three-dimensional convolution layer → the sixth activation function layer → the fourth three-dimensional maximum pooling layer → the average three-dimensional pooling layer; wherein the first three-dimensional convolution layer input channel is 1, and the output channel is 64; the second three-dimensional convolutional layer has an input channel of 64 and an output channel of 128; the third three-dimensional convolution layer has an input channel of 128 and an output channel of 256; the input channel of the fourth three-dimensional convolution layer is 256, and the output channel is 256; the input channel of the fifth three-dimensional convolution layer is 256, and the output channel is 512; the input channel of the sixth three-dimensional convolution layer is 512, and the output channel is 512; the step lengths are all 1; convolution kernels used by the 6 three-dimensional convolution layers are all 3 multiplied by 3 in size; the size of the first three-dimensional maximum pooling layer pooling core is 1 multiplied by 2, and the step length is 1 multiplied by 2; the sizes of the second three-dimensional maximum pooling layer pooling core, the third three-dimensional maximum pooling layer pooling core and the fourth three-dimensional maximum pooling layer pooling core are all 2 multiplied by 2, and the step lengths are all 2 multiplied by 2; the average three-dimensional pooling layer convolution kernel size is 1 multiplied by 2; the 6 activation function layers all adopt a ReLU function;
(2b) background suppression loss function L defining a background suppression constraint moduleBGSBackground constrained loss function LrestrainMinimum square error L2Minimum absolute value deviation L1:
Lrestrain=LBGS+L2+L1
Wherein | · | purple sweet1Representing 1 norm, Binary (·) representing binarization,to representThe result of the prediction of (a) is,to representThe nth frame image of (1);
in this example, the loss function L is constrained if the backgroundrestrainUsing only the least square error L2And background rejection loss function LBGSCalculating loss of unsupervised abnormal behavior detection network model, although prediction result can be guaranteedAnd normal behavior frame imagesImage ofSimilarity of elements, but also ease of predictionBlurring occurs and therefore to alleviateWill deviate from the minimum absolute value by L1A background constraint penalty function L is also addedrestrainAnd calculating the loss of the unsupervised abnormal behavior detection network model.
Step 3) carrying out iterative training on the unsupervised abnormal behavior detection network model H:
(3a) the initial iteration time is T, the maximum iteration time is T, T is more than or equal to 80, and the parameter of the T-th iteration feature extraction network is thetaG1_tThe memory convolutional neural network parameter is thetaG2_tTransposed convolutional neural network parameter of θG3_tThe three-dimensional convolution neural network parameter is thetaG4_tLet t be 1;
in this example, when the maximum iteration number is T100, the trained unsupervised abnormal behavior detection network model has the best detection effect;
(3b) will train sample set BtrainObtaining the t-th iteration time frame sequence as the input of an unsupervised abnormal behavior detection network model HPredicted result of (2)
(3b1) Background suppression module pair training sample set BtrainOf each normal behavior frame sequenceEach normal behavior frame image in (1)Performing background information suppression, and suppressing all background informationThe frame image of (2) constitutes a frame image sequence, and the implementation steps are as follows:
background suppression module pair training sample set BtrainOf each normal behavior frame sequenceEach normal behavior frame image in (1)Adjusting the illumination of the frame image by gamma correction, and correcting the gamma-corrected frame imageGaussian filtering is carried out to remove noise points in the frame image, and then the frame image after Gaussian filtering is carried outPerforming laplacian sharpening to inhibit background information to obtain a frame image with the background information inhibited
(3b2) Frame sequence with background suppression by spatial coder in prediction moduleEach frame image in the image processing system is subjected to feature extraction, and a convolution long-term and short-term memory module pairFeature tensor composed of all extracted featuresDecomposing to obtainCharacteristic information ofAnd store,c∈[2,M-1]The process is as follows:
spatial encoder pairs frame sequences by convolutional layers and activation function layers in feature extraction networksEach frame image in the image processing system is subjected to feature extraction and stacked to obtain a feature tensorThe convolution long-short term memory module utilizes convolution layer, tensor decomposition layer and activation function layer pairDecomposing to obtain characteristic information
(3b3) Context memorization module for frame division sequenceExtracting features of each frame image in M-1 normal behavior frame sequences except the image sequenceThe features of all previous frame images constitute the above informationAnd store while at the same timeThe features of all subsequent frame images constitute context informationAnd storing, the process is as follows:
for dividing frame sequenceBesides, all the frames are combinedEach frame image in the sequence is subjected to feature extraction by means of a three-dimensional convolutional neural network and the extracted features are encoded, and the frame sequenceAll previous frame sequencesAs the above informationAnd storing, a sequence of framesAll subsequent frame sequencesAs the following informationAnd stored.
(3b4) The decoder in the prediction module compares the feature information obtained in step (3b2)And the above information obtained in step (3b3)And context informationDecoding to obtain the t-th iteration time frame sequencePredicted result of (2)The process is as follows:
decoder for the above information by means of transposed convolutional neural networksContext informationAnd frame sequencesCharacteristic information ofThe formed tensors are transposed and decoded to obtain the frame sequence of the t iteration timePredicted result of (2)
The decoder in the prediction module in this example uses simultaneously the sequence of frames extracted by the spatial encoderThe characteristic information and the characteristic information obtained by extracting the characteristics of other frame sequences are decoded by the motion matching encoder, so that the prediction results are more various, and the intelligent degree of the model is higher.
(3c) Background suppression constraint module pairs prediction resultsAnd normal behavior frame sequencesNormal behavior frame image inPerforming binarization processing to obtain prediction result at t momentIs generated from the binary imageNth normal behavior frame imageIs generated from the binary imagePredicted resultsAnd normal behavior frame sequencesNormal behavior frame image inThe background suppression constraint module performs binarization processing to change all pixel values of the frame image which are not 0 to 1.
Because the foreground object and the background object both move continuously in the video, and the change of the pixel value is continuous, when the moving object passes through a certain area, the pixel value of the area changes, and the fluctuation of the pixel value is also taken as potential feature extraction in the process of extracting the feature by the algorithm, thereby causing false detection.
In this example, the binarization process would be to normally-behave frame imagesAnd predicting the resultAll the pixel values which are not 0 in the background image are changed into 1, and then the problem that the pixel value of a moving target passing area is not 0 caused by target motion is solved through the difference frame of the two pixel values, so that dynamic background information is suppressed, and the accuracy of detection is improved.
(3d) Using a background suppression loss function LBGSDisclosure of the inventionFor treatingAndcalculate HtBackground suppression loss value L ofBGSAnd using a background constrained loss function LrestrainThrough LBGS、L2And L1Calculate HtIs a background constraint loss value Lrestrain;
(3e) Using a counter-propagating method and passing through LrestrainCalculate HtGradient of network parameters, then by H using a random gradient descent methodtNetwork parameter gradient of (a) to network parameter thetaG1_t、θG2_t、θG3_t、θG4_tUpdating to obtain the unsupervised abnormal behavior detection network model H of the iterationt;
(3f) Judging whether T is more than or equal to T, if so, obtaining a trained unsupervised abnormal behavior detection network model H*Otherwise, let t be t +1, HtH, and performing step (3 b);
stochastic gradient descent algorithm through HtNetwork parameter gradient pair HtCharacteristic extraction network parameter theta ofG1_tMemorizing the convolution neural network parameter thetaG2_tTransposed convolutional neural network parameter θG3_tThree-dimensional convolutional neural network parameter thetaG4_tUpdating, wherein the updating formula is as follows:
mt=β1·vt-1+(1-β1)·gt
wherein: gtIs the gradient at the number of iterations t,extracting network parameters theta for features, respectivelyG1_tMemorizing the convolution neural network parameter thetaG2_tTransposed convolutional neural network parameter θG3_tThree-dimensional convolutional neural network parameter thetaG4_tUpdated parameters, { fti(θ) | i ═ 1,2,3,4} is the parameter θGi_tObjective function of, beta1,β2Exponential decay rates of the first and second moments, { m }, respectivelyti1,2,3,4 is HtFirst moment estimation of network parameter gradients, { v }tiI | ═ 1,2,3,4} is for HtAn estimate of the second moment of the gradient of the network parameter,is a pair { mtiCorrection of i | 1,2,3,4},is betaiTo the power of t of (a),for { vtiCorrection of | i ═ 1,2,3,4 [ { α ]iI | ═ 1,2,3,4} is the learning rate, { εiI | ═ 1,2,3,4} is a constant added to maintain numerical stability.
(3f) Judging whether T is more than or equal to T, if so, obtaining a trained unsupervised abnormal behavior detection network model H*Otherwise, let t be t +1,Hth, and performing step (3 b);
step 4), obtaining an abnormal behavior detection result:
(4a) set B of test samplestestSequence of the c-th anomalously behaving frameUnsupervised abnormal behavior detection network model H as trained*Is forward propagated to obtainPredicted frame image of
(4b) Using an anomaly score function score and by predicting the frame imageAnd frame imageCalculating outAnd judging whether F and a preset abnormal score detection threshold I meet the condition that F is not less than I, if so, judging that F is not less than IThere is abnormal behavior, whereas there is no abnormal behavior, wherein:
the effect of the present invention will be further explained with reference to the following experiments:
1. the experimental conditions are as follows:
the hardware platform of the experiment of the invention is as follows: 2 blocks of NVIDIA GeForce GTX 2080Ti GPU.
The software platform of the experiment of the invention is as follows: ubuntu 16 operating system, Pytorch 1.7 framework, Python 3.8.
The data set used for the experiment was the ShanghaiTech data set, which had a total of 437 videos, each with different lighting conditions and camera angles.
2. Analysis of experimental contents and results thereof:
(1) evaluation index
The main evaluation index in the field of video monitoring abnormal behavior detection is the Area Under the Curve (AUC) of Receiver Operating Characteristic Curve (ROC). The ROC takes the false positive rate as the abscissa and the true positive rate as the ordinate. The false positive rate refers to the probability of predicting as a positive sample in all negative samples, and the true positive rate refers to the probability of predicting as a positive sample in all positive samples. The closer the ROC is to the upper left corner, the larger the AUC value, and the better the performance of the algorithm model. For the abnormal behavior detection task, AUC values are calculated based on image-level abnormality scores.
(3) Results and analysis of the experiments
The experiment is mainly used for verifying the advantages of the method and other existing abnormal behavior detection methods in the aspect of detection accuracy. In the experiment, various abnormal behavior detection methods are adopted to train and test on a ShanghaiTech data set, and finally, an evaluation index AUC on the data set is obtained.
Table 1 experimental results of different algorithms on ShanghaiTech dataset
Method | AUC |
Conv-AE | 60.9% |
StackedRNN | 68% |
Liuetal. | 72.8% |
VEC | 74.8% |
HF2-VED | 76.2% |
The invention | 76.5% |
As can be seen from the experimental results of table 1, the present invention has higher accuracy compared to the prior art.
In conclusion, compared with the prior art, the method has higher detection accuracy rate on the abnormal behavior, and has important practical significance. While the invention has been shown and described with reference to certain preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.
Claims (4)
1. An unsupervised abnormal behavior detection method based on background suppression is characterized by comprising the following steps:
(1) acquiring a training sample set and a testing sample set:
(1a) randomly selecting M personal sidewalk monitoring videos for decomposition to obtain M frame sequence sets,whereinDenotes the m-th contains KmA sequence of frames of a frame of the image,vkto representThe K-th frame image, M is more than or equal to 200, Km≥100;
(1b) From a set S of frame sequencesv1Each frame sequence involvedRespectively screened N only containing pedestrian walking eventsmThe frame images form a normal behavior frame sequenceAnd all normal behavior frame sequences contained in the M frame sequences form a training sample set BtrainThen will beP remaining inmAbnormal behavior frame sequence formed by frame imagesThen all abnormal behavior frame sequences are combined into a test sample set BtestWherein N ism≥Pm,Pm=Km-Nm;
(2) Constructing an unsupervised abnormal behavior detection network model H:
(2a) constructing an unsupervised abnormal behavior detection network model H of a background suppression module, a prediction module and a background suppression constraint module which are connected in sequence, wherein the output end of the background suppression module is also connected with a context memory module; wherein:
the prediction module comprises a space encoder, a convolution long-term and short-term memory module and a decoder which are sequentially connected, wherein the space encoder adopts a feature extraction network comprising a plurality of two-dimensional convolution layers and a plurality of activation function layers; the convolution long-term and short-term memory module adopts a memory convolution neural network comprising a plurality of two-dimensional convolution layers, a plurality of tensor decomposition layers and a plurality of activation function layers; the decoder adopts a transposed convolutional neural network comprising a plurality of two-dimensional transposed convolutional layers and a plurality of activation function layers;
the context memory module comprises a motion matching encoder and a memory module which are connected in sequence, wherein the motion matching encoder adopts a three-dimensional convolutional neural network comprising a plurality of three-dimensional convolutional layers, a plurality of activation function layers, a plurality of three-dimensional maximum pooling layers and 1 three-dimensional average pooling layer;
the output end of the memory module in the context memory module is connected with the input end of the decoder in the prediction module;
(2b) background suppression loss function L defining a background suppression constraint moduleBGSBackground constrained loss function LrestrainMinimum square error L2Minimum absolute value deviation L1:
Lrestrain=LBGS+L2+L1
Wherein | · | charging1Representing 1 norm, Binary (·) representing binarization,to representThe result of the prediction of (a) is,to representThe nth frame image of (1);
(3) carrying out iterative training on the unsupervised abnormal behavior detection network model H:
(3a) the initial iteration time is T, the maximum iteration time is T, T is more than or equal to 80, and the parameter of the T-th iteration feature extraction network is thetaG1_tMemory convolutional neural network parameter is thetaG2_tTransposed convolutional neural network parameter of θG3_tThe three-dimensional convolution neural network parameter is thetaG4_tAnd let t equal to 1;
(3b) will train sample set BtrainObtaining the t-th iteration time frame sequence as the input of an unsupervised abnormal behavior detection network model HPredicted result of (2)
(3b1) Background suppression module pair training sample set BtrainIn each of the normal behavior frame sequencesEach normal behavior frame image in (1)Inhibiting background information, and forming all frame images with the suppressed background information into a frame image sequence;
(3b2) frame sequence with background suppression by spatial coder in prediction moduleEach frame image in the image processing system is subjected to feature extraction, and a convolution long-term and short-term memory module pairFeature tensor composed of all extracted featuresDecomposing to obtainCharacteristic information ofAnd store, c is [2, M-1 ]];
(3b3) Context memorization module for frame division sequenceExtracting features of each frame image in M-1 normal behavior frame sequences except the image sequenceThe features of all previous frame images constitute the above informationAnd store while at the same timeAll subsequent framesFeatures of an image constitute contextual informationAnd storing;
(3b4) the decoder in the prediction module compares the feature information obtained in step (3b2)And the above information obtained in step (3b3)And context informationDecoding to obtain the t-th iteration time frame sequencePredicted result of (2)
(3c) Background suppression constraint module pairs prediction resultsAnd normal behavior frame sequencesNormal behavior frame image inPerforming binarization processing to obtain prediction result at t momentIs generated from the binary imageNth normal behavior frame imageIs generated from the binary image
(3d) Using a background suppression loss function LBGSBy passingAndcalculating HtBackground suppression loss value L ofBGSAnd using a background constrained loss function LrestrainThrough LBGS、L2And L1Calculate HtIs a background constraint loss value Lrestrain;
(3e) Using a counter-propagating method and passing through LrestrainCalculate HtGradient of network parameters, then by H using a random gradient descent methodtNetwork parameter gradient of (a) to network parameter thetaG1_t、θG2_t、θG3_t、θG4_tUpdating to obtain the unsupervised abnormal behavior detection network model H of the iterationt;
(3f) Judging whether T is more than or equal to T, if so, obtaining a trained unsupervised abnormal behavior detection network model H*Otherwise, let t be t +1, HtH, and performing step (3 b);
(4) acquiring an abnormal behavior detection result:
(4a) set B of test samplestestSequence of the c-th anomalously behaving frameUnsupervised abnormal behavior detection network model H as trained*Is forward propagated to obtainPredicted frame image of
(4b) Using an anomaly score function score and by predicting the frame imageAnd frame imageComputingAnd judging whether F and a preset abnormal score detection threshold I meet the condition that F is not less than I, if so, judging that F is not less than IAbnormal behavior is present, whereas abnormal behavior is absent, wherein:
2. the background suppression-based unsupervised abnormal behavior detection method according to claim 1, wherein the unsupervised abnormal behavior detection network model H in step (2a) is a network model H in which:
the number of the two-dimensional convolution layer and the number of the activation function layer which are contained in the space encoder are both 4, and the specific structure of the space encoder is as follows: the first two-dimensional convolution layer → the first activation function layer → the second two-dimensional convolution layer → the second activation function layer → the third two-dimensional convolution layer → the third activation function layer → the fourth two-dimensional convolution layer → the fourth activation function layer; wherein the input channel of the first two-dimensional convolutional layer is 1, the output channel is 64, and the step length is 2; the input channel of the second two-dimensional convolutional layer is 64, the output channel is 64, and the step length is 1; the third two-dimensional convolutional layer has an input channel of 64, an output channel of 128 and a step length of 2; the fourth two-dimensional convolutional layer has an input channel of 128, an output channel of 128 and a step length of 1; the convolution kernels used by the 4 two-dimensional convolution layers are all 3 multiplied by 3 in size; the 4 activation function layers all adopt ELU functions;
convolution long short-term memory module, it contains the number that two-dimentional convolution layer and tensor decompose the layer and is 2, and the number of activation function layer is 3, and concrete structure is: the first two-dimensional convolution layer → the second two-dimensional convolution layer → the first tensor decomposition layer → the second tensor decomposition layer → the first activation function layer → the second activation function layer → the third activation function layer; wherein the first two-dimensional convolutional layer and the second two-dimensional convolutional layer are the same, the input channel is 128, and the output channel is 128; 3 activation function layers all adopt sigmoid functions;
the decoder, its two-dimentional transposition convolution layer that contains number is 4, and the number of activation function layer is 3, and the concrete structure is: a first two-dimensional transposed convolution layer → a first activation function layer → a second two-dimensional transposed convolution layer → a second activation function layer → a third two-dimensional transposed convolution layer → a third activation function layer → a fourth two-dimensional transposed convolution layer; wherein the input channel of the first two-dimensional transpose convolution layer is 256, the output channel is 128, and the step length is 1; the second two-dimensional transpose convolution layer has an input channel of 128, an output channel of 64, and a step size of 2; the third two-dimensional transpose convolution layer has 64 input channels, 64 output channels and 1 step length; the fourth two-dimensional transpose convolution layer has an input channel of 64, an output channel of 1 and a step length of 1; convolution kernels used by the 4 two-dimensional transposition convolution layers are all 3 multiplied by 3 in the same size, and 3 activation function layers all adopt ELU functions;
the motion matching encoder comprises 6 three-dimensional convolution layers and 6 activation function layers, wherein the number of three-dimensional maximum pooling layers is 4, the number of three-dimensional average pooling layers is 1, and the specific structure is as follows: the first three-dimensional convolution layer → the first activation function layer → the first three-dimensional maximum pooling layer → the second three-dimensional convolution layer → the second activation function layer → the second three-dimensional maximum pooling layer → the third three-dimensional convolution layer → the third activation function layer → the fourth three-dimensional convolution layer → the fourth activation function layer → the third three-dimensional maximum pooling layer → the fifth three-dimensional convolution layer → the fifth activation function layer → the sixth three-dimensional convolution layer → the sixth activation function layer → the fourth three-dimensional maximum pooling layer → the average three-dimensional pooling layer; wherein the input channel of the first three-dimensional convolution layer is 1, and the output channel is 64; the second three-dimensional convolutional layer has an input channel of 64 and an output channel of 128; the third three-dimensional convolution layer has an input channel of 128 and an output channel of 256; the input channel of the fourth three-dimensional convolution layer is 256, and the output channel is 256; the fifth three-dimensional convolution layer input channel is 256 and the output channel is 512; the input channel of the sixth three-dimensional convolution layer is 512, and the output channel is 512; the step lengths are all 1; convolution kernels used by the 6 three-dimensional convolution layers are all 3 multiplied by 3 in size; the size of the first three-dimensional maximum pooling layer pooling core is 1 multiplied by 2, and the step length is 1 multiplied by 2; the sizes of the second three-dimensional maximum pooling layer pooling core, the third three-dimensional maximum pooling layer pooling core and the fourth three-dimensional maximum pooling layer pooling core are all 2 multiplied by 2, and the step lengths are all 2 multiplied by 2; the average three-dimensional pooling layer convolution kernel size is 1 multiplied by 2; the 6 activation function layers all adopt a ReLU function.
3. The background suppression-based unsupervised abnormal behavior detection method according to claim 1, wherein the background suppression module in step (3B1) applies the training sample set BtrainIn each of the normal behavior frame sequencesEach normal behavior frame image in (1)The background information suppression is carried out, and the implementation steps are as follows:
background suppression module pair training sample set BtrainEach is normalBehavioral frame sequencesEach normal behavior frame image in (1)Performing gamma correction, and subjecting the gamma-corrected frame imagePerforming Gaussian filtering, and performing Gaussian filtering on the frame imagePerforming Laplace sharpening to obtain a frame image with suppressed background information
4. The background suppression-based unsupervised abnormal behavior detection method according to claim 1, characterized in that: step (3e) is performed by a random gradient descent method through HtNetwork parameter gradient of (a) to network parameter thetaG1_t、θG2_t、θG3_t、θG4_tUpdating is carried out; the update formula is:
gt=▽θft(θt-1)
mt=β1·vt-1+(1-β1)·gt
wherein: g is a radical of formulatIs the gradient at the number of iterations t,extracting network parameters theta for features, respectivelyG1_tMemorizing the convolution neural network parameter thetaG2_tTransposed convolution neural network parameter θG3_tThree-dimensional convolutional neural network parameter thetaG4_tUpdated parameters, { fti(θ) | i ═ 1,2,3,4} is the parameter θGi_tObjective function of, beta1,β2Exponential decay rates of the first and second moments, { m }, respectivelyti1,2,3,4 is HtFirst moment estimation of network parameter gradients, { v }tiI | ═ 1,2,3,4} is for HtAn estimate of the second moment of the gradient of the network parameter,is a pair { mtiCorrection of 1,2,3,4 |,is betaiTo the power of t of (a),for { vtiCorrection of | i ═ 1,2,3,4 { α }iI | ═ 1,2,3,4} is the learning rate, { εiI | ═ 1,2,3,4} is a constant added to maintain numerical stability.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210252961.6A CN114612936B (en) | 2022-03-15 | 2022-03-15 | Non-supervision abnormal behavior detection method based on background suppression |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210252961.6A CN114612936B (en) | 2022-03-15 | 2022-03-15 | Non-supervision abnormal behavior detection method based on background suppression |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114612936A true CN114612936A (en) | 2022-06-10 |
CN114612936B CN114612936B (en) | 2024-08-23 |
Family
ID=81862820
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210252961.6A Active CN114612936B (en) | 2022-03-15 | 2022-03-15 | Non-supervision abnormal behavior detection method based on background suppression |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114612936B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024055948A1 (en) * | 2022-09-14 | 2024-03-21 | 北京数慧时空信息技术有限公司 | Improved unsupervised remote-sensing image abnormality detection method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170103264A1 (en) * | 2014-06-24 | 2017-04-13 | Sportlogiq Inc. | System and Method for Visual Event Description and Event Analysis |
CN111832516A (en) * | 2020-07-22 | 2020-10-27 | 西安电子科技大学 | Video behavior identification method based on unsupervised video representation learning |
CN113032778A (en) * | 2021-03-02 | 2021-06-25 | 四川大学 | Semi-supervised network abnormal behavior detection method based on behavior feature coding |
-
2022
- 2022-03-15 CN CN202210252961.6A patent/CN114612936B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170103264A1 (en) * | 2014-06-24 | 2017-04-13 | Sportlogiq Inc. | System and Method for Visual Event Description and Event Analysis |
CN111832516A (en) * | 2020-07-22 | 2020-10-27 | 西安电子科技大学 | Video behavior identification method based on unsupervised video representation learning |
CN113032778A (en) * | 2021-03-02 | 2021-06-25 | 四川大学 | Semi-supervised network abnormal behavior detection method based on behavior feature coding |
Non-Patent Citations (2)
Title |
---|
ANITHA RAMCHANDRAN等: "Unsupervised deep learning system for local anomaly event detection in crowded scenes", 《 MULTIMEDIA TOOLS AND APPLICATIONS》, 12 May 2019 (2019-05-12) * |
李玎: "面向监控视频的无监督异常事件检测算法研究与应用", 《万方数据》, 6 July 2023 (2023-07-06) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024055948A1 (en) * | 2022-09-14 | 2024-03-21 | 北京数慧时空信息技术有限公司 | Improved unsupervised remote-sensing image abnormality detection method |
Also Published As
Publication number | Publication date |
---|---|
CN114612936B (en) | 2024-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108133188B (en) | Behavior identification method based on motion history image and convolutional neural network | |
CN108765506B (en) | Layer-by-layer network binarization-based compression method | |
CN104268594B (en) | A kind of video accident detection method and device | |
CN111861925B (en) | Image rain removing method based on attention mechanism and door control circulation unit | |
CN110120064B (en) | Depth-related target tracking algorithm based on mutual reinforcement and multi-attention mechanism learning | |
CN112597815A (en) | Synthetic aperture radar image ship detection method based on Group-G0 model | |
CN114882434A (en) | Unsupervised abnormal behavior detection method based on background suppression | |
CN106529419A (en) | Automatic detection method for significant stack type polymerization object in video | |
CN113378775B (en) | Video shadow detection and elimination method based on deep learning | |
CN107424175B (en) | Target tracking method combined with space-time context information | |
CN113066089B (en) | Real-time image semantic segmentation method based on attention guide mechanism | |
Wang et al. | Fast infrared maritime target detection: Binarization via histogram curve transformation | |
CN111008608B (en) | Night vehicle detection method based on deep learning | |
CN110929635A (en) | False face video detection method and system based on face cross-over ratio under trust mechanism | |
Cai et al. | A real-time smoke detection model based on YOLO-smoke algorithm | |
CN111368634A (en) | Human head detection method, system and storage medium based on neural network | |
CN112634171B (en) | Image defogging method and storage medium based on Bayesian convolutional neural network | |
CN114612936A (en) | Unsupervised abnormal behavior detection method based on background suppression | |
CN111144220B (en) | Personnel detection method, device, equipment and medium suitable for big data | |
CN116433909A (en) | Similarity weighted multi-teacher network model-based semi-supervised image semantic segmentation method | |
CN116189096A (en) | Double-path crowd counting method of multi-scale attention mechanism | |
CN111079572A (en) | Forest smoke and fire detection method based on video understanding, storage medium and equipment | |
CN114462490A (en) | Retrieval method, retrieval device, electronic device and storage medium of image object | |
CN105872859A (en) | Video compression method based on moving target trajectory extraction of object | |
CN115375966A (en) | Image countermeasure sample generation method and system based on joint loss function |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |