Disclosure of Invention
The invention aims to provide an automatic registering method and a related device based on a moving target in a panoramic video, which are used for solving the problems that the detection precision is low, the false alarm is high, and the coordinate mapping is carried out by only calibrating four points serving as unique projection planes in the prior art, so that larger errors are easy to bring.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect, the present invention provides a method for automatically registering a moving target in a panoramic video, including:
extracting a frame of image from the panoramic video image, and extracting image feature points;
map coordinate information corresponding to the image feature points is found out on the map;
mapping the map coordinate information to the image is completed, and a homography matrix is calculated;
performing target detection based on a neural network on the panoramic video image to obtain target coordinates and category information;
and combining the received ADS-B information of the moving target, and carrying out data fusion to complete the hanging of the moving target.
Optionally, extracting a frame of image from the panoramic video image, and extracting image feature points:
the panoramic video adopts 4K cameras to finish high-resolution panoramic image stitching, and feature points are extracted on the images to find out obvious points or mark lines on the ground.
Optionally, map coordinate information corresponding to the image feature points is found out on the map:
and finding out the longitude and latitude coordinate information of the map corresponding to the points or the mark lines on the map, and calculating a homography matrix between the video map and the map through the 4 pairs of mark points and more.
Optionally, mapping the map coordinate information to the image is completed, and a homography matrix is calculated:
the map is regarded as one projection surface, the video image is regarded as the other projection surface, and the projection relation between the two planes is described by a homography matrix H of 3X 3;
a pair of points on a map and video imageThere is a coordinate variation relationship as follows:
X'=HX (1)
map coordinates (X, y) and image coordinates (u, v) are written as a three-dimensional homogeneous column vector x= (X, y, 1) T ,X'=(u,v,1) T Equation (1) is written as:
lambda is a non-zero scale factor and does not change the projective transformation relationship; H1-H9 are 9 parameters of matrix H;
elimination of λ in equation (2) yields two linear equations for the H element:
-h 1 x-h 2 y-h 3 +(h 7 x+h 8 y+h 9 )u=0 (3)
-h 4 x-h 5 y-h 6 +(h 7 x+h 8 y+h 9 )v=0 (4)
writing equations (3) and (4) into a matrix form
A i h=0 (5)
Wherein:
h=(h 1 h 2 h 3 h 4 h 5 h 6 h 7 h 8 h 9 ) T therefore, 8 parameters in h are obtained to determine a projection matrix, and the projection matrix between two planes is calculated by more than 4 groups of matching points; after the homography matrix is obtained, the image coordinates are calculated according to the formula 5 after a group of longitudes and latitudes are input each time.
Optionally, performing target detection on the panoramic video image based on the neural network to obtain target coordinates and category information:
the detection flow is divided into two parts of training and detection of the neural network:
the video image data set is sent into a network model for training after being preprocessed, and finally the network model after data training is obtained; and if the detected picture contains the target, outputting the target coordinates and the category information.
Optionally, the detection process:
preprocessing an originally acquired panoramic video image, and inputting the preprocessed panoramic video image into a network model trained by data to obtain a detection result;
the training uses the Yolov5 algorithm.
Optionally, combining the received ADS-B information of the moving target to perform data fusion, so as to complete aircraft hanging:
after ADS-B information of the moving target is received, the coordinate of a drop point of the moving target on the image is calculated through a homography matrix, and after the interested target is detected through deep learning and registered with the position of the drop point, the one-to-one correspondence between the moving target on the image and the ADS-B information is realized, and target registration is realized.
In a second aspect, the present invention provides an automatic registration system based on moving targets in panoramic video, comprising:
the data acquisition module is used for extracting a frame of image from the panoramic video image and extracting image characteristic points; map coordinate information corresponding to the image feature points is found out on the map;
the mapping module is used for finishing mapping of map coordinate information to an image and calculating a homography matrix;
the detection module is used for carrying out target detection on the panoramic video image based on the neural network to obtain target coordinates and category information;
and the fusion module is used for carrying out data fusion by combining the received ADS-B information of the moving target so as to finish the hanging of the moving target.
In a third aspect, the present invention provides a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing steps of an automatic registration method based on moving objects in panoramic video, when the computer program is executed.
In a fourth aspect, the present invention provides a computer readable storage medium storing a computer program which, when executed by a processor, performs the steps of a method for automatically registering a target based on movement in panoramic video.
Compared with the prior art, the invention has the following technical effects:
the invention provides a new method for realizing automatic listing of an airplane in a video by fusing the video and ADS-B data. Firstly, image tracking is carried out on a video to obtain aircraft image coordinates, then 4 or more points and lines are respectively selected on an airport map and the video image to correspondingly calculate a homography matrix between two projection planes, the map coordinates are mapped into the image coordinates, and finally, the image tracking data and ADS-B monitoring data are fused to realize automatic aircraft listing in the video.
According to the invention, the deep learning is adopted to replace the traditional target detection algorithm to accurately position the target of interest, so that the method has the advantages of high speed, high detection precision and convenient deployment.
The invention adopts a multi-area idea to replace the whole image marked as a single plane, and the deviation of the position of the card is caused by larger error caused by the dispersion of marked data. ADS-B information received after multi-region marking can judge which region falls on, then coordinates mapped to an image are calculated through a homography matrix by utilizing the nearest 4 marking points, so that errors caused by homography matrix mapping can be reduced, and theoretically, the more marked regions are, the more the mapping positions are finally calculated.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
referring to fig. 1 to 5, the invention provides a new method for realizing automatic listing of an airplane in a video by fusing the video and ADS-B data, aiming at the problem that airport video monitoring can only provide airplane image information and can not extract label information such as flight numbers. Firstly, image tracking is carried out on a video to obtain aircraft image coordinates, then 4 or more points and lines are respectively selected on an airport map and the video image to correspondingly calculate a homography matrix between two projection planes, the map coordinates are mapped into the image coordinates, and finally, the image tracking data and ADS-B monitoring data are fused to realize automatic aircraft listing in the video.
Example 1:
the automatic listing method based on the moving target in the panoramic video is characterized by comprising the following steps:
extracting a frame of image from the panoramic video image, and extracting image feature points;
map coordinate information corresponding to the image feature points is found out on the map;
mapping the map coordinate information to the image is completed, and a homography matrix is calculated;
performing target detection based on a neural network on the panoramic video image to obtain target coordinates and category information;
and combining the received ADS-B information of the moving target, and carrying out data fusion to complete the hanging of the moving target.
Firstly, image tracking is carried out on a video to obtain aircraft image coordinates, then 4 or more points and lines are respectively selected on an airport map and the video image to correspondingly calculate a homography matrix between two projection planes, the map coordinates are mapped into the image coordinates, and finally, the image tracking data and ADS-B monitoring data are fused to realize automatic aircraft listing in the video.
Example 2:
the specific scheme is as follows:
the first step: extracting a frame of image from the panoramic video image, (normally, 4K cameras are adopted to finish high-resolution panoramic image splicing in the panoramic video), extracting characteristic points on the image to find out obvious points or mark lines on the airport ground, simultaneously, finding out the points or the lines on an airport map to correspondingly obtain accurate map coordinate information (longitude and latitude information), and calculating a homography matrix between the video image and the airport map through the 4 pairs of mark points and more. To improve the accuracy of the card, we will typically mark at least 8 points on the runway to obtain three areas (as shown in fig. 2), with four points within each area not being coplanar.
And a second step of: and calculating a homography matrix to finish mapping from longitude and latitude coordinates to image coordinates.
If the airport map is considered as one projection plane and the video image is considered as the other projection plane, the projection relationship between these two planes can be described by a homography matrix H of 3X3, as shown in fig. 3:
a pair of points on airport map and video imagesThere is a coordinate variation relationship as follows:
X'=HX (1)
map coordinates (X, y) and image coordinates (u, v) are written as a three-dimensional homogeneous column vector x= (X, y, 1) T ,X'=(u,v,1) T Equation (1) can be written as:
what is significant for homography matrix H is simply the ratio of the matrix elements, there are 8 independent ratios among the 9 elements of H, so homography matrix H has 8 degrees of freedom, and the homography matrix H multiplied by a non-zero scale factor in this equation does not change the projection relationship.
Elimination of λ in equation 2 yields two linear equations for the H element:
-h 1 x-h 2 y-h 3 +(h 7 x+h 8 y+h 9 )u=0 (3)
-h 4 x-h 5 y-h 6 +(h 7 x+h 8 y+h 9 )v=0 (4)
writing equations (3) and (4) into a matrix form
A i h=0 (5)
Wherein:
h=(h 1 h 2 h 3 h 4 h 5 h 6 h 7 h 8 h 9 ) T therefore, the projection matrix can be determined by solving 8 parameters in h, and the projection matrix between two planes can be calculated by more than 4 groups of matching points, but at least three points are not coplanar. After the homography matrix is obtained, the image coordinates can be calculated according to the formula 5 after a group of longitudes and latitudes are input each time.
In a specific field real-time process, an airport runway is divided into 3 areas, after receiving ADS-B information of an airplane each time, the airport runway is judged to fall to which area, then a mapping matrix is calculated by using marking points in the area, and finally, the image coordinates of the points are obtained by using a formula 5, so that the mapping accuracy can be improved.
And a third step of: target detection for aircraft, vehicles, etc. based on deep learning.
The detection system flow is divided into two parts of training and detection of the neural network. Fig. 4 shows the overall flow of the algorithm.
The upper part is the training process of the network, the data set is sent into the network model for training after being preprocessed, and finally the network model after data training is obtained. The lower part is video detection, and if the detected picture contains a target, the target coordinates and the category information are output.
The method comprises the steps of collecting airplane take-off and landing videos at an airport site as original materials, segmenting screening marks, training through a deep learning network, selecting Yolov5, obtaining a detection model, deploying the detection model to the site to analyze real-time videos, feeding back results to the front end,
the detection result is shown in fig. 5;
fourth step: and (5) data fusion, and finishing aircraft hanging.
After ADS-B information of the moving target is received, firstly judging which area of the calibration graph falls on, calculating the falling point coordinates of the moving target on the image through a homography matrix, detecting the target of interest through deep learning, registering the moving target with the falling point position, and accordingly achieving one-to-one correspondence between the moving target and the ADS-B information on the image and achieving target registration.
In still another embodiment of the present invention, an automatic registration system for a moving target in a panoramic video is provided, which can be used to implement the above automatic registration method for a moving target in a panoramic video, and specifically the system includes:
the data acquisition module is used for extracting a frame of image from the panoramic video image and extracting image characteristic points; map coordinate information corresponding to the image feature points is found out on the map;
the mapping module is used for finishing mapping of map coordinate information to an image and calculating a homography matrix;
the detection module is used for carrying out target detection on the panoramic video image based on the neural network to obtain target coordinates and category information;
and the fusion module is used for carrying out data fusion by combining the received ADS-B information of the moving target so as to finish the hanging of the moving target.
The division of the modules in the embodiments of the present invention is schematically only one logic function division, and there may be another division manner in actual implementation, and in addition, each functional module in each embodiment of the present invention may be integrated in one processor, or may exist separately and physically, or two or more modules may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules.
In yet another embodiment of the present invention, a computer device is provided that includes a processor and a memory for storing a computer program including program instructions, the processor for executing the program instructions stored by the computer storage medium. The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf Programmable gate arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc., which are the computational core and control core of the terminal adapted to implement one or more instructions, in particular adapted to load and execute one or more instructions within a computer storage medium to implement the corresponding method flow or corresponding functions; the processor provided by the embodiment of the invention can be used for the operation of an automatic listing method based on a moving target in panoramic video.
In yet another embodiment of the present invention, a storage medium, specifically a computer readable storage medium (Memory), is a Memory device in a computer device, for storing a program and data. It is understood that the computer readable storage medium herein may include both built-in storage media in a computer device and extended storage media supported by the computer device. The computer-readable storage medium provides a storage space storing an operating system of the terminal. Also stored in the memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor. The computer readable storage medium herein may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. One or more instructions stored in a computer-readable storage medium may be loaded and executed by a processor to implement the corresponding steps in the above embodiments with respect to a method for automatically registering a moving object based on panoramic video.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: modifications and equivalents may be made to the specific embodiments of the invention without departing from the spirit and scope of the invention, which is intended to be covered by the claims.