Nothing Special   »   [go: up one dir, main page]

CN109118564A - A kind of three-dimensional point cloud labeling method and device based on fusion voxel - Google Patents

A kind of three-dimensional point cloud labeling method and device based on fusion voxel Download PDF

Info

Publication number
CN109118564A
CN109118564A CN201810861715.4A CN201810861715A CN109118564A CN 109118564 A CN109118564 A CN 109118564A CN 201810861715 A CN201810861715 A CN 201810861715A CN 109118564 A CN109118564 A CN 109118564A
Authority
CN
China
Prior art keywords
voxel
point cloud
dimensional
characteristic
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810861715.4A
Other languages
Chinese (zh)
Other versions
CN109118564B (en
Inventor
马燕新
鲁敏
涂兵
郭裕兰
雷印杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
Hunan Visualtouring Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Visualtouring Information Technology Co Ltd filed Critical Hunan Visualtouring Information Technology Co Ltd
Priority to CN201810861715.4A priority Critical patent/CN109118564B/en
Publication of CN109118564A publication Critical patent/CN109118564A/en
Application granted granted Critical
Publication of CN109118564B publication Critical patent/CN109118564B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Graphics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the present invention provides a kind of three-dimensional point cloud labeling method and device based on fusion voxel.Wherein, the three-dimensional point cloud labeling method based on fusion voxel includes voxelization processing being carried out to three dimensional point cloud collection, and carry out voxel feature extraction in voxel based on processing result and form the first voxel eigenmatrix;First voxel eigenmatrix is calculated to the Analysis On Multi-scale Features of voxel as the input of Three dimensional convolution neural network, and feature fused in tandem is carried out to obtain the second voxel eigenmatrix to the Analysis On Multi-scale Features;The voxel feature in the second voxel eigenmatrix is extended in each point of three dimensional point cloud concentration to obtain a cloud eigenmatrix based on feature interpolation algorithm;It will be marked in cloud eigenmatrix input multilayer perceptron with attribute of the realization to three-dimensional point cloud.The present invention can be realized point-by-point more sophisticated category identification, with further hoist point cloud labeling properties.

Description

Three-dimensional point cloud marking method and device based on fused voxels
Technical Field
The invention relates to the technical field of data marking, in particular to a three-dimensional point cloud marking method and device based on fused voxels.
Background
With the widespread application of three-dimensional sensors (such as LiDAR, Microsoft Kinect, ASUS x) in various fields (such as mobile robots, automatic driving, remote sensing interpretation, virtual reality, augmented reality, battlefield situation perception and the like) in recent years, three-dimensional data is also growing remarkably, and three-dimensional point cloud marking is an important means for processing three-dimensional data, namely identifying the category attribute of each point in observed scene point cloud data and allocating a unique category label to each point, such as buildings, roads, automobiles and the like. However, the conventional semantic labeling based on the voxel convolutional neural network requires that all points in a voxel share the same semantic label and require regularized data input, and the obtained labeling result is also a rough labeling result at a voxel level.
Disclosure of Invention
In view of the above, the present invention provides a method and an apparatus for labeling a three-dimensional point cloud based on a fused voxel, so as to solve the above problems.
In one aspect, a preferred embodiment of the present invention provides a method for labeling a three-dimensional point cloud based on fused voxels, where the method includes:
performing voxelization processing on the three-dimensional point cloud data set, and performing voxel characteristic extraction in voxels based on a processing result to form a first voxel characteristic matrix;
taking the first voxel characteristic matrix as the input of a three-dimensional convolution neural network to calculate and obtain multi-scale characteristics of voxels, and performing characteristic series fusion on the multi-scale characteristics to obtain a second voxel characteristic matrix;
expanding the voxel characteristics in the second voxel characteristic matrix to each point in the three-dimensional point cloud data set based on a characteristic interpolation algorithm to obtain a point cloud characteristic matrix;
and inputting the point cloud characteristic matrix into a multilayer perceptron to realize attribute marking of the three-dimensional point cloud.
In another aspect, a preferred embodiment of the present invention further provides a three-dimensional point cloud labeling apparatus based on fused voxels, including:
the voxel processing and characteristic extraction module is used for carrying out voxel processing on the three-dimensional point cloud data set and carrying out voxel characteristic extraction in voxels based on a processing result to form a first voxel characteristic matrix;
the multi-scale voxel characteristic calculation module is used for calculating the multi-scale characteristic of the voxel by taking the first voxel characteristic matrix as the input of the three-dimensional convolutional neural network, and performing characteristic series fusion on the multi-scale characteristic to obtain a second voxel characteristic matrix;
the characteristic expansion module is used for expanding the voxel characteristics in the second voxel characteristic matrix to each point in the three-dimensional point cloud data set based on a characteristic interpolation algorithm to obtain a point cloud characteristic matrix;
and the point cloud marking module is used for inputting the point cloud characteristic matrix into the multilayer perceptron to mark the attribute of the three-dimensional point cloud.
Compared with the prior art, the method and the device for marking the three-dimensional point cloud based on the fused voxels provided by the embodiment of the invention have the advantages that the multi-scale space is constructed on the basis of the voxel convolution neural network on the regularized voxel model to extract multi-scale voxel characteristics, and then the voxel characteristics are expanded to point characteristics in a characteristic interpolation mode, so that the point-by-point finer classification identification and point cloud marking are realized.
In addition, the invention also utilizes CRF-RNN to optimize the marking result of the marked point cloud data so as to improve the marking precision.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a schematic view of an application scenario of a three-dimensional point cloud labeling apparatus based on a fused voxel according to an embodiment of the present invention.
Fig. 2 is a schematic flow chart of a three-dimensional point cloud labeling method based on fused voxels according to an embodiment of the present invention.
Fig. 3 is a sub-flowchart of step S11 shown in fig. 2.
Fig. 4 is a schematic diagram of a frame structure of a three-dimensional point cloud labeling method based on a voxel convolutional neural network according to an embodiment of the present invention.
Fig. 5 is a schematic structural diagram of a local and global feature fusion module according to an embodiment of the present invention.
Fig. 6 is a schematic view of a voxel feature extraction network structure provided in an embodiment of the present invention.
Fig. 7 is a sub-flowchart of step S12 shown in fig. 2.
Fig. 8 is a schematic diagram of a three-dimensional convolution operation according to an embodiment of the present invention.
Fig. 9 is a two-dimensional schematic diagram of convolution and deconvolution provided by the embodiment of the present invention.
Fig. 10 is a schematic diagram of a multi-scale feature extraction structure based on 3D CNN according to an embodiment of the present invention.
Fig. 11 is a schematic view of point cloud feature interpolation based on voxel features according to an embodiment of the present invention.
Fig. 12 is a schematic diagram of a fully connected CRF connection according to an embodiment of the present invention.
Fig. 13 is a schematic diagram of an implementation structure of CRF based on RNN according to an embodiment of the present invention.
Fig. 14 is a schematic structural diagram of a three-dimensional point cloud labeling network integrating a voxel convolution neural network and CRF backend optimization provided by the embodiment of the present invention.
Fig. 15 is a schematic diagram of a labeling result on the S3DIS data set based on XYZ coordinate information according to an embodiment of the present invention.
Fig. 16 is a schematic diagram of a labeling result on the vKITTI dataset based on XYZ coordinate information according to an embodiment of the present invention.
FIG. 17 is a class confusion matrix for Point-VoxelNet on two datasets according to an embodiment of the present invention.
Figure 18 is a class confusion matrix for PVCRF on two datasets provided by an embodiment of the present invention.
Fig. 19 is a schematic block structure diagram of a three-dimensional point cloud labeling apparatus based on fused voxels according to an embodiment of the present invention.
Icon: 10-an electronic terminal; 100-a three-dimensional point cloud marking device based on fused voxels; 110-voxel processing and feature extraction module; 111-voxel division unit; 112-point cloud classification unit; 113-a point cloud sampling unit; 120-a multi-scale voxel feature calculation module; 130-feature extension module; 140-point cloud marking module; 200-a memory; 300-a memory controller; 400-processor.
Detailed Description
The inventor researches to find that, on the existing global context information modeling of scene point cloud, the representation capability of graph models (graphical models) can be generally utilized to solve, as a more common way is to combine a classifier and a Conditional Random Field (CRF) to estimate the semantic label of each data point. However, the classifier classification recognition phase and the CRF optimization phase are usually operated independently as separate modules without interaction with each other, thereby limiting the information exchange between the modules.
Among them, for the classifier, a three-dimensional voxel convolution neural network is a better choice. The three-dimensional voxel convolution neural network is expanded from a two-dimensional convolution neural network, has good performance in a three-dimensional target classification and identification task, and has the advantages of clear network structure, easy acceleration and the like compared with a point cloud-based deep neural network. However, the voxel neural network requires regularized data input and its labeling results are also rough labels at the voxel level.
In order to solve the above problems in the prior art, embodiments of the present invention provide a three-dimensional point cloud labeling method and apparatus based on fused voxels, where a multi-scale space is constructed on a regularized voxel model based on a voxel convolution neural network to extract multi-scale voxel features, and then the voxel features are expanded to point features by means of feature interpolation, so as to implement finer classification and identification point by point, and further improve the labeling performance. In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
Fig. 1 is a schematic view of an application scenario of a three-dimensional point cloud labeling apparatus 100 based on fused voxels according to an embodiment of the present invention. The electronic terminal 10 includes a three-dimensional point cloud labeling apparatus 100 based on fused voxels, a memory 200, a storage controller 300, and a processor 400. The electronic terminal 10 may be, but is not limited to, an electronic device having a processing function, such as a computer, a Mobile Internet Device (MID), and the like, and may also be a server and the like.
Optionally, the elements of the memory 200, the memory controller 300 and the processor 400 are directly or indirectly electrically connected to each other to realize data transmission or interaction. For example, the components are electrically connected to each other through one or more communication buses or signal lines. The fused voxel-based three-dimensional point cloud labeling apparatus 100 includes at least one software functional module that may be stored in the memory 200 in the form of software or firmware or solidified in the operating system of the electronic terminal 10. The processor 400 accesses the memory 200 under the control of the storage controller 300 for executing executable modules stored in the memory 200, such as software functional modules and computer programs included in the fused voxel-based three-dimensional point cloud labeling apparatus 100.
It will be appreciated that the configuration shown in FIG. 1 is merely illustrative and that the electronic terminal 10 may include more or fewer components than shown in FIG. 1 or may have a different configuration than shown in FIG. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.
Further, referring to fig. 2, an embodiment of the present invention further provides a three-dimensional point cloud labeling method based on a fused voxel, which is described below with reference to fig. 2.
Step S11, carrying out voxelization processing on the three-dimensional point cloud data set, and carrying out voxel characteristic extraction in voxels based on the processing result to form a first voxel characteristic matrix;
step S12, the first voxel characteristic matrix is used as the input of the three-dimensional convolution neural network to calculate and obtain the multi-scale characteristics of the voxels, and the multi-scale characteristics are subjected to characteristic series fusion to obtain a second voxel characteristic matrix;
step S13, expanding the voxel characteristics in the second voxel characteristic matrix to each point in the three-dimensional point cloud data set based on a characteristic interpolation algorithm to obtain a point cloud characteristic matrix;
and step S14, inputting the point cloud feature matrix into a multilayer perceptron to realize attribute marking of the three-dimensional point cloud.
In the embodiment, the three-dimensional point cloud is subjected to voxelization processing, the point cloud in the voxels is subjected to feature extraction, a voxel model with the voxel features as elements is input into a three-dimensional convolution neural network to perform multi-scale feature extraction and fusion, and the voxel features are expanded to the point cloud features by using a feature interpolation algorithm, so that the three-dimensional point cloud is marked, and the point cloud marking precision can be effectively improved.
In detail, referring to fig. 3, the process of performing the voxelization process on the point cloud in step S11 can be implemented by the following steps S111 to S113:
step S111, dividing a point cloud coordinate space into a plurality of voxels according to a preset voxel size;
step S112, classifying each point in the three-dimensional point cloud data set into a corresponding voxel according to the grid parameter of the voxel;
step S113, sampling points in each classified voxel so that the number of points in the voxel reaches a first preset value.
In this embodiment, a point cloud voxelization model is introduced in the above steps S111 to S113 to voxelize the point cloud. Specifically, as shown in fig. 4, point cloud voxelization is to divide a point cloud coordinate space into a plurality of voxels according to a given voxel size. Wherein, assuming that the sizes of the input point cloud in three coordinate axes, namely X, Y, Z axis directions, are W, H, E respectively, the size of each voxel is λW、λH、λEThe voxel size of the voxel-processed model is W' ═ W/λW、H'=H/λH、E'=E/λE. In this embodiment, for the convenience of the subsequent convolution operation, W ', H ', and E ' may be integers and powers of 2.
After the point cloud coordinate space is subjected to voxelization grid in step S111, each point in the point cloud may be further classified according to the grid parameter of each voxel, so that each point is classified in each voxel. However, when point cloud classification is performed, due to the influence of measurement errors, distances, shielding and other factors when three-dimensional point cloud data is acquired, the acquired point clouds are often uneven, for example, the point clouds in partial areas are concentrated, and the point clouds in partial areas are sparse. In addition, the point cloud data collection is equivalent to the sampling of the target surface, so that the inside of the target is empty, and no point cloud data exists, so that after the point cloud space is subjected to voxelization, the point cloud distribution inside each voxel is uneven, specifically as shown in fig. 4, wherein the voxel at the lower left corner does not contain the point cloud, and the voxel at the upper right corner contains a small number of points. Therefore, in order to facilitate subsequent uniform voxel feature extraction, the same number of points, such as a first preset value T (T is determined according to the resolution and storage capacity of the point cloud), need to be sampled from each voxel after point cloud segmentation.
It should be noted that, during sampling, if the number of points included in a voxel is greater than a first preset value, randomly sampling the first preset value of points from the current voxel to make the number of points in the voxel reach the first preset value; if the number of points contained in the voxel is smaller than a first preset value, one or more points are randomly selected from the current voxel to be copied so that the number of points in the voxel reaches the first preset value. For example, assuming that the first preset value is T, T points may be randomly sampled for voxels in which the number of points in the voxels exceeds T, and if the number of points in the voxels is less than T, a corresponding number of points are randomly copied to obtain a set of T points, after point cloud blocking and sampling, no more than T ' × H ' × E ' sets of voxels including T points may be obtained, and then feature learning is performed using point cloud data in the voxels to obtain an effective feature expression of each voxel including the point cloud.
Further, the step of performing point cloud feature extraction in the voxel based on the processing result in step S11 to form a first voxel feature matrix includes: calculating the center coordinate of the point cloud in each voxel, and performing center normalization processing on the point cloud data in the voxels based on the center coordinate to obtain an initial data matrix; inputting the initial data matrix into an LGAB module to realize point-by-point local feature description, and performing point-by-point pooling operation on a local feature set in a voxel by adopting maximum pooling to obtain a global feature of the voxel and using the global feature as a first voxel feature matrix.
Specifically, in the embodiment of the present invention, local and global feature fusion modules (LGABs) as shown in fig. 5 are used for stacking to build a feature learning network for voxel sign extraction. Wherein, as shown in FIG. 6, suppose VxFor non-empty voxels containing T points, i.e. Vx={pi=(xi,yi,ziB,) i ═ 1,2,3, …, T, then center normalization is performed before inputting the point cloud data (initial data matrix) into the LGAB module, i.e. the center coordinates (c) of the point cloud within the voxel are first calculatedx,cy,cz) Using the central coordinate (c)x,cy,cz) Carrying out center normalization on the point cloud data to obtain a final initial data matrix of the input dataThat is, the input to the voxel feature extraction module is an initial data matrix of dimension T × 6.
Further, a local feature description point by point can be obtained by using stacked LGAB modules, and a global feature of a voxel can be obtained by performing Pooling operation on a point by point feature set in the voxel by Max-Pooling (MP). As an example of feature extraction for non-empty voxels is given in fig. 6, to reduce the number of parameters, the remaining non-empty voxels may share the same network parameters when feature extraction is performed. In practical implementation, since the LGAB modules can well fuse the common information of the local neighborhoods of the point clouds and the difference information of the local neighborhoods of the point clouds, the cascade connection of a plurality of LGAB modules can be used to effectively extract the point cloud information inside the voxels in this embodiment.
Further, as shown in fig. 7, the process of using the first voxel feature matrix as an input of the three-dimensional convolutional neural network to calculate the multi-scale features of the voxels in step S12 may be implemented by the following steps.
Step S120, converting the voxel characteristic matrix into a 4-dimensional tensor, and respectively inputting the 4-dimensional tensor into three-dimensional convolution neural networks with convolution kernels of different sizes to calculate and obtain voxel characteristics under different scales;
step S121, respectively inputting the voxel characteristics under different scales into three-dimensional deconvolution neural networks with convolution kernels of different sizes to obtain a plurality of voxel characteristics of different scales, wherein the convolution kernel size of each three-dimensional convolution neural network during convolution is the same as the convolution kernel size of each three-dimensional deconvolution neural network during deconvolution.
In detail, since the spatial geometry information is important information of the three-dimensional target, effective feature description of the target can be extracted by directly processing the three-dimensional data, the invention expands the two-dimensional convolutional neural network into the three-dimensional convolutional neural network by taking the huge success of the two-dimensional convolutional neural network in image processing as a reference, that is, when the three-dimensional data is processed by adopting the three-dimensional convolutional neural network, the three-dimensional data needs to be subjected to regularization processing, namely voxelization processing, and two-dimensional convolution operation, pooling operation and the like also need to be expanded to the three-dimensional voxelization data. Wherein, the three-dimensional convolution formula is as follows:
in the formula (1), the reaction mixture is,for the input three-dimensional voxel data,is a three-dimensional convolution kernel template,for which a response is output. Similar to the two-dimensional convolution operation, the three-dimensional convolution operation simply extends the two-dimensional convolution kernel to the three-dimensional convolution kernel, including three dimensions of length, width and height, and accordingly, as shown in fig. 8, the local perception is also converted from a local neighborhood on the two-dimensional plane to a local neighborhood in the three-dimensional space. In practice, due to three-dimensional volumeThe product operation reduces the spatial size of the three-dimensional data, i.e. the spatial size of the three-dimensional feature map is smaller than the size of the input voxel data. However, for three-dimensional data labeling, it is necessary to acquire feature information of each data point, that is, to acquire the feature of each voxel when performing voxel feature extraction, and therefore, it is necessary to reversely map the feature map after convolution operation back to the original input voxel. To solve this problem, in this embodiment, a deconvolution operation (deconvolution) is used to process the obtained three-dimensional voxel feature map to obtain a feature map with the same size as the input voxel data. The core of the deconvolution operation is still the convolution operation, and only the edge 0-complementing operation is performed on the input feature data before the convolution operation so as to ensure the size requirement of the output feature map. For example, a two-dimensional example of convolution and deconvolution operations is shown in fig. 9, where blue marker data is the input data for both operations, gray marker data is the convolution kernel (both are the same size), and green marker data is the response output for both. It will be appreciated that the input to the deconvolution layer is the output of the convolution layer, which has the same dimensions as the input to the convolution layer.
Further, after all the non-empty voxels are processed based on the foregoing operation, a series of voxel characteristics in D dimension can be obtained. Since each voxel feature uniquely corresponds to a voxel coordinate of the three-dimensional space, the acquired feature can be expressed as a 4-dimensional tensor whose size is W ' × H ' × E ' × D, (for empty voxels, a D-dimensional zero vector is used as its feature description). After the feature representation is converted into the feature representation based on the 4-dimensional tensor, a three-dimensional convolutional neural network (3D CNN) can be adopted for further feature optimization. In consideration of the fact that feature extraction under a fixed scale (convolution kernel size) is not enough to completely express the local neighborhood structure information of the voxel, the method adopts multi-scale feature extraction and fusion to extract richer local neighborhood information.
As shown in fig. 10, the specific three-dimensional convolutional neural network structure, that is, the 3D CNN-based multi-scale feature extraction mainly includes a three-dimensional convolution operation (Conv3D), a three-dimensional deconvolution operation (DeConv3D), and a feature concatenation operation (Concat).For the input W ' × H ' × E ' × D dimensional features, three different convolution kernels are used to perform convolution and deconvolution operations respectively, which are marked as Conv3D (fin;fout;ker;st;pad),DeConv3D(fin;fout(ii) a ker, respectively; st; pad) where fin、foutA dimension representing the input-output characteristic matrix, ker; st; the pad represents the size of a convolution kernel, the size of a convolution kernel moving step during convolution and the size of data filling during data expansion, and all represent three-dimensional vectors. To obtain features at different scales, the three convolution kernels may be, but are not limited to, (1; 2; 2); (2; 1; 2); (2; 2; 1), the same convolution kernel is adopted in the convolution and deconvolution operations, and the corresponding edge 0-complementing operation is carried out in the deconvolution operation. In addition, the convolutional layer and the deconvolution layer include convolution and deconvolution operations, and each convolution operation further includes a Normalization layer (BN) and a ReLU activation operation.
The multi-scale feature extraction based on the 3D CNN adopts different convolution kernels to extract and fuse voxel features along three mutually perpendicular directions (namely X, Y, Z directions) of a three-dimensional space, so that the learned features can contain more local structure information, and the point cloud can be expressed more completely.
Further, in step S13, since the feature description of each point needs to be obtained when the marking of the three-dimensional point cloud is implemented, and the aforementioned three-dimensional convolutional neural network can only obtain the feature description of each voxel, the feature description of each point in the input point cloud is obtained by interpolating the voxel features in the present invention. As shown in fig. 11, for a given target point p, the nearest predetermined number (e.g. 8) of neighborhood voxels in the voxel space formed by the second voxel feature matrix are found, and the feature descriptions corresponding to the neighborhood voxels are respectivelyWhere j is 1,2, …,8, the target point p is characterized by:
in the formula (2), the reaction mixture is,representing the center point c in the voxel according to the target point p and the jth neighborhoodjThe euclidean distance between them to obtain a weight parameter,and expressing the voxel characteristics of the jth neighborhood voxel, and repeatedly executing the process on each point in the three-dimensional point cloud, namely expanding the voxel characteristics in the second voxel characteristic matrix to each point in the three-dimensional point cloud data set to obtain the point cloud characteristic matrix.
Further, in step S14, the point cloud feature matrix of each point obtained in step S13 is input to a multi-level perceptron (MLP) to realize the category identification of the point, i.e. the three-dimensional point cloud marker, and please refer to fig. 4 again for the specific network structure thereof.
According to the actual requirement, in order to further optimize the point cloud marking result obtained by the convolutional neural network and improve the marking precision, the method further comprises a step S15.
And step S15, inputting the three-dimensional point cloud with the attribute marking completed into a CRF-RNN network for point cloud attribute marking optimization.
Specifically, in this embodiment, the FC-CRF implemented based on the CNN basic operation is embedded in the point cloud labeling network based on the three-dimensional convolutional neural network, so as to implement an end-to-end three-dimensional point cloud fine labeling network that integrates a coarse label and a back-end optimization, and further improve the accuracy of point cloud labeling, especially the smoothness of a target boundary and a contour.
In the embodiment, the CRF modeling facing the three-dimensional point cloud marker is firstly carried out, then the CRF approximation based on the CNN operation is realized, and finally the CRF optimized three-dimensional point cloud marker is fused.
The traditional semantic labels are modeled as point-by-point classification and recognition, and some semantic labels adopt local features for classification and recognition, and some semantic labels adopt a deep neural network for classification and recognition. However, the point-by-point classification recognition usually brings some obvious unacceptable marking errors, for example, partial points in a certain target may be recognized as other categories, because the point-by-point classification recognition does not consider the adjacency relation between the points, and only uses the small-sized neighborhood information of the partial points to be marked. If the target structure information can be modeled in advance (for example, all targets are continuous, and adjacent points with similar characteristics are marked as targets of the same class), and the marking result is optimized and limited based on the modeling result, some obvious errors can be effectively eliminated, and further a high-precision marking result is obtained. Conditional Random Fields (CRFs) are an efficient method of modeling the continuity of an object and its neighborhood structure information, and have been widely used in two-dimensional image labeling. A conditional Random Field is a model that calculates the conditional probability distribution of one set of output Random variables given another set of Random variables, and is mainly characterized by the assumption that the output variables constitute a Markov Random Field (MRF).
In detail, CRF is a discriminant probabilistic undirected graph model, which can model global context information and cross features in data and can well process sequence data segmentation and labeling. Let X be { X in a given set of random variables1,X2,…,XNP ═ P1,P2,…,PNIn which Xi∈L={l1,l1,…,lMFor three-dimensional point cloud markers, P is the input point cloud containing N points, PjIs the observation vector of the j point, X is the semantic labeling result of the input point cloud, XiThe semantic label of the ith point is one of the M semantic labels, and then the corresponding CRF model can be represented by Gibbs probability distribution, that is:
in the formula (3), G is a probability undirected graph constructed on a random variable set X,Oare groups in figure G wherein,οeach pair of nodes in the set is adjacent,OGthen the set of all groups in G, Z (P) is a normalization function, Λ (x)οP) is an energy function on the radical, also known as a potential function.
For any marking result x epsilon LNThe overall potential function of (a) is:
solving based on the maximum posterior probability algorithm to obtain the optimal marking result:
as can be seen from the solving process of the optimal solution, the maximization of the posterior probability of the marking result is the minimization of the overall potential function. It is easy to see that the conditional random field firstly realizes the modeling of local context information through a group potential function, and then transfers the context information by utilizing a graph structure, thereby realizing the modeling of large-range context information.
For the fully-linked CRF model, each node in graph G is linked to the remaining nodes, as shown in FIG. 12, with the corresponding groupsOFor a group containing a single node or containing pairs of nodes, the overall potential function for x can thus be expressed as:
wherein, for convenience of description, the part of the conditional P in the conditional posterior probability is removed in equation (6), i.e., Λ (x) ═ Λ (x | P),as a function of the cell potential,for a pairwise binary potential function, i, j ═ 1,2, …, N. Cell potential functionIndicating that the ith node in graph G is labeled xiThe cost is usually defined by the probability output of a certain discriminant classifier, and the estimation result often contains more noise and the segmentation result is often discontinuous at the target edge. The paired binary potential function gives that the ith observation point and the jth observation point are marked as x at the same timei、xjThe method has the effects of keeping the consistency of the marks of adjacent observation points and reducing the inconsistency, and can improve the smoothness of the marking result.
The idea of gaussian weighting is used to define a pairwise binary potential function as:
in the formula (7), the function ψ (x)i,xj) Calculating a function for consistency between different labels, w(m)As a weight value, the weight value,for a smooth filter function based on Gaussian kernels, there is a total of MGA Gaussian kernel function, fi,fjFeature vectors representing observation points i, j, respectively, and havingEach gaussian kernel functionBy a symmetrical, positive definite matrix ΛmTo be defined. To this end, the method is oriented to three-dimensional point cloud marksAnd (4) after the modeling of the full-connection CRF is completed, the next step is how to solve to obtain the optimal marking result.
The three-dimensional point cloud marking optimization process based on the fully-connected CRF is a process of maximizing posterior probability phi (X) based on input point cloud data. Solving accurate posterior probability is difficult, the calculation amount is huge, and the approximation method based on the average field can convert the posterior probability phi (X) into a series of products of independent marginal probabilities, namely phi (X) is approximately equal to theta (X) or pii=1Θi(Xi) The arbitrary labeling results x can be obtained by combining the formulae (3) to (7)iMarginal probability of (theta)iComprises the following steps:
an iterative inference algorithm for CRF can be constructed based on equation (8), as shown in algorithm 5.1. The convergence of the iterative algorithm is mainly measured through the difference between the estimated Q and P, the convergence of the iterative algorithm can be obtained through evaluation, the estimation error is very small when the iteration number is 10, and the algorithm is proved to have good convergence.
Algorithm 1: CRF iteration inference algorithm based on average field approximation
1. Initialization: initializing all nodes
While is convergence do
3. Information transmission: computing all Gaussian filtered results
4. Weighted filtering:
5. and (3) consistency detection:
6. adding a univariate potential function:
7. normalization:
8.end while
the following describes how to use the correlation operations in CNN to implement the CRF iterative inference algorithm described above. The biggest problem in reconstructing the algorithm based on the CNN operation is whether the back propagation of errors can be realized, namely, the BP algorithm is adopted for parameter learning training.
(1) Initialization operations
The initialization operation in algorithm 1 is:
wherein,i.e. all values may be summed. Note the bookThen there isZi=∑lexp(Ui(l) It can be seen that this operation is equivalent to U marking all possible results on each scene pointi(l) And performing an activation operation based on the Softmax function. The Softmax function is a commonly used activation in CNN networksThe function, which does not contain any parameters, has error derivatives that can be propagated backwards, and thus can also be learned and trained using the Back-propagation (BP) algorithm.
(2) Information transfer
As shown in Algorithm 1, information transfer in CRF is by MGA Gaussian filter pair thetajAnd performing smooth filtering. The kernel function of the gaussian filter is obtained according to the characteristics of the point cloud, such as coordinate information or color and intensity information of the point, and expresses the association relationship between each scene point. In the fully-connected CRF model, each filter needs to cover all points in the point cloud, and the data amount and the calculation amount are large, so that the method cannot be directly realized. The fast gaussian convolution is realized by a full-freedom-degree-based polyhedral laTTice method (finite-phase laTTice), the calculated amount is O (N), and N is the number of points for filtering, so that the fast gaussian convolution has higher speed and better filtering effect compared with the traditional gaussian convolution. The fast Gaussian convolution based on the full-freedom polyhedral lattice method comprises four stages, namely a polyhedral lattice construction stage, an expansion (splat) mapping stage, a slice (slice) mapping stage, a fuzzy (blu) stage and the like.
In backward propagation, the input (error derivative) of the current convolutional layer is the output of the previous layer filter in the opposite direction through MGAnd outputting the result after the Gaussian filter. In the gaussian convolution based on the full-degree-of-freedom polyhedral lattice method, this back propagation can be achieved by reversing the order of the filters in the blurring stage on the basis of keeping the construction, extended mapping and slice mapping of the polyhedral lattice the same as in the forward propagation. The calculation amount of the implementation method is still O (N), so that the calculation amount is obviously reduced, and the calculation efficiency is improved.
(3) Weighted filtering
The next calculation is the aforementioned M for each semantic tag lGThe output results are weighted and summed. In point cloud labeling, each semantic tag is independent of each other, so that the weighted filtering operation can be performed by MGA convolution kernel of l × lWherein the input of the convolution operation is MGAnd outputting the feature matrix of each channel as a feature matrix containing l channels. In the back propagation, because the input and the output of the operation are known and the convolution kernels are independent of each other, the error derivatives of the convolution kernel parameters and the input data can be calculated, and the learning training of the convolution kernel parameters can be further performed by using the BP algorithm.
(4) Consistency detection
In consistency detection, a PoTTs model is used for performing compatibility calculation on output results of different tags in the last step. The compatibility calculation mainly compares whether the labels of two similar observation points are the same, consistency detection is 0 when the semantic labels of the two points are the same, and a penalty term sigma is introduced when the semantic labels of the two points are different, and the calculation is as follows:
compared with the fixed penalty term sigma, the method adopts the penalty value learned based on data, because the relevance between different labels is different, so that the influence of different labels marked at adjacent points on the whole marking result is different. Therefore, the consistency detection can also be regarded as a convolution layer, the number of input and output channels of the layer is M (the number of labels), the size of the convolution kernel is l × l, and the learned neuron connection weight parameter is the value of the transfer function. This step is also counter-propagating since it is implemented using the basic convolution operation.
(5) Adding a unary potential function
Will be a function of the unary potentialThe output result obtained in the consistency detection is combined element by element to obtain the complete potentialAnd (4) function results. The step of adding the unary potential function does not contain any parameter, so that the error in the output can be simply copied to the input end to realize back propagation.
(6) Normalization
Similar to the initialization procedure, the normalization step can also be implemented by an activation operation based on the Softmax function, and the backward propagation thereof is consistent with the backward propagation based on the Softmax function in the CNN. So far, the basic operation in the CNN network has been adopted to implement each step of a single iteration in the algorithm 1, and the solution algorithm of multiple iterations can be implemented by stacking the above steps.
Based on the above description, in this embodiment, the CRF model is first approximately modeled by using an average field approximation method, and then each step in the average field approximation method is equivalently implemented by using the basic operation in the CNN, that is, a single iteration average field approximation algorithm is implemented. For the iterative mean field approximation method, only related steps need to be stacked, that is, iterative mean field approximation calculation can be realized by using a recursive CNN structure (RNN), which is shown in fig. 13, where input point cloud data is given as P, and a point-by-point univariate potential function is U-Ui(l) The marginal probability obtained by the previous iteration is H1The marginal probability obtained by the current iteration is H2The single mean field approximation is denoted as fΩ(U,P,H1) Ω is its parameter set (including all parameters in weighted filtering and consistency detection) and Ω ═ w(m)ψ (l, l') }. For H1The output of the softmax function with U as input is initialized at the beginning of iteration and is equal to H in the following iteration2The output of (a) is:
wherein T' is the number of iterations.
In the presence of hydrogen to obtain H1Then, based on average field approximation algorithmCarrying out H2The estimation of (2) is as follows:
H2(t')=fΩ(U,P,H1(t')),0<t'≤T'
for the output Y, the estimation result is output only at the last iteration, i.e. Y ═ H2(T')。
Based on the above analysis, it can be known that the error derivative of the whole network structure (denoted as CRF-RNN) with respect to the parameter is calculable, so that it can be solved by the standard BP algorithm, and thus can be embedded into other neural networks for learning and training.
Further, a three-dimensional Point cloud labeling network (Point voxel net + CRF-RNN, PVCRF) fusing a three-dimensional voxel convolutional neural network and CRF rear-end optimization is constructed based on the Point cloud labeling network, and a specific structure thereof is shown in fig. 14, wherein for an input Point cloud, firstly, voxelization is performed according to a scene size of the input Point cloud and a fixed number of Point clouds are randomly selected in a voxel for subsequent feature extraction, then, a simple voxel feature is obtained by performing feature extraction in the voxel based on an LGAB module, after all non-empty voxel features are obtained (the features of empty voxels adopt 0 complementing operation), multi-scale voxel feature extraction and fusion are performed based on the three-dimensional convolutional neural network (Conv3D, DeConv3D), then, a multi-scale voxel feature is expanded to all points by using an interpolation method, further, Point-by-Point features are obtained, and then, the Point features are input into a multi-layer perceptron to obtain a preliminary labeling result, and finally, performing back-end optimization based on a CRF-RNN network structure. The network structure well realizes the information interaction between the classification and identification stage and the CRF optimization stage in the point cloud marking, and has obvious effect on improving the point cloud marking precision.
Based on the description of the three-dimensional point cloud marking method based on the fused voxels, the inventor also verifies the performance of the three-dimensional point cloud marking method based on the fused voxels, and evaluates the performance of the point cloud marking by evaluation indexes such as an Intersection over Union (IoU) and an Overall Accuracy (OA).
(1) Network implementation and parameter setting
In the point cloud data rasterization and sampling phase, different data sets need to be processed differently.
S3 DIS: for the S3DIS dataset, the maximum size range of the scene along the Z, Y, X axis is E-8 m, H-16 m, and W-50 m, respectively. To cover the entire scene, the size of the entire grid is 8 × 16 × 50, and the size of each voxel is λE=0.5m、λH=0.25m、λWThe voxel model size is E ' 16, H ' 64, W ' 256, with the extra voxels empty. And selecting 32 points in each voxel.
vKITTI: for vKITTI data sets, the maximum size ranges of the scene along the Z, Y, X axis are respectively E-33 m, H-193 m and W-148 m. Here the size of each voxel is set to λE=2m、λH=1.6m、λWThe voxel model size is 1.2m, E ' 16, H ' 128, W ' 128. Similarly, 32 points are selected for each voxel.
For the CRF-RNN network, in order to prevent overfitting, gradient fading, and the like, the number of iterations is set to T '5 in the training phase, and T' 10 in the testing phase. The size of the Gaussian filter is consistent with the size of the point cloud data.
The method adopts a two-step training strategy, wherein the first step is to carry out independent training on the Point-VoxelNet, and the second step is to carry out fine adjustment on the network PVCRF combining the Point-VoxelNet and the CRF-RNN. In a Point-VoxelNet network, an Adam optimization algorithm with a momentum value of 0.9 is adopted for training optimization, the initial learning rate is 0.001, and the trained batch size is 16. For the PVCRF network, an Adam optimization algorithm with a momentum value of 0.6 is adopted for training and optimization, the initial learning rate is 0.0001, and the training batch size is 16. During training, an early-stopping strategy is also adopted to obtain the optimal network parameters, the maximum number of training rounds is 100, and the training is stopped if the network parameters are not updated after 10 continuous training rounds. In the testing stage, the algorithm is verified by adopting 6-fold cross validation, and the grouping conditions of the training data and the testing data are shown in table 1.
The three-dimensional voxel neural network structure and the CRF-RNN network structure are realized by adopting a Tensorflow deep learning framework based on Python language. The experimental hardware environment is as follows: intel Core i76700KCPU, 48G memory, GTX1080Ti display card (supporting CUDA 8.0 and cuDNN 5.1).
(2) Analysis of quantitative results
Based on the two data sets, the application effect of the deep neural network model (Point VoxelNet and PVCRF) in three-dimensional Point cloud marking is compared, and comparative analysis is carried out on the application effect of the deep neural network model and the current better marking algorithms of PointNet, MS + CU (2), SEGCloud and 3 DContextNet. Wherein, only rectangular coordinate information of the point cloud, namely XYZ coordinates, is adopted for processing.
TABLE 1
TABLE 2
Table 1 and table 2 respectively show statistical labeling results of different network models on two small data sets, where the PVCRF model provided in this embodiment obtains superior average IoU on S3DIS and vKITTI data sets, respectively 51.8% and 39.1%, and also obtains better results on overall labeling accuracy OA, respectively 81.2% and 82.6%, indicating that the PVCRF model can better obtain complete feature expression of a point cloud, and proving that multi-scale feature extraction based on a three-dimensional voxel space and multi-scale feature extraction based on an euclidean space have a comparable function, and both can provide detailed information of a point cloud scene for three-dimensional point cloud labeling.
Comparing PVCRF with SEGCLloud, SEGCLloud directly adopts a binary voxel model to carry out voxel characteristic learning, and PVCRF adopts a rasterized point cloud voxel model and utilizes the point cloud in the voxel to carry out voxel characteristic learning. Both add back-end optimization based on a fully-connected CRF model. In addition, the PVCRF comprises a multi-scale feature extraction module based on a three-dimensional voxel space, so that the PVCRF obtains higher average IoU, and the fact that the multi-scale feature extraction module can extract feature descriptions with stronger characterization capability by utilizing the voxel interior point cloud data for processing is fully demonstrated.
Compared with Point-voxel Net and PVCRF models, the PVCRF model has better performance in the aspects of average IoU and overall accuracy OA, and the full-connection CRF model can model context information in a larger range and has stronger characterization capability on the adjacency relation of Point clouds.
On different data sets, PointNet is used as a reference, as point clouds in the S3DIS data set are concentrated and have generally high density, the network model PVCRF provided by the embodiment achieves a large performance improvement, and in the vKITTI data set, as the point cloud distribution is generally sparse, the performance improvement needs to be small.
The labeling results obtained by the different algorithms on the S3DIS and vKITTI datasets, as well as the true labeling results, are shown in fig. 15 and 16. From left to right are: inputting color Point cloud, and really marking the result based on the marking results of PointNet, Point-VoxelNet and PVCRF. From the labeled results in fig. 15 and fig. 16, it can be seen that the results obtained based on PVCRF are better than PointNet and Point-VoxelNet, and the labeled results are closer to the real results, thus proving the effectiveness of the multi-scale feature learning and CRF backend optimization in PVCRF. Compared with the Point-VoxelNet and the Point-VoxelNet, the marking result obtained based on the Point-VoxelNet is superior to that of the Point-VoxelNet, which is mainly because the introduction of multi-scale feature learning in the Point-VoxelNet improves the feature learning capability of a network model.
In indoor and outdoor scenes, for targets with high overlapping degree or close connection, the Point-VoxelNet and PVCRF models are still difficult to separate, such as wall panel (board) and window (window) in fig. 15, road (road) and terrain (terrain) in fig. 16. Because multi-scale feature extraction is considered in both the Point-voxel net and PVCRF models, objects larger than the size of the object in the scene can be well marked, such as a table (table) in fig. 15 and a building (building) in fig. 16.
Compared with Point-VoxelNet and PVCRF, the full-connected CRF can extract the large-scale context information in the scene Point cloud, so that the details in the marking result are more prominent, as shown in the segmentation results of the chair (chair) and the sofa (sofa) in the first line of fig. 15 and the road (road) and the terrain (terrain) in fig. 16.
(4) Class confusion matrix analysis
Fig. 17 and fig. 18 respectively show category confusion matrices obtained by two network models of Point-VoxelNet and PVCRF on S3DIS and vKITTI datasets, where a numerical value in a matrix grid is a category label accuracy, and a grid color also represents the accuracy. Comparing the two results, it can be found that for the indoor data set S3DIS, the segmentation accuracy of the Point-VoxelNet and the PVCRF model is equivalent to that of various targets, and the introduction of the CRF model mainly reduces confusion between a table (table) and a floor (floor). For outdoor data set vKITTI, the introduction of CRF models mainly reduces the confusion between buildings (building) and trucks (van).
As can be seen from fig. 17(a), in the S3DIS data set, the recognition accuracy of the Point-VoxelNet network model to the ceiling (ceiling), the floor (floor), the door (door), the column (column), the beam (beam), the window (window), the bookcase (book), the wall board (board) and the chair (chair) is above 52%, the labeling accuracy is low, the sofa (sofa), the wall (wall) and the shadow (cluTTer) are included, the accuracy is between 35% and 46%, and the worst of the 13 types of targets is the table (table), and the accuracy is 22%. The distribution of average class accuracy obtained by PVCRF on the S3DIS data set is similar to the Point-VoxelNet network model, but the general accuracy is improved, wherein the accuracy of the table is improved to 30%, as shown in fig. 18 (a).
Similar comparison results are also found in vKITTI datasets, as shown in FIGS. 17(b) and 18 (b). This comparison demonstrates the accurate understanding and modeling of the CRF model for the target details in the scene. Comparing different results on the two data sets in fig. 17 and fig. 18, the marking accuracy on the vKITTI data set is generally low, which is mainly because the point cloud in the vKITTI data set is generally sparse and uneven relative to the point cloud data in the S3DIS data set, so the neighborhood structure of the point is not so obvious, which further results in insufficient information extraction, and is not beneficial to realizing high-accuracy point cloud marking.
(5) Computational time statistical analysis
Experimental analysis was performed on the computational efficiency of the different algorithms on the S3DIS dataset. The S3DIS data set contains 272 independent point clouds, each point cloud is labeled by using different algorithms and the average test time is counted (only the time consumed by the neural network calculation is counted, and the time consumed by the data preparation stage is not counted), and the statistical results are shown in table 3. As can be seen from the statistical results in table 3, the calculation time of PointNet is about 1.8 s. The calculation of the network model PVCRF for fusing the three-dimensional voxel convolution neural network and the CRF rear-end optimization provided by the embodiment takes the largest time, namely 4.52s, because the calculation amount is obviously increased due to the introduction of the CRF.
TABLE 3
And finally, performing further optimization by adopting FC-CRF realized based on a convolutional neural network to obtain a fine marking result.
Further, referring to fig. 19, the page building apparatus 100 provided in the embodiment of the present invention includes a voxel processing and feature extracting module 110, a multi-scale voxel feature calculating module 120, a feature expanding module 130, and a point cloud marking module 140.
A voxel processing and feature extraction module 110, configured to perform voxel processing on the three-dimensional point cloud data set, and perform voxel feature extraction in voxels based on a processing result to form a first voxel feature matrix; in this embodiment, the step S11 may be executed by the voxel processing and feature extraction module 110, that is, the detailed description of the voxel processing and feature extraction module 110 may refer to the step S11, which is not repeated herein. Optionally, as shown in fig. 19, in the present embodiment, the voxel processing and feature extracting module 110 includes a voxel dividing unit 111, a point cloud classifying unit 112, and a point cloud sampling unit 113.
The voxel dividing unit 111 is used for dividing the point cloud coordinate space into a plurality of voxels according to a preset voxel size; in this embodiment, the step S111 may be executed by the voxel dividing unit 111, that is, for a detailed description of the voxel dividing unit 111, reference may be made to the step S111, which is not described herein again.
A point cloud classifying unit 112, configured to classify each point in the three-dimensional point cloud data set into a corresponding voxel according to a grid parameter of the voxel; in this embodiment, the step S112 may be performed by the point cloud classifying unit 112, that is, the step S112 may be referred to for the detailed description of the point cloud classifying unit 112, and the detailed description is not repeated herein.
And the point cloud sampling unit 113 is configured to sample points in each classified voxel so that the number of points in the voxel reaches a first preset value. In this embodiment, the step S113 may be performed by the point cloud sampling unit 113, that is, the step S113 may be referred to for the detailed description of the point cloud sampling unit 113, and the detailed description is not repeated herein.
A multi-scale voxel characteristic calculation module 120, configured to calculate a multi-scale characteristic of a voxel by using the first voxel characteristic matrix as an input of a three-dimensional convolutional neural network, and perform characteristic tandem fusion on the multi-scale characteristic to obtain a second voxel characteristic matrix; in this embodiment, the step S12 may be executed by the multi-scale voxel characteristic calculating module 120, that is, the step S12 may be referred to for the detailed description of the multi-scale voxel characteristic calculating module 120, and the detailed description is not repeated here.
A feature expansion module 130, configured to expand the voxel features in the second voxel feature matrix to each point in the three-dimensional point cloud data set based on a feature interpolation algorithm to obtain a point cloud feature matrix; in this embodiment, the step S13 can be executed by the feature extension module 130, that is, the detailed description about the feature extension module 130 can refer to the step S13, and the detailed description is not repeated here.
And the point cloud marking module 140 is used for inputting the point cloud feature matrix into the multilayer perceptron to mark the attributes of the three-dimensional point cloud. In this embodiment, the step S14 can be executed by the point cloud marking module 140, that is, the detailed description about the point cloud marking module 140 can refer to the step S14, and the detailed description is not repeated here.
In summary, the method and the device for labeling the three-dimensional point cloud based on the fused voxel provided by the embodiment of the invention extract multi-scale voxel characteristics by constructing a multi-scale space on a regularized voxel model based on a voxel convolution neural network, and then expand the voxel characteristics to point characteristics by using a characteristic interpolation mode, thereby realizing relatively fine classification and point cloud labeling point by point.
In the description of the present invention, the terms "disposed", "connected" and "connected" should be interpreted broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art. In the embodiments provided in the embodiments of the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus and method embodiments described above are illustrative only, as the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to a predetermined number of embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code. The module, segment, or portion of code, comprises one or a predetermined number of elements designed to implement a specified logical function.
It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A fused voxel based three-dimensional point cloud labeling method, the method comprising:
performing voxelization processing on the three-dimensional point cloud data set, and performing voxel characteristic extraction in voxels based on a processing result to form a first voxel characteristic matrix;
taking the first voxel characteristic matrix as the input of a three-dimensional convolution neural network to calculate and obtain multi-scale characteristics of voxels, and performing characteristic series fusion on the multi-scale characteristics to obtain a second voxel characteristic matrix;
expanding the voxel characteristics in the second voxel characteristic matrix to each point in the three-dimensional point cloud data set based on a characteristic interpolation algorithm to obtain a point cloud characteristic matrix;
and inputting the point cloud characteristic matrix into a multilayer perceptron to realize attribute marking of the three-dimensional point cloud.
2. The fused voxel based three-dimensional point cloud labeling method of claim 1, wherein the step of voxelizing the point cloud comprises:
dividing a point cloud coordinate space into a plurality of voxels according to a preset voxel size;
classifying each point in the three-dimensional point cloud data set into a corresponding voxel according to the grid parameter of the voxel;
and sampling the points in each classified voxel to enable the number of the points in the voxel to reach a first preset value.
3. The fused voxel based three-dimensional point cloud labeling method of claim 2, wherein the step of sampling points in each classified voxel comprises:
if the number of points contained in the voxel is larger than a first preset value, randomly sampling the first preset value points from the current voxel to enable the number of the points in the voxel to reach the first preset value;
if the number of points contained in the voxel is smaller than a first preset value, one or more points are randomly selected from the current voxel to be copied so that the number of points in the voxel reaches the first preset value.
4. The fused voxel based three-dimensional point cloud labeling method of claim 1, wherein the step of performing point cloud feature extraction within a voxel based on the processing result to form a first voxel feature matrix comprises:
calculating the center coordinate of the point cloud in each voxel, and performing center normalization processing on the point cloud data in the voxels based on the center coordinate to obtain an initial data matrix;
inputting the initial data matrix into an LGAB module to realize point-by-point local feature description, and performing point-by-point pooling operation on a local feature set in a voxel by adopting maximum pooling to obtain a global feature of the voxel and using the global feature as a first voxel feature matrix.
5. The fused voxel based three-dimensional point cloud labeling method of claim 1, wherein the step of using the first voxel feature matrix as an input to a three-dimensional convolutional neural network to compute the multi-scale features of the voxels comprises:
converting the voxel characteristic matrix into a 4-dimensional tensor, and respectively inputting the 4-dimensional tensor into three-dimensional convolution neural networks with convolution kernels of different sizes to calculate and obtain voxel characteristics under different scales;
and respectively inputting the voxel characteristics under different scales into three-dimensional deconvolution neural networks with convolution kernels of different sizes to obtain a plurality of voxel characteristics of different scales, wherein the convolution kernel size of each three-dimensional convolution neural network during convolution is the same as the convolution kernel size of each three-dimensional deconvolution neural network during deconvolution.
6. The fused voxel based three-dimensional point cloud labeling method of claim 1, wherein the step of extending the voxel features in the second voxel feature matrix into points in the three-dimensional point cloud dataset based on a feature interpolation algorithm comprises:
aiming at each point in the three-dimensional point cloud data set, respectively taking the point as a target point, and searching a preset number of voxels closest to the target point in a voxel space formed by the second voxel characteristic matrix to be used as neighborhood voxels;
and performing expansion calculation on the voxel characteristics of the neighborhood voxels to obtain the point cloud characteristics of the target point.
7. The method of claim 6The three-dimensional point cloud marking method based on the fused voxel is characterized in that the point cloud characteristic F of the target point ppIs composed ofWherein,representing the center point c in the voxel according to the target point p and the jth neighborhoodjThe euclidean distance between them to obtain a weight parameter,representing the voxel characteristics of the jth neighborhood voxel.
8. The fused voxel based three-dimensional point cloud labeling method of claim 1, further comprising:
inputting the three-dimensional point cloud with the attribute marking into a CRF-RNN network to optimize the point cloud attribute marking.
9. A fused voxel based three-dimensional point cloud labeling apparatus, the apparatus comprising:
the voxel processing and characteristic extraction module is used for carrying out voxel processing on the three-dimensional point cloud data set and carrying out voxel characteristic extraction in voxels based on a processing result to form a first voxel characteristic matrix;
the multi-scale voxel characteristic calculation module is used for calculating the multi-scale characteristic of the voxel by taking the first voxel characteristic matrix as the input of the three-dimensional convolutional neural network, and performing characteristic series fusion on the multi-scale characteristic to obtain a second voxel characteristic matrix;
the characteristic expansion module is used for expanding the voxel characteristics in the second voxel characteristic matrix to each point in the three-dimensional point cloud data set based on a characteristic interpolation algorithm to obtain a point cloud characteristic matrix;
and the point cloud marking module is used for inputting the point cloud characteristic matrix into the multilayer perceptron to mark the attribute of the three-dimensional point cloud.
10. The fused voxel based three-dimensional point cloud labeling apparatus of claim 9, wherein the voxel processing and feature extraction module comprises:
the voxel dividing unit is used for dividing the point cloud coordinate space into a plurality of voxels according to a preset voxel size;
the point cloud classifying unit is used for classifying each point in the three-dimensional point cloud data set into a corresponding voxel according to the grid parameter of the voxel;
and the point cloud sampling unit is used for sampling points in each classified voxel so as to enable the number of the points in the voxel to reach a first preset value.
CN201810861715.4A 2018-08-01 2018-08-01 Three-dimensional point cloud marking method and device based on fusion voxels Active CN109118564B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810861715.4A CN109118564B (en) 2018-08-01 2018-08-01 Three-dimensional point cloud marking method and device based on fusion voxels

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810861715.4A CN109118564B (en) 2018-08-01 2018-08-01 Three-dimensional point cloud marking method and device based on fusion voxels

Publications (2)

Publication Number Publication Date
CN109118564A true CN109118564A (en) 2019-01-01
CN109118564B CN109118564B (en) 2023-09-19

Family

ID=64863798

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810861715.4A Active CN109118564B (en) 2018-08-01 2018-08-01 Three-dimensional point cloud marking method and device based on fusion voxels

Country Status (1)

Country Link
CN (1) CN109118564B (en)

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816714A (en) * 2019-01-15 2019-05-28 西北大学 A kind of point cloud object type recognition methods based on Three dimensional convolution neural network
CN109829399A (en) * 2019-01-18 2019-05-31 武汉大学 A kind of vehicle mounted road scene point cloud automatic classification method based on deep learning
CN110046561A (en) * 2019-03-29 2019-07-23 贵阳铝镁设计研究院有限公司 A kind of electrolytic aluminium anode carbon block surface steel seal recognition methods based on 3D vision
CN110059608A (en) * 2019-04-11 2019-07-26 腾讯科技(深圳)有限公司 A kind of object detecting method, device, electronic equipment and storage medium
CN110060288A (en) * 2019-03-15 2019-07-26 华为技术有限公司 Generation method, device and the storage medium of point cloud characteristic pattern
CN110097077A (en) * 2019-03-26 2019-08-06 深圳市速腾聚创科技有限公司 Point cloud data classification method, device, computer equipment and storage medium
CN110675488A (en) * 2019-09-24 2020-01-10 电子科技大学 Construction method of creative three-dimensional voxel model modeling system based on deep learning
CN111027581A (en) * 2019-08-23 2020-04-17 中国地质大学(武汉) 3D target detection method and system based on learnable codes
CN111104641A (en) * 2019-12-10 2020-05-05 重庆大学 Method for identifying crystal grains by computer in three-dimensional space
CN111144242A (en) * 2019-12-13 2020-05-12 中国科学院深圳先进技术研究院 Three-dimensional target detection method and device and terminal
CN111223101A (en) * 2020-01-17 2020-06-02 湖南视比特机器人有限公司 Point cloud processing method, point cloud processing system, and storage medium
CN111260775A (en) * 2020-01-23 2020-06-09 清华大学 Three-dimensional reconstruction method and device based on multi-scale perception of shielding information
CN111340935A (en) * 2020-01-23 2020-06-26 北京市商汤科技开发有限公司 Point cloud data processing method, intelligent driving method, related device and electronic equipment
CN111368733A (en) * 2020-03-04 2020-07-03 电子科技大学 Three-dimensional hand posture estimation method based on label distribution learning, storage medium and terminal
CN111414802A (en) * 2020-02-21 2020-07-14 青岛海洋科学与技术国家实验室发展中心 Protein data feature extraction method
WO2020168648A1 (en) * 2019-02-18 2020-08-27 平安科技(深圳)有限公司 Image segmentation method and device, and computer-readable storage medium
CN111951368A (en) * 2020-08-31 2020-11-17 广州大学 Point cloud, voxel and multi-view fusion deep learning method
CN112085066A (en) * 2020-08-13 2020-12-15 南京邮电大学 Voxelized three-dimensional point cloud scene classification method based on graph convolution neural network
CN112115744A (en) * 2019-06-20 2020-12-22 北京京东叁佰陆拾度电子商务有限公司 Point cloud data processing method and device, computer storage medium and electronic equipment
CN112154448A (en) * 2019-09-29 2020-12-29 深圳市大疆创新科技有限公司 Target detection method and device and movable platform
CN112184867A (en) * 2020-09-23 2021-01-05 中国第一汽车股份有限公司 Point cloud feature extraction method, device, equipment and storage medium
CN112257605A (en) * 2020-10-23 2021-01-22 中国科学院自动化研究所 Three-dimensional target detection method, system and device based on self-labeling training sample
WO2021027710A1 (en) * 2019-08-12 2021-02-18 阿里巴巴集团控股有限公司 Method, device, and equipment for object detection
CN112541535A (en) * 2020-12-09 2021-03-23 中国科学院深圳先进技术研究院 Three-dimensional point cloud classification method based on complementary multi-branch deep learning
CN112884723A (en) * 2021-02-02 2021-06-01 贵州电网有限责任公司 Insulator string detection method in three-dimensional laser point cloud data
CN113033655A (en) * 2021-03-23 2021-06-25 广州虎牙科技有限公司 3D voxel model classification method and device, electronic equipment and storage medium
WO2021164066A1 (en) * 2020-02-18 2021-08-26 中国电子科技集团公司第二十八研究所 Convolutional neural network-based target group distribution mode determination method and device
CN113378854A (en) * 2021-06-11 2021-09-10 武汉大学 Point cloud target detection method integrating original point cloud and voxel division
CN113487713A (en) * 2021-06-16 2021-10-08 中南民族大学 Point cloud feature extraction method and device and electronic equipment
CN113706710A (en) * 2021-08-11 2021-11-26 武汉大学 Virtual point multi-source point cloud fusion method and system based on FPFH (field programmable gate flash) feature difference
CN113759338A (en) * 2020-11-09 2021-12-07 北京京东乾石科技有限公司 Target detection method and device, electronic equipment and storage medium
CN113763465A (en) * 2020-06-02 2021-12-07 中移(成都)信息通信科技有限公司 Garbage determination system, model training method, determination method and determination device
CN113780446A (en) * 2021-09-16 2021-12-10 广州大学 Lightweight voxel deep learning method capable of being heavily parameterized
CN114401666A (en) * 2019-07-15 2022-04-26 普罗马顿控股有限责任公司 Object detection and instance segmentation of 3D point clouds based on deep learning
WO2022193335A1 (en) * 2021-03-15 2022-09-22 深圳大学 Point cloud data processing method and apparatus, and computer device and storage medium
CN116259029A (en) * 2023-05-15 2023-06-13 小米汽车科技有限公司 Target detection method and device and vehicle
CN116310149A (en) * 2023-05-18 2023-06-23 深圳优立全息科技有限公司 Voxel merging method, device, equipment and storage medium
US11875424B2 (en) 2021-03-15 2024-01-16 Shenzhen University Point cloud data processing method and device, computer device, and storage medium
CN117496086A (en) * 2023-10-12 2024-02-02 南京林业大学 Indoor geometric reconstruction method for semantic perception
US11967873B2 (en) 2019-09-23 2024-04-23 Canoo Technologies Inc. Fractional slot electric motors with coil elements having rectangular cross-sections

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104700451A (en) * 2015-03-14 2015-06-10 西安电子科技大学 Point cloud registering method based on iterative closest point algorithm
CN105005995A (en) * 2015-07-29 2015-10-28 武汉大学 Method used for calculating three-dimensional point cloud model bone
CN106600622A (en) * 2016-12-06 2017-04-26 西安电子科技大学 Point cloud data partitioning method based on hyper voxels
CN107067470A (en) * 2017-04-05 2017-08-18 东北大学 Portable three-dimensional reconstruction of temperature field system based on thermal infrared imager and depth camera
CN108154192A (en) * 2018-01-12 2018-06-12 西安电子科技大学 High Resolution SAR terrain classification method based on multiple dimensioned convolution and Fusion Features

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104700451A (en) * 2015-03-14 2015-06-10 西安电子科技大学 Point cloud registering method based on iterative closest point algorithm
CN105005995A (en) * 2015-07-29 2015-10-28 武汉大学 Method used for calculating three-dimensional point cloud model bone
CN106600622A (en) * 2016-12-06 2017-04-26 西安电子科技大学 Point cloud data partitioning method based on hyper voxels
CN107067470A (en) * 2017-04-05 2017-08-18 东北大学 Portable three-dimensional reconstruction of temperature field system based on thermal infrared imager and depth camera
CN108154192A (en) * 2018-01-12 2018-06-12 西安电子科技大学 High Resolution SAR terrain classification method based on multiple dimensioned convolution and Fusion Features

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LU JUN: ""Point Cloud Registration Algorithm Based on NDT with Variable Size Voxel"", 《PROCEEDINGS OF THE 34TH CHINESE CONTROL CONFERENCE》 *
袁华: ""基于体素化网格下采样的点云简化算法研究"", 《数字视频》 *

Cited By (58)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816714A (en) * 2019-01-15 2019-05-28 西北大学 A kind of point cloud object type recognition methods based on Three dimensional convolution neural network
CN109829399B (en) * 2019-01-18 2022-07-05 武汉大学 Vehicle-mounted road scene point cloud automatic classification method based on deep learning
CN109829399A (en) * 2019-01-18 2019-05-31 武汉大学 A kind of vehicle mounted road scene point cloud automatic classification method based on deep learning
WO2020168648A1 (en) * 2019-02-18 2020-08-27 平安科技(深圳)有限公司 Image segmentation method and device, and computer-readable storage medium
CN110060288A (en) * 2019-03-15 2019-07-26 华为技术有限公司 Generation method, device and the storage medium of point cloud characteristic pattern
CN110097077A (en) * 2019-03-26 2019-08-06 深圳市速腾聚创科技有限公司 Point cloud data classification method, device, computer equipment and storage medium
CN110046561A (en) * 2019-03-29 2019-07-23 贵阳铝镁设计研究院有限公司 A kind of electrolytic aluminium anode carbon block surface steel seal recognition methods based on 3D vision
CN110059608A (en) * 2019-04-11 2019-07-26 腾讯科技(深圳)有限公司 A kind of object detecting method, device, electronic equipment and storage medium
US11915501B2 (en) 2019-04-11 2024-02-27 Tencent Technology (Shenzhen) Company Limited Object detection method and apparatus, electronic device, and storage medium
CN112115744B (en) * 2019-06-20 2024-05-21 北京京东叁佰陆拾度电子商务有限公司 Point cloud data processing method and device, computer storage medium and electronic equipment
CN112115744A (en) * 2019-06-20 2020-12-22 北京京东叁佰陆拾度电子商务有限公司 Point cloud data processing method and device, computer storage medium and electronic equipment
CN114401666A (en) * 2019-07-15 2022-04-26 普罗马顿控股有限责任公司 Object detection and instance segmentation of 3D point clouds based on deep learning
WO2021027710A1 (en) * 2019-08-12 2021-02-18 阿里巴巴集团控股有限公司 Method, device, and equipment for object detection
CN111027581A (en) * 2019-08-23 2020-04-17 中国地质大学(武汉) 3D target detection method and system based on learnable codes
US11967873B2 (en) 2019-09-23 2024-04-23 Canoo Technologies Inc. Fractional slot electric motors with coil elements having rectangular cross-sections
CN110675488B (en) * 2019-09-24 2023-02-28 电子科技大学 Method for constructing modeling system of creative three-dimensional voxel model based on deep learning
CN110675488A (en) * 2019-09-24 2020-01-10 电子科技大学 Construction method of creative three-dimensional voxel model modeling system based on deep learning
CN112154448A (en) * 2019-09-29 2020-12-29 深圳市大疆创新科技有限公司 Target detection method and device and movable platform
CN111104641A (en) * 2019-12-10 2020-05-05 重庆大学 Method for identifying crystal grains by computer in three-dimensional space
CN111144242A (en) * 2019-12-13 2020-05-12 中国科学院深圳先进技术研究院 Three-dimensional target detection method and device and terminal
CN111144242B (en) * 2019-12-13 2023-09-29 中国科学院深圳先进技术研究院 Three-dimensional target detection method, device and terminal
CN111223101A (en) * 2020-01-17 2020-06-02 湖南视比特机器人有限公司 Point cloud processing method, point cloud processing system, and storage medium
CN111223101B (en) * 2020-01-17 2023-08-11 湖南视比特机器人有限公司 Point cloud processing method, point cloud processing system and storage medium
CN111260775A (en) * 2020-01-23 2020-06-09 清华大学 Three-dimensional reconstruction method and device based on multi-scale perception of shielding information
CN111340935A (en) * 2020-01-23 2020-06-26 北京市商汤科技开发有限公司 Point cloud data processing method, intelligent driving method, related device and electronic equipment
CN111260775B (en) * 2020-01-23 2022-05-20 清华大学 Three-dimensional reconstruction method and device based on multi-scale perception of occlusion information
WO2021164066A1 (en) * 2020-02-18 2021-08-26 中国电子科技集团公司第二十八研究所 Convolutional neural network-based target group distribution mode determination method and device
CN111414802B (en) * 2020-02-21 2023-06-27 青岛海洋科技中心 Protein data characteristic extraction method
CN111414802A (en) * 2020-02-21 2020-07-14 青岛海洋科学与技术国家实验室发展中心 Protein data feature extraction method
CN111368733B (en) * 2020-03-04 2022-12-06 电子科技大学 Three-dimensional hand posture estimation method based on label distribution learning, storage medium and terminal
CN111368733A (en) * 2020-03-04 2020-07-03 电子科技大学 Three-dimensional hand posture estimation method based on label distribution learning, storage medium and terminal
CN113763465A (en) * 2020-06-02 2021-12-07 中移(成都)信息通信科技有限公司 Garbage determination system, model training method, determination method and determination device
CN112085066B (en) * 2020-08-13 2022-08-26 南京邮电大学 Voxelized three-dimensional point cloud scene classification method based on graph convolution neural network
CN112085066A (en) * 2020-08-13 2020-12-15 南京邮电大学 Voxelized three-dimensional point cloud scene classification method based on graph convolution neural network
CN111951368A (en) * 2020-08-31 2020-11-17 广州大学 Point cloud, voxel and multi-view fusion deep learning method
CN111951368B (en) * 2020-08-31 2023-08-08 广州大学 Deep learning method for point cloud, voxel and multi-view fusion
CN112184867A (en) * 2020-09-23 2021-01-05 中国第一汽车股份有限公司 Point cloud feature extraction method, device, equipment and storage medium
CN112257605A (en) * 2020-10-23 2021-01-22 中国科学院自动化研究所 Three-dimensional target detection method, system and device based on self-labeling training sample
CN113759338A (en) * 2020-11-09 2021-12-07 北京京东乾石科技有限公司 Target detection method and device, electronic equipment and storage medium
CN113759338B (en) * 2020-11-09 2024-04-16 北京京东乾石科技有限公司 Target detection method and device, electronic equipment and storage medium
CN112541535B (en) * 2020-12-09 2024-01-05 中国科学院深圳先进技术研究院 Three-dimensional point cloud classification method based on complementary multi-branch deep learning
CN112541535A (en) * 2020-12-09 2021-03-23 中国科学院深圳先进技术研究院 Three-dimensional point cloud classification method based on complementary multi-branch deep learning
CN112884723A (en) * 2021-02-02 2021-06-01 贵州电网有限责任公司 Insulator string detection method in three-dimensional laser point cloud data
CN112884723B (en) * 2021-02-02 2022-08-12 贵州电网有限责任公司 Insulator string detection method in three-dimensional laser point cloud data
WO2022193335A1 (en) * 2021-03-15 2022-09-22 深圳大学 Point cloud data processing method and apparatus, and computer device and storage medium
US11875424B2 (en) 2021-03-15 2024-01-16 Shenzhen University Point cloud data processing method and device, computer device, and storage medium
CN113033655A (en) * 2021-03-23 2021-06-25 广州虎牙科技有限公司 3D voxel model classification method and device, electronic equipment and storage medium
CN113378854A (en) * 2021-06-11 2021-09-10 武汉大学 Point cloud target detection method integrating original point cloud and voxel division
CN113487713A (en) * 2021-06-16 2021-10-08 中南民族大学 Point cloud feature extraction method and device and electronic equipment
CN113706710B (en) * 2021-08-11 2024-03-08 武汉大学 Virtual point multi-source point cloud fusion method and system based on FPFH characteristic difference
CN113706710A (en) * 2021-08-11 2021-11-26 武汉大学 Virtual point multi-source point cloud fusion method and system based on FPFH (field programmable gate flash) feature difference
CN113780446A (en) * 2021-09-16 2021-12-10 广州大学 Lightweight voxel deep learning method capable of being heavily parameterized
CN116259029B (en) * 2023-05-15 2023-08-15 小米汽车科技有限公司 Target detection method and device and vehicle
CN116259029A (en) * 2023-05-15 2023-06-13 小米汽车科技有限公司 Target detection method and device and vehicle
CN116310149B (en) * 2023-05-18 2023-07-21 深圳优立全息科技有限公司 Voxel merging method, device, equipment and storage medium
CN116310149A (en) * 2023-05-18 2023-06-23 深圳优立全息科技有限公司 Voxel merging method, device, equipment and storage medium
CN117496086A (en) * 2023-10-12 2024-02-02 南京林业大学 Indoor geometric reconstruction method for semantic perception
CN117496086B (en) * 2023-10-12 2024-05-07 南京林业大学 Indoor geometric reconstruction method for semantic perception

Also Published As

Publication number Publication date
CN109118564B (en) 2023-09-19

Similar Documents

Publication Publication Date Title
CN109118564A (en) A kind of three-dimensional point cloud labeling method and device based on fusion voxel
CN108229479B (en) Training method and device of semantic segmentation model, electronic equipment and storage medium
CN110135267B (en) Large-scene SAR image fine target detection method
CN109829399B (en) Vehicle-mounted road scene point cloud automatic classification method based on deep learning
CN111489358A (en) Three-dimensional point cloud semantic segmentation method based on deep learning
CN112633350B (en) Multi-scale point cloud classification implementation method based on graph convolution
CN112488210A (en) Three-dimensional point cloud automatic classification method based on graph convolution neural network
CN114724120B (en) Vehicle target detection method and system based on radar vision semantic segmentation adaptive fusion
CN110675423A (en) Unmanned aerial vehicle tracking method based on twin neural network and attention model
CN110309842B (en) Object detection method and device based on convolutional neural network
CN111695494A (en) Three-dimensional point cloud data classification method based on multi-view convolution pooling
WO2023193401A1 (en) Point cloud detection model training method and apparatus, electronic device, and storage medium
CN113743417B (en) Semantic segmentation method and semantic segmentation device
WO2019146057A1 (en) Learning device, system for generating captured image classification device, device for generating captured image classification device, learning method, and program
CN114495089B (en) Three-dimensional target detection method based on multi-scale heterogeneous characteristic self-adaptive fusion
CN112132145A (en) Image classification method and system based on model extended convolutional neural network
CN116612288B (en) Multi-scale lightweight real-time semantic segmentation method and system
CN108154511B (en) SAR image segmentation method based on submodule dictionary learning
CN116563682A (en) Attention scheme and strip convolution semantic line detection method based on depth Hough network
CN113850189A (en) Embedded twin network real-time tracking method applied to maneuvering platform
CN115393631A (en) Hyperspectral image classification method based on Bayesian layer graph convolution neural network
CN115995042A (en) Video SAR moving target detection method and device
Deng et al. See-csom: Sharp-edged and efficient continuous semantic occupancy mapping for mobile robots
CN111860679A (en) Vehicle detection method based on YOLO v3 improved algorithm
CN114693923A (en) Three-dimensional point cloud semantic segmentation method based on context and attention

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20230817

Address after: Room 402, 4th Floor, Old Book Office Building, Shandong Vocational College of Science and Technology, No. 1567 Xinhua Road, Kuiwen District, Weifang City, Shandong Province, 261035

Applicant after: Shandong Jiayin Information Technology Co.,Ltd.

Address before: 3 / F, building 19, phase I, Changsha Zhongdian Software Park, 39 Jianshan Road, Changsha, Hunan 410000

Applicant before: HUNAN VISUALTOURING INFORMATION TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231218

Address after: No.109, Deya Road, Kaifu District, Changsha, Hunan 410000

Patentee after: National University of Defense Technology

Address before: Room 402, 4th Floor, Old Book Office Building, Shandong Vocational College of Science and Technology, No. 1567 Xinhua Road, Kuiwen District, Weifang City, Shandong Province, 261035

Patentee before: Shandong Jiayin Information Technology Co.,Ltd.