CN112967296B

CN112967296B - Point cloud dynamic region graph convolution method, classification method and segmentation method

Info

Publication number: CN112967296B
Application number: CN202110261653.5A
Authority: CN
Inventors: 王勇; 岳晨珂; 汤鑫彤
Original assignee: Chongqing University of Technology
Current assignee: Sichuan Jiulai Technology Co ltd
Priority date: 2021-03-10
Filing date: 2021-03-10
Publication date: 2022-11-15
Anticipated expiration: 2041-03-10
Also published as: CN112967296A

Abstract

The invention discloses a point cloud dynamic area map convolution method, a point cloud dynamic area map classification method and a point cloud dynamic area map segmentation method using the point cloud dynamic area map convolution method. The present invention adopts a new convolution operation form for point cloud, according to the construction of point cloud map structure, aggregates point feature information of multiple different neighborhoods through a nonlinear method, so that neurons can adaptively select the size of the area . Compared with existing technical solutions such as PointNet that analyze on a single point, the present invention constructs a plurality of different local neighborhood graph structures, so that each neuron can adaptively select a suitable neighborhood receptive field size , and then use the connection between each point and the neighborhood points to perform a similar convolution operation to obtain local features, which can better combine the surrounding neighborhood information, extract local geometric information more effectively, and finally improve the accuracy of point cloud data. Classification or segmentation accuracy.

Description

A point cloud dynamic area map convolution method, classification method and segmentation method

技术领域technical field

本发明属于计算机视觉技术领域，具体涉及一种点云动态区域图卷积方法、分类方法及分割方法。The invention belongs to the technical field of computer vision, and in particular relates to a point cloud dynamic area map convolution method, a classification method and a segmentation method.

背景技术Background technique

点云数据包含了丰富的语义信息，具有高密度、高精度等特点，但是由于点云数据的不规则性和无序性，使得基于点云数据的语义分析仍是一项困难的挑战。早期的一些方法使用带有复杂规则的特征来进行手工提取解决此类问题。近年来随着深度学习和机器学习技术的火热增长，深度学习的方法也被引入用于点云数据的分析与处理。深度网络需要处理的数据为规则形状，点云数据从根本上说是不规则的，并且点云数据的空间分布不受点云的排列方式的影响，所以使用深度学习模型处理点云数据的常用方法是将原始点云数据转换成网格、体素、树形等数据结构形式。一些先进的深度学习网络如PointNet和PointNet++是专门为了处理点云的不规则性而设计，它可以直接处理原始点云数据，不需要把点云数据转换成规则形状再进行处理。但PointNet和PointNet++都不支持卷积操作，且不能有效的提取到局部几何信息。Point cloud data contains rich semantic information and has the characteristics of high density and high precision. However, due to the irregularity and disorder of point cloud data, semantic analysis based on point cloud data is still a difficult challenge. Some early methods used features with complex rules for manual extraction to solve such problems. In recent years, with the rapid growth of deep learning and machine learning technology, deep learning methods have also been introduced for the analysis and processing of point cloud data. The data that needs to be processed by the deep network is of regular shape, and the point cloud data is fundamentally irregular, and the spatial distribution of the point cloud data is not affected by the arrangement of the point cloud, so it is common to use the deep learning model to process point cloud data. The method is to convert the original point cloud data into grid, voxel, tree and other data structures. Some advanced deep learning networks such as PointNet and PointNet++ are specially designed to deal with the irregularities of point clouds. They can directly process raw point cloud data without converting point cloud data into regular shapes for further processing. However, neither PointNet nor PointNet++ supports convolution operations, and cannot effectively extract local geometric information.

目前很多工作都集中于利用卷积操作处理点云数据。2DCNN直接拓展到3D领域，将3D空间看作体积网格，使用3D卷积进行操作。虽然3D卷积在点云分类和分割任务上取得了不错的效果，但是它们对于存储性能的高要求以及所需的高计算成本使得它们在大规模数据集和大型场景上仍然存在精度不足的问题。Much work currently focuses on processing point cloud data using convolution operations. 2DCNN directly expands to the 3D field, treats the 3D space as a volume grid, and uses 3D convolution to operate. Although 3D convolutions have achieved good results in point cloud classification and segmentation tasks, their high requirements for storage performance and high computational costs make them still have insufficient accuracy on large-scale datasets and large scenes. .

综上所述，如何提高对点云数据的分类或分割的准确率成为了本领域技术人员急需解决的问题。To sum up, how to improve the accuracy of classification or segmentation of point cloud data has become an urgent problem to be solved by those skilled in the art.

发明内容Contents of the invention

针对现有技术中存在的上述不足，本发明实际解决的问题是：提高对点云数据的分类或分割的准确率。Aiming at the above-mentioned deficiencies in the prior art, the problem actually solved by the present invention is: to improve the accuracy rate of classification or segmentation of point cloud data.

为解决上述技术问题，本发明采用了如下的技术方案：In order to solve the problems of the technologies described above, the present invention adopts the following technical solutions:

一种点云动态区域图卷积方法，包括：A point cloud dynamic region map convolution method, comprising:

S1、获取三维点云数据X，X＝{α₁,α₂,α₃,…,α_i,…,α_n}，α_i表示第i个点的数据，n表示所述三维点云数据中点的个数，α_i＝{x_i,y_i,z_i}，xⁱ、y_i和z_i表示αⁱ的三维坐标；S1. Acquire 3D point cloud data X, X={α ₁ , α ₂ , α ₃ ,…,α _i ,…,α _n }, α _i represents the data of the i-th point, and n represents the 3D point cloud data The number of midpoints, α _i ={ _xi , y _i , z _i }, ^xi , y _i and z _i represent the three-dimensional coordinates of α ⁱ ;

S2、对三维点云数据X进行两次独立的k近邻运算，得到两个局部特征图y和z，所述两次独立的k近邻运算的k值不同；S2. Perform two independent k-nearest neighbor operations on the three-dimensional point cloud data X to obtain two local feature maps y and z, and the k values of the two independent k-nearest neighbor operations are different;

S3、将两个局部特征图y和z融合得到融合信息T，T＝Sum(y,z)；S3. Fusion of two local feature maps y and z to obtain fusion information T, T=Sum(y,z);

S4、对融合信息T进行池化操作得到特征通信信息s₁，s₁＝MAX(T)；S4. Perform pooling operation on the fusion information T to obtain characteristic communication information s ₁ , s ₁ =MAX(T);

S5、使用全连接层对特征通信信息s₁进行紧凑降维得到紧凑特征s₂，s₂＝FC(s₁)；S5. Use the fully connected layer to compact and reduce the dimensionality of the feature communication information s ₁ to obtain a compact feature s ₂ , s ₂ =FC(s ₁ );

S6、利用注意力机制从紧凑特征中自适应地选择不同区域的分支维数信息，并利用softmax对权重进行归一化，得到归一化信息a₁和a₂，

FC₁()和FC₂()分别表示与y和z对应的全连接层操作；S6. Use the attention mechanism to adaptively select the branch dimension information of different regions from the compact features, and use softmax to normalize the weights to obtain normalized information a ₁ and a ₂ ,

FC ₁ () and FC ₂ () represent the fully connected layer operations corresponding to y and z, respectively;

S7、将归一化信息与局部特征图相乘后求和得到特征图谱U，U＝Sum(a₁*y,a₂*z)。S7. The normalization information is multiplied by the local feature map and then summed to obtain the feature map U, U=Sum(a ₁ *y, a ₂ *z).

优选地，步骤S2中两次独立的k近邻运算的k值分别为15和25。Preferably, the k values of the two independent k-nearest neighbor operations in step S2 are 15 and 25 respectively.

一种点云动态区域图分类方法，采用上述的点云动态区域图卷积方法进行卷积操作，以所述特征图谱U作为每一次卷积操作得到的特征。A method for classifying point cloud dynamic area graphs, using the above-mentioned point cloud dynamic area graph convolution method for convolution operations, using the feature map U as a feature obtained by each convolution operation.

一种点云动态区域图分割方法，采用上述的点云动态区域图卷积方法进行卷积操作，以所述特征图谱U作为每一次卷积操作得到的特征。A method for segmenting a point cloud dynamic area map, using the above-mentioned point cloud dynamic area map convolution method for convolution operations, using the feature map U as a feature obtained by each convolution operation.

综上所述，本发明与现有技术相比，具有以下技术效果：In summary, compared with the prior art, the present invention has the following technical effects:

本发明采用了一种针对点云的新的卷积运算形式，根据构建点云图结构，通过一种非线性方法聚合多个不同邻域的点特征信息，使得神经元能够自适应的选取区域大小。与PointNet等现有的在单个点上进行分析的技术方案相比，本发明构建了多个不同的局部邻域图结构，让每个神经元都能够自适应的选取适合的邻域感受野大小，然后利用每个点与邻域点之间的联系进行类似的卷积运算获取局部特征，能够更好的结合周围邻域信息，更加有效的提取到局部几何信息，最终提高对点云数据的分类或分割的准确率。The present invention adopts a new convolution operation form for point cloud, according to the construction of point cloud map structure, through a nonlinear method to aggregate point feature information of multiple different neighborhoods, so that neurons can adaptively select the size of the area . Compared with existing technical solutions such as PointNet that analyze on a single point, the present invention constructs a plurality of different local neighborhood graph structures, so that each neuron can adaptively select a suitable neighborhood receptive field size , and then use the connection between each point and the neighborhood points to perform a similar convolution operation to obtain local features, which can better combine the surrounding neighborhood information, extract local geometric information more effectively, and finally improve the accuracy of point cloud data. Classification or segmentation accuracy.

附图说明Description of drawings

图1为本发明公开的一种点云动态区域图卷积方法的流程图；Fig. 1 is a flow chart of a method for convolution of point cloud dynamic region graphs disclosed by the present invention;

图2为局部点云空间的k近邻图。Figure 2 is the k-nearest neighbor map of the local point cloud space.

具体实施方式Detailed ways

下面结合附图对本发明作进一步的详细说明。The present invention will be further described in detail below in conjunction with the accompanying drawings.

如图1所示，本发明公开了一种点云动态区域图卷积方法，包括：As shown in Figure 1, the present invention discloses a method for convolution of a point cloud dynamic area graph, including:

S1、获取三维点云数据X，X＝{α₁,α₂,α₃,…,α_i,…,α_n}，α_i表示第i个点的数据，n表示所述三维点云数据中点的个数，αⁱ＝{xⁱ,yⁱ,zⁱ}，xⁱ、y_i和zⁱ表示αⁱ的三维坐标；S1. Acquire 3D point cloud data X, X={α ₁ , α ₂ , α ₃ ,…,α _i ,…,α _n }, α _i represents the data of the i-th point, and n represents the 3D point cloud data The number of midpoints, α ⁱ ={ ^xi , y ⁱ , z ⁱ }, ^xi , y _i and z ⁱ represent the three-dimensional coordinates of α ⁱ ;

如图2所示，为局部点云空间的k近邻图，定义α_j1,α_j2,…,α_jk为αⁱ的k个近邻点，e_ij为边缘特征，定义为e_ij＝h_θ(α_i,α_j)，其中θ为训练参数，非线性函数h_θ(α_i,α_j)：R^C×R^C→R^C，R^C为聚合之后的特征。则图卷积中第i个点的输出可以表示为：

类似于2D图像中的卷积操作，将α_i看作卷积区域的中心像素，α_j则是α_i周围的像素块。对图结构中α_i和α_j形成的有向边e_ij设计的边缘函数定义为：h_θ(α_i,α_j)＝h_θ(α_i,α_i-α_j)。这样的结构同时结合了全局形状信息以及局部邻域信息，且通过MLP实现。聚合函数选取max函数进行聚合操作。As shown in Figure 2, it is the k-nearest neighbor map of the local point cloud space, and α _j1 , α _j2 ,..., α _jk are defined as the k neighbor points of α ⁱ , e _ij is the edge feature, defined as e _ij = h _θ ( α _i ,α _j ), where θ is the training parameter, the nonlinear function h _θ (α _i ,α _j ): R ^C × ^RC → ^{RC , and R C} ^is the feature after aggregation. Then the output of the i-th point in the graph convolution can be expressed as:

Similar to the convolution operation in 2D images, α _i is regarded as the central pixel of the convolution area, and α _j is the pixel block around α _i . The edge function designed for the directed edge e _ij formed by α _i and α _j in the graph structure is defined as: h _θ (α _i ,α _j )=h _θ (α _i ,α _i -α _j ). Such a structure combines global shape information and local neighborhood information at the same time, and is realized by MLP. The aggregation function selects the max function to perform the aggregation operation.

步骤S3至S5是对从多个分支传过来的信息进行整合编码传送到下一步骤中，实现神经元自适应调整k邻域的大小。最后使用全连接网络对特征进行紧凑降维，既能实现精确自适应选择区域，还能降低尺寸，提高运算效率。Steps S3 to S5 are to integrate and encode the information transmitted from multiple branches and transmit them to the next step, so as to realize the adaptive adjustment of the size of the k-neighborhood of neurons. Finally, the fully connected network is used to perform compact dimensionality reduction on features, which can not only realize accurate adaptive selection of regions, but also reduce size and improve computing efficiency.

a₁∈1×C′，a₂∈1×C′，y∈n×C′，z∈n×C′；C′表示特征通道数。a ₁ ∈1×C′, a ₂ ∈1×C′, y∈n×C′, z∈n×C′; C′ represents the number of feature channels.

具体实施时，步骤S2中两次独立的k近邻运算的k值分别为15和25。During specific implementation, the k values of the two independent k-nearest neighbor operations in step S2 are 15 and 25 respectively.

采用不同次数的k近邻运算以及k值时的平均类精度和总精度如表2所示，因此，本发明中，最佳的k近邻运算的次数为2，k值分别为15和25。The average class accuracy and total precision when using different times of k-nearest neighbor operations and k values are shown in Table 2. Therefore, in the present invention, the optimal number of k-nearest neighbor operations is 2, and the k-values are 15 and 25 respectively.

表2Table 2

本发明还公开了一种点云动态区域图分类方法，采用上述的点云动态区域图卷积方法进行卷积操作，以所述特征图谱U作为每一次卷积操作得到的特征。The present invention also discloses a method for classifying point cloud dynamic area graphs, which uses the above-mentioned point cloud dynamic area graph convolution method to perform convolution operations, and uses the feature map U as a feature obtained by each convolution operation.

为了验证本发明公开的一种点云动态区域图分类方法的效果，在ModelNet40数据集上进行分类任务的评估。该数据集包含来自40个类别的12311个网格CAD模型，其中有9843个模型用于训练，2468个模型用于测试。本发明遵循DGCNN等模型的实验设置，对每个模型，从网格面均匀采样1024个点，仅使用采样点的三维坐标当作网络的输入数据。In order to verify the effect of a point cloud dynamic area map classification method disclosed in the present invention, the classification task is evaluated on the ModelNet40 data set. The dataset contains 12311 mesh CAD models from 40 categories, of which 9843 models are used for training and 2468 models are used for testing. The present invention follows the experimental settings of models such as DGCNN. For each model, 1024 points are evenly sampled from the grid surface, and only the three-dimensional coordinates of the sampling points are used as the input data of the network.

使用了四个DRG模块提取局部几何特征，每个DRG模块计算出来的特征用于下一个模块进行再计算。对于DRG模块本文选取了两个不同k邻域分支分别为15，25。之后将每个DRG模块得到的特征连接起来获得64+64+128+256＝512维特征点云。然后分别使用全局最大池化和平均最大池化获取全局特征。最后使用两个全连接层(512，256)来进行特征分类。Four DRG modules are used to extract local geometric features, and the features calculated by each DRG module are used for recalculation in the next module. For the DRG module, two different k-neighborhood branches are selected as 15 and 25 respectively. Then connect the features obtained by each DRG module to obtain a 64+64+128+256=512-dimensional feature point cloud. Global features are then obtained using global max-pooling and average max-pooling, respectively. Finally, two fully connected layers (512, 256) are used for feature classification.

所有层都包含LeakyReLU和批处理正则化。实验还对不同k邻域数量进行对比，选取最优k邻域数量，并在测试数据集上对模型进行了评估。使用0.1学习率的SGD优化器，并对学习率进行衰减至0.001。训练数据选取批处理数量为24，测试数据选取为16。实验结果如表1所示。All layers incorporate LeakyReLU and batch regularization. The experiment also compares the number of different k-neighborhoods, selects the optimal number of k-neighborhoods, and evaluates the model on the test data set. Use the SGD optimizer with a learning rate of 0.1 and decay the learning rate to 0.001. The number of batches is selected as 24 for the training data and 16 for the test data. The experimental results are shown in Table 1.

表1Table 1

本发明还公开了一种点云动态区域图分割方法，采用上述的点云动态区域图卷积方法进行卷积操作，以所述特征图谱U作为每一次卷积操作得到的特征。The present invention also discloses a method for segmenting a point cloud dynamic area graph, which adopts the above-mentioned point cloud dynamic area graph convolution method for convolution operation, and uses the feature map U as a feature obtained by each convolution operation.

为了验证本发明公开的一种点云动态区域图分割方法的效果，在ShapeNet数据集上进行部分分割任务。该任务将点云中的每个点分类成物体的几个零件类别标签。该数据集包含来自16个对象类别的16881个3D形状，总共标注了50个零件，每个训练样本中采样了2048个点，同样遵循DGCNN等模型的实验方案。将三层DRGConv模块的输出连接起来，拼接到2048个点的特征中，然后通过MLP(256，256，128)进行特征变换。对于批处理数量、激活函数、学习率等选取与分类网络相同。In order to verify the effect of a point cloud dynamic region map segmentation method disclosed in the present invention, part of the segmentation task is performed on the ShapeNet data set. The task classifies each point in the point cloud into several part class labels of the object. The dataset contains 16881 3D shapes from 16 object categories, a total of 50 parts are labeled, and 2048 points are sampled in each training sample, which also follows the experimental scheme of models such as DGCNN. The outputs of the three-layer DRGConv modules are concatenated, concatenated into features of 2048 points, and then feature transformed by MLP (256, 256, 128). For the number of batches, activation function, learning rate, etc., the selection is the same as that of the classification network.

采用与PointNet一样得评估方法，通过平均在该形状中出现的不同部分的IOU来计算形状的IOU，并通过对属于该类别的所有形状的IOU进行平均计算来获得该类别的IOU。最后通过平均所有测试形状的IOU来计算平均IOU(mIOU)。通过与PointNet、PointNet++、PointCNN、DGCNN、Kd-Net进行比较。实验结果如表3所示。Using the same evaluation method as PointNet, the IOU of a shape is calculated by averaging the IOU of different parts that appear in the shape, and the IOU of the category is obtained by averaging the IOU of all shapes belonging to the category. Finally the average IOU (mIOU) is calculated by averaging the IOU of all test shapes. By comparing with PointNet, PointNet++, PointCNN, DGCNN, Kd-Net. The experimental results are shown in Table 3.

表3table 3

以上仅是本发明优选的实施方式，需指出是，对于本领域技术人员在不脱离本技术方案的前提下，还可以作出若干变形和改进，上述变形和改进的技术方案应同样视为落入本申请要求保护的范围。The above are only preferred embodiments of the present invention. It should be pointed out that those skilled in the art can also make some modifications and improvements without departing from the technical solution. The above-mentioned modifications and improved technical solutions should be regarded as falling into This application claims the protection scope.

Claims

1. A method for convolution of point cloud dynamic area graphs, comprising:

S1. Acquire 3D point cloud data X, X={α ₁ , α ₂ , α ₃ ,…,α _i ,…,α _n }, α _i represents the data of the i-th point, and n represents the 3D point cloud data The number of midpoints, α _i ={ _xi , y _i , z _i }, _xi , y _i and z _i represent the three-dimensional coordinates of α _i ;

S2. Perform two independent k-nearest neighbor operations on the three-dimensional point cloud data X to obtain two local feature maps y and z, and the k values of the two independent k-nearest neighbor operations are different;

S3. Fusion of two local feature maps y and z to obtain fusion information T, T=Sum(y,z);

S4. Perform pooling operation on the fusion information T to obtain characteristic communication information s ₁ , s ₁ =MAX(T);

S5. Use the fully connected layer to compact and reduce the dimensionality of the feature communication information s ₁ to obtain a compact feature s ₂ , s ₂ =FC(s ₁ );

S6. Use the attention mechanism to adaptively select the branch dimension information of different regions from the compact features, and use softmax to normalize the weights to obtain normalized information a ₁ and a ₂ ,

S7. The normalization information is multiplied by the local feature map and then summed to obtain the feature map U, U=Sum(a ₁ *y, a ₂ *z).

2. The point cloud dynamic area map convolution method as claimed in claim 1, wherein the k values of the two independent k nearest neighbor operations in step S2 are 15 and 25 respectively.