Nothing Special   »   [go: up one dir, main page]

CN111489358B - Three-dimensional point cloud semantic segmentation method based on deep learning - Google Patents

Three-dimensional point cloud semantic segmentation method based on deep learning Download PDF

Info

Publication number
CN111489358B
CN111489358B CN202010190589.1A CN202010190589A CN111489358B CN 111489358 B CN111489358 B CN 111489358B CN 202010190589 A CN202010190589 A CN 202010190589A CN 111489358 B CN111489358 B CN 111489358B
Authority
CN
China
Prior art keywords
point cloud
features
semantic segmentation
feature
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010190589.1A
Other languages
Chinese (zh)
Other versions
CN111489358A (en
Inventor
孙志刚
江湧
邓世恒
肖力
王卓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202010190589.1A priority Critical patent/CN111489358B/en
Publication of CN111489358A publication Critical patent/CN111489358A/en
Application granted granted Critical
Publication of CN111489358B publication Critical patent/CN111489358B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a three-dimensional point cloud semantic segmentation method based on deep learning, and belongs to the field of three-dimensional point cloud and pattern recognition. The method comprises the following steps: training a semantic segmentation neural network model by using a three-dimensional point cloud training set, wherein the labels are real semantic categories, and the semantic segmentation neural network model comprises the following steps: a feature extraction network and a semantic segmentation network; the feature extraction network is used for extracting global features and local features of the three-dimensional point cloud; the semantic segmentation network is used for fusing global features and local features of the point cloud, and the probability that each point corresponding to the output feature map belongs to each semantic category; and inputting the point cloud to be detected into the trained semantic segmentation neural network model to obtain a point cloud segmentation result. According to the method, the local feature extraction module is used for extracting the local features of the point clouds in multiple scales, the channel attention promotion module is used for promoting the attention of important feature channels, unimportant feature channels are restrained, the weighted multi-class loss function is used for optimizing the training effect, and the precision of the semantic segmentation method is improved.

Description

Three-dimensional point cloud semantic segmentation method based on deep learning
Technical Field
The invention belongs to the field of three-dimensional point cloud and pattern recognition, and particularly relates to a three-dimensional point cloud semantic segmentation method based on deep learning.
Background
The semantic segmentation of the three-dimensional point cloud is the basis of semantic understanding and analysis of a three-dimensional scene, and is a research hotspot in the fields of navigation positioning, mode recognition, unmanned driving and the like. The semantic segmentation algorithm of the three-dimensional point cloud is mainly divided into a traditional feature extraction algorithm and a deep learning algorithm.
The three-dimensional point cloud segmentation algorithm based on traditional feature extraction carries out clustering and classification by extracting features such as boundary gradient, normal vector, surface ratio, texture and the like of the three-dimensional point cloud so as to carry out semantic segmentation.
The three-dimensional point cloud segmentation algorithm based on deep learning is divided into a voxel CNN, a multi-view CNN and a point cloud CNN algorithm. The voxel CNN algorithm needs to convert point cloud into 3D grid and then carries out three-dimensional convolution similar to two-dimension, because one-dimension is added, the complexity of time and space is too high; the multi-view CNN algorithm is that three-dimensional point cloud is mapped into images of multiple visual angles, then the images are segmented by using an image semantic segmentation algorithm and then are fused into the three-dimensional point cloud, and the algorithm ignores the space structure of the point cloud and is difficult to expand into tasks such as scene understanding; the point cloud CNN algorithm directly takes the point cloud as input without conversion, and end-to-end training learning is realized. The point cloud CNN algorithm mainly obtains a semantic segmentation model by training a marked three-dimensional point cloud scene, and completes semantic segmentation by using the semantic segmentation model, and the algorithm has relatively high accuracy and universality and is a hotspot of current research, and in order to perform more accurate semantic understanding, the accuracy of semantic segmentation still has a space for improvement.
Therefore, the existing three-dimensional point cloud semantic segmentation algorithm still has a space for improving the accuracy.
Disclosure of Invention
The invention provides a three-dimensional point cloud semantic segmentation method based on deep learning, aiming at solving the problem of low accuracy of the three-dimensional point cloud semantic segmentation algorithm in the prior art and aiming at improving the accuracy of the current three-dimensional point cloud semantic segmentation algorithm.
In order to achieve the above object, according to a first aspect of the present invention, there is provided a deep learning-based three-dimensional point cloud semantic segmentation method, including the following steps:
s1, training a semantic segmentation neural network model by using a three-dimensional point cloud training set, wherein a training sample is a three-dimensional point cloud, a label is a real semantic category, and the semantic segmentation neural network model comprises the following steps: a feature extraction network and a semantic segmentation network; the feature extraction network is used for extracting global features and local features of the three-dimensional point cloud; the semantic segmentation network is used for fusing global features and local features of the point cloud, and the output feature map corresponds to the probability that each point in the point cloud belongs to each semantic category;
and S2, inputting the point cloud to be detected into the trained semantic segmentation neural network model to obtain a point cloud segmentation result.
Preferably, the feature extraction network takes the point cloud as input and extracts features sequentially through an MLP (64), a first local feature extraction module, a first channel attention promotion module, a second local feature extraction module, a second channel attention promotion module, a connection structure, an MLP (1024) and a maximum pooling layer;
the local feature extraction module is used for extracting multi-scale local features of the point cloud;
the channel attention promoting module is used for extracting global features and point features of the point cloud, focusing attention on a feature channel with large information quantity and inhibiting unimportant channel features;
the connecting structure connects the local features output by the two local feature extraction modules with the point features output by the second channel attention promoting module, and the final features are obtained through MLP (1024) and maximum pooling;
here, the number in parentheses of MLP () represents the number of convolution kernels.
Preferably, the local feature extraction module includes four parallel branches, each parallel branch extracting a local neighborhood feature of one scale, and each branch includes: the system comprises a KNN local neighborhood search module, two multilayer perceptrons 1 × 1MLP and a Softmax function, wherein the local feature extraction module is used for connecting four parallel branch features to obtain the multi-scale local features.
Preferably, the specific structure of the channel attention boosting module is as follows:
a1 x 1mlp (C) for extracting C-dimensional features;
the first branch, the C-dimensional features are processed through a global average pooling layer, a full-connection layer and a Sigmoid function to obtain the weight of each feature channel, and the weight is multiplied by the C-dimensional features to obtain the point features of channel attention improvement;
the second branch is used for obtaining a C-dimensional global feature by performing maximal pooling on the C-dimensional feature, copying the global feature to restore the size of the original C-dimensional feature, and multiplying the global feature by the weight obtained by the first branch to obtain the global feature subjected to channel attention boosting;
here, the number in parentheses of MLP () represents the number of convolution kernels.
Preferably, the semantic segmentation sub-network of the semantic segmentation network obtains the probability of each semantic category of each point in the point cloud through a connection structure, MLP (512), MLP (256), MLP (128), MLP (c), and uses the category with the highest probability as the label of the point, thereby realizing semantic segmentation and obtaining a segmentation result, wherein the number in the parentheses of MLP () represents the number of convolution kernels.
Preferably, in the training stage, the current network weight parameter is calculated and updated, and the weighted multi-class loss function is:
Figure GDA0003582541890000041
Figure GDA0003582541890000042
Figure GDA0003582541890000043
where Loss is the Loss function, fl(x)(x) Is a Softmax function, al(x)(x) For segmenting the value of the network output characteristic diagram corresponding to the position of the point x belonging to the semantic category l (x), K represents the number of the semantic categories, wl(x)(x) Representing the weight of x points belonging to semantic class l (x), N being the total number of points in the set of points Ω input into the network model, Nl(x)The number of dots of the type l (x) is shown.
To achieve the above object, according to a second aspect of the present invention, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method according to the first aspect.
Generally, by the above technical solution conceived by the present invention, the following beneficial effects can be obtained:
(1) the invention uses a local feature extraction module to extract local features of a point cloud. Local neighborhoods at different scales have different useful information, and a parallel multi-scale network structure is used for extracting multi-scale features and fully fusing the multi-scale features together. The attention of the important characteristic channel is promoted by using a channel attention promotion module, the unimportant characteristic channel is restrained, and the weighted multi-class loss function is used for optimizing the training effect, so that items in different classes have different learning weights; network overfitting to unbalanced class distribution in the training process is reduced; the class with smaller point number is endowed with larger learning weight, the class with larger point number is endowed with smaller learning weight, the attention of the class with smaller point number ratio is improved, and the segmentation effect is improved by continuously reducing the Loss value during training.
(2) The invention directly takes the point cloud as input for training, and uses the neural network to extract the global characteristics of the input point cloud so as to solve the problem of point cloud disorder. And finally, connecting the global features and the point features and obtaining the classification probability of each point through an MLP layer, thereby realizing the semantic segmentation of the three-dimensional point cloud.
(3) The invention extracts and fuses the characteristics of the point cloud through multilayer convolution operation, and uses batch normalization and activation functions to carry out network optimization, thereby improving the training effect and monitoring the accuracy of the training process, loss functions and other data in real time.
(4) The invention preprocesses the S3DIS data set, including segmenting and sampling the room in the data set, so that each point has 9 dimensionalities of information, and the expanded dimensionality information is beneficial to improving the segmentation precision of network training.
Drawings
FIG. 1 is a flowchart of an embodiment of the present invention for implementing a deep learning-based semantic segmentation method for three-dimensional point clouds;
FIG. 2 is a schematic structural diagram of a three-dimensional semantic segmented deep neural network provided by an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a local feature extraction module in a deep neural network according to an embodiment of the present invention;
fig. 4 is a schematic diagram of KNN neighborhood search in the local feature extraction module according to the embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a channel attention boosting module in a deep neural network according to an embodiment of the present invention;
fig. 6 is a schematic diagram of a connection structure of a convolutional layer, a batch normalization layer, and an activation layer in a deep neural network according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
As shown in fig. 1, the invention provides a deep learning-based three-dimensional point cloud semantic segmentation method, which includes:
s1, training a semantic segmentation neural network model by using a three-dimensional point cloud training set, wherein a training sample is a point cloud containing actual three-dimensional coordinates, RGB colors and normalized coordinate information, and a label is a real semantic category.
And (3) carrying out segmentation, sampling and position normalization pretreatment on the indoor three-dimensional point cloud data set, and establishing a training set and a verification set.
The Indoor three-dimensional point cloud data set Stanford large-scale 3D inoor Spaces database (S3DIS) comprises three-dimensional point clouds of 271 rooms, the 271 rooms are divided into six areas of Area1-Area6, Area1-Area 5 are used as training areas, and Area6 is used as a verification Area. Sampling and position normalization are carried out on each room, each room is divided into a plurality of cuboids with 1m x 1m height, when the actual point number N of each cuboid is larger than 4096, sampling is carried out randomly, N is smaller than or equal to 4096, samples are copied randomly, the point number of each cuboid is guaranteed to be 4096, and then position normalization is carried out on each point in each cuboid relative to the corresponding room:
Xnorm=X/Xroom_max
Ynorm=Y/Yroom_max
Znorm=Z/Zroom_max
wherein (X, Y, Z) is the coordinates of a point, Xroom_max、Yroom_max、Zroom_maxRespectively corresponding to the maximum values of the middle points of the room on the X axis, the Y axis and the Z axis (X)norm,norm,Znorm) Is the coordinates of the normalized point.
Further, for each point in the dataset, dimension 9, includes: actual three-dimensional coordinates, RGB color values, normalized coordinates, i.e., [ X, Y, Z, R, G, B, Xnorm,Ynorm,Znorm]。
Further, the semantic categories of the point cloud in the S3DIS dataset are classified into 13 categories, which include: ceiling, floor, wall, beam, column, window, door, table, chair, sofa, bookcase, panel, and others.
As shown in fig. 2, the semantic segmentation neural network model is composed of a feature extraction sub-network and a semantic segmentation sub-network. The network model of the invention is added with a multi-scale local feature extraction module to extract multi-scale local features of the point cloud, a channel attention promotion module to promote the attention of important feature channels, and a weighted multi-class loss function aided training, thereby improving the effect of three-dimensional semantic segmentation.
Feature extraction subnetwork
The feature extraction sub-network mainly comprises: two local feature extraction modules, two channel attention boosting modules, a connection structure, two multi-layer perceptrons 1 × 9MLP (64) and 1 × 1MLP (1024), and a Max Pooling layer Max Pooling. The feature extraction sub-network directly takes the point cloud as input and sequentially passes through an MLP (64), a local feature extraction module 1, a channel attention promotion module 1, a local feature extraction module 2, a channel attention promotion module 2, a connection structure, an MLP (1024) and a Max Pooling layer to extract features. N in the network represents the number of input points, taking 4096, the input to the network is 4096 × 9, i.e., 4096 points, each point having a dimension of 9.
The local feature extraction module is used for extracting multi-scale local features of the point cloud. As shown in fig. 3, the local feature extraction module includes four parallel branches, each parallel branch extracts a local domain feature of one scale, each branch includes a KNN local neighborhood search, two multi-layer perceptrons 1 × 1MLP and a Softmax function, and the module connects the four parallel branch features to obtain the multi-scale local feature. KNN (k) represents KNN neighborhood search, the numbers in parentheses represent the number of points included in each point neighborhood, the diagram of local neighborhood search is shown in fig. 4, and one point feature and the edge of each point feature in the local neighborhood constitute a local neighborhood feature. The first branch extracts point characteristics through a1 x 1MLP (O), and the output characteristic dimension is O; the second branch searches K neighborhood of the point cloud through KNN (K), firstly extracts the feature through 1X 1MLP (O), then extracts the weight of each side in the local feature through 1X 1MLP (1) and a Softmax function, matrix multiplication is carried out on the output of the MLP (O) and the weight to obtain the local feature under the scale of the local neighborhood, and the output feature dimension is O; the third branch searches a 2K neighborhood of the point cloud through KNN (2K), and then outputs a characteristic dimension O as same as the second branch; and in the fourth branch, searching a 3K neighborhood of the point cloud through KNN (3K), and outputting a characteristic dimension O subsequently as in the second branch. And finally, connecting the output features of the four branches to obtain a multi-scale local feature, wherein the dimension of the output feature is 4 x O. Two local feature extraction modules are used in the feature extraction sub-network, wherein O in the first local feature extraction module is 64, O in the second local feature extraction module is 128, and K in the two local feature extraction modules is 16, so that the local features extracted by the local feature extraction modules have four scales of information, namely 1, 16, 32 and 48.
As shown in fig. 5, the channel attention boosting module is used to extract high-dimensional features of the point cloud and focus attention on the feature channels with large information amount, suppressing those unimportant channel features. The channel attention boosting module contains two branches and outputs two features, one global feature and one point feature. Firstly, extracting C-dimensional features through 1 x 1MLP (C); then, from bottom to top, in the first branch, the C-dimensional features are subjected to Average Pooling through a global Average Pooling layer, a full connection layer FC and a Sigmoid function to obtain the weight of each feature channel, and the weight is multiplied by the C-dimensional features to obtain the point features of channel attention boost; and in the second branch, the C-dimensional feature is processed by a Max Pooling with the largest pool to obtain a C-dimensional global feature, the global feature is copied and restored to the size of the original C-dimensional feature, and then the C-dimensional global feature is multiplied by the weight obtained by the first branch to obtain the global feature with the improved channel attention. Two channel attention boosting modules are used in the feature extraction sub-network, C in the first channel attention boosting module takes 64, and global features and point features with dimension of 64 are output; the second channel attention boosting module takes C for 128 and outputs global and point features with dimensions of 128.
The connection structure of the feature extraction sub-network connects the local features output by the two local feature extraction modules with the point features output by the second channel attention boosting module, and then the final global features are obtained through a1 x 1MLP (1024) and a Max Pooling.
The point features and the global features are connected, the local features of the point cloud are not fully utilized, and the local structure information between the points is helpful for improving the precision of semantic segmentation.
Semantic segmentation sub-network
The semantic segmentation sub-network mainly comprises: a connecting structure and four 1 × 1 multi-layer perceptrons MLP (512), MLP (256), MLP (128) and MLP (c), wherein c is the semantic category number 13. MLP is a multilayer perceptron, four 1 × 1 MLPs (512), 256, 128, and (c) represent convolution layers with convolution kernels of 1 × 1, and the number in the parentheses of MLP () represents the number of convolution kernels, which is also the dimension of the output feature. N in the network represents the number of points input, 4096. The semantic segmentation sub-network obtains a segmentation result through the connection structure, MLP (512), MLP (256), MLP (128), MLP (c).
The connection structure of the semantic segmentation sub-network connects the final global feature and the 64-dimensional and 128-dimensional global features from the two channel attention boosting modules with a plurality of levels of point features (896-dimensional features output by the connection structure in the feature extraction sub-network, 1024-dimensional features output by MLP (1024), 1024-dimensional global features output by Max Pooling), and then sequentially passes through 1 × 1MLP (512), 1 × 1MLP (256), 1 × 1MLP (128) and 1 × 1MLP (c), wherein c is a semantic category number to obtain the score of each point in the point cloud in each semantic category, and the category with the largest score is taken as a label of the point, so that the semantic segmentation is realized.
Further, as shown in fig. 6, the convolution layer and the deconvolution layer in the semantic segmentation model are subjected to batch normalization after convolution or deconvolution, and then the ReLU activation function is used. The batch normalization layer is used for solving gradient explosion or disappearance in backward propagation and relieving overfitting, and the activation layer is used for increasing nonlinearity of the neural network model so that the neural network model can approximate any function.
Calculating loss values by using weighted multi-class loss functions based on real classes to obtain prediction errors, performing back propagation by using the prediction errors, calculating and updating current network weight parameters, and calculating and updating network weights for multiple times by using a training set to obtain final network weight parameters, thereby obtaining the trained three-dimensional point cloud semantic segmentation deep neural network model.
The weighted multi-class loss function is:
in the training of the three-dimensional point cloud semantic segmentation deep neural network model, loss values are calculated by using a loss function based on real categories to obtain a prediction error, back propagation is performed by using the prediction error, and current network weight parameters are calculated and updated, wherein the weighted multi-category loss function is as follows:
Figure GDA0003582541890000111
Figure GDA0003582541890000112
Figure GDA0003582541890000113
wherein, Loss is a weighted Softmax cross entropy Loss function which can measure the difference between the predicted value and the true value. The smaller the value of the cross entropy, the better the model prediction effect. f. ofl(x)(x) Is a Softmax function, al(x)(x) To segment the value of the network output feature map for which the corresponding x point position belongs to class l (x), K representing the number of semantic classes, the Softmax function makes the sum of the multi-class prediction probabilities for each position in the feature map 1. w is al(x)(x) Representing the weight of x points belonging to semantic class l (x), N being the total number of data concentration points, Nl(x)The number of dots of the type l (x) is shown. w is alThe weight means that a category having a relatively small number of points is given a large learning weight, and a category having a relatively large number of points is given a small learning weight. And improving the segmentation effect by continuously reducing the value of Loss during training.
The training set consists of the Area1-Area 5 in the preprocessed S3DIS data set. The parameters of the training model are configured as: the point number N of the input point cloud is 4096, and the nine channels comprise XYZ coordinate information, RGB color information and position normalization information. An adam optimizer is adopted during model training, the initial learning rate is 0.001, the momentum is 0.9, the batch is 24, the delay rate is 0.5, the delay step size is 300000, and the maximum iteration number is 50. And monitoring parameters such as loss, IoU, call and the like in the training process, comparing IoU with the historical maximum value IoU after each iteration is finished, if the IoU is larger than the historical maximum value, saving the current training model, and updating the historical maximum value IoU, so that the saved training model is the training model IoU the highest after the training is finished.
And S2, inputting the point cloud to be detected into the trained three-dimensional point cloud semantic segmentation neural network model to obtain a point cloud segmentation result.
When the trained three-dimensional point cloud semantic segmentation model is used for semantic segmentation, the semantic category of each point in the point cloud can be effectively acquired, the accuracy of the semantic segmentation method is improved, training is performed on an S3DIS data set Area1-Area 5, verification is performed on the Area6, the precision reaches 90.14%, and the mIOU reaches 72.83%.
In practical application, the method can more accurately perform semantic segmentation of the three-dimensional point cloud, can realize higher precision compared with the conventional method, and is suitable for complex scenes.
It will be understood by those skilled in the art that the foregoing is only an exemplary embodiment of the present invention, and is not intended to limit the invention to the particular forms disclosed, since various modifications, substitutions and improvements within the spirit and scope of the invention are possible and within the scope of the appended claims.

Claims (6)

1. A three-dimensional point cloud semantic segmentation method based on deep learning is characterized by comprising the following steps:
s1, training a semantic segmentation neural network model by using a three-dimensional point cloud training set, wherein a training sample is a three-dimensional point cloud, a label is a real semantic category, and the semantic segmentation neural network model comprises the following steps: a feature extraction network and a semantic segmentation network;
the feature extraction network is used for extracting global features and local features of the three-dimensional point cloud;
the semantic segmentation network is used for fusing the global features and the local features of the point cloud, and the output feature map corresponds to the probability that each point in the point cloud belongs to each semantic category;
s2, inputting the point cloud to be detected into the trained semantic segmentation neural network model to obtain a point cloud segmentation result;
the feature extraction network takes the point cloud as input and extracts features sequentially through a1 × 9MLP (64), a first local feature extraction module, a first channel attention promotion module, a second local feature extraction module, a second channel attention promotion module, a connection structure, a1 × 1MLP (1024) and a maximum pooling layer; wherein,
the local feature extraction module is used for extracting multi-scale local features of the point cloud;
the channel attention promoting module is used for extracting global features and point features of the point cloud, focusing attention on a feature channel with large information quantity and inhibiting unimportant channel features;
the connecting structure is used for connecting the local features output by the two local feature extraction modules with the point features output by the second channel attention boosting module, and obtaining the final features through 1 x 1MLP (1024) and the maximum pooling layer;
the number in parentheses of MLP () represents the number of convolution kernels.
2. The method of semantic segmentation of three-dimensional point clouds according to claim 1, wherein the local feature extraction module includes four parallel branches, each branch extracting a scale of local neighborhood features,
a first branch for extracting point features through a1 x 1mlp (O), the output feature dimension being O;
the second branch is used for searching K neighborhoods of the point cloud through KNN (K), extracting features through 1X 1MLP (O), extracting the weight of each edge in the local features through 1X 1MLP (1) and a Softmax function, and performing matrix multiplication on the output of the MLP (O) and the weight to obtain the local features under the local neighborhood scale, wherein the output feature dimension is O;
the third branch is used for searching a 2K neighborhood of the point cloud through KNN (2K), and the subsequent characteristic dimension is O as same as that of the second branch;
the fourth branch is used for searching a 3K neighborhood of the point cloud through KNN (3K), and the subsequent output characteristic dimension is O as same as that of the second branch;
and finally, connecting the output features of the four branches to obtain the multi-scale local feature, wherein the dimension of the output feature is 4-O multi-scale local feature.
3. The three-dimensional point cloud semantic segmentation method according to claim 1, wherein the channel attention boosting module has a specific structure as follows:
a1 x 1mlp (C) for extracting C-dimensional features;
the first branch is used for enabling the C-dimensional features to pass through a global average pooling layer, a full connection layer and a Sigmoid function to obtain the weight of each feature channel, and multiplying the weight by the C-dimensional features to obtain point features with improved channel attention;
the second branch is used for obtaining a C-dimensional global feature by performing maximal pooling on the C-dimensional feature, copying the global feature to restore the size of the original C-dimensional feature, and multiplying the global feature by the weight obtained by the first branch to obtain the global feature with the channel attention promoted;
here, the number in parentheses of MLP () represents the number of convolution kernels.
4. The three-dimensional point cloud semantic segmentation method according to any one of claims 1 to 3, wherein the semantic segmentation network semantic segmentation sub-network obtains the probability of each point in the point cloud in each semantic category through a connection structure, 1 x 1MLP (512), 1 x 1MLP (256), 1 x 1MLP (128), and 1 x 1MLP (c), and the category with the highest probability is used as the label of the point, so as to realize semantic segmentation, and obtain the segmentation result, wherein c represents the number of semantic categories.
5. The method for semantic segmentation of three-dimensional point cloud according to any one of claims 1 to 3, wherein in the training stage, the current network weight parameters are calculated and updated, and the weighted multi-class loss function is:
Figure FDA0003597070710000031
Figure FDA0003597070710000032
Figure FDA0003597070710000033
where Loss is the Loss function, fl(x)(x) Is a Softmax function, al(x)(x) For segmenting the value of the network output characteristic diagram corresponding to the position of the point x belonging to the semantic category l (x), K represents the number of the semantic categories, wl(x)(x) Representing the weight of x points belonging to semantic categories l (x), N being the total number of points in the point set omega input into the network model, Nl(x)The number of dots of the type l (x) is shown.
6. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the deep learning based three-dimensional point cloud semantic segmentation method according to any one of claims 1-5.
CN202010190589.1A 2020-03-18 2020-03-18 Three-dimensional point cloud semantic segmentation method based on deep learning Active CN111489358B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010190589.1A CN111489358B (en) 2020-03-18 2020-03-18 Three-dimensional point cloud semantic segmentation method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010190589.1A CN111489358B (en) 2020-03-18 2020-03-18 Three-dimensional point cloud semantic segmentation method based on deep learning

Publications (2)

Publication Number Publication Date
CN111489358A CN111489358A (en) 2020-08-04
CN111489358B true CN111489358B (en) 2022-06-14

Family

ID=71810781

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010190589.1A Active CN111489358B (en) 2020-03-18 2020-03-18 Three-dimensional point cloud semantic segmentation method based on deep learning

Country Status (1)

Country Link
CN (1) CN111489358B (en)

Families Citing this family (60)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931790A (en) * 2020-08-10 2020-11-13 武汉慧通智云信息技术有限公司 Laser point cloud extraction method and device
CN111882558A (en) * 2020-08-11 2020-11-03 上海商汤智能科技有限公司 Image processing method and device, electronic equipment and storage medium
CN112017196B (en) * 2020-08-27 2022-02-22 重庆邮电大学 Three-dimensional tooth model mesh segmentation method based on local attention mechanism
CN111950658B (en) * 2020-08-28 2024-02-09 南京大学 Deep learning-based LiDAR point cloud and optical image priori coupling classification method
CN112070760B (en) * 2020-09-17 2022-11-08 安徽大学 Bone mass detection method based on convolutional neural network
CN112200248B (en) * 2020-10-13 2023-05-12 北京理工大学 Point cloud semantic segmentation method, system and storage medium based on DBSCAN clustering under urban road environment
CN112233124B (en) * 2020-10-14 2022-05-17 华东交通大学 Point cloud semantic segmentation method and system based on countermeasure learning and multi-modal learning
CN112257597B (en) * 2020-10-22 2024-03-15 中国人民解放军战略支援部队信息工程大学 Semantic segmentation method for point cloud data
CN112330699B (en) * 2020-11-14 2022-09-16 重庆邮电大学 Three-dimensional point cloud segmentation method based on overlapping region alignment
CN112418235B (en) * 2020-11-20 2023-05-30 中南大学 Point cloud semantic segmentation method based on expansion nearest neighbor feature enhancement
EP4009236A1 (en) * 2020-12-02 2022-06-08 Aptiv Technologies Limited Method for determining a semantic segmentation of an environment of a vehicle
CN112541535B (en) * 2020-12-09 2024-01-05 中国科学院深圳先进技术研究院 Three-dimensional point cloud classification method based on complementary multi-branch deep learning
CN112541908B (en) * 2020-12-18 2023-08-29 广东工业大学 Casting flash recognition method based on machine vision and storage medium
CN112560965B (en) * 2020-12-18 2024-04-05 中国科学院深圳先进技术研究院 Image semantic segmentation method, storage medium and computer device
CN112560865B (en) * 2020-12-23 2022-08-12 清华大学 Semantic segmentation method for point cloud under outdoor large scene
CN113160414B (en) * 2021-01-25 2024-06-07 北京豆牛网络科技有限公司 Automatic goods allowance recognition method, device, electronic equipment and computer readable medium
CN112836734A (en) * 2021-01-27 2021-05-25 深圳市华汉伟业科技有限公司 Heterogeneous data fusion method and device and storage medium
CN112907602B (en) * 2021-01-28 2022-07-19 中北大学 Three-dimensional scene point cloud segmentation method based on improved K-nearest neighbor algorithm
CN112819080B (en) * 2021-02-05 2022-09-02 四川大学 High-precision universal three-dimensional point cloud identification method
CN112837420B (en) * 2021-03-09 2024-01-09 西北大学 Shape complement method and system for terracotta soldiers and horses point cloud based on multi-scale and folding structure
CN113011430B (en) * 2021-03-23 2023-01-20 中国科学院自动化研究所 Large-scale point cloud semantic segmentation method and system
CN113850811B (en) * 2021-03-25 2024-05-28 北京大学 Three-dimensional point cloud instance segmentation method based on multi-scale clustering and mask scoring
CN113012177A (en) * 2021-04-02 2021-06-22 上海交通大学 Three-dimensional point cloud segmentation method based on geometric feature extraction and edge perception coding
CN113421267B (en) * 2021-05-07 2024-04-12 江苏大学 Point cloud semantic and instance joint segmentation method and system based on improved PointConv
CN113345101B (en) * 2021-05-20 2023-07-25 北京百度网讯科技有限公司 Three-dimensional point cloud labeling method, device, equipment and storage medium
CN113159232A (en) * 2021-05-21 2021-07-23 西南大学 Three-dimensional target classification and segmentation method
CN113177555B (en) * 2021-05-21 2022-11-04 西南大学 Target processing method and device based on cross-level, cross-scale and cross-attention mechanism
CN113554654B (en) * 2021-06-07 2024-03-22 之江实验室 Point cloud feature extraction system and classification segmentation method based on graph neural network
CN113554653B (en) * 2021-06-07 2024-10-29 之江实验室 Semantic segmentation method based on mutual information calibration point cloud data long tail distribution
CN113435461B (en) * 2021-06-11 2023-07-14 深圳市规划和自然资源数据管理中心(深圳市空间地理信息中心) Point cloud local feature extraction method, device, equipment and storage medium
CN113361538B (en) * 2021-06-22 2022-09-02 中国科学技术大学 Point cloud classification and segmentation method and system based on self-adaptive selection neighborhood
CN113516663B (en) * 2021-06-30 2022-09-27 同济大学 Point cloud semantic segmentation method and device, electronic equipment and storage medium
CN113657387B (en) * 2021-07-07 2023-10-13 复旦大学 Semi-supervised three-dimensional point cloud semantic segmentation method based on neural network
CN113538474B (en) * 2021-07-12 2023-08-22 大连民族大学 3D point cloud segmentation target detection system based on edge feature fusion
CN113449744A (en) * 2021-07-15 2021-09-28 东南大学 Three-dimensional point cloud semantic segmentation method based on depth feature expression
CN113744186B (en) * 2021-07-26 2024-09-24 南开大学 Method for detecting surface defects of workpiece by fusing projection point set segmentation network
CN113486988B (en) * 2021-08-04 2022-02-15 广东工业大学 Point cloud completion device and method based on adaptive self-attention transformation network
CN113627440A (en) * 2021-08-14 2021-11-09 张冉 Large-scale point cloud semantic segmentation method based on lightweight neural network
CN113781432B (en) * 2021-09-10 2023-11-21 浙江大学 Laser scanning automatic laying on-line detection method and device based on deep learning
CN113807233B (en) * 2021-09-14 2023-04-07 电子科技大学 Point cloud feature extraction method, classification method and segmentation method based on high-order term reference surface learning
CN113870272B (en) * 2021-10-12 2024-10-08 中国联合网络通信集团有限公司 Point cloud segmentation method and device and computer readable storage medium
CN114241226A (en) * 2021-12-07 2022-03-25 电子科技大学 Three-dimensional point cloud semantic segmentation method based on multi-neighborhood characteristics of hybrid model
CN114419372B (en) * 2022-01-13 2024-11-01 南京邮电大学 Multi-scale point cloud classification method and system
CN114529757B (en) * 2022-01-21 2023-04-18 四川大学 Cross-modal single-sample three-dimensional point cloud segmentation method
CN114359562B (en) * 2022-03-20 2022-06-17 宁波博登智能科技有限公司 Automatic semantic segmentation and labeling system and method for four-dimensional point cloud
CN114419570B (en) * 2022-03-28 2023-04-07 苏州浪潮智能科技有限公司 Point cloud data identification method and device, electronic equipment and storage medium
CN114419381B (en) * 2022-04-01 2022-06-24 城云科技(中国)有限公司 Semantic segmentation method and road ponding detection method and device applying same
CN114792372B (en) * 2022-06-22 2022-11-04 广东工业大学 Three-dimensional point cloud semantic segmentation method and system based on multi-head two-stage attention
CN115170585B (en) * 2022-07-12 2024-06-14 上海人工智能创新中心 Three-dimensional point cloud semantic segmentation method
CN115294562B (en) * 2022-07-19 2023-05-09 广西大学 Intelligent sensing method for operation environment of plant protection robot
CN115311274B (en) * 2022-10-11 2022-12-23 四川路桥华东建设有限责任公司 Weld joint detection method and system based on spatial transformation self-attention module
CN115471513B (en) * 2022-11-01 2023-03-31 小米汽车科技有限公司 Point cloud segmentation method and device
CN115578393B (en) * 2022-12-09 2023-03-10 腾讯科技(深圳)有限公司 Key point detection method, key point training method, key point detection device, key point training device, key point detection equipment, key point detection medium and key point detection medium
CN116129207B (en) * 2023-04-18 2023-08-04 江西师范大学 Image data processing method for attention of multi-scale channel
CN116246039B (en) * 2023-05-12 2023-07-14 中国空气动力研究与发展中心计算空气动力研究所 Three-dimensional flow field grid classification segmentation method based on deep learning
CN116824188B (en) * 2023-06-05 2024-04-09 腾晖科技建筑智能(深圳)有限公司 Hanging object type identification method and system based on multi-neural network integrated learning
CN116524197B (en) * 2023-06-30 2023-09-29 厦门微亚智能科技股份有限公司 Point cloud segmentation method, device and equipment combining edge points and depth network
CN117708560A (en) * 2023-11-27 2024-03-15 云南电网有限责任公司昭通供电局 Multi-information PointNet++ fusion method for constructing DEM (digital elevation model) based on airborne laser radar data
CN118154929B (en) * 2024-01-03 2024-09-03 华中科技大学 Rock-soil particle three-dimensional point cloud semantic classification method
CN118351332B (en) * 2024-06-18 2024-08-20 山东财经大学 Automatic driving vehicle three-dimensional feature extraction method and system based on point cloud data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110197223A (en) * 2019-05-29 2019-09-03 北方民族大学 Point cloud data classification method based on deep learning
CN110660062A (en) * 2019-08-31 2020-01-07 南京理工大学 Point cloud instance segmentation method and system based on PointNet
WO2020036742A1 (en) * 2018-08-17 2020-02-20 Nec Laboratories America, Inc. Dense three-dimensional correspondence estimation with multi-level metric learning and hierarchical matching
CN110827398A (en) * 2019-11-04 2020-02-21 北京建筑大学 Indoor three-dimensional point cloud automatic semantic segmentation algorithm based on deep neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020036742A1 (en) * 2018-08-17 2020-02-20 Nec Laboratories America, Inc. Dense three-dimensional correspondence estimation with multi-level metric learning and hierarchical matching
CN110197223A (en) * 2019-05-29 2019-09-03 北方民族大学 Point cloud data classification method based on deep learning
CN110660062A (en) * 2019-08-31 2020-01-07 南京理工大学 Point cloud instance segmentation method and system based on PointNet
CN110827398A (en) * 2019-11-04 2020-02-21 北京建筑大学 Indoor three-dimensional point cloud automatic semantic segmentation algorithm based on deep neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
A multi-scale fully convolutional network for semantic labeling of 3D point clouds;Mohammed Yousefhussien等;《ISPRS》;20180516;第191-204页 *

Also Published As

Publication number Publication date
CN111489358A (en) 2020-08-04

Similar Documents

Publication Publication Date Title
CN111489358B (en) Three-dimensional point cloud semantic segmentation method based on deep learning
CN110930454B (en) Six-degree-of-freedom pose estimation algorithm based on boundary box outer key point positioning
CN109919108B (en) Remote sensing image rapid target detection method based on deep hash auxiliary network
WO2022252274A1 (en) Point cloud segmentation and virtual environment generation method and apparatus based on pointnet network
CN110532920B (en) Face recognition method for small-quantity data set based on FaceNet method
CN111242208A (en) Point cloud classification method, point cloud segmentation method and related equipment
CN111898432B (en) Pedestrian detection system and method based on improved YOLOv3 algorithm
CN112907602B (en) Three-dimensional scene point cloud segmentation method based on improved K-nearest neighbor algorithm
CN111583263A (en) Point cloud segmentation method based on joint dynamic graph convolution
CN111625667A (en) Three-dimensional model cross-domain retrieval method and system based on complex background image
CN111612008A (en) Image segmentation method based on convolution network
CN113435461B (en) Point cloud local feature extraction method, device, equipment and storage medium
CN110263855B (en) Method for classifying images by utilizing common-basis capsule projection
CN110532409B (en) Image retrieval method based on heterogeneous bilinear attention network
CN115222998B (en) Image classification method
CN112364747B (en) Target detection method under limited sample
CN113989340A (en) Point cloud registration method based on distribution
CN111899203A (en) Real image generation method based on label graph under unsupervised training and storage medium
CN114187506B (en) Remote sensing image scene classification method of viewpoint-aware dynamic routing capsule network
CN115049833A (en) Point cloud component segmentation method based on local feature enhancement and similarity measurement
CN114066844A (en) Pneumonia X-ray image analysis model and method based on attention superposition and feature fusion
CN117671666A (en) Target identification method based on self-adaptive graph convolution neural network
CN117173595A (en) Unmanned aerial vehicle aerial image target detection method based on improved YOLOv7
CN116597267A (en) Image recognition method, device, computer equipment and storage medium
Li et al. Few-shot meta-learning on point cloud for semantic segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant