Nothing Special   »   [go: up one dir, main page]

CN111027559A - Point cloud semantic segmentation method based on expansion point convolution space pyramid pooling - Google Patents

Point cloud semantic segmentation method based on expansion point convolution space pyramid pooling Download PDF

Info

Publication number
CN111027559A
CN111027559A CN201911048539.3A CN201911048539A CN111027559A CN 111027559 A CN111027559 A CN 111027559A CN 201911048539 A CN201911048539 A CN 201911048539A CN 111027559 A CN111027559 A CN 111027559A
Authority
CN
China
Prior art keywords
point
point cloud
convolution
expansion
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911048539.3A
Other languages
Chinese (zh)
Inventor
余洪山
何勇
邹艳梅
杨振耕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Research Institute Of Hunan University
Hunan University
Original Assignee
Shenzhen Research Institute Of Hunan University
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Research Institute Of Hunan University, Hunan University filed Critical Shenzhen Research Institute Of Hunan University
Priority to CN201911048539.3A priority Critical patent/CN111027559A/en
Publication of CN111027559A publication Critical patent/CN111027559A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a point cloud semantic segmentation method based on expansion point convolution space pyramid pooling, which comprises the steps of firstly obtaining point cloud subset central points through a farthest point sampling algorithm, and determining a subset range by utilizing a KNN algorithm; then, cloud subset features of each point are extracted in a pyramid mode through the expansion point convolution space, the receptive field of point convolution is increased, and feature extraction of scene multi-scale targets is enriched; secondly, a simple and effective decoding module is adopted to realize feature decoding, so that the segmentation precision of sparse point cloud is improved; and finally, realizing the label classification of each point cloud through a full connection layer. The point cloud semantic segmentation method has the outstanding advantages of high segmentation precision, various adaptive scenes and the like.

Description

Point cloud semantic segmentation method based on expansion point convolution space pyramid pooling
Technical Field
The invention belongs to the field of computer vision, and relates to a 3D semantic segmentation method based on expansion point convolution space pyramid pooling.
Background
Point cloud semantic segmentation is one of the main research difficulties and hot spots of a 3D scene analysis technology, and how to efficiently and quickly acquire local features, global features and scene context information of a point cloud becomes an urgent problem to be solved. The point cloud semantic segmentation realizes point-by-point classification by acquiring scene point cloud characteristics, and achieves the purpose of scene analysis. However, the scene point cloud has disorder, sparsity and density unevenness, and the scene target has multi-scale characteristics, which all seriously affect the acquisition of the point cloud characteristics. At present, the schemes for semantic segmentation of point cloud mainly include a multi-view scheme, a voxelization scheme, and a scheme for directly processing point cloud. The multi-view scheme acquires images of a point cloud scene from different views by using a projection mode and inputs the images into a traditional 2D convolutional neural network; the voxelization scheme divides the point cloud into 3D grids, and extracts features by using a 3D convolutional neural network; the two schemes convert irregular point clouds into regular data, so that the limitation of the point clouds is avoided to a certain extent, but partial geometric information of scene point clouds is lost, quantization errors are introduced, and the segmentation precision depends on the performance of a traditional convolutional neural network. The scheme of directly processing the point cloud has received more and more attention because the point cloud information is retained to the maximum. In order to obtain context information of different scales, a point cloud semantic segmentation network usually adopts a point cloud multi-scale grouping mode, which is easy to cause more calculation cost. In addition, the local feature and context information acquisition capability of the existing point cloud semantic segmentation network still needs to be improved.
How to improve the local feature and context information acquisition capability and reduce the network computing cost is a main urgent technical problem to be solved in the field.
Disclosure of Invention
Aiming at the problems, the invention provides a point cloud semantic segmentation method based on expanded point convolution space pyramid pooling.
A point cloud semantic segmentation method based on an expansion point convolution space pyramid comprises the following steps:
step 1: based on ScanNet data set point cloud, adopting a farthest point sampling algorithm to obtain point cloud subset center points;
the input point clouds of the network are respectively P { P1,p2,p3…,pnF, selecting a subset P from the cloud of input points using an iterative farthest point sampling algorithmsub-i{Pi1,Pi2,Pi3,…,PimIs such that P isijFurthest from other points in the subset;
step 2: determining the range of the point cloud subset by using a nearest neighbor algorithm based on the point cloud subset center point obtained in the step 1;
the input forms are P x (D + C) and P1Matrix information of x D, output form P1×K1X (D + C) matrix information. Wherein P is the number of the input point clouds, P1The number of the central points is sampled, D is D-dimensional coordinate information of each point, C is C-dimensional point characteristic information, K1The number of the neighborhood points of the central point is;
searching out K by KNN algorithm1And each neighborhood point closest to the central point is sorted and numbered according to the distance from each neighborhood point to the central point. K1The individual domain points and the center point constitute a center point neighborhood, also referred to as a local neighborhood.
And step 3: pyramid pooling extraction of local neighborhood features F using improved expansion point convolution space1And obtaining P1An image extraction point;
input form is P1×K1Matrix information of x (D + C), output form P1Matrix information of x (D + C "). Wherein, C' is a point characteristic dimension obtained by pyramid pooling of the improved expansion point convolution space in a local neighborhood abstraction;
and 4, step 4: based on P obtained in step 31And sampling and grouping the abstract point clouds.
The step is input in the form of P1Matrix information of x (D + C'), output form P2×K2X (D + C'); wherein, P2The number of the central points of the second down-sampling, K2The number of neighborhood points of the central point is sampled for the second time; based on P1Repeating the steps 1 and 2 to obtain P2A central point of down sampling, K2A central point domain point, get P2A local neighborhood;
and 5: PointNet extracting local field feature F2
Input form is P2×K2Matrix of x (D + C'), output form P2X (D + C "); wherein, C' is a point characteristic dimension abstracted by PointNet in a local field;
step 6: decoding P2Each contains F2Abstract point cloud of features, to obtain P1Each contains F3An abstract point cloud of features;
input form is P2Matrix information of x (D + C ″), output form P1X (D + C'); wherein C' is a characteristic dimension of the decoded point cloud;
and 7: decoding P1Each contains F3Obtaining P abstract point clouds containing F4An abstract point cloud of features;
input form is P1X (D + C') matrix information, output form P1X (D + C ""). Wherein, C "" is the characteristic dimension of the decoded point cloud;
and 8: and obtaining the label of each point cloud by adopting the full connection layer.
This step inputs the form P × (D + C ""), and outputs P × k. Where k is the number of scene point cloud categories.
Further, the pyramid pooling extraction of the point cloud local neighborhood characteristics by the improved expansion point convolution space comprises the following three steps: 1) improving the conventional dilation point convolution; 2) respectively extracting local field features from improved expansion point convolution channels with different expansion rates; 3) fusing and adding the characteristics of all channels; 4) reducing dimension of the features;
the specific extraction process is as follows:
1) replacing a point convolution kernel function (MLP) and improving the expansion point convolution;
the continuous convolution of a conventional dilated point convolution is defined as:
Figure BDA0002254726520000031
where H is a continuous characteristic function, given by pjAssigning a feature value; g is a continuous kernel function, pjTo piDistance is mapped as kernel weight;using monte carlo integration, the continuous convolution definition translates into:
Figure BDA0002254726520000032
wherein d is the expansion rate of the expansion point convolution; the infinite kernel G (·) is replaced with a multi-layered perceptron:
Figure BDA0002254726520000033
wherein p is the relative position of the neighborhood point and the central point, and the Euclidean distance is used; θ is a series of parameters of MLP.
In order to obtain more local neighborhood characteristics, the local neighborhood point characteristics are abstracted to a higher dimension to obtain redundant information, so that the improved expansion point convolution replaces an infinite kernel function g (-) by PointNet:
Figure BDA0002254726520000034
wherein, PN is a PointNet network, and theta' is a series of parameters of PN;
2) extracting local neighborhood characteristics by convolution of each improved expansion point;
input form is P1×K1Matrix information of x (D + C), output form P1×(D+Ci) (ii) a Wherein, CiObtaining a point characteristic dimension abstracted in a local neighborhood for the ith improved expansion point convolution;
the space pyramid pooling has i channels, each channel carries an improved expansion point convolution with an expansion rate d1,d2,…,diThen, the content information P of different scales of the i groups of local neighborhoods is obtained1×(D+Ci);
3) Fusing characteristic information extracted by each improved expansion point convolution network;
input form is P1×(D+Ci) The output form is sigma P1×(D+Ci) (ii) a Wherein i is space goldThe number of character tower pooling channels is also the improved expansion point convolution number; fusing content information of different scales by a splicing method;
4) reducing the dimension characteristic information;
the input form is output form is sigma P1×(D+Ci) The output form is P1×(D+C′)。
With the increase of the number of channels, the local neighborhood characteristic number is increased by i times, and the calculation cost of the rear-end coding of the coding layer is increased. Therefore, 1 × 1 convolution operation is performed on the fused feature information, and feature dimensionality is reduced.
Compared with the traditional point convolution, the improved expansion point convolution enlarges the receptive field and obtains more scene context information without increasing the convolution calculation cost. In addition, spatial pyramid pooling can effectively encode scene multi-scale content information. Therefore, the method combines the advantages of the improved expansion point convolution and the spatial pyramid pooling, ensures the convolution calculation cost, and efficiently extracts the local neighborhood characteristics and the context information.
Further, PointNet is selected as a feature extractor of local neighborhood point cloud, and the working principle of the PointNet feature extractor is as follows:
given a set of unordered local neighborhood points { pl1,pl1,…,plnH' may be defined to map the set of point clouds into a vector:
Figure BDA0002254726520000041
where γ and h are typically MLP.
PointNet [ document 1 ] is often used as a point cloud feature extractor, which ensures the translational infeasibility of disordered point clouds, can abstract low-dimensional point features into rich high-dimensional semantic features, and improves segmentation accuracy.
Further, the decoding process in step 6 is as follows:
1) interpolation and upsampling: based on P2Sampling the abstract point cloud by interpolation algorithm to obtain P1An abstract point cloud with point cloud characteristics of F2. The step is inputP2X (D + C') matrix, output P1×(D+C″);
2) Jump link fusion feature: p in step 3 is converted into a link-hopping mode1An abstract point cloud characteristic F1With up-sampled P1An abstract point cloud characteristic F2Fusion addition; the step inputs P1X (D + C') and P1X (D + C'), and output P1×(D+C″+C′);
3) And (3) decoding: decoding the point cloud based on the fusion characteristics by using a Unit PointNet;
the step inputs P1X (D + C "+ C'), output P1×(D+C″′)。
Further, the decoding process in step 7 is as follows:
1) interpolation and upsampling: input P1X (D + C ') matrix, output P x (D + C');
2) jump link fusion feature: inputting P x (D + C ') and P x (D + C), and outputting P x (D + C' + C);
3) and (3) decoding: p × (D + C '+ C) is input, and P × (D + C') is output.
Advantageous effects
The point cloud semantic segmentation method based on the expansion point convolution space pyramid pooling firstly obtains a point cloud subset central point through a farthest point sampling algorithm, and determines a subset range by utilizing a KNN algorithm; then, cloud subset features of each point are extracted in a pyramid mode through the expansion point convolution space, the receptive field of point convolution is increased, and feature extraction of scene multi-scale targets is enriched; secondly, a simple and effective decoding module is used for realizing feature decoding, and the segmentation precision of the sparse point cloud is improved; and finally, realizing the label classification of each point cloud through a full connection layer. The point cloud semantic segmentation method has the outstanding advantages of high segmentation precision, various adaptive scenes and the like.
The invention can realize the semantic segmentation of the irregular, sparse and uneven-density point cloud, has the advantages of high segmentation precision, low calculation cost, multiple adaptive scenes and the like, and effectively solves the problems of low acquisition efficiency and high calculation cost of local characteristics and context information of the point cloud in the indoor and outdoor scene semantic segmentation technology.
Compared with the existing point cloud semantic segmentation network, the invention has the advantages that:
1) the expansion point convolution and the PointNet are combined, an improved expansion point convolution is provided, and the acquisition capability of the local characteristics and the context information of the point cloud is improved;
2) inspiring the space pyramid pooling, the invention provides the improved expansion point convolution space pyramid pooling, which effectively encodes multi-scale context information, enriches point cloud characteristics and improves scene semantic segmentation precision.
3) The invention provides an improved coding layer fused with expansion point convolution space pyramid pooling, which is placed at the front end of the coding layer in a pyramid pooling mode, so that the loss of point cloud high-dimensional features is avoided, and the segmentation of scene small targets is facilitated.
4) The invention provides a simple and effective decoding layer, which is used for sampling point cloud high-dimensional features of the coding layer and adding the point cloud high-dimensional features to point cloud low-dimensional features, so that scene detail feature information is enriched, and the segmentation precision is improved.
Drawings
FIG. 1 is a block diagram of the overall network of the present invention;
FIG. 2 is a diagram of a conventional point convolution, an extended point convolution and an improved extended point convolution;
FIG. 3 is an expanded point convolution spatial pyramid pooling;
FIG. 4PointNet feature extractor
FIG. 5 is a point cloud semantic segmentation network framework.
Detailed Description
The present invention will be described in further detail below with reference to the accompanying drawings.
The point cloud data related by the invention can adopt an indoor scene common data set ScanNet and an outdoor scene common data set Semantic3D and the like. The ScanNet data set provides various indoor scenes such as offices, apartments, bedrooms and the like, and the point cloud data is acquired by the RGB-D camera and comprises coordinate information, color information and an alpha channel P (x, y, z, r, g, b and alpha) of points. The sematic 3D dataset provides many types of outdoor scenes such as farms, broadacres, castle, etc., and the point cloud data is collected by a static ground laser scanner and contains coordinate information, laser reflection intensity and color information P (x, y, z, intensity, r, g, b) of points. As an application example, the test effect based on the ScanNet public data set is given.
Fig. 1 shows a flowchart of the present invention, and a point cloud semantic segmentation method based on an extended point convolution space pyramid includes the following steps:
step 1: based on ScanNet data set point cloud, adopting a farthest point sampling algorithm to obtain point cloud subset center points;
the input point clouds of the network are respectively P { P1,p2,p3…,pnF, selecting a subset P from the cloud of input points using an iterative farthest point sampling algorithmsub-i{Pi1,Pi2,Pi3,…,PimIs such that P isijFurthest from the other points in the subset. On the premise of giving the same number of central points, compared with a random point sampling algorithm, the farthest point sampling algorithm has stronger capability of covering all input point clouds.
Step 2: determining the range of the point cloud subset by using a nearest neighbor algorithm (KNN) based on the point cloud subset center point obtained in the step 1;
the input form of this step is P × (D + C) and P1Matrix information of x D, output form P1×K1X (D + C) matrix information. Wherein P is the number of the input point clouds, P1The number of the central points is sampled, D is D-dimensional coordinate information of each point, C is C-dimensional point characteristic information, K1The number of the neighborhood points of the central point.
Searching out K by KNN algorithm1And each neighborhood point closest to the central point is sorted and numbered according to the distance from each neighborhood point to the central point. K1The individual domain points and the center point constitute a center point neighborhood, also referred to as a local neighborhood.
And step 3: pyramid pooling extraction of local neighborhood features F using improved expansion point convolution space1And obtaining P1An image extraction point;
the input form of the step is P1×K1Matrix information of x (D + C), output form P1Matrix information of x (D + C "). Wherein, C' is a point characteristic dimension abstracted in a local neighborhood by the pyramid pooling of the improved expansion point convolution space.
Compared with the traditional point convolution, the improved expansion point convolution enlarges the receptive field and obtains more scene context information without increasing the convolution calculation cost. In addition, spatial pyramid pooling can effectively encode scene multi-scale content information. Therefore, the method combines the advantages of the improved expansion point convolution and the spatial pyramid pooling, ensures the convolution calculation cost, and efficiently extracts the local neighborhood characteristics and the context information. The pyramid pooling extraction of the local neighborhood characteristics of the point cloud by the improved expansion point convolution space mainly comprises the following three steps: 1) improving the conventional dilation point convolution; 2) respectively extracting local field features from improved expansion point convolution channels with different expansion rates; 3) merging and adding the characteristics of all channels; 4) and (5) reducing the dimension of the feature. The specific extraction process is as follows:
1) replacing a point convolution kernel function (MLP) and improving the expansion point convolution;
as shown in fig. 2, the continuous convolution of the conventional dilated point convolution is defined as:
Figure BDA0002254726520000071
where H is a continuous characteristic function, given by pjAssigning a feature value; g is a continuous kernel function, pjTo piDistance is mapped as kernel weight; in most practical applications, the characteristic function F is not completely known, and with monte carlo integration, the continuous convolution definition can be approximately converted into:
Figure BDA0002254726520000072
wherein d is the expansion rate of the expansion point convolution; here the infinite kernel function G (·) is replaced with a multi-layered perceptron (MLP):
Figure BDA0002254726520000073
wherein p is the relative position of the neighborhood point and the central point, and the Euclidean distance is used; θ is a series of parameters of MLP.
In order to obtain more local neighborhood characteristics, abstracting the local neighborhood point characteristics to a higher dimension to obtain redundant information, so that the modified expansion point convolution replaces an infinite kernel function g (-) with PointNet, and a PointNet characteristic extractor will be described in detail in step 5:
Figure BDA0002254726520000081
wherein, PN is PointNet network, and theta' is a series of parameters of PN.
2) Extracting local neighborhood characteristics by convolution of each improved expansion point;
the input form of this step is P1×K1Matrix information of x (D + C), output form P1×(D+ Ci). Wherein, CiAnd (4) convolving the feature dimension of the point abstracted in the local neighborhood for the ith improved expansion point.
As shown in FIG. 3, the spatial pyramid pooling has i channels, each of which carries an improved dilation point convolution with a dilation rate d1,d2,…,diThen, the content information P of i groups of local neighborhoods with different scales is obtained1×(D+Ci)。
3) Fusing characteristic information extracted by each improved expansion point convolution network;
the step is input in the form of P1×(D+Ci) The output form is sigma P1×(D+Ci). Wherein, i is the number of the spatial pyramid pooling channels and is the number of the improved expansion point convolution. And fusing content information of different scales by a splicing method.
4) And (5) reducing the dimension characteristic information.
The step inputs the form of output as sigma P1×(D+Ci) The output form is P1×(D+C′)。With the increase of the number of channels, the local neighborhood characteristic number is increased by i times, and the calculation cost of the coding at the rear end of the coding layer is increased. Therefore, 1 × 1 convolution operation is performed on the fused feature information, and feature dimensionality is reduced.
And 4, step 4: based on P obtained in step 31And sampling and grouping the abstract point clouds.
The step is input in the form of P1Matrix information of x (D + C'), output form P2×K2X (D + C'). Wherein, P2The number of the central points of the second down-sampling, K2And the number of the neighborhood points of the central point is sampled for the second time. Based on P1Repeating the steps 1 and 2 to obtain P2A central point of down sampling, K2A central point domain point, get P2A local neighborhood.
And 5: PointNet extracting local field feature F2
The step is input in the form of P2×K2Matrix of x (D + C'), output form P2X (D + C "). Wherein, C' is a point characteristic dimension abstracted by PointNet in a local field.
PointNet [ document 1 ] is often used as a point cloud feature extractor, which ensures the translational infeasibility of disordered point clouds, can abstract low-dimensional point features into rich high-dimensional semantic features, and improves segmentation accuracy. According to the invention, PointNet is selected as a feature extractor of local neighborhood point cloud, as shown in FIG. 4, the working principle of the PointNet feature extractor is as follows.
Given a set of unordered local neighborhood points { pl1,pl1,…,plnH' may be defined to map the set of point clouds into a vector:
Figure BDA0002254726520000091
where γ and h are typically MLP.
Step 6: decoding P2Each contains F2Abstract point cloud of features, to obtain P1Each contains F3An abstract point cloud of features;
the step is input in the form of P2Matrix information of x (D + C ″), output form P1X (D + C'). Wherein, C' is the characteristic dimension of the decoded point cloud. The decoding layer mainly comprises three steps: 1) interpolation and upsampling: based on P2Sampling the abstract point cloud by interpolation algorithm to obtain P1An abstract point cloud with point cloud characteristics of F2. The step is input with P2X (D + C') matrix, output P1X (D + C "). 2) Jump link fusion feature: p in step 3 is connected in a hopping link mode1An abstract point cloud characteristic F1With up-sampled P1An abstract point cloud characteristic F2Fused addition. The step inputs P1X (D + C') and P1X (D + C'), and output P1X (D + C "+ C'). 3) And (3) decoding: and (4) decoding the point cloud based on the fusion characteristics by using Unit PointNet (document 1). The step inputs P1X (D + C "+ C'), output P1×(D+C″′)。
And 7: decoding P1Each contains F3Obtaining P abstract point clouds containing F4An abstract point cloud of features;
the step is input in the form of P1X (D + C') matrix information, output form P1X (D + C ""). Wherein, C "" is the characteristic dimension of the decoded point cloud. The decoding process is shown in step 6, and the data stream form is as follows: 1) Interpolation and upsampling: input P1X (D + C ') matrix, output P X (D + C'). 2) Jump link fusion feature: p (D + C') and P (D + C) are input, and P (D + C) is output. 3) And (3) decoding: p × (D + C '+ C) is input, and P × (D + C') is output.
And 8: and obtaining the label of each point cloud by adopting the full connection layer.
This step inputs the form P × (D + C ""), and outputs P × k. Where k is the number of scene point cloud categories.
Charles R Q, Hao S, Mo K, et al.PointNet: Deep Learning on pointSets for 3D Classification and Segmentation [ C ]// IEEE Conference on computer Vision & Pattern recognition.2017.

Claims (5)

1. A point cloud semantic segmentation method based on an expansion point convolution space pyramid is characterized by comprising the following steps:
step 1: based on ScanNet data set point cloud, adopting a farthest point sampling algorithm to obtain point cloud subset center points;
the input point clouds of the network are respectively P { P1,p2,p3…,pnF, selecting a subset from the cloud of input points using an iterative furthest point sampling algorithm
Figure FDA0002254726510000012
So that
Figure FDA0002254726510000011
Furthest from other points in the subset;
step 2: determining the range of the point cloud subset by using a nearest neighbor algorithm based on the point cloud subset center point obtained in the step 1;
the input forms are P x (D + C) and P1Matrix information of x D, output form P1×K1X (D + C) matrix information. Wherein P is the number of the input point clouds, P1The number of the central points is sampled, D is D-dimensional coordinate information of each point, C is C-dimensional point characteristic information, K1The number of the neighborhood points of the central point is;
and step 3: pyramid pooling extraction of local neighborhood features F using improved expansion point convolution space1And obtaining P1An image extraction point;
input form is P1×K1Matrix information of x (D + C), output form P1Matrix information of x (D + C "). Wherein, C' is a point characteristic dimension obtained by pyramid pooling of the improved expansion point convolution space in a local neighborhood abstraction;
and 4, step 4: based on P obtained in step 31And sampling and grouping the abstract point clouds.
The step is input in the form of P1Matrix information of x (D + C'), output form P2×K2X (D + C'); wherein,P2The number of the central points of the second down-sampling, K2The number of neighborhood points of the central point is sampled for the second time; based on P1Repeating the steps 1 and 2 to obtain P2A central point of down sampling, K2A central point domain point, to obtain P2A local neighborhood;
and 5: PointNet extracting local field feature F2
Input form is P2×K2Matrix of x (D + C'), output form P2X (D + C "); wherein, C' is a point characteristic dimension abstracted by PointNet in a local field;
step 6: decoding P2Each contains F2Abstract point cloud of features, to obtain P1Each contains F3An abstract point cloud of features;
input form is P2Matrix information of x (D + C ″), output form P1X (D + C'); wherein C' is a characteristic dimension of the decoded point cloud;
and 7: decoding P1Each contains F3Obtaining P abstract point clouds containing F4An abstract point cloud of features;
input form is P1X (D + C') matrix information, output form P1X (D + C ""). Wherein, C "" is the characteristic dimension of the decoded point cloud;
and 8: obtaining a label of each point cloud by adopting a full connection layer;
input form P × (D + C ""), output P × k; where k is the number of scene point cloud categories.
2. The method of claim 1, wherein the improved expanded point convolution space pyramid extraction of the point cloud local neighborhood features comprises the following three steps: 1) improving the conventional dilation point convolution; 2) respectively extracting local field features from improved expansion point convolution channels with different expansion rates; 3) fusing and adding the characteristics of all channels; 4) reducing dimension of the features;
the specific extraction process is as follows:
1) replacing a point convolution kernel function (MLP) and improving the expansion point convolution;
the continuous convolution of a conventional dilated point convolution is defined as:
Figure FDA0002254726510000021
where H is a continuous characteristic function, given by pjAssigning a feature value; g is a continuous kernel function, pjTo piDistance mapping is used as kernel weight; using monte carlo integration, the continuous convolution definition translates into:
Figure FDA0002254726510000022
wherein d is the expansion rate of the expansion point convolution; the infinite kernel G (·) is replaced with a multi-layered perceptron:
Figure FDA0002254726510000023
wherein p is the relative position of the neighborhood point and the central point, and the Euclidean distance is used; θ is a series of parameters of MLP.
In order to obtain more local neighborhood characteristics, the local neighborhood point characteristics are abstracted to a higher dimension to obtain redundant information, so that the improved expansion point convolution replaces an infinite kernel function g (-) by PointNet:
Figure FDA0002254726510000024
wherein, PN is a PointNet network, and theta' is a series of parameters of PN;
2) extracting local neighborhood characteristics by convolution of each improved expansion point;
input form is P1×K1Matrix information of x (D + C), output form P1×(D+Ci) (ii) a Wherein, CiObtaining a point characteristic dimension abstracted in a local neighborhood for the ith improved expansion point convolution;
spatial pyramid pooling has i channelsEach channel carries an improved expansion point convolution with an expansion rate d1,d2,…,diThen, the content information P of different scales of the i groups of local neighborhoods is obtained1×(D+Ci);
3) Fusing characteristic information extracted by each improved expansion point convolution network;
input form is P1×(D+Ci) The output form is sigma P1×(D+Ci) (ii) a Wherein, i is the number of the spatial pyramid pooling channels and is the number of the improved expansion point convolution; fusing content information of different scales by a splicing method;
4) reducing the dimension characteristic information;
the input form is output form is sigma P1×(D+Ci) The output form is P1×(D+C′)。
3. The method of claim 2, wherein PointNet is selected as the feature extractor of the local neighborhood point cloud, and the working principle of the PointNet feature extractor is as follows:
given a set of unordered local neighborhood points { pl1,pl1,…,plnH' may be defined to map the set of point clouds into a vector:
Figure FDA0002254726510000031
where γ and h are typically MLP.
4. The method of claim 1, wherein the decoding process in step 6 is as follows:
1) interpolation and upsampling: based on P2Sampling the abstract point cloud by interpolation algorithm to obtain P1An abstract point cloud with point cloud characteristics of F2. The step inputs P2X (D + C') matrix, output P1×(D+C″);
2) Jump link fusion feature: p in step 3 is converted into a link-hopping mode1An abstract point cloudCharacteristic F1With up-sampled P1An abstract point cloud characteristic F2Fusion addition; the step inputs P1X (D + C') and P1X (D + C'), and output P1×(D+C″+C′);
3) And (3) decoding: decoding the point cloud based on the fusion characteristics by using a Unit PointNet;
the step inputs P1X (D + C "+ C'), output P1×(D+C″′)。
5. The method of claim 4, wherein the decoding process in step 7 is as follows:
1) interpolation and upsampling: input P1X (D + C ') matrix, output P x (D + C');
2) jump link fusion feature: inputting P x (D + C ') and P x (D + C), and outputting P x (D + C' + C);
3) and (3) decoding: p × (D + C '+ C) is input, and P × (D + C') is output.
CN201911048539.3A 2019-10-31 2019-10-31 Point cloud semantic segmentation method based on expansion point convolution space pyramid pooling Pending CN111027559A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911048539.3A CN111027559A (en) 2019-10-31 2019-10-31 Point cloud semantic segmentation method based on expansion point convolution space pyramid pooling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911048539.3A CN111027559A (en) 2019-10-31 2019-10-31 Point cloud semantic segmentation method based on expansion point convolution space pyramid pooling

Publications (1)

Publication Number Publication Date
CN111027559A true CN111027559A (en) 2020-04-17

Family

ID=70200756

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911048539.3A Pending CN111027559A (en) 2019-10-31 2019-10-31 Point cloud semantic segmentation method based on expansion point convolution space pyramid pooling

Country Status (1)

Country Link
CN (1) CN111027559A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860138A (en) * 2020-06-09 2020-10-30 中南民族大学 Three-dimensional point cloud semantic segmentation method and system based on full-fusion network
CN112149725A (en) * 2020-09-18 2020-12-29 南京信息工程大学 Spectral domain graph convolution 3D point cloud classification method based on Fourier transform
CN112418235A (en) * 2020-11-20 2021-02-26 中南大学 Point cloud semantic segmentation method based on expansion nearest neighbor feature enhancement
CN112560965A (en) * 2020-12-18 2021-03-26 中国科学院深圳先进技术研究院 Image semantic segmentation method, storage medium and computer device
CN112819833A (en) * 2021-02-05 2021-05-18 四川大学 Large scene point cloud semantic segmentation method
CN112967296A (en) * 2021-03-10 2021-06-15 重庆理工大学 Point cloud dynamic region graph convolution method, classification method and segmentation method
CN113378112A (en) * 2021-06-18 2021-09-10 浙江工业大学 Point cloud completion method and device based on anisotropic convolution
CN113392841A (en) * 2021-06-03 2021-09-14 电子科技大学 Three-dimensional point cloud semantic segmentation method based on multi-feature information enhanced coding
CN113486963A (en) * 2021-07-12 2021-10-08 厦门大学 Density self-adaptive point cloud end-to-end sampling method
CN114693932A (en) * 2022-04-06 2022-07-01 南京航空航天大学 Large aircraft large component point cloud semantic segmentation method
CN115496910A (en) * 2022-11-07 2022-12-20 中国测绘科学研究院 Point cloud semantic segmentation method based on full-connected graph coding and double-expansion residual error

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108319957A (en) * 2018-02-09 2018-07-24 深圳市唯特视科技有限公司 A kind of large-scale point cloud semantic segmentation method based on overtrick figure
CN108345831A (en) * 2017-12-28 2018-07-31 新智数字科技有限公司 The method, apparatus and electronic equipment of Road image segmentation based on point cloud data
CN109410307A (en) * 2018-10-16 2019-03-01 大连理工大学 A kind of scene point cloud semantic segmentation method
US20190147302A1 (en) * 2017-11-10 2019-05-16 Nvidia Corp. Bilateral convolution layer network for processing point clouds
US10408939B1 (en) * 2019-01-31 2019-09-10 StradVision, Inc. Learning method and learning device for integrating image acquired by camera and point-cloud map acquired by radar or LiDAR corresponding to image at each of convolution stages in neural network and testing method and testing device using the same
CN110223348A (en) * 2019-02-25 2019-09-10 湖南大学 Robot scene adaptive bit orientation estimation method based on RGB-D camera

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190147302A1 (en) * 2017-11-10 2019-05-16 Nvidia Corp. Bilateral convolution layer network for processing point clouds
CN108345831A (en) * 2017-12-28 2018-07-31 新智数字科技有限公司 The method, apparatus and electronic equipment of Road image segmentation based on point cloud data
CN108319957A (en) * 2018-02-09 2018-07-24 深圳市唯特视科技有限公司 A kind of large-scale point cloud semantic segmentation method based on overtrick figure
CN109410307A (en) * 2018-10-16 2019-03-01 大连理工大学 A kind of scene point cloud semantic segmentation method
US10408939B1 (en) * 2019-01-31 2019-09-10 StradVision, Inc. Learning method and learning device for integrating image acquired by camera and point-cloud map acquired by radar or LiDAR corresponding to image at each of convolution stages in neural network and testing method and testing device using the same
CN110223348A (en) * 2019-02-25 2019-09-10 湖南大学 Robot scene adaptive bit orientation estimation method based on RGB-D camera

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HONGSHAN YU 等: "Methods and datasets on semantic segmentation: A review", 《NEUROCOMPUTING》 *
YUAN WANG 等: "PointSeg: Real-Time Semantic Segmentation Based on 3D LiDAR Point Cloud", 《ARXIV》 *
张祥甫 等: "基于深度学习的语义分割问题研究综述", 《激光与光电子学进展》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860138A (en) * 2020-06-09 2020-10-30 中南民族大学 Three-dimensional point cloud semantic segmentation method and system based on full-fusion network
CN111860138B (en) * 2020-06-09 2024-03-01 中南民族大学 Three-dimensional point cloud semantic segmentation method and system based on full fusion network
CN112149725A (en) * 2020-09-18 2020-12-29 南京信息工程大学 Spectral domain graph convolution 3D point cloud classification method based on Fourier transform
CN112149725B (en) * 2020-09-18 2023-08-22 南京信息工程大学 Fourier transform-based spectrum domain map convolution 3D point cloud classification method
CN112418235A (en) * 2020-11-20 2021-02-26 中南大学 Point cloud semantic segmentation method based on expansion nearest neighbor feature enhancement
CN112560965B (en) * 2020-12-18 2024-04-05 中国科学院深圳先进技术研究院 Image semantic segmentation method, storage medium and computer device
CN112560965A (en) * 2020-12-18 2021-03-26 中国科学院深圳先进技术研究院 Image semantic segmentation method, storage medium and computer device
CN112819833A (en) * 2021-02-05 2021-05-18 四川大学 Large scene point cloud semantic segmentation method
CN112819833B (en) * 2021-02-05 2022-07-12 四川大学 Large scene point cloud semantic segmentation method
CN112967296A (en) * 2021-03-10 2021-06-15 重庆理工大学 Point cloud dynamic region graph convolution method, classification method and segmentation method
CN112967296B (en) * 2021-03-10 2022-11-15 重庆理工大学 Point cloud dynamic region graph convolution method, classification method and segmentation method
CN113392841A (en) * 2021-06-03 2021-09-14 电子科技大学 Three-dimensional point cloud semantic segmentation method based on multi-feature information enhanced coding
CN113378112A (en) * 2021-06-18 2021-09-10 浙江工业大学 Point cloud completion method and device based on anisotropic convolution
CN113486963A (en) * 2021-07-12 2021-10-08 厦门大学 Density self-adaptive point cloud end-to-end sampling method
CN113486963B (en) * 2021-07-12 2023-07-07 厦门大学 Point cloud end-to-end sampling method with self-adaptive density
CN114693932A (en) * 2022-04-06 2022-07-01 南京航空航天大学 Large aircraft large component point cloud semantic segmentation method
CN115496910B (en) * 2022-11-07 2023-04-07 中国测绘科学研究院 Point cloud semantic segmentation method based on full-connected graph coding and double-expansion residual error
CN115496910A (en) * 2022-11-07 2022-12-20 中国测绘科学研究院 Point cloud semantic segmentation method based on full-connected graph coding and double-expansion residual error

Similar Documents

Publication Publication Date Title
CN111027559A (en) Point cloud semantic segmentation method based on expansion point convolution space pyramid pooling
Qiu et al. Semantic segmentation for real point cloud scenes via bilateral augmentation and adaptive fusion
Li et al. Deep learning for remote sensing image classification: A survey
CN104036012B (en) Dictionary learning, vision bag of words feature extracting method and searching system
Lin et al. Local and global encoder network for semantic segmentation of Airborne laser scanning point clouds
CN113240683B (en) Attention mechanism-based lightweight semantic segmentation model construction method
CN114792372A (en) Three-dimensional point cloud semantic segmentation method and system based on multi-head two-stage attention
CN113870286B (en) Foreground segmentation method based on multi-level feature and mask fusion
CN111652273A (en) Deep learning-based RGB-D image classification method
CN114299285A (en) Three-dimensional point cloud semi-automatic labeling method and system, electronic equipment and storage medium
CN115272696A (en) Point cloud semantic segmentation method based on self-adaptive convolution and local geometric information
CN116129118A (en) Urban scene laser LiDAR point cloud semantic segmentation method based on graph convolution
CN116028663A (en) Three-dimensional data engine platform
Zheng et al. Person re-identification in the 3D space
Hazer et al. Deep learning based point cloud processing techniques
Han et al. A Large-Scale Network Construction and Lightweighting Method for Point Cloud Semantic Segmentation
Bashmal et al. Language Integration in Remote Sensing: Tasks, datasets, and future directions
CN117765258A (en) Large-scale point cloud semantic segmentation method based on density self-adaption and attention mechanism
CN111597367B (en) Three-dimensional model retrieval method based on view and hash algorithm
CN116503746A (en) Infrared small target detection method based on multilayer nested non-full-mapping U-shaped network
Wang et al. Hierarchical Kernel Interaction Network for Remote Sensing Object Counting
Tan et al. 3D detection transformer: Set prediction of objects using point clouds
CN115497085A (en) Point cloud completion method and system based on multi-resolution dual-feature folding
Zhu et al. Gradient-based graph attention for scene text image super-resolution
Huang et al. Remote sensing data detection based on multiscale fusion and attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200417

WD01 Invention patent application deemed withdrawn after publication