CN118351320B - Instance segmentation method based on three-dimensional point cloud - Google Patents
Instance segmentation method based on three-dimensional point cloud Download PDFInfo
- Publication number
- CN118351320B CN118351320B CN202410780784.8A CN202410780784A CN118351320B CN 118351320 B CN118351320 B CN 118351320B CN 202410780784 A CN202410780784 A CN 202410780784A CN 118351320 B CN118351320 B CN 118351320B
- Authority
- CN
- China
- Prior art keywords
- point cloud
- mask
- instance
- prediction
- matching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 61
- 238000000034 method Methods 0.000 title claims abstract description 38
- 239000013598 vector Substances 0.000 claims abstract description 122
- 238000012545 processing Methods 0.000 claims abstract description 31
- 230000004927 fusion Effects 0.000 claims abstract description 26
- 230000007246 mechanism Effects 0.000 claims abstract description 24
- 238000012549 training Methods 0.000 claims abstract description 21
- 238000011156 evaluation Methods 0.000 claims abstract description 12
- 230000002776 aggregation Effects 0.000 claims abstract description 6
- 238000004220 aggregation Methods 0.000 claims abstract description 6
- 230000004931 aggregating effect Effects 0.000 claims abstract description 4
- 238000007781 pre-processing Methods 0.000 claims abstract description 4
- 238000013507 mapping Methods 0.000 claims description 19
- 230000014509 gene expression Effects 0.000 claims description 18
- 238000011176 pooling Methods 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 13
- 238000000605 extraction Methods 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000005192 partition Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 50
- 230000009286 beneficial effect Effects 0.000 description 11
- 238000010606 normalization Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0499—Feedforward networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/28—Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an example segmentation method based on three-dimensional point cloud, which belongs to the technical field of image processing and computer vision and comprises the following steps: acquiring and preprocessing point cloud data to obtain a point cloud training data set; extracting characteristics of point cloud data to obtain point cloud level data characteristics and comprehensive characteristics of each super point; classifying and aggregating by a coding and second class aggregation method to obtain an initialization position coding, an initialization position query vector and a high-density point query vector; vector fusion is carried out based on a cross attention mechanism, so that example fusion characteristics are obtained; predicting the center position and the boundary position of the instance, convolving to generate a segmentation mask, and converting the segmentation mask into a prediction mask; and according to the bipartite matching method and the graph model, matching and matching degree evaluation are carried out on the prediction mask and the real mask, and instance segmentation of the three-dimensional point cloud is carried out. The method solves the problem of limitation of the existing three-dimensional point cloud instance segmentation technology in the aspects of processing closely adjacent objects and understanding real-time scenes.
Description
Technical Field
The invention belongs to the technical field of image processing and computer vision, and particularly relates to an integrated query and density clustering instance segmentation method suitable for 3D instance segmentation.
Background
Three-dimensional instance segmentation is a critical task in the field of computer vision, involving the accurate segmentation and identification of individual objects from three-dimensional point cloud data. The accurate three-dimensional instance segmentation not only can improve the environment sensing capability of systems in multiple fields such as automatic driving, robot navigation, virtual reality and the like, but also is beneficial to realizing safety and more functions.
Three-dimensional point cloud data is typically collected by a lidar, a structured light scanner, or a stereo vision system, which has unstructured and highly complex features. Processing such data requires an understanding of its unique spatial structure and dense point cloud distribution. Traditional three-dimensional scene understanding methods tend to be limited to predefined categories and supervise learning techniques, which rely on large amounts of annotation data. The point cloud data is distinguished from the traditional two-dimensional image, the distribution in space is uneven, and is often influenced by shielding and noise, and the three-dimensional space structure and topological relation of the data need to be effectively understood when the data are processed.
In recent years, the introduction of deep learning brings revolutionary progress to three-dimensional scene understanding, and more complex and abstract feature representations can be learned, so that the accuracy and the robustness of segmentation are remarkably improved. However, the existing method often depends on a large amount of labeling data and computing resources, so that flexibility and expandability of instance segmentation are limited, and meanwhile, tight overlapping among instances, multiple categories and real-time segmentation in a dynamic environment are difficult to effectively process.
Disclosure of Invention
Aiming at the defects in the prior art, the instance segmentation method based on the three-dimensional point cloud solves the problem of limitation of the existing three-dimensional point cloud instance segmentation technology in the aspects of processing closely adjacent objects and understanding real-time scenes by combining binary clustering and query vectors.
In order to achieve the aim of the invention, the invention adopts the following technical scheme:
the invention provides an example segmentation method based on three-dimensional point cloud, which comprises the following steps:
S1, acquiring and preprocessing point cloud data to obtain a point cloud training data set;
s2, extracting characteristics of point cloud data in the point cloud training data set to obtain point cloud level data characteristics and comprehensive characteristics of all super points;
S3, classifying and aggregating according to the basic data characteristics of the point cloud by using a coding and second class aggregation method to obtain an initialization position coding, an initialization position query vector and a high-density point query vector;
S4, vector fusion is carried out on the basis of a cross attention mechanism according to the comprehensive characteristics of each super point, the initialized position code, the initialized position query vector and the high-density point query vector to obtain example fusion characteristics;
S5, based on the example fusion characteristics, predicting the center position and the boundary position of the example, convoluting to generate a segmentation mask, and converting the segmentation mask of each prediction example into a prediction mask;
And S6, according to the bipartite matching method and the graph model, matching and matching degree evaluation are carried out on the prediction mask and the real mask, and instance segmentation of the three-dimensional point cloud is carried out.
The beneficial effects of the invention are as follows: according to the example segmentation method based on the three-dimensional point cloud, characteristic extraction is carried out on point cloud data, so that point cloud scale is obtained, characteristic data and comprehensive characteristics of all super points are obtained, extraction of high-density point query vectors is realized through second-class fusion values, the space recognition capability of the query vectors is improved, key characteristic points of a dense area are accurately represented, and the understanding and processing capability of a complex three-dimensional scene are enhanced; based on the comprehensive characteristics of each super point, the initialization position code, the initialization position query vector and the high-density point query vector, the vector fusion is carried out through a cross attention mechanism, so that the example fusion characteristics are obtained, and the accuracy of generating the prediction mask is improved; the invention obviously improves the processing efficiency of unstructured three-dimensional point cloud data through the matching between the prediction mask and the real mask and the matching evaluation optimization, and has obvious advantages in the aspect of example segmentation of the three-dimensional point cloud data and the aspect of processing a plurality of examples closely connected in space.
Further, the step S1 includes the following steps:
s11, acquiring point cloud data in a plurality of scenes, and matching corresponding labels with the point cloud data;
S12, carrying out standardization processing and data enhancement processing on the point cloud data after the label is matched;
S13, generating a point cloud data set based on the point cloud data subjected to the standardization processing and the data enhancement processing;
S14, carrying out voxelization on the point cloud data with the size of H multiplied by W multiplied by 3 in the point cloud data set, and carrying out voxelization on the point cloud scene to obtain a point cloud training data set, wherein H represents the height of the point cloud data, and W represents the width of the point cloud data.
The beneficial effects of adopting the further scheme are as follows: according to the invention, through the standardization and data enhancement processing of the point cloud data, the richness of the point cloud data is effectively improved through different visual angles and sizes.
Further, the step S2 includes the following steps:
S21, performing feature conversion on point cloud data in a point cloud training data set by using an input convolution layer to obtain initial point cloud data features;
S22, performing multi-scale feature extraction on the initial point cloud data features by using a pre-training sparse 3D U-Net model to obtain multi-scale point cloud data features;
s23, utilizing a linear layer to adjust feature dimensions of the multi-scale point cloud data features to obtain normalized point cloud data features;
s24, reconstructing a mapping relation between the normalized point cloud data characteristics and point cloud data in the point cloud training data set by using a mapping table to obtain point cloud level data characteristics;
s25, pooling the normalized point cloud data characteristics by using the identification of the super points to obtain pooling characteristics of each super point;
s26, according to the pooling type adopted by the pooling characteristics of each super point, the characteristics in the same super point are aggregated, and the comprehensive characteristics of each super point are obtained.
The beneficial effects of adopting the further scheme are as follows: according to the invention, through the encoder-decoder architecture and jump connection of the 3D U-Net model, feature extraction is effectively realized on multiple scales, the feature from the whole world to the local is ensured to be effectively captured and utilized, the mapping relation is reconstructed by utilizing the mapping table, the spatial consistency of the feature is ensured, the comprehensive feature of the super point is obtained through pooling and aggregation, the necessary information is reserved while the data quantity is reduced, and the processing of large-scale point cloud data is facilitated.
Further, the step S3 includes the following steps:
s31, coding the space position information of the point cloud scale other data characteristics to obtain an initialization position code;
s32, generating an initial position query vector by using the initial position code;
S33, counting the number of neighborhood points of each point in the point cloud within a preset radius range of the point according to the data characteristics of the points cloud scale, and taking the number as the local density of the point;
s34, setting a density classification threshold;
s35, taking points, among the point clouds, of which the local density is greater than a density classification threshold value as high-density points;
s36, gathering all the high-density points to obtain a high-density point set;
And S37, extracting the characteristics in the high-density point set to obtain a high-density point query vector.
The beneficial effects of adopting the further scheme are as follows: the invention obtains the initialization position code by the coding processing method and correspondingly generates the initial position inquiry vector; according to the method, binary clustering is realized by setting the density classification threshold value, high-density points in the point cloud are obtained, the high-density point sets are formed by aggregation, and the high-density point query vector is obtained through feature extraction, so that the instance identification efficiency and the instance segmentation precision are effectively improved, and the method is particularly suitable for processing intensive or complex three-dimensional scenes.
Further, the step S4 includes the following steps:
S41, calculating the similarity between each initialized position query vector and the key vector and the value vector by adopting a self-attention mechanism, and normalizing the similarity into a first weight through a softmax function;
s42, matching the first weight with the corresponding point cloud scale other characteristic data, and correspondingly weighting and summing to obtain a new initialization position query vector;
s43, calculating the similarity between each high-density query vector and the key vector and the value vector by adopting a self-attention mechanism, and normalizing the similarity into a second weight through a softmax function;
S44, matching the second weight with the corresponding point cloud scale other characteristic data, and correspondingly weighting and summing to obtain a new high-density query vector;
S45, taking the comprehensive characteristics of the initialization position codes and the super points as key vectors to be fused and value vectors to be fused, and taking a new initialization position query vector and a new high-density query vector as query vectors to be fused;
S46, fusing the query vector to be fused with the key vector to be fused and the value vector to be fused based on a cross attention mechanism to obtain an instance fusion characteristic.
The beneficial effects of adopting the further scheme are as follows: the invention adopts a self-attention mechanism, constructs a new initialization position query vector and a new high-density query vector according to the similarity between the initialization position query vector and the key vector and the value vector and the similarity between the high-density query vector and the key vector and the value vector, adopts a cross-attention mechanism, combines the comprehensive characteristics of initialization position codes and all the superpoints, realizes vector fusion, constructs a new vector query mode and provides a basis for improving the accuracy of generating the prediction mask.
Further, the calculation expression of the cross-attention mechanism in S46 is as follows:
,
Wherein, attention () represents an Attention mechanism function, Q represents a query vector to be fused, K represents a key vector to be fused, V represents a value vector to be fused, softmax () represents a softmax function, K T represents a transpose of the key vector to be fused, and d k represents a dimension of the key vector to be fused.
The beneficial effects of adopting the further scheme are as follows: the invention provides a calculation method of a cross attention mechanism, which can fuse the comprehensive characteristics of each super point with a new initialized position query vector and a new high-density query vector which are processed by the self attention mechanism, thereby realizing effective integration of information of different feature spaces and improving the analysis capability.
Further, the step S5 includes the following steps:
S51, based on example fusion characteristics, constructing characteristic mapping associated with an example center to obtain a center position and a boundary position of a predicted example;
s52, based on the central position and the boundary position of the predicted instance, obtaining a segmentation mask of each predicted instance by carrying out convolution operation on the instance fusion characteristics;
s53, converting the segmentation mask of each prediction instance into a prediction mask;
the computational expression of the prediction mask is as follows:
Where M i (x) represents the prediction mask of the ith instance, sigmoid () represents the sigmoid function, and f (x) represents the partition mask of each prediction instance.
The beneficial effects of adopting the further scheme are as follows: the invention provides a method for carrying out mask segmentation and mask prediction based on an instance fusion feature, which obtains a prediction mask and provides a basis for executing an instance segmentation task.
Further, the calculation expressions of the center position and the boundary position of the prediction example in S51 are as follows:
,
,
Wherein Center i represents the Center position of the i-th predicted instance, σ represents the activation function, MLP () represents the multi-layer perceptron, Q position represents the query vector corresponding to the Center position of the instance, i represents the i-th instance, bound i represents the Boundary position of the i-th predicted instance, MLP center () represents the multi-layer perceptron for instance Center prediction, Q ct represents the query vector corresponding to the Boundary position of the current instance, and Q pt-1 represents the query vector corresponding to the Center position of the previous instance.
The beneficial effects of adopting the further scheme are as follows: the invention provides a calculation method for the center position and the boundary position of a predicted instance, which can provide a basis for accurately performing mask segmentation and obtaining a predicted mask by accurately calculating the center position and the boundary position of the predicted instance.
Further, the step S6 includes the steps of:
s61, constructing a graph model according to a binary matching method, wherein nodes in the graph model are a prediction mask and a real mask respectively, and the weight of edges in the graph model is the similarity between the prediction mask and the real mask;
S62, calculating a matching degree between a prediction mask and a real mask through a mask cross-correlation model;
The calculation expression of the mask merging ratio model is as follows:
,
Wherein IOU (M pred,Mgt) represents the matching degree between the prediction mask and the real mask, M pred represents the prediction mask, M gt represents the real mask, |represents the absolute value, |represents the intersection operation, |represents the union operation;
s63, setting a matching degree threshold, and taking the prediction mask and the real mask as a matching mask pair when the matching degree between the prediction mask and the real mask is greater than the matching degree threshold;
S64, constructing a global optimal matching model according to the Hungary algorithm, and optimizing the matching degree between the prediction mask and the real mask;
the computational expression of the global optimal matching model is as follows:
,
wherein maximize denotes the maximization, Represent the firstPrediction mask and the firstThe degree of matching between the individual true masks, x represents the multiplication,Represent the firstThe number of prediction masks is chosen such that,Represent the firstThe number of true masks is one,Representing a mask pairing indication function, wherein the mask pairing index function is at the firstPrediction mask and the firstThe pair of true masks forming the matching mask takes a value of 1, at the firstPrediction mask and the firstThe value of the matching pair is 0 when the matching pair is not selected among the true masks;
S65, evaluating the matching degree between the optimized prediction mask and the real mask by using the matching loss function to obtain a matching degree evaluation result;
And S66, optimizing prediction masks and prediction of instance centers based on the matching degree evaluation result, and executing instance segmentation of the three-dimensional point cloud.
The beneficial effects of adopting the further scheme are as follows: the invention provides a method for matching a prediction mask and a real mask and optimizing and evaluating a matching result, which is used for executing instance segmentation of a three-dimensional point cloud based on an optimizing process and a matching degree evaluation result, so that the instance segmentation efficiency, and the accuracy and the integrity of the segmentation result can be greatly improved.
Further, the calculation expression of the matching loss function in S65 is as follows;
,
,
,
,
,
wherein L represents a matching loss function, λ cls represents a classification loss weight coefficient, L cls represents a classification loss function, λ mask represents a mask binary loss weight coefficient, L mask represents a mask binary loss function, λ dice represents an intersection ratio loss weight coefficient, L dice represents an intersection ratio loss function, λ center represents a center regression loss weight coefficient, L center represents a center regression loss function, CE () represents an intersection entropy loss function, pred class represents a class of a predicted instance, true class represents a class of a real instance, pred mask represents a prediction mask, true mask represents a real mask, and i is a norm operation.
The beneficial effects of adopting the further scheme are as follows: the invention provides a calculation method of a matching loss function, which evaluates the matching degree of the prediction mask and the real mask in an omnibearing way from the prediction type, the matching precision of pixel levels between the prediction mask and the real mask, the overlapping area and the error between center points, and can optimize the matching accuracy between the prediction mask and the real mask and the prediction of the instance center while ensuring the classification recognition precision, thereby improving the overall performance on a three-dimensional point cloud instance segmentation task.
Other advantages that are also present with respect to the present invention will be more detailed in the following examples.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart illustrating steps of an example segmentation method based on a three-dimensional point cloud according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention.
As shown in fig. 1, in one embodiment of the present invention, the present invention provides an example segmentation method based on a three-dimensional point cloud, including the following steps:
S1, acquiring and preprocessing point cloud data to obtain a point cloud training data set;
The step S1 comprises the following steps:
s11, acquiring point cloud data in a plurality of scenes, and matching corresponding labels with the point cloud data;
S12, carrying out standardization processing and data enhancement processing on the point cloud data after the label is matched;
in the embodiment, the standardization processing is performed on the point cloud data by adopting the scale normalization or the centralization processing at random, the data enhancement is performed on the point cloud data by randomly rotating or zooming, and the point cloud data with different visual angles and sizes are expanded;
S13, generating a point cloud data set based on the point cloud data subjected to the standardization processing and the data enhancement processing;
In this embodiment, the information in the point cloud data set includes a scene ID, a voxel coordinate, a mapping of point cloud data to a voxel, a mapping of a voxel to point cloud data, a shape of a discrete voxel space, a feature of the point cloud data, a super point identifier, a batch offset, an instance tag, and a point cloud floating point coordinate;
The scene ID is used for uniquely identifying a scene; the voxel coordinates are used for representing coordinates of the point cloud data in a discrete voxel space; the mapping from the point cloud data to the voxels is used for mapping the points in the point cloud data to the corresponding voxels; the mapping from the voxels to the point cloud data is used for mapping the points in the voxels to the corresponding point cloud data; the shape of the discrete voxel space is used to represent the size of a voxel grid; the characteristics of the point cloud data comprise the characteristics of the position, the color, the normal vector and the like of the point; the super point mark is used for representing advanced features for improving the point cloud processing performance; the batch offset is used for identifying data boundaries of different scenes in a batch processing process; the instance tag is used for representing an instance to which each point in the point cloud data belongs; the point cloud floating point coordinates are used for representing floating point number coordinate information of the point cloud data.
S14, carrying out voxelization on point cloud data with the size of H multiplied by W multiplied by 3 in the point cloud data set, and carrying out voxelization on a point cloud scene by using Open3D to obtain a point cloud training data set, wherein H represents the height of the point cloud data, and W represents the width of the point cloud data.
S2, extracting characteristics of point cloud data in the point cloud training data set to obtain point cloud level data characteristics and comprehensive characteristics of all super points;
the step S2 comprises the following steps:
S21, performing feature conversion on point cloud data in a point cloud training data set by using an input convolution layer to obtain initial point cloud data features; the input convolution layer performs preliminary feature conversion on the original point cloud data, and prepares for multi-scale analysis and feature extraction of the deep network;
s22, performing multi-scale feature extraction on the initial point cloud data features by using a pre-training sparse 3D U-Net model to obtain multi-scale point cloud data features; the pre-training sparse 3D U-Net model is connected with the jump feature through the encoder-decoder architecture, so that feature extraction of initial point cloud data features on multiple scales is effectively realized, the feature from the whole world to the local can be effectively captured and utilized, and the method is applicable to complex 3D structures and objects of various scales.
S23, utilizing a linear layer to adjust feature dimensions of the multi-scale point cloud data features to obtain normalized point cloud data features; in this embodiment, the linear layer adopts a Normalization function and Relu activation functions for activating the Normalization result, so that feature dimensions of the normalized point cloud data features are matched with feature mapping and pooling operations.
S24, reconstructing a mapping relation between the normalized point cloud data characteristics and point cloud data in the point cloud training data set by using a mapping table to obtain point cloud level data characteristics; the remapping adopts a mapping table v2p_map to correspond the point cloud data characteristics subjected to multi-scale characteristic extraction and normalization processing with the point cloud data in the point cloud training data set, is very important for processing irregular point cloud data, and can ensure consistency of characteristic space. The remapped point cloud scale data features retain the resolution of the point cloud level, retain the detail information of the point cloud data, and can be directly used for tasks such as query vector generation and instance segmentation with high precision requirements.
S25, pooling the normalized point cloud data characteristics by using the identification of the super points to obtain pooling characteristics of each super point;
S26, according to the pooling type adopted by the pooling characteristics of each super point, the characteristics in the same super point are aggregated, and the comprehensive characteristics of each super point are obtained. In this embodiment, features in the same super point are aggregated according to mean pooling or maximum pooling, so as to obtain a comprehensive feature of each super point, thereby reducing data volume while retaining necessary information, and facilitating processing of large-scale point cloud data.
S3, classifying and aggregating according to the basic data characteristics of the point cloud by using a coding and second class aggregation method to obtain an initialization position coding, an initialization position query vector and a high-density point query vector;
the step S3 comprises the following steps:
s31, coding the space position information of the point cloud scale other data characteristics to obtain an initialization position code;
s32, generating an initial position query vector by using the initial position code;
S33, counting the number of neighborhood points of each point in the point cloud within a preset radius range of the point according to the data characteristics of the points cloud scale, and taking the number as the local density of the point;
s34, setting a density classification threshold;
s35, taking points, among the point clouds, of which the local density is greater than a density classification threshold value as high-density points;
s36, gathering all the high-density points to obtain a high-density point set;
And S37, extracting the characteristics in the high-density point set to obtain a high-density point query vector. In this embodiment, the high-density point set includes key structural information, which is an important point for instance identification and scene segmentation; the high-density point query vector can be used for guiding instance recognition and instance segmentation in a backbone architecture, so that the instance recognition efficiency can be improved, the instance segmentation accuracy can be improved, and the method is particularly suitable for three-dimensional scenes with intensive and complex processing and insignificant boundaries between instances.
S4, vector fusion is carried out on the basis of a cross attention mechanism according to the comprehensive characteristics of each super point, the initialized position code, the initialized position query vector and the high-density point query vector to obtain example fusion characteristics;
The step S4 comprises the following steps:
S41, calculating the similarity between each initialized position query vector and the key vector and the value vector by adopting a self-attention mechanism, and normalizing the similarity into a first weight through a softmax function; in the embodiment, similarity between each initialization position query vector and key vector and value vector is calculated by adopting dot product operation;
s42, matching the first weight with the corresponding point cloud scale other characteristic data, and correspondingly weighting and summing to obtain a new initialization position query vector;
s43, calculating the similarity between each high-density query vector and the key vector and the value vector by adopting a self-attention mechanism, and normalizing the similarity into a second weight through a softmax function;
S44, matching the second weight with the corresponding point cloud scale other characteristic data, and correspondingly weighting and summing to obtain a new high-density query vector;
S45, taking the comprehensive characteristics of the initialization position codes and the super points as key vectors to be fused and value vectors to be fused, and taking a new initialization position query vector and a new high-density query vector as query vectors to be fused;
S46, fusing the query vector to be fused, the key vector to be fused and the value vector to be fused based on a cross attention mechanism to obtain an instance fusion characteristic;
The computational expression of the cross-attention mechanism is as follows:
,
Wherein, attention () represents an Attention mechanism function, Q represents a query vector to be fused, K represents a key vector to be fused, V represents a value vector to be fused, softmax () represents a softmax function, K T represents a transpose of the key vector to be fused, and d k represents a dimension of the key vector to be fused; dot products can be scaled through the dimension of the key, and gradient problems caused by overlarge internal dot products are prevented. The softmax function is used for calculating a normalized dot product result, so that the output weight distribution is reasonable. Instance fusion features can be used for instance identification and segmentation, accurate localization of the boundaries of each instance, and prediction of the center of the instance and corresponding class labels.
S5, based on the example fusion characteristics, predicting the center position and the boundary position of the example, convoluting to generate a segmentation mask, and converting the segmentation mask of each prediction example into a prediction mask;
The step S5 comprises the following steps:
S51, based on example fusion characteristics, constructing characteristic mapping associated with an example center to obtain a center position and a boundary position of a predicted example;
the calculation expressions of the center position and the boundary position of the predicted instance in S51 are as follows:
,
,
wherein Center i represents the Center position of the i-th predicted instance, σ represents the activation function, MLP () represents the multi-layer perceptron, Q position represents the query vector corresponding to the Center position of the instance, i represents the i-th instance, bound i represents the Boundary position of the i-th predicted instance, MLP center () represents the multi-layer perceptron for instance Center prediction, Q ct represents the query vector corresponding to the Boundary position of the current instance, Q pt-1 represents the query vector corresponding to the Center position of the previous instance; in this embodiment, the activation function adopts a sigmoid activation function;
S52, based on the central position and the boundary position of the predicted instance, obtaining a segmentation mask of each predicted instance by carrying out convolution operation on the instance fusion characteristics; the segmentation mask is a binary image, if the pixel value in the binary image is 1, a prediction example is represented, and if the pixel value in the binary image is 0, a background or other examples are represented;
S53, converting the segmentation mask of each prediction instance into a prediction mask; in this embodiment, the prediction mask represents a possibility that each pixel belongs to a certain instance, and the segmentation mask may be converted into the prediction mask through a sigmoid function or a softmax function;
the computational expression of the prediction mask is as follows:
,
Where M i (x) represents the prediction mask of the ith instance, sigmoid () represents the sigmoid function, and f (x) represents the partition mask of each prediction instance.
And S6, according to the bipartite matching method and the graph model, matching and matching degree evaluation are carried out on the prediction mask and the real mask, and instance segmentation of the three-dimensional point cloud is carried out. In the instance segmentation task, particularly when there are multiple overlapping instances in a scene, it is necessary to match the generated prediction masks accurately.
The step S6 comprises the following steps:
S61, constructing a graph model according to a binary matching method, wherein nodes in the graph model are a prediction mask and a real mask respectively, and the weight of edges in the graph model is the similarity between the prediction mask and the real mask; the similarity of the prediction mask and the real mask can reflect the cross overlap area ratio between different instances in the scene;
S62, calculating a matching degree between a prediction mask and a real mask through a mask cross-correlation model;
The calculation expression of the mask merging ratio model is as follows:
,
Wherein IOU (M pred,Mgt) represents the matching degree between the prediction mask and the real mask, M pred represents the prediction mask, M gt represents the real mask, |represents the absolute value, |represents the intersection operation, |represents the union operation;
s63, setting a matching degree threshold, and taking the prediction mask and the real mask as a matching mask pair when the matching degree between the prediction mask and the real mask is greater than the matching degree threshold;
S64, constructing a global optimal matching model according to the Hungary algorithm, and optimizing the matching degree between the prediction mask and the real mask;
the computational expression of the global optimal matching model is as follows:
,
wherein maximize denotes the maximization, Represent the firstPrediction mask and the firstThe degree of matching between the individual true masks, x represents the multiplication,Represent the firstThe number of prediction masks is chosen such that,Represent the firstThe number of true masks is one,Representing a mask pairing indication function, wherein the mask pairing index function is at the firstPrediction mask and the firstThe pair of true masks forming the matching mask takes a value of 1, at the firstPrediction mask and the firstThe value of the matching pair is 0 when the matching pair is not selected among the true masks;
S65, evaluating the matching degree between the optimized prediction mask and the real mask by using the matching loss function to obtain a matching degree evaluation result;
the calculation expression of the matching loss function in S65 is as follows;
,
,
,
,
,
Wherein L represents a matching loss function, λ cls represents a classification loss weight coefficient, L cls represents a classification loss function, λ mask represents a mask binary loss weight coefficient, L mask represents a mask binary loss function, λ dice represents an intersection ratio loss weight coefficient, L dice represents an intersection ratio loss function, λ center represents a center regression loss weight coefficient, L center represents a center regression loss function, CE () represents an intersection entropy loss function, pred class represents a class of a predicted instance, true class represents a class of a real instance, pred mask represents a prediction mask, true mask represents a real mask, and i represents a norm operation; the classification loss function is used for evaluating the matching degree between the prediction mask and the real mask, and the cross entropy loss function is adopted in the embodiment for calculating the classification loss function; the mask binary loss function is used for evaluating the matching precision of pixel levels between the prediction mask and the real mask, and a binary cross entropy loss function is adopted in the embodiment for calculating the binary loss function; the cross-over ratio loss function is used for optimizing the cross-over ratio of the prediction mask and the real mask and improving the overlapping area between the prediction mask and the real mask; the center regression loss function is used for optimizing the error between the predicted instance center point and the real instance center point, and in this embodiment, a minimum absolute value deviation function is used to measure the distance between the predicted instance center point and the real instance center point.
And S66, optimizing prediction masks and prediction of instance centers based on the matching degree evaluation result, and executing instance segmentation of the three-dimensional point cloud.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention.
Claims (5)
1. An instance segmentation method based on three-dimensional point cloud is characterized by comprising the following steps:
S1, acquiring and preprocessing point cloud data to obtain a point cloud training data set;
s2, extracting characteristics of point cloud data in the point cloud training data set to obtain point cloud level data characteristics and comprehensive characteristics of all super points;
the step S2 comprises the following steps:
S21, performing feature conversion on point cloud data in a point cloud training data set by using an input convolution layer to obtain initial point cloud data features;
S22, performing multi-scale feature extraction on the initial point cloud data features by using a pre-training sparse 3D U-Net model to obtain multi-scale point cloud data features;
s23, utilizing a linear layer to adjust feature dimensions of the multi-scale point cloud data features to obtain normalized point cloud data features;
s24, reconstructing a mapping relation between the normalized point cloud data characteristics and point cloud data in the point cloud training data set by using a mapping table to obtain point cloud level data characteristics;
s25, pooling the normalized point cloud data characteristics by using the identification of the super points to obtain pooling characteristics of each super point;
s26, according to the pooling type adopted by the pooling characteristics of each super point, the characteristics in the same super point are aggregated to obtain the comprehensive characteristics of each super point;
S3, classifying and aggregating according to the basic data characteristics of the point cloud by using a coding and second class aggregation method to obtain an initialization position coding, an initialization position query vector and a high-density point query vector;
the step S3 comprises the following steps:
s31, coding the space position information of the point cloud scale other data characteristics to obtain an initialization position code;
s32, generating an initialization position query vector by using the initialization position code;
S33, counting the number of neighborhood points of each point in the point cloud within a preset radius range of the point according to the data characteristics of the points cloud scale, and taking the number as the local density of the point;
s34, setting a density classification threshold;
s35, taking points, among the point clouds, of which the local density is greater than a density classification threshold value as high-density points;
s36, gathering all the high-density points to obtain a high-density point set;
S37, extracting features in the high-density point set to obtain a high-density point query vector;
S4, vector fusion is carried out on the basis of a cross attention mechanism according to the comprehensive characteristics of each super point, the initialized position code, the initialized position query vector and the high-density point query vector to obtain example fusion characteristics;
The step S4 comprises the following steps:
s41, calculating the similarity between each initialized position query vector and the key vector and the value vector by adopting a self-attention mechanism, and normalizing the similarity into a first weight through a softmax function;
s42, matching the first weight with the corresponding point cloud scale other characteristic data, and correspondingly weighting and summing to obtain a new initialization position query vector;
s43, calculating the similarity between each high-density query vector and the key vector and the value vector by adopting a self-attention mechanism, and normalizing the similarity into a second weight through a softmax function;
S44, matching the second weight with the corresponding point cloud scale other characteristic data, and correspondingly weighting and summing to obtain a new high-density query vector;
S45, taking the comprehensive characteristics of the initialization position codes and the super points as key vectors to be fused and value vectors to be fused, and taking a new initialization position query vector and a new high-density query vector as query vectors to be fused;
S46, fusing the query vector to be fused, the key vector to be fused and the value vector to be fused based on a cross attention mechanism to obtain an instance fusion characteristic;
S5, based on the example fusion characteristics, predicting the center position and the boundary position of the example, convoluting to generate a segmentation mask, and converting the segmentation mask of each prediction example into a prediction mask;
The step S5 comprises the following steps:
S51, based on example fusion characteristics, constructing characteristic mapping associated with an example center to obtain a center position and a boundary position of a predicted example;
the calculation expressions of the center position and the boundary position of the predicted instance in S51 are as follows:
,
,
Wherein Center i represents the Center position of the i-th predicted instance, σ represents the activation function, MLP () represents the multi-layer perceptron, Q position represents the query vector corresponding to the Center position of the instance, i represents the i-th instance, bound i represents the Boundary position of the i-th predicted instance, MLP center () represents the multi-layer perceptron for instance Center prediction, Q ct represents the query vector corresponding to the Boundary position of the current instance, Q pt-1 represents the query vector corresponding to the Center position of the previous instance;
s52, based on the central position and the boundary position of the predicted instance, obtaining a segmentation mask of each predicted instance by carrying out convolution operation on the instance fusion characteristics;
s53, converting the segmentation mask of each prediction instance into a prediction mask;
the computational expression of the prediction mask is as follows:
Where M i (x) represents the prediction mask of the ith instance, sigmoid () represents the sigmoid function, and f (x) represents the partition mask of each prediction instance;
And S6, according to the bipartite matching method and the graph model, matching and matching degree evaluation are carried out on the prediction mask and the real mask, and instance segmentation of the three-dimensional point cloud is carried out.
2. The three-dimensional point cloud-based instance segmentation method according to claim 1, wherein the S1 comprises the steps of:
s11, acquiring point cloud data in a plurality of scenes, and matching corresponding labels with the point cloud data;
S12, carrying out standardization processing and data enhancement processing on the point cloud data after the label is matched;
S13, generating a point cloud data set based on the point cloud data subjected to the standardization processing and the data enhancement processing;
S14, carrying out voxelization on the point cloud data with the size of H multiplied by W multiplied by 3 in the point cloud data set, and carrying out voxelization on the point cloud scene to obtain a point cloud training data set, wherein H represents the height of the point cloud data, and W represents the width of the point cloud data.
3. The three-dimensional point cloud-based instance segmentation method according to claim 1, wherein the computation expression of the cross-attention mechanism in S46 is as follows:
,
Wherein, attention () represents an Attention mechanism function, Q represents a query vector to be fused, K represents a key vector to be fused, V represents a value vector to be fused, softmax () represents a softmax function, K T represents a transpose of the key vector to be fused, and d k represents a dimension of the key vector to be fused.
4. The three-dimensional point cloud-based instance segmentation method according to claim 1, wherein the step S6 comprises the steps of:
s61, constructing a graph model according to a binary matching method, wherein nodes in the graph model are a prediction mask and a real mask respectively, and the weight of edges in the graph model is the similarity between the prediction mask and the real mask;
S62, calculating a matching degree between a prediction mask and a real mask through a mask cross-correlation model;
The calculation expression of the mask merging ratio model is as follows:
,
Wherein IOU (M pred,Mgt) represents the matching degree between the prediction mask and the real mask, M pred represents the prediction mask, M gt represents the real mask, |represents the absolute value, |represents the intersection operation, |represents the union operation;
s63, setting a matching degree threshold, and taking the prediction mask and the real mask as a matching mask pair when the matching degree between the prediction mask and the real mask is greater than the matching degree threshold;
S64, constructing a global optimal matching model according to the Hungary algorithm, and optimizing the matching degree between the prediction mask and the real mask;
the computational expression of the global optimal matching model is as follows:
,
wherein maximize denotes the maximization, Represent the firstPrediction mask and the firstThe degree of matching between the individual true masks, x represents the multiplication,Represent the firstThe number of prediction masks is chosen such that,Represent the firstThe number of true masks is one,Representing a mask pairing indication function, wherein the mask pairing index function is at the firstPrediction mask and the firstThe pair of true masks forming the matching mask takes a value of 1, at the firstPrediction mask and the firstThe value of the matching pair is 0 when the matching pair is not selected among the true masks;
S65, evaluating the matching degree between the optimized prediction mask and the real mask by using the matching loss function to obtain a matching degree evaluation result;
And S66, optimizing prediction masks and prediction of instance centers based on the matching degree evaluation result, and executing instance segmentation of the three-dimensional point cloud.
5. The three-dimensional point cloud based instance segmentation method according to claim 4, wherein the calculation expression of the matching loss function in S65 is as follows;
,
,
,
,
,
wherein L represents a matching loss function, λ cls represents a classification loss weight coefficient, L cls represents a classification loss function, λ mask represents a mask binary loss weight coefficient, L mask represents a mask binary loss function, λ dice represents an intersection ratio loss weight coefficient, L dice represents an intersection ratio loss function, λ center represents a center regression loss weight coefficient, L center represents a center regression loss function, CE () represents an intersection entropy loss function, pred class represents a class of a predicted instance, true class represents a class of a real instance, pred mask represents a prediction mask, true mask represents a real mask, and i is a norm operation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410780784.8A CN118351320B (en) | 2024-06-18 | 2024-06-18 | Instance segmentation method based on three-dimensional point cloud |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410780784.8A CN118351320B (en) | 2024-06-18 | 2024-06-18 | Instance segmentation method based on three-dimensional point cloud |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118351320A CN118351320A (en) | 2024-07-16 |
CN118351320B true CN118351320B (en) | 2024-08-16 |
Family
ID=91825020
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410780784.8A Active CN118351320B (en) | 2024-06-18 | 2024-06-18 | Instance segmentation method based on three-dimensional point cloud |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118351320B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118606900B (en) * | 2024-08-08 | 2024-10-11 | 中国民用航空飞行学院 | Open vocabulary three-dimensional scene understanding method based on bimodal interaction |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116152267A (en) * | 2023-04-24 | 2023-05-23 | 中国民用航空飞行学院 | Point cloud instance segmentation method based on contrast language image pre-training technology |
CN116993979A (en) * | 2023-07-26 | 2023-11-03 | 浙江大学 | Point cloud panorama segmentation system and method based on instance center coding |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115019043B (en) * | 2022-06-10 | 2024-07-02 | 华南理工大学 | Cross-attention mechanism-based three-dimensional object detection method based on image point cloud fusion |
CN116704504A (en) * | 2023-05-05 | 2023-09-05 | 浙江大学 | Radar panorama segmentation method based on decoupling dynamic convolution kernel |
CN118135225A (en) * | 2024-03-26 | 2024-06-04 | 复旦大学 | Weak supervision indoor point cloud semantic segmentation method, device and medium based on clustering thought |
-
2024
- 2024-06-18 CN CN202410780784.8A patent/CN118351320B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116152267A (en) * | 2023-04-24 | 2023-05-23 | 中国民用航空飞行学院 | Point cloud instance segmentation method based on contrast language image pre-training technology |
CN116993979A (en) * | 2023-07-26 | 2023-11-03 | 浙江大学 | Point cloud panorama segmentation system and method based on instance center coding |
Also Published As
Publication number | Publication date |
---|---|
CN118351320A (en) | 2024-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116152267B (en) | Point cloud instance segmentation method based on contrast language image pre-training technology | |
CN113569979B (en) | Three-dimensional object point cloud classification method based on attention mechanism | |
US20230206603A1 (en) | High-precision point cloud completion method based on deep learning and device thereof | |
CN111242208A (en) | Point cloud classification method, point cloud segmentation method and related equipment | |
CN118351320B (en) | Instance segmentation method based on three-dimensional point cloud | |
CN110930452B (en) | Object pose estimation method based on self-supervision learning and template matching | |
CN113658100A (en) | Three-dimensional target object detection method and device, electronic equipment and storage medium | |
CN114926469B (en) | Semantic segmentation model training method, semantic segmentation method, storage medium and terminal | |
Jiang et al. | Local and global structure for urban ALS point cloud semantic segmentation with ground-aware attention | |
CN115861619A (en) | Airborne LiDAR (light detection and ranging) urban point cloud semantic segmentation method and system of recursive residual double-attention kernel point convolution network | |
CN115294563A (en) | 3D point cloud analysis method and device based on Transformer and capable of enhancing local semantic learning ability | |
CN116246119A (en) | 3D target detection method, electronic device and storage medium | |
Huang et al. | An object detection algorithm combining semantic and geometric information of the 3D point cloud | |
CN117078956A (en) | Point cloud classification segmentation network based on point cloud multi-scale parallel feature extraction and attention mechanism | |
CN115409989A (en) | Three-dimensional point cloud semantic segmentation method for optimizing boundary | |
CN117671480A (en) | Landslide automatic identification method, system and computer equipment based on visual large model | |
CN111597367B (en) | Three-dimensional model retrieval method based on view and hash algorithm | |
CN116912486A (en) | Target segmentation method based on edge convolution and multidimensional feature fusion and electronic device | |
CN117312594A (en) | Sketching mechanical part library retrieval method integrating double-scale features | |
Feng et al. | Point-guided contrastive learning for monocular 3-D object detection | |
CN116452940A (en) | Three-dimensional instance segmentation method and system based on dense and sparse convolution fusion | |
Wang et al. | A geometry feature aggregation method for point cloud classification and segmentation | |
CN114638953A (en) | Point cloud data segmentation method and device and computer readable storage medium | |
Liu et al. | Point2Building: Reconstructing Buildings from Airborne LiDAR Point Clouds | |
Wan et al. | PointNest: Learning Deep Multi-Scale Nested Feature Propagation for Semantic Segmentation of 3D Point clouds |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |