Nothing Special   »   [go: up one dir, main page]

CN118351320B - Instance segmentation method based on three-dimensional point cloud - Google Patents

Instance segmentation method based on three-dimensional point cloud Download PDF

Info

Publication number
CN118351320B
CN118351320B CN202410780784.8A CN202410780784A CN118351320B CN 118351320 B CN118351320 B CN 118351320B CN 202410780784 A CN202410780784 A CN 202410780784A CN 118351320 B CN118351320 B CN 118351320B
Authority
CN
China
Prior art keywords
point cloud
mask
instance
prediction
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410780784.8A
Other languages
Chinese (zh)
Other versions
CN118351320A (en
Inventor
潘磊
董华征
栾五洋
王艾
李俊辉
黄洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Civil Aviation Flight University of China
Inspur Intelligent IoT Technology Co Ltd
Original Assignee
Civil Aviation Flight University of China
Inspur Intelligent IoT Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Civil Aviation Flight University of China, Inspur Intelligent IoT Technology Co Ltd filed Critical Civil Aviation Flight University of China
Priority to CN202410780784.8A priority Critical patent/CN118351320B/en
Publication of CN118351320A publication Critical patent/CN118351320A/en
Application granted granted Critical
Publication of CN118351320B publication Critical patent/CN118351320B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an example segmentation method based on three-dimensional point cloud, which belongs to the technical field of image processing and computer vision and comprises the following steps: acquiring and preprocessing point cloud data to obtain a point cloud training data set; extracting characteristics of point cloud data to obtain point cloud level data characteristics and comprehensive characteristics of each super point; classifying and aggregating by a coding and second class aggregation method to obtain an initialization position coding, an initialization position query vector and a high-density point query vector; vector fusion is carried out based on a cross attention mechanism, so that example fusion characteristics are obtained; predicting the center position and the boundary position of the instance, convolving to generate a segmentation mask, and converting the segmentation mask into a prediction mask; and according to the bipartite matching method and the graph model, matching and matching degree evaluation are carried out on the prediction mask and the real mask, and instance segmentation of the three-dimensional point cloud is carried out. The method solves the problem of limitation of the existing three-dimensional point cloud instance segmentation technology in the aspects of processing closely adjacent objects and understanding real-time scenes.

Description

Instance segmentation method based on three-dimensional point cloud
Technical Field
The invention belongs to the technical field of image processing and computer vision, and particularly relates to an integrated query and density clustering instance segmentation method suitable for 3D instance segmentation.
Background
Three-dimensional instance segmentation is a critical task in the field of computer vision, involving the accurate segmentation and identification of individual objects from three-dimensional point cloud data. The accurate three-dimensional instance segmentation not only can improve the environment sensing capability of systems in multiple fields such as automatic driving, robot navigation, virtual reality and the like, but also is beneficial to realizing safety and more functions.
Three-dimensional point cloud data is typically collected by a lidar, a structured light scanner, or a stereo vision system, which has unstructured and highly complex features. Processing such data requires an understanding of its unique spatial structure and dense point cloud distribution. Traditional three-dimensional scene understanding methods tend to be limited to predefined categories and supervise learning techniques, which rely on large amounts of annotation data. The point cloud data is distinguished from the traditional two-dimensional image, the distribution in space is uneven, and is often influenced by shielding and noise, and the three-dimensional space structure and topological relation of the data need to be effectively understood when the data are processed.
In recent years, the introduction of deep learning brings revolutionary progress to three-dimensional scene understanding, and more complex and abstract feature representations can be learned, so that the accuracy and the robustness of segmentation are remarkably improved. However, the existing method often depends on a large amount of labeling data and computing resources, so that flexibility and expandability of instance segmentation are limited, and meanwhile, tight overlapping among instances, multiple categories and real-time segmentation in a dynamic environment are difficult to effectively process.
Disclosure of Invention
Aiming at the defects in the prior art, the instance segmentation method based on the three-dimensional point cloud solves the problem of limitation of the existing three-dimensional point cloud instance segmentation technology in the aspects of processing closely adjacent objects and understanding real-time scenes by combining binary clustering and query vectors.
In order to achieve the aim of the invention, the invention adopts the following technical scheme:
the invention provides an example segmentation method based on three-dimensional point cloud, which comprises the following steps:
S1, acquiring and preprocessing point cloud data to obtain a point cloud training data set;
s2, extracting characteristics of point cloud data in the point cloud training data set to obtain point cloud level data characteristics and comprehensive characteristics of all super points;
S3, classifying and aggregating according to the basic data characteristics of the point cloud by using a coding and second class aggregation method to obtain an initialization position coding, an initialization position query vector and a high-density point query vector;
S4, vector fusion is carried out on the basis of a cross attention mechanism according to the comprehensive characteristics of each super point, the initialized position code, the initialized position query vector and the high-density point query vector to obtain example fusion characteristics;
S5, based on the example fusion characteristics, predicting the center position and the boundary position of the example, convoluting to generate a segmentation mask, and converting the segmentation mask of each prediction example into a prediction mask;
And S6, according to the bipartite matching method and the graph model, matching and matching degree evaluation are carried out on the prediction mask and the real mask, and instance segmentation of the three-dimensional point cloud is carried out.
The beneficial effects of the invention are as follows: according to the example segmentation method based on the three-dimensional point cloud, characteristic extraction is carried out on point cloud data, so that point cloud scale is obtained, characteristic data and comprehensive characteristics of all super points are obtained, extraction of high-density point query vectors is realized through second-class fusion values, the space recognition capability of the query vectors is improved, key characteristic points of a dense area are accurately represented, and the understanding and processing capability of a complex three-dimensional scene are enhanced; based on the comprehensive characteristics of each super point, the initialization position code, the initialization position query vector and the high-density point query vector, the vector fusion is carried out through a cross attention mechanism, so that the example fusion characteristics are obtained, and the accuracy of generating the prediction mask is improved; the invention obviously improves the processing efficiency of unstructured three-dimensional point cloud data through the matching between the prediction mask and the real mask and the matching evaluation optimization, and has obvious advantages in the aspect of example segmentation of the three-dimensional point cloud data and the aspect of processing a plurality of examples closely connected in space.
Further, the step S1 includes the following steps:
s11, acquiring point cloud data in a plurality of scenes, and matching corresponding labels with the point cloud data;
S12, carrying out standardization processing and data enhancement processing on the point cloud data after the label is matched;
S13, generating a point cloud data set based on the point cloud data subjected to the standardization processing and the data enhancement processing;
S14, carrying out voxelization on the point cloud data with the size of H multiplied by W multiplied by 3 in the point cloud data set, and carrying out voxelization on the point cloud scene to obtain a point cloud training data set, wherein H represents the height of the point cloud data, and W represents the width of the point cloud data.
The beneficial effects of adopting the further scheme are as follows: according to the invention, through the standardization and data enhancement processing of the point cloud data, the richness of the point cloud data is effectively improved through different visual angles and sizes.
Further, the step S2 includes the following steps:
S21, performing feature conversion on point cloud data in a point cloud training data set by using an input convolution layer to obtain initial point cloud data features;
S22, performing multi-scale feature extraction on the initial point cloud data features by using a pre-training sparse 3D U-Net model to obtain multi-scale point cloud data features;
s23, utilizing a linear layer to adjust feature dimensions of the multi-scale point cloud data features to obtain normalized point cloud data features;
s24, reconstructing a mapping relation between the normalized point cloud data characteristics and point cloud data in the point cloud training data set by using a mapping table to obtain point cloud level data characteristics;
s25, pooling the normalized point cloud data characteristics by using the identification of the super points to obtain pooling characteristics of each super point;
s26, according to the pooling type adopted by the pooling characteristics of each super point, the characteristics in the same super point are aggregated, and the comprehensive characteristics of each super point are obtained.
The beneficial effects of adopting the further scheme are as follows: according to the invention, through the encoder-decoder architecture and jump connection of the 3D U-Net model, feature extraction is effectively realized on multiple scales, the feature from the whole world to the local is ensured to be effectively captured and utilized, the mapping relation is reconstructed by utilizing the mapping table, the spatial consistency of the feature is ensured, the comprehensive feature of the super point is obtained through pooling and aggregation, the necessary information is reserved while the data quantity is reduced, and the processing of large-scale point cloud data is facilitated.
Further, the step S3 includes the following steps:
s31, coding the space position information of the point cloud scale other data characteristics to obtain an initialization position code;
s32, generating an initial position query vector by using the initial position code;
S33, counting the number of neighborhood points of each point in the point cloud within a preset radius range of the point according to the data characteristics of the points cloud scale, and taking the number as the local density of the point;
s34, setting a density classification threshold;
s35, taking points, among the point clouds, of which the local density is greater than a density classification threshold value as high-density points;
s36, gathering all the high-density points to obtain a high-density point set;
And S37, extracting the characteristics in the high-density point set to obtain a high-density point query vector.
The beneficial effects of adopting the further scheme are as follows: the invention obtains the initialization position code by the coding processing method and correspondingly generates the initial position inquiry vector; according to the method, binary clustering is realized by setting the density classification threshold value, high-density points in the point cloud are obtained, the high-density point sets are formed by aggregation, and the high-density point query vector is obtained through feature extraction, so that the instance identification efficiency and the instance segmentation precision are effectively improved, and the method is particularly suitable for processing intensive or complex three-dimensional scenes.
Further, the step S4 includes the following steps:
S41, calculating the similarity between each initialized position query vector and the key vector and the value vector by adopting a self-attention mechanism, and normalizing the similarity into a first weight through a softmax function;
s42, matching the first weight with the corresponding point cloud scale other characteristic data, and correspondingly weighting and summing to obtain a new initialization position query vector;
s43, calculating the similarity between each high-density query vector and the key vector and the value vector by adopting a self-attention mechanism, and normalizing the similarity into a second weight through a softmax function;
S44, matching the second weight with the corresponding point cloud scale other characteristic data, and correspondingly weighting and summing to obtain a new high-density query vector;
S45, taking the comprehensive characteristics of the initialization position codes and the super points as key vectors to be fused and value vectors to be fused, and taking a new initialization position query vector and a new high-density query vector as query vectors to be fused;
S46, fusing the query vector to be fused with the key vector to be fused and the value vector to be fused based on a cross attention mechanism to obtain an instance fusion characteristic.
The beneficial effects of adopting the further scheme are as follows: the invention adopts a self-attention mechanism, constructs a new initialization position query vector and a new high-density query vector according to the similarity between the initialization position query vector and the key vector and the value vector and the similarity between the high-density query vector and the key vector and the value vector, adopts a cross-attention mechanism, combines the comprehensive characteristics of initialization position codes and all the superpoints, realizes vector fusion, constructs a new vector query mode and provides a basis for improving the accuracy of generating the prediction mask.
Further, the calculation expression of the cross-attention mechanism in S46 is as follows:
Wherein, attention () represents an Attention mechanism function, Q represents a query vector to be fused, K represents a key vector to be fused, V represents a value vector to be fused, softmax () represents a softmax function, K T represents a transpose of the key vector to be fused, and d k represents a dimension of the key vector to be fused.
The beneficial effects of adopting the further scheme are as follows: the invention provides a calculation method of a cross attention mechanism, which can fuse the comprehensive characteristics of each super point with a new initialized position query vector and a new high-density query vector which are processed by the self attention mechanism, thereby realizing effective integration of information of different feature spaces and improving the analysis capability.
Further, the step S5 includes the following steps:
S51, based on example fusion characteristics, constructing characteristic mapping associated with an example center to obtain a center position and a boundary position of a predicted example;
s52, based on the central position and the boundary position of the predicted instance, obtaining a segmentation mask of each predicted instance by carrying out convolution operation on the instance fusion characteristics;
s53, converting the segmentation mask of each prediction instance into a prediction mask;
the computational expression of the prediction mask is as follows:
Where M i (x) represents the prediction mask of the ith instance, sigmoid () represents the sigmoid function, and f (x) represents the partition mask of each prediction instance.
The beneficial effects of adopting the further scheme are as follows: the invention provides a method for carrying out mask segmentation and mask prediction based on an instance fusion feature, which obtains a prediction mask and provides a basis for executing an instance segmentation task.
Further, the calculation expressions of the center position and the boundary position of the prediction example in S51 are as follows:
Wherein Center i represents the Center position of the i-th predicted instance, σ represents the activation function, MLP () represents the multi-layer perceptron, Q position represents the query vector corresponding to the Center position of the instance, i represents the i-th instance, bound i represents the Boundary position of the i-th predicted instance, MLP center () represents the multi-layer perceptron for instance Center prediction, Q ct represents the query vector corresponding to the Boundary position of the current instance, and Q pt-1 represents the query vector corresponding to the Center position of the previous instance.
The beneficial effects of adopting the further scheme are as follows: the invention provides a calculation method for the center position and the boundary position of a predicted instance, which can provide a basis for accurately performing mask segmentation and obtaining a predicted mask by accurately calculating the center position and the boundary position of the predicted instance.
Further, the step S6 includes the steps of:
s61, constructing a graph model according to a binary matching method, wherein nodes in the graph model are a prediction mask and a real mask respectively, and the weight of edges in the graph model is the similarity between the prediction mask and the real mask;
S62, calculating a matching degree between a prediction mask and a real mask through a mask cross-correlation model;
The calculation expression of the mask merging ratio model is as follows:
Wherein IOU (M pred,Mgt) represents the matching degree between the prediction mask and the real mask, M pred represents the prediction mask, M gt represents the real mask, |represents the absolute value, |represents the intersection operation, |represents the union operation;
s63, setting a matching degree threshold, and taking the prediction mask and the real mask as a matching mask pair when the matching degree between the prediction mask and the real mask is greater than the matching degree threshold;
S64, constructing a global optimal matching model according to the Hungary algorithm, and optimizing the matching degree between the prediction mask and the real mask;
the computational expression of the global optimal matching model is as follows:
wherein maximize denotes the maximization, Represent the firstPrediction mask and the firstThe degree of matching between the individual true masks, x represents the multiplication,Represent the firstThe number of prediction masks is chosen such that,Represent the firstThe number of true masks is one,Representing a mask pairing indication function, wherein the mask pairing index function is at the firstPrediction mask and the firstThe pair of true masks forming the matching mask takes a value of 1, at the firstPrediction mask and the firstThe value of the matching pair is 0 when the matching pair is not selected among the true masks;
S65, evaluating the matching degree between the optimized prediction mask and the real mask by using the matching loss function to obtain a matching degree evaluation result;
And S66, optimizing prediction masks and prediction of instance centers based on the matching degree evaluation result, and executing instance segmentation of the three-dimensional point cloud.
The beneficial effects of adopting the further scheme are as follows: the invention provides a method for matching a prediction mask and a real mask and optimizing and evaluating a matching result, which is used for executing instance segmentation of a three-dimensional point cloud based on an optimizing process and a matching degree evaluation result, so that the instance segmentation efficiency, and the accuracy and the integrity of the segmentation result can be greatly improved.
Further, the calculation expression of the matching loss function in S65 is as follows;
wherein L represents a matching loss function, λ cls represents a classification loss weight coefficient, L cls represents a classification loss function, λ mask represents a mask binary loss weight coefficient, L mask represents a mask binary loss function, λ dice represents an intersection ratio loss weight coefficient, L dice represents an intersection ratio loss function, λ center represents a center regression loss weight coefficient, L center represents a center regression loss function, CE () represents an intersection entropy loss function, pred class represents a class of a predicted instance, true class represents a class of a real instance, pred mask represents a prediction mask, true mask represents a real mask, and i is a norm operation.
The beneficial effects of adopting the further scheme are as follows: the invention provides a calculation method of a matching loss function, which evaluates the matching degree of the prediction mask and the real mask in an omnibearing way from the prediction type, the matching precision of pixel levels between the prediction mask and the real mask, the overlapping area and the error between center points, and can optimize the matching accuracy between the prediction mask and the real mask and the prediction of the instance center while ensuring the classification recognition precision, thereby improving the overall performance on a three-dimensional point cloud instance segmentation task.
Other advantages that are also present with respect to the present invention will be more detailed in the following examples.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart illustrating steps of an example segmentation method based on a three-dimensional point cloud according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention.
As shown in fig. 1, in one embodiment of the present invention, the present invention provides an example segmentation method based on a three-dimensional point cloud, including the following steps:
S1, acquiring and preprocessing point cloud data to obtain a point cloud training data set;
The step S1 comprises the following steps:
s11, acquiring point cloud data in a plurality of scenes, and matching corresponding labels with the point cloud data;
S12, carrying out standardization processing and data enhancement processing on the point cloud data after the label is matched;
in the embodiment, the standardization processing is performed on the point cloud data by adopting the scale normalization or the centralization processing at random, the data enhancement is performed on the point cloud data by randomly rotating or zooming, and the point cloud data with different visual angles and sizes are expanded;
S13, generating a point cloud data set based on the point cloud data subjected to the standardization processing and the data enhancement processing;
In this embodiment, the information in the point cloud data set includes a scene ID, a voxel coordinate, a mapping of point cloud data to a voxel, a mapping of a voxel to point cloud data, a shape of a discrete voxel space, a feature of the point cloud data, a super point identifier, a batch offset, an instance tag, and a point cloud floating point coordinate;
The scene ID is used for uniquely identifying a scene; the voxel coordinates are used for representing coordinates of the point cloud data in a discrete voxel space; the mapping from the point cloud data to the voxels is used for mapping the points in the point cloud data to the corresponding voxels; the mapping from the voxels to the point cloud data is used for mapping the points in the voxels to the corresponding point cloud data; the shape of the discrete voxel space is used to represent the size of a voxel grid; the characteristics of the point cloud data comprise the characteristics of the position, the color, the normal vector and the like of the point; the super point mark is used for representing advanced features for improving the point cloud processing performance; the batch offset is used for identifying data boundaries of different scenes in a batch processing process; the instance tag is used for representing an instance to which each point in the point cloud data belongs; the point cloud floating point coordinates are used for representing floating point number coordinate information of the point cloud data.
S14, carrying out voxelization on point cloud data with the size of H multiplied by W multiplied by 3 in the point cloud data set, and carrying out voxelization on a point cloud scene by using Open3D to obtain a point cloud training data set, wherein H represents the height of the point cloud data, and W represents the width of the point cloud data.
S2, extracting characteristics of point cloud data in the point cloud training data set to obtain point cloud level data characteristics and comprehensive characteristics of all super points;
the step S2 comprises the following steps:
S21, performing feature conversion on point cloud data in a point cloud training data set by using an input convolution layer to obtain initial point cloud data features; the input convolution layer performs preliminary feature conversion on the original point cloud data, and prepares for multi-scale analysis and feature extraction of the deep network;
s22, performing multi-scale feature extraction on the initial point cloud data features by using a pre-training sparse 3D U-Net model to obtain multi-scale point cloud data features; the pre-training sparse 3D U-Net model is connected with the jump feature through the encoder-decoder architecture, so that feature extraction of initial point cloud data features on multiple scales is effectively realized, the feature from the whole world to the local can be effectively captured and utilized, and the method is applicable to complex 3D structures and objects of various scales.
S23, utilizing a linear layer to adjust feature dimensions of the multi-scale point cloud data features to obtain normalized point cloud data features; in this embodiment, the linear layer adopts a Normalization function and Relu activation functions for activating the Normalization result, so that feature dimensions of the normalized point cloud data features are matched with feature mapping and pooling operations.
S24, reconstructing a mapping relation between the normalized point cloud data characteristics and point cloud data in the point cloud training data set by using a mapping table to obtain point cloud level data characteristics; the remapping adopts a mapping table v2p_map to correspond the point cloud data characteristics subjected to multi-scale characteristic extraction and normalization processing with the point cloud data in the point cloud training data set, is very important for processing irregular point cloud data, and can ensure consistency of characteristic space. The remapped point cloud scale data features retain the resolution of the point cloud level, retain the detail information of the point cloud data, and can be directly used for tasks such as query vector generation and instance segmentation with high precision requirements.
S25, pooling the normalized point cloud data characteristics by using the identification of the super points to obtain pooling characteristics of each super point;
S26, according to the pooling type adopted by the pooling characteristics of each super point, the characteristics in the same super point are aggregated, and the comprehensive characteristics of each super point are obtained. In this embodiment, features in the same super point are aggregated according to mean pooling or maximum pooling, so as to obtain a comprehensive feature of each super point, thereby reducing data volume while retaining necessary information, and facilitating processing of large-scale point cloud data.
S3, classifying and aggregating according to the basic data characteristics of the point cloud by using a coding and second class aggregation method to obtain an initialization position coding, an initialization position query vector and a high-density point query vector;
the step S3 comprises the following steps:
s31, coding the space position information of the point cloud scale other data characteristics to obtain an initialization position code;
s32, generating an initial position query vector by using the initial position code;
S33, counting the number of neighborhood points of each point in the point cloud within a preset radius range of the point according to the data characteristics of the points cloud scale, and taking the number as the local density of the point;
s34, setting a density classification threshold;
s35, taking points, among the point clouds, of which the local density is greater than a density classification threshold value as high-density points;
s36, gathering all the high-density points to obtain a high-density point set;
And S37, extracting the characteristics in the high-density point set to obtain a high-density point query vector. In this embodiment, the high-density point set includes key structural information, which is an important point for instance identification and scene segmentation; the high-density point query vector can be used for guiding instance recognition and instance segmentation in a backbone architecture, so that the instance recognition efficiency can be improved, the instance segmentation accuracy can be improved, and the method is particularly suitable for three-dimensional scenes with intensive and complex processing and insignificant boundaries between instances.
S4, vector fusion is carried out on the basis of a cross attention mechanism according to the comprehensive characteristics of each super point, the initialized position code, the initialized position query vector and the high-density point query vector to obtain example fusion characteristics;
The step S4 comprises the following steps:
S41, calculating the similarity between each initialized position query vector and the key vector and the value vector by adopting a self-attention mechanism, and normalizing the similarity into a first weight through a softmax function; in the embodiment, similarity between each initialization position query vector and key vector and value vector is calculated by adopting dot product operation;
s42, matching the first weight with the corresponding point cloud scale other characteristic data, and correspondingly weighting and summing to obtain a new initialization position query vector;
s43, calculating the similarity between each high-density query vector and the key vector and the value vector by adopting a self-attention mechanism, and normalizing the similarity into a second weight through a softmax function;
S44, matching the second weight with the corresponding point cloud scale other characteristic data, and correspondingly weighting and summing to obtain a new high-density query vector;
S45, taking the comprehensive characteristics of the initialization position codes and the super points as key vectors to be fused and value vectors to be fused, and taking a new initialization position query vector and a new high-density query vector as query vectors to be fused;
S46, fusing the query vector to be fused, the key vector to be fused and the value vector to be fused based on a cross attention mechanism to obtain an instance fusion characteristic;
The computational expression of the cross-attention mechanism is as follows:
Wherein, attention () represents an Attention mechanism function, Q represents a query vector to be fused, K represents a key vector to be fused, V represents a value vector to be fused, softmax () represents a softmax function, K T represents a transpose of the key vector to be fused, and d k represents a dimension of the key vector to be fused; dot products can be scaled through the dimension of the key, and gradient problems caused by overlarge internal dot products are prevented. The softmax function is used for calculating a normalized dot product result, so that the output weight distribution is reasonable. Instance fusion features can be used for instance identification and segmentation, accurate localization of the boundaries of each instance, and prediction of the center of the instance and corresponding class labels.
S5, based on the example fusion characteristics, predicting the center position and the boundary position of the example, convoluting to generate a segmentation mask, and converting the segmentation mask of each prediction example into a prediction mask;
The step S5 comprises the following steps:
S51, based on example fusion characteristics, constructing characteristic mapping associated with an example center to obtain a center position and a boundary position of a predicted example;
the calculation expressions of the center position and the boundary position of the predicted instance in S51 are as follows:
wherein Center i represents the Center position of the i-th predicted instance, σ represents the activation function, MLP () represents the multi-layer perceptron, Q position represents the query vector corresponding to the Center position of the instance, i represents the i-th instance, bound i represents the Boundary position of the i-th predicted instance, MLP center () represents the multi-layer perceptron for instance Center prediction, Q ct represents the query vector corresponding to the Boundary position of the current instance, Q pt-1 represents the query vector corresponding to the Center position of the previous instance; in this embodiment, the activation function adopts a sigmoid activation function;
S52, based on the central position and the boundary position of the predicted instance, obtaining a segmentation mask of each predicted instance by carrying out convolution operation on the instance fusion characteristics; the segmentation mask is a binary image, if the pixel value in the binary image is 1, a prediction example is represented, and if the pixel value in the binary image is 0, a background or other examples are represented;
S53, converting the segmentation mask of each prediction instance into a prediction mask; in this embodiment, the prediction mask represents a possibility that each pixel belongs to a certain instance, and the segmentation mask may be converted into the prediction mask through a sigmoid function or a softmax function;
the computational expression of the prediction mask is as follows:
Where M i (x) represents the prediction mask of the ith instance, sigmoid () represents the sigmoid function, and f (x) represents the partition mask of each prediction instance.
And S6, according to the bipartite matching method and the graph model, matching and matching degree evaluation are carried out on the prediction mask and the real mask, and instance segmentation of the three-dimensional point cloud is carried out. In the instance segmentation task, particularly when there are multiple overlapping instances in a scene, it is necessary to match the generated prediction masks accurately.
The step S6 comprises the following steps:
S61, constructing a graph model according to a binary matching method, wherein nodes in the graph model are a prediction mask and a real mask respectively, and the weight of edges in the graph model is the similarity between the prediction mask and the real mask; the similarity of the prediction mask and the real mask can reflect the cross overlap area ratio between different instances in the scene;
S62, calculating a matching degree between a prediction mask and a real mask through a mask cross-correlation model;
The calculation expression of the mask merging ratio model is as follows:
Wherein IOU (M pred,Mgt) represents the matching degree between the prediction mask and the real mask, M pred represents the prediction mask, M gt represents the real mask, |represents the absolute value, |represents the intersection operation, |represents the union operation;
s63, setting a matching degree threshold, and taking the prediction mask and the real mask as a matching mask pair when the matching degree between the prediction mask and the real mask is greater than the matching degree threshold;
S64, constructing a global optimal matching model according to the Hungary algorithm, and optimizing the matching degree between the prediction mask and the real mask;
the computational expression of the global optimal matching model is as follows:
wherein maximize denotes the maximization, Represent the firstPrediction mask and the firstThe degree of matching between the individual true masks, x represents the multiplication,Represent the firstThe number of prediction masks is chosen such that,Represent the firstThe number of true masks is one,Representing a mask pairing indication function, wherein the mask pairing index function is at the firstPrediction mask and the firstThe pair of true masks forming the matching mask takes a value of 1, at the firstPrediction mask and the firstThe value of the matching pair is 0 when the matching pair is not selected among the true masks;
S65, evaluating the matching degree between the optimized prediction mask and the real mask by using the matching loss function to obtain a matching degree evaluation result;
the calculation expression of the matching loss function in S65 is as follows;
Wherein L represents a matching loss function, λ cls represents a classification loss weight coefficient, L cls represents a classification loss function, λ mask represents a mask binary loss weight coefficient, L mask represents a mask binary loss function, λ dice represents an intersection ratio loss weight coefficient, L dice represents an intersection ratio loss function, λ center represents a center regression loss weight coefficient, L center represents a center regression loss function, CE () represents an intersection entropy loss function, pred class represents a class of a predicted instance, true class represents a class of a real instance, pred mask represents a prediction mask, true mask represents a real mask, and i represents a norm operation; the classification loss function is used for evaluating the matching degree between the prediction mask and the real mask, and the cross entropy loss function is adopted in the embodiment for calculating the classification loss function; the mask binary loss function is used for evaluating the matching precision of pixel levels between the prediction mask and the real mask, and a binary cross entropy loss function is adopted in the embodiment for calculating the binary loss function; the cross-over ratio loss function is used for optimizing the cross-over ratio of the prediction mask and the real mask and improving the overlapping area between the prediction mask and the real mask; the center regression loss function is used for optimizing the error between the predicted instance center point and the real instance center point, and in this embodiment, a minimum absolute value deviation function is used to measure the distance between the predicted instance center point and the real instance center point.
And S66, optimizing prediction masks and prediction of instance centers based on the matching degree evaluation result, and executing instance segmentation of the three-dimensional point cloud.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention.

Claims (5)

1. An instance segmentation method based on three-dimensional point cloud is characterized by comprising the following steps:
S1, acquiring and preprocessing point cloud data to obtain a point cloud training data set;
s2, extracting characteristics of point cloud data in the point cloud training data set to obtain point cloud level data characteristics and comprehensive characteristics of all super points;
the step S2 comprises the following steps:
S21, performing feature conversion on point cloud data in a point cloud training data set by using an input convolution layer to obtain initial point cloud data features;
S22, performing multi-scale feature extraction on the initial point cloud data features by using a pre-training sparse 3D U-Net model to obtain multi-scale point cloud data features;
s23, utilizing a linear layer to adjust feature dimensions of the multi-scale point cloud data features to obtain normalized point cloud data features;
s24, reconstructing a mapping relation between the normalized point cloud data characteristics and point cloud data in the point cloud training data set by using a mapping table to obtain point cloud level data characteristics;
s25, pooling the normalized point cloud data characteristics by using the identification of the super points to obtain pooling characteristics of each super point;
s26, according to the pooling type adopted by the pooling characteristics of each super point, the characteristics in the same super point are aggregated to obtain the comprehensive characteristics of each super point;
S3, classifying and aggregating according to the basic data characteristics of the point cloud by using a coding and second class aggregation method to obtain an initialization position coding, an initialization position query vector and a high-density point query vector;
the step S3 comprises the following steps:
s31, coding the space position information of the point cloud scale other data characteristics to obtain an initialization position code;
s32, generating an initialization position query vector by using the initialization position code;
S33, counting the number of neighborhood points of each point in the point cloud within a preset radius range of the point according to the data characteristics of the points cloud scale, and taking the number as the local density of the point;
s34, setting a density classification threshold;
s35, taking points, among the point clouds, of which the local density is greater than a density classification threshold value as high-density points;
s36, gathering all the high-density points to obtain a high-density point set;
S37, extracting features in the high-density point set to obtain a high-density point query vector;
S4, vector fusion is carried out on the basis of a cross attention mechanism according to the comprehensive characteristics of each super point, the initialized position code, the initialized position query vector and the high-density point query vector to obtain example fusion characteristics;
The step S4 comprises the following steps:
s41, calculating the similarity between each initialized position query vector and the key vector and the value vector by adopting a self-attention mechanism, and normalizing the similarity into a first weight through a softmax function;
s42, matching the first weight with the corresponding point cloud scale other characteristic data, and correspondingly weighting and summing to obtain a new initialization position query vector;
s43, calculating the similarity between each high-density query vector and the key vector and the value vector by adopting a self-attention mechanism, and normalizing the similarity into a second weight through a softmax function;
S44, matching the second weight with the corresponding point cloud scale other characteristic data, and correspondingly weighting and summing to obtain a new high-density query vector;
S45, taking the comprehensive characteristics of the initialization position codes and the super points as key vectors to be fused and value vectors to be fused, and taking a new initialization position query vector and a new high-density query vector as query vectors to be fused;
S46, fusing the query vector to be fused, the key vector to be fused and the value vector to be fused based on a cross attention mechanism to obtain an instance fusion characteristic;
S5, based on the example fusion characteristics, predicting the center position and the boundary position of the example, convoluting to generate a segmentation mask, and converting the segmentation mask of each prediction example into a prediction mask;
The step S5 comprises the following steps:
S51, based on example fusion characteristics, constructing characteristic mapping associated with an example center to obtain a center position and a boundary position of a predicted example;
the calculation expressions of the center position and the boundary position of the predicted instance in S51 are as follows:
Wherein Center i represents the Center position of the i-th predicted instance, σ represents the activation function, MLP () represents the multi-layer perceptron, Q position represents the query vector corresponding to the Center position of the instance, i represents the i-th instance, bound i represents the Boundary position of the i-th predicted instance, MLP center () represents the multi-layer perceptron for instance Center prediction, Q ct represents the query vector corresponding to the Boundary position of the current instance, Q pt-1 represents the query vector corresponding to the Center position of the previous instance;
s52, based on the central position and the boundary position of the predicted instance, obtaining a segmentation mask of each predicted instance by carrying out convolution operation on the instance fusion characteristics;
s53, converting the segmentation mask of each prediction instance into a prediction mask;
the computational expression of the prediction mask is as follows:
Where M i (x) represents the prediction mask of the ith instance, sigmoid () represents the sigmoid function, and f (x) represents the partition mask of each prediction instance;
And S6, according to the bipartite matching method and the graph model, matching and matching degree evaluation are carried out on the prediction mask and the real mask, and instance segmentation of the three-dimensional point cloud is carried out.
2. The three-dimensional point cloud-based instance segmentation method according to claim 1, wherein the S1 comprises the steps of:
s11, acquiring point cloud data in a plurality of scenes, and matching corresponding labels with the point cloud data;
S12, carrying out standardization processing and data enhancement processing on the point cloud data after the label is matched;
S13, generating a point cloud data set based on the point cloud data subjected to the standardization processing and the data enhancement processing;
S14, carrying out voxelization on the point cloud data with the size of H multiplied by W multiplied by 3 in the point cloud data set, and carrying out voxelization on the point cloud scene to obtain a point cloud training data set, wherein H represents the height of the point cloud data, and W represents the width of the point cloud data.
3. The three-dimensional point cloud-based instance segmentation method according to claim 1, wherein the computation expression of the cross-attention mechanism in S46 is as follows:
Wherein, attention () represents an Attention mechanism function, Q represents a query vector to be fused, K represents a key vector to be fused, V represents a value vector to be fused, softmax () represents a softmax function, K T represents a transpose of the key vector to be fused, and d k represents a dimension of the key vector to be fused.
4. The three-dimensional point cloud-based instance segmentation method according to claim 1, wherein the step S6 comprises the steps of:
s61, constructing a graph model according to a binary matching method, wherein nodes in the graph model are a prediction mask and a real mask respectively, and the weight of edges in the graph model is the similarity between the prediction mask and the real mask;
S62, calculating a matching degree between a prediction mask and a real mask through a mask cross-correlation model;
The calculation expression of the mask merging ratio model is as follows:
Wherein IOU (M pred,Mgt) represents the matching degree between the prediction mask and the real mask, M pred represents the prediction mask, M gt represents the real mask, |represents the absolute value, |represents the intersection operation, |represents the union operation;
s63, setting a matching degree threshold, and taking the prediction mask and the real mask as a matching mask pair when the matching degree between the prediction mask and the real mask is greater than the matching degree threshold;
S64, constructing a global optimal matching model according to the Hungary algorithm, and optimizing the matching degree between the prediction mask and the real mask;
the computational expression of the global optimal matching model is as follows:
wherein maximize denotes the maximization, Represent the firstPrediction mask and the firstThe degree of matching between the individual true masks, x represents the multiplication,Represent the firstThe number of prediction masks is chosen such that,Represent the firstThe number of true masks is one,Representing a mask pairing indication function, wherein the mask pairing index function is at the firstPrediction mask and the firstThe pair of true masks forming the matching mask takes a value of 1, at the firstPrediction mask and the firstThe value of the matching pair is 0 when the matching pair is not selected among the true masks;
S65, evaluating the matching degree between the optimized prediction mask and the real mask by using the matching loss function to obtain a matching degree evaluation result;
And S66, optimizing prediction masks and prediction of instance centers based on the matching degree evaluation result, and executing instance segmentation of the three-dimensional point cloud.
5. The three-dimensional point cloud based instance segmentation method according to claim 4, wherein the calculation expression of the matching loss function in S65 is as follows;
wherein L represents a matching loss function, λ cls represents a classification loss weight coefficient, L cls represents a classification loss function, λ mask represents a mask binary loss weight coefficient, L mask represents a mask binary loss function, λ dice represents an intersection ratio loss weight coefficient, L dice represents an intersection ratio loss function, λ center represents a center regression loss weight coefficient, L center represents a center regression loss function, CE () represents an intersection entropy loss function, pred class represents a class of a predicted instance, true class represents a class of a real instance, pred mask represents a prediction mask, true mask represents a real mask, and i is a norm operation.
CN202410780784.8A 2024-06-18 2024-06-18 Instance segmentation method based on three-dimensional point cloud Active CN118351320B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410780784.8A CN118351320B (en) 2024-06-18 2024-06-18 Instance segmentation method based on three-dimensional point cloud

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410780784.8A CN118351320B (en) 2024-06-18 2024-06-18 Instance segmentation method based on three-dimensional point cloud

Publications (2)

Publication Number Publication Date
CN118351320A CN118351320A (en) 2024-07-16
CN118351320B true CN118351320B (en) 2024-08-16

Family

ID=91825020

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410780784.8A Active CN118351320B (en) 2024-06-18 2024-06-18 Instance segmentation method based on three-dimensional point cloud

Country Status (1)

Country Link
CN (1) CN118351320B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118606900B (en) * 2024-08-08 2024-10-11 中国民用航空飞行学院 Open vocabulary three-dimensional scene understanding method based on bimodal interaction

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116152267A (en) * 2023-04-24 2023-05-23 中国民用航空飞行学院 Point cloud instance segmentation method based on contrast language image pre-training technology
CN116993979A (en) * 2023-07-26 2023-11-03 浙江大学 Point cloud panorama segmentation system and method based on instance center coding

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115019043B (en) * 2022-06-10 2024-07-02 华南理工大学 Cross-attention mechanism-based three-dimensional object detection method based on image point cloud fusion
CN116704504A (en) * 2023-05-05 2023-09-05 浙江大学 Radar panorama segmentation method based on decoupling dynamic convolution kernel
CN118135225A (en) * 2024-03-26 2024-06-04 复旦大学 Weak supervision indoor point cloud semantic segmentation method, device and medium based on clustering thought

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116152267A (en) * 2023-04-24 2023-05-23 中国民用航空飞行学院 Point cloud instance segmentation method based on contrast language image pre-training technology
CN116993979A (en) * 2023-07-26 2023-11-03 浙江大学 Point cloud panorama segmentation system and method based on instance center coding

Also Published As

Publication number Publication date
CN118351320A (en) 2024-07-16

Similar Documents

Publication Publication Date Title
CN116152267B (en) Point cloud instance segmentation method based on contrast language image pre-training technology
CN113569979B (en) Three-dimensional object point cloud classification method based on attention mechanism
US20230206603A1 (en) High-precision point cloud completion method based on deep learning and device thereof
CN111242208A (en) Point cloud classification method, point cloud segmentation method and related equipment
CN118351320B (en) Instance segmentation method based on three-dimensional point cloud
CN110930452B (en) Object pose estimation method based on self-supervision learning and template matching
CN113658100A (en) Three-dimensional target object detection method and device, electronic equipment and storage medium
CN114926469B (en) Semantic segmentation model training method, semantic segmentation method, storage medium and terminal
Jiang et al. Local and global structure for urban ALS point cloud semantic segmentation with ground-aware attention
CN115861619A (en) Airborne LiDAR (light detection and ranging) urban point cloud semantic segmentation method and system of recursive residual double-attention kernel point convolution network
CN115294563A (en) 3D point cloud analysis method and device based on Transformer and capable of enhancing local semantic learning ability
CN116246119A (en) 3D target detection method, electronic device and storage medium
Huang et al. An object detection algorithm combining semantic and geometric information of the 3D point cloud
CN117078956A (en) Point cloud classification segmentation network based on point cloud multi-scale parallel feature extraction and attention mechanism
CN115409989A (en) Three-dimensional point cloud semantic segmentation method for optimizing boundary
CN117671480A (en) Landslide automatic identification method, system and computer equipment based on visual large model
CN111597367B (en) Three-dimensional model retrieval method based on view and hash algorithm
CN116912486A (en) Target segmentation method based on edge convolution and multidimensional feature fusion and electronic device
CN117312594A (en) Sketching mechanical part library retrieval method integrating double-scale features
Feng et al. Point-guided contrastive learning for monocular 3-D object detection
CN116452940A (en) Three-dimensional instance segmentation method and system based on dense and sparse convolution fusion
Wang et al. A geometry feature aggregation method for point cloud classification and segmentation
CN114638953A (en) Point cloud data segmentation method and device and computer readable storage medium
Liu et al. Point2Building: Reconstructing Buildings from Airborne LiDAR Point Clouds
Wan et al. PointNest: Learning Deep Multi-Scale Nested Feature Propagation for Semantic Segmentation of 3D Point clouds

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant