Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

Search Results (326)

Search Parameters:
Keywords = 3D geometry reconstruction

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 28012 KiB  
Article
A Model Development Approach Based on Point Cloud Reconstruction and Mapping Texture Enhancement
by Boyang You and Barmak Honarvar Shakibaei Asli
Big Data Cogn. Comput. 2024, 8(11), 164; https://doi.org/10.3390/bdcc8110164 - 20 Nov 2024
Viewed by 115
Abstract
To address the challenge of rapid geometric model development in the digital twin industry, this paper presents a comprehensive pipeline for constructing 3D models from images using monocular vision imaging principles. Firstly, a structure-from-motion (SFM) algorithm generates a 3D point cloud from photographs. [...] Read more.
To address the challenge of rapid geometric model development in the digital twin industry, this paper presents a comprehensive pipeline for constructing 3D models from images using monocular vision imaging principles. Firstly, a structure-from-motion (SFM) algorithm generates a 3D point cloud from photographs. The feature detection methods scale-invariant feature transform (SIFT), speeded-up robust features (SURF), and KAZE are compared across six datasets, with SIFT proving the most effective (matching rate higher than 0.12). Using K-nearest-neighbor matching and random sample consensus (RANSAC), refined feature point matching and 3D spatial representation are achieved via antipodal geometry. Then, the Poisson surface reconstruction algorithm converts the point cloud into a mesh model. Additionally, texture images are enhanced by leveraging a visual geometry group (VGG) network-based deep learning approach. Content images from a dataset provide geometric contours via higher-level VGG layers, while textures from style images are extracted using the lower-level layers. These are fused to create texture-transferred images, where the image quality assessment (IQA) metrics SSIM and PSNR are used to evaluate texture-enhanced images. Finally, texture mapping integrates the enhanced textures with the mesh model, improving the scene representation with enhanced texture. The method presented in this paper surpassed a LiDAR-based reconstruction approach by 20% in terms of point cloud density and number of model facets, while the hardware cost was only 1% of that associated with LiDAR. Full article
Show Figures

Figure 1

Figure 1
<p>Samples from Dataset 1 (Source: <a href="https://github.com/Abhishek-Aditya-bs/MultiView-3D-Reconstruction/tree/main/Datasets" target="_blank">https://github.com/Abhishek-Aditya-bs/MultiView-3D-Reconstruction/tree/main/Datasets</a> accessed on 18 November 2024) and samples from Dataset 2.</p>
Full article ">Figure 2
<p>Demonstration of Dataset 3.</p>
Full article ">Figure 3
<p>Diagram of SFM algorithm.</p>
Full article ">Figure 4
<p>Camera imaging model.</p>
Full article ">Figure 5
<p>Coplanarity condition of photogrammetry.</p>
Full article ">Figure 6
<p>Process of surface reconstruction.</p>
Full article ">Figure 7
<p>Demonstration of isosurface.</p>
Full article ">Figure 8
<p>Demonstration of VGG network.</p>
Full article ">Figure 9
<p>Demonstration of Gram matrix.</p>
Full article ">Figure 10
<p>Style transformation architecture.</p>
Full article ">Figure 11
<p>Texture mapping process.</p>
Full article ">Figure 12
<p>Demonstration of the three kinds of feature descriptors used on Dataset 1 and Dataset 2.</p>
Full article ">Figure 13
<p>Matching rate fitting of three kinds of image descriptors.</p>
Full article ">Figure 14
<p>SIFT point matching for <span class="html-italic">CNC1</span> object under different thresholds.</p>
Full article ">Figure 15
<p>SIFT point matching for <span class="html-italic">Fountain</span> object under different thresholds.</p>
Full article ">Figure 16
<p>Matching result of Dataset 2 using RANSAC method.</p>
Full article ">Figure 17
<p>Triangulation presentation of feature points obtained from objects in Dataset 1.</p>
Full article ">Figure 18
<p>Triangulation presentation of feature points obtained from objects in Dataset 2.</p>
Full article ">Figure 19
<p>Point cloud data of objects in Dataset 1.</p>
Full article ">Figure 20
<p>Point cloud data of objects in Dataset 2.</p>
Full article ">Figure 21
<p>Normal vector presentation of the points set obtained from objects in Dataset 1.</p>
Full article ">Figure 22
<p>Normal vector of the points set obtained from objects in Dataset 2.</p>
Full article ">Figure 23
<p>Poisson surface reconstruction results of objects in Dataset 1.</p>
Full article ">Figure 24
<p>Poisson surface reconstruction results of objects in Dataset 2.</p>
Full article ">Figure 25
<p>Style transfer result of <span class="html-italic">Statue</span> object.</p>
Full article ">Figure 26
<p>Style transfer result of <span class="html-italic">Fountain</span> object.</p>
Full article ">Figure 27
<p>Style transfer result of <span class="html-italic">Castle</span> object.</p>
Full article ">Figure 28
<p>Style transfer result of <span class="html-italic">CNC1</span> object.</p>
Full article ">Figure 29
<p>Style transfer result of <span class="html-italic">CNC2</span> object.</p>
Full article ">Figure 30
<p>Style transfer result of <span class="html-italic">Robot</span> object.</p>
Full article ">Figure 31
<p>Training loss in style transfer for <b>CNC1</b> object.</p>
Full article ">Figure 32
<p>IQA assessment for <b>CNC1</b> images after style transfer.</p>
Full article ">Figure 33
<p>Results of texture mapping for Dataset 1.</p>
Full article ">Figure 34
<p>Results of texture mapping for Dataset 2.</p>
Full article ">Figure A1
<p>Results of Camera calibration.</p>
Full article ">
18 pages, 2990 KiB  
Article
A GGCM-E Based Semantic Filter and Its Application in VSLAM Systems
by Yuanjie Li, Chunyan Shao and Jiaming Wang
Electronics 2024, 13(22), 4487; https://doi.org/10.3390/electronics13224487 - 15 Nov 2024
Viewed by 265
Abstract
Image matching-based visual simultaneous localization and mapping (vSLAM) extracts low-level pixel features to reconstruct camera trajectories and maps through the epipolar geometry method. However, it fails to achieve correct trajectories and mapping when there are low-quality feature correspondences in several challenging environments. Although [...] Read more.
Image matching-based visual simultaneous localization and mapping (vSLAM) extracts low-level pixel features to reconstruct camera trajectories and maps through the epipolar geometry method. However, it fails to achieve correct trajectories and mapping when there are low-quality feature correspondences in several challenging environments. Although the RANSAC-based framework can enable better results, it is computationally inefficient and unstable in the presence of a large number of outliers. A Faster R-CNN learning-based semantic filter is proposed to explore the semantic information of inliers to remove low-quality correspondences, helping vSLAM localize accurately in our previous work. However, the semantic filter learning method generalizes low precision for low-level and dense texture-rich scenes, leading the semantic filter-based vSLAM to be unstable and have poor geometry estimation. In this paper, a GGCM-E-based semantic filter using YOLOv8 is proposed to address these problems. Firstly, the semantic patches of images are collected from the KITTI dataset, the TUM dataset provided by the Technical University of Munich, and real outdoor scenes. Secondly, the semantic patches are classified by our proposed GGCM-E descriptors to obtain the YOLOv8 neural network training dataset. Finally, several semantic filters for filtering low-level and dense texture-rich scenes are generated and combined into the ORB-SLAM3 system. Extensive experiments show that the semantic filter can detect and classify semantic levels of different scenes effectively, filtering low-level semantic scenes to improve the quality of correspondences, thus achieving accurate and robust trajectory reconstruction and mapping. For the challenging autonomous driving benchmark and real environments, the vSLAM system with respect to the GGCM-E-based semantic filter demonstrates its superiority regarding reducing the 3D position error, such that the absolute trajectory error is reduced by up to approximately 17.44%, showing its promise and good generalization. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Robotics)
Show Figures

Figure 1

Figure 1
<p>ORB-SLAM3 framework with the proposed semantic filter module.</p>
Full article ">Figure 2
<p>Framework of the proposed semantic filter approach.</p>
Full article ">Figure 3
<p>Computation of GGCM-E features.</p>
Full article ">Figure 4
<p>Semantic filtering on the KITTI frame.</p>
Full article ">Figure 5
<p>Semantic filtering on our captured outdoor frame.</p>
Full article ">Figure 6
<p>The trajectory of KITTI07 with respect to the ground truth using GGCM-E semantic filter.</p>
Full article ">Figure 7
<p>Comparison of trajectories between the proposed method and ground truth in the KITTI dataset.</p>
Full article ">Figure 8
<p>Comparison on APEs with respect to ground truth of the ORB-SLAM3 and the semantic filter.</p>
Full article ">Figure 9
<p>Dense texture-rich sequences in TUM dataset (DTR sequences).</p>
Full article ">Figure 10
<p>Comparison of camera trajectories in DTR sequences.</p>
Full article ">Figure 11
<p>Comparison of the trajectory with respect to the ground truth of DynaSLAM and GGCM-E+DynaSLAM on KITTI00 sequences.</p>
Full article ">Figure 12
<p>Comparison of the APEs of semantic filter-based Structure-SLAM, LDSO and DynaSLAM on KITTI07 sequences.</p>
Full article ">
13 pages, 8320 KiB  
Technical Note
Unmanned Aerial Vehicle-Neural Radiance Field (UAV-NeRF): Learning Multiview Drone Three-Dimensional Reconstruction with Neural Radiance Field
by Li Li, Yongsheng Zhang, Zhipeng Jiang, Ziquan Wang, Lei Zhang and Han Gao
Remote Sens. 2024, 16(22), 4168; https://doi.org/10.3390/rs16224168 - 8 Nov 2024
Viewed by 368
Abstract
In traditional 3D reconstruction using UAV images, only radiance information, which is treated as a geometric constraint, is used in feature matching, allowing for the restoration of the scene’s structure. After introducing radiance supervision, NeRF can adjust the geometry in the fixed-ray direction, [...] Read more.
In traditional 3D reconstruction using UAV images, only radiance information, which is treated as a geometric constraint, is used in feature matching, allowing for the restoration of the scene’s structure. After introducing radiance supervision, NeRF can adjust the geometry in the fixed-ray direction, resulting in a smaller search space and higher robustness. Considering the lack of NeRF construction methods for aerial scenarios, we propose a new NeRF point sampling method, which is generated using a UAV imaging model, compatible with a global geographic coordinate system, and suitable for a UAV view. We found that NeRF is optimized entirely based on the radiance while ignoring the direct geometry constraint. Therefore, we designed a radiance correction strategy that considers the incidence angle. Our method can complete point sampling in a UAV imaging scene, as well as simultaneously perform digital surface model construction and ground radiance information recovery. When tested on self-acquired datasets, the NeRF variant proposed in this paper achieved better reconstruction accuracy than the original NeRF-based methods. It also reached a level of precision comparable to that of traditional photogrammetry methods, and it is capable of outputting a surface albedo that includes shadow information. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p><b>The main motivation for our proposed method</b>. We analyzed NeRF’s 3D reconstruction workflow (<b>c</b>) from a photogrammetric perspective (<b>b</b>) and found that the latter uses reprojection errors to geometrically adjust the ray direction, while the former can adjust the transmittance by radiance help, thereby narrowing the search space to a single ray. However, there is a rare NeRF method specifically for drone imaging scenarios. In addition, NeRF does not consider the influence of geometric structure changes on the radiance (<b>a</b>); as such, we designed a new geographical NeRF point sampling method for UAVs and introduced the photogrammetry incident angle model to optimize NeRF radiance features, thus completing end-to-end 3D reconstruction and radiance acquisition (<b>d</b>).</p>
Full article ">Figure 2
<p><b>Main workflow of the multitask UAV-NeRF</b>. We used traditional photogrammetry methods to sample the NeRF point, and we then used a geometric imaging model to perform radiation correction and decoding. “MLP” represents the multilayer perceptrons that were used to decode the different radiation information.</p>
Full article ">Figure 3
<p>The study location along the with general collection pattern, flight lines for the drone imagery.</p>
Full article ">Figure 4
<p><b>Intuitive performance comparison of different methods.</b> These experiments were conducted on the DengFeng and XinMi areas, and they demonstrate the improvement in 3D surface construction achieved using our proposed method. From (<b>a</b>) to (<b>d</b>), the images are as follows: (<b>a</b>) the original imagery captured by the drone, (<b>b</b>) the corresponding ground DSM truth obtained from LIDAR, (<b>c</b>) the DSM predicted by the method proposed in this paper, and (<b>d</b>) the DSM obtained using the CC method. Additionally, another representation of the results, showcasing additional details, is provided in <a href="#remotesensing-16-04168-t001" class="html-table">Table 1</a>.</p>
Full article ">Figure 5
<p>The UAV image, albedo, shadow scalar <span class="html-italic">s</span>, and transient scalar <math display="inline"><semantics> <mi>β</mi> </semantics></math>.</p>
Full article ">
31 pages, 1871 KiB  
Article
3D Reconstruction of Geometries for Urban Areas Supported by Computer Vision or Procedural Generations
by Hanli Liu, Carlos J. Hellín, Abdelhamid Tayebi, Carlos Delgado and Josefa Gómez
Mathematics 2024, 12(21), 3331; https://doi.org/10.3390/math12213331 - 23 Oct 2024
Viewed by 601
Abstract
This work presents a numerical mesh generation method for 3D urban scenes that could be easily converted into any 3D format, different from most implementations which are limited to specific environments in their applicability. The building models have shaped roofs and faces with [...] Read more.
This work presents a numerical mesh generation method for 3D urban scenes that could be easily converted into any 3D format, different from most implementations which are limited to specific environments in their applicability. The building models have shaped roofs and faces with static colors, combining the buildings with a ground grid. The building generation uses geographic positions and shape names, which can be extracted from OpenStreetMap. Additional steps, like a computer vision method, can be integrated into the generation optionally to improve the quality of the model, although this is highly time-consuming. Its function is to classify unknown roof shapes from satellite images with adequate resolution. The generation can also use custom geographic information. This aspect was tested using information created by procedural processes. The method was validated by results generated for many realistic scenarios with multiple building entities, comparing the results between using computer vision and not. The generated models were attempted to be rendered under Graphics Library Transmission Format and Unity Engine. In future work, a polygon-covering algorithm needs to be completed to process the building footprints more effectively, and a solution is required for the missing height values in OpenStreetMap. Full article
(This article belongs to the Special Issue Object Detection: Algorithms, Computations and Practices)
Show Figures

Figure 1

Figure 1
<p>Tiles to consider to create the ground grid model.</p>
Full article ">Figure 2
<p>Empty ground tiles (white points for tiles which have no value).</p>
Full article ">Figure 3
<p>Tile triangles and equation of diagonals: how to divide each tile with a valid value to infer which triangle belongs to a point.</p>
Full article ">Figure 4
<p>Elements labeling in the triangle: to obtain the exact elevation inside a tile triangle.</p>
Full article ">Figure 5
<p>Theoretical side view of a building model: the ground elevations are from points in the building footprint area.</p>
Full article ">Figure 6
<p>Ways to adapt footprint coordinates to the rectangular basis.</p>
Full article ">Figure 7
<p>Example of noise side that obtains incorrect direction for the alternative coordinate.</p>
Full article ">Figure 8
<p>Example step-by-step walk through of the outer sides; each color represents a different direction.</p>
Full article ">Figure 9
<p>Alternative coordinate overview: definition of the parameters to obtain those coordinates.</p>
Full article ">Figure 10
<p>Example representing a body face, with XY position of the roof segments obtained by the interior segments of the virtual rectangle roof.</p>
Full article ">Figure 11
<p>Representation of 2D coordinates of roof faces: real roof faces obtained by 2D intersections of the footprint with the virtual rectangle roof.</p>
Full article ">Figure 12
<p>Structure of generated 3D definition.</p>
Full article ">Figure 13
<p>Confusion matrices for neural network validation.</p>
Full article ">Figure 14
<p>Tiles concatenation and cropping for building to obtain an image to classify a roof.</p>
Full article ">Figure 15
<p>Component diagram of the implementation.</p>
Full article ">Figure 16
<p>Flowchart to generate a real scene.</p>
Full article ">Figure 17
<p>Examples of single building by roof shapes.</p>
Full article ">Figure 18
<p>Examples of complex buildings.</p>
Full article ">Figure 19
<p>Example of the ground model only, with tiles divided by diagonals.</p>
Full article ">Figure 20
<p>Real scene example: Alcala de Henares.</p>
Full article ">Figure 21
<p>Real scene example: Munich.</p>
Full article ">Figure 22
<p>Examples of procedural generation.</p>
Full article ">Figure 23
<p>Example of Unity rendering.</p>
Full article ">Figure 24
<p>Multiple buildings in a row as single OSM entity. I do not understand. It is complete.</p>
Full article ">Figure 25
<p>Example of modeling by polygon-covering.</p>
Full article ">Figure A1
<p>Shape of gabled roof.</p>
Full article ">Figure A2
<p>Shape of hipped roof.</p>
Full article ">Figure A3
<p>Shape of pyramidal roof.</p>
Full article ">Figure A4
<p>Shape of skillion roof.</p>
Full article ">Figure A5
<p>Shape of half-hipped roof.</p>
Full article ">Figure A6
<p>Shape of gambrel roof.</p>
Full article ">Figure A7
<p>Shape of mansard roof.</p>
Full article ">
16 pages, 1608 KiB  
Article
Control-Oriented Free-Boundary Equilibrium Solver for Tokamaks
by Xiao Song, Brian Leard, Zibo Wang, Sai Tej Paruchuri, Tariq Rafiq and Eugenio Schuster
Plasma 2024, 7(4), 842-857; https://doi.org/10.3390/plasma7040045 - 23 Oct 2024
Viewed by 479
Abstract
A free-boundary equilibrium solver for an axisymmetric tokamak geometry was developed based on the finite difference method and Picard iteration in a rectangular computational area. The solver can run either in forward mode, where external coil currents are prescribed until the converged magnetic [...] Read more.
A free-boundary equilibrium solver for an axisymmetric tokamak geometry was developed based on the finite difference method and Picard iteration in a rectangular computational area. The solver can run either in forward mode, where external coil currents are prescribed until the converged magnetic flux function ψ(R,Z) map is achieved, or in inverse mode, where the desired plasma boundary, with or without an X-point, is prescribed to determine the required coil currents. The equilibrium solutions are made consistent with prescribed plasma parameters, such as the total plasma current, poloidal beta, or safety factor at a specified flux surface. To verify the mathematical correctness and accuracy of the solver, the solution obtained using this numerical solver was compared with that from an analytic fixed-boundary equilibrium solver based on the EAST geometry. Additionally, the proposed solver was benchmarked against another numerical solver based on the finite-element and Newton-iteration methods in a triangular-based mesh. Finally, the proposed solver was compared with equilibrium reconstruction results in DIII-D experiments. Full article
(This article belongs to the Special Issue New Insights into Plasma Theory, Modeling and Predictive Simulations)
Show Figures

Figure 1

Figure 1
<p>Initial guess of <math display="inline"><semantics> <msubsup> <mi>J</mi> <mrow> <mi>ϕ</mi> <mi>j</mi> <mo>,</mo> <mi>i</mi> </mrow> <mn>0</mn> </msubsup> </semantics></math> [<math display="inline"><semantics> <mrow> <mi>A</mi> <mo>/</mo> <msup> <mi>m</mi> <mn>2</mn> </msup> </mrow> </semantics></math>] with <math display="inline"><semantics> <mrow> <msub> <mi>I</mi> <mi>p</mi> </msub> <mo>=</mo> </mrow> </semantics></math> 250 kA.</p>
Full article ">Figure 2
<p>Illustration of a rectangular grid, where the red ‘x’s denote the boundary of the mesh while the black dots represents the core region.</p>
Full article ">Figure 3
<p>Block diagram of the Picard iteration in the COTES code. The diagram provides a representation of the iterative procedure and the conditions for updating the flux functions and current densities until convergence is achieved.</p>
Full article ">Figure 4
<p>Iso<math display="inline"><semantics> <mrow> <mo>−</mo> <msub> <mi>ψ</mi> <mi>N</mi> </msub> <mo>∈</mo> </mrow> </semantics></math> (0:0.1:1) contour comparisons between the inverse of COTES (yellow plain) and the analytic (black dashed) fixed-boundary equilibrium solver with an EAST geometry.</p>
Full article ">Figure 5
<p>(<b>Left</b>): rectangular grid in COTES. (<b>Right</b>): triangular grid with finite element method in FEEQS.M code.</p>
Full article ">Figure 6
<p>(<b>Left</b>): iso<math display="inline"><semantics> <mrow> <mo>−</mo> <msub> <mi>ψ</mi> <mi>N</mi> </msub> <mo>∈</mo> </mrow> </semantics></math>(0:0.1:1) contour plots comparisons between inverse of COTES (yellow plain) and FEEQS.M (black dashed) solutions. (<b>Right</b>): plasma internal profiles of <math display="inline"><semantics> <mrow> <mi>p</mi> <mo>,</mo> <mi>f</mi> <mo>,</mo> <msup> <mi>p</mi> <mo>′</mo> </msup> <mo>,</mo> <mi>f</mi> <msup> <mi>f</mi> <mo>′</mo> </msup> <mo>,</mo> <mi>q</mi> </mrow> </semantics></math>, and <math display="inline"><semantics> <msub> <mi>J</mi> <mrow> <mi>ϕ</mi> <mo>,</mo> <mi>p</mi> <mi>l</mi> </mrow> </msub> </semantics></math> in COTES (plain) and FEEQS.M (dashed).</p>
Full article ">Figure 7
<p>(<b>Left</b>): iso<math display="inline"><semantics> <mrow> <mo>−</mo> <msub> <mi>ψ</mi> <mi>N</mi> </msub> <mo>∈</mo> </mrow> </semantics></math> (0:0.1:1) contour plot comparisons between the inverse of COTES (yellow plain) and FEEQS.M (black dashed) solutions. (<b>Right</b>): plasma internal profiles of <math display="inline"><semantics> <mrow> <mi>p</mi> <mo>,</mo> <mi>f</mi> <mo>,</mo> <msup> <mi>p</mi> <mo>′</mo> </msup> <mo>,</mo> <mi>f</mi> <msup> <mi>f</mi> <mo>′</mo> </msup> <mo>,</mo> <mi>q</mi> </mrow> </semantics></math>, and <math display="inline"><semantics> <msub> <mi>j</mi> <mrow> <mi>ϕ</mi> <mo>,</mo> <mi>p</mi> <mi>l</mi> </mrow> </msub> </semantics></math> in COTES (plain) and FEEQS.M (dashed).</p>
Full article ">Figure 8
<p>The plasma boundary between EFIT (cyan dashed) and the forward of COTES (red plain) with three vertical instability approaches.</p>
Full article ">Figure 9
<p>(<b>Left</b>): profiles of <span class="html-italic">p</span> and <span class="html-italic">f</span> between EFIT (dashed) and COTES (plain) simulations. (<b>Right</b>): profile of <span class="html-italic">q</span> between EFIT (dashed) and COTES (plain) simulations.</p>
Full article ">
16 pages, 9232 KiB  
Article
DSM Reconstruction from Uncalibrated Multi-View Satellite Stereo Images by RPC Estimation and Integration
by Dong-Uk Seo and Soon-Yong Park
Remote Sens. 2024, 16(20), 3863; https://doi.org/10.3390/rs16203863 - 17 Oct 2024
Viewed by 534
Abstract
In this paper, we propose a 3D Digital Surface Model (DSM) reconstruction method from uncalibrated Multi-view Satellite Stereo (MVSS) images, where Rational Polynomial Coefficient (RPC) sensor parameters are not available. While recent investigations have introduced several techniques to reconstruct high-precision and high-density DSMs [...] Read more.
In this paper, we propose a 3D Digital Surface Model (DSM) reconstruction method from uncalibrated Multi-view Satellite Stereo (MVSS) images, where Rational Polynomial Coefficient (RPC) sensor parameters are not available. While recent investigations have introduced several techniques to reconstruct high-precision and high-density DSMs from MVSS images, they inherently depend on the use of geo-corrected RPC sensor parameters. However, RPC parameters from satellite sensors are subject to being erroneous due to inaccurate sensor data. In addition, due to the increasing data availability from the internet, uncalibrated satellite images can be easily obtained without RPC parameters. This study proposes a novel method to reconstruct a 3D DSM from uncalibrated MVSS images by estimating and integrating RPC parameters. To do this, we first employ a structure from motion (SfM) and 3D homography-based geo-referencing method to reconstruct an initial DSM. Second, we sample 3D points from the initial DSM as references and reproject them to the 2D image space to determine 3D–2D correspondences. Using the correspondences, we directly calculate all RPC parameters. To overcome the memory shortage problem while running the large size of satellite images, we also propose an RPC integration method. Image space is partitioned to multiple tiles, and RPC estimation is performed independently in each tile. Then, all tiles’ RPCs are integrated into the final RPC to represent the geometry of the whole image space. Finally, the integrated RPC is used to run a true MVSS pipeline to obtain the 3D DSM. The experimental results show that the proposed method can achieve 1.455 m Mean Absolute Error (MAE) in the height map reconstruction from multi-view satellite benchmark datasets. We also show that the proposed method can be used to reconstruct a geo-referenced 3D DSM from uncalibrated and freely available Google Earth imagery. Full article
Show Figures

Figure 1

Figure 1
<p>Pipeline of the proposed method. (Reference: GEMVS [<a href="#B14-remotesensing-16-03863" class="html-bibr">14</a>], 3D to 2D correspondence search [<a href="#B17-remotesensing-16-03863" class="html-bibr">17</a>], MS2P [<a href="#B25-remotesensing-16-03863" class="html-bibr">25</a>]).</p>
Full article ">Figure 2
<p>3D-to-2D projection process for finding correspondence.</p>
Full article ">Figure 3
<p>2D-to-3D correspondence search process. For example, an image space is divided into <math display="inline"><semantics> <mrow> <mn>3</mn> <mo>×</mo> <mn>3</mn> </mrow> </semantics></math> tiles. Red-color points are uniform samples in image <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>I</mi> </mrow> <mrow> <msub> <mrow> <mi>C</mi> </mrow> <mrow> <mi>i</mi> </mrow> </msub> </mrow> </msub> </mrow> </semantics></math>. A sampled point is reprojected to geo-referencing space by each tile’s inverse RPCs to obtain <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="bold-italic">P</mi> </mrow> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>j</mi> </mrow> </msub> </mrow> </msub> </mrow> </semantics></math>. Then, all <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="bold-italic">P</mi> </mrow> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>j</mi> </mrow> </msub> </mrow> </msub> </mrow> </semantics></math> are weighted averaged by the distance of the point to each tile center to obtain the final correspondence <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="bold-italic">P</mi> </mrow> <mrow> <msub> <mrow> <mi>S</mi> </mrow> <mrow> <mi>k</mi> </mrow> </msub> </mrow> </msub> </mrow> </semantics></math>.</p>
Full article ">Figure 4
<p>A simplified flow diagram of MS2P. The baseline algorithm for MVS is EnSoft3D [<a href="#B29-remotesensing-16-03863" class="html-bibr">29</a>], and it is modified to use the estimated RPC parameters.</p>
Full article ">Figure 5
<p>Results of the proposed method on GE imagery. The area of the first row is Sigiriya, Sri-Lanka (7°57′22″N 80°45′32″E), and second row is Sydney, Australia (33°51′25″S 151°12′42″E).</p>
Full article ">Figure 6
<p>Comparison of the height map results from uncalibrated satellite images using the pin-hole camera model with GEMVS [<a href="#B17-remotesensing-16-03863" class="html-bibr">17</a>] and the RPC model with MS2P [<a href="#B25-remotesensing-16-03863" class="html-bibr">25</a>].</p>
Full article ">Figure 7
<p>Comparison of the DSM reconstruction of two camera models, COLMAP and GEMVS, with the pin-hole model and MS2P with the estimated RPC model.</p>
Full article ">Figure 8
<p>MAE and RMSE error of the height map compared with the GT DSM from the DFC19 dataset.</p>
Full article ">Figure 9
<p>Error analysis of the OMA_284 tile. From the left, reconstructed DSM, GT model, and error map.</p>
Full article ">
22 pages, 7672 KiB  
Article
ALS-Based, Automated, Single-Tree 3D Reconstruction and Parameter Extraction Modeling
by Hong Wang, Dan Li, Jiaqi Duan and Peng Sun
Forests 2024, 15(10), 1776; https://doi.org/10.3390/f15101776 - 9 Oct 2024
Viewed by 811
Abstract
The 3D reconstruction of point cloud trees and the acquisition of stand factors are key to supporting forestry regulation and urban planning. However, the two are usually independent modules in existing studies. In this work, we extended the AdTree method for 3D modeling [...] Read more.
The 3D reconstruction of point cloud trees and the acquisition of stand factors are key to supporting forestry regulation and urban planning. However, the two are usually independent modules in existing studies. In this work, we extended the AdTree method for 3D modeling of trees by adding a quantitative analysis capability to acquire stand factors. We used unmanned aircraft LiDAR (ALS) data as the raw data for this study. After denoising the data and segmenting the single trees, we obtained the single-tree samples needed for this study and produced our own single-tree sample dataset. The scanned tree point cloud was reconstructed in three dimensions in terms of geometry and topology, and important stand parameters in forestry were extracted. This improvement in the quantification of model parameters significantly improves the utility of the original point cloud tree reconstruction algorithm and increases its ability for quantitative analysis. The tree parameters obtained by this improved model were validated on 82 camphor pine trees sampled from the Northeast Forestry University forest. In a controlled experiment with the same field-measured parameters, the root mean square errors (RMSEs) and coefficients of determination (R2s) for diameters at breast height (DBHs) and crown widths (CWs) were 4.1 cm and 0.63, and 0.61 m and 0.74, and the RMSEs and coefficients of determination (R2s) for heights at tree height (THs) and crown base heights (CBHs) were 0.55 m and 0.85, and 1.02 m and 0.88, respectively. The overall effect of the canopy volume extracted based on the alpha shape is closest to the original point cloud and best estimated when alpha = 0.3. Full article
(This article belongs to the Special Issue Forest Parameter Detection and Modeling Using Remote Sensing Data)
Show Figures

Figure 1

Figure 1
<p>Overview map of the study area.</p>
Full article ">Figure 2
<p>(<b>a</b>) Side view of the original point clouds with noise. (<b>b</b>) Point clouds after denoising. (<b>c</b>) Point clouds with ground points removed. (<b>d</b>) The results of the segmentation algorithm. The colors of the point clouds in subfigures (<b>a</b>–<b>c</b>) are the “Scalar field” pattern in CloudCompare.</p>
Full article ">Figure 3
<p>Overlapping point cloud trees and point cloud trees with low completeness.</p>
Full article ">Figure 4
<p>Diagram of the reconstruction process of the camphor pine skeleton based on AdTree. (<b>a</b>) Initial input point cloud. (<b>b</b>) Delaunay triangular profile. (<b>c</b>) lightweight tree skeleton. (<b>d</b>) final reconstructed 3D tree skeleton model. (<b>e</b>) The degree of alignment with the original point cloud after fitting.</p>
Full article ">Figure 5
<p>Overall flow chart of the experiment.</p>
Full article ">Figure 6
<p>Parameters and objectives of the AdTree-based cylindrical fitting problem.</p>
Full article ">Figure 7
<p>Estimation of DBH based on Levenberg–Marquardt cylindrical fitting algorithm. (<b>a</b>) Extract the main stem point cloud within a vertical range around 1.3 m above the ground. (<b>b</b>) Side view of the extracted point cloud. (<b>c</b>) Top view of the extracted point cloud. The colors of the point clouds in subfigure (<b>a</b>) are the “Scalar field” pattern in CloudCompare.</p>
Full article ">Figure 8
<p>The 2D schematic of the Welzl algorithm.</p>
Full article ">Figure 9
<p>Top view of camphor pine canopy elevation. The colors of the point clouds are the “Scalar field” pattern in CloudCompare.</p>
Full article ">Figure 10
<p>Level of detail in the final canopy corresponding to different alpha values.</p>
Full article ">Figure 11
<p>Scatterplots of tree height (TH) and crown base height (CBH).</p>
Full article ">Figure 12
<p>Scatterplots of diameter at breast height (DBH) and crown width (CW).</p>
Full article ">Figure 13
<p>Line plots of crown volume (CV) for different models and parameters.</p>
Full article ">Figure 14
<p>AdTree reconstructions of point clouds with low completeness are also less realistic.</p>
Full article ">
14 pages, 1281 KiB  
Article
A Flexible Hierarchical Framework for Implicit 3D Characterization of Bionic Devices
by Yunhong Lu, Xiangnan Li and Mingliang Li
Biomimetics 2024, 9(10), 590; https://doi.org/10.3390/biomimetics9100590 - 29 Sep 2024
Viewed by 541
Abstract
In practical applications, integrating three-dimensional models of bionic devices with simulation systems can predict their behavior and performance under various operating conditions, providing a basis for subsequent engineering optimization and improvements. This study proposes a framework for characterizing three-dimensional models of objects, focusing [...] Read more.
In practical applications, integrating three-dimensional models of bionic devices with simulation systems can predict their behavior and performance under various operating conditions, providing a basis for subsequent engineering optimization and improvements. This study proposes a framework for characterizing three-dimensional models of objects, focusing on extracting 3D structures and generating high-quality 3D models. The core concept involves obtaining the density output of the model from multiple images to enable adaptive boundary surface detection. The framework employs a hierarchical octree structure to partition the 3D space based on surface and geometric complexity. This approach includes recursive encoding and decoding of the octree structure and surface geometry, ultimately leading to the reconstruction of the 3D model. The framework has been validated through a series of experiments, yielding positive results. Full article
(This article belongs to the Special Issue Biomimetic Aspects of Human–Computer Interactions)
Show Figures

Figure 1

Figure 1
<p>A diagram of a hierarchical octree neural network in two dimensions. In this paper, a recursive encoder–decoder network is proposed, which is trained using several GAN methods. Here, the geometry of the octree is encoded using the voxel 3DCNN and recursively aggregated using the hierarchical structure and geometric features of the local encoder <math display="inline"><semantics> <msub> <mi>ε</mi> <mi>i</mi> </msub> </semantics></math>. The decoding function is implemented by a local decoder <math display="inline"><semantics> <msub> <mi>D</mi> <mi>i</mi> </msub> </semantics></math> hierarchy with a mirror structure relative to the encoder. The structural and geometric information of the input model is decoded recursively, and the local geometric surfaces are recovered with input of an implicit decoder embedded in each octree.</p>
Full article ">Figure 2
<p>The structure of the encoder and decoder <math display="inline"><semantics> <msub> <mi>E</mi> <mi>k</mi> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>D</mi> <mi>k</mi> </msub> </semantics></math>, respectively. <math display="inline"><semantics> <msub> <mi>E</mi> <mi>k</mi> </msub> </semantics></math> collects the structure (<math display="inline"><semantics> <mrow> <msub> <mi>a</mi> <msub> <mi>c</mi> <mi>j</mi> </msub> </msub> <mo>,</mo> <msub> <mi>b</mi> <msub> <mi>c</mi> <mi>j</mi> </msub> </msub> </mrow> </semantics></math>) and geometric characteristics (<math display="inline"><semantics> <msub> <mi>g</mi> <msub> <mi>c</mi> <mi>j</mi> </msub> </msub> </semantics></math>) of the child octrees into its parent octree <span class="html-italic">k</span>, where <math display="inline"><semantics> <msub> <mi>c</mi> <mi>j</mi> </msub> </semantics></math> is in <math display="inline"><semantics> <msub> <mi>C</mi> <mi>k</mi> </msub> </semantics></math>, utilizing an MLP, maximum set operation, and second MLP. Two MLPs and classifiers decode the geometric features <math display="inline"><semantics> <msub> <mi>g</mi> <mi>k</mi> </msub> </semantics></math> of the parent space into geometric features <math display="inline"><semantics> <msub> <mi>g</mi> <msub> <mi>c</mi> <mi>j</mi> </msub> </msub> </semantics></math> and two attributes <math display="inline"><semantics> <msub> <mi>α</mi> <msub> <mi>c</mi> <mi>j</mi> </msub> </msub> </semantics></math>,<math display="inline"><semantics> <msub> <mi>β</mi> <msub> <mi>c</mi> <mi>j</mi> </msub> </msub> </semantics></math> of the child space. Two metrics are employed to determine the probability of surface occupation and the need for substructure subdivision.</p>
Full article ">Figure 3
<p>This figure shows the depth generation model, which is composed of <span class="html-italic">n</span> combined network layers of linear network and implicit network, and the implicit network core uses <math display="inline"><semantics> <mrow> <mi>s</mi> <mi>i</mi> <mi>n</mi> </mrow> </semantics></math> function as the calculation method. The multi-layer perceptron takes the position information <span class="html-italic">x</span> and the noise information <span class="html-italic">z</span> processed by the mapping network layer as the input, and the final output is the density. (<b>a</b>) Overall architecture of the network model. (<b>b</b>) Specific structure of the FiLM SIREN unit.</p>
Full article ">Figure 4
<p>Shape reconstruction comparison of (<b>a</b>) LIG [<a href="#B5-biomimetics-09-00590" class="html-bibr">5</a>], (<b>b</b>) OccNet [<a href="#B9-biomimetics-09-00590" class="html-bibr">9</a>], (<b>c</b>) IM-Net [<a href="#B4-biomimetics-09-00590" class="html-bibr">4</a>], (<b>d</b>) OctField [<a href="#B6-biomimetics-09-00590" class="html-bibr">6</a>], and (<b>e</b>) the work of this paper.</p>
Full article ">Figure 5
<p>The result of modeling the detail part of the aircraft model.</p>
Full article ">Figure 6
<p>Shape generation. The image shows the results generated by randomly sampling potential codes in the potential space.</p>
Full article ">Figure 7
<p>Shape interpolation. The figure shows the results of two types of interpolation: table and chair. (<b>a</b>) The source shape and (<b>f</b>) the target shape. (<b>b</b>–<b>e</b>) is the intermediate result of interpolation.</p>
Full article ">
25 pages, 17785 KiB  
Article
Compressing and Recovering Short-Range MEMS-Based LiDAR Point Clouds Based on Adaptive Clustered Compressive Sensing and Application to 3D Rock Fragment Surface Point Clouds
by Lin Li, Huajun Wang and Sen Wang
Sensors 2024, 24(17), 5695; https://doi.org/10.3390/s24175695 - 1 Sep 2024
Viewed by 4118
Abstract
Short-range MEMS-based (Micro Electronical Mechanical System) LiDAR provides precise point cloud datasets for rock fragment surfaces. However, there is more vibrational noise in MEMS-based LiDAR signals, which cannot guarantee that the reconstructed point cloud data are not distorted with a high compression ratio. [...] Read more.
Short-range MEMS-based (Micro Electronical Mechanical System) LiDAR provides precise point cloud datasets for rock fragment surfaces. However, there is more vibrational noise in MEMS-based LiDAR signals, which cannot guarantee that the reconstructed point cloud data are not distorted with a high compression ratio. Many studies have illustrated that wavelet-based clustered compressive sensing can improve reconstruction precision. The k-means clustering algorithm can be conveniently employed to obtain clusters; however, estimating a meaningful k value (i.e., the number of clusters) is challenging. An excessive quantity of clusters is not necessary for dense point clouds, as this leads to elevated consumption of memory and CPU resources. For sparser point clouds, fewer clusters lead to more distortions, while excessive clusters lead to more voids in reconstructed point clouds. This study proposes a local clustering method to determine a number of clusters closer to the actual number based on GMM (Gaussian Mixture Model) observation distances and density peaks. Experimental results illustrate that the estimated number of clusters is closer to the actual number in four datasets from the KEEL public repository. In point cloud compression and recovery experiments, our proposed approach compresses and recovers the Bunny and Armadillo datasets in the Stanford 3D repository; the experimental results illustrate that our proposed approach improves reconstructed point clouds’ geometry and curvature similarity. Furthermore, the geometric similarity increases to 0.9 above in our complete rock fragment surface datasets after selecting a better wavelet basis for each dimension of MEMS-based LiDAR signals. In both experiments, the sparsity of signals was 0.8 and the sampling ratio was 0.4. Finally, a rock outcrop point cloud data experiment is utilized to verify that the proposed approach is applicable for large-scale research objects. All of our experiments illustrate that the proposed adaptive clustered compressive sensing approach can better reconstruct MEMS-based LiDAR point clouds with a lower sampling ratio. Full article
(This article belongs to the Special Issue Short-Range Optical 3D Scanning and 3D Data Processing)
Show Figures

Figure 1

Figure 1
<p>Flow chart of the proposed method.</p>
Full article ">Figure 2
<p>Three types of images taken using the Intel L515 LiDAR camera: (<b>a</b>) high-definition image, (<b>b</b>) deep image, and (<b>c</b>) infra-image.</p>
Full article ">Figure 3
<p>Partial point clouds from various shooting angles.</p>
Full article ">Figure 4
<p>Corresponding points in two different images.</p>
Full article ">Figure 5
<p>Partial point cloud states before and after registration: (<b>a</b>) original partial point clouds, (<b>b</b>) coarse registered partial point clouds, and (<b>c</b>) fine registered partial point clouds.</p>
Full article ">Figure 6
<p>Complete point clouds for rock fragment surface across three scales: (<b>a</b>) <math display="inline"><semantics> <mrow> <mn>32</mn> <mspace width="4.pt"/> <mi>cm</mi> <mo>×</mo> <mn>24</mn> <mspace width="4.pt"/> <mi>cm</mi> <mo>×</mo> <mn>25</mn> <mspace width="4.pt"/> <mi>cm</mi> </mrow> </semantics></math>, (<b>b</b>) <math display="inline"><semantics> <mrow> <mn>25</mn> <mspace width="4.pt"/> <mi>cm</mi> <mo>×</mo> <mn>21</mn> <mspace width="4.pt"/> <mi>cm</mi> <mo>×</mo> <mn>20</mn> <mspace width="4.pt"/> <mi>cm</mi> </mrow> </semantics></math>, and (<b>c</b>) <math display="inline"><semantics> <mrow> <mn>10</mn> <mspace width="4.pt"/> <mi>cm</mi> <mo>×</mo> <mn>8</mn> <mspace width="4.pt"/> <mi>cm</mi> <mo>×</mo> <mn>10</mn> <mspace width="4.pt"/> <mi>cm</mi> </mrow> </semantics></math>.</p>
Full article ">Figure 7
<p>Clustering motivation diagram based on observation distances.</p>
Full article ">Figure 8
<p>Different GMMs for different observers.</p>
Full article ">Figure 9
<p>The decision graph for the third component of the optimal GMM in the KEEL Wine-White dataset.</p>
Full article ">Figure 10
<p>Distributions of the three dimensions in a point cloud of rock fragment surfaces: (<b>a</b>) example distribution of the X-dimension, (<b>b</b>) example distribution of the Y-dimension, and (<b>c</b>) example distribution of the Z-dimension.</p>
Full article ">Figure 11
<p>Waveforms of the three wavelets: (<b>a</b>) waveform of the coif1 wavelet, (<b>b</b>) waveform of the db2 wavelet, and (<b>c</b>) waveform of the bior1.1 wavelet.</p>
Full article ">Figure 12
<p>Comparison of coherence values generated by various measurement matrices and wavelet bases.</p>
Full article ">Figure 13
<p>PDF of the optimal GMM.</p>
Full article ">Figure 14
<p>Decision graphs corresponding to each Gaussian component.</p>
Full article ">Figure 15
<p>Sorted dissimilarity sequences.</p>
Full article ">Figure 16
<p>Various dimensional signal distributions of the Bunny and Armadillo point clouds: (<b>a</b>) the X-, Y-, and Z-dimensional signal distributions of the Bunny dataset and (<b>b</b>) the X-, Y-, and Z-dimensional signal distributions of the Armadillo dataset.</p>
Full article ">Figure 17
<p>Original and DWT-based reconstructed point cloud data: (<b>a</b>) original Bunny data, (<b>b</b>) DWT-based reconstructed Bunny data, (<b>c</b>) original Armadillo data, and (<b>d</b>) DWT-based reconstructed Armadillo data.</p>
Full article ">Figure 18
<p>Reconstructed Bunny point clouds: (<b>a</b>) DWT-based reconstructed point cloud shape, (<b>b</b>) point cloud reconstructed using non-clustered compressive sensing, (<b>c</b>) point cloud reconstructed using our proposed approach, (<b>d</b>) point cloud reconstructed using GMM CCS, (<b>e</b>) point cloud reconstructed using Sil-based CCS, (<b>f</b>) point cloud reconstructed using CH-based CCS, (<b>g</b>) point cloud reconstructed using DB-based CCS.</p>
Full article ">Figure 19
<p>Reconstructed Armadillo point clouds: (<b>a</b>) DWT-based reconstructed point cloud shape, (<b>b</b>) point cloud reconstructed using non-clustered compressive sensing, (<b>c</b>) point cloud reconstructed using our proposed approach, (<b>d</b>) point cloud reconstructed using GMM CCS, (<b>e</b>) point cloud reconstructed using Sil-based CCS, (<b>f</b>) point cloud reconstructed using CH-based CCS, (<b>g</b>) point cloud reconstructed using DB-based CCS.</p>
Full article ">Figure 20
<p>Comparison of RMSE results of different wavelets for each dimension: (<b>a</b>) X, (<b>b</b>) Y, and (<b>c</b>) Z.</p>
Full article ">Figure 21
<p>Comparative diagrams of the point cloud data shown in <a href="#sensors-24-05695-f006" class="html-fig">Figure 6</a>a reconstructed using various compressive sensing approaches: (<b>a</b>) original data, (<b>b</b>) DWT-based data, (<b>c</b>) non-clustered compressive sensing, (<b>d</b>) our proposed CCS, (<b>e</b>) GMM-based CCS, (<b>f</b>) CH-based CCS, (<b>g</b>) DB-based CCS.</p>
Full article ">Figure 22
<p>Comparative diagrams of the point cloud data shown in <a href="#sensors-24-05695-f006" class="html-fig">Figure 6</a>b reconstructed using various compressive sensing approaches: (<b>a</b>) original data, (<b>b</b>) DWT-based data, (<b>c</b>) non-clustered compressive sensing, (<b>d</b>) our proposed CCS, (<b>e</b>) GMM-based CCS, (<b>f</b>) Sil-based CCS, (<b>g</b>) CH-based CCS, (<b>h</b>) DB-based CCS.</p>
Full article ">Figure 23
<p>Comparative diagrams of the point cloud data shown in <a href="#sensors-24-05695-f006" class="html-fig">Figure 6</a>c reconstructed using various compressive sensing approaches: (<b>a</b>) original data, (<b>b</b>) DWT-based data, (<b>c</b>) non-clustered compressive sensing, (<b>d</b>) our proposed CCS, (<b>e</b>) GMM-based CCS, (<b>f</b>) Sil-based CCS, (<b>g</b>) CH-based CCS, (<b>h</b>) DB-based CCS.</p>
Full article ">Figure 24
<p>The rock outcrop point cloud data.</p>
Full article ">Figure 25
<p>Mechanical LiDAR signal shapes of the outcrop point cloud data: (<b>a</b>) shape of the X-dimensional signal, (<b>b</b>) shape of the Y-dimensional signal, and (<b>c</b>) shape of the Z-dimensional signal.</p>
Full article ">
23 pages, 76553 KiB  
Article
3DRecNet: A 3D Reconstruction Network with Dual Attention and Human-Inspired Memory
by Muhammad Awais Shoukat, Allah Bux Sargano, Lihua You and Zulfiqar Habib
Electronics 2024, 13(17), 3391; https://doi.org/10.3390/electronics13173391 - 26 Aug 2024
Viewed by 666
Abstract
Humans inherently perceive 3D scenes using prior knowledge and visual perception, but 3D reconstruction in computer graphics is challenging due to complex object geometries, noisy backgrounds, and occlusions, leading to high time and space complexity. To addresses these challenges, this study introduces 3DRecNet, [...] Read more.
Humans inherently perceive 3D scenes using prior knowledge and visual perception, but 3D reconstruction in computer graphics is challenging due to complex object geometries, noisy backgrounds, and occlusions, leading to high time and space complexity. To addresses these challenges, this study introduces 3DRecNet, a compact 3D reconstruction architecture optimized for both efficiency and accuracy through five key modules. The first module, the Human-Inspired Memory Network (HIMNet), is designed for initial point cloud estimation, assisting in identifying and localizing objects in occluded and complex regions while preserving critical spatial information. Next, separate image and 3D encoders perform feature extraction from input images and initial point clouds. These features are combined using a dual attention-based feature fusion module, which emphasizes features from the image branch over those from the 3D encoding branch. This approach ensures independence from proposals at inference time and filters out irrelevant information, leading to more accurate and detailed reconstructions. Finally, a Decoder Branch transforms the fused features into a 3D representation. The integration of attention-based fusion with the memory network in 3DRecNet significantly enhances the overall reconstruction process. Experimental results on the benchmark datasets, such as ShapeNet, ObjectNet3D, and Pix3D, demonstrate that 3DRecNet outperforms existing methods. Full article
(This article belongs to the Special Issue New Trends in Computer Vision and Image Processing)
Show Figures

Figure 1

Figure 1
<p>2D Image along with corresponding different views of 3D Model.</p>
Full article ">Figure 2
<p>Different distributions of datasets are illustrated. From Left to Right: Images 1–3 sourced from ObjectNet3D, depict occluded objects with complex backgrounds. Images 4–6, from ShapeNet, showed CAD models with plain background. Images 7–9 originate from Pix3D, feature real objects.</p>
Full article ">Figure 3
<p>The proposed method, 3DRecNet, reconstructs the 3D shape of an object from a single image. It learns the geometry of the object present in the input image by embedding the attention-based fusion technique on the image encoder and 3D encoder features. Different modules of 3DRecNet are highlighted with dark colors and dotted boundaries.</p>
Full article ">Figure 4
<p>Human-inspired memory network (HIMNet) utilizes high-level feature extraction to search for structurally similar proposals, refining results through dual filtration based on object category and structure similarity. The geometries of the proposals are used as an initial guess in the learning-based network.</p>
Full article ">Figure 5
<p>The figure shows the steps involved in HIMNet for the filtration of proposals and generating initial guess for the training of end-end architecture 3DRecNet.</p>
Full article ">Figure 6
<p>This figure presents real-world images alongside their corresponding estimated 3D point clouds, illustrating the model’s robustness in handling diverse challenges, including occlusions, environmental complexities, visual noise, and intricate backgrounds.</p>
Full article ">Figure 7
<p>Sequential Steps for Estimating Point Cloud from Input Images: Feature Extraction, Selection, KNN-based Proposal Retrieval, Category Filtering, Structural Similarity Filtration, Generating Initial Point Cloud, which is further Refined by Shape Correction and Alignment Network (i.e., 3DRecNet) for Final 3D Point Cloud Estimation.</p>
Full article ">Figure 8
<p>Optimized sequence for faster point cloud estimation from input images. Includes an optional initial guess and subsequent refinement via shape correction and alignment network for final 3D point cloud estimation.</p>
Full article ">Figure 9
<p>Experimental results presented in the image demonstrate the sequence from left to right: the input image, the model output without an initial guess, with an initial guess, and finally with enhanced test time performance and induction of dual attention mechanism.</p>
Full article ">Figure 10
<p>Sample design architectures: (<b>a</b>,<b>b</b>) without memory integration and (<b>c</b>) with memory integration along with their loss function behavior during training. Without memory, the loss function converges very slowly, requiring a large number of iterations, while with memory it reduces continuously. In case (<b>b</b>), the loss initially reduces exponentially due to the constant 3D sphere used as an initial guess. However, it fails to decrease in later epochs, resulting in vanishing gradients, making the model suitable only for synthetic datasets. (<b>a</b>) Single-stage architecture without initial guess (memory). (<b>b</b>) Two-stage architecture without initial guess (memory). (<b>c</b>) Two-stage architecture with memory integration (initial guess from human-inspired memory net HIMNet).</p>
Full article ">Figure 11
<p>Model performance on inter and intra-class variations with incorrect initial guesses across two datasets: (<b>a</b>) ObjectNet3D with intra-class variations, (<b>b</b>) ObjectNet3D with inter-class variations and (<b>c</b>) ShapeNet with inter-class variations. The figures demonstrate the model’s ability to correct wrong estimations in all the cases. (<b>a</b>) ObjectNet3D dataset with incorrect initial guesses, showing intra-class performance. (<b>b</b>) ObjectNet3D dataset with incorrect initial guesses, showing inter-class performance. (<b>c</b>) ShapeNet dataset with incorrect initial guesses, showing inter-class performance.</p>
Full article ">Figure 12
<p>Training loss on the ShapeNet dataset over different epochs demonstrates model convergence, as indicated by the trend line. The loss is presented in four figures due to high deviation at the initial epochs and minimal reduction in the later epochs. (<b>a</b>) Training loss over the first 1–100 epochs. (<b>b</b>) Training loss for the 100–200 epochs. (<b>c</b>) Training loss for the 200–2000 epochs. (<b>d</b>) Training loss over the first 2000–10,000 epochs.</p>
Full article ">Figure 13
<p>Training loss on the ObjectNet3D dataset over different epochs demonstrates model convergence, as indicated by the trend line. The loss is presented in four figures due to high deviation at the initial epochs and minimal reduction in the later epochs. (<b>a</b>) Training loss over the first 1–100 epochs. (<b>b</b>) Training loss for the 100–200 epochs. (<b>c</b>) Training loss for the 200–2000 epochs. (<b>d</b>) Training loss over the first 2000–10,000 epochs.</p>
Full article ">Figure 14
<p>Images in the leftmost and rightmost columns represent correct samples, while the intermediate column displays incorrect samples. The top row shows results without the attention-based mechanism, and the bottom row demonstrates how the attention-based mechanism enables the network to produce accurate shapes, even in the presence of incorrect samples.</p>
Full article ">Figure 15
<p>The different hyperparameters are tuned to improve the algorithm performance, including (<b>a</b>) learning rate of 0.0001, (<b>b</b>) batch size of 16, (<b>c</b>) optimizer Stochastic Gradient Descent, and (<b>d</b>) freezing 100 layers for transfer learning. (<b>a</b>) Tuning of the learning rate. (<b>b</b>) Tuning of the batch size. (<b>c</b>) Tuning of the optimizer for backpropagation and weight updates to minimize the loss function. (<b>d</b>) Tuning for the number of layers to freeze in the model during transfer learning.</p>
Full article ">Figure 16
<p>Evaluating model performance on diverse datasets with varying backgrounds.</p>
Full article ">Figure 17
<p>Visualization results on the ShapeNet dataset. From left to right: input 2D images, 3D-LMNet [<a href="#B32-electronics-13-03391" class="html-bibr">32</a>], 3D-CDRNet [<a href="#B26-electronics-13-03391" class="html-bibr">26</a>], Proposed (Ours), and Ground Truth.</p>
Full article ">
18 pages, 9438 KiB  
Article
High-Throughput and Accurate 3D Scanning of Cattle Using Time-of-Flight Sensors and Deep Learning
by Gbenga Omotara, Seyed Mohamad Ali Tousi, Jared Decker, Derek Brake and G. N. DeSouza
Sensors 2024, 24(16), 5275; https://doi.org/10.3390/s24165275 - 14 Aug 2024
Viewed by 937
Abstract
We introduce a high-throughput 3D scanning system designed to accurately measure cattle phenotypes. This scanner employs an array of depth sensors, i.e., time-of-flight (ToF) sensors, each controlled by dedicated embedded devices. The sensors generate high-fidelity 3D point clouds, which are automatically stitched using [...] Read more.
We introduce a high-throughput 3D scanning system designed to accurately measure cattle phenotypes. This scanner employs an array of depth sensors, i.e., time-of-flight (ToF) sensors, each controlled by dedicated embedded devices. The sensors generate high-fidelity 3D point clouds, which are automatically stitched using a point could segmentation approach through deep learning. The deep learner combines raw RGB and depth data to identify correspondences between the multiple 3D point clouds, thus creating a single and accurate mesh that reconstructs the cattle geometry on the fly. In order to evaluate the performance of our system, we implemented a two-fold validation process. Initially, we quantitatively tested the scanner for its ability to determine accurate volume and surface area measurements in a controlled environment featuring known objects. Next, we explored the impact and need for multi-device synchronization when scanning moving targets (cattle). Finally, we performed qualitative and quantitative measurements on cattle. The experimental results demonstrate that the proposed system is capable of producing high-quality meshes of untamed cattle with accurate volume and surface area measurements for livestock studies. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) A schematic representation of the scanning system. (<b>b</b>) Real-life figure of the camera frame and the system components.</p>
Full article ">Figure 2
<p>Overview of the software pipeline: The pipeline begins with the data acquisition of RGBD data, which undergo a segmentation and filtering step to eliminate the background pixels and noise in both depth and RGB space. The filtered data are subsequently backprojected into 3D space and then stitched to form a unified 3D model. A mesh is then constructed over the 3D point cloud. Finally, we measure our traits of interest, volume, and surface area.</p>
Full article ">Figure 3
<p>Schematic layout of Server–Client: In this configuration, the Client sends a capture request to 10 Server programs. Each Server program performs the image acquisition request from the Client and the captured data are transmitted to a storage device.</p>
Full article ">Figure 4
<p><b>Mask R-CNN Architecture</b> [<a href="#B9-sensors-24-05275" class="html-bibr">9</a>]: Mask R-CNN builds upon two existing Faster R-CNN heads as detailed in [<a href="#B10-sensors-24-05275" class="html-bibr">10</a>,<a href="#B11-sensors-24-05275" class="html-bibr">11</a>]. The left and right panels illustrate the heads for the ResNet C4 and FPN backbones, respectively, with an added mask branch. Spatial resolution and channels are indicated by the numbers, while arrows represent conv, deconv, or FC layers, inferred from the context (conv layers maintain spatial dimensions, whereas deconv layers increase them). All conv layers are 3 × 3, except for the output conv which is 1 × 1. Deconv layers are 2 × 2 with a stride of 2, and ReLU [<a href="#B12-sensors-24-05275" class="html-bibr">12</a>] is used in hidden layers. On the left, ‘res5’ refers to the fifth stage of ResNet, which has been modified so that the first conv layer operates on a 7 × 7 RoI with a stride of 1 (instead of 14 × 14 with a stride of 2 as in [<a href="#B10-sensors-24-05275" class="html-bibr">10</a>]). On the right, ‘×4’ indicates a stack of four consecutive conv layers.</p>
Full article ">Figure 5
<p>Multi-view point cloud registration: (<b>a</b>) Given N = 6 point clouds, we perform a simple pairwise registration of point cloud fragments of the scanned cattle. (<b>b</b>) We use the Colored ICP algorithm to solve for the coordinate transformation from camera coordinate frame j to camera coordinate frame i (denoted as <math display="inline"><semantics> <mrow> <msup> <mo> </mo> <mi>i</mi> </msup> <msub> <mi>H</mi> <mi>j</mi> </msub> </mrow> </semantics></math>). Each view is aligned into the coordinate frame of its adjacent camera. We fix the coordinate frame of Camera 1 (<math display="inline"><semantics> <msub> <mi>V</mi> <mn>1</mn> </msub> </semantics></math>) as the world coordinate frame and then align all views with respect to coordinate frame 1. (<b>c</b>) This results in a well-aligned point cloud of the scanned cattle.</p>
Full article ">Figure 6
<p>Comparison of 3D point cloud capture quality with and without synchronization using a large box with known dimensions. The left image displays the results without synchronization (0 μs), capturing a total of 17,098 points. The right image shows the same box captured with synchronization (160 μs) with all other settings the same, resulting in a total of 38,631 points, illustrating the significant improvement in data acquisition quality. (<b>a</b>) Large box, 0 s delay, <span class="html-italic">n</span> = 17,098. (<b>b</b>) Large box, 160 μs delay, <span class="html-italic">n</span> = 38,631.</p>
Full article ">Figure 7
<p>Results of scanning a cylindrical object in multiple orientations, highlighting the scanner’s accuracy across diverse poses. The horizontal axis displays the predicted volumes and surface areas obtained in each test. Given that the same object was used throughout, the ground truth volume and surface area remain constant. This plot demonstrates the scanner’s precision, as evidenced by the close alignment of the predicted values with the consistent ground truths, illustrating the system’s reliability in varying orientations. (<b>a</b>) Surface area calculation results. (<b>b</b>) Volume calculation results.</p>
Full article ">Figure 8
<p>Regression analysis of predicted versus known surface area and volume for multiple static objects. The plots displays the correlation between the scanner’s predicted values and the actual measurements for a cylinder, small box, medium box, and large box, all placed in the same pose across 10 consecutive scans. The high <math display="inline"><semantics> <msup> <mi>R</mi> <mn>2</mn> </msup> </semantics></math> values of 0.997 for surface area and 0.999 for volume demonstrate the scanner’s accuracy and consistency in various object dimensions and shapes under controlled conditions. (<b>a</b>) Surface area calculation results. (<b>b</b>) Volume calculation results.</p>
Full article ">Figure 9
<p>Performance of the scanner under direct sunlight, using a standard box to simulate outdoor livestock scanning conditions. The graphs show the mean and standard deviation of volume and surface area measurements across 10 consecutive scans. The results here illustrate the slight impact of sunlight on the scanner’s infrared sensors, affecting measurement accuracy. (<b>a</b>) Surface area calculation results from data collected in sunlight. (<b>b</b>) Volume calculation results from data collected in sunlight.</p>
Full article ">Figure 10
<p>Segmentation of cattle using combined RGB and depth models via Mask R-CNN: The figure shows an RGBD image of cattle segmented using both RGB and depth data. Results from each model are integrated using a voting arbitrator, resulting in a well-defined segmentation in both modalities.</p>
Full article ">Figure 11
<p>Poisson reconstructed meshes of cattle from which we compute the surface area and volume estimates.</p>
Full article ">
19 pages, 8886 KiB  
Article
High-Precision Calibration Method and Error Analysis of Infrared Binocular Target Ranging Systems
by Changwen Zeng, Rongke Wei, Mingjian Gu, Nejie Zhang and Zuoxiao Dai
Electronics 2024, 13(16), 3188; https://doi.org/10.3390/electronics13163188 - 12 Aug 2024
Viewed by 907
Abstract
Infrared binocular cameras, leveraging their distinct thermal imaging capabilities, are well-suited for visual measurement and 3D reconstruction in challenging environments. The precision of camera calibration is essential for leveraging the full potential of these infrared cameras. To overcome the limitations of traditional calibration [...] Read more.
Infrared binocular cameras, leveraging their distinct thermal imaging capabilities, are well-suited for visual measurement and 3D reconstruction in challenging environments. The precision of camera calibration is essential for leveraging the full potential of these infrared cameras. To overcome the limitations of traditional calibration techniques, a novel method for calibrating infrared binocular cameras is introduced. By creating a virtual target plane that closely mimics the geometry of the real target plane, the method refines the feature point coordinates, leading to enhanced precision in infrared camera calibration. The virtual target plane is obtained by inverse projecting the centers of the imaging ellipses, which are estimated at sub-pixel edge, into three-dimensional space, and then optimized using the RANSAC least squares method. Subsequently, the imaging ellipses are inversely projected onto the virtual target plane, where its centers are identified. The corresponding world coordinates of the feature points are then refined through a linear optimization process. These coordinates are reprojected onto the imaging plane, yielding optimized pixel feature points. The calibration procedure is iteratively performed to determine the ultimate set of calibration parameters. The method has been validated through experiments, demonstrating an average reprojection error of less than 0.02 pixels and a significant 24.5% improvement in calibration accuracy over traditional methods. Furthermore, a comprehensive analysis has been conducted to identify the primary sources of calibration error. Ultimately, this achieves an error rate of less than 5% in infrared stereo ranging within a 55-m range. Full article
Show Figures

Figure 1

Figure 1
<p>Infrared binocular camera imaging model.</p>
Full article ">Figure 2
<p>The process feature point coordinate optimization.</p>
Full article ">Figure 3
<p>Projection of circular plane.</p>
Full article ">Figure 4
<p>Physical photograph of the infrared binocular camera system.</p>
Full article ">Figure 5
<p>Calibration target image.</p>
Full article ">Figure 6
<p>Calibration images for the <b>left</b> and <b>right</b> cameras.</p>
Full article ">Figure 7
<p>The results of elliptical fitting to the edges of circular images.</p>
Full article ">Figure 8
<p>Optimization of inverse projection plane. (<b>a</b>) Results of inverse projection of ellipse centers in imaging; (<b>b</b>) RANSAC least squares optimized plane.</p>
Full article ">Figure 9
<p>Before and after optimization of feature image points in the (<b>a</b>) left camera and (<b>b</b>) right camera.</p>
Full article ">Figure 10
<p>Comparison of images before and after stereoscopic correction: (<b>a</b>) before; (<b>b</b>) after.</p>
Full article ">Figure 11
<p>Relationship between projection distortion error and θ when the center of the circular plane is on the optical axis.</p>
Full article ">Figure 12
<p>Relationship between projection distortion error and θ when the center of the circular plane deviates from the optical axis.</p>
Full article ">Figure 13
<p>Relationship between projection distortion error and d when the center of the circular plane deviates from the optical axis.</p>
Full article ">Figure 14
<p>The correlation between the absolute value of the ranging relative error and the actual distance of the target.</p>
Full article ">
15 pages, 1727 KiB  
Review
Leveraging 3D Atrial Geometry for the Evaluation of Atrial Fibrillation: A Comprehensive Review
by Alexander J. Sharp, Timothy R. Betts and Abhirup Banerjee
J. Clin. Med. 2024, 13(15), 4442; https://doi.org/10.3390/jcm13154442 - 29 Jul 2024
Viewed by 1137
Abstract
Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia associated with significant morbidity and mortality. Managing risk of stroke and AF burden are pillars of AF management. Atrial geometry has long been recognized as a useful measure in achieving these goals. However, [...] Read more.
Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia associated with significant morbidity and mortality. Managing risk of stroke and AF burden are pillars of AF management. Atrial geometry has long been recognized as a useful measure in achieving these goals. However, traditional diagnostic approaches often overlook the complex spatial dynamics of the atria. This review explores the emerging role of three-dimensional (3D) atrial geometry in the evaluation and management of AF. Advancements in imaging technologies and computational modeling have enabled detailed reconstructions of atrial anatomy, providing insights into the pathophysiology of AF that were previously unattainable. We examine current methodologies for interpreting 3D atrial data, including qualitative, basic quantitative, global quantitative, and statistical shape modeling approaches. We discuss their integration into clinical practice, highlighting potential benefits such as personalized treatment strategies, improved outcome prediction, and informed treatment approaches. Additionally, we discuss the challenges and limitations associated with current approaches, including technical constraints and variable interpretations, and propose future directions for research and clinical applications. This comprehensive review underscores the transformative potential of leveraging 3D atrial geometry in the evaluation and management of AF, advocating for its broader adoption in clinical practice. Full article
(This article belongs to the Section Cardiovascular Medicine)
Show Figures

Figure 1

Figure 1
<p>Overview of the different qualitative and basic quantitative metrics based on three-dimensional atrial geometry, used for the assessment of atrial fibrillation. LA: left atrium; LAA: left atrial appendage; LSPV: left superior pulmonary vein; RSPV: right superior pulmonary vein; SSM: statistical shape modeling.</p>
Full article ">Figure 2
<p>Overview of the asymmetry index and left atrial sphericity measurement. LA: left atrium; LAA: left atrial appendage; LIPV: left inferior pulmonary vein; LSPV: left superior pulmonary vein; SSM: statistical shape modeling.</p>
Full article ">Figure 3
<p>Correspondence points (white ‘X’ symbols) in a point distribution model and average shape and shape variability in a linear SSM. First PC captures the roundedness of edges and second PC captures the ratio of width to length. LA: left atrium; PC: principal component; PCA: principal component analysis; SSM: statistical shape modeling.</p>
Full article ">Figure 4
<p>Deep learning-based SSM capturing population-wide LA geometries and metadata to produce virtual populations relevant to specified sub-populations, personalized prediction of procedural success, and identification of key risk factors. LA: left atrium; SSM: statistical shape modeling.</p>
Full article ">
17 pages, 5053 KiB  
Article
Comparison of Left Ventricular Function Derived from Subject-Specific Inverse Finite Element Modeling Based on 3D ECHO and Magnetic Resonance Images
by Lei Fan, Jenny S. Choy, Chenghan Cai, Shawn D. Teague, Julius Guccione, Lik Chuan Lee and Ghassan S. Kassab
Bioengineering 2024, 11(7), 735; https://doi.org/10.3390/bioengineering11070735 - 20 Jul 2024
Viewed by 863
Abstract
Three-dimensional echocardiography (3D ECHO) and magnetic resonance (MR) imaging are frequently used in patients and animals to evaluate heart functions. Inverse finite element (FE) modeling is increasingly applied to MR images to quantify left ventricular (LV) function and estimate myocardial contractility and other [...] Read more.
Three-dimensional echocardiography (3D ECHO) and magnetic resonance (MR) imaging are frequently used in patients and animals to evaluate heart functions. Inverse finite element (FE) modeling is increasingly applied to MR images to quantify left ventricular (LV) function and estimate myocardial contractility and other cardiac biomarkers. It remains unclear, however, as to whether myocardial contractility derived from the inverse FE model based on 3D ECHO images is comparable to that derived from MR images. To address this issue, we developed a subject-specific inverse FE model based on 3D ECHO and MR images acquired from seven healthy swine models to investigate if there are differences in myocardial contractility and LV geometrical features derived using these two imaging modalities. We showed that end-systolic and end-diastolic volumes derived from 3D ECHO images are comparable to those derived from MR images (R2=0.805 and 0.969, respectively). As a result, ejection fraction from 3D ECHO and MR images are linearly correlated (R2=0.977) with the limit of agreement (LOA) ranging from −17.95% to 45.89%. Using an inverse FE modeling to fit pressure and volume waveforms in subject-specific LV geometry reconstructed from 3D ECHO and MR images, we found that myocardial contractility derived from these two imaging modalities are linearly correlated with an R2 value of 0.989, a gradient of 0.895, and LOA ranging from −6.11% to 36.66%. This finding supports using 3D ECHO images in image-based inverse FE modeling to estimate myocardial contractility. Full article
(This article belongs to the Special Issue Computational Models in Cardiovascular System)
Show Figures

Figure 1

Figure 1
<p>Schematic of the sequential phases in the model parameter estimation process. (<b>A</b>) Unloading; (<b>B</b>) Passive phase; (<b>C</b>) Active phase. a and c denote end-diastolic point. b denotes the LV volume at zero pressure.</p>
Full article ">Figure 2
<p>Segmented LV endocardial surfaces in (<b>A</b>) MR images and (<b>B</b>) 3D ECHO images. (<b>C</b>) Comparison of regional LV wall thickness derived from 3D ECHO and MR images; Correlations of (<b>D</b>) LV EDV and ESV; (<b>E</b>) SV and (<b>F</b>) EF derived from 3D ECHO and MR images.</p>
Full article ">Figure 3
<p>Bland–Altman analyses of the percentage difference in (<b>A</b>) EDV, (<b>B</b>) ESV, (<b>C</b>) SV, and (<b>D</b>) EF based on 3D ECHO and MR images.</p>
Full article ">Figure 4
<p>Comparison of (<b>A</b>) model predicted EDPVRs and Klotz curves; (<b>B</b>) Model predicted and measured LV pressure waveforms; (<b>C</b>) Model predicted and measured LV PV loops. (<b>D</b>) Relative errors between model predictions and measurements of LV pressures for all cases. Blue and orange solid lines denote the model predicted PV loops based on 3D ECHO and MRI, respectively. Blue and orange dots denote the experimentally measured PV loops based on 3D ECHO and MRI, respectively.</p>
Full article ">Figure 5
<p>(<b>A</b>) Comparison of <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> </mrow> </semantics></math> waveforms derived from 3D ECHO and MR images. (<b>B</b>) Correlation of peak <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> </mrow> </semantics></math> derived from 3D ECHO and MR images. (<b>C</b>) Bland–Altman analysis of peak <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> </mrow> </semantics></math>. (<b>D</b>) Correlation of model-predicted time to reach peak <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> </mrow> </semantics></math> derived from 3D ECHO and MR images. (<b>E</b>) Bland–Altman analysis of time to peak <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> </mrow> </semantics></math>.</p>
Full article ">Figure 5 Cont.
<p>(<b>A</b>) Comparison of <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> </mrow> </semantics></math> waveforms derived from 3D ECHO and MR images. (<b>B</b>) Correlation of peak <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> </mrow> </semantics></math> derived from 3D ECHO and MR images. (<b>C</b>) Bland–Altman analysis of peak <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> </mrow> </semantics></math>. (<b>D</b>) Correlation of model-predicted time to reach peak <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> </mrow> </semantics></math> derived from 3D ECHO and MR images. (<b>E</b>) Bland–Altman analysis of time to peak <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> </mrow> </semantics></math>.</p>
Full article ">Figure A1
<p>Comparison of (<b>A</b>) model-predicted EDPVRs and Klotz curves; (<b>B</b>) Model-predicted and measured LV pressure waveforms; (<b>C</b>) Model-predicted and measured LV PV loops. (<b>D</b>) Relative errors between model predictions and measurements of LV pressures for all cases. Blue and orange solid lines denote the model-predicted PV loops based on 3D ECHO and MRI, respectively. Blue and orange dots denote the experimentally measured PV loops based on 3D ECHO and MRI, respectively.</p>
Full article ">Figure A1 Cont.
<p>Comparison of (<b>A</b>) model-predicted EDPVRs and Klotz curves; (<b>B</b>) Model-predicted and measured LV pressure waveforms; (<b>C</b>) Model-predicted and measured LV PV loops. (<b>D</b>) Relative errors between model predictions and measurements of LV pressures for all cases. Blue and orange solid lines denote the model-predicted PV loops based on 3D ECHO and MRI, respectively. Blue and orange dots denote the experimentally measured PV loops based on 3D ECHO and MRI, respectively.</p>
Full article ">Figure A2
<p>(<b>A</b>) Comparison of <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> </mrow> </semantics></math> waveforms derived from 3D ECHO and MR images. (<b>B</b>) Correlation of peak <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> </mrow> </semantics></math> derived from 3D ECHO and MR images. (<b>C</b>) Bland–Altman analysis of peak <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>T</mi> </mrow> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> </mrow> </semantics></math>.</p>
Full article ">
24 pages, 11966 KiB  
Article
Evaluation of Denoising and Voxelization Algorithms on 3D Point Clouds
by Sara Gonizzi Barsanti, Marco Raoul Marini, Saverio Giulio Malatesta and Adriana Rossi
Remote Sens. 2024, 16(14), 2632; https://doi.org/10.3390/rs16142632 - 18 Jul 2024
Viewed by 799
Abstract
Proper documentation is fundamental to providing structural health monitoring, damage identification and failure assessment for Cultural Heritage (CH). Three-dimensional models from photogrammetric and laser scanning surveys usually provide 3D point clouds that can be converted into meshes. The point clouds usually contain noise [...] Read more.
Proper documentation is fundamental to providing structural health monitoring, damage identification and failure assessment for Cultural Heritage (CH). Three-dimensional models from photogrammetric and laser scanning surveys usually provide 3D point clouds that can be converted into meshes. The point clouds usually contain noise data due to different causes: non-cooperative material or surfaces, bad lighting, complex geometry and low accuracy of the instruments utilized. Point cloud denoising has become one of the hot topics of 3D geometric data processing, removing these noise data to recover the ground-truth point cloud and adding smoothing to the ideal surface. These cleaned point clouds can be converted in volumes with different algorithms, suitable for different uses, mainly for structural analysis. This paper aimed to analyse the geometric accuracy of algorithms available for the conversion of 3D point clouds into volumetric models that can be used for structural analyses through the FEA process. The process is evaluated, highlighting problems and difficulties that lie in poor reconstruction results of volumes from denoised point clouds due to the geometric complexity of the objects. Full article
(This article belongs to the Special Issue New Perspectives on 3D Point Cloud II)
Show Figures

Figure 1

Figure 1
<p>The test objects of this paper: (<b>a</b>) Solimene’s factory; (<b>b</b>) statue of Moses; (<b>c</b>) portion of the façade of Solimene’s factory; (<b>d</b>) car’s suspension; (<b>e</b>) medieval pillar; (<b>f</b>) scorpionide (copy of a Roman throwing machine).</p>
Full article ">Figure 2
<p>Retopologised models of (<b>a</b>) Moses’s statue; (<b>b</b>) Solimene’s factory; (<b>c</b>) portion of the wall of Solimene’s factory; (<b>d</b>) suspension of car.</p>
Full article ">Figure 3
<p>The comparison of raw data and denoised ones: (<b>a</b>) Solimene factory; (<b>b</b>) Moses’s statue; (<b>c</b>) car suspension; (<b>d</b>) medieval pillar; (<b>e</b>) scorpionide.</p>
Full article ">Figure 4
<p>Topological analysis of the Moses’ meshes: not denoised (<b>a</b>); denoised (<b>b</b>); retopologised not denoised (<b>c</b>); retopologised denoised (<b>d</b>). Red dots in the images indicates the errors and the holes in the models.</p>
Full article ">Figure 5
<p>Schema regarding the functions of Open3D (taken from <a href="https://www.open3d.org/docs/release/introduction.html" target="_blank">https://www.open3d.org/docs/release/introduction.html</a>, accessed on 15 February 2024).</p>
Full article ">Figure 6
<p>Comparison of the high-resolution models of the statue of Moses with (<b>a</b>) volume in blender and (<b>b</b>) volume in Meshmixer.</p>
Full article ">Figure 7
<p>Comparison of the high-resolution models of Solimene’s façade with (<b>a</b>) volume in blender and (<b>b</b>) volume in Meshmixer.</p>
Full article ">Figure 8
<p>Comparison of the high-resolution models of the portion of Solimene’s wall with (<b>a</b>) volume in blender and (<b>b</b>) volume in Meshmixer.</p>
Full article ">Figure 9
<p>Comparison of the high-resolution models of the suspension with (<b>a</b>) volume in blender and (<b>b</b>) volume in Meshmixer.</p>
Full article ">Figure 10
<p>Comparison of the high-resolution models of the pillar with (<b>a</b>) error while creating the volume in blender and (<b>b</b>) volume in Meshmixer.</p>
Full article ">Figure 11
<p>The voxel grid originated from the denoised point cloud: (<b>a</b>) Mosè, (<b>b</b>) pillar, (<b>c</b>) Solimene, (<b>d</b>) portion of Solimene’s wall, (<b>e</b>) scorpionide, (<b>f</b>) suspension.</p>
Full article ">Figure 12
<p>The voxel grid originated from the not-denoised point cloud: (<b>a</b>) pillar, (<b>b</b>) Solimene, (<b>c</b>) scorpionide, (<b>d</b>) suspension, (<b>e</b>) portion of Solimene’s wall.</p>
Full article ">Figure 13
<p>Comparison of the high-resolution 3D point cloud with the voxel grids originated from them. Column on the left, original not denoised point clouds; column on the right, denoised point clouds: (<b>a</b>,<b>b</b>) Solimene, (<b>c</b>,<b>d</b>) pillar (<b>e</b>,<b>f</b>) portion of Solimene’s wall (<b>g</b>,<b>h</b>) scorpionide (<b>i</b>,<b>j</b>) suspension, (<b>k</b>) Moses.</p>
Full article ">Figure 13 Cont.
<p>Comparison of the high-resolution 3D point cloud with the voxel grids originated from them. Column on the left, original not denoised point clouds; column on the right, denoised point clouds: (<b>a</b>,<b>b</b>) Solimene, (<b>c</b>,<b>d</b>) pillar (<b>e</b>,<b>f</b>) portion of Solimene’s wall (<b>g</b>,<b>h</b>) scorpionide (<b>i</b>,<b>j</b>) suspension, (<b>k</b>) Moses.</p>
Full article ">Figure 14
<p>The voxel models of (<b>a</b>) pillar, (<b>b</b>) Solimene, (<b>c</b>) scorpionide, (<b>d</b>) suspension, (<b>e</b>) portion of Solimene’s wall.</p>
Full article ">Figure 15
<p>The meshes reconstructed from the voxel grid with Meshlab (on the left) and with Open3D (on the right) of (<b>a</b>,<b>b</b>) Moses, (<b>c</b>,<b>d</b>) pillar, (<b>e</b>,<b>f</b>) Solimene, (<b>g</b>,<b>h</b>) portion of Solimene’s wall, (<b>i</b>,<b>k</b>) scorpionide (<b>j</b>,<b>l</b>) suspension.</p>
Full article ">
Back to TopTop