Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 

Topic Editors

Prof. Dr. Junxing Zheng
School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan, China
Dr. Peng Cao
College of Architecture and Civil Engineering, Beijing University of Technology, Beijing 100124, China

3D Computer Vision and Smart Building and City, 2nd Volume

Abstract submission deadline
closed (31 October 2024)
Manuscript submission deadline
closed (31 December 2024)
Viewed by
49991

Topic Information

Dear Colleagues,

This Topic is a continuation of the previous successful Topic, "3D Computer Vision and Smart Building and City (https://www.mdpi.com/topics/3D_BIM)". Three-dimensional computer vision is an interdisciplinary subject involving computer vision, computer graphics, artificial intelligence and other fields. Its main contents include 3D perception, 3D understanding and 3D modeling. In recent years, 3D computer vision technology has developed rapidly and has been widely used in unmanned aerial vehicles, robots, autonomous driving, AR, VR and other fields. Smart buildings and cities use various information technologies or innovative concepts to connect as well as various systems and services so as to improve the efficiency of resource utilization, optimize management and services and improve quality of life. Smart buildings and cities can involve some frontier techniques, such as 3D CV for building information models, digital twins, city information models, simultaneous localization and mapping robots. The application of 3D computer vision in smart buildings and cities is a valuable research direction, but it still faces many major challenges. This topic focuses on the theory and technology of 3D computer vision in smart buildings and cities. We welcome papers that provide innovative technologies, theories or case studies in the relevant field.

Prof. Dr. Junxing Zheng
Dr. Peng Cao
Topic Editors

Keywords

  • smart buildings and cities
  • 3D computer vision
  • SLAM
  • building information model
  • city information model
  • robots

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Buildings
buildings
3.1 3.4 2011 15.3 Days CHF 2600
Drones
drones
4.4 5.6 2017 19.2 Days CHF 2600
Energies
energies
3.0 6.2 2008 16.8 Days CHF 2600
Sensors
sensors
3.4 7.3 2001 18.6 Days CHF 2600
Sustainability
sustainability
3.3 6.8 2009 19.7 Days CHF 2400
ISPRS International Journal of Geo-Information
ijgi
2.8 6.9 2012 35.8 Days CHF 1900

Preprints.org is a multidiscipline platform providing preprint service that is dedicated to sharing your research from the start and empowering your research journey.

MDPI Topics is cooperating with Preprints.org and has built a direct connection between MDPI journals and Preprints.org. Authors are encouraged to enjoy the benefits by posting a preprint at Preprints.org prior to publication:

  1. Immediately share your ideas ahead of publication and establish your research priority;
  2. Protect your idea from being stolen with this time-stamped preprint article;
  3. Enhance the exposure and impact of your research;
  4. Receive feedback from your peers in advance;
  5. Have it indexed in Web of Science (Preprint Citation Index), Google Scholar, Crossref, SHARE, PrePubMed, Scilit and Europe PMC.

Published Papers (34 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
15 pages, 5853 KiB  
Article
Multi-View Three-Dimensional Reconstruction Based on Feature Enhancement and Weight Optimization Network
by Guobiao Yao, Ziheng Wang, Guozhong Wei, Fengqi Zhu, Qingqing Fu, Qian Yu and Min Wei
ISPRS Int. J. Geo-Inf. 2025, 14(2), 43; https://doi.org/10.3390/ijgi14020043 - 24 Jan 2025
Viewed by 507
Abstract
Aiming to address the issue that existing multi-view stereo reconstruction methods have insufficient adaptability to the repetitive and weak textures in multi-view images, this paper proposes a three-dimensional (3D) reconstruction algorithm based on Feature Enhancement and Weight Optimization MVSNet (Abbreviated as FEWO-MVSNet). To [...] Read more.
Aiming to address the issue that existing multi-view stereo reconstruction methods have insufficient adaptability to the repetitive and weak textures in multi-view images, this paper proposes a three-dimensional (3D) reconstruction algorithm based on Feature Enhancement and Weight Optimization MVSNet (Abbreviated as FEWO-MVSNet). To obtain accurate and detailed global and local features, we first develop an adaptive feature enhancement approach to obtain multi-scale information from the images. Second, we introduce an attention mechanism and a spatial feature capture module to enable high-sensitivity detection for weak texture features. Third, based on the 3D convolutional neural network, the fine depth map for multi-view images can be predicted and the complete 3D model is subsequently reconstructed. Last, we evaluated the proposed FEWO-MVSNet through training and testing on the DTU, BlendedMVS, and Tanks and Temples datasets. The results demonstrate significant superiorities of our method for 3D reconstruction from multi-view images, with our method ranking first in accuracy and second in completeness when compared to the existing representative methods. Full article
Show Figures

Figure 1

Figure 1
<p>The proposed 3D reconstruction network, FEWO-MVSNet.</p>
Full article ">Figure 2
<p>The improved feature extraction network by integrating the FPN with the ASFF module.</p>
Full article ">Figure 3
<p>The adaptive allocation mechanism for feature weight.</p>
Full article ">Figure 4
<p>We chose three representative methods (MVSNet, CasMVSNet and TransMVSNet) for comparison. The reconstruction results in four scenes with weak/repeated textures are shown, with optimized details indicated in the red box.</p>
Full article ">Figure 5
<p>Comparison of the reconstructed 3D dense point clouds from repetitive- and weak-texture scenes based on TransMVSNet and our FEWO-MVSNet. The reconstruction results in four scenes with weak/repeated textures are shown, with optimized details indicated in the black box.</p>
Full article ">Figure 6
<p>The reconstruction of scenes based on FEWO-MVSNet using the Tanks and Temples (Intermediate) dataset. The reconstruction results in six scenes with weak/repeated textures are shown, with optimized details indicated in the red box.</p>
Full article ">
21 pages, 7111 KiB  
Article
Construction of 3D Indoor Topological Models Based on Improved Face Sorting
by Qun Sun, Xinwu Zhan and Pu Tang
ISPRS Int. J. Geo-Inf. 2025, 14(1), 27; https://doi.org/10.3390/ijgi14010027 - 13 Jan 2025
Viewed by 551
Abstract
Indoor location-based services and applications need to obtain information about the indoor spatial layouts and topological relationships of indoor spaces. The 3D city modeling data standard CityGML describes the indoor geometric and semantic information of buildings, but the surfaces composing a volume are [...] Read more.
Indoor location-based services and applications need to obtain information about the indoor spatial layouts and topological relationships of indoor spaces. The 3D city modeling data standard CityGML describes the indoor geometric and semantic information of buildings, but the surfaces composing a volume are discrete, leading to invalid volumes. Moreover, the topological adjacency relationships of adjacent indoor spaces have not yet been described, which makes it difficult to realize effective queries and analyses for indoor applications. In this paper, we present a 3D topological data model for indoor spaces that adopts five topological primitives, namely, node, edge, loop, face, and solid, to describe the topological relationships of indoor spaces. Then, by improving the existing face-sorting method according to vector products in 3D space, a method for constructing 3D topological relationships for indoor spaces is proposed, which successively constructs the topological hierarchical combination of volume and the topological adjacency relationships of adjacent volumes. The experimental results show that by using the improved face-sorting method proposed in this work, the relative positions of faces are directly determined to sort the faces set, which avoids relatively cumbersome calculations and improves the efficiency of constructing 3D topological relationships for indoor spaces. Full article
Show Figures

Figure 1

Figure 1
<p>Construction of 3D models of buildings based on extrusion of footprints. (<b>a</b>) Footprints of buildings; (<b>b</b>) 3D models of buildings by extruding the footprints.</p>
Full article ">Figure 2
<p>Construction of 3D models of buildings based on face sorting by common edges. (<b>a</b>) Surface set F<sub>1</sub>, F<sub>2</sub>, F<sub>3</sub>, and F<sub>4</sub> share a common edge E<sub>0</sub>. (<b>b</b>) Determining the nearest neighboring surface F<sub>4</sub> of the search surface F<sub>1</sub> to construct the 3D building model.</p>
Full article ">Figure 3
<p>Face sorting by a common edge.</p>
Full article ">Figure 4
<p>Three-dimensional indoor topological data model.</p>
Full article ">Figure 5
<p>A solid and its direction.</p>
Full article ">Figure 6
<p>Two adjacent solids sharing a common face.</p>
Full article ">Figure 7
<p>Segmentation of faces.</p>
Full article ">Figure 8
<p>Interruption of edges.</p>
Full article ">Figure 9
<p>Geometrical connotation of the inner product of vectors.</p>
Full article ">Figure 10
<p>Geometrical connotation of the outer product of vectors.</p>
Full article ">Figure 11
<p>Sorting faces set based on the vector products. (<b>a</b>) Faces set sharing a common edge. (<b>b</b>) The normal vectors of the faces.</p>
Full article ">Figure 12
<p>The process of construction of volumetric objects.</p>
Full article ">Figure 13
<p>The nearest neighboring faces searching for a face with holes.</p>
Full article ">Figure 14
<p>Determination of the initial searching faces for the construction of adjacent solids.</p>
Full article ">Figure 15
<p>A 3D indoor space dataset based on CityGML (visualized in FZK Viewer _V6.5.1).</p>
Full article ">Figure 16
<p>The indoor spaces of the constructed office building.</p>
Full article ">Figure 17
<p>The 3D indoor dataset and the constructed indoor spaces. (<b>a</b>) A 3D indoor space simulation dataset based on CityGML. (<b>b</b>) The constructed indoor spaces of the building.</p>
Full article ">Figure 18
<p>Basic information inquiry for indoor spaces.</p>
Full article ">Figure 19
<p>Inquiry of topological adjacency relationships for indoor spaces. (<b>a</b>) Query two adjacent rooms on the same floor by a wall surface. (<b>b</b>) Query two adjacent rooms on adjacent floors by a floor surface.</p>
Full article ">Figure 19 Cont.
<p>Inquiry of topological adjacency relationships for indoor spaces. (<b>a</b>) Query two adjacent rooms on the same floor by a wall surface. (<b>b</b>) Query two adjacent rooms on adjacent floors by a floor surface.</p>
Full article ">
21 pages, 6523 KiB  
Article
The Ontological Multiplicity of Digital Heritage Models: A Case Study of Yunyan Temple, Sichuan Province, China
by Jie Tan, Xin Guo and Haijing Huang
Buildings 2025, 15(2), 178; https://doi.org/10.3390/buildings15020178 - 9 Jan 2025
Viewed by 510
Abstract
This paper investigates the ontological multiplicity of digital heritage objects within the context of a digital twin project focused on Yunyan Temple, Sichuan Province, China—a site threatened by natural disasters. The project employs laser scanning and photogrammetry to generate high-resolution 3D models at [...] Read more.
This paper investigates the ontological multiplicity of digital heritage objects within the context of a digital twin project focused on Yunyan Temple, Sichuan Province, China—a site threatened by natural disasters. The project employs laser scanning and photogrammetry to generate high-resolution 3D models at varying levels of detail. The study analyzes how these digital objects support diverse analytical tasks ranging from geomorphological analysis to structural assessments and spatial sequence analysis. We present a novel four-layer data integration and service platform architecture designed to manage the complex data relationships arising from this ontological multiplicity. This includes a temporal database to support iterative refinements of conservation strategies based on ongoing monitoring. The findings highlight the dynamic role of digital objects in knowledge production and offer practical implications for database design, data management, and the development of adaptive conservation strategies for cultural heritage. Full article
Show Figures

Figure 1

Figure 1
<p>The location of Yunyan Temple.</p>
Full article ">Figure 2
<p>(<b>a</b>) Peak of Douchuan Mountain. (<b>b</b>) Crossing the valley on an iron chain.</p>
Full article ">Figure 3
<p>Cultural heritage of Douchuan Mountain: (<b>a</b>) Yunyan Temple; (<b>b</b>) Zang Hall; (<b>c</b>) Feitian Sutra Cabinet; (<b>d</b>) site plan of Yunyan Temple.</p>
Full article ">Figure 4
<p>Digital twin framework for heritage preservation.</p>
Full article ">Figure 5
<p>Methodological research approach.</p>
Full article ">Figure 6
<p>Data acquisition schemes of specially designed photogrammetry at different scales.</p>
Full article ">Figure 7
<p>Scanning the Feitian Sutra Cabinet: (<b>a</b>) setting up a temporary scaffold; (<b>b</b>) scanning the upper part; (<b>c</b>) scanning the exterior; (<b>d</b>) scanning the interior.</p>
Full article ">Figure 8
<p>Photogrammetry control points: (<b>a</b>) grid paper; (<b>b</b>) right-angle signs; 3D laser scanning control points; (<b>c</b>) Target ball; (<b>d</b>) right-angle signs in the Feitian Sutra Cabinet.</p>
Full article ">Figure 9
<p>Dynamic process of new knowledge production.</p>
Full article ">Figure 10
<p>The new workflow and database categorization.</p>
Full article ">Figure 11
<p>Stone Lion photogrammetry model: (<b>a</b>) surface moss on the façade; (<b>b</b>) surface moss on the back; (<b>c</b>) inquiry about the surface area of one side of the Stone Lion’s base; (<b>d</b>) volume query of the Stone Lion.</p>
Full article ">Figure 12
<p>The process of data acquisition and model generation for Douchuan Mountain using a UAV.</p>
Full article ">Figure 13
<p>Models of Douchuan Mountain are at the top, the central images are models of Yunyan Temple and Chaoran Pavilion, and below are models of two stone tablets in front of Daxiong Hall.</p>
Full article ">Figure 14
<p>Zang Hall—complementary data representation: (<b>a</b>) point cloud model Detailing Interior and eave features; (<b>b</b>) photogrammetric model capturing rooftop completeness.</p>
Full article ">Figure 15
<p>Point cloud model of Feitian Sutra Cabinet.</p>
Full article ">Figure 16
<p>Topographic heat map of Douchuan Mountain’s fault zone—elevation changes analyzed using CloudCompare software.</p>
Full article ">Figure 17
<p>Topographic heat map of Yunyan Temple area—elevation changes analyzed using CloudCompare software.</p>
Full article ">Figure 18
<p>Structural deformation analysis of Nanyue Hall—deviation from ideal model surface (CloudCompare method, initial assessment).</p>
Full article ">Figure 19
<p>Spatial sequence analysis of Yunyan Temple axis—D/H ratio and sky view factor (SVF) from 3D models.</p>
Full article ">Figure 20
<p>Three-level semantic segmentation of Yunyan Temple point cloud—landscape, architecture, and detail elements with associated information types.</p>
Full article ">Figure 21
<p>Insertion of the photogrammetric model of the Yunyan Temple into the SuperMap platform.</p>
Full article ">Figure 22
<p>The process of data construction using a temporal database.</p>
Full article ">Figure 23
<p>Four-layer architecture of the Yunyan Temple digital twin platform.</p>
Full article ">
21 pages, 11620 KiB  
Article
Performance Evaluation and Optimization of 3D Gaussian Splatting in Indoor Scene Generation and Rendering
by Xinjian Fang, Yingdan Zhang, Hao Tan, Chao Liu and Xu Yang
ISPRS Int. J. Geo-Inf. 2025, 14(1), 21; https://doi.org/10.3390/ijgi14010021 - 7 Jan 2025
Viewed by 1303
Abstract
This study addresses the prevalent challenges of inefficiency and suboptimal quality in indoor 3D scene generation and rendering by proposing a parameter-tuning strategy for 3D Gaussian Splatting (3DGS). Through a systematic quantitative analysis of various performance indicators under differing resolution conditions, threshold settings [...] Read more.
This study addresses the prevalent challenges of inefficiency and suboptimal quality in indoor 3D scene generation and rendering by proposing a parameter-tuning strategy for 3D Gaussian Splatting (3DGS). Through a systematic quantitative analysis of various performance indicators under differing resolution conditions, threshold settings for the average magnitude of spatial position gradients, and adjustments to the scaling learning rate, the optimal parameter configuration for the 3DGS model, specifically tailored for indoor modeling scenarios, is determined. Firstly, utilizing a self-collected dataset, a comprehensive comparison was conducted among COLLI-SION-MAPping (abbreviated as COLMAP (V3.7), an open-source software based on Structure from Motion and Multi-View Stereo (SFM-MVS)), Context Capture (V10.2) (abbreviated as CC, a software utilizing oblique photography algorithms), Neural Radiance Fields (NeRF), and the currently renowned 3DGS algorithm. The key dimensions of focus included the number of images, rendering time, and overall rendering effectiveness. Subsequently, based on this comparison, rigorous qualitative and quantitative evaluations are further conducted on the overall performance and detail processing capabilities of the 3DGS algorithm. Finally, to meet the specific requirements of indoor scene modeling and rendering, targeted parameter tuning is performed on the algorithm. The results demonstrate significant performance improvements in the optimized 3DGS algorithm: the PSNR metric increases by 4.3%, and the SSIM metric improves by 0.2%. The experimental results prove that the improved 3DGS algorithm exhibits superior expressive power and persuasiveness in indoor scene rendering. Full article
Show Figures

Figure 1

Figure 1
<p>Diagram of Multi-Resolution Hash Encoding method. The blue, brown, orange, and green boxes represent the index calculation of hash tables at different Levels. Each Level has a different grid resolution. T denotes the size of the hash table.</p>
Full article ">Figure 2
<p>Overview of 3DGS technology.</p>
Full article ">Figure 3
<p>Study Area. (<b>a</b>) Floor; (<b>b</b>) Ceiling; (<b>c</b>) Table; (<b>d</b>) Full View.</p>
Full article ">Figure 4
<p>Schematic diagram of data collection. The arrows indicate the direction and trajectory of the shooting.</p>
Full article ">Figure 5
<p>Model construction and rendered image generation. (<b>a</b>) CC modeling diagram, the red border highlights the damaged area of the model.; (<b>b</b>) OSketch Up individualized rendering; (<b>c</b>,<b>d</b>) OSketch Up interactive operation diagrams.</p>
Full article ">Figure 6
<p>Comparison Chart of Rendering Effects Among Different Algorithms: (<b>a</b>) COLMAP; (<b>b</b>) NeRF (Instant-NGP); (<b>c</b>) 3DGS; (<b>d</b>) CC.</p>
Full article ">Figure 7
<p>A comparison of reconstruction results between COLMAP, NeRF, and 3DGS methods under varying numbers of remote sensing images. (<b>a</b>–<b>c</b>) Showcase the modeling effects of COLMAP when the number of images is 170, 110, and 66, respectively; (<b>d</b>–<b>f</b>) Demonstrate the modeling outcomes of NeRF (Instant-NGP) with 170, 110, and 66 images; (<b>g</b>–<b>i</b>) Present the modeling performance of 3DGS for the same sets of 170, 110, and 66 images.</p>
Full article ">Figure 8
<p>A comparative analysis of rendering outcomes of tables/chairs. (<b>a</b>,<b>b</b>) Modeling accuracy of chairs utilizing NeRF (Instant-NGP) with 110 and 66 Images, respectively. (<b>c</b>,<b>d</b>) Reconstruction fidelity of chairs by 3DGS for 110 and 66 images, respectively.</p>
Full article ">Figure 9
<p>A comparative analysis of rendering outcomes of windows. (<b>a</b>,<b>b</b>) Modeling performance of windows achieved by NeRF (Instant-NGP) with 110 and 66 images, respectively. (<b>c</b>,<b>d</b>) Reconstruction precision of windows via 3DGS for the corresponding sets of 110 and 66 images, respectively.</p>
Full article ">Figure 10
<p>Comparison of scene images under different numbers of best pictures and iterations. (<b>a</b>) Original image; (<b>b</b>) rendered image.</p>
Full article ">Figure 11
<p>Comparison of dimensions for rendered images at various resolutions. (<b>a</b>–<b>g</b>) represent rendered images with resolutions of 0.3 k, 0.5 k, 0.8 k, 1.2 k, 1.5 k, 1.6 k, and 2 k, respectively.</p>
Full article ">Figure 12
<p>Comparison of rendering effects for different threshold values of the average magnitude of spatial position gradients. (<b>a</b>–<b>d</b>) represent the rendering effects when the threshold values are set to 0.0001, 0.0002, 0.0003, and 0.0004, respectively.</p>
Full article ">Figure 13
<p>Comparison of rendering effects for different learning rates at various scaling scales. (<b>a</b>–<b>d</b>) represent the rendering effects when the scaling scale learning rates are set to 0.004, 0.005, 0.006, and 0.008, respectively.</p>
Full article ">Figure 14
<p>Comparison of rendering effects for different hyperparameter settings. (<b>a</b>–<b>e</b>) represents the rendering effects when the hyperparameter settings are 0.0005, 0.001, 0.002, 0.01, and 0.1, respectively.</p>
Full article ">Figure 15
<p>Comparison diagram of ceiling area before and after algorithm optimization. (<b>a</b>) Before optimization; (<b>b</b>) After optimization.</p>
Full article ">Figure 16
<p>Iterative 3DGS training results. (<b>a</b>) Original image; (<b>b</b>) Training result after 7 k iterations; (<b>c</b>) Training result after 30 k iterations.</p>
Full article ">
35 pages, 7483 KiB  
Article
Space Efficiency of Transit-Oriented Station Areas: A Case Study from a Complex Adaptive System Perspective
by Jinwen Fan, Zhenwu Shi, Jie Liu and Jinru Wang
ISPRS Int. J. Geo-Inf. 2025, 14(1), 20; https://doi.org/10.3390/ijgi14010020 - 6 Jan 2025
Viewed by 686
Abstract
Transit-oriented development (TOD) has been widely adopted in urban planning to alleviate traffic congestion, urban sprawl, and other problems. The TOD metro station area, as a dynamic and open spatial system, presents typical complex features. To improve urban planning by understanding the complex [...] Read more.
Transit-oriented development (TOD) has been widely adopted in urban planning to alleviate traffic congestion, urban sprawl, and other problems. The TOD metro station area, as a dynamic and open spatial system, presents typical complex features. To improve urban planning by understanding the complex features of metro station areas, this study proposes a comprehensive evaluation method using complex adaptive system theory (CAS) to assess space efficiency and the use of an evaluation method like COWA (continuous ordered weighted averaging) operator and cloud model to show efficiency. Factors include external relevance, internal coordination, and environmental adaptation. This study uses Museum Station of Harbin Railway Transportation as the case study, and the results show that the space efficiency of Harbin’s TOD metro station areas are lacking in internal coordination and environmental adaptation. The proposed evaluation method not only identifies areas of space inefficiencies in urban rail transit station areas but also provides valuable insights for informed decision-making and future urban development initiatives. Full article
Show Figures

Figure 1

Figure 1
<p>CAS theory and its characteristics.</p>
Full article ">Figure 2
<p>CAS framework of TOD system.</p>
Full article ">Figure 3
<p>Analysis of multiple internal and external interactions.</p>
Full article ">Figure 4
<p>Concept diagram of the standard cloud model.</p>
Full article ">Figure 5
<p>Reverse cloud generator.</p>
Full article ">Figure 6
<p>Map of Museum Station.</p>
Full article ">Figure 7
<p>Satellite map of Museum Station.</p>
Full article ">Figure 8
<p>Land use compactness.</p>
Full article ">Figure 9
<p>The functional mix of the areas surrounding stations.</p>
Full article ">Figure 10
<p>The concept diagram of comprehensive evaluation cloud.</p>
Full article ">Figure 11
<p>Concept diagram of dimension evaluation cloud.</p>
Full article ">Figure 12
<p>Upstairs without escalator.</p>
Full article ">Figure 13
<p>Access with up and down escalators.</p>
Full article ">
36 pages, 25347 KiB  
Article
Construction of a Real-Scene 3D Digital Campus Using a Multi-Source Data Fusion: A Case Study of Lanzhou Jiaotong University
by Rui Gao, Guanghui Yan, Yingzhi Wang, Tianfeng Yan, Ruiting Niu and Chunyang Tang
ISPRS Int. J. Geo-Inf. 2025, 14(1), 19; https://doi.org/10.3390/ijgi14010019 - 3 Jan 2025
Viewed by 1038
Abstract
Real-scene 3D digital campuses are essential for improving the accuracy and effectiveness of spatial data representation, facilitating informed decision-making for university administrators, optimizing resource management, and enriching user engagement for students and faculty. However, current approaches to constructing these digital environments face several [...] Read more.
Real-scene 3D digital campuses are essential for improving the accuracy and effectiveness of spatial data representation, facilitating informed decision-making for university administrators, optimizing resource management, and enriching user engagement for students and faculty. However, current approaches to constructing these digital environments face several challenges. They often rely on costly commercial platforms, struggle with integrating heterogeneous datasets, and require complex workflows to achieve both high precision and comprehensive campus coverage. This paper addresses these issues by proposing a systematic multi-source data fusion approach that employs open-source technologies to generate a real-scene 3D digital campus. A case study of Lanzhou Jiaotong University is presented to demonstrate the feasibility of this approach. Firstly, oblique photography based on unmanned aerial vehicles (UAVs) is used to capture large-scale, high-resolution images of the campus area, which are then processed using open-source software to generate an initial 3D model. Afterward, a high-resolution model of the campus buildings is then created by integrating the UAV data, while 3D Digital Elevation Model (DEM) and OpenStreetMap (OSM) building data provide a 3D overview of the surrounding campus area, resulting in a comprehensive 3D model for a real-scene digital campus. Finally, the 3D model is visualized on the web using Cesium, which enables functionalities such as real-time data loading, perspective switching, and spatial data querying. Results indicate that the proposed approach can effectively get rid of reliance on expensive proprietary systems, while rapidly and accurately reconstructing a real-scene digital campus. This framework not only streamlines data harmonization but also offers an open-source, practical, cost-effective solution for real-scene 3D digital campus construction, promoting further research and applications in twin city, Virtual Reality (VR), and Geographic Information Systems (GIS). Full article
Show Figures

Figure 1

Figure 1
<p>Challenges in Integration of Different Data Layers for 3D Digital Campus: (<b>a</b>) Satellite Imagery Alone; (<b>b</b>) Satellite Imagery Combined with Digital Surface Model (DSM); (<b>c</b>) Satellite Imagery Combined with Oblique Photography; (<b>d</b>) Oblique Photography Data Alone.</p>
Full article ">Figure 2
<p>Case study area: Lanzhou Jiaotong University main campus in Lanzhou City (Sources: Google Earth).</p>
Full article ">Figure 3
<p>Route planning and design for oblique photography data acquisition.</p>
Full article ">Figure 4
<p>Overall workflow of the proposed approach (A variety of open-source tools and libraries were used in this workflow; see <a href="#app1-ijgi-14-00019" class="html-app">Appendix A</a>).</p>
Full article ">Figure 5
<p>Coordinate transformation.</p>
Full article ">Figure 6
<p>Camera View and Clip Plane Relationship: View Coordinates and NDC.</p>
Full article ">Figure 7
<p>3D Real-Scene Digital Campus System based on Cesium framework.</p>
Full article ">Figure 8
<p>Stitching of Oblique Photography 3D Tiles Models and Spatial Alignment in Cesium.</p>
Full article ">Figure 9
<p>Oblique Photography 3D Real-Scene Models of Lanzhou Jiaotong University.</p>
Full article ">Figure 10
<p>Real-Scene 3D Model with Multi-Source Data Integration.</p>
Full article ">Figure 11
<p>Acquisition of location information based on LGIRA.</p>
Full article ">Figure 12
<p>Positional correction of BIM model in 3D Tile format.</p>
Full article ">Figure 13
<p>Dynamic Display of Construction Stages of the Comprehensive Teaching Building.</p>
Full article ">Figure 13 Cont.
<p>Dynamic Display of Construction Stages of the Comprehensive Teaching Building.</p>
Full article ">Figure 14
<p>Animated Weather Effects in Different Conditions.</p>
Full article ">Figure 14 Cont.
<p>Animated Weather Effects in Different Conditions.</p>
Full article ">Figure 15
<p>Location and Feature Selection GCPs for three regions in the Case Study Area.</p>
Full article ">Figure 16
<p>Establishing links between GCPs and positions in Oblique Photography Imagery.</p>
Full article ">
20 pages, 4856 KiB  
Article
Enhancing the Ground Truth Disparity by MAP Estimation for Developing a Neural-Net Based Stereoscopic Camera
by Hanbit Gil, Sehyun Ryu and Sungmin Woo
Sensors 2024, 24(23), 7761; https://doi.org/10.3390/s24237761 - 4 Dec 2024
Viewed by 854
Abstract
This paper presents a novel method to enhance ground truth disparity maps generated by Semi-Global Matching (SGM) using Maximum a Posteriori (MAP) estimation. SGM, while not producing visually appealing outputs like neural networks, offers high disparity accuracy in valid regions and avoids the [...] Read more.
This paper presents a novel method to enhance ground truth disparity maps generated by Semi-Global Matching (SGM) using Maximum a Posteriori (MAP) estimation. SGM, while not producing visually appealing outputs like neural networks, offers high disparity accuracy in valid regions and avoids the generalization issues often encountered with neural network-based disparity estimation. However, SGM struggles with occlusions and textureless areas, leading to invalid disparity values. Our approach, though relatively simple, mitigates these issues by interpolating invalid pixels using surrounding disparity information and Bayesian inference, improving both the visual quality of disparity maps and their usability for training neural network-based commercial depth-sensing devices. Experimental results validate that our enhanced disparity maps preserve SGM’s accuracy in valid regions while improving the overall performance of neural networks on both synthetic and real-world datasets. This method provides a robust framework for advanced stereoscopic camera systems, particularly in autonomous applications. Full article
Show Figures

Figure 1

Figure 1
<p>The proposed framework for enhancing SGM disparity map.</p>
Full article ">Figure 2
<p>Example of left (<b>a</b>) and RGB (<b>b</b>) images.</p>
Full article ">Figure 3
<p>(<b>a</b>) Disparity map generated by SGM for the images in <a href="#sensors-24-07761-f001" class="html-fig">Figure 1</a>. (<b>b</b>) Enlarged view of (<b>a</b>). Dark blue pixels indicate “invalid” regions. The numbers shown represent disparity values for each grouped region.</p>
Full article ">Figure 4
<p>(<b>a</b>) Prior probability, (<b>b</b>) Likelihood, and (<b>c</b>) Posterior distribution of an invalid pixel from <a href="#sensors-24-07761-f003" class="html-fig">Figure 3</a>.</p>
Full article ">Figure 5
<p>Preprocessing steps for the proposed method: (<b>a</b>) Original cropped patch, (<b>b</b>) Standardized patch, (<b>c</b>) Mask, and (<b>d</b>) Masked patch.</p>
Full article ">Figure 6
<p>(<b>a</b>) Left masked cropped patch. (<b>b</b>) Right cropped candidate patches.</p>
Full article ">Figure 7
<p>Disparity map comparisons on the synthetic Driving dataset across different scenes. (<b>a</b>) Ground truth, (<b>b</b>) SGM<math display="inline"><semantics> <mrow> <mo>(</mo> <msub> <mi>U</mi> <mrow> <mi>t</mi> <mi>h</mi> </mrow> </msub> <mo>=</mo> <mn>10</mn> <mo>)</mo> </mrow> </semantics></math>, (<b>c</b>) SGM<math display="inline"><semantics> <mrow> <mo>(</mo> <msub> <mi>U</mi> <mrow> <mi>t</mi> <mi>h</mi> </mrow> </msub> <mo>=</mo> <mn>0</mn> <mo>)</mo> </mrow> </semantics></math>, (<b>d</b>) Linear Interpolation, (<b>e</b>) Nearest Interpolation, (<b>f</b>) PDE, (<b>g</b>) ShCNN [<a href="#B56-sensors-24-07761" class="html-bibr">56</a>], (<b>h</b>) GMCNN [<a href="#B55-sensors-24-07761" class="html-bibr">55</a>], (<b>i</b>) MADF [<a href="#B58-sensors-24-07761" class="html-bibr">58</a>], (<b>j</b>) Chen [<a href="#B57-sensors-24-07761" class="html-bibr">57</a>], and (<b>k</b>) The proposed. Invalid regions are shown in darkish blue.</p>
Full article ">Figure 8
<p>Example captured images of real-world indoor scenes.</p>
Full article ">Figure 9
<p>Disparity map comparisons across different real-world scenes. (<b>a</b>) Input left images, (<b>b</b>) SGM<math display="inline"><semantics> <mrow> <mo>(</mo> <msub> <mi>U</mi> <mrow> <mi>t</mi> <mi>h</mi> </mrow> </msub> <mo>=</mo> <mn>10</mn> <mo>)</mo> </mrow> </semantics></math>, (<b>c</b>) Linear Interpolation, (<b>d</b>) Nearest Interpolation, (<b>e</b>) PDE, (<b>f</b>) Shepard inpainting [<a href="#B56-sensors-24-07761" class="html-bibr">56</a>], (<b>g</b>) GMCNN [<a href="#B55-sensors-24-07761" class="html-bibr">55</a>], (<b>h</b>) MADF [<a href="#B58-sensors-24-07761" class="html-bibr">58</a>], (<b>i</b>) Chen [<a href="#B57-sensors-24-07761" class="html-bibr">57</a>], and (<b>j</b>) The proposed. The insets highlight areas with significant differences, particularly in challenging regions with occlusions and textureless surfaces. Invalid regions are shown in darkish blue.</p>
Full article ">Figure 10
<p>Basic CNN-based model for disparity estimation.</p>
Full article ">Figure 11
<p>ResNet-based model for disparity estimation. Note that the additional low-scale intermediate feature maps are used to capture structural information of disparity at the original size.</p>
Full article ">Figure 12
<p>Vision Transformer-based model for disparity estimation. This model replaces the Encoder of the baseline model with a Vision Transformer (ViT) and modifies the Decoder from <a href="#sensors-24-07761-f010" class="html-fig">Figure 10</a> accordingly.</p>
Full article ">Figure 13
<p>Disparity map comparisons across various scenes using different models. (<b>a</b>) GT, (<b>b</b>) CNN-based, (<b>c</b>) ResNet-based, (<b>d</b>) ViT-based, (<b>e</b>) PSMNet [<a href="#B50-sensors-24-07761" class="html-bibr">50</a>]. The first row of each scene is trained with the original SGM GT, and the second row with the proposed GT. Invalid regions are shown in darkish red.</p>
Full article ">Figure 14
<p>Relationship between patch size <math display="inline"><semantics> <msub> <mi mathvariant="bold">S</mi> <mi>p</mi> </msub> </semantics></math> and both error and invalid pixel ratios for various prior window sizes <math display="inline"><semantics> <msub> <mi mathvariant="bold">S</mi> <mi>w</mi> </msub> </semantics></math> at a fixed intensity threshold <math display="inline"><semantics> <mrow> <msub> <mi>I</mi> <mrow> <mi>t</mi> <mi>h</mi> </mrow> </msub> <mo>=</mo> <mn>0.1</mn> </mrow> </semantics></math>. (<b>a</b>) shows how smaller values of <math display="inline"><semantics> <msub> <mi mathvariant="bold">S</mi> <mi>w</mi> </msub> </semantics></math> and larger values of <math display="inline"><semantics> <msub> <mi mathvariant="bold">S</mi> <mi>p</mi> </msub> </semantics></math> tend to minimize error, with an optimal configuration observed around <math display="inline"><semantics> <mrow> <msub> <mi mathvariant="bold">S</mi> <mi>w</mi> </msub> <mo>=</mo> <mn>17</mn> <mo>×</mo> <mn>17</mn> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mi mathvariant="bold">S</mi> <mi>p</mi> </msub> <mo>=</mo> <mn>24</mn> <mo>×</mo> <mn>4</mn> </mrow> </semantics></math>. (<b>b</b>) demonstrates that smaller values of <math display="inline"><semantics> <msub> <mi mathvariant="bold">S</mi> <mi>w</mi> </msub> </semantics></math> generally lead to higher invalid pixel ratios, while smaller values of <math display="inline"><semantics> <msub> <mi mathvariant="bold">S</mi> <mi>p</mi> </msub> </semantics></math> help in reducing invalid pixel ratios. This indicates a trade-off in parameter selection between minimizing error and reducing invalid pixel ratios.</p>
Full article ">Figure 15
<p>3D visualization of error-based on prior window size <math display="inline"><semantics> <msub> <mi mathvariant="bold">S</mi> <mi>w</mi> </msub> </semantics></math> and patch size <math display="inline"><semantics> <msub> <mi mathvariant="bold">S</mi> <mi>p</mi> </msub> </semantics></math>, with point size indicating the reciprocal of invalid pixel ratios.</p>
Full article ">
15 pages, 7711 KiB  
Article
Development of Automated 3D LiDAR System for Dimensional Quality Inspection of Prefabricated Concrete Elements
by Shuangping Li, Bin Zhang, Junxing Zheng, Dong Wang and Zuqiang Liu
Sensors 2024, 24(23), 7486; https://doi.org/10.3390/s24237486 - 24 Nov 2024
Viewed by 1176
Abstract
The dimensional quality inspection of prefabricated concrete (PC) elements is crucial for ensuring overall assembly quality and enhancing on-site construction efficiency. However, current practices remain heavily reliant on manual inspection, which results in high operator dependency and low efficiency. Existing Light Detection and [...] Read more.
The dimensional quality inspection of prefabricated concrete (PC) elements is crucial for ensuring overall assembly quality and enhancing on-site construction efficiency. However, current practices remain heavily reliant on manual inspection, which results in high operator dependency and low efficiency. Existing Light Detection and Ranging (LiDAR)-based methods also require skilled professionals for scanning and subsequent point cloud processing, thereby presenting technical challenges. This study developed a 3D LiDAR system for the automatic identification and measurement of the dimensional quality of PC elements. The system consists of (1) a hardware system integrated with camera and LiDAR components to acquire 3D point cloud data and (2) a user-friendly graphical user interface (GUI) software system incorporating a series of algorithms for automated point cloud processing using PyQt5. Field experiments comparing the system’s measurements with manual measurements on prefabricated bridge columns demonstrated that the system’s average measurement error was approximately 5 mm. The developed system can provide a quick, accurate, and automated inspection tool for dimensional quality assessment of PC elements, thereby enhancing on-site construction efficiency. Full article
Show Figures

Figure 1

Figure 1
<p>Research framework.</p>
Full article ">Figure 2
<p>The developed device: (<b>a</b>) the operating end of the device and (<b>b</b>) internal structure of the device.</p>
Full article ">Figure 3
<p>Workflow of column end face automatic inspection system.</p>
Full article ">Figure 4
<p>Software system of the proposed device.</p>
Full article ">Figure 5
<p>Experiment in the field: (<b>a</b>) column production area, (<b>b</b>) column scanning based on the proposed system, and (<b>c</b>) basic information of bridge column.</p>
Full article ">Figure 6
<p>Point cloud visualization of the column: (<b>a</b>) position of frontal scan, (<b>b</b>) rotation process of 2D LiDAR, (<b>c</b>) results from frontal scan, and (<b>d</b>) results from side-scan.</p>
Full article ">Figure 7
<p>Point cloud processing and assessment of PC column.</p>
Full article ">Figure 8
<p>Rebars clustering results under different parameters: (<b>a</b>) ε = 50, Minpoints = 5; (<b>b</b>) ε = 50, Minpoints = 10; (<b>c</b>) ε = 100, Minpoints = 10; (<b>d</b>) ε = 100, Minpoints = 5.</p>
Full article ">Figure 9
<p>Absolute difference values of embedded rebar spacing.</p>
Full article ">Figure 10
<p>Absolute difference values of embedded rebar length.</p>
Full article ">
17 pages, 7037 KiB  
Article
Experimental Study on the Bending Mechanical Properties of Socket-Type Concrete Pipe Joints
by Xu Liang, Jian Xu, Xuesong Song, Zhongyao Ren and Li Shi
Buildings 2024, 14(11), 3655; https://doi.org/10.3390/buildings14113655 - 17 Nov 2024
Viewed by 585
Abstract
In modern infrastructure construction, the socket joint of concrete pipelines is a critical component in ensuring the overall stability and safety of the pipeline system. This study conducted monotonic and cyclic bending loading tests on DN300 concrete pipeline socket joints to thoroughly analyse [...] Read more.
In modern infrastructure construction, the socket joint of concrete pipelines is a critical component in ensuring the overall stability and safety of the pipeline system. This study conducted monotonic and cyclic bending loading tests on DN300 concrete pipeline socket joints to thoroughly analyse their bending mechanical properties. The experimental results indicated that during monotonic loading, the relationship between the joint angle and bending moment exhibited nonlinear growth, with the stress state of the socket joint transitioning from the initial contact between the rubber ring and the socket to the eventual contact between the spigot and socket concrete. During the cyclic loading phase, the accumulated joint angle, secant stiffness, and bending stiffness of the pipeline interface significantly increased within the first 1 to 7 cycles and stabilised between the 8th and 40th cycles. After 40 cycles of loading, the bending stiffness of the joint reached 1.5 kN·m2, while the stiffness of the pipeline was approximately 8500 times that of the joint. Additionally, a finite element model for the monotonic loading of the concrete pipeline socket joint was established, and the simulation results showed good agreement with the experimental data, providing a reliable basis for further simulation and analysis of the joint’s mechanical performance under higher loads. This study fills the gap in research on the mechanical properties of concrete pipeline socket joints, particularly under bending loads, and offers valuable references for related engineering applications. Full article
Show Figures

Figure 1

Figure 1
<p>Flexural loading test of full-sized concrete pipeline–socket interface: (<b>a</b>) side view; (<b>b</b>) top view.</p>
Full article ">Figure 2
<p>(<b>a</b>) Physical drawing of test pipe fitting; (<b>b</b>) pipe interfacial dimensions (mm).</p>
Full article ">Figure 3
<p>Assembly diagram of the pipeline–socket interface.</p>
Full article ">Figure 4
<p>Calculation diagram of pipeline interfacial bending deformation.</p>
Full article ">Figure 5
<p>Cyclic loading time-history curves: (<b>a</b>) Test 2 with a cyclic load amplitude of 10.5 kN; (<b>b</b>) Test 3 with a cyclic load amplitude of 17.5 kN.</p>
Full article ">Figure 6
<p>Monotonic bending loading test process for pipeline joints: (<b>a</b>) before loading; (<b>b</b>) during loading; (<b>c</b>) after loading.</p>
Full article ">Figure 7
<p>Load (jack’s output)–displacement curves of the concrete pipeline–socket interface under monotonic loading.</p>
Full article ">Figure 8
<p>Rubber ring deformation during bending loading of concrete pipeline socket joints: (<b>a</b>) twist; (<b>b</b>) slippage.</p>
Full article ">Figure 9
<p>Moment–rotation angle curves of concrete pipeline–socket interface under monotonic loading.</p>
Full article ">Figure 10
<p>Cumulative rotation angles and numbers of cycles for the concrete pipeline–socket interface under cyclic loading. Note: The letters (O, A, B, C) in the figure are used to differentiate the segments of the curve.</p>
Full article ">Figure 11
<p>Deformation of the concrete pipeline–socket interface under cyclic loading.</p>
Full article ">Figure 12
<p>Bending moment–rotation angle curves for cyclic loading tests on the concrete pipeline–socket interface: (<b>a</b>) Test 2 (peak cyclic load of 10.5 kN); (<b>b</b>) Test 3 (peak cyclic load of 17.5 kN).</p>
Full article ">Figure 13
<p>Secant stiffness of the cyclic hysteresis curve. Note: The black line represents the load-angle diagram in the cyclic experiment, while the red line indicates the chord of the curve.</p>
Full article ">Figure 14
<p>Variation curves of secant stiffness at the concrete pipeline–socket interface with increasing number of cycles.</p>
Full article ">Figure 15
<p>Curves of flexural stiffness of concrete pipeline–socket interface with increasing number of cycles.</p>
Full article ">Figure 16
<p>Three-dimensional finite model grid diagram of the concrete pipeline–socket interface.</p>
Full article ">Figure 17
<p>Bending moment–rotation angle curves of the concrete pipeline–socket interface under monotonic loading and corresponding numerical simulation.</p>
Full article ">Figure 18
<p>Displacement cloud map of the concrete pipeline–socket interface under a bending moment of 8 kN·m.</p>
Full article ">Figure 19
<p>Stress distribution cloud map of the concrete pipeline–socket interface under a bending moment of 8 kN·m.</p>
Full article ">
25 pages, 2849 KiB  
Article
Enhanced Hybrid U-Net Framework for Sophisticated Building Automation Extraction Utilizing Decay Matrix
by Ting Wang, Zhuyi Gong, Anqi Tang, Qian Zhang and Yun Ge
Buildings 2024, 14(11), 3353; https://doi.org/10.3390/buildings14113353 - 23 Oct 2024
Viewed by 872
Abstract
Automatically extracting buildings from remote sensing imagery using deep learning techniques has become essential for various real-world applications. However, mainstream methods often encounter difficulties in accurately extracting and reconstructing fine-grained features due to the heterogeneity and scale variations in building appearances. To address [...] Read more.
Automatically extracting buildings from remote sensing imagery using deep learning techniques has become essential for various real-world applications. However, mainstream methods often encounter difficulties in accurately extracting and reconstructing fine-grained features due to the heterogeneity and scale variations in building appearances. To address these challenges, we propose LDFormer, an advanced building segmentation model based on linear decay. LDFormer introduces a multi-scale detail fusion bridge (MDFB), which dynamically integrates shallow features to enhance the representation of local details and capture fine-grained local features effectively. To improve global feature extraction, the model incorporates linear decay self-attention (LDSA) and depthwise large separable kernel multi-layer perceptron (DWLSK-MLP) optimizations in the decoder. Specifically, LDSA employs a linear decay matrix within the self-attention mechanism to address long-distance dependency issues, while DWLSK-MLP utilizes step-wise convolutions to achieve a large receptive field. The proposed method has been evaluated on the Massachusetts, Inria, and WHU building datasets, achieving IoU scores of 76.10%, 82.87%, and 91.86%, respectively. LDFormer demonstrates superior performance compared to existing state-of-the-art methods in building segmentation tasks, showcasing its significant potential for building automation extraction. Full article
Show Figures

Figure 1

Figure 1
<p>Some challenges in remote sensing images. (<b>a</b>) Buildings vary in size, shape, texture, and color. (<b>b</b>) Shadows obscure the buildings in remote sensing images.</p>
Full article ">Figure 2
<p>An overview of the LDFormer model.</p>
Full article ">Figure 3
<p>The structure of LDBlock. (<b>a</b>) Block in Swin-Transformer. (<b>b</b>) LDBlock in LDFormer.</p>
Full article ">Figure 4
<p>The structure of Multi-scale Detail Fusion Bridge (MDFB).</p>
Full article ">Figure 5
<p>Illustration of the LDSA strategy. Different colors represent different weight values for the data; the closer to the token center, the greater the data weight value.</p>
Full article ">Figure 6
<p>(<b>a</b>) DW-MLP. (<b>b</b>) MS-MLP. (<b>c</b>) Our DWLSK-MLP.</p>
Full article ">Figure 7
<p>Qualitative comparison under Massachusetts test sets. We added some red boxes to highlight the differences in order to facilitate model comparison.</p>
Full article ">Figure 8
<p>Visualization of large image inference on the Massachusetts dataset.</p>
Full article ">Figure 9
<p>Qualitative comparison under WHU (<b>left</b>) and Inria (<b>right</b>) test sets.</p>
Full article ">Figure 10
<p>Ablation analysis of the impact of the number of model heads and window size using the Inria building dataset.</p>
Full article ">Figure 11
<p>Model complexity comparison of LDFormer on Inria dataset.</p>
Full article ">
26 pages, 11601 KiB  
Article
Raspberry Pi-Based IoT System for Grouting Void Detection in Tunnel Construction
by Weibin Luo, Junxing Zheng, Yu Miao and Lin Gao
Buildings 2024, 14(11), 3349; https://doi.org/10.3390/buildings14113349 - 23 Oct 2024
Viewed by 2583
Abstract
This paper presents an IoT-based solution for detecting grouting voids in tunnel construction using the Raspberry Pi microcomputer. Voids between the primary and secondary tunnel linings can compromise structural integrity, and traditional methods like GPR lack continuous feedback. The proposed system uses embedded [...] Read more.
This paper presents an IoT-based solution for detecting grouting voids in tunnel construction using the Raspberry Pi microcomputer. Voids between the primary and secondary tunnel linings can compromise structural integrity, and traditional methods like GPR lack continuous feedback. The proposed system uses embedded electrical wires in the secondary lining to measure conductivity, with disruptions indicating unfilled voids. The Raspberry Pi monitors this in real time, uploading data to a cloud platform for engineer access via smartphone. Field tests were conducted in a full-scale, 600 m long tunnel to evaluate the system’s effectiveness. The tests demonstrated the system’s accuracy in detecting voids in various tunnel geometries, including straight sections, curves, and intersections. Using only the proposed void detection system, the largest void detected post-grouting was 1.8 cm, which is within acceptable limits and does not compromise the tunnel’s structural integrity or safety. The system proved to be a cost-effective and scalable solution for real-time monitoring during the grouting process, eliminating the need for continuous manual inspections. This study highlights the potential of IoT-based solutions in smart construction, providing a reliable and practical method for improving tunnel safety and operational efficiency during grouting operations. Full article
Show Figures

Figure 1

Figure 1
<p>The architecture of the proposed IoT-based grouting void detection system using Raspberry Pi.</p>
Full article ">Figure 2
<p>Electrical wires embedded within the secondary lining structure.</p>
Full article ">Figure 3
<p>Setup of the proposed raspberry Pi-based grouting void detection IoT system.</p>
Full article ">Figure 4
<p>The Raspberry Pi platform.</p>
Full article ">Figure 5
<p>GPIO interface of the Raspberry Pi platform.</p>
Full article ">Figure 6
<p>The main components of IoT-based grouting void detection system using Raspberry Pi.</p>
Full article ">Figure 7
<p>The grouting void detection algorithm.</p>
Full article ">Figure 8
<p>The IoT architecture of the proposed system.</p>
Full article ">Figure 9
<p>Field test setup.</p>
Full article ">
16 pages, 9902 KiB  
Article
Research on a Photovoltaic Panel Dust Detection Algorithm Based on 3D Data Generation
by Chengzhi Xie, Qifen Li, Yongwen Yang, Liting Zhang and Xiaojing Liu
Energies 2024, 17(20), 5222; https://doi.org/10.3390/en17205222 - 20 Oct 2024
Viewed by 1094
Abstract
With the rapid advancements in AI technology, UAV-based inspection has become a mainstream method for intelligent maintenance of PV power stations. To address limitations in accuracy and data acquisition, this paper presents a defect detection algorithm for PV panels based on an enhanced [...] Read more.
With the rapid advancements in AI technology, UAV-based inspection has become a mainstream method for intelligent maintenance of PV power stations. To address limitations in accuracy and data acquisition, this paper presents a defect detection algorithm for PV panels based on an enhanced YOLOv8 model. The PV panel dust dataset is manually extended using 3D modeling technology, which significantly improves the model’s ability to generalize and detect fine dust particles in complex environments. SENetV2 is introduced to improve the model’s perception of dust features in cluttered backgrounds. AKConv replaces traditional convolution in the neck network, allowing for more flexible and accurate feature extraction through arbitrary kernel parameters and sampling shapes. Additionally, a DySample dynamic upsampler accelerates processing by 8.73%, improving the frame rate from 87.58 FPS to 95.23 FPS while maintaining efficiency. Experimental results show that the 3D image expansion method contributes to a 4.6% increase in detection accuracy, an 8.4% improvement in recall, a 5.7% increase in mAP@50, and a 15.1% improvement in mAP@50-95 compared to the original YOLOv8. The expanded dataset and enhanced model demonstrate the effectiveness and practicality of the proposed approach. Full article
Show Figures

Figure 1

Figure 1
<p>Overall flow chart of the experiment.</p>
Full article ">Figure 2
<p>Structure of the YOLOv8 model.</p>
Full article ">Figure 3
<p>Improvement of YOLOV8 network structure diagram.</p>
Full article ">Figure 4
<p>SENet module.</p>
Full article ">Figure 5
<p>SENetV2 module.</p>
Full article ">Figure 6
<p>Dysample dynamic upsampling structure.</p>
Full article ">Figure 7
<p>AKConv structure.</p>
Full article ">Figure 8
<p>Three-dimensional modeling of PV panels in Blender.</p>
Full article ">Figure 9
<p>Dust randomization override script settings.</p>
Full article ">Figure 10
<p>Surface dust of PV panels at different particle sizes.</p>
Full article ">Figure 11
<p>Resnet50 network.</p>
Full article ">Figure 12
<p>Low-quality-samples (first row) and high-quality-samples (second row).</p>
Full article ">Figure 13
<p>Photographs of real samples.</p>
Full article ">Figure 14
<p>Partial experimental data results after 3D image expansion.</p>
Full article ">Figure 15
<p>Display of inference results.</p>
Full article ">Figure 15 Cont.
<p>Display of inference results.</p>
Full article ">Figure 16
<p>Effect of different modules on Map@50 and R.</p>
Full article ">
21 pages, 6078 KiB  
Article
Multi-Feature-Filtering-Based Road Curb Extraction from Unordered Point Clouds
by Hong Lang, Yuan Peng, Zheng Zou, Shengxue Zhu, Yichuan Peng and Hao Du
Sensors 2024, 24(20), 6544; https://doi.org/10.3390/s24206544 - 10 Oct 2024
Viewed by 1209
Abstract
Road curb extraction is a critical component of road environment perception, being essential for calculating road geometry parameters and ensuring the safe navigation of autonomous vehicles. The existing research primarily focuses on extracting curbs from ordered point clouds, which are constrained by their [...] Read more.
Road curb extraction is a critical component of road environment perception, being essential for calculating road geometry parameters and ensuring the safe navigation of autonomous vehicles. The existing research primarily focuses on extracting curbs from ordered point clouds, which are constrained by their structure of point cloud organization, making it difficult to apply them to unordered point cloud data and making them susceptible to interference from obstacles. To overcome these limitations, a multi-feature-filtering-based method for curb extraction from unordered point clouds is proposed. This method integrates several techniques, including the grid height difference, normal vectors, clustering, an alpha-shape algorithm based on point cloud density, and the MSAC (M-Estimate Sample Consensus) algorithm for multi-frame fitting. The multi-frame fitting approach addresses the limitations of traditional single-frame methods by fitting the curb contour every five frames, ensuring more accurate contour extraction while preserving local curb features. Based on our self-developed dataset and the Toronto dataset, these methods are integrated to create a robust filter capable of accurately identifying curbs in various complex scenarios. Optimal threshold values were determined through sensitivity analysis and applied to enhance curb extraction performance under diverse conditions. Experimental results demonstrate that the proposed method accurately and comprehensively extracts curb points in different road environments, proving its effectiveness and robustness. Specifically, the average curb segmentation precision, recall, and F1 score values across scenarios A, B (intersections), C (straight road), and scenarios D and E (curved roads and ghosting) are 0.9365, 0.782, and 0.8523, respectively. Full article
Show Figures

Figure 1

Figure 1
<p>Algorithm framework.</p>
Full article ">Figure 2
<p>MLS system for field data acquisition.</p>
Full article ">Figure 3
<p>Various road scenarios. The road curb within the red box ABCDE is the scene for the comparison experiments.</p>
Full article ">Figure 4
<p>Extraction results for road scenarios A and B: (<b>a</b>) results after processing with grid height difference; (<b>b</b>) result after normal vector extraction; (<b>c</b>) result after using clustering and variable-radius alpha-shape algorithm; (<b>d</b>) result after multi-frame fitting MSAC algorithm. The different colors in the figure indicate varying heights of the curb points.</p>
Full article ">Figure 4 Cont.
<p>Extraction results for road scenarios A and B: (<b>a</b>) results after processing with grid height difference; (<b>b</b>) result after normal vector extraction; (<b>c</b>) result after using clustering and variable-radius alpha-shape algorithm; (<b>d</b>) result after multi-frame fitting MSAC algorithm. The different colors in the figure indicate varying heights of the curb points.</p>
Full article ">Figure 5
<p>Extraction results for road scenario C: (<b>a</b>) result after processing with grid height difference; (<b>b</b>) result after normal vector extraction; (<b>c</b>) result after using clustering and variable-radius alpha-shape algorithm; (<b>d</b>) result after using multi-frame fitting MSAC algorithm.</p>
Full article ">Figure 6
<p>Extraction results for road scenarios D and E: (<b>a</b>) result after processing with grid height difference; (<b>b</b>) result after normal vector extraction; (<b>c</b>) result after using clustering and variable-radius alpha-shape algorithm; (<b>d</b>) result after using multi-frame fitting MSAC algorithm.</p>
Full article ">Figure 7
<p>Comparison of road curb extraction using single-frame and multi-frame fitting: (<b>a</b>) road extraction result using single-frame fitting for road scenarios A and B; (<b>b</b>) road extraction result using multi-frame fitting for road scenarios A and B; (<b>c</b>) road extraction result using single-frame fitting for road scenarios C; (<b>d</b>) road extraction result using multi-frame fitting for road scenario C; (<b>e</b>) road extraction result using single-frame fitting for road scenarios D and E; (<b>f</b>) road extraction result using multi-frame fitting for road scenarios D and E.</p>
Full article ">Figure 8
<p>Comparison using the Toronto dataset: (<b>a</b>) our results using the Toronto dataset; (<b>b</b>) Mi’s results using the Toronto dataset.</p>
Full article ">
19 pages, 13819 KiB  
Article
An Algorithm for Simplifying 3D Building Models with Consideration for Detailed Features and Topological Structure
by Zhenglin Li, Zhanjie Zhao, Wujun Gao and Li Jiao
ISPRS Int. J. Geo-Inf. 2024, 13(10), 356; https://doi.org/10.3390/ijgi13100356 - 8 Oct 2024
Viewed by 1181
Abstract
To tackle problems such as the destruction of topological structures and the loss of detailed features in the simplification of 3D building models, we propose a 3D building model simplification algorithm that considers detailed features and topological structures. Based on the edge collapse [...] Read more.
To tackle problems such as the destruction of topological structures and the loss of detailed features in the simplification of 3D building models, we propose a 3D building model simplification algorithm that considers detailed features and topological structures. Based on the edge collapse algorithm, the method defines the region formed by the first-order neighboring triangles of the endpoints of the edge to be collapsed as the simplification unit. It incorporates the centroid displacement of the simplification unit, significance level, and approximate curvature of the edge as influencing factors for the collapse cost to control the edge collapse sequence and preserve model details. Additionally, considering the unique properties of 3D building models, boundary edge detection and face overlay are added as constraints to maintain the model’s topological structure. The experimental results show that the algorithm is superior to the classic QEM algorithm in terms of preserving the topological structure and detailed features of the model. Compared to the QEM algorithm and the other two comparison algorithms selected in this paper, the simplified model resulting from this algorithm exhibit a reduction in Hausdorff distance, mean error, and mean square error to varying degrees. Moreover, the advantages of this algorithm become more pronounced as the simplification rate increases. The research findings can be applied to the simplification of 3D building models. Full article
Show Figures

Figure 1

Figure 1
<p>Edge collapse.</p>
Full article ">Figure 2
<p>Centroid displacement.</p>
Full article ">Figure 3
<p>Schematic diagram of the calculation of simplification unit saliency.</p>
Full article ">Figure 4
<p>Boundary point.</p>
Full article ">Figure 5
<p>The role of boundary edge constraints.</p>
Full article ">Figure 6
<p>Common neighborhood vertices.</p>
Full article ">Figure 7
<p>The role of surface superposition detection.</p>
Full article ">Figure 8
<p>Process of the algorithm.</p>
Full article ">Figure 9
<p>Original model.</p>
Full article ">Figure 10
<p>Simplification results of each algorithm at a simplification rate of 20% for the house model. (<b>a</b>) QEM Algorithm (11,578 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (11,578 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (11,578 faces). (<b>d</b>) Algorithm in this paper (11,578 faces).</p>
Full article ">Figure 11
<p>Simplification results of each algorithm at a simplification rate of 50% for the house model. (<b>a</b>) QEM Algorithm (7237 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (7237 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (7237 faces). (<b>d</b>) Algorithm in this paper (7237 faces).</p>
Full article ">Figure 12
<p>Simplification results of each algorithm at a simplification rate of 80% for the house model. (<b>a</b>) QEM Algorithm (2895 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (2895 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (2895 faces). (<b>d</b>) Algorithm in this paper (2895 faces).</p>
Full article ">Figure 13
<p>Simplification results of each algorithm at a simplification rate of 95% for the house model. (<b>a</b>) QEM Algorithm (724 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (724 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (724 faces). (<b>d</b>) Algorithm in this paper (724 faces).</p>
Full article ">Figure 14
<p>Simplification results of each algorithm at a simplification rate of 20% for the pagoda model. (<b>a</b>) QEM Algorithm (433,448 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (433,448 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (433,448 faces). (<b>d</b>) Algorithm in this paper (433,448 faces).</p>
Full article ">Figure 15
<p>Simplification results of each algorithm at a simplification rate of 50% for the pagoda model. (<b>a</b>) QEM Algorithm (270,905 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (270,905 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (270,905 faces). (<b>d</b>) Algorithm in this paper (270,905 faces).</p>
Full article ">Figure 16
<p>Simplification results of each algorithm at a simplification rate of 80% for the pagoda model. (<b>a</b>) QEM Algorithm (108,362 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (108,362 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (108,362 faces). (<b>d</b>) Algorithm in this paper (108,362 faces).</p>
Full article ">Figure 17
<p>Simplification results of each algorithm at a simplification rate of 95% for the pagoda model. (<b>a</b>) QEM Algorithm (27,091 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (27,091 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (27,091 faces). (<b>d</b>) Algorithm in this paper (27,091 faces).</p>
Full article ">Figure 18
<p>Simplified results without considering centroid displacement.</p>
Full article ">Figure 19
<p>Simplified results regardless of significance.</p>
Full article ">Figure 20
<p>Simplified results without considering edge approximate curvature.</p>
Full article ">Figure 21
<p>Simplified results of this algorithm.</p>
Full article ">
18 pages, 3496 KiB  
Article
Analysis of Guidance Signage Systems from a Complex Network Theory Perspective: A Case Study in Subway Stations
by Fei Peng, Zhe Zhang and Qingyan Ding
ISPRS Int. J. Geo-Inf. 2024, 13(10), 342; https://doi.org/10.3390/ijgi13100342 - 25 Sep 2024
Viewed by 857
Abstract
Guidance signage systems (GSSs) play a large role in pedestrian navigation for public buildings. A vulnerable GSS can cause wayfinding troubles for pedestrians. In order to investigate the robustness of GSSs, a complex network-based GSS robustness analysis framework is proposed in this paper. [...] Read more.
Guidance signage systems (GSSs) play a large role in pedestrian navigation for public buildings. A vulnerable GSS can cause wayfinding troubles for pedestrians. In order to investigate the robustness of GSSs, a complex network-based GSS robustness analysis framework is proposed in this paper. First, a method that can transform a GSS into a guidance service network (GSN) is proposed by analyzing the relationships among various signs, and signage node metrics are proposed to evaluate the importance of signage nodes. Second, two network performance metrics, namely, the level of visibility and guidance efficiency, are proposed to evaluate the robustness of the GSN under various disruption modes, and the most important signage node metrics are determined. Finally, a multi-objective optimization model is established to find the optimal weights of these metrics, and a comprehensive evaluation method is proposed to position the critical signage nodes that should receive increased maintenance efforts. A case study was conducted in a subway station and the GSS was transformed into a GSN successfully. The analysis results show that the GSN has scale-free characteristics, and recommendations for GSS design are proposed on the basis of robustness analysis. The signage nodes with high betweenness centrality play a greater role in the GSN than the signage nodes with high degree centrality. The proposed critical signage node evaluation method can be used to efficiently identify the signage nodes for which failure has the greatest effects on GSN performance. Full article
Show Figures

Figure 1

Figure 1
<p>Missing information (to train and direction arrow) due to light tube failure.</p>
Full article ">Figure 2
<p>Methodological framework.</p>
Full article ">Figure 3
<p>Interaction relationship between signs.</p>
Full article ">Figure 4
<p>Influence factors of the relationship between any two signage nodes.</p>
Full article ">Figure 5
<p>VCA of signage node <span class="html-italic">k</span>.</p>
Full article ">Figure 6
<p>The occlusion effects of obstacles.</p>
Full article ">Figure 7
<p>GSN for the GSS of the Suyuan subway station.</p>
Full article ">Figure 8
<p>Degree distributions.</p>
Full article ">Figure 9
<p>Locations of the ten highest-ranked signs in each case.</p>
Full article ">Figure 10
<p>Robustness of the GSN under failure conditions.</p>
Full article ">Figure 11
<p>Divide one long signboard into two short signboards.</p>
Full article ">Figure 12
<p>Guidance efficiency under short and long signboard scenarios.</p>
Full article ">Figure 13
<p>Weights of degree and betweenness centrality under various numbers of removed signage nodes.</p>
Full article ">
17 pages, 10236 KiB  
Article
Research on a 3D Point Cloud Map Learning Algorithm Based on Point Normal Constraints
by Zhao Fang, Youyu Liu, Lijin Xu, Mahamudul Hasan Shahed and Liping Shi
Sensors 2024, 24(19), 6185; https://doi.org/10.3390/s24196185 - 24 Sep 2024
Viewed by 1042
Abstract
Laser point clouds are commonly affected by Gaussian and Laplace noise, resulting in decreased accuracy in subsequent surface reconstruction and visualization processes. However, existing point cloud denoising algorithms often overlook the local consistency and density of the point cloud normal vector. A feature [...] Read more.
Laser point clouds are commonly affected by Gaussian and Laplace noise, resulting in decreased accuracy in subsequent surface reconstruction and visualization processes. However, existing point cloud denoising algorithms often overlook the local consistency and density of the point cloud normal vector. A feature map learning algorithm which integrates point normal constraints, Dirichlet energy, and coupled orthogonality bias terms is proposed. Specifically, the Dirichlet energy is employed to penalize the difference between neighboring normal vectors and combined with a coupled orthogonality bias term to enhance the orthogonality between the normal vectors and the subsurface, thereby enhancing the accuracy and robustness of the learned denoising of the feature maps. Additionally, to mitigate the effect of mixing noise, a point cloud density function is introduced to rapidly capture local feature correlations. In experimental findings on the anchor public dataset, the proposed method reduces the average mean square error (MSE) by 0.005 and 0.054 compared to the MRPCA and NLD algorithms, respectively. Moreover, it improves the average signal-to-noise ratio (SNR) by 0.13 DB and 2.14 DB compared to MRPCA and AWLOP, respectively. The proposed algorithm enhances computational efficiency by 27% compared to the RSLDM method. It not only removes mixed noise but also preserves the local geometric features of the point cloud, further improving computational efficiency. Full article
Show Figures

Figure 1

Figure 1
<p>A point cloud model with Gaussian noise and Laplacian noise.</p>
Full article ">Figure 2
<p>Local consistency constraint of point cloud normal vectors.</p>
Full article ">Figure 3
<p>Point cloud model: (<b>a</b>) Anchor ground truth; (<b>b</b>) Gargoyle ground truth.</p>
Full article ">Figure 4
<p>Point cloud model at different noise strengths: (<b>a</b>) Noise (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) Noise (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 5
<p>Noise reduction effects of PNCFGL at different strengths in Anchor model: (<b>a</b>) PNCFGL (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) PNCFGL (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 6
<p>Noise reduction effect of PNCFGL at different strengths in Gargoyle model: (<b>a</b>) PNCFGL (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) PNCFGL (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 7
<p>Noise reduction effect of APSS at different noise strengths in Anchor model: (<b>a</b>) APSS (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) APSS (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 8
<p>Noise reduction effect of APSS at different noise intensities in Gargoyle model: (<b>a</b>) APSS (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) APSS (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 9
<p>Noise reduction effects of NLD at different strengths in Anchor model: (<b>a</b>) NLD (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) NLD (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 10
<p>Noise reduction effects of NLD at different strengths in Gargoyle model: (<b>a</b>) NLD (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) nld (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 11
<p>Noise removal effect of MRPCA at different strengths in Anchor model: (<b>a</b>) MRPCA (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) MRPCA (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 12
<p>De-noising effects of MRPCA at different strengths in Gargoyle model: (<b>a</b>) MRPCA (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) MRPCA (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 13
<p>Laser radar scanner.</p>
Full article ">Figure 14
<p>Scanning objects: (<b>a</b>) pipe fitting; (<b>b</b>) pipe circle; (<b>c</b>) rear support; (<b>d</b>) front support.</p>
Full article ">Figure 14 Cont.
<p>Scanning objects: (<b>a</b>) pipe fitting; (<b>b</b>) pipe circle; (<b>c</b>) rear support; (<b>d</b>) front support.</p>
Full article ">Figure 15
<p>Comparison of objects (<b>a</b>–<b>d</b>) before and after PNCFGL noise reduction.</p>
Full article ">
25 pages, 10814 KiB  
Article
Three-Dimensional Web-Based Client Presentation of Integrated BIM and GIS for Smart Cities
by Abdullah Varlık and İsmail Dursun
Buildings 2024, 14(9), 3021; https://doi.org/10.3390/buildings14093021 - 23 Sep 2024
Cited by 3 | Viewed by 1913
Abstract
Smart cities use technological solutions to reduce the drawbacks of urban living. The importance of BIM and GIS integration has increased with the popularity of smart city and 3D city concepts in recent years. In addition to 3D city models, Building Information Modeling [...] Read more.
Smart cities use technological solutions to reduce the drawbacks of urban living. The importance of BIM and GIS integration has increased with the popularity of smart city and 3D city concepts in recent years. In addition to 3D city models, Building Information Modeling (BIM) is an essential element of smart cities. The 3D city model web client in this study displays three-dimensional (3D) city models created using photogrammetric techniques, BIM, and campus infrastructure projects. The comparison and integration of the aforementioned systems were evaluated. A web-based 3D client framework and implementation for combined BIM and 3D city models are the goals of the submitted work. The Web is a very challenging platform for 3D data presentation. The Cesium engine based on HTML5 and WebGL is an open-source creation and the virtualcityMAP application using the Cesium infrastructure was used in this study. Full article
Show Figures

Figure 1

Figure 1
<p>LoD representation is defined by CityGML 2.0 and CityGML 3.0 [<a href="#B16-buildings-14-03021" class="html-bibr">16</a>,<a href="#B17-buildings-14-03021" class="html-bibr">17</a>].</p>
Full article ">Figure 2
<p>Relationship between the LoD and the degree of representativeness [<a href="#B22-buildings-14-03021" class="html-bibr">22</a>].</p>
Full article ">Figure 3
<p>Snapshot of a building modeled in IFC (right side) and CityGML (left side) [<a href="#B26-buildings-14-03021" class="html-bibr">26</a>].</p>
Full article ">Figure 4
<p>Research route of the full level of detail (LoD) specification for 3D building models. IFC, industry foundation classes; ILoD, indoor LoD; OLoD, outdoor LoD [<a href="#B18-buildings-14-03021" class="html-bibr">18</a>].</p>
Full article ">Figure 5
<p>Semantic mapping of IFC and CityGML classes (yellow outline indicates IFC classes and green outline indicates CityGML classes. Classes in boxes without black outlines do not carry geometric information) [<a href="#B39-buildings-14-03021" class="html-bibr">39</a>].</p>
Full article ">Figure 6
<p>Working area.</p>
Full article ">Figure 7
<p>BIM model generated through CAD-to-BIM conversion.</p>
Full article ">Figure 8
<p>(<b>a</b>) SBIF UAV data; (<b>b</b>) campus orthophoto.</p>
Full article ">Figure 9
<p>SBIF LIDAR point cloud.</p>
Full article ">Figure 10
<p>Scan-to-BIM model.</p>
Full article ">Figure 11
<p>Integration of CityGML and IFC. Note: “*” UML notation used to representation the cardinal relationship among CityGML classes that shows the number of occurrence or possibilities and an intermediately model is shown inside the red box.</p>
Full article ">Figure 12
<p>3D city model/BIM integration model.</p>
Full article ">Figure 13
<p>Georeferencing.</p>
Full article ">Figure 14
<p>IFC to CityGML transformation FME workbench [<a href="#B70-buildings-14-03021" class="html-bibr">70</a>].</p>
Full article ">Figure 15
<p>Used CityGML structure encoded in GML.</p>
Full article ">Figure 16
<p>Main data.</p>
Full article ">Figure 17
<p>Supplementary data.</p>
Full article ">
19 pages, 1468 KiB  
Article
Research on Technological Innovation Capability of Yancheng Prefabricated Construction Industry Based on Patent Information Analysis
by Renyan Lu, Feiting Shi and Houchao Sun
Buildings 2024, 14(9), 2968; https://doi.org/10.3390/buildings14092968 - 19 Sep 2024
Viewed by 1033
Abstract
In order to improve the innovation capabilities of Yancheng’s prefabricated construction industry, the Dawei Innojoy patent database was used to search the prefabricated construction technology patent literature data of Yancheng and other major cities in the Yangtze River Delta region from 2012 to [...] Read more.
In order to improve the innovation capabilities of Yancheng’s prefabricated construction industry, the Dawei Innojoy patent database was used to search the prefabricated construction technology patent literature data of Yancheng and other major cities in the Yangtze River Delta region from 2012 to 2022, and the prefabricated construction patents in Yancheng were analyzed from the number of patent applications. Analysis and research were conducted on trends, application type composition, applicants, technical fields, patent legal status, etc. At the same time, the prefabricated building technology innovation capability evaluation system was constructed, and the factor analysis method was used to compare and analyze the prefabricated building technology patent indicators of Yancheng and major cities in the Yangtze River Delta region. The results show that Yancheng has a small number of patent applications, a small proportion of invention patents, a low patent authorization rate, and a low patent conversion rate, and the industry-university-research chain needs to be opened up. Among the cities in the Yangtze River Delta, Yancheng’s comprehensive innovation ability in prefabricated building technology is medium to low, where it lags behind in terms of the scale and quality of technological innovation and ranks at the forefront of technological innovation operations. Based on this, the article puts forward countermeasures and suggestions for Yancheng’s prefabricated building technology patent applications from three levels, macro, meso, and micro, in order to achieve efficient innovation and promote the high-quality development of Yancheng’s new building industrialization. Full article
Show Figures

Figure 1

Figure 1
<p>Number of patent applications in Yancheng City’s prefabricated construction technology industry from 2012 to 2022.</p>
Full article ">Figure 2
<p>Composition of patent application types in Yancheng City’s prefabricated construction technology industry.</p>
Full article ">Figure 3
<p>The main applicants for Yancheng’s prefabricated construction technology industry patents.</p>
Full article ">Figure 4
<p>Legal status of patent applications for Yancheng prefabricated construction technology industry.</p>
Full article ">Figure 5
<p>Comparison of the total number and effective number of patent applications in the prefabricated construction technology industry in major cities in the Yangtze River Delta.</p>
Full article ">Figure 6
<p>Comparison of the proportion of valid authorizations for patent applications in the prefabricated construction technology industry in major cities in the Yangtze River Delta.</p>
Full article ">
18 pages, 4137 KiB  
Article
A Minimal Solution Estimating the Position of Cameras with Unknown Focal Length with IMU Assistance
by Kang Yan, Zhenbao Yu, Chengfang Song, Hongping Zhang and Dezhong Chen
Drones 2024, 8(9), 423; https://doi.org/10.3390/drones8090423 - 24 Aug 2024
Viewed by 949
Abstract
Drones are typically built with integrated cameras and inertial measurement units (IMUs). It is crucial to achieve drone attitude control through relative pose estimation using cameras. IMU drift can be ignored over short periods. Based on this premise, in this paper, four methods [...] Read more.
Drones are typically built with integrated cameras and inertial measurement units (IMUs). It is crucial to achieve drone attitude control through relative pose estimation using cameras. IMU drift can be ignored over short periods. Based on this premise, in this paper, four methods are proposed for estimating relative pose and focal length across various application scenarios: for scenarios where the camera’s focal length varies between adjacent moments and is unknown, the relative pose and focal length can be computed from four-point correspondences; for planar motion scenarios where the camera’s focal length varies between adjacent moments and is unknown, the relative pose and focal length can be determined from three-point correspondences; for instances of planar motion where the camera’s focal length is equal between adjacent moments and is unknown, the relative pose and focal length can be calculated from two-point correspondences; finally, for scenarios where multiple cameras are employed for image acquisition but only one is calibrated, a method proposed for estimating the pose and focal length of uncalibrated cameras can be used. The numerical stability and performance of these methods are compared and analyzed under various noise conditions using simulated datasets. We also assessed the performance of these methods on real datasets captured by a drone in various scenes. The experimental results demonstrate that the method proposed in this paper achieves superior accuracy and stability to classical methods. Full article
Show Figures

Figure 1

Figure 1
<p><span class="html-italic">O</span><sub>1</sub> and <span class="html-italic">O</span><sub>2</sub> represent the camera center; <span class="html-italic">P</span> denotes the target feature point; <span class="html-italic">p</span><sub>1</sub> and <span class="html-italic">p</span><sub>2</sub> are the pixel coordinates of the feature points; <span class="html-italic">e</span><sub>1</sub> and <span class="html-italic">e</span><sub>2</sub> are epipoles, which are the points where the line connecting <span class="html-italic">O</span><sub>1</sub> and <span class="html-italic">O</span><sub>2</sub> intersects with the image plane; <span class="html-italic">O</span><sub>1</sub>, <span class="html-italic">O</span><sub>2</sub>, and <span class="html-italic">P</span> forms the epipolar plane; and <span class="html-italic">l</span><sub>1</sub> and <span class="html-italic">l</span><sub>2</sub> are the epipolar lines, which are the lines where the epipolar plane intersects with the image plane.</p>
Full article ">Figure 2
<p>Focal length error probability density for 10,000 randomly generated problem instances.</p>
Full article ">Figure 3
<p>Translation matrix error probability density for 10,000 randomly generated problem instances.</p>
Full article ">Figure 4
<p>Error variation curve of focal length <span class="html-italic">f</span> with different scale errors in pixel coordinates.</p>
Full article ">Figure 5
<p>Error variation curve of translation vector <b><span class="html-italic">t</span></b> with different scale errors in pixel coordinates.</p>
Full article ">Figure 6
<p>The error variation curves of eight methods when introducing different levels of noise into the three rotation angles with the IMU: (<b>a</b>) the median focal length error calculated after introducing pitch angle rotation errors; (<b>b</b>) the median focal length error calculated after introducing yaw angle rotation errors; (<b>c</b>) the median focal length error calculated after introducing roll angle rotation errors; (<b>d</b>) the median translation vector error calculated after introducing pitch angle rotation errors; (<b>e</b>) the median translation vector error calculated after introducing yaw angle rotation errors; (<b>f</b>) the median translation vector error calculated after introducing roll angle rotation errors.</p>
Full article ">Figure 7
<p>Images captured by the drone: (<b>a</b>) outdoor landscapes; (<b>b</b>) urban buildings; (<b>c</b>) road vehicles.</p>
Full article ">Figure 8
<p>Schematic of feature point extraction using the SIFT algorithm.</p>
Full article ">Figure 9
<p>Cumulative distribution functions of the estimated errors in camera focal length and translation vector across three scenarios: (<b>a</b>) the camera focal length error of outdoor landscapes; (<b>b</b>) the translation vector error of outdoor landscapes; (<b>c</b>) the camera focal length error of urban buildings; (<b>d</b>) the translation vector error of urban buildings; (<b>e</b>) the camera focal length error of road vehicles; (<b>f</b>) the translation vector error of road vehicles.</p>
Full article ">Figure 10
<p>Three-dimensional trajectory plot of real data.</p>
Full article ">Figure 11
<p>Two-dimensional trajectory plot of real data.</p>
Full article ">
23 pages, 63398 KiB  
Article
Automatic Generation of Standard Nursing Unit Floor Plan in General Hospital Based on Stable Diffusion
by Zhuo Han and Yongquan Chen
Buildings 2024, 14(9), 2601; https://doi.org/10.3390/buildings14092601 - 23 Aug 2024
Cited by 1 | Viewed by 1187
Abstract
This study focuses on the automatic generation of architectural floor plans for standard nursing units in general hospitals based on Stable Diffusion. It aims at assisting architects in efficiently generating a variety of preliminary plan preview schemes and enhancing the efficiency of the [...] Read more.
This study focuses on the automatic generation of architectural floor plans for standard nursing units in general hospitals based on Stable Diffusion. It aims at assisting architects in efficiently generating a variety of preliminary plan preview schemes and enhancing the efficiency of the pre-planning stage of medical buildings. It includes dataset processing, model training, model testing and generation. It enables the generation of well-organized, clear, and readable functional block floor plans with strong generalization capabilities by inputting the boundaries of the nursing unit’s floor plan. Quantitative analysis demonstrated that 82% of the generated samples met the evaluation criteria for standard nursing units. Additionally, a comparative experiment was conducted using the same dataset to train a deep learning model based on Generative Adversarial Networks (GANs). The conclusion describes the strengths and limitations of the methodology, pointing out directions for improvement by future studies. Full article
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) the basic architecture of the SD model: Latent Diffusion Model [<a href="#B8-buildings-14-02601" class="html-bibr">8</a>]; (<b>b</b>) LoRA [<a href="#B9-buildings-14-02601" class="html-bibr">9</a>].</p>
Full article ">Figure 2
<p>Methodological framework of the experiment.</p>
Full article ">Figure 3
<p>Main and sub-corridor style nursing units floor plan dataset (portion).</p>
Full article ">Figure 4
<p>Stable Diffusion loss.</p>
Full article ">Figure 5
<p>Testing and Generation Framework.</p>
Full article ">Figure 6
<p>Sampling steps and denoising strength.</p>
Full article ">Figure 7
<p>Other hyperparameters remain unchanged; the seed is changed.</p>
Full article ">Figure 8
<p>Other hyperparameters remain unchanged; the input image is added to the main corridor and the seed is changed.</p>
Full article ">Figure 9
<p>ControlNet preprocessors.</p>
Full article ">Figure 10
<p>ControlNet preprocessors with the input image added to the main corridor.</p>
Full article ">Figure 11
<p>ControlNet Weight, Guidance Start.</p>
Full article ">Figure 12
<p>Preprocessor and Guidance End.</p>
Full article ">Figure 13
<p>Image-to-Image + ControlNet, change the seed.</p>
Full article ">Figure 14
<p>Parameter-controlled boundary generation.</p>
Full article ">Figure 15
<p>Area feature distribution.</p>
Full article ">
20 pages, 8685 KiB  
Article
Numerical Simulation and Field Monitoring of Blasting Vibration for Tunnel In-Situ Expansion by a Non-Cut Blast Scheme
by Zhenchang Guan, Lifu Xie, Dong Chen and Jingkang Shi
Sensors 2024, 24(14), 4546; https://doi.org/10.3390/s24144546 - 13 Jul 2024
Cited by 2 | Viewed by 1563
Abstract
There have been ever more in-situ tunnel extension projects due to the growing demand for transportation. The traditional blast scheme requires a large quantity of explosive and the vibration effect is hard to control. In order to reduce explosive consumption and the vibration [...] Read more.
There have been ever more in-situ tunnel extension projects due to the growing demand for transportation. The traditional blast scheme requires a large quantity of explosive and the vibration effect is hard to control. In order to reduce explosive consumption and the vibration effect, an optimized non-cut blast scheme was proposed and applied to the in-situ expansion of the Gushan Tunnel. Refined numerical simulation was adopted to compare the traditional and optimized blast schemes. The vibration attenuation within the interlaid rock mass and the vibration effect on the adjacent tunnel were studied and compared. The simulation results were validated by the field monitoring of the vibration effect on the adjacent tunnel. Both the simulation and the monitoring results showed that the vibration velocity on the adjacent tunnel’s back side was much smaller than its counterpart on the blast side, i.e., the presence of cavity reduced the blasting vibration effect significantly. The optimized non-cut blast scheme, which effectively utilized the existing free surface, could reduce the explosive consumption and vibration effect significantly, and might be preferred for in-situ tunnel expansion projects. Full article
Show Figures

Figure 1

Figure 1
<p>The engineering practices of tunnel reconstruction or expansion.</p>
Full article ">Figure 2
<p>The typical cross-section of the Gushan tunnel before and after in-situ expansion (unit: m).</p>
Full article ">Figure 3
<p>The excavation sequence for the in-situ expansion of the north tunnel. The dashed areas represents the lining profile after in-situ expansion and the “+” areas represent the unexcavated rock mass.</p>
Full article ">Figure 4
<p>Traditional blast scheme for the top part of the north tunnel. Numbers represent detonator sequences. Plus sign represents unexcavated rock mass. Circles represent blast holes.</p>
Full article ">Figure 5
<p>Non-cut blast scheme for the top part of the north tunnel.</p>
Full article ">Figure 6
<p>Loading boundary of equivalent blasting load.</p>
Full article ">Figure 7
<p>Numerical model for the Gushan tunnel.</p>
Full article ">Figure 8
<p>Equivalent blasting loads for every detonator sequence in a traditional blast scheme.</p>
Full article ">Figure 9
<p>The implementation of equivalent blasting load for traditional blast scheme: (<b>a</b>) detonator sequence 8; (<b>b</b>) denotator sequence 14.</p>
Full article ">Figure 10
<p>Equivalent blasting load for every detonator sequence in the non-cut blast scheme.</p>
Full article ">Figure 11
<p>The implementation of equivalent blasting load for the non-cut blast scheme: (<b>a</b>) detonator sequence 8; (<b>b</b>) detonator sequence 14.</p>
Full article ">Figure 12
<p>Arrangement of numerical monitoring points (unit: m).</p>
Full article ">Figure 13
<p>The velocity–time histories and frequency spectra of the M6 monitoring point for the traditional blast scheme: (<b>a</b>) time history in the <span class="html-italic">X</span>-direction, (<b>b</b>) frequency spectrum in the <span class="html-italic">X</span>-direction, (<b>c</b>) time history in the <span class="html-italic">Z</span>-direction, (<b>d</b>) frequency spectrum in the <span class="html-italic">Z</span>-direction.</p>
Full article ">Figure 14
<p>The maximum velocities on the adjacent tunnel for the traditional blast scheme: (<b>a</b>) the <span class="html-italic">X</span>-direction; (<b>b</b>) the <span class="html-italic">Z</span>-direction. Units: cm/s.</p>
Full article ">Figure 15
<p>The velocity–time histories and frequency spectra of the M6 monitoring point for the non-cut blast scheme: (<b>a</b>) time history in the <span class="html-italic">X</span>-direction, (<b>b</b>) frequency spectrum in the <span class="html-italic">X</span>-direction, (<b>c</b>) time history in the <span class="html-italic">Z</span>-direction, (<b>d</b>) frequency spectrum in the <span class="html-italic">Z</span>-direction.</p>
Full article ">Figure 16
<p>The maximum velocities on the adjacent tunnel for the non-cut blast scheme: (<b>a</b>) the <span class="html-italic">X</span>-direction; (<b>b</b>) the Z-direction; units: cm/s.</p>
Full article ">Figure 17
<p>The maximum velocities within interlaid rock mass for the traditional blast scheme: (<b>a</b>) <span class="html-italic">X</span>-direction, (<b>b</b>) Z-direction.</p>
Full article ">Figure 18
<p>The maximum velocities within the interlaid rock mass for the non-cut blast scheme: (<b>a</b>) <span class="html-italic">X</span>-direction, (<b>b</b>) Z-direction.</p>
Full article ">Figure 19
<p>The upper part of NK18+110 section before and after expansion.</p>
Full article ">Figure 20
<p>Field monitoring for blasting vibration.</p>
Full article ">Figure 21
<p>Arrangement of blasting vibration meters.</p>
Full article ">Figure 22
<p>The velocity–time histories and frequency spectra recorded by field monitoring and compared with numerical simulation results: (<b>a</b>) time history of M6 in the <span class="html-italic">X</span>-direction, (<b>b</b>) frequency spectrum of M6 in the <span class="html-italic">X</span>-direction, (<b>c</b>) time history of M6 in the Z-direction, (<b>d</b>) frequency spectrum of M6 in the Z-direction, (<b>e</b>) time history of M7 in the <span class="html-italic">X</span>-direction, (<b>f</b>) frequency spectrum of M7 in the <span class="html-italic">X</span>-direction, (<b>g</b>) time history of M7 in the Z-direction, (<b>h</b>) frequency spectrum of M7 in the Z-direction.</p>
Full article ">
21 pages, 3782 KiB  
Article
Globally Optimal Relative Pose and Scale Estimation from Only Image Correspondences with Known Vertical Direction
by Zhenbao Yu, Shirong Ye, Changwei Liu, Ronghe Jin, Pengfei Xia and Kang Yan
ISPRS Int. J. Geo-Inf. 2024, 13(7), 246; https://doi.org/10.3390/ijgi13070246 - 9 Jul 2024
Viewed by 1054
Abstract
Installing multi-camera systems and inertial measurement units (IMUs) in self-driving cars, micro aerial vehicles, and robots is becoming increasingly common. An IMU provides the vertical direction, allowing coordinate frames to be aligned in a common direction. The degrees of freedom (DOFs) of the [...] Read more.
Installing multi-camera systems and inertial measurement units (IMUs) in self-driving cars, micro aerial vehicles, and robots is becoming increasingly common. An IMU provides the vertical direction, allowing coordinate frames to be aligned in a common direction. The degrees of freedom (DOFs) of the rotation matrix are reduced from 3 to 1. In this paper, we propose a globally optimal solver to calculate the relative poses and scale of generalized cameras with a known vertical direction. First, the cost function is established to minimize algebraic error in the least-squares sense. Then, the cost function is transformed into two polynomials with only two unknowns. Finally, the eigenvalue method is used to solve the relative rotation angle. The performance of the proposed method is verified on both simulated and KITTI datasets. Experiments show that our method is more accurate than the existing state-of-the-art solver in estimating the relative pose and scale. Compared to the best method among the comparison methods, the method proposed in this paper reduces the rotation matrix error, translation vector error, and scale error by 53%, 67%, and 90%, respectively. Full article
Show Figures

Figure 1

Figure 1
<p>The rotation matrix, translation vector, and scale are <math display="inline"><semantics> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> </semantics></math>, <math display="inline"><semantics> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> </semantics></math>, and <math display="inline"><semantics> <mi mathvariant="normal">s</mi> </semantics></math>, respectively.</p>
Full article ">Figure 2
<p>The rotation matrix and translation vector of the <math display="inline"><semantics> <mi>i</mi> </semantics></math>-th camera in the <math display="inline"><semantics> <mi>k</mi> </semantics></math> frame are <math display="inline"><semantics> <mrow> <msub> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> <mrow> <mi>k</mi> <mi>i</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> <mrow> <mi>k</mi> <mi>i</mi> </mrow> </msub> </mrow> </semantics></math>. The rotation matrix and translation vector of the <math display="inline"><semantics> <mi>j</mi> </semantics></math>-th camera in the <math display="inline"><semantics> <mrow> <mi>k</mi> <mo>+</mo> <mn>1</mn> </mrow> </semantics></math> frame are <math display="inline"><semantics> <mrow> <msub> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> <mrow> <msup> <mi>k</mi> <mo>′</mo> </msup> <mi>j</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> <mrow> <msup> <mi>k</mi> <mo>′</mo> </msup> <mi>j</mi> </mrow> </msub> </mrow> </semantics></math>. The rotation matrix, translation vector, and scale vector between aligned <math display="inline"><semantics> <mi>k</mi> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>k</mi> <mo>+</mo> <mn>1</mn> </mrow> </semantics></math> are <math display="inline"><semantics> <mrow> <msub> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> <mi>y</mi> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mstyle mathvariant="bold" mathsize="normal"> <mover accent="true"> <mi>t</mi> <mo>˜</mo> </mover> </mstyle> </semantics></math>, and <math display="inline"><semantics> <mi>s</mi> </semantics></math>.</p>
Full article ">Figure 3
<p>Algorithm flow chart.</p>
Full article ">Figure 4
<p>Effect of the number of feature points on the accuracy of rotation, translation, and scale estimation by the method proposed in this paper with different feature points. (<b>a</b>) Rotation error (degree); (<b>b</b>) translation error (degree); (<b>c</b>) translation error; (<b>d</b>) scale error.</p>
Full article ">Figure 5
<p>Estimating errors in the rotation matrix, translation vector, and scale information under random motion. The first column shows the calculation results of adding image noise. The second column shows the calculation results of adding pitch angle noise. The third column shows the calculation results of adding roll angle noise. The first, second, third and fourth rows represent the values of <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mrow> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> <mo>,</mo> <mi>dir</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mi>s</mi> </msub> </mrow> </semantics></math> respectively.</p>
Full article ">Figure 6
<p>Estimating errors in the rotation matrix, translation vector, and scale information under planar motion. The first column shows the calculation results of adding image noise. The second column shows the calculation results of adding pitch angle noise. The third column shows the calculation results of adding roll angle noise. The first, second, third and fourth rows represent the values of <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mrow> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> <mo>,</mo> <mi>dir</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mi>s</mi> </msub> </mrow> </semantics></math> respectively.</p>
Full article ">Figure 7
<p>Estimating errors in the rotation matrix, translation vector, and scale information under sideways motion. The first column shows the calculation results of adding image noise. The second column shows the calculation results of adding pitch angle noise. The third column shows the calculation results of adding roll angle noise. The first, second, third and fourth rows represent the values of <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mrow> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> <mo>,</mo> <mi>dir</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mi>s</mi> </msub> </mrow> </semantics></math> respectively.</p>
Full article ">Figure 8
<p>Estimating errors in the rotation matrix, translation vector, and scale information under forward motion. The first column shows the calculation results of adding image noise. The second column shows the calculation results of adding pitch angle noise. The third column shows the calculation results of adding roll angle noise. The first, second, third and fourth rows represent the values of <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mrow> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> <mo>,</mo> <mi>dir</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mi>s</mi> </msub> </mrow> </semantics></math> respectively.</p>
Full article ">Figure 9
<p>Test image pair from KITTI dataset with feature detection.</p>
Full article ">
29 pages, 28612 KiB  
Article
Synergistic Landscape Design Strategies to Renew Thermal Environment: A Case Study of a Cfa-Climate Urban Community in Central Komatsu City, Japan
by Jing Xiao, Takaya Yuizono and Ruixuan Li
Sustainability 2024, 16(13), 5582; https://doi.org/10.3390/su16135582 - 29 Jun 2024
Cited by 1 | Viewed by 1476
Abstract
An effective community landscape design consistently impacts thermally comfortable outdoor conditions and climate adaptation. Therefore, constructing sustainable communities requires a resilience assessment of existing built environments for optimal design mechanisms, especially the renewal of thermally resilient communities in densely populated cities. However, the [...] Read more.
An effective community landscape design consistently impacts thermally comfortable outdoor conditions and climate adaptation. Therefore, constructing sustainable communities requires a resilience assessment of existing built environments for optimal design mechanisms, especially the renewal of thermally resilient communities in densely populated cities. However, the current community only involves green space design and lacks synergistic landscape design for renewing the central community. The main contribution of this study is that it reveals a three-level optimization method to validate the Synergistic Landscape Design Strategies (SLDS) (i.e., planting, green building envelope, water body, and urban trees) for renewing urban communities. A typical Japanese community in central Komatsu City was selected to illustrate the simulation-based design strategies. The microclimate model ENVI-met reproduces communities involving 38 case implementations to evaluate the physiologically equivalent temperature (PET) and microclimate condition as a measure of the thermal environments in humid subtropical climates. The simulation results indicated that the single-family buildings and real estate flats were adapted to the summer thermal mitigation strategy of water bodies and green roofs (W). In small-scale and large-scale models, the mean PET was lowered by 1.4–5.0 °C (0.9–2.3 °C), and the cooling effect reduced mean air temperature by 0.4–2.3 °C (0.5–0.8 °C) and improved humidification by 3.7–15.2% (3.7–5.3%). The successful SLDS provides precise alternatives for realizing Sustainable Development Goals (SDGs) in the renewal of urban communities. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The methodology for microclimate simulation sets an optimization mechanism (from a to d) in the Synergistic Landscape Design Strategies (SLDS) to renew the thermal environment in urban communities.</p>
Full article ">Figure 2
<p>Local sample communities include the single-family building community (HC), the real estate flat community (AC), and the mixed cluster community (BC).</p>
Full article ">Figure 3
<p>(<b>a</b>) Köppen-Geiger climate classification of Cfa, in Japan (with black square area); (<b>b</b>) surface temperature change in Japan from 2019 to 2021 for August warming; (<b>c</b>) maximum air temperature (T<sub>a</sub>) in August from 2019 to 2021; and (<b>d</b>) annual mean air temperature (T<sub>a</sub>) in Komatsu City from 2019 to 2021.</p>
Full article ">Figure 4
<p>Japanese community is in two building forms (A and H types) and wall material settings in the ENVI-met.</p>
Full article ">Figure 5
<p>(<b>a</b>) ArcGIS analysis of Komatsu City using urban surface categories, (<b>b</b>) vegetation cover distribution, and (<b>c</b>) urban heat island (UHI) effect.</p>
Full article ">Figure 6
<p>Axonometric diagrams for all design cases using the Synergistic Landscape design strategies (SLDS) in three sample communities (HC, AC, and BC areas).</p>
Full article ">Figure 7
<p>Planting design cases for small-scale models (HC and AC areas) with the position of receptors in ENVI-met.</p>
Full article ">Figure 8
<p>Synergistic landscape design cases for large-scale models (BC area) in ENVI-met.</p>
Full article ">Figure 9
<p>Validation of linear fit for monitored and modeled air temperature (T<sub>a</sub>) and relative humidity (RH) in local sample communities (HC, AC, and BC areas) at small and large scales.</p>
Full article ">Figure 10
<p>Simulation results of the microclimate variations in sample communities at small-large scales.</p>
Full article ">Figure 11
<p>ENVI-met Simulation results on the physiologically equivalent temperature (PET) distribution and the mitigation time at a pedestrian height of 1.8 m.</p>
Full article ">Figure 12
<p>Distribution maps of the PET thermal index at 14:00 simulated with planting design strategies (L1, L2, L3, and L4) in two small-scale communities.</p>
Full article ">Figure 13
<p>Distribution maps of the PET thermal index at 14:00 under green building envelope (GBE) design strategies (“R, F, and W”) renewed in the HC and AC areas based on planting design (L1–4).</p>
Full article ">Figure 14
<p>Distribution maps of the PET thermal index at 14:00 under urban tree effect (W-Ga-f) based water body and green roof (W) design of the BC area.</p>
Full article ">
14 pages, 3735 KiB  
Article
Learning Effective Geometry Representation from Videos for Self-Supervised Monocular Depth Estimation
by Hailiang Zhao, Yongyi Kong, Chonghao Zhang, Haoji Zhang and Jiansen Zhao
ISPRS Int. J. Geo-Inf. 2024, 13(6), 193; https://doi.org/10.3390/ijgi13060193 - 11 Jun 2024
Viewed by 1493
Abstract
Recent studies on self-supervised monocular depth estimation have achieved promising results, which are mainly based on the joint optimization of depth and pose estimation via high-level photometric loss. However, how to learn the latent and beneficial task-specific geometry representation from videos is still [...] Read more.
Recent studies on self-supervised monocular depth estimation have achieved promising results, which are mainly based on the joint optimization of depth and pose estimation via high-level photometric loss. However, how to learn the latent and beneficial task-specific geometry representation from videos is still far from being explored. To tackle this issue, we propose two novel schemes to learn more effective representation from monocular videos: (i) an Inter-task Attention Model (IAM) to learn the geometric correlation representation between the depth and pose learning networks to make structure and motion information mutually beneficial; (ii) a Spatial-Temporal Memory Module (STMM) to exploit long-range geometric context representation among consecutive frames both spatially and temporally. Systematic ablation studies are conducted to demonstrate the effectiveness of each component. Evaluations on KITTI show that our method outperforms current state-of-the-art techniques. Full article
Show Figures

Figure 1

Figure 1
<p>Comparison of the learning process of the general pipeline (<b>a</b>) and our method (<b>b</b>) for self-supervised monocular depth estimation. Different from the general pipeline that learns the depth feature <math display="inline"><semantics> <msub> <mi>F</mi> <mi>D</mi> </msub> </semantics></math> and the pose feature <math display="inline"><semantics> <msub> <mi>F</mi> <mi>P</mi> </msub> </semantics></math> separately using a 2D photometric loss <span class="html-italic">L</span>, we propose a new scheme for learning better representation from videos. A memory mechanism <span class="html-italic">M</span> is devised to exploit the long-range context from videos for depth feature learning. An inter-task attention mechanism <span class="html-italic">A</span> is devised to leverage depth information for helping pose feature learning, which inversely benefits depth feature learning as well via gradient back-propagation.</p>
Full article ">Figure 2
<p>Illustration of our network framework (<b>a</b>) and the architecture of the IAM (<b>b</b>) and the STMM (<b>c</b>). The network takes three consecutive frames as input to learn the long-range geometric correlation representation by introducing STMM after the encoder. The pose network is split into two branches to predict rotation <span class="html-italic">R</span> and translation <span class="html-italic">t</span> separately. The IAM is applied after the second convolution layer of both <span class="html-italic">R</span> and <span class="html-italic">t</span> branches, learning valuable geometry information to assist <span class="html-italic">R</span> and <span class="html-italic">t</span> branches in leveraging inter-task correlation representation.</p>
Full article ">Figure 3
<p>Qualitative results on KITTI test set. Our method produces more accurate depth maps with low-texture regions, moving vehicles, delicate structures, and object boundaries.</p>
Full article ">Figure 4
<p>Visual results evaluated on the Cityscapes dataset. The evaluation uses models trained on KITTI without any refinement. Compared with the methods in [<a href="#B2-ijgi-13-00193" class="html-bibr">2</a>], our method generates higher-quality depth maps and captures moving and slim objects better. The difference is highlighted with the dashed circles.</p>
Full article ">Figure 5
<p>The visualization of learned attention maps in the IAM. It indicates the IAM places distinct emphasis on different regions for two branches to improve their estimation.</p>
Full article ">Figure 6
<p>Visual comparison of the visual odometry trajectories. Full trajectories are plotted using the Evo visualization tool [<a href="#B51-ijgi-13-00193" class="html-bibr">51</a>].</p>
Full article ">
16 pages, 7679 KiB  
Article
A 3D Parameterized BIM-Modeling Method for Complex Engineering Structures in Building Construction Projects
by Lijun Yang, Xuexiang Gao, Song Chen, Qianyao Li and Shuo Bai
Buildings 2024, 14(6), 1752; https://doi.org/10.3390/buildings14061752 - 11 Jun 2024
Cited by 1 | Viewed by 1785
Abstract
The structural components of large-scale public construction projects are more complex than those of ordinary residential buildings, with irregular and diverse components, as well as a large number of repetitive structural elements, which increase the difficulty of BIM-modeling operations. Additionally, there is a [...] Read more.
The structural components of large-scale public construction projects are more complex than those of ordinary residential buildings, with irregular and diverse components, as well as a large number of repetitive structural elements, which increase the difficulty of BIM-modeling operations. Additionally, there is a significant amount of inherent parameter information in the construction process, which puts forward higher requirements for the application and management capabilities of BIM technology. However, the current BIM software still has deficiencies in the parameterization of complex and irregular structural components, fine modeling, and project management information. To address these issues, this paper takes Grasshopper as the core parametric tool and Revit as the carrier of component attribute information. It investigates the parametric modeling logic of Grasshopper and combines the concepts of parameterization, modularization, standardization, and engineering practicality to create a series of parametric programs for complex structural components in building projects. This approach mainly addresses intricate challenges pertaining to the parametric structural shapes (including batch processing) and parametric structural attributes (including the batch processing of diverse attribute parameters), thereby ensuring the efficiency in BIM modeling throughout the design and construction phases of complex building projects. Full article
Show Figures

Figure 1

Figure 1
<p>BIM parameterized digital graphic representation.</p>
Full article ">Figure 2
<p>Vector representation in Grasshopper.</p>
Full article ">Figure 3
<p>Differences in different types of data structures and operations.</p>
Full article ">Figure 4
<p>The three matching modes: (<b>a</b>) Matching with Longest List; (<b>b</b>) Matching with Shortest List; and (<b>c</b>) Cross-List Data Matching.</p>
Full article ">Figure 5
<p>The process of forming points, lines, and surfaces.</p>
Full article ">Figure 6
<p>Diversified movement methods of components: (<b>a</b>) object translation along direction; and (<b>b</b>) object rotation around axis.</p>
Full article ">Figure 7
<p>Type and section of retaining wall: (<b>a</b>) Type a; (<b>b</b>) Type b; (<b>c</b>) Type c; (<b>d</b>) Type d; (<b>e</b>) Type e; (<b>f</b>) Type f.</p>
Full article ">Figure 8
<p>The parameterization process of GH for retaining walls.</p>
Full article ">Figure 9
<p>Implementation process of structural positioning.</p>
Full article ">Figure 10
<p>Parameter settings and model parameterization creation.</p>
Full article ">Figure 11
<p>Parameterized variable display of staircase structure.</p>
Full article ">Figure 12
<p>Parameterized node connection display of staircase structure.</p>
Full article ">Figure 13
<p>Draw parameterized staircase structure based on projection lines.</p>
Full article ">
19 pages, 13136 KiB  
Article
DSOMF: A Dynamic Environment Simultaneous Localization and Mapping Technique Based on Machine Learning
by Shengzhe Yue, Zhengjie Wang and Xiaoning Zhang
Sensors 2024, 24(10), 3063; https://doi.org/10.3390/s24103063 - 11 May 2024
Viewed by 1255
Abstract
To address the challenges of reduced localization accuracy and incomplete map construction demonstrated using classical semantic simultaneous localization and mapping (SLAM) algorithms in dynamic environments, this study introduces a dynamic scene SLAM technique that builds upon direct sparse odometry (DSO) and incorporates instance [...] Read more.
To address the challenges of reduced localization accuracy and incomplete map construction demonstrated using classical semantic simultaneous localization and mapping (SLAM) algorithms in dynamic environments, this study introduces a dynamic scene SLAM technique that builds upon direct sparse odometry (DSO) and incorporates instance segmentation and video completion algorithms. While prioritizing the algorithm’s real-time performance, we leverage the rapid matching capabilities of Direct Sparse Odometry (DSO) to link identical dynamic objects in consecutive frames. This association is achieved through merging semantic and geometric data, thereby enhancing the matching accuracy during image tracking through the inclusion of semantic probability. Furthermore, we incorporate a loop closure module based on video inpainting algorithms into our mapping thread. This allows our algorithm to rely on the completed static background for loop closure detection, further enhancing the localization accuracy of our algorithm. The efficacy of this approach is validated using the TUM and KITTI public datasets and the unmanned platform experiment. Experimental results show that, in various dynamic scenes, our method achieves an improvement exceeding 85% in terms of localization accuracy compared with the DSO system. Full article
Show Figures

Figure 1

Figure 1
<p>Algorithm framework.</p>
Full article ">Figure 2
<p>Image processing workflow (The blue box in the diagram represents the tracking thread, while the orange box represents the mapping thread).</p>
Full article ">Figure 3
<p>Dynamic object instance segmentation results. (<b>a</b>) Original image; (<b>b</b>) mask image.</p>
Full article ">Figure 4
<p>Optical flow tracking of regional centroids.</p>
Full article ">Figure 5
<p>Delineation of dynamic and static regions.</p>
Full article ">Figure 6
<p>Comparative analysis of mapping outcomes pre and post dynamic object elimination.</p>
Full article ">Figure 7
<p>FGVC optical flow completion process.</p>
Full article ">Figure 8
<p>Comparison of absolute trajectory error for the camera on the TUM dataset.</p>
Full article ">Figure 9
<p>Video completion process in KITTI-04 dataset.</p>
Full article ">Figure 10
<p>Map construction effect of DSOMF in the KITTI-04 dataset.</p>
Full article ">Figure 11
<p>Unmanned flight platform SLAM algorithm test system.</p>
Full article ">Figure 12
<p>Top view of fixed wing aircraft.</p>
Full article ">Figure 13
<p>Loop closure detection experiment. (<b>a</b>) The loop closure detection module runs; (<b>b</b>) the loop closure detection not runs.</p>
Full article ">Figure 14
<p>SLAM algorithm test system for unmanned ground platform.</p>
Full article ">Figure 15
<p>Comparison of outdoor dynamic environment trajectories (In the real-life scenario, the outlined boxes represent the trajectories of dynamic objects. Route one denotes the path of vehicles, while route two signifies pedestrian pathways).</p>
Full article ">Figure 16
<p>Environment image. (<b>a</b>) Pedestrian environment image; (<b>b</b>) electric vehicle environment image.</p>
Full article ">
15 pages, 2894 KiB  
Article
Phase Error Reduction for a Structured-Light 3D System Based on a Texture-Modulated Reprojection Method
by Chenbo Shi, Zheng Qin, Xiaowei Hu, Changsheng Zhu, Yuanzheng Mo, Zelong Li, Shaojia Yan, Yue Yu, Xiangteng Zang and Chun Zhang
Sensors 2024, 24(7), 2075; https://doi.org/10.3390/s24072075 - 24 Mar 2024
Viewed by 1424
Abstract
Fringe projection profilometry (FPP), with benefits such as high precision and a large depth of field, is a popular 3D optical measurement method widely used in precision reconstruction scenarios. However, the pixel brightness at reflective edges does not satisfy the conditions of the [...] Read more.
Fringe projection profilometry (FPP), with benefits such as high precision and a large depth of field, is a popular 3D optical measurement method widely used in precision reconstruction scenarios. However, the pixel brightness at reflective edges does not satisfy the conditions of the ideal pixel-wise phase-shifting model due to the influence of scene texture and system defocus, resulting in severe phase errors. To address this problem, we theoretically analyze the non-pixel-wise phase propagation model for texture edges and propose a reprojection strategy based on scene texture modulation. The strategy first obtains the reprojection weight mask by projecting typical FPP patterns and calculating the scene texture reflection ratio, then reprojects stripe patterns modulated by the weight mask to eliminate texture edge effects, and finally fuses coarse and refined phase maps to generate an accurate phase map. We validated the proposed method on various texture scenes, including a smooth plane, depth surface, and curved surface. Experimental results show that the root mean square error (RMSE) of the phase at the texture edge decreased by 53.32%, proving the effectiveness of the reprojection strategy in eliminating depth errors at texture edges. Full article
Show Figures

Figure 1

Figure 1
<p>Measurement effect of traditional FPP method. (<b>a</b>) Smooth texture plane; (<b>b</b>) traditional FPP method measurement results.</p>
Full article ">Figure 2
<p>The process of capturing the intensity change in the stripe image by the camera.</p>
Full article ">Figure 3
<p>The relationship between camera defocus and phase in the scene. (<b>a</b>) The camera captures scene intensity; (<b>b</b>) the two-dimensional Gaussian distribution; (<b>c</b>) the phase value of (<b>a</b>).</p>
Full article ">Figure 4
<p>Computational framework of our proposed method.</p>
Full article ">Figure 5
<p>Modulation mask. (<b>a</b>) Maximum light modulation pattern; (<b>b</b>) scene image after adding mask; (<b>c</b>) gradient absolute value image; (<b>d</b>) absolute gradient value and phase comparison of line drawing positions.</p>
Full article ">Figure 6
<p>Phase error analysis of simulated modulated scene. (<b>a</b>) Simulation of original and modulated measurement scene pictures; (<b>b</b>) original phase error and modulated phase error; (<b>c</b>) comparison of the phase error.</p>
Full article ">Figure 7
<p>Structured-light 3D reconstruction system platform.</p>
Full article ">Figure 8
<p>Actual measurement objects. (<b>a</b>) Smooth scene with only texture edges; (<b>b</b>) Scenes affected by both depth edges and texture edges; (<b>c</b>) Smooth surfaces affected by only texture edges; (<b>d</b>) Scenes with different depths of field.</p>
Full article ">Figure 9
<p>Comparison of measurement results for different depth differences. (<b>a</b>) Original scene image; (<b>b</b>) original depth map; (<b>c</b>) comparison of local ROI regions; (<b>d</b>) modulated scene image; (<b>e</b>) fusion depth map; (<b>f</b>) comparison of original depth curve (red), fusion depth curve (blue), and ground truth (black).</p>
Full article ">Figure 10
<p>Comparison of measurement scenes only modulated by texture. (<b>a</b>) Original measurement scene; (<b>b</b>) original depth map; (<b>c</b>) fusion depth map; (<b>d</b>) depth comparison between position A and position B; (<b>e</b>) depth comparison between position C and position D; (<b>f</b>) depth comparison between position E and position F.</p>
Full article ">Figure 11
<p>Comparison of measured effects on scenes modulated by depth and texture. (<b>a</b>) Original scene image; (<b>b</b>) original depth map; (<b>c</b>) comparison of local ROI regions; (<b>d</b>) modulated scene image; (<b>e</b>) fusion depth map; (<b>f</b>) comparison of original depth curve, modulated depth curve, and true depth curve.</p>
Full article ">Figure 12
<p>Comparison of measurements of different texture widths. (<b>a</b>) Original measurement scene; (<b>b</b>) modulated scene image; (<b>c</b>) original depth map; (<b>d</b>) fusion depth map; (<b>e</b>) ROI of original measurement scene; (<b>f</b>–<b>l</b>) original depth (red), fusion depth (blue), and actual value (black) comparison at positions A–G in (<b>e</b>).</p>
Full article ">Figure 13
<p>Measurement experiments under different depths of field. (<b>a</b>) The original scene image; (<b>b</b>) the original depth map; (<b>c</b>) the original depth map compared with the depth curve of position A in the fusion depth map; (<b>d</b>) the modulated scene image; (<b>e</b>) the fusion depth map; (<b>f</b>) the original depth map compared with the depth curve of position B in the fused depth map.</p>
Full article ">Figure 14
<p>Comparison of the modulation effects of different light intensities. (<b>a</b>) The reconstruction effect of the traditional method; (<b>b</b>–<b>d</b>) the reconstruction result when the modulated light intensity is 220, 90, and 50.</p>
Full article ">
21 pages, 25891 KiB  
Article
An Improved TransMVSNet Algorithm for Three-Dimensional Reconstruction in the Unmanned Aerial Vehicle Remote Sensing Domain
by Jiawei Teng, Haijiang Sun, Peixun Liu and Shan Jiang
Sensors 2024, 24(7), 2064; https://doi.org/10.3390/s24072064 - 23 Mar 2024
Viewed by 1320
Abstract
It is important to achieve the 3D reconstruction of UAV remote sensing images in deep learning-based multi-view stereo (MVS) vision. The lack of obvious texture features and detailed edges in UAV remote sensing images leads to inaccurate feature point matching or depth estimation. [...] Read more.
It is important to achieve the 3D reconstruction of UAV remote sensing images in deep learning-based multi-view stereo (MVS) vision. The lack of obvious texture features and detailed edges in UAV remote sensing images leads to inaccurate feature point matching or depth estimation. To address this problem, this study improves the TransMVSNet algorithm in the field of 3D reconstruction by optimizing its feature extraction network and costumed body depth prediction network. The improvement is mainly achieved by extracting features with the Asymptotic Pyramidal Network (AFPN) and assigning weights to different levels of features through the ASFF module to increase the importance of key levels and also using the UNet structured network combined with an attention mechanism to predict the depth information, which also extracts the key area information. It aims to improve the performance and accuracy of the TransMVSNet algorithm’s 3D reconstruction of UAV remote sensing images. In this work, we have performed comparative experiments and quantitative evaluation with other algorithms on the DTU dataset as well as on a large UAV remote sensing image dataset. After a large number of experimental studies, it is shown that our improved TransMVSNet algorithm has better performance and robustness, providing a valuable reference for research and application in the field of 3D reconstruction of UAV remote sensing images. Full article
Show Figures

Figure 1

Figure 1
<p>The architecture of the proposed asymptotic feature pyramid network (AFPN). AFPN is initiated by fusing two neighboring low-level features and progressively incorporating high-level features into the fusion process.</p>
Full article ">Figure 2
<p>Cost volume regularization network: (<b>a</b>) the overall network, (<b>b</b>) the UBA layer, and (<b>c</b>) the CCA module in the UBA layer.</p>
Full article ">Figure 3
<p>Fully connected (FC) network structure.</p>
Full article ">Figure 4
<p>Comparison of depth prediction results for Scan1, where (<b>a</b>–<b>d</b>) are the results of our algorithm and (<b>e</b>–<b>h</b>) are the results of the original algorithm.</p>
Full article ">Figure 5
<p>Comparison of depth prediction results for Scan4, where (<b>a</b>–<b>d</b>) are the results of our algorithm and (<b>e</b>–<b>h</b>) are the results of the original algorithm.</p>
Full article ">Figure 6
<p>Comparison of depth prediction results for Scan9, where (<b>a</b>–<b>d</b>) are the results of our algorithm and (<b>e</b>–<b>h</b>) are the results of the original algorithm.</p>
Full article ">Figure 7
<p>Comparison of depth prediction results for Scan10, where (<b>a</b>–<b>d</b>) are the results of our algorithm and (<b>e</b>–<b>h</b>) are the results of the original algorithm.</p>
Full article ">Figure 8
<p>(<b>a</b>–<b>h</b>) are overhead drone images of buildings from the drone mapping dataset Pix4D. Scene 1 is an unfinished building and Scene 2 is a residential home in Chicago, IL, USA.</p>
Full article ">Figure 9
<p>Depth map of the first scene in <a href="#sensors-24-02064-f008" class="html-fig">Figure 8</a>: (<b>a</b>–<b>d</b>) are depth prediction images of our improved algorithm; (<b>e</b>–<b>h</b>) are depth prediction images of the original algorithm.</p>
Full article ">Figure 10
<p>Depth map of the second scene in <a href="#sensors-24-02064-f008" class="html-fig">Figure 8</a>: (<b>a</b>–<b>d</b>) are depth prediction images of our improved algorithm; (<b>e</b>–<b>h</b>) are depth prediction images of the original algorithm.</p>
Full article ">Figure 11
<p>The 3D reconstruction results of the improved algorithm: (<b>a</b>–<b>d</b>) for Scene 1 and (<b>e</b>–<b>h</b>) for Scene 2 in <a href="#sensors-24-02064-f008" class="html-fig">Figure 8</a>.</p>
Full article ">Figure 12
<p>Self-constructed wrap-around robotic arm overhead shooting scene dataset: (<b>a</b>–<b>d</b>) are Scene 1, which mainly includes schools and stadiums; (<b>e</b>–<b>h</b>) are Scene 2, which mainly includes residential areas.</p>
Full article ">Figure 13
<p>(<b>a</b>–<b>p</b>) contain the comparison of the depth maps of the two self-built scenario datasets: Scene 1 is the result of <a href="#sensors-24-02064-f013" class="html-fig">Figure 13</a>, and Scene 2 is the result of <a href="#sensors-24-02064-f014" class="html-fig">Figure 14</a>.</p>
Full article ">Figure 14
<p>(<b>a</b>–<b>h</b>) are the three-dimensional reconstruction modeling diagram of the two scenes in <a href="#sensors-24-02064-f012" class="html-fig">Figure 12</a>.</p>
Full article ">Figure 15
<p>(<b>a</b>–<b>h</b>) are iconic site in Lausanne, showcasing the city’s beauty and history. It is the capital of the canton of Vaud, Switzerland, from the unmanned remote sensing dataset Pix4D.</p>
Full article ">Figure 16
<p>Depth prediction image of the scene in <a href="#sensors-24-02064-f015" class="html-fig">Figure 15</a>, comparing our improved algorithm with TransMVSNet, where (<b>a</b>–<b>d</b>) are the results of our algorithm and (<b>e</b>–<b>h</b>) are the results of the TransMVSNet.</p>
Full article ">Figure 17
<p>(<b>a</b>–<b>d</b>) are the three-dimensional reconstructed model view of the scene in <a href="#sensors-24-02064-f015" class="html-fig">Figure 15</a>.</p>
Full article ">Figure 18
<p>(<b>a</b>–<b>e</b>) show coastal mountain village in the capital of the canton of Vaud, district of Lausanne, Switzerland, photographed on Pix4D.</p>
Full article ">Figure 19
<p>Depth map of the scene in <a href="#sensors-24-02064-f018" class="html-fig">Figure 18</a>: (<b>a</b>–<b>d</b>) are depth prediction images of our improved algorithm; (<b>e</b>–<b>h</b>) are depth prediction images of the original algorithm.</p>
Full article ">Figure 20
<p>(<b>a</b>–<b>d</b>) are the three-dimensional reconstruction model view of the scene in <a href="#sensors-24-02064-f018" class="html-fig">Figure 18</a>.</p>
Full article ">Figure 21
<p>Comparison with state-of-the-art deep learning-based MVS methods on DTU dataset (lower is better).</p>
Full article ">Figure 22
<p>Quantitative experiments: (<b>a</b>–<b>d</b>) are the first to quantify the number of scans in three planes, (<b>e</b>–<b>h</b>) are the second to quantify the number of scans in three planes, and (<b>i</b>–<b>l</b>) are the third to quantify the number of scans in three planes.</p>
Full article ">
31 pages, 8391 KiB  
Article
Model-Based 3D Gaze Estimation Using a TOF Camera
by Kuanxin Shen, Yingshun Li, Zhannan Guo, Jintao Gao and Yingjian Wu
Sensors 2024, 24(4), 1070; https://doi.org/10.3390/s24041070 - 6 Feb 2024
Viewed by 2631
Abstract
Among the numerous gaze-estimation methods currently available, appearance-based methods predominantly use RGB images as input and employ convolutional neural networks (CNNs) to detect facial images to regressively obtain gaze angles or gaze points. Model-based methods require high-resolution images to obtain a clear eyeball [...] Read more.
Among the numerous gaze-estimation methods currently available, appearance-based methods predominantly use RGB images as input and employ convolutional neural networks (CNNs) to detect facial images to regressively obtain gaze angles or gaze points. Model-based methods require high-resolution images to obtain a clear eyeball geometric model. These methods face significant challenges in outdoor environments and practical application scenarios. This paper proposes a model-based gaze-estimation algorithm using a low-resolution 3D TOF camera. This study uses infrared images instead of RGB images as input to overcome the impact of varying illumination intensity in the environment on gaze estimation. We utilized a trained YOLOv8 neural network model to detect eye landmarks in captured facial images. Combined with the depth map from a time-of-flight (TOF) camera, we calculated the 3D coordinates of the canthus points of a single eye of the subject. Based on this, we fitted a 3D geometric model of the eyeball to determine the subject’s gaze angle. Experimental validation showed that our method achieved a root mean square error of 6.03° and 4.83° in the horizontal and vertical directions, respectively, for the detection of the subject’s gaze angle. We also tested the proposed method in a real car driving environment, achieving stable driver gaze detection at various locations inside the car, such as the dashboard, driver mirror, and the in-vehicle screen. Full article
Show Figures

Figure 1

Figure 1
<p>The overall process of the proposed model-based 3D gaze estimation using a TOF camera method. The green arrow represents the subject’s gaze direction.</p>
Full article ">Figure 2
<p>The partial effectiveness of data augmentation.</p>
Full article ">Figure 3
<p>The eye region and landmark detection model trained on the IRGD dataset using YOLOv8 shows the detection effect on the subject’s gaze image (<b>a</b>). The landmark detection model outputs 7 target points for a single-eye image of the subject (<b>b</b>): 1—Left eye corner point; 2—First upper eyelid point; 3—Second upper eyelid point; 4—Right eye corner point; 5—First lower eyelid point; 6—Second lower eyelid point; 7—Pupil point.</p>
Full article ">Figure 4
<p>The subject maintained a head pose angle of 0° in both the horizontal and vertical directions and performed a series of coherent lizard movements. The green arrow indicates the ground-truth gaze direction, while the red arrow represents the final gaze direction obtained using the eyeball center calculation method proposed in [<a href="#B30-sensors-24-01070" class="html-bibr">30</a>]. As the subject’s gaze angle gradually increased, the deviation between the gaze angle calculated by this eyeball center localization method and the ground-truth gaze angle began to increase. <a href="#sensors-24-01070-t001" class="html-table">Table 1</a> shows the results of our calculations.</p>
Full article ">Figure 5
<p>Eight marked points are manually annotated on the image of the subject’s single eye. These points are randomly distributed on the sclera of the eye, not the cornea. We use these eight 3D coordinate points to fit the eyeball model and solve for the 3D coordinates of the eyeball center and the radius of the eyeball.</p>
Full article ">Figure 6
<p>Eye detail images taken by the TOF camera at a distance of 200 mm–500 mm from the subject. The experiment is divided into two scenarios: the subject not wearing myopia glasses (<b>top</b>) and wearing glasses (<b>bottom</b>). The occlusion of glasses reduces some of the clarity and contrast of the subject’s eyes, but it is much less than the impact of a longer distance. When the distance between the subject and the TOF camera exceeds 300 mm, the only observable details in the eye area are the corners of the eyes and the pupil points.</p>
Full article ">Figure 7
<p>Creating a standard plane with multiple gaze points using a level’s laser line (<b>a</b>) and fixing a TOF camera on the plane (<b>b</b>).</p>
Full article ">Figure 8
<p>Sample pictures of the IRGD dataset proposed in this paper. We recorded gaze data at five different distances from the participant to the TOF camera, ranging from 200 mm to 600 mm. The TOF camera simultaneously collected IR images and depth images of the participant gazing at the gaze points on the standard plane. All participants performed natural eye movements and coherent head movements.</p>
Full article ">Figure 9
<p>The absolute values of the average head pose angles of the participants at 35 gaze points in the IRGD dataset. The maximum absolute angle of the participants’ head pose in the horizontal direction (yaw) is approximately 50°, while in the vertical direction (pitch), it is approximately 30°.</p>
Full article ">Figure 10
<p>Independent modeling and solution of eyeball center coordinates in horizontal (<b>a</b>) and vertical (<b>b</b>) gaze directions of subjects.</p>
Full article ">Figure 11
<p>Variation trends of the aspect ratio of eye appearance with vertical gaze angle in male (<b>a</b>) and female (<b>b</b>) participants. In male participants, the aspect ratio of eye appearance is less than 0.3 when the eyeball is looking down, while in female participants, the aspect ratio of eye appearance is less than 0.4 when the eyeball is looking down.</p>
Full article ">Figure 12
<p>The drawback of the inability to extract pupil point depth values from the depth image of the TOF camera. For gaze images at certain special angles, the pupil point can be observed on its IR image (<b>a</b>), but due to the absorption of infrared light by the pupil, a ‘black hole’ appears at the position of the pupil point on the corresponding depth image (<b>b</b>) of the IR image.</p>
Full article ">Figure 13
<p>Schematic diagram of the calibration process for individual-specific eyeball parameters of the subject.</p>
Full article ">Figure 14
<p>Calibration results of eyeball parameters for three subjects. We obtained the optimal eyeball structure parameters <math display="inline"><semantics> <mrow> <mrow> <mo>(</mo> <mrow> <msub> <mi>R</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>d</mi> <mn>1</mn> </msub> </mrow> <mo>)</mo> </mrow> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mrow> <mo>(</mo> <mrow> <msub> <mi>R</mi> <mn>2</mn> </msub> <mo>,</mo> <msub> <mi>d</mi> <mn>2</mn> </msub> </mrow> <mo>)</mo> </mrow> </mrow> </semantics></math> for 3 subjects through 10 calibrations, each involving gazing at 20 gaze points. At the same time, we calculated the mean absolute deviation between the gaze angles in the horizontal direction (blue) and vertical direction (orange) computed from this set of parameters and the ground-truth angles.</p>
Full article ">Figure 15
<p>Experiment results on calculating the average pupil depth information and corresponding ground-truth values in horizontal and vertical gaze directions for the male group (<b>a</b>) and female group (<b>b</b>).</p>
Full article ">Figure 16
<p>Results of the subject’s gaze detection. Column (<b>a</b>) presents the original gaze images of the subject, column (<b>b</b>) shows the results of eye landmark detection based on YOLOv8, and column (<b>c</b>) visualizes the subject’s gaze direction. The green arrow indicates the gaze direction detected by our model.</p>
Full article ">Figure 17
<p>Gaze angle detection results of male and female subject groups using the gaze-estimation method proposed in this study. Specifically, (<b>a</b>) represents the horizontal gaze results of the male group, (<b>b</b>) shows the vertical gaze results of the male group. (<b>c</b>) illustrates the horizontal gaze results of the female group, and (<b>d</b>) presents the vertical gaze results of the female subjects.</p>
Full article ">Figure 17 Cont.
<p>Gaze angle detection results of male and female subject groups using the gaze-estimation method proposed in this study. Specifically, (<b>a</b>) represents the horizontal gaze results of the male group, (<b>b</b>) shows the vertical gaze results of the male group. (<b>c</b>) illustrates the horizontal gaze results of the female group, and (<b>d</b>) presents the vertical gaze results of the female subjects.</p>
Full article ">Figure 18
<p>Comparative accuracy results of our proposed gaze-estimation model with other state-of-the-art models in infrared gaze test images.</p>
Full article ">Figure 19
<p>Detection results of driver’s partial gaze points in the interior of a Toyota business SUV. Green arrows indicate the driver’s gaze direction detected by our gaze-estimation model.</p>
Full article ">Figure 20
<p>Mean absolute error between the detected driver’s gaze angles and ground-truth angles at various gaze points inside the car.</p>
Full article ">Figure 21
<p>Detection effect of existing state-of-the-art gaze-estimation methods on the IRGD dataset proposed in this study, with arrows and lines indicating the predicted gaze direction of the subject by each model.</p>
Full article ">
22 pages, 7968 KiB  
Article
Ship-Fire Net: An Improved YOLOv8 Algorithm for Ship Fire Detection
by Ziyang Zhang, Lingye Tan and Robert Lee Kong Tiong
Sensors 2024, 24(3), 727; https://doi.org/10.3390/s24030727 - 23 Jan 2024
Cited by 13 | Viewed by 3201
Abstract
Ship fire may result in significant damage to its structure and large economic loss. Hence, the prompt identification of fires is essential in order to provide prompt reactions and effective mitigation strategies. However, conventional detection systems exhibit limited efficacy and accuracy in detecting [...] Read more.
Ship fire may result in significant damage to its structure and large economic loss. Hence, the prompt identification of fires is essential in order to provide prompt reactions and effective mitigation strategies. However, conventional detection systems exhibit limited efficacy and accuracy in detecting targets, which has been mostly attributed to limitations imposed by distance constraints and the motion of ships. Although the development of deep learning algorithms provides a potential solution, the computational complexity of ship fire detection algorithm pose significant challenges. To solve this, this paper proposes a lightweight ship fire detection algorithm based on YOLOv8n. Initially, a dataset, including more than 4000 unduplicated images and their labels, is established before training. In order to ensure the performance of algorithms, both fire inside ship rooms and also fire on board are considered. Then after tests, YOLOv8n is selected as the model with the best performance and fastest speed from among several advanced object detection algorithms. GhostnetV2-C2F is then inserted in the backbone of the algorithm for long-range attention with inexpensive operation. In addition, spatial and channel reconstruction convolution (SCConv) is used to reduce redundant features with significantly lower complexity and computational costs for real-time ship fire detection. For the neck part, omni-dimensional dynamic convolution is used for the multi-dimensional attention mechanism, which also lowers the parameters. After these improvements, a lighter and more accurate YOLOv8n algorithm, called Ship-Fire Net, was proposed. The proposed method exceeds 0.93, both in precision and recall for fire and smoke detection in ships. In addition, the [email protected] reaches about 0.9. Despite the improvement in accuracy, Ship-Fire Net also has fewer parameters and lower FLOPs compared to the original, which accelerates its detection speed. The FPS of Ship-Fire Net also reaches 286, which is helpful for real-time ship fire monitoring. Full article
Show Figures

Figure 1

Figure 1
<p>Structure of the YOLOv8n network.</p>
Full article ">Figure 2
<p>The architecture of the C2F-GhostNetV2 block.</p>
Full article ">Figure 3
<p>The architecture of the GhostNetV2 bottleneck and DFC attention.</p>
Full article ">Figure 4
<p>The architecture of SCConv integrated with a SRU and a CRU.</p>
Full article ">Figure 5
<p>The architecture of the Spatial Reconstruction Unit (SRU).</p>
Full article ">Figure 6
<p>The architecture of the channel reconstruction unit (CRU).</p>
Full article ">Figure 7
<p>Schematic of an omni-dimensional dynamic convolution.</p>
Full article ">Figure 8
<p>The architecture of proposed model (Ship-Fire Net).</p>
Full article ">Figure 9
<p>The pre-process before labeling using Visual Similarity Duplicate Image Finder.</p>
Full article ">Figure 10
<p>Example of the ship fire and smoke datasets (outside).</p>
Full article ">Figure 11
<p>Example of the ship fire and smoke datasets (inside).</p>
Full article ">Figure 12
<p>Visualization results of the analysis of the dataset. (<b>a</b>) Distribution of object centroid locations; (<b>b</b>) distribution of object sizes.</p>
Full article ">Figure 13
<p>Precision–epoch and recall-epoch curve.</p>
Full article ">Figure 14
<p>Results of Ship-Fire Net and YOLOv8n for outside images.</p>
Full article ">Figure 15
<p>Results of Ship-Fire Net and YOLOv8n for inside images.</p>
Full article ">
18 pages, 4863 KiB  
Article
Research on Pedestrian Crossing Decision Models and Predictions Based on Machine Learning
by Jun Cai, Mengjia Wang and Yishuang Wu
Sensors 2024, 24(1), 258; https://doi.org/10.3390/s24010258 - 1 Jan 2024
Cited by 4 | Viewed by 3492
Abstract
Systematically and comprehensively enhancing road traffic safety using artificial intelligence (AI) is of paramount importance, and it is gradually becoming a crucial framework in smart cities. Within this context of heightened attention, we propose to utilize machine learning (ML) to optimize and ameliorate [...] Read more.
Systematically and comprehensively enhancing road traffic safety using artificial intelligence (AI) is of paramount importance, and it is gradually becoming a crucial framework in smart cities. Within this context of heightened attention, we propose to utilize machine learning (ML) to optimize and ameliorate pedestrian crossing predictions in intelligent transportation systems, where the crossing process is vital to pedestrian crossing behavior. Compared with traditional analytical models, the application of OpenCV image recognition and machine learning methods can analyze the mechanisms of pedestrian crossing behaviors with greater accuracy, thereby more precisely judging and simulating pedestrian violations in crossing. Authentic pedestrian crossing behavior data were extracted from signalized intersection scenarios in Chinese cities, and several machine learning models, including decision trees, multilayer perceptrons, Bayesian algorithms, and support vector machines, were trained and tested. In comparing the various models, the results indicate that the support vector machine (SVM) model exhibited optimal accuracy in predicting pedestrian crossing probabilities and speeds, and it can be applied in pedestrian crossing prediction and traffic simulation systems in intelligent transportation. Full article
Show Figures

Figure 1

Figure 1
<p>Camera angles at four data collection sites: (<b>a</b>) Shandong Road–Songjiang Road; (<b>b</b>) Hongyun Road–Zhelin Street; (<b>c</b>) Zhangqian Road–Hongjin Road; and (<b>d</b>) Huadong Road–Qianshan Road.</p>
Full article ">Figure 2
<p>Installation process of cameras for data collection.</p>
Full article ">Figure 3
<p>Image recognition interface.</p>
Full article ">Figure 4
<p>Vehicle speed and distance statistics. (<b>a</b>) Statistics of the elderly; (<b>b</b>) statistics of middle-aged people; (<b>c</b>) statistics of children.</p>
Full article ">Figure 5
<p>Pedestrian crossing prediction methods and procedures.</p>
Full article ">Figure 6
<p>Structure diagram of decision tree.</p>
Full article ">Figure 7
<p>The structure of multi-layer perceptron.</p>
Full article ">Figure 8
<p>ROC curves for each machine learning model. (<b>a</b>) Decision tree; (<b>b</b>) SVM; (<b>c</b>) MLP; and (<b>d</b>) Naïve Bayes.</p>
Full article ">Figure 9
<p>SHAP analysis conducted on the crossing probability prediction model based on the SVM.</p>
Full article ">Figure 10
<p>Probability model of pedestrians’ crossing behaviors. (<b>a</b>) Crossing probability model for the elderly; (<b>b</b>) crossing probability model for middle−aged adult pedestrians; (<b>c</b>) crossing probability model for children.</p>
Full article ">Figure 11
<p>SHAP analysis based on the support vector regression (SVR) model.</p>
Full article ">Figure 12
<p>Crossing speed model of the pedestrians. (<b>a</b>) Crossing speeds of elderly individuals; (<b>b</b>) crossing speeds of middle-aged individuals; (<b>c</b>) crossing speeds of children.</p>
Full article ">
19 pages, 5724 KiB  
Article
Image-Enhanced U-Net: Optimizing Defect Detection in Window Frames for Construction Quality Inspection
by Jorge Vasquez, Tomotake Furuhata and Kenji Shimada
Buildings 2024, 14(1), 3; https://doi.org/10.3390/buildings14010003 - 19 Dec 2023
Cited by 3 | Viewed by 2052
Abstract
Ensuring the structural integrity of window frames and detecting subtle defects, such as dents and scratches, is crucial for maintaining product quality. Traditional machine vision systems face challenges in defect identification, especially with reflective materials and varied environments. Modern machine and deep learning [...] Read more.
Ensuring the structural integrity of window frames and detecting subtle defects, such as dents and scratches, is crucial for maintaining product quality. Traditional machine vision systems face challenges in defect identification, especially with reflective materials and varied environments. Modern machine and deep learning (DL) systems hold promise for post-installation inspections but face limitations due to data scarcity and environmental variability. Our study introduces an innovative approach to enhance DL-based defect detection, even with limited data. We present a comprehensive window frame defect detection framework incorporating optimized image enhancement, data augmentation, and a core U-Net model. We constructed five datasets using cell phones and the Spot Robot for autonomous inspection, evaluating our approach across various scenarios and lighting conditions in real-world window frame inspections. Our results demonstrate significant performance improvements over the standard U-Net model, with a notable 7.43% increase in the F1 score and 15.1% in IoU. Our approach enhances defect detection capabilities, even in challenging real-world conditions. To enhance the generalizability of this study, it would be advantageous to apply its methodology across a broader range of diverse construction sites. Full article
Show Figures

Figure 1

Figure 1
<p>The framework of the window frame defect detection system (WFDD). The input comprises RGB images captured by the Spot Robot. The data augmentation module employs geometric operations and applies different image enhancement techniques. The preprocessing module is then employed to enhance the performance of the defect detection model. Within the detection module, defects are identified among all detected window frames, with the output showcasing U-Net-generated segmentation blobs.</p>
Full article ">Figure 2
<p>Example from Cellphone Dataset.</p>
Full article ">Figure 3
<p>Samples of Construction Site Dataset.</p>
Full article ">Figure 4
<p>Example from Lab-1 Dataset.</p>
Full article ">Figure 5
<p>Example from Lab-2 Dataset.</p>
Full article ">Figure 6
<p>Samples of Demo Site Dataset.</p>
Full article ">Figure 7
<p>Example of labeling.</p>
Full article ">Figure 8
<p>Comparative sample using the shadow removal technique.</p>
Full article ">Figure 9
<p>Comparative sample using the color neutralization technique.</p>
Full article ">Figure 10
<p>Comparative sample using the contrast enhancement technique.</p>
Full article ">Figure 11
<p>Comparative sample using the intensity neutralization technique.</p>
Full article ">Figure 12
<p>Comparative sample using the CLAHE technique.</p>
Full article ">
16 pages, 5787 KiB  
Article
The Spatio-Temporal Patterns of Regional Development in Shandong Province of China from 2012 to 2021 Based on Nighttime Light Remote Sensing
by Hongli Zhang, Quanzhou Yu, Yujie Liu, Jie Jiang, Junjie Chen and Ruyun Liu
Sensors 2023, 23(21), 8728; https://doi.org/10.3390/s23218728 - 26 Oct 2023
Cited by 4 | Viewed by 2532
Abstract
As a major coastal economic province in the east of China, it is of great significance to clarify the temporal and spatial patterns of regional development in Shandong Province in recent years to support regional high-quality development. Nightlight remote sensing data can reveal [...] Read more.
As a major coastal economic province in the east of China, it is of great significance to clarify the temporal and spatial patterns of regional development in Shandong Province in recent years to support regional high-quality development. Nightlight remote sensing data can reveal the spatio-temporal patterns of social and economic activities on a fine pixel scale. We based the nighttime light patterns at three spatial scales in three geographical regions on monthly nighttime light remote sensing data and social statistics. Different cities and different counties in Shandong Province in the last 10 years were studied by using the methods of trend analysis, stability analysis and correlation analysis. The results show that: (1) The nighttime light pattern was generally consistent with the spatial pattern of construction land. The nighttime light intensity of most urban, built-up areas showed an increasing trend, while the old urban areas of Qingdao and Yantai showed a weakening trend. (2) At the geographical unit scale, the total nighttime light in south-central Shandong was significantly higher than that in eastern and northwest Shandong, while the nighttime light growth rate in northwest Shandong was significantly highest. At the urban scale, Liaocheng had the highest nighttime light growth rate. At the county scale, the nighttime light growth rate of counties with a better economy was lower, while that of counties with a backward economy was higher. (3) The nighttime light growth was significantly correlated with Gross Domestic Product (GDP) and population growth, indicating that regional economic development and population growth were the main causes of nighttime light change. Full article
Show Figures

Figure 1

Figure 1
<p>Data processing flow chart.</p>
Full article ">Figure 2
<p>Land cover in Shandong province in 2020.</p>
Full article ">Figure 3
<p>Spatial pattern of mean Nighttime Light in Shandong province from April 2012 to October 2021.</p>
Full article ">Figure 4
<p>Spatio-temporal changes in Nighttime Light in Shandong province from April 2012 to October 2021.</p>
Full article ">Figure 5
<p>Key areas of Nighttime Light change in Shandong province from April 2012 to October 2021.</p>
Full article ">Figure 6
<p>Stability pattern of Nighttime Light in Shandong province from April 2012 to October 2021.</p>
Full article ">
17 pages, 45348 KiB  
Article
Enhanced 3D Pose Estimation in Multi-Person, Multi-View Scenarios through Unsupervised Domain Adaptation with Dropout Discriminator
by Junli Deng, Haoyuan Yao and Ping Shi
Sensors 2023, 23(20), 8406; https://doi.org/10.3390/s23208406 - 12 Oct 2023
Cited by 1 | Viewed by 1719
Abstract
Data-driven pose estimation methods often assume equal distributions between training and test data. However, in reality, this assumption does not always hold true, leading to significant performance degradation due to distribution mismatches. In this study, our objective is to enhance the cross-domain robustness [...] Read more.
Data-driven pose estimation methods often assume equal distributions between training and test data. However, in reality, this assumption does not always hold true, leading to significant performance degradation due to distribution mismatches. In this study, our objective is to enhance the cross-domain robustness of multi-view, multi-person 3D pose estimation. We tackle the domain shift challenge through three key approaches: (1) A domain adaptation component is introduced to improve estimation accuracy for specific target domains. (2) By incorporating a dropout mechanism, we train a more reliable model tailored to the target domain. (3) Transferable Parameter Learning is employed to retain crucial parameters for learning domain-invariant data. The foundation for these approaches lies in the H-divergence theory and the lottery ticket hypothesis, which are realized through adversarial training by learning domain classifiers. Our proposed methodology is evaluated using three datasets: Panoptic, Shelf, and Campus, allowing us to assess its efficacy in addressing domain shifts in multi-view, multi-person pose estimation. Both qualitative and quantitative experiments demonstrate that our algorithm performs well in two different domain shift scenarios. Full article
Show Figures

Figure 1

Figure 1
<p>Depiction of various datasets utilized for multi-view, multi-person 3D pose estimation. Image examples are sourced from Panoptic [<a href="#B9-sensors-23-08406" class="html-bibr">9</a>], Campus [<a href="#B10-sensors-23-08406" class="html-bibr">10</a>], and Shelf [<a href="#B10-sensors-23-08406" class="html-bibr">10</a>], respectively. While all datasets feature scenes with clean backgrounds, they differ in aspects such as clothing, resolution, lighting, body size, and more. These visual disparities among the datasets complicate the task of applying pose estimation models across different domains.</p>
Full article ">Figure 2
<p>An overview of our Domain Adaptive VoxelPose model. An adversarial training method is used to train the domain classifier. The selection of certain discriminators is determined by a probability <math display="inline"><semantics> <msub> <mi>δ</mi> <mi>k</mi> </msub> </semantics></math>. The network performs a robust positive update for the transferable parameters and performs a negative update for the untransferable parameters.</p>
Full article ">Figure 3
<p>The original adversarial framework (<b>a</b>) is extended to incorporate multiple adversaries. In this enhancement, certain discriminators are probabilistically omitted (<b>b</b>), resulting in only a random subset of feedback (depicted by the arrows) being utilized by the feature extractor at the end of each batch.</p>
Full article ">Figure 4
<p>Estimated 3D poses and their corresponding images in an outdoor environment (Campus Dataset). Different colors represent different people detected. The penultimate column is the output result of the original voxelpose, which has misestimated the person. The last column shows the estimated 3D poses by our algorithm.</p>
Full article ">Figure 5
<p>Cross-domain qualitative comparison between our method and other state-of-the-art multi-view multi-person 3D pose estimation algorithms. The evaluated methods were trained on the Panoptic dataset and validated on the Campus dataset. Different colors represent different people detected, with red indicating the ground truth.</p>
Full article ">Figure 6
<p>Estimated 3D poses and their corresponding images in an indoor social interaction environment (Shelf Dataset). The penultimate column is the output result of the original voxelpose, which has misestimated person. The last column shows the estimated 3D poses by our algorithm.</p>
Full article ">Figure 7
<p>Cross-domain qualitative comparison between our method and other state-of-the-art multi-view multi-person 3D pose estimation algorithms in the Shelf dataset. the evaluated methods were trained on the Panoptic dataset and validated on the Shelf dataset.</p>
Full article ">Figure 8
<p>An illustration of the Average Percentage of Correct Parts (PCP3D) on the Campus and Shelf datasets, with the Dropout Rate (d) plotted on the horizontal axis and PCP3D on the vertical axis. The methods are distinguished by color: the red line for the DA baseline method, the yellow line for the dropout DA method, and the blue line for our proposed full method with TransPar.</p>
Full article ">Figure 9
<p>The Average Percentage of Correct Parts (PCP3D), based on Wider Ratios of Transferable Parameters, on Campus and Shelf dataset.</p>
Full article ">
Back to TopTop