Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 

Topic Editors

Prof. Dr. Junxing Zheng
School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan, China
Dr. Peng Cao
Associate Professor, Faculty of Architecture, Civil and Transportation Engineering, Beijing University of Technology, Beijing 100084, China

3D Computer Vision and Smart Building and City, 2nd Volume

Abstract submission deadline
closed (31 October 2024)
Manuscript submission deadline
31 December 2024
Viewed by
32002

Topic Information

Dear Colleagues,

This Topic is a continuation of the previous successful Topic, "3D Computer Vision and Smart Building and City (https://www.mdpi.com/topics/3D_BIM)". Three-dimensional computer vision is an interdisciplinary subject involving computer vision, computer graphics, artificial intelligence and other fields. Its main contents include 3D perception, 3D understanding and 3D modeling. In recent years, 3D computer vision technology has developed rapidly and has been widely used in unmanned aerial vehicles, robots, autonomous driving, AR, VR and other fields. Smart buildings and cities use various information technologies or innovative concepts to connect as well as various systems and services so as to improve the efficiency of resource utilization, optimize management and services and improve quality of life. Smart buildings and cities can involve some frontier techniques, such as 3D CV for building information models, digital twins, city information models, simultaneous localization and mapping robots. The application of 3D computer vision in smart buildings and cities is a valuable research direction, but it still faces many major challenges. This topic focuses on the theory and technology of 3D computer vision in smart buildings and cities. We welcome papers that provide innovative technologies, theories or case studies in the relevant field.

Prof. Dr. Junxing Zheng
Dr. Peng Cao
Topic Editors

Keywords

  • smart buildings and cities
  • 3D computer vision
  • SLAM
  • building information model
  • city information model
  • robots

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Buildings
buildings
3.1 3.4 2011 17.2 Days CHF 2600 Submit
Drones
drones
4.4 5.6 2017 21.7 Days CHF 2600 Submit
Energies
energies
3.0 6.2 2008 17.5 Days CHF 2600 Submit
Sensors
sensors
3.4 7.3 2001 16.8 Days CHF 2600 Submit
Sustainability
sustainability
3.3 6.8 2009 20 Days CHF 2400 Submit
ISPRS International Journal of Geo-Information
ijgi
2.8 6.9 2012 36.2 Days CHF 1700 Submit

Preprints.org is a multidiscipline platform providing preprint service that is dedicated to sharing your research from the start and empowering your research journey.

MDPI Topics is cooperating with Preprints.org and has built a direct connection between MDPI journals and Preprints.org. Authors are encouraged to enjoy the benefits by posting a preprint at Preprints.org prior to publication:

  1. Immediately share your ideas ahead of publication and establish your research priority;
  2. Protect your idea from being stolen with this time-stamped preprint article;
  3. Enhance the exposure and impact of your research;
  4. Receive feedback from your peers in advance;
  5. Have it indexed in Web of Science (Preprint Citation Index), Google Scholar, Crossref, SHARE, PrePubMed, Scilit and Europe PMC.

Published Papers (26 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
17 pages, 7037 KiB  
Article
Experimental Study on the Bending Mechanical Properties of Socket-Type Concrete Pipe Joints
by Xu Liang, Jian Xu, Xuesong Song, Zhongyao Ren and Li Shi
Buildings 2024, 14(11), 3655; https://doi.org/10.3390/buildings14113655 - 17 Nov 2024
Viewed by 228
Abstract
In modern infrastructure construction, the socket joint of concrete pipelines is a critical component in ensuring the overall stability and safety of the pipeline system. This study conducted monotonic and cyclic bending loading tests on DN300 concrete pipeline socket joints to thoroughly analyse [...] Read more.
In modern infrastructure construction, the socket joint of concrete pipelines is a critical component in ensuring the overall stability and safety of the pipeline system. This study conducted monotonic and cyclic bending loading tests on DN300 concrete pipeline socket joints to thoroughly analyse their bending mechanical properties. The experimental results indicated that during monotonic loading, the relationship between the joint angle and bending moment exhibited nonlinear growth, with the stress state of the socket joint transitioning from the initial contact between the rubber ring and the socket to the eventual contact between the spigot and socket concrete. During the cyclic loading phase, the accumulated joint angle, secant stiffness, and bending stiffness of the pipeline interface significantly increased within the first 1 to 7 cycles and stabilised between the 8th and 40th cycles. After 40 cycles of loading, the bending stiffness of the joint reached 1.5 kN·m2, while the stiffness of the pipeline was approximately 8500 times that of the joint. Additionally, a finite element model for the monotonic loading of the concrete pipeline socket joint was established, and the simulation results showed good agreement with the experimental data, providing a reliable basis for further simulation and analysis of the joint’s mechanical performance under higher loads. This study fills the gap in research on the mechanical properties of concrete pipeline socket joints, particularly under bending loads, and offers valuable references for related engineering applications. Full article
Show Figures

Figure 1

Figure 1
<p>Flexural loading test of full-sized concrete pipeline–socket interface: (<b>a</b>) side view; (<b>b</b>) top view.</p>
Full article ">Figure 2
<p>(<b>a</b>) Physical drawing of test pipe fitting; (<b>b</b>) pipe interfacial dimensions (mm).</p>
Full article ">Figure 3
<p>Assembly diagram of the pipeline–socket interface.</p>
Full article ">Figure 4
<p>Calculation diagram of pipeline interfacial bending deformation.</p>
Full article ">Figure 5
<p>Cyclic loading time-history curves: (<b>a</b>) Test 2 with a cyclic load amplitude of 10.5 kN; (<b>b</b>) Test 3 with a cyclic load amplitude of 17.5 kN.</p>
Full article ">Figure 6
<p>Monotonic bending loading test process for pipeline joints: (<b>a</b>) before loading; (<b>b</b>) during loading; (<b>c</b>) after loading.</p>
Full article ">Figure 7
<p>Load (jack’s output)–displacement curves of the concrete pipeline–socket interface under monotonic loading.</p>
Full article ">Figure 8
<p>Rubber ring deformation during bending loading of concrete pipeline socket joints: (<b>a</b>) twist; (<b>b</b>) slippage.</p>
Full article ">Figure 9
<p>Moment–rotation angle curves of concrete pipeline–socket interface under monotonic loading.</p>
Full article ">Figure 10
<p>Cumulative rotation angles and numbers of cycles for the concrete pipeline–socket interface under cyclic loading. Note: The letters (O, A, B, C) in the figure are used to differentiate the segments of the curve.</p>
Full article ">Figure 11
<p>Deformation of the concrete pipeline–socket interface under cyclic loading.</p>
Full article ">Figure 12
<p>Bending moment–rotation angle curves for cyclic loading tests on the concrete pipeline–socket interface: (<b>a</b>) Test 2 (peak cyclic load of 10.5 kN); (<b>b</b>) Test 3 (peak cyclic load of 17.5 kN).</p>
Full article ">Figure 13
<p>Secant stiffness of the cyclic hysteresis curve. Note: The black line represents the load-angle diagram in the cyclic experiment, while the red line indicates the chord of the curve.</p>
Full article ">Figure 14
<p>Variation curves of secant stiffness at the concrete pipeline–socket interface with increasing number of cycles.</p>
Full article ">Figure 15
<p>Curves of flexural stiffness of concrete pipeline–socket interface with increasing number of cycles.</p>
Full article ">Figure 16
<p>Three-dimensional finite model grid diagram of the concrete pipeline–socket interface.</p>
Full article ">Figure 17
<p>Bending moment–rotation angle curves of the concrete pipeline–socket interface under monotonic loading and corresponding numerical simulation.</p>
Full article ">Figure 18
<p>Displacement cloud map of the concrete pipeline–socket interface under a bending moment of 8 kN·m.</p>
Full article ">Figure 19
<p>Stress distribution cloud map of the concrete pipeline–socket interface under a bending moment of 8 kN·m.</p>
Full article ">
25 pages, 2849 KiB  
Article
Enhanced Hybrid U-Net Framework for Sophisticated Building Automation Extraction Utilizing Decay Matrix
by Ting Wang, Zhuyi Gong, Anqi Tang, Qian Zhang and Yun Ge
Buildings 2024, 14(11), 3353; https://doi.org/10.3390/buildings14113353 - 23 Oct 2024
Viewed by 538
Abstract
Automatically extracting buildings from remote sensing imagery using deep learning techniques has become essential for various real-world applications. However, mainstream methods often encounter difficulties in accurately extracting and reconstructing fine-grained features due to the heterogeneity and scale variations in building appearances. To address [...] Read more.
Automatically extracting buildings from remote sensing imagery using deep learning techniques has become essential for various real-world applications. However, mainstream methods often encounter difficulties in accurately extracting and reconstructing fine-grained features due to the heterogeneity and scale variations in building appearances. To address these challenges, we propose LDFormer, an advanced building segmentation model based on linear decay. LDFormer introduces a multi-scale detail fusion bridge (MDFB), which dynamically integrates shallow features to enhance the representation of local details and capture fine-grained local features effectively. To improve global feature extraction, the model incorporates linear decay self-attention (LDSA) and depthwise large separable kernel multi-layer perceptron (DWLSK-MLP) optimizations in the decoder. Specifically, LDSA employs a linear decay matrix within the self-attention mechanism to address long-distance dependency issues, while DWLSK-MLP utilizes step-wise convolutions to achieve a large receptive field. The proposed method has been evaluated on the Massachusetts, Inria, and WHU building datasets, achieving IoU scores of 76.10%, 82.87%, and 91.86%, respectively. LDFormer demonstrates superior performance compared to existing state-of-the-art methods in building segmentation tasks, showcasing its significant potential for building automation extraction. Full article
Show Figures

Figure 1

Figure 1
<p>Some challenges in remote sensing images. (<b>a</b>) Buildings vary in size, shape, texture, and color. (<b>b</b>) Shadows obscure the buildings in remote sensing images.</p>
Full article ">Figure 2
<p>An overview of the LDFormer model.</p>
Full article ">Figure 3
<p>The structure of LDBlock. (<b>a</b>) Block in Swin-Transformer. (<b>b</b>) LDBlock in LDFormer.</p>
Full article ">Figure 4
<p>The structure of Multi-scale Detail Fusion Bridge (MDFB).</p>
Full article ">Figure 5
<p>Illustration of the LDSA strategy. Different colors represent different weight values for the data; the closer to the token center, the greater the data weight value.</p>
Full article ">Figure 6
<p>(<b>a</b>) DW-MLP. (<b>b</b>) MS-MLP. (<b>c</b>) Our DWLSK-MLP.</p>
Full article ">Figure 7
<p>Qualitative comparison under Massachusetts test sets. We added some red boxes to highlight the differences in order to facilitate model comparison.</p>
Full article ">Figure 8
<p>Visualization of large image inference on the Massachusetts dataset.</p>
Full article ">Figure 9
<p>Qualitative comparison under WHU (<b>left</b>) and Inria (<b>right</b>) test sets.</p>
Full article ">Figure 10
<p>Ablation analysis of the impact of the number of model heads and window size using the Inria building dataset.</p>
Full article ">Figure 11
<p>Model complexity comparison of LDFormer on Inria dataset.</p>
Full article ">
26 pages, 11601 KiB  
Article
Raspberry Pi-Based IoT System for Grouting Void Detection in Tunnel Construction
by Weibin Luo, Junxing Zheng, Yu Miao and Lin Gao
Buildings 2024, 14(11), 3349; https://doi.org/10.3390/buildings14113349 - 23 Oct 2024
Viewed by 1607
Abstract
This paper presents an IoT-based solution for detecting grouting voids in tunnel construction using the Raspberry Pi microcomputer. Voids between the primary and secondary tunnel linings can compromise structural integrity, and traditional methods like GPR lack continuous feedback. The proposed system uses embedded [...] Read more.
This paper presents an IoT-based solution for detecting grouting voids in tunnel construction using the Raspberry Pi microcomputer. Voids between the primary and secondary tunnel linings can compromise structural integrity, and traditional methods like GPR lack continuous feedback. The proposed system uses embedded electrical wires in the secondary lining to measure conductivity, with disruptions indicating unfilled voids. The Raspberry Pi monitors this in real time, uploading data to a cloud platform for engineer access via smartphone. Field tests were conducted in a full-scale, 600 m long tunnel to evaluate the system’s effectiveness. The tests demonstrated the system’s accuracy in detecting voids in various tunnel geometries, including straight sections, curves, and intersections. Using only the proposed void detection system, the largest void detected post-grouting was 1.8 cm, which is within acceptable limits and does not compromise the tunnel’s structural integrity or safety. The system proved to be a cost-effective and scalable solution for real-time monitoring during the grouting process, eliminating the need for continuous manual inspections. This study highlights the potential of IoT-based solutions in smart construction, providing a reliable and practical method for improving tunnel safety and operational efficiency during grouting operations. Full article
Show Figures

Figure 1

Figure 1
<p>The architecture of the proposed IoT-based grouting void detection system using Raspberry Pi.</p>
Full article ">Figure 2
<p>Electrical wires embedded within the secondary lining structure.</p>
Full article ">Figure 3
<p>Setup of the proposed raspberry Pi-based grouting void detection IoT system.</p>
Full article ">Figure 4
<p>The Raspberry Pi platform.</p>
Full article ">Figure 5
<p>GPIO interface of the Raspberry Pi platform.</p>
Full article ">Figure 6
<p>The main components of IoT-based grouting void detection system using Raspberry Pi.</p>
Full article ">Figure 7
<p>The grouting void detection algorithm.</p>
Full article ">Figure 8
<p>The IoT architecture of the proposed system.</p>
Full article ">Figure 9
<p>Field test setup.</p>
Full article ">
17 pages, 9902 KiB  
Article
Research on a Photovoltaic Panel Dust Detection Algorithm Based on 3D Data Generation
by Chengzhi Xie, Qifen Li, Yongwen Yang, Liting Zhang and Xiaojing Liu
Energies 2024, 17(20), 5222; https://doi.org/10.3390/en17205222 - 20 Oct 2024
Viewed by 593
Abstract
With the rapid advancements in AI technology, UAV-based inspection has become a mainstream method for intelligent maintenance of PV power stations. To address limitations in accuracy and data acquisition, this paper presents a defect detection algorithm for PV panels based on an enhanced [...] Read more.
With the rapid advancements in AI technology, UAV-based inspection has become a mainstream method for intelligent maintenance of PV power stations. To address limitations in accuracy and data acquisition, this paper presents a defect detection algorithm for PV panels based on an enhanced YOLOv8 model. The PV panel dust dataset is manually extended using 3D modeling technology, which significantly improves the model’s ability to generalize and detect fine dust particles in complex environments. SENetV2 is introduced to improve the model’s perception of dust features in cluttered backgrounds. AKConv replaces traditional convolution in the neck network, allowing for more flexible and accurate feature extraction through arbitrary kernel parameters and sampling shapes. Additionally, a DySample dynamic upsampler accelerates processing by 8.73%, improving the frame rate from 87.58 FPS to 95.23 FPS while maintaining efficiency. Experimental results show that the 3D image expansion method contributes to a 4.6% increase in detection accuracy, an 8.4% improvement in recall, a 5.7% increase in mAP@50, and a 15.1% improvement in mAP@50-95 compared to the original YOLOv8. The expanded dataset and enhanced model demonstrate the effectiveness and practicality of the proposed approach. Full article
Show Figures

Figure 1

Figure 1
<p>Overall flow chart of the experiment.</p>
Full article ">Figure 2
<p>Structure of the YOLOv8 model.</p>
Full article ">Figure 3
<p>Improvement of YOLOV8 network structure diagram.</p>
Full article ">Figure 4
<p>SENet module.</p>
Full article ">Figure 5
<p>SENetV2 module.</p>
Full article ">Figure 6
<p>Dysample dynamic upsampling structure.</p>
Full article ">Figure 7
<p>AKConv structure.</p>
Full article ">Figure 8
<p>Three-dimensional modeling of PV panels in Blender.</p>
Full article ">Figure 9
<p>Dust randomization override script settings.</p>
Full article ">Figure 10
<p>Surface dust of PV panels at different particle sizes.</p>
Full article ">Figure 11
<p>Resnet50 network.</p>
Full article ">Figure 12
<p>Low-quality-samples (first row) and high-quality-samples (second row).</p>
Full article ">Figure 13
<p>Photographs of real samples.</p>
Full article ">Figure 14
<p>Partial experimental data results after 3D image expansion.</p>
Full article ">Figure 15
<p>Display of inference results.</p>
Full article ">Figure 15 Cont.
<p>Display of inference results.</p>
Full article ">Figure 16
<p>Effect of different modules on Map@50 and R.</p>
Full article ">
21 pages, 6078 KiB  
Article
Multi-Feature-Filtering-Based Road Curb Extraction from Unordered Point Clouds
by Hong Lang, Yuan Peng, Zheng Zou, Shengxue Zhu, Yichuan Peng and Hao Du
Sensors 2024, 24(20), 6544; https://doi.org/10.3390/s24206544 - 10 Oct 2024
Viewed by 849
Abstract
Road curb extraction is a critical component of road environment perception, being essential for calculating road geometry parameters and ensuring the safe navigation of autonomous vehicles. The existing research primarily focuses on extracting curbs from ordered point clouds, which are constrained by their [...] Read more.
Road curb extraction is a critical component of road environment perception, being essential for calculating road geometry parameters and ensuring the safe navigation of autonomous vehicles. The existing research primarily focuses on extracting curbs from ordered point clouds, which are constrained by their structure of point cloud organization, making it difficult to apply them to unordered point cloud data and making them susceptible to interference from obstacles. To overcome these limitations, a multi-feature-filtering-based method for curb extraction from unordered point clouds is proposed. This method integrates several techniques, including the grid height difference, normal vectors, clustering, an alpha-shape algorithm based on point cloud density, and the MSAC (M-Estimate Sample Consensus) algorithm for multi-frame fitting. The multi-frame fitting approach addresses the limitations of traditional single-frame methods by fitting the curb contour every five frames, ensuring more accurate contour extraction while preserving local curb features. Based on our self-developed dataset and the Toronto dataset, these methods are integrated to create a robust filter capable of accurately identifying curbs in various complex scenarios. Optimal threshold values were determined through sensitivity analysis and applied to enhance curb extraction performance under diverse conditions. Experimental results demonstrate that the proposed method accurately and comprehensively extracts curb points in different road environments, proving its effectiveness and robustness. Specifically, the average curb segmentation precision, recall, and F1 score values across scenarios A, B (intersections), C (straight road), and scenarios D and E (curved roads and ghosting) are 0.9365, 0.782, and 0.8523, respectively. Full article
Show Figures

Figure 1

Figure 1
<p>Algorithm framework.</p>
Full article ">Figure 2
<p>MLS system for field data acquisition.</p>
Full article ">Figure 3
<p>Various road scenarios. The road curb within the red box ABCDE is the scene for the comparison experiments.</p>
Full article ">Figure 4
<p>Extraction results for road scenarios A and B: (<b>a</b>) results after processing with grid height difference; (<b>b</b>) result after normal vector extraction; (<b>c</b>) result after using clustering and variable-radius alpha-shape algorithm; (<b>d</b>) result after multi-frame fitting MSAC algorithm. The different colors in the figure indicate varying heights of the curb points.</p>
Full article ">Figure 4 Cont.
<p>Extraction results for road scenarios A and B: (<b>a</b>) results after processing with grid height difference; (<b>b</b>) result after normal vector extraction; (<b>c</b>) result after using clustering and variable-radius alpha-shape algorithm; (<b>d</b>) result after multi-frame fitting MSAC algorithm. The different colors in the figure indicate varying heights of the curb points.</p>
Full article ">Figure 5
<p>Extraction results for road scenario C: (<b>a</b>) result after processing with grid height difference; (<b>b</b>) result after normal vector extraction; (<b>c</b>) result after using clustering and variable-radius alpha-shape algorithm; (<b>d</b>) result after using multi-frame fitting MSAC algorithm.</p>
Full article ">Figure 6
<p>Extraction results for road scenarios D and E: (<b>a</b>) result after processing with grid height difference; (<b>b</b>) result after normal vector extraction; (<b>c</b>) result after using clustering and variable-radius alpha-shape algorithm; (<b>d</b>) result after using multi-frame fitting MSAC algorithm.</p>
Full article ">Figure 7
<p>Comparison of road curb extraction using single-frame and multi-frame fitting: (<b>a</b>) road extraction result using single-frame fitting for road scenarios A and B; (<b>b</b>) road extraction result using multi-frame fitting for road scenarios A and B; (<b>c</b>) road extraction result using single-frame fitting for road scenarios C; (<b>d</b>) road extraction result using multi-frame fitting for road scenario C; (<b>e</b>) road extraction result using single-frame fitting for road scenarios D and E; (<b>f</b>) road extraction result using multi-frame fitting for road scenarios D and E.</p>
Full article ">Figure 8
<p>Comparison using the Toronto dataset: (<b>a</b>) our results using the Toronto dataset; (<b>b</b>) Mi’s results using the Toronto dataset.</p>
Full article ">
19 pages, 13819 KiB  
Article
An Algorithm for Simplifying 3D Building Models with Consideration for Detailed Features and Topological Structure
by Zhenglin Li, Zhanjie Zhao, Wujun Gao and Li Jiao
ISPRS Int. J. Geo-Inf. 2024, 13(10), 356; https://doi.org/10.3390/ijgi13100356 - 8 Oct 2024
Viewed by 680
Abstract
To tackle problems such as the destruction of topological structures and the loss of detailed features in the simplification of 3D building models, we propose a 3D building model simplification algorithm that considers detailed features and topological structures. Based on the edge collapse [...] Read more.
To tackle problems such as the destruction of topological structures and the loss of detailed features in the simplification of 3D building models, we propose a 3D building model simplification algorithm that considers detailed features and topological structures. Based on the edge collapse algorithm, the method defines the region formed by the first-order neighboring triangles of the endpoints of the edge to be collapsed as the simplification unit. It incorporates the centroid displacement of the simplification unit, significance level, and approximate curvature of the edge as influencing factors for the collapse cost to control the edge collapse sequence and preserve model details. Additionally, considering the unique properties of 3D building models, boundary edge detection and face overlay are added as constraints to maintain the model’s topological structure. The experimental results show that the algorithm is superior to the classic QEM algorithm in terms of preserving the topological structure and detailed features of the model. Compared to the QEM algorithm and the other two comparison algorithms selected in this paper, the simplified model resulting from this algorithm exhibit a reduction in Hausdorff distance, mean error, and mean square error to varying degrees. Moreover, the advantages of this algorithm become more pronounced as the simplification rate increases. The research findings can be applied to the simplification of 3D building models. Full article
Show Figures

Figure 1

Figure 1
<p>Edge collapse.</p>
Full article ">Figure 2
<p>Centroid displacement.</p>
Full article ">Figure 3
<p>Schematic diagram of the calculation of simplification unit saliency.</p>
Full article ">Figure 4
<p>Boundary point.</p>
Full article ">Figure 5
<p>The role of boundary edge constraints.</p>
Full article ">Figure 6
<p>Common neighborhood vertices.</p>
Full article ">Figure 7
<p>The role of surface superposition detection.</p>
Full article ">Figure 8
<p>Process of the algorithm.</p>
Full article ">Figure 9
<p>Original model.</p>
Full article ">Figure 10
<p>Simplification results of each algorithm at a simplification rate of 20% for the house model. (<b>a</b>) QEM Algorithm (11,578 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (11,578 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (11,578 faces). (<b>d</b>) Algorithm in this paper (11,578 faces).</p>
Full article ">Figure 11
<p>Simplification results of each algorithm at a simplification rate of 50% for the house model. (<b>a</b>) QEM Algorithm (7237 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (7237 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (7237 faces). (<b>d</b>) Algorithm in this paper (7237 faces).</p>
Full article ">Figure 12
<p>Simplification results of each algorithm at a simplification rate of 80% for the house model. (<b>a</b>) QEM Algorithm (2895 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (2895 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (2895 faces). (<b>d</b>) Algorithm in this paper (2895 faces).</p>
Full article ">Figure 13
<p>Simplification results of each algorithm at a simplification rate of 95% for the house model. (<b>a</b>) QEM Algorithm (724 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (724 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (724 faces). (<b>d</b>) Algorithm in this paper (724 faces).</p>
Full article ">Figure 14
<p>Simplification results of each algorithm at a simplification rate of 20% for the pagoda model. (<b>a</b>) QEM Algorithm (433,448 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (433,448 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (433,448 faces). (<b>d</b>) Algorithm in this paper (433,448 faces).</p>
Full article ">Figure 15
<p>Simplification results of each algorithm at a simplification rate of 50% for the pagoda model. (<b>a</b>) QEM Algorithm (270,905 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (270,905 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (270,905 faces). (<b>d</b>) Algorithm in this paper (270,905 faces).</p>
Full article ">Figure 16
<p>Simplification results of each algorithm at a simplification rate of 80% for the pagoda model. (<b>a</b>) QEM Algorithm (108,362 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (108,362 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (108,362 faces). (<b>d</b>) Algorithm in this paper (108,362 faces).</p>
Full article ">Figure 17
<p>Simplification results of each algorithm at a simplification rate of 95% for the pagoda model. (<b>a</b>) QEM Algorithm (27,091 faces). (<b>b</b>) The algorithm from Reference [<a href="#B18-ijgi-13-00356" class="html-bibr">18</a>] (27,091 faces). (<b>c</b>) The algorithm from Reference [<a href="#B21-ijgi-13-00356" class="html-bibr">21</a>] (27,091 faces). (<b>d</b>) Algorithm in this paper (27,091 faces).</p>
Full article ">Figure 18
<p>Simplified results without considering centroid displacement.</p>
Full article ">Figure 19
<p>Simplified results regardless of significance.</p>
Full article ">Figure 20
<p>Simplified results without considering edge approximate curvature.</p>
Full article ">Figure 21
<p>Simplified results of this algorithm.</p>
Full article ">
18 pages, 3496 KiB  
Article
Analysis of Guidance Signage Systems from a Complex Network Theory Perspective: A Case Study in Subway Stations
by Fei Peng, Zhe Zhang and Qingyan Ding
ISPRS Int. J. Geo-Inf. 2024, 13(10), 342; https://doi.org/10.3390/ijgi13100342 - 25 Sep 2024
Viewed by 543
Abstract
Guidance signage systems (GSSs) play a large role in pedestrian navigation for public buildings. A vulnerable GSS can cause wayfinding troubles for pedestrians. In order to investigate the robustness of GSSs, a complex network-based GSS robustness analysis framework is proposed in this paper. [...] Read more.
Guidance signage systems (GSSs) play a large role in pedestrian navigation for public buildings. A vulnerable GSS can cause wayfinding troubles for pedestrians. In order to investigate the robustness of GSSs, a complex network-based GSS robustness analysis framework is proposed in this paper. First, a method that can transform a GSS into a guidance service network (GSN) is proposed by analyzing the relationships among various signs, and signage node metrics are proposed to evaluate the importance of signage nodes. Second, two network performance metrics, namely, the level of visibility and guidance efficiency, are proposed to evaluate the robustness of the GSN under various disruption modes, and the most important signage node metrics are determined. Finally, a multi-objective optimization model is established to find the optimal weights of these metrics, and a comprehensive evaluation method is proposed to position the critical signage nodes that should receive increased maintenance efforts. A case study was conducted in a subway station and the GSS was transformed into a GSN successfully. The analysis results show that the GSN has scale-free characteristics, and recommendations for GSS design are proposed on the basis of robustness analysis. The signage nodes with high betweenness centrality play a greater role in the GSN than the signage nodes with high degree centrality. The proposed critical signage node evaluation method can be used to efficiently identify the signage nodes for which failure has the greatest effects on GSN performance. Full article
Show Figures

Figure 1

Figure 1
<p>Missing information (to train and direction arrow) due to light tube failure.</p>
Full article ">Figure 2
<p>Methodological framework.</p>
Full article ">Figure 3
<p>Interaction relationship between signs.</p>
Full article ">Figure 4
<p>Influence factors of the relationship between any two signage nodes.</p>
Full article ">Figure 5
<p>VCA of signage node <span class="html-italic">k</span>.</p>
Full article ">Figure 6
<p>The occlusion effects of obstacles.</p>
Full article ">Figure 7
<p>GSN for the GSS of the Suyuan subway station.</p>
Full article ">Figure 8
<p>Degree distributions.</p>
Full article ">Figure 9
<p>Locations of the ten highest-ranked signs in each case.</p>
Full article ">Figure 10
<p>Robustness of the GSN under failure conditions.</p>
Full article ">Figure 11
<p>Divide one long signboard into two short signboards.</p>
Full article ">Figure 12
<p>Guidance efficiency under short and long signboard scenarios.</p>
Full article ">Figure 13
<p>Weights of degree and betweenness centrality under various numbers of removed signage nodes.</p>
Full article ">
17 pages, 10236 KiB  
Article
Research on a 3D Point Cloud Map Learning Algorithm Based on Point Normal Constraints
by Zhao Fang, Youyu Liu, Lijin Xu, Mahamudul Hasan Shahed and Liping Shi
Sensors 2024, 24(19), 6185; https://doi.org/10.3390/s24196185 - 24 Sep 2024
Viewed by 654
Abstract
Laser point clouds are commonly affected by Gaussian and Laplace noise, resulting in decreased accuracy in subsequent surface reconstruction and visualization processes. However, existing point cloud denoising algorithms often overlook the local consistency and density of the point cloud normal vector. A feature [...] Read more.
Laser point clouds are commonly affected by Gaussian and Laplace noise, resulting in decreased accuracy in subsequent surface reconstruction and visualization processes. However, existing point cloud denoising algorithms often overlook the local consistency and density of the point cloud normal vector. A feature map learning algorithm which integrates point normal constraints, Dirichlet energy, and coupled orthogonality bias terms is proposed. Specifically, the Dirichlet energy is employed to penalize the difference between neighboring normal vectors and combined with a coupled orthogonality bias term to enhance the orthogonality between the normal vectors and the subsurface, thereby enhancing the accuracy and robustness of the learned denoising of the feature maps. Additionally, to mitigate the effect of mixing noise, a point cloud density function is introduced to rapidly capture local feature correlations. In experimental findings on the anchor public dataset, the proposed method reduces the average mean square error (MSE) by 0.005 and 0.054 compared to the MRPCA and NLD algorithms, respectively. Moreover, it improves the average signal-to-noise ratio (SNR) by 0.13 DB and 2.14 DB compared to MRPCA and AWLOP, respectively. The proposed algorithm enhances computational efficiency by 27% compared to the RSLDM method. It not only removes mixed noise but also preserves the local geometric features of the point cloud, further improving computational efficiency. Full article
Show Figures

Figure 1

Figure 1
<p>A point cloud model with Gaussian noise and Laplacian noise.</p>
Full article ">Figure 2
<p>Local consistency constraint of point cloud normal vectors.</p>
Full article ">Figure 3
<p>Point cloud model: (<b>a</b>) Anchor ground truth; (<b>b</b>) Gargoyle ground truth.</p>
Full article ">Figure 4
<p>Point cloud model at different noise strengths: (<b>a</b>) Noise (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) Noise (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 5
<p>Noise reduction effects of PNCFGL at different strengths in Anchor model: (<b>a</b>) PNCFGL (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) PNCFGL (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 6
<p>Noise reduction effect of PNCFGL at different strengths in Gargoyle model: (<b>a</b>) PNCFGL (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) PNCFGL (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 7
<p>Noise reduction effect of APSS at different noise strengths in Anchor model: (<b>a</b>) APSS (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) APSS (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 8
<p>Noise reduction effect of APSS at different noise intensities in Gargoyle model: (<b>a</b>) APSS (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) APSS (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 9
<p>Noise reduction effects of NLD at different strengths in Anchor model: (<b>a</b>) NLD (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) NLD (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 10
<p>Noise reduction effects of NLD at different strengths in Gargoyle model: (<b>a</b>) NLD (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) nld (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 11
<p>Noise removal effect of MRPCA at different strengths in Anchor model: (<b>a</b>) MRPCA (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) MRPCA (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 12
<p>De-noising effects of MRPCA at different strengths in Gargoyle model: (<b>a</b>) MRPCA (<math display="inline"><semantics> <mrow> <mi>σ</mi> </mrow> </semantics></math> = 0.02); (<b>b</b>) MRPCA (<span class="html-italic">σ</span> = 0.04).</p>
Full article ">Figure 13
<p>Laser radar scanner.</p>
Full article ">Figure 14
<p>Scanning objects: (<b>a</b>) pipe fitting; (<b>b</b>) pipe circle; (<b>c</b>) rear support; (<b>d</b>) front support.</p>
Full article ">Figure 14 Cont.
<p>Scanning objects: (<b>a</b>) pipe fitting; (<b>b</b>) pipe circle; (<b>c</b>) rear support; (<b>d</b>) front support.</p>
Full article ">Figure 15
<p>Comparison of objects (<b>a</b>–<b>d</b>) before and after PNCFGL noise reduction.</p>
Full article ">
25 pages, 10814 KiB  
Article
Three-Dimensional Web-Based Client Presentation of Integrated BIM and GIS for Smart Cities
by Abdullah Varlık and İsmail Dursun
Buildings 2024, 14(9), 3021; https://doi.org/10.3390/buildings14093021 - 23 Sep 2024
Viewed by 1201
Abstract
Smart cities use technological solutions to reduce the drawbacks of urban living. The importance of BIM and GIS integration has increased with the popularity of smart city and 3D city concepts in recent years. In addition to 3D city models, Building Information Modeling [...] Read more.
Smart cities use technological solutions to reduce the drawbacks of urban living. The importance of BIM and GIS integration has increased with the popularity of smart city and 3D city concepts in recent years. In addition to 3D city models, Building Information Modeling (BIM) is an essential element of smart cities. The 3D city model web client in this study displays three-dimensional (3D) city models created using photogrammetric techniques, BIM, and campus infrastructure projects. The comparison and integration of the aforementioned systems were evaluated. A web-based 3D client framework and implementation for combined BIM and 3D city models are the goals of the submitted work. The Web is a very challenging platform for 3D data presentation. The Cesium engine based on HTML5 and WebGL is an open-source creation and the virtualcityMAP application using the Cesium infrastructure was used in this study. Full article
Show Figures

Figure 1

Figure 1
<p>LoD representation is defined by CityGML 2.0 and CityGML 3.0 [<a href="#B16-buildings-14-03021" class="html-bibr">16</a>,<a href="#B17-buildings-14-03021" class="html-bibr">17</a>].</p>
Full article ">Figure 2
<p>Relationship between the LoD and the degree of representativeness [<a href="#B22-buildings-14-03021" class="html-bibr">22</a>].</p>
Full article ">Figure 3
<p>Snapshot of a building modeled in IFC (right side) and CityGML (left side) [<a href="#B26-buildings-14-03021" class="html-bibr">26</a>].</p>
Full article ">Figure 4
<p>Research route of the full level of detail (LoD) specification for 3D building models. IFC, industry foundation classes; ILoD, indoor LoD; OLoD, outdoor LoD [<a href="#B18-buildings-14-03021" class="html-bibr">18</a>].</p>
Full article ">Figure 5
<p>Semantic mapping of IFC and CityGML classes (yellow outline indicates IFC classes and green outline indicates CityGML classes. Classes in boxes without black outlines do not carry geometric information) [<a href="#B39-buildings-14-03021" class="html-bibr">39</a>].</p>
Full article ">Figure 6
<p>Working area.</p>
Full article ">Figure 7
<p>BIM model generated through CAD-to-BIM conversion.</p>
Full article ">Figure 8
<p>(<b>a</b>) SBIF UAV data; (<b>b</b>) campus orthophoto.</p>
Full article ">Figure 9
<p>SBIF LIDAR point cloud.</p>
Full article ">Figure 10
<p>Scan-to-BIM model.</p>
Full article ">Figure 11
<p>Integration of CityGML and IFC. Note: “*” UML notation used to representation the cardinal relationship among CityGML classes that shows the number of occurrence or possibilities and an intermediately model is shown inside the red box.</p>
Full article ">Figure 12
<p>3D city model/BIM integration model.</p>
Full article ">Figure 13
<p>Georeferencing.</p>
Full article ">Figure 14
<p>IFC to CityGML transformation FME workbench [<a href="#B70-buildings-14-03021" class="html-bibr">70</a>].</p>
Full article ">Figure 15
<p>Used CityGML structure encoded in GML.</p>
Full article ">Figure 16
<p>Main data.</p>
Full article ">Figure 17
<p>Supplementary data.</p>
Full article ">
19 pages, 1468 KiB  
Article
Research on Technological Innovation Capability of Yancheng Prefabricated Construction Industry Based on Patent Information Analysis
by Renyan Lu, Feiting Shi and Houchao Sun
Buildings 2024, 14(9), 2968; https://doi.org/10.3390/buildings14092968 - 19 Sep 2024
Viewed by 656
Abstract
In order to improve the innovation capabilities of Yancheng’s prefabricated construction industry, the Dawei Innojoy patent database was used to search the prefabricated construction technology patent literature data of Yancheng and other major cities in the Yangtze River Delta region from 2012 to [...] Read more.
In order to improve the innovation capabilities of Yancheng’s prefabricated construction industry, the Dawei Innojoy patent database was used to search the prefabricated construction technology patent literature data of Yancheng and other major cities in the Yangtze River Delta region from 2012 to 2022, and the prefabricated construction patents in Yancheng were analyzed from the number of patent applications. Analysis and research were conducted on trends, application type composition, applicants, technical fields, patent legal status, etc. At the same time, the prefabricated building technology innovation capability evaluation system was constructed, and the factor analysis method was used to compare and analyze the prefabricated building technology patent indicators of Yancheng and major cities in the Yangtze River Delta region. The results show that Yancheng has a small number of patent applications, a small proportion of invention patents, a low patent authorization rate, and a low patent conversion rate, and the industry-university-research chain needs to be opened up. Among the cities in the Yangtze River Delta, Yancheng’s comprehensive innovation ability in prefabricated building technology is medium to low, where it lags behind in terms of the scale and quality of technological innovation and ranks at the forefront of technological innovation operations. Based on this, the article puts forward countermeasures and suggestions for Yancheng’s prefabricated building technology patent applications from three levels, macro, meso, and micro, in order to achieve efficient innovation and promote the high-quality development of Yancheng’s new building industrialization. Full article
Show Figures

Figure 1

Figure 1
<p>Number of patent applications in Yancheng City’s prefabricated construction technology industry from 2012 to 2022.</p>
Full article ">Figure 2
<p>Composition of patent application types in Yancheng City’s prefabricated construction technology industry.</p>
Full article ">Figure 3
<p>The main applicants for Yancheng’s prefabricated construction technology industry patents.</p>
Full article ">Figure 4
<p>Legal status of patent applications for Yancheng prefabricated construction technology industry.</p>
Full article ">Figure 5
<p>Comparison of the total number and effective number of patent applications in the prefabricated construction technology industry in major cities in the Yangtze River Delta.</p>
Full article ">Figure 6
<p>Comparison of the proportion of valid authorizations for patent applications in the prefabricated construction technology industry in major cities in the Yangtze River Delta.</p>
Full article ">
20 pages, 4137 KiB  
Article
A Minimal Solution Estimating the Position of Cameras with Unknown Focal Length with IMU Assistance
by Kang Yan, Zhenbao Yu, Chengfang Song, Hongping Zhang and Dezhong Chen
Drones 2024, 8(9), 423; https://doi.org/10.3390/drones8090423 - 24 Aug 2024
Viewed by 651
Abstract
Drones are typically built with integrated cameras and inertial measurement units (IMUs). It is crucial to achieve drone attitude control through relative pose estimation using cameras. IMU drift can be ignored over short periods. Based on this premise, in this paper, four methods [...] Read more.
Drones are typically built with integrated cameras and inertial measurement units (IMUs). It is crucial to achieve drone attitude control through relative pose estimation using cameras. IMU drift can be ignored over short periods. Based on this premise, in this paper, four methods are proposed for estimating relative pose and focal length across various application scenarios: for scenarios where the camera’s focal length varies between adjacent moments and is unknown, the relative pose and focal length can be computed from four-point correspondences; for planar motion scenarios where the camera’s focal length varies between adjacent moments and is unknown, the relative pose and focal length can be determined from three-point correspondences; for instances of planar motion where the camera’s focal length is equal between adjacent moments and is unknown, the relative pose and focal length can be calculated from two-point correspondences; finally, for scenarios where multiple cameras are employed for image acquisition but only one is calibrated, a method proposed for estimating the pose and focal length of uncalibrated cameras can be used. The numerical stability and performance of these methods are compared and analyzed under various noise conditions using simulated datasets. We also assessed the performance of these methods on real datasets captured by a drone in various scenes. The experimental results demonstrate that the method proposed in this paper achieves superior accuracy and stability to classical methods. Full article
Show Figures

Figure 1

Figure 1
<p><span class="html-italic">O</span><sub>1</sub> and <span class="html-italic">O</span><sub>2</sub> represent the camera center; <span class="html-italic">P</span> denotes the target feature point; <span class="html-italic">p</span><sub>1</sub> and <span class="html-italic">p</span><sub>2</sub> are the pixel coordinates of the feature points; <span class="html-italic">e</span><sub>1</sub> and <span class="html-italic">e</span><sub>2</sub> are epipoles, which are the points where the line connecting <span class="html-italic">O</span><sub>1</sub> and <span class="html-italic">O</span><sub>2</sub> intersects with the image plane; <span class="html-italic">O</span><sub>1</sub>, <span class="html-italic">O</span><sub>2</sub>, and <span class="html-italic">P</span> forms the epipolar plane; and <span class="html-italic">l</span><sub>1</sub> and <span class="html-italic">l</span><sub>2</sub> are the epipolar lines, which are the lines where the epipolar plane intersects with the image plane.</p>
Full article ">Figure 2
<p>Focal length error probability density for 10,000 randomly generated problem instances.</p>
Full article ">Figure 3
<p>Translation matrix error probability density for 10,000 randomly generated problem instances.</p>
Full article ">Figure 4
<p>Error variation curve of focal length <span class="html-italic">f</span> with different scale errors in pixel coordinates.</p>
Full article ">Figure 5
<p>Error variation curve of translation vector <b><span class="html-italic">t</span></b> with different scale errors in pixel coordinates.</p>
Full article ">Figure 6
<p>The error variation curves of eight methods when introducing different levels of noise into the three rotation angles with the IMU: (<b>a</b>) the median focal length error calculated after introducing pitch angle rotation errors; (<b>b</b>) the median focal length error calculated after introducing yaw angle rotation errors; (<b>c</b>) the median focal length error calculated after introducing roll angle rotation errors; (<b>d</b>) the median translation vector error calculated after introducing pitch angle rotation errors; (<b>e</b>) the median translation vector error calculated after introducing yaw angle rotation errors; (<b>f</b>) the median translation vector error calculated after introducing roll angle rotation errors.</p>
Full article ">Figure 7
<p>Images captured by the drone: (<b>a</b>) outdoor landscapes; (<b>b</b>) urban buildings; (<b>c</b>) road vehicles.</p>
Full article ">Figure 8
<p>Schematic of feature point extraction using the SIFT algorithm.</p>
Full article ">Figure 9
<p>Cumulative distribution functions of the estimated errors in camera focal length and translation vector across three scenarios: (<b>a</b>) the camera focal length error of outdoor landscapes; (<b>b</b>) the translation vector error of outdoor landscapes; (<b>c</b>) the camera focal length error of urban buildings; (<b>d</b>) the translation vector error of urban buildings; (<b>e</b>) the camera focal length error of road vehicles; (<b>f</b>) the translation vector error of road vehicles.</p>
Full article ">Figure 10
<p>Three-dimensional trajectory plot of real data.</p>
Full article ">Figure 11
<p>Two-dimensional trajectory plot of real data.</p>
Full article ">
23 pages, 63398 KiB  
Article
Automatic Generation of Standard Nursing Unit Floor Plan in General Hospital Based on Stable Diffusion
by Zhuo Han and Yongquan Chen
Buildings 2024, 14(9), 2601; https://doi.org/10.3390/buildings14092601 - 23 Aug 2024
Viewed by 682
Abstract
This study focuses on the automatic generation of architectural floor plans for standard nursing units in general hospitals based on Stable Diffusion. It aims at assisting architects in efficiently generating a variety of preliminary plan preview schemes and enhancing the efficiency of the [...] Read more.
This study focuses on the automatic generation of architectural floor plans for standard nursing units in general hospitals based on Stable Diffusion. It aims at assisting architects in efficiently generating a variety of preliminary plan preview schemes and enhancing the efficiency of the pre-planning stage of medical buildings. It includes dataset processing, model training, model testing and generation. It enables the generation of well-organized, clear, and readable functional block floor plans with strong generalization capabilities by inputting the boundaries of the nursing unit’s floor plan. Quantitative analysis demonstrated that 82% of the generated samples met the evaluation criteria for standard nursing units. Additionally, a comparative experiment was conducted using the same dataset to train a deep learning model based on Generative Adversarial Networks (GANs). The conclusion describes the strengths and limitations of the methodology, pointing out directions for improvement by future studies. Full article
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) the basic architecture of the SD model: Latent Diffusion Model [<a href="#B8-buildings-14-02601" class="html-bibr">8</a>]; (<b>b</b>) LoRA [<a href="#B9-buildings-14-02601" class="html-bibr">9</a>].</p>
Full article ">Figure 2
<p>Methodological framework of the experiment.</p>
Full article ">Figure 3
<p>Main and sub-corridor style nursing units floor plan dataset (portion).</p>
Full article ">Figure 4
<p>Stable Diffusion loss.</p>
Full article ">Figure 5
<p>Testing and Generation Framework.</p>
Full article ">Figure 6
<p>Sampling steps and denoising strength.</p>
Full article ">Figure 7
<p>Other hyperparameters remain unchanged; the seed is changed.</p>
Full article ">Figure 8
<p>Other hyperparameters remain unchanged; the input image is added to the main corridor and the seed is changed.</p>
Full article ">Figure 9
<p>ControlNet preprocessors.</p>
Full article ">Figure 10
<p>ControlNet preprocessors with the input image added to the main corridor.</p>
Full article ">Figure 11
<p>ControlNet Weight, Guidance Start.</p>
Full article ">Figure 12
<p>Preprocessor and Guidance End.</p>
Full article ">Figure 13
<p>Image-to-Image + ControlNet, change the seed.</p>
Full article ">Figure 14
<p>Parameter-controlled boundary generation.</p>
Full article ">Figure 15
<p>Area feature distribution.</p>
Full article ">
20 pages, 8685 KiB  
Article
Numerical Simulation and Field Monitoring of Blasting Vibration for Tunnel In-Situ Expansion by a Non-Cut Blast Scheme
by Zhenchang Guan, Lifu Xie, Dong Chen and Jingkang Shi
Sensors 2024, 24(14), 4546; https://doi.org/10.3390/s24144546 - 13 Jul 2024
Viewed by 1192
Abstract
There have been ever more in-situ tunnel extension projects due to the growing demand for transportation. The traditional blast scheme requires a large quantity of explosive and the vibration effect is hard to control. In order to reduce explosive consumption and the vibration [...] Read more.
There have been ever more in-situ tunnel extension projects due to the growing demand for transportation. The traditional blast scheme requires a large quantity of explosive and the vibration effect is hard to control. In order to reduce explosive consumption and the vibration effect, an optimized non-cut blast scheme was proposed and applied to the in-situ expansion of the Gushan Tunnel. Refined numerical simulation was adopted to compare the traditional and optimized blast schemes. The vibration attenuation within the interlaid rock mass and the vibration effect on the adjacent tunnel were studied and compared. The simulation results were validated by the field monitoring of the vibration effect on the adjacent tunnel. Both the simulation and the monitoring results showed that the vibration velocity on the adjacent tunnel’s back side was much smaller than its counterpart on the blast side, i.e., the presence of cavity reduced the blasting vibration effect significantly. The optimized non-cut blast scheme, which effectively utilized the existing free surface, could reduce the explosive consumption and vibration effect significantly, and might be preferred for in-situ tunnel expansion projects. Full article
Show Figures

Figure 1

Figure 1
<p>The engineering practices of tunnel reconstruction or expansion.</p>
Full article ">Figure 2
<p>The typical cross-section of the Gushan tunnel before and after in-situ expansion (unit: m).</p>
Full article ">Figure 3
<p>The excavation sequence for the in-situ expansion of the north tunnel. The dashed areas represents the lining profile after in-situ expansion and the “+” areas represent the unexcavated rock mass.</p>
Full article ">Figure 4
<p>Traditional blast scheme for the top part of the north tunnel. Numbers represent detonator sequences. Plus sign represents unexcavated rock mass. Circles represent blast holes.</p>
Full article ">Figure 5
<p>Non-cut blast scheme for the top part of the north tunnel.</p>
Full article ">Figure 6
<p>Loading boundary of equivalent blasting load.</p>
Full article ">Figure 7
<p>Numerical model for the Gushan tunnel.</p>
Full article ">Figure 8
<p>Equivalent blasting loads for every detonator sequence in a traditional blast scheme.</p>
Full article ">Figure 9
<p>The implementation of equivalent blasting load for traditional blast scheme: (<b>a</b>) detonator sequence 8; (<b>b</b>) denotator sequence 14.</p>
Full article ">Figure 10
<p>Equivalent blasting load for every detonator sequence in the non-cut blast scheme.</p>
Full article ">Figure 11
<p>The implementation of equivalent blasting load for the non-cut blast scheme: (<b>a</b>) detonator sequence 8; (<b>b</b>) detonator sequence 14.</p>
Full article ">Figure 12
<p>Arrangement of numerical monitoring points (unit: m).</p>
Full article ">Figure 13
<p>The velocity–time histories and frequency spectra of the M6 monitoring point for the traditional blast scheme: (<b>a</b>) time history in the <span class="html-italic">X</span>-direction, (<b>b</b>) frequency spectrum in the <span class="html-italic">X</span>-direction, (<b>c</b>) time history in the <span class="html-italic">Z</span>-direction, (<b>d</b>) frequency spectrum in the <span class="html-italic">Z</span>-direction.</p>
Full article ">Figure 14
<p>The maximum velocities on the adjacent tunnel for the traditional blast scheme: (<b>a</b>) the <span class="html-italic">X</span>-direction; (<b>b</b>) the <span class="html-italic">Z</span>-direction. Units: cm/s.</p>
Full article ">Figure 15
<p>The velocity–time histories and frequency spectra of the M6 monitoring point for the non-cut blast scheme: (<b>a</b>) time history in the <span class="html-italic">X</span>-direction, (<b>b</b>) frequency spectrum in the <span class="html-italic">X</span>-direction, (<b>c</b>) time history in the <span class="html-italic">Z</span>-direction, (<b>d</b>) frequency spectrum in the <span class="html-italic">Z</span>-direction.</p>
Full article ">Figure 16
<p>The maximum velocities on the adjacent tunnel for the non-cut blast scheme: (<b>a</b>) the <span class="html-italic">X</span>-direction; (<b>b</b>) the Z-direction; units: cm/s.</p>
Full article ">Figure 17
<p>The maximum velocities within interlaid rock mass for the traditional blast scheme: (<b>a</b>) <span class="html-italic">X</span>-direction, (<b>b</b>) Z-direction.</p>
Full article ">Figure 18
<p>The maximum velocities within the interlaid rock mass for the non-cut blast scheme: (<b>a</b>) <span class="html-italic">X</span>-direction, (<b>b</b>) Z-direction.</p>
Full article ">Figure 19
<p>The upper part of NK18+110 section before and after expansion.</p>
Full article ">Figure 20
<p>Field monitoring for blasting vibration.</p>
Full article ">Figure 21
<p>Arrangement of blasting vibration meters.</p>
Full article ">Figure 22
<p>The velocity–time histories and frequency spectra recorded by field monitoring and compared with numerical simulation results: (<b>a</b>) time history of M6 in the <span class="html-italic">X</span>-direction, (<b>b</b>) frequency spectrum of M6 in the <span class="html-italic">X</span>-direction, (<b>c</b>) time history of M6 in the Z-direction, (<b>d</b>) frequency spectrum of M6 in the Z-direction, (<b>e</b>) time history of M7 in the <span class="html-italic">X</span>-direction, (<b>f</b>) frequency spectrum of M7 in the <span class="html-italic">X</span>-direction, (<b>g</b>) time history of M7 in the Z-direction, (<b>h</b>) frequency spectrum of M7 in the Z-direction.</p>
Full article ">
21 pages, 3782 KiB  
Article
Globally Optimal Relative Pose and Scale Estimation from Only Image Correspondences with Known Vertical Direction
by Zhenbao Yu, Shirong Ye, Changwei Liu, Ronghe Jin, Pengfei Xia and Kang Yan
ISPRS Int. J. Geo-Inf. 2024, 13(7), 246; https://doi.org/10.3390/ijgi13070246 - 9 Jul 2024
Viewed by 783
Abstract
Installing multi-camera systems and inertial measurement units (IMUs) in self-driving cars, micro aerial vehicles, and robots is becoming increasingly common. An IMU provides the vertical direction, allowing coordinate frames to be aligned in a common direction. The degrees of freedom (DOFs) of the [...] Read more.
Installing multi-camera systems and inertial measurement units (IMUs) in self-driving cars, micro aerial vehicles, and robots is becoming increasingly common. An IMU provides the vertical direction, allowing coordinate frames to be aligned in a common direction. The degrees of freedom (DOFs) of the rotation matrix are reduced from 3 to 1. In this paper, we propose a globally optimal solver to calculate the relative poses and scale of generalized cameras with a known vertical direction. First, the cost function is established to minimize algebraic error in the least-squares sense. Then, the cost function is transformed into two polynomials with only two unknowns. Finally, the eigenvalue method is used to solve the relative rotation angle. The performance of the proposed method is verified on both simulated and KITTI datasets. Experiments show that our method is more accurate than the existing state-of-the-art solver in estimating the relative pose and scale. Compared to the best method among the comparison methods, the method proposed in this paper reduces the rotation matrix error, translation vector error, and scale error by 53%, 67%, and 90%, respectively. Full article
Show Figures

Figure 1

Figure 1
<p>The rotation matrix, translation vector, and scale are <math display="inline"><semantics> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> </semantics></math>, <math display="inline"><semantics> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> </semantics></math>, and <math display="inline"><semantics> <mi mathvariant="normal">s</mi> </semantics></math>, respectively.</p>
Full article ">Figure 2
<p>The rotation matrix and translation vector of the <math display="inline"><semantics> <mi>i</mi> </semantics></math>-th camera in the <math display="inline"><semantics> <mi>k</mi> </semantics></math> frame are <math display="inline"><semantics> <mrow> <msub> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> <mrow> <mi>k</mi> <mi>i</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> <mrow> <mi>k</mi> <mi>i</mi> </mrow> </msub> </mrow> </semantics></math>. The rotation matrix and translation vector of the <math display="inline"><semantics> <mi>j</mi> </semantics></math>-th camera in the <math display="inline"><semantics> <mrow> <mi>k</mi> <mo>+</mo> <mn>1</mn> </mrow> </semantics></math> frame are <math display="inline"><semantics> <mrow> <msub> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> <mrow> <msup> <mi>k</mi> <mo>′</mo> </msup> <mi>j</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> <mrow> <msup> <mi>k</mi> <mo>′</mo> </msup> <mi>j</mi> </mrow> </msub> </mrow> </semantics></math>. The rotation matrix, translation vector, and scale vector between aligned <math display="inline"><semantics> <mi>k</mi> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>k</mi> <mo>+</mo> <mn>1</mn> </mrow> </semantics></math> are <math display="inline"><semantics> <mrow> <msub> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> <mi>y</mi> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mstyle mathvariant="bold" mathsize="normal"> <mover accent="true"> <mi>t</mi> <mo>˜</mo> </mover> </mstyle> </semantics></math>, and <math display="inline"><semantics> <mi>s</mi> </semantics></math>.</p>
Full article ">Figure 3
<p>Algorithm flow chart.</p>
Full article ">Figure 4
<p>Effect of the number of feature points on the accuracy of rotation, translation, and scale estimation by the method proposed in this paper with different feature points. (<b>a</b>) Rotation error (degree); (<b>b</b>) translation error (degree); (<b>c</b>) translation error; (<b>d</b>) scale error.</p>
Full article ">Figure 5
<p>Estimating errors in the rotation matrix, translation vector, and scale information under random motion. The first column shows the calculation results of adding image noise. The second column shows the calculation results of adding pitch angle noise. The third column shows the calculation results of adding roll angle noise. The first, second, third and fourth rows represent the values of <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mrow> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> <mo>,</mo> <mi>dir</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mi>s</mi> </msub> </mrow> </semantics></math> respectively.</p>
Full article ">Figure 6
<p>Estimating errors in the rotation matrix, translation vector, and scale information under planar motion. The first column shows the calculation results of adding image noise. The second column shows the calculation results of adding pitch angle noise. The third column shows the calculation results of adding roll angle noise. The first, second, third and fourth rows represent the values of <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mrow> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> <mo>,</mo> <mi>dir</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mi>s</mi> </msub> </mrow> </semantics></math> respectively.</p>
Full article ">Figure 7
<p>Estimating errors in the rotation matrix, translation vector, and scale information under sideways motion. The first column shows the calculation results of adding image noise. The second column shows the calculation results of adding pitch angle noise. The third column shows the calculation results of adding roll angle noise. The first, second, third and fourth rows represent the values of <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mrow> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> <mo>,</mo> <mi>dir</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mi>s</mi> </msub> </mrow> </semantics></math> respectively.</p>
Full article ">Figure 8
<p>Estimating errors in the rotation matrix, translation vector, and scale information under forward motion. The first column shows the calculation results of adding image noise. The second column shows the calculation results of adding pitch angle noise. The third column shows the calculation results of adding roll angle noise. The first, second, third and fourth rows represent the values of <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mrow> <mstyle mathvariant="bold" mathsize="normal"> <mi>t</mi> </mstyle> <mo>,</mo> <mi>dir</mi> </mrow> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mi>ε</mi> <mi>s</mi> </msub> </mrow> </semantics></math> respectively.</p>
Full article ">Figure 9
<p>Test image pair from KITTI dataset with feature detection.</p>
Full article ">
29 pages, 28612 KiB  
Article
Synergistic Landscape Design Strategies to Renew Thermal Environment: A Case Study of a Cfa-Climate Urban Community in Central Komatsu City, Japan
by Jing Xiao, Takaya Yuizono and Ruixuan Li
Sustainability 2024, 16(13), 5582; https://doi.org/10.3390/su16135582 - 29 Jun 2024
Viewed by 1069
Abstract
An effective community landscape design consistently impacts thermally comfortable outdoor conditions and climate adaptation. Therefore, constructing sustainable communities requires a resilience assessment of existing built environments for optimal design mechanisms, especially the renewal of thermally resilient communities in densely populated cities. However, the [...] Read more.
An effective community landscape design consistently impacts thermally comfortable outdoor conditions and climate adaptation. Therefore, constructing sustainable communities requires a resilience assessment of existing built environments for optimal design mechanisms, especially the renewal of thermally resilient communities in densely populated cities. However, the current community only involves green space design and lacks synergistic landscape design for renewing the central community. The main contribution of this study is that it reveals a three-level optimization method to validate the Synergistic Landscape Design Strategies (SLDS) (i.e., planting, green building envelope, water body, and urban trees) for renewing urban communities. A typical Japanese community in central Komatsu City was selected to illustrate the simulation-based design strategies. The microclimate model ENVI-met reproduces communities involving 38 case implementations to evaluate the physiologically equivalent temperature (PET) and microclimate condition as a measure of the thermal environments in humid subtropical climates. The simulation results indicated that the single-family buildings and real estate flats were adapted to the summer thermal mitigation strategy of water bodies and green roofs (W). In small-scale and large-scale models, the mean PET was lowered by 1.4–5.0 °C (0.9–2.3 °C), and the cooling effect reduced mean air temperature by 0.4–2.3 °C (0.5–0.8 °C) and improved humidification by 3.7–15.2% (3.7–5.3%). The successful SLDS provides precise alternatives for realizing Sustainable Development Goals (SDGs) in the renewal of urban communities. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The methodology for microclimate simulation sets an optimization mechanism (from a to d) in the Synergistic Landscape Design Strategies (SLDS) to renew the thermal environment in urban communities.</p>
Full article ">Figure 2
<p>Local sample communities include the single-family building community (HC), the real estate flat community (AC), and the mixed cluster community (BC).</p>
Full article ">Figure 3
<p>(<b>a</b>) Köppen-Geiger climate classification of Cfa, in Japan (with black square area); (<b>b</b>) surface temperature change in Japan from 2019 to 2021 for August warming; (<b>c</b>) maximum air temperature (T<sub>a</sub>) in August from 2019 to 2021; and (<b>d</b>) annual mean air temperature (T<sub>a</sub>) in Komatsu City from 2019 to 2021.</p>
Full article ">Figure 4
<p>Japanese community is in two building forms (A and H types) and wall material settings in the ENVI-met.</p>
Full article ">Figure 5
<p>(<b>a</b>) ArcGIS analysis of Komatsu City using urban surface categories, (<b>b</b>) vegetation cover distribution, and (<b>c</b>) urban heat island (UHI) effect.</p>
Full article ">Figure 6
<p>Axonometric diagrams for all design cases using the Synergistic Landscape design strategies (SLDS) in three sample communities (HC, AC, and BC areas).</p>
Full article ">Figure 7
<p>Planting design cases for small-scale models (HC and AC areas) with the position of receptors in ENVI-met.</p>
Full article ">Figure 8
<p>Synergistic landscape design cases for large-scale models (BC area) in ENVI-met.</p>
Full article ">Figure 9
<p>Validation of linear fit for monitored and modeled air temperature (T<sub>a</sub>) and relative humidity (RH) in local sample communities (HC, AC, and BC areas) at small and large scales.</p>
Full article ">Figure 10
<p>Simulation results of the microclimate variations in sample communities at small-large scales.</p>
Full article ">Figure 11
<p>ENVI-met Simulation results on the physiologically equivalent temperature (PET) distribution and the mitigation time at a pedestrian height of 1.8 m.</p>
Full article ">Figure 12
<p>Distribution maps of the PET thermal index at 14:00 simulated with planting design strategies (L1, L2, L3, and L4) in two small-scale communities.</p>
Full article ">Figure 13
<p>Distribution maps of the PET thermal index at 14:00 under green building envelope (GBE) design strategies (“R, F, and W”) renewed in the HC and AC areas based on planting design (L1–4).</p>
Full article ">Figure 14
<p>Distribution maps of the PET thermal index at 14:00 under urban tree effect (W-Ga-f) based water body and green roof (W) design of the BC area.</p>
Full article ">
14 pages, 3735 KiB  
Article
Learning Effective Geometry Representation from Videos for Self-Supervised Monocular Depth Estimation
by Hailiang Zhao, Yongyi Kong, Chonghao Zhang, Haoji Zhang and Jiansen Zhao
ISPRS Int. J. Geo-Inf. 2024, 13(6), 193; https://doi.org/10.3390/ijgi13060193 - 11 Jun 2024
Viewed by 1061
Abstract
Recent studies on self-supervised monocular depth estimation have achieved promising results, which are mainly based on the joint optimization of depth and pose estimation via high-level photometric loss. However, how to learn the latent and beneficial task-specific geometry representation from videos is still [...] Read more.
Recent studies on self-supervised monocular depth estimation have achieved promising results, which are mainly based on the joint optimization of depth and pose estimation via high-level photometric loss. However, how to learn the latent and beneficial task-specific geometry representation from videos is still far from being explored. To tackle this issue, we propose two novel schemes to learn more effective representation from monocular videos: (i) an Inter-task Attention Model (IAM) to learn the geometric correlation representation between the depth and pose learning networks to make structure and motion information mutually beneficial; (ii) a Spatial-Temporal Memory Module (STMM) to exploit long-range geometric context representation among consecutive frames both spatially and temporally. Systematic ablation studies are conducted to demonstrate the effectiveness of each component. Evaluations on KITTI show that our method outperforms current state-of-the-art techniques. Full article
Show Figures

Figure 1

Figure 1
<p>Comparison of the learning process of the general pipeline (<b>a</b>) and our method (<b>b</b>) for self-supervised monocular depth estimation. Different from the general pipeline that learns the depth feature <math display="inline"><semantics> <msub> <mi>F</mi> <mi>D</mi> </msub> </semantics></math> and the pose feature <math display="inline"><semantics> <msub> <mi>F</mi> <mi>P</mi> </msub> </semantics></math> separately using a 2D photometric loss <span class="html-italic">L</span>, we propose a new scheme for learning better representation from videos. A memory mechanism <span class="html-italic">M</span> is devised to exploit the long-range context from videos for depth feature learning. An inter-task attention mechanism <span class="html-italic">A</span> is devised to leverage depth information for helping pose feature learning, which inversely benefits depth feature learning as well via gradient back-propagation.</p>
Full article ">Figure 2
<p>Illustration of our network framework (<b>a</b>) and the architecture of the IAM (<b>b</b>) and the STMM (<b>c</b>). The network takes three consecutive frames as input to learn the long-range geometric correlation representation by introducing STMM after the encoder. The pose network is split into two branches to predict rotation <span class="html-italic">R</span> and translation <span class="html-italic">t</span> separately. The IAM is applied after the second convolution layer of both <span class="html-italic">R</span> and <span class="html-italic">t</span> branches, learning valuable geometry information to assist <span class="html-italic">R</span> and <span class="html-italic">t</span> branches in leveraging inter-task correlation representation.</p>
Full article ">Figure 3
<p>Qualitative results on KITTI test set. Our method produces more accurate depth maps with low-texture regions, moving vehicles, delicate structures, and object boundaries.</p>
Full article ">Figure 4
<p>Visual results evaluated on the Cityscapes dataset. The evaluation uses models trained on KITTI without any refinement. Compared with the methods in [<a href="#B2-ijgi-13-00193" class="html-bibr">2</a>], our method generates higher-quality depth maps and captures moving and slim objects better. The difference is highlighted with the dashed circles.</p>
Full article ">Figure 5
<p>The visualization of learned attention maps in the IAM. It indicates the IAM places distinct emphasis on different regions for two branches to improve their estimation.</p>
Full article ">Figure 6
<p>Visual comparison of the visual odometry trajectories. Full trajectories are plotted using the Evo visualization tool [<a href="#B51-ijgi-13-00193" class="html-bibr">51</a>].</p>
Full article ">
16 pages, 7679 KiB  
Article
A 3D Parameterized BIM-Modeling Method for Complex Engineering Structures in Building Construction Projects
by Lijun Yang, Xuexiang Gao, Song Chen, Qianyao Li and Shuo Bai
Buildings 2024, 14(6), 1752; https://doi.org/10.3390/buildings14061752 - 11 Jun 2024
Viewed by 1220
Abstract
The structural components of large-scale public construction projects are more complex than those of ordinary residential buildings, with irregular and diverse components, as well as a large number of repetitive structural elements, which increase the difficulty of BIM-modeling operations. Additionally, there is a [...] Read more.
The structural components of large-scale public construction projects are more complex than those of ordinary residential buildings, with irregular and diverse components, as well as a large number of repetitive structural elements, which increase the difficulty of BIM-modeling operations. Additionally, there is a significant amount of inherent parameter information in the construction process, which puts forward higher requirements for the application and management capabilities of BIM technology. However, the current BIM software still has deficiencies in the parameterization of complex and irregular structural components, fine modeling, and project management information. To address these issues, this paper takes Grasshopper as the core parametric tool and Revit as the carrier of component attribute information. It investigates the parametric modeling logic of Grasshopper and combines the concepts of parameterization, modularization, standardization, and engineering practicality to create a series of parametric programs for complex structural components in building projects. This approach mainly addresses intricate challenges pertaining to the parametric structural shapes (including batch processing) and parametric structural attributes (including the batch processing of diverse attribute parameters), thereby ensuring the efficiency in BIM modeling throughout the design and construction phases of complex building projects. Full article
Show Figures

Figure 1

Figure 1
<p>BIM parameterized digital graphic representation.</p>
Full article ">Figure 2
<p>Vector representation in Grasshopper.</p>
Full article ">Figure 3
<p>Differences in different types of data structures and operations.</p>
Full article ">Figure 4
<p>The three matching modes: (<b>a</b>) Matching with Longest List; (<b>b</b>) Matching with Shortest List; and (<b>c</b>) Cross-List Data Matching.</p>
Full article ">Figure 5
<p>The process of forming points, lines, and surfaces.</p>
Full article ">Figure 6
<p>Diversified movement methods of components: (<b>a</b>) object translation along direction; and (<b>b</b>) object rotation around axis.</p>
Full article ">Figure 7
<p>Type and section of retaining wall: (<b>a</b>) Type a; (<b>b</b>) Type b; (<b>c</b>) Type c; (<b>d</b>) Type d; (<b>e</b>) Type e; (<b>f</b>) Type f.</p>
Full article ">Figure 8
<p>The parameterization process of GH for retaining walls.</p>
Full article ">Figure 9
<p>Implementation process of structural positioning.</p>
Full article ">Figure 10
<p>Parameter settings and model parameterization creation.</p>
Full article ">Figure 11
<p>Parameterized variable display of staircase structure.</p>
Full article ">Figure 12
<p>Parameterized node connection display of staircase structure.</p>
Full article ">Figure 13
<p>Draw parameterized staircase structure based on projection lines.</p>
Full article ">
20 pages, 13136 KiB  
Article
DSOMF: A Dynamic Environment Simultaneous Localization and Mapping Technique Based on Machine Learning
by Shengzhe Yue, Zhengjie Wang and Xiaoning Zhang
Sensors 2024, 24(10), 3063; https://doi.org/10.3390/s24103063 - 11 May 2024
Viewed by 985
Abstract
To address the challenges of reduced localization accuracy and incomplete map construction demonstrated using classical semantic simultaneous localization and mapping (SLAM) algorithms in dynamic environments, this study introduces a dynamic scene SLAM technique that builds upon direct sparse odometry (DSO) and incorporates instance [...] Read more.
To address the challenges of reduced localization accuracy and incomplete map construction demonstrated using classical semantic simultaneous localization and mapping (SLAM) algorithms in dynamic environments, this study introduces a dynamic scene SLAM technique that builds upon direct sparse odometry (DSO) and incorporates instance segmentation and video completion algorithms. While prioritizing the algorithm’s real-time performance, we leverage the rapid matching capabilities of Direct Sparse Odometry (DSO) to link identical dynamic objects in consecutive frames. This association is achieved through merging semantic and geometric data, thereby enhancing the matching accuracy during image tracking through the inclusion of semantic probability. Furthermore, we incorporate a loop closure module based on video inpainting algorithms into our mapping thread. This allows our algorithm to rely on the completed static background for loop closure detection, further enhancing the localization accuracy of our algorithm. The efficacy of this approach is validated using the TUM and KITTI public datasets and the unmanned platform experiment. Experimental results show that, in various dynamic scenes, our method achieves an improvement exceeding 85% in terms of localization accuracy compared with the DSO system. Full article
Show Figures

Figure 1

Figure 1
<p>Algorithm framework.</p>
Full article ">Figure 2
<p>Image processing workflow (The blue box in the diagram represents the tracking thread, while the orange box represents the mapping thread).</p>
Full article ">Figure 3
<p>Dynamic object instance segmentation results. (<b>a</b>) Original image; (<b>b</b>) mask image.</p>
Full article ">Figure 4
<p>Optical flow tracking of regional centroids.</p>
Full article ">Figure 5
<p>Delineation of dynamic and static regions.</p>
Full article ">Figure 6
<p>Comparative analysis of mapping outcomes pre and post dynamic object elimination.</p>
Full article ">Figure 7
<p>FGVC optical flow completion process.</p>
Full article ">Figure 8
<p>Comparison of absolute trajectory error for the camera on the TUM dataset.</p>
Full article ">Figure 9
<p>Video completion process in KITTI-04 dataset.</p>
Full article ">Figure 10
<p>Map construction effect of DSOMF in the KITTI-04 dataset.</p>
Full article ">Figure 11
<p>Unmanned flight platform SLAM algorithm test system.</p>
Full article ">Figure 12
<p>Top view of fixed wing aircraft.</p>
Full article ">Figure 13
<p>Loop closure detection experiment. (<b>a</b>) The loop closure detection module runs; (<b>b</b>) the loop closure detection not runs.</p>
Full article ">Figure 14
<p>SLAM algorithm test system for unmanned ground platform.</p>
Full article ">Figure 15
<p>Comparison of outdoor dynamic environment trajectories (In the real-life scenario, the outlined boxes represent the trajectories of dynamic objects. Route one denotes the path of vehicles, while route two signifies pedestrian pathways).</p>
Full article ">Figure 16
<p>Environment image. (<b>a</b>) Pedestrian environment image; (<b>b</b>) electric vehicle environment image.</p>
Full article ">
15 pages, 2894 KiB  
Article
Phase Error Reduction for a Structured-Light 3D System Based on a Texture-Modulated Reprojection Method
by Chenbo Shi, Zheng Qin, Xiaowei Hu, Changsheng Zhu, Yuanzheng Mo, Zelong Li, Shaojia Yan, Yue Yu, Xiangteng Zang and Chun Zhang
Sensors 2024, 24(7), 2075; https://doi.org/10.3390/s24072075 - 24 Mar 2024
Viewed by 1183
Abstract
Fringe projection profilometry (FPP), with benefits such as high precision and a large depth of field, is a popular 3D optical measurement method widely used in precision reconstruction scenarios. However, the pixel brightness at reflective edges does not satisfy the conditions of the [...] Read more.
Fringe projection profilometry (FPP), with benefits such as high precision and a large depth of field, is a popular 3D optical measurement method widely used in precision reconstruction scenarios. However, the pixel brightness at reflective edges does not satisfy the conditions of the ideal pixel-wise phase-shifting model due to the influence of scene texture and system defocus, resulting in severe phase errors. To address this problem, we theoretically analyze the non-pixel-wise phase propagation model for texture edges and propose a reprojection strategy based on scene texture modulation. The strategy first obtains the reprojection weight mask by projecting typical FPP patterns and calculating the scene texture reflection ratio, then reprojects stripe patterns modulated by the weight mask to eliminate texture edge effects, and finally fuses coarse and refined phase maps to generate an accurate phase map. We validated the proposed method on various texture scenes, including a smooth plane, depth surface, and curved surface. Experimental results show that the root mean square error (RMSE) of the phase at the texture edge decreased by 53.32%, proving the effectiveness of the reprojection strategy in eliminating depth errors at texture edges. Full article
Show Figures

Figure 1

Figure 1
<p>Measurement effect of traditional FPP method. (<b>a</b>) Smooth texture plane; (<b>b</b>) traditional FPP method measurement results.</p>
Full article ">Figure 2
<p>The process of capturing the intensity change in the stripe image by the camera.</p>
Full article ">Figure 3
<p>The relationship between camera defocus and phase in the scene. (<b>a</b>) The camera captures scene intensity; (<b>b</b>) the two-dimensional Gaussian distribution; (<b>c</b>) the phase value of (<b>a</b>).</p>
Full article ">Figure 4
<p>Computational framework of our proposed method.</p>
Full article ">Figure 5
<p>Modulation mask. (<b>a</b>) Maximum light modulation pattern; (<b>b</b>) scene image after adding mask; (<b>c</b>) gradient absolute value image; (<b>d</b>) absolute gradient value and phase comparison of line drawing positions.</p>
Full article ">Figure 6
<p>Phase error analysis of simulated modulated scene. (<b>a</b>) Simulation of original and modulated measurement scene pictures; (<b>b</b>) original phase error and modulated phase error; (<b>c</b>) comparison of the phase error.</p>
Full article ">Figure 7
<p>Structured-light 3D reconstruction system platform.</p>
Full article ">Figure 8
<p>Actual measurement objects. (<b>a</b>) Smooth scene with only texture edges; (<b>b</b>) Scenes affected by both depth edges and texture edges; (<b>c</b>) Smooth surfaces affected by only texture edges; (<b>d</b>) Scenes with different depths of field.</p>
Full article ">Figure 9
<p>Comparison of measurement results for different depth differences. (<b>a</b>) Original scene image; (<b>b</b>) original depth map; (<b>c</b>) comparison of local ROI regions; (<b>d</b>) modulated scene image; (<b>e</b>) fusion depth map; (<b>f</b>) comparison of original depth curve (red), fusion depth curve (blue), and ground truth (black).</p>
Full article ">Figure 10
<p>Comparison of measurement scenes only modulated by texture. (<b>a</b>) Original measurement scene; (<b>b</b>) original depth map; (<b>c</b>) fusion depth map; (<b>d</b>) depth comparison between position A and position B; (<b>e</b>) depth comparison between position C and position D; (<b>f</b>) depth comparison between position E and position F.</p>
Full article ">Figure 11
<p>Comparison of measured effects on scenes modulated by depth and texture. (<b>a</b>) Original scene image; (<b>b</b>) original depth map; (<b>c</b>) comparison of local ROI regions; (<b>d</b>) modulated scene image; (<b>e</b>) fusion depth map; (<b>f</b>) comparison of original depth curve, modulated depth curve, and true depth curve.</p>
Full article ">Figure 12
<p>Comparison of measurements of different texture widths. (<b>a</b>) Original measurement scene; (<b>b</b>) modulated scene image; (<b>c</b>) original depth map; (<b>d</b>) fusion depth map; (<b>e</b>) ROI of original measurement scene; (<b>f</b>–<b>l</b>) original depth (red), fusion depth (blue), and actual value (black) comparison at positions A–G in (<b>e</b>).</p>
Full article ">Figure 13
<p>Measurement experiments under different depths of field. (<b>a</b>) The original scene image; (<b>b</b>) the original depth map; (<b>c</b>) the original depth map compared with the depth curve of position A in the fusion depth map; (<b>d</b>) the modulated scene image; (<b>e</b>) the fusion depth map; (<b>f</b>) the original depth map compared with the depth curve of position B in the fused depth map.</p>
Full article ">Figure 14
<p>Comparison of the modulation effects of different light intensities. (<b>a</b>) The reconstruction effect of the traditional method; (<b>b</b>–<b>d</b>) the reconstruction result when the modulated light intensity is 220, 90, and 50.</p>
Full article ">
21 pages, 25891 KiB  
Article
An Improved TransMVSNet Algorithm for Three-Dimensional Reconstruction in the Unmanned Aerial Vehicle Remote Sensing Domain
by Jiawei Teng, Haijiang Sun, Peixun Liu and Shan Jiang
Sensors 2024, 24(7), 2064; https://doi.org/10.3390/s24072064 - 23 Mar 2024
Viewed by 1110
Abstract
It is important to achieve the 3D reconstruction of UAV remote sensing images in deep learning-based multi-view stereo (MVS) vision. The lack of obvious texture features and detailed edges in UAV remote sensing images leads to inaccurate feature point matching or depth estimation. [...] Read more.
It is important to achieve the 3D reconstruction of UAV remote sensing images in deep learning-based multi-view stereo (MVS) vision. The lack of obvious texture features and detailed edges in UAV remote sensing images leads to inaccurate feature point matching or depth estimation. To address this problem, this study improves the TransMVSNet algorithm in the field of 3D reconstruction by optimizing its feature extraction network and costumed body depth prediction network. The improvement is mainly achieved by extracting features with the Asymptotic Pyramidal Network (AFPN) and assigning weights to different levels of features through the ASFF module to increase the importance of key levels and also using the UNet structured network combined with an attention mechanism to predict the depth information, which also extracts the key area information. It aims to improve the performance and accuracy of the TransMVSNet algorithm’s 3D reconstruction of UAV remote sensing images. In this work, we have performed comparative experiments and quantitative evaluation with other algorithms on the DTU dataset as well as on a large UAV remote sensing image dataset. After a large number of experimental studies, it is shown that our improved TransMVSNet algorithm has better performance and robustness, providing a valuable reference for research and application in the field of 3D reconstruction of UAV remote sensing images. Full article
Show Figures

Figure 1

Figure 1
<p>The architecture of the proposed asymptotic feature pyramid network (AFPN). AFPN is initiated by fusing two neighboring low-level features and progressively incorporating high-level features into the fusion process.</p>
Full article ">Figure 2
<p>Cost volume regularization network: (<b>a</b>) the overall network, (<b>b</b>) the UBA layer, and (<b>c</b>) the CCA module in the UBA layer.</p>
Full article ">Figure 3
<p>Fully connected (FC) network structure.</p>
Full article ">Figure 4
<p>Comparison of depth prediction results for Scan1, where (<b>a</b>–<b>d</b>) are the results of our algorithm and (<b>e</b>–<b>h</b>) are the results of the original algorithm.</p>
Full article ">Figure 5
<p>Comparison of depth prediction results for Scan4, where (<b>a</b>–<b>d</b>) are the results of our algorithm and (<b>e</b>–<b>h</b>) are the results of the original algorithm.</p>
Full article ">Figure 6
<p>Comparison of depth prediction results for Scan9, where (<b>a</b>–<b>d</b>) are the results of our algorithm and (<b>e</b>–<b>h</b>) are the results of the original algorithm.</p>
Full article ">Figure 7
<p>Comparison of depth prediction results for Scan10, where (<b>a</b>–<b>d</b>) are the results of our algorithm and (<b>e</b>–<b>h</b>) are the results of the original algorithm.</p>
Full article ">Figure 8
<p>(<b>a</b>–<b>h</b>) are overhead drone images of buildings from the drone mapping dataset Pix4D. Scene 1 is an unfinished building and Scene 2 is a residential home in Chicago, IL, USA.</p>
Full article ">Figure 9
<p>Depth map of the first scene in <a href="#sensors-24-02064-f008" class="html-fig">Figure 8</a>: (<b>a</b>–<b>d</b>) are depth prediction images of our improved algorithm; (<b>e</b>–<b>h</b>) are depth prediction images of the original algorithm.</p>
Full article ">Figure 10
<p>Depth map of the second scene in <a href="#sensors-24-02064-f008" class="html-fig">Figure 8</a>: (<b>a</b>–<b>d</b>) are depth prediction images of our improved algorithm; (<b>e</b>–<b>h</b>) are depth prediction images of the original algorithm.</p>
Full article ">Figure 11
<p>The 3D reconstruction results of the improved algorithm: (<b>a</b>–<b>d</b>) for Scene 1 and (<b>e</b>–<b>h</b>) for Scene 2 in <a href="#sensors-24-02064-f008" class="html-fig">Figure 8</a>.</p>
Full article ">Figure 12
<p>Self-constructed wrap-around robotic arm overhead shooting scene dataset: (<b>a</b>–<b>d</b>) are Scene 1, which mainly includes schools and stadiums; (<b>e</b>–<b>h</b>) are Scene 2, which mainly includes residential areas.</p>
Full article ">Figure 13
<p>(<b>a</b>–<b>p</b>) contain the comparison of the depth maps of the two self-built scenario datasets: Scene 1 is the result of <a href="#sensors-24-02064-f013" class="html-fig">Figure 13</a>, and Scene 2 is the result of <a href="#sensors-24-02064-f014" class="html-fig">Figure 14</a>.</p>
Full article ">Figure 14
<p>(<b>a</b>–<b>h</b>) are the three-dimensional reconstruction modeling diagram of the two scenes in <a href="#sensors-24-02064-f012" class="html-fig">Figure 12</a>.</p>
Full article ">Figure 15
<p>(<b>a</b>–<b>h</b>) are iconic site in Lausanne, showcasing the city’s beauty and history. It is the capital of the canton of Vaud, Switzerland, from the unmanned remote sensing dataset Pix4D.</p>
Full article ">Figure 16
<p>Depth prediction image of the scene in <a href="#sensors-24-02064-f015" class="html-fig">Figure 15</a>, comparing our improved algorithm with TransMVSNet, where (<b>a</b>–<b>d</b>) are the results of our algorithm and (<b>e</b>–<b>h</b>) are the results of the TransMVSNet.</p>
Full article ">Figure 17
<p>(<b>a</b>–<b>d</b>) are the three-dimensional reconstructed model view of the scene in <a href="#sensors-24-02064-f015" class="html-fig">Figure 15</a>.</p>
Full article ">Figure 18
<p>(<b>a</b>–<b>e</b>) show coastal mountain village in the capital of the canton of Vaud, district of Lausanne, Switzerland, photographed on Pix4D.</p>
Full article ">Figure 19
<p>Depth map of the scene in <a href="#sensors-24-02064-f018" class="html-fig">Figure 18</a>: (<b>a</b>–<b>d</b>) are depth prediction images of our improved algorithm; (<b>e</b>–<b>h</b>) are depth prediction images of the original algorithm.</p>
Full article ">Figure 20
<p>(<b>a</b>–<b>d</b>) are the three-dimensional reconstruction model view of the scene in <a href="#sensors-24-02064-f018" class="html-fig">Figure 18</a>.</p>
Full article ">Figure 21
<p>Comparison with state-of-the-art deep learning-based MVS methods on DTU dataset (lower is better).</p>
Full article ">Figure 22
<p>Quantitative experiments: (<b>a</b>–<b>d</b>) are the first to quantify the number of scans in three planes, (<b>e</b>–<b>h</b>) are the second to quantify the number of scans in three planes, and (<b>i</b>–<b>l</b>) are the third to quantify the number of scans in three planes.</p>
Full article ">
32 pages, 8391 KiB  
Article
Model-Based 3D Gaze Estimation Using a TOF Camera
by Kuanxin Shen, Yingshun Li, Zhannan Guo, Jintao Gao and Yingjian Wu
Sensors 2024, 24(4), 1070; https://doi.org/10.3390/s24041070 - 6 Feb 2024
Viewed by 1925
Abstract
Among the numerous gaze-estimation methods currently available, appearance-based methods predominantly use RGB images as input and employ convolutional neural networks (CNNs) to detect facial images to regressively obtain gaze angles or gaze points. Model-based methods require high-resolution images to obtain a clear eyeball [...] Read more.
Among the numerous gaze-estimation methods currently available, appearance-based methods predominantly use RGB images as input and employ convolutional neural networks (CNNs) to detect facial images to regressively obtain gaze angles or gaze points. Model-based methods require high-resolution images to obtain a clear eyeball geometric model. These methods face significant challenges in outdoor environments and practical application scenarios. This paper proposes a model-based gaze-estimation algorithm using a low-resolution 3D TOF camera. This study uses infrared images instead of RGB images as input to overcome the impact of varying illumination intensity in the environment on gaze estimation. We utilized a trained YOLOv8 neural network model to detect eye landmarks in captured facial images. Combined with the depth map from a time-of-flight (TOF) camera, we calculated the 3D coordinates of the canthus points of a single eye of the subject. Based on this, we fitted a 3D geometric model of the eyeball to determine the subject’s gaze angle. Experimental validation showed that our method achieved a root mean square error of 6.03° and 4.83° in the horizontal and vertical directions, respectively, for the detection of the subject’s gaze angle. We also tested the proposed method in a real car driving environment, achieving stable driver gaze detection at various locations inside the car, such as the dashboard, driver mirror, and the in-vehicle screen. Full article
Show Figures

Figure 1

Figure 1
<p>The overall process of the proposed model-based 3D gaze estimation using a TOF camera method. The green arrow represents the subject’s gaze direction.</p>
Full article ">Figure 2
<p>The partial effectiveness of data augmentation.</p>
Full article ">Figure 3
<p>The eye region and landmark detection model trained on the IRGD dataset using YOLOv8 shows the detection effect on the subject’s gaze image (<b>a</b>). The landmark detection model outputs 7 target points for a single-eye image of the subject (<b>b</b>): 1—Left eye corner point; 2—First upper eyelid point; 3—Second upper eyelid point; 4—Right eye corner point; 5—First lower eyelid point; 6—Second lower eyelid point; 7—Pupil point.</p>
Full article ">Figure 4
<p>The subject maintained a head pose angle of 0° in both the horizontal and vertical directions and performed a series of coherent lizard movements. The green arrow indicates the ground-truth gaze direction, while the red arrow represents the final gaze direction obtained using the eyeball center calculation method proposed in [<a href="#B30-sensors-24-01070" class="html-bibr">30</a>]. As the subject’s gaze angle gradually increased, the deviation between the gaze angle calculated by this eyeball center localization method and the ground-truth gaze angle began to increase. <a href="#sensors-24-01070-t001" class="html-table">Table 1</a> shows the results of our calculations.</p>
Full article ">Figure 5
<p>Eight marked points are manually annotated on the image of the subject’s single eye. These points are randomly distributed on the sclera of the eye, not the cornea. We use these eight 3D coordinate points to fit the eyeball model and solve for the 3D coordinates of the eyeball center and the radius of the eyeball.</p>
Full article ">Figure 6
<p>Eye detail images taken by the TOF camera at a distance of 200 mm–500 mm from the subject. The experiment is divided into two scenarios: the subject not wearing myopia glasses (<b>top</b>) and wearing glasses (<b>bottom</b>). The occlusion of glasses reduces some of the clarity and contrast of the subject’s eyes, but it is much less than the impact of a longer distance. When the distance between the subject and the TOF camera exceeds 300 mm, the only observable details in the eye area are the corners of the eyes and the pupil points.</p>
Full article ">Figure 7
<p>Creating a standard plane with multiple gaze points using a level’s laser line (<b>a</b>) and fixing a TOF camera on the plane (<b>b</b>).</p>
Full article ">Figure 8
<p>Sample pictures of the IRGD dataset proposed in this paper. We recorded gaze data at five different distances from the participant to the TOF camera, ranging from 200 mm to 600 mm. The TOF camera simultaneously collected IR images and depth images of the participant gazing at the gaze points on the standard plane. All participants performed natural eye movements and coherent head movements.</p>
Full article ">Figure 9
<p>The absolute values of the average head pose angles of the participants at 35 gaze points in the IRGD dataset. The maximum absolute angle of the participants’ head pose in the horizontal direction (yaw) is approximately 50°, while in the vertical direction (pitch), it is approximately 30°.</p>
Full article ">Figure 10
<p>Independent modeling and solution of eyeball center coordinates in horizontal (<b>a</b>) and vertical (<b>b</b>) gaze directions of subjects.</p>
Full article ">Figure 11
<p>Variation trends of the aspect ratio of eye appearance with vertical gaze angle in male (<b>a</b>) and female (<b>b</b>) participants. In male participants, the aspect ratio of eye appearance is less than 0.3 when the eyeball is looking down, while in female participants, the aspect ratio of eye appearance is less than 0.4 when the eyeball is looking down.</p>
Full article ">Figure 12
<p>The drawback of the inability to extract pupil point depth values from the depth image of the TOF camera. For gaze images at certain special angles, the pupil point can be observed on its IR image (<b>a</b>), but due to the absorption of infrared light by the pupil, a ‘black hole’ appears at the position of the pupil point on the corresponding depth image (<b>b</b>) of the IR image.</p>
Full article ">Figure 13
<p>Schematic diagram of the calibration process for individual-specific eyeball parameters of the subject.</p>
Full article ">Figure 14
<p>Calibration results of eyeball parameters for three subjects. We obtained the optimal eyeball structure parameters <math display="inline"><semantics> <mrow> <mrow> <mo>(</mo> <mrow> <msub> <mi>R</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>d</mi> <mn>1</mn> </msub> </mrow> <mo>)</mo> </mrow> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mrow> <mo>(</mo> <mrow> <msub> <mi>R</mi> <mn>2</mn> </msub> <mo>,</mo> <msub> <mi>d</mi> <mn>2</mn> </msub> </mrow> <mo>)</mo> </mrow> </mrow> </semantics></math> for 3 subjects through 10 calibrations, each involving gazing at 20 gaze points. At the same time, we calculated the mean absolute deviation between the gaze angles in the horizontal direction (blue) and vertical direction (orange) computed from this set of parameters and the ground-truth angles.</p>
Full article ">Figure 15
<p>Experiment results on calculating the average pupil depth information and corresponding ground-truth values in horizontal and vertical gaze directions for the male group (<b>a</b>) and female group (<b>b</b>).</p>
Full article ">Figure 16
<p>Results of the subject’s gaze detection. Column (<b>a</b>) presents the original gaze images of the subject, column (<b>b</b>) shows the results of eye landmark detection based on YOLOv8, and column (<b>c</b>) visualizes the subject’s gaze direction. The green arrow indicates the gaze direction detected by our model.</p>
Full article ">Figure 17
<p>Gaze angle detection results of male and female subject groups using the gaze-estimation method proposed in this study. Specifically, (<b>a</b>) represents the horizontal gaze results of the male group, (<b>b</b>) shows the vertical gaze results of the male group. (<b>c</b>) illustrates the horizontal gaze results of the female group, and (<b>d</b>) presents the vertical gaze results of the female subjects.</p>
Full article ">Figure 17 Cont.
<p>Gaze angle detection results of male and female subject groups using the gaze-estimation method proposed in this study. Specifically, (<b>a</b>) represents the horizontal gaze results of the male group, (<b>b</b>) shows the vertical gaze results of the male group. (<b>c</b>) illustrates the horizontal gaze results of the female group, and (<b>d</b>) presents the vertical gaze results of the female subjects.</p>
Full article ">Figure 18
<p>Comparative accuracy results of our proposed gaze-estimation model with other state-of-the-art models in infrared gaze test images.</p>
Full article ">Figure 19
<p>Detection results of driver’s partial gaze points in the interior of a Toyota business SUV. Green arrows indicate the driver’s gaze direction detected by our gaze-estimation model.</p>
Full article ">Figure 20
<p>Mean absolute error between the detected driver’s gaze angles and ground-truth angles at various gaze points inside the car.</p>
Full article ">Figure 21
<p>Detection effect of existing state-of-the-art gaze-estimation methods on the IRGD dataset proposed in this study, with arrows and lines indicating the predicted gaze direction of the subject by each model.</p>
Full article ">
22 pages, 7968 KiB  
Article
Ship-Fire Net: An Improved YOLOv8 Algorithm for Ship Fire Detection
by Ziyang Zhang, Lingye Tan and Robert Lee Kong Tiong
Sensors 2024, 24(3), 727; https://doi.org/10.3390/s24030727 - 23 Jan 2024
Cited by 7 | Viewed by 2709
Abstract
Ship fire may result in significant damage to its structure and large economic loss. Hence, the prompt identification of fires is essential in order to provide prompt reactions and effective mitigation strategies. However, conventional detection systems exhibit limited efficacy and accuracy in detecting [...] Read more.
Ship fire may result in significant damage to its structure and large economic loss. Hence, the prompt identification of fires is essential in order to provide prompt reactions and effective mitigation strategies. However, conventional detection systems exhibit limited efficacy and accuracy in detecting targets, which has been mostly attributed to limitations imposed by distance constraints and the motion of ships. Although the development of deep learning algorithms provides a potential solution, the computational complexity of ship fire detection algorithm pose significant challenges. To solve this, this paper proposes a lightweight ship fire detection algorithm based on YOLOv8n. Initially, a dataset, including more than 4000 unduplicated images and their labels, is established before training. In order to ensure the performance of algorithms, both fire inside ship rooms and also fire on board are considered. Then after tests, YOLOv8n is selected as the model with the best performance and fastest speed from among several advanced object detection algorithms. GhostnetV2-C2F is then inserted in the backbone of the algorithm for long-range attention with inexpensive operation. In addition, spatial and channel reconstruction convolution (SCConv) is used to reduce redundant features with significantly lower complexity and computational costs for real-time ship fire detection. For the neck part, omni-dimensional dynamic convolution is used for the multi-dimensional attention mechanism, which also lowers the parameters. After these improvements, a lighter and more accurate YOLOv8n algorithm, called Ship-Fire Net, was proposed. The proposed method exceeds 0.93, both in precision and recall for fire and smoke detection in ships. In addition, the [email protected] reaches about 0.9. Despite the improvement in accuracy, Ship-Fire Net also has fewer parameters and lower FLOPs compared to the original, which accelerates its detection speed. The FPS of Ship-Fire Net also reaches 286, which is helpful for real-time ship fire monitoring. Full article
Show Figures

Figure 1

Figure 1
<p>Structure of the YOLOv8n network.</p>
Full article ">Figure 2
<p>The architecture of the C2F-GhostNetV2 block.</p>
Full article ">Figure 3
<p>The architecture of the GhostNetV2 bottleneck and DFC attention.</p>
Full article ">Figure 4
<p>The architecture of SCConv integrated with a SRU and a CRU.</p>
Full article ">Figure 5
<p>The architecture of the Spatial Reconstruction Unit (SRU).</p>
Full article ">Figure 6
<p>The architecture of the channel reconstruction unit (CRU).</p>
Full article ">Figure 7
<p>Schematic of an omni-dimensional dynamic convolution.</p>
Full article ">Figure 8
<p>The architecture of proposed model (Ship-Fire Net).</p>
Full article ">Figure 9
<p>The pre-process before labeling using Visual Similarity Duplicate Image Finder.</p>
Full article ">Figure 10
<p>Example of the ship fire and smoke datasets (outside).</p>
Full article ">Figure 11
<p>Example of the ship fire and smoke datasets (inside).</p>
Full article ">Figure 12
<p>Visualization results of the analysis of the dataset. (<b>a</b>) Distribution of object centroid locations; (<b>b</b>) distribution of object sizes.</p>
Full article ">Figure 13
<p>Precision–epoch and recall-epoch curve.</p>
Full article ">Figure 14
<p>Results of Ship-Fire Net and YOLOv8n for outside images.</p>
Full article ">Figure 15
<p>Results of Ship-Fire Net and YOLOv8n for inside images.</p>
Full article ">
18 pages, 4863 KiB  
Article
Research on Pedestrian Crossing Decision Models and Predictions Based on Machine Learning
by Jun Cai, Mengjia Wang and Yishuang Wu
Sensors 2024, 24(1), 258; https://doi.org/10.3390/s24010258 - 1 Jan 2024
Cited by 2 | Viewed by 2741
Abstract
Systematically and comprehensively enhancing road traffic safety using artificial intelligence (AI) is of paramount importance, and it is gradually becoming a crucial framework in smart cities. Within this context of heightened attention, we propose to utilize machine learning (ML) to optimize and ameliorate [...] Read more.
Systematically and comprehensively enhancing road traffic safety using artificial intelligence (AI) is of paramount importance, and it is gradually becoming a crucial framework in smart cities. Within this context of heightened attention, we propose to utilize machine learning (ML) to optimize and ameliorate pedestrian crossing predictions in intelligent transportation systems, where the crossing process is vital to pedestrian crossing behavior. Compared with traditional analytical models, the application of OpenCV image recognition and machine learning methods can analyze the mechanisms of pedestrian crossing behaviors with greater accuracy, thereby more precisely judging and simulating pedestrian violations in crossing. Authentic pedestrian crossing behavior data were extracted from signalized intersection scenarios in Chinese cities, and several machine learning models, including decision trees, multilayer perceptrons, Bayesian algorithms, and support vector machines, were trained and tested. In comparing the various models, the results indicate that the support vector machine (SVM) model exhibited optimal accuracy in predicting pedestrian crossing probabilities and speeds, and it can be applied in pedestrian crossing prediction and traffic simulation systems in intelligent transportation. Full article
Show Figures

Figure 1

Figure 1
<p>Camera angles at four data collection sites: (<b>a</b>) Shandong Road–Songjiang Road; (<b>b</b>) Hongyun Road–Zhelin Street; (<b>c</b>) Zhangqian Road–Hongjin Road; and (<b>d</b>) Huadong Road–Qianshan Road.</p>
Full article ">Figure 2
<p>Installation process of cameras for data collection.</p>
Full article ">Figure 3
<p>Image recognition interface.</p>
Full article ">Figure 4
<p>Vehicle speed and distance statistics. (<b>a</b>) Statistics of the elderly; (<b>b</b>) statistics of middle-aged people; (<b>c</b>) statistics of children.</p>
Full article ">Figure 5
<p>Pedestrian crossing prediction methods and procedures.</p>
Full article ">Figure 6
<p>Structure diagram of decision tree.</p>
Full article ">Figure 7
<p>The structure of multi-layer perceptron.</p>
Full article ">Figure 8
<p>ROC curves for each machine learning model. (<b>a</b>) Decision tree; (<b>b</b>) SVM; (<b>c</b>) MLP; and (<b>d</b>) Naïve Bayes.</p>
Full article ">Figure 9
<p>SHAP analysis conducted on the crossing probability prediction model based on the SVM.</p>
Full article ">Figure 10
<p>Probability model of pedestrians’ crossing behaviors. (<b>a</b>) Crossing probability model for the elderly; (<b>b</b>) crossing probability model for middle−aged adult pedestrians; (<b>c</b>) crossing probability model for children.</p>
Full article ">Figure 11
<p>SHAP analysis based on the support vector regression (SVR) model.</p>
Full article ">Figure 12
<p>Crossing speed model of the pedestrians. (<b>a</b>) Crossing speeds of elderly individuals; (<b>b</b>) crossing speeds of middle-aged individuals; (<b>c</b>) crossing speeds of children.</p>
Full article ">
19 pages, 5724 KiB  
Article
Image-Enhanced U-Net: Optimizing Defect Detection in Window Frames for Construction Quality Inspection
by Jorge Vasquez, Tomotake Furuhata and Kenji Shimada
Buildings 2024, 14(1), 3; https://doi.org/10.3390/buildings14010003 - 19 Dec 2023
Viewed by 1735
Abstract
Ensuring the structural integrity of window frames and detecting subtle defects, such as dents and scratches, is crucial for maintaining product quality. Traditional machine vision systems face challenges in defect identification, especially with reflective materials and varied environments. Modern machine and deep learning [...] Read more.
Ensuring the structural integrity of window frames and detecting subtle defects, such as dents and scratches, is crucial for maintaining product quality. Traditional machine vision systems face challenges in defect identification, especially with reflective materials and varied environments. Modern machine and deep learning (DL) systems hold promise for post-installation inspections but face limitations due to data scarcity and environmental variability. Our study introduces an innovative approach to enhance DL-based defect detection, even with limited data. We present a comprehensive window frame defect detection framework incorporating optimized image enhancement, data augmentation, and a core U-Net model. We constructed five datasets using cell phones and the Spot Robot for autonomous inspection, evaluating our approach across various scenarios and lighting conditions in real-world window frame inspections. Our results demonstrate significant performance improvements over the standard U-Net model, with a notable 7.43% increase in the F1 score and 15.1% in IoU. Our approach enhances defect detection capabilities, even in challenging real-world conditions. To enhance the generalizability of this study, it would be advantageous to apply its methodology across a broader range of diverse construction sites. Full article
Show Figures

Figure 1

Figure 1
<p>The framework of the window frame defect detection system (WFDD). The input comprises RGB images captured by the Spot Robot. The data augmentation module employs geometric operations and applies different image enhancement techniques. The preprocessing module is then employed to enhance the performance of the defect detection model. Within the detection module, defects are identified among all detected window frames, with the output showcasing U-Net-generated segmentation blobs.</p>
Full article ">Figure 2
<p>Example from Cellphone Dataset.</p>
Full article ">Figure 3
<p>Samples of Construction Site Dataset.</p>
Full article ">Figure 4
<p>Example from Lab-1 Dataset.</p>
Full article ">Figure 5
<p>Example from Lab-2 Dataset.</p>
Full article ">Figure 6
<p>Samples of Demo Site Dataset.</p>
Full article ">Figure 7
<p>Example of labeling.</p>
Full article ">Figure 8
<p>Comparative sample using the shadow removal technique.</p>
Full article ">Figure 9
<p>Comparative sample using the color neutralization technique.</p>
Full article ">Figure 10
<p>Comparative sample using the contrast enhancement technique.</p>
Full article ">Figure 11
<p>Comparative sample using the intensity neutralization technique.</p>
Full article ">Figure 12
<p>Comparative sample using the CLAHE technique.</p>
Full article ">
16 pages, 5787 KiB  
Article
The Spatio-Temporal Patterns of Regional Development in Shandong Province of China from 2012 to 2021 Based on Nighttime Light Remote Sensing
by Hongli Zhang, Quanzhou Yu, Yujie Liu, Jie Jiang, Junjie Chen and Ruyun Liu
Sensors 2023, 23(21), 8728; https://doi.org/10.3390/s23218728 - 26 Oct 2023
Cited by 4 | Viewed by 2251
Abstract
As a major coastal economic province in the east of China, it is of great significance to clarify the temporal and spatial patterns of regional development in Shandong Province in recent years to support regional high-quality development. Nightlight remote sensing data can reveal [...] Read more.
As a major coastal economic province in the east of China, it is of great significance to clarify the temporal and spatial patterns of regional development in Shandong Province in recent years to support regional high-quality development. Nightlight remote sensing data can reveal the spatio-temporal patterns of social and economic activities on a fine pixel scale. We based the nighttime light patterns at three spatial scales in three geographical regions on monthly nighttime light remote sensing data and social statistics. Different cities and different counties in Shandong Province in the last 10 years were studied by using the methods of trend analysis, stability analysis and correlation analysis. The results show that: (1) The nighttime light pattern was generally consistent with the spatial pattern of construction land. The nighttime light intensity of most urban, built-up areas showed an increasing trend, while the old urban areas of Qingdao and Yantai showed a weakening trend. (2) At the geographical unit scale, the total nighttime light in south-central Shandong was significantly higher than that in eastern and northwest Shandong, while the nighttime light growth rate in northwest Shandong was significantly highest. At the urban scale, Liaocheng had the highest nighttime light growth rate. At the county scale, the nighttime light growth rate of counties with a better economy was lower, while that of counties with a backward economy was higher. (3) The nighttime light growth was significantly correlated with Gross Domestic Product (GDP) and population growth, indicating that regional economic development and population growth were the main causes of nighttime light change. Full article
Show Figures

Figure 1

Figure 1
<p>Data processing flow chart.</p>
Full article ">Figure 2
<p>Land cover in Shandong province in 2020.</p>
Full article ">Figure 3
<p>Spatial pattern of mean Nighttime Light in Shandong province from April 2012 to October 2021.</p>
Full article ">Figure 4
<p>Spatio-temporal changes in Nighttime Light in Shandong province from April 2012 to October 2021.</p>
Full article ">Figure 5
<p>Key areas of Nighttime Light change in Shandong province from April 2012 to October 2021.</p>
Full article ">Figure 6
<p>Stability pattern of Nighttime Light in Shandong province from April 2012 to October 2021.</p>
Full article ">
17 pages, 45348 KiB  
Article
Enhanced 3D Pose Estimation in Multi-Person, Multi-View Scenarios through Unsupervised Domain Adaptation with Dropout Discriminator
by Junli Deng, Haoyuan Yao and Ping Shi
Sensors 2023, 23(20), 8406; https://doi.org/10.3390/s23208406 - 12 Oct 2023
Cited by 1 | Viewed by 1389
Abstract
Data-driven pose estimation methods often assume equal distributions between training and test data. However, in reality, this assumption does not always hold true, leading to significant performance degradation due to distribution mismatches. In this study, our objective is to enhance the cross-domain robustness [...] Read more.
Data-driven pose estimation methods often assume equal distributions between training and test data. However, in reality, this assumption does not always hold true, leading to significant performance degradation due to distribution mismatches. In this study, our objective is to enhance the cross-domain robustness of multi-view, multi-person 3D pose estimation. We tackle the domain shift challenge through three key approaches: (1) A domain adaptation component is introduced to improve estimation accuracy for specific target domains. (2) By incorporating a dropout mechanism, we train a more reliable model tailored to the target domain. (3) Transferable Parameter Learning is employed to retain crucial parameters for learning domain-invariant data. The foundation for these approaches lies in the H-divergence theory and the lottery ticket hypothesis, which are realized through adversarial training by learning domain classifiers. Our proposed methodology is evaluated using three datasets: Panoptic, Shelf, and Campus, allowing us to assess its efficacy in addressing domain shifts in multi-view, multi-person pose estimation. Both qualitative and quantitative experiments demonstrate that our algorithm performs well in two different domain shift scenarios. Full article
Show Figures

Figure 1

Figure 1
<p>Depiction of various datasets utilized for multi-view, multi-person 3D pose estimation. Image examples are sourced from Panoptic [<a href="#B9-sensors-23-08406" class="html-bibr">9</a>], Campus [<a href="#B10-sensors-23-08406" class="html-bibr">10</a>], and Shelf [<a href="#B10-sensors-23-08406" class="html-bibr">10</a>], respectively. While all datasets feature scenes with clean backgrounds, they differ in aspects such as clothing, resolution, lighting, body size, and more. These visual disparities among the datasets complicate the task of applying pose estimation models across different domains.</p>
Full article ">Figure 2
<p>An overview of our Domain Adaptive VoxelPose model. An adversarial training method is used to train the domain classifier. The selection of certain discriminators is determined by a probability <math display="inline"><semantics> <msub> <mi>δ</mi> <mi>k</mi> </msub> </semantics></math>. The network performs a robust positive update for the transferable parameters and performs a negative update for the untransferable parameters.</p>
Full article ">Figure 3
<p>The original adversarial framework (<b>a</b>) is extended to incorporate multiple adversaries. In this enhancement, certain discriminators are probabilistically omitted (<b>b</b>), resulting in only a random subset of feedback (depicted by the arrows) being utilized by the feature extractor at the end of each batch.</p>
Full article ">Figure 4
<p>Estimated 3D poses and their corresponding images in an outdoor environment (Campus Dataset). Different colors represent different people detected. The penultimate column is the output result of the original voxelpose, which has misestimated the person. The last column shows the estimated 3D poses by our algorithm.</p>
Full article ">Figure 5
<p>Cross-domain qualitative comparison between our method and other state-of-the-art multi-view multi-person 3D pose estimation algorithms. The evaluated methods were trained on the Panoptic dataset and validated on the Campus dataset. Different colors represent different people detected, with red indicating the ground truth.</p>
Full article ">Figure 6
<p>Estimated 3D poses and their corresponding images in an indoor social interaction environment (Shelf Dataset). The penultimate column is the output result of the original voxelpose, which has misestimated person. The last column shows the estimated 3D poses by our algorithm.</p>
Full article ">Figure 7
<p>Cross-domain qualitative comparison between our method and other state-of-the-art multi-view multi-person 3D pose estimation algorithms in the Shelf dataset. the evaluated methods were trained on the Panoptic dataset and validated on the Shelf dataset.</p>
Full article ">Figure 8
<p>An illustration of the Average Percentage of Correct Parts (PCP3D) on the Campus and Shelf datasets, with the Dropout Rate (d) plotted on the horizontal axis and PCP3D on the vertical axis. The methods are distinguished by color: the red line for the DA baseline method, the yellow line for the dropout DA method, and the blue line for our proposed full method with TransPar.</p>
Full article ">Figure 9
<p>The Average Percentage of Correct Parts (PCP3D), based on Wider Ratios of Transferable Parameters, on Campus and Shelf dataset.</p>
Full article ">
Back to TopTop