Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (4,875)

Search Parameters:
Keywords = supervised learning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
14 pages, 1739 KiB  
Article
Older Adult Fall Risk Prediction with Deep Learning and Timed Up and Go (TUG) Test Data
by Josu Maiora, Chloe Rezola-Pardo, Guillermo García, Begoña Sanz and Manuel Graña
Bioengineering 2024, 11(10), 1000; https://doi.org/10.3390/bioengineering11101000 (registering DOI) - 5 Oct 2024
Viewed by 233
Abstract
Falls are a major health hazard for older adults; therefore, in the context of an aging population, predicting the risk of a patient suffering falls in the near future is of great impact for health care systems. Currently, the standard prospective fall risk [...] Read more.
Falls are a major health hazard for older adults; therefore, in the context of an aging population, predicting the risk of a patient suffering falls in the near future is of great impact for health care systems. Currently, the standard prospective fall risk assessment instrument relies on a set of clinical and functional mobility assessment tools, one of them being the Timed Up and Go (TUG) test. Recently, wearable inertial measurement units (IMUs) have been proposed to capture motion data that would allow for the building of estimates of fall risk. The hypothesis of this study is that the data gathered from IMU readings while the patient is performing the TUG test can be used to build a predictive model that would provide an estimate of the probability of suffering a fall in the near future, i.e., assessing prospective fall risk. This study applies deep learning convolutional neural networks (CNN) and recurrent neural networks (RNN) to build such predictive models based on features extracted from IMU data acquired during TUG test realizations. Data were obtained from a cohort of 106 older adults wearing wireless IMU sensors with sampling frequencies of 100 Hz while performing the TUG test. The dependent variable is a binary variable that is true if the patient suffered a fall in the six-month follow-up period. This variable was used as the output variable for the supervised training and validations of the deep learning architectures and competing machine learning approaches. A hold-out validation process using 75 subjects for training and 31 subjects for testing was repeated one hundred times to obtain robust estimations of model performances At each repetition, 5-fold cross-validation was carried out to select the best model over the training subset. Best results were achieved by a bidirectional long short-term memory (BLSTM), obtaining an accuracy of 0.83 and AUC of 0.73 with good sensitivity and specificity values. Full article
Show Figures

Figure 1

Figure 1
<p>The process of the realization of the TUG test decomposed into six phases.</p>
Full article ">Figure 2
<p>An instance of the readings of the G-walk during a TUG test realization shown as raw data plots: (<b>a</b>) triaxial accelerometer, and (<b>b</b>) triaxial gyroscope.</p>
Full article ">Figure 3
<p>Number of samples per recorded IMU sequence during the realization of TUG tests sorted in ascending order.</p>
Full article ">Figure 4
<p>Box-plot of each phase duration in TUG test. The median, upper-lower quartiles and maximum-minimum values are shown.</p>
Full article ">Figure 5
<p>Univariate Chi-Square Test importance ranking of TUG test phase input variables used by conventional machine learning classifiers.</p>
Full article ">Figure 6
<p>ROC curve with Point-wise Confidence Bounds of an instance of the 5-fold cross-validation of the BILSTM architecture. The dashed lines represent the chance ROC.</p>
Full article ">
30 pages, 585 KiB  
Article
Decoding Urban Intelligence: Clustering and Feature Importance in Smart Cities
by Enrico Barbierato and Alice Gatti
Future Internet 2024, 16(10), 362; https://doi.org/10.3390/fi16100362 (registering DOI) - 5 Oct 2024
Viewed by 251
Abstract
The rapid urbanization trend underscores the need for effective management of city resources and services, making the concept of smart cities increasingly important. This study leverages the IMD Smart City Index (SCI) dataset to analyze and rank smart cities worldwide. Our research has [...] Read more.
The rapid urbanization trend underscores the need for effective management of city resources and services, making the concept of smart cities increasingly important. This study leverages the IMD Smart City Index (SCI) dataset to analyze and rank smart cities worldwide. Our research has a dual objective: first, we aim to apply a set of unsupervised learning models to cluster cities based on their smartness indices. Second, we aim to employ supervised learning models such as random forest, support vector machines (SVMs), and others to determine the importance of various features that contribute to a city’s smartness. Our findings reveal that while smart living was the most critical factor, with an importance of 0.259014. Smart mobility and smart environment also played significant roles, with the importance of 0.170147 and 0.163159, respectively, in determining a city’s smartness. While the clustering provides insights into the similarities and groupings among cities, the feature importance analysis elucidates the critical factors that drive these classifications. The integration of these two approaches aims to demonstrate that understanding the similarities between smart cities is of limited utility without a clear comprehension of the importance of the underlying features. This holistic approach provides a comprehensive understanding of what makes a city ’smart’ and offers a robust framework for policymakers to enhance urban living standards. Full article
(This article belongs to the Special Issue Machine Learning for Blockchain and IoT Systems in Smart City)
Show Figures

Figure 1

Figure 1
<p>Percentage of urban population between 1950 and 2050.</p>
Full article ">Figure 2
<p>Visual representation of the work’s structure.</p>
Full article ">Figure 3
<p>k-means clustering for <math display="inline"><semantics> <mrow> <mi>k</mi> <mspace width="3.33333pt"/> <mo>=</mo> <mspace width="3.33333pt"/> <mn>5</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 4
<p>Dendrogram of smart city indices.</p>
Full article ">Figure 5
<p>Gaussian mixture model (GMM) clustering of smart city indices.</p>
Full article ">Figure 6
<p>Self-organizing map (SOM) clustering of smart city indices with convex hulls.</p>
Full article ">
13 pages, 363 KiB  
Article
Semi-Supervised Learning with Close-Form Label Propagation Using a Bipartite Graph
by Zhongxing Peng, Gengzhong Zheng and Wei Huang
Symmetry 2024, 16(10), 1312; https://doi.org/10.3390/sym16101312 - 4 Oct 2024
Viewed by 394
Abstract
In this paper, we introduce an efficient and effective algorithm for Graph-based Semi-Supervised Learning (GSSL). Unlike other GSSL methods, our proposed algorithm achieves efficiency by constructing a bipartite graph, which connects a small number of representative points to a large volume of raw [...] Read more.
In this paper, we introduce an efficient and effective algorithm for Graph-based Semi-Supervised Learning (GSSL). Unlike other GSSL methods, our proposed algorithm achieves efficiency by constructing a bipartite graph, which connects a small number of representative points to a large volume of raw data by capturing their underlying manifold structures. This bipartite graph, with a sparse and anti-diagonal affinity matrix which is symmetrical, serves as a low-rank approximation of the original graph. Consequently, our algorithm accelerates both the graph construction and label propagation steps. In particular, on the one hand, our algorithm computes the label propagation in closed-form, reducing its computational complexity from cubic to approximately linear with respect to the number of data points; on the other hand, our algorithm calculates the soft label matrix for unlabeled data using a closed-form solution, thereby gaining additional acceleration. Comprehensive experiments performed on six real-world datasets demonstrate the efficiency and effectiveness of our algorithm in comparison to five state-of-the-art algorithms. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

Figure 1
<p>Bipartite Graph Construction in GSSL Problem. (<b>a</b>) The original fully connected graph comprises 8 data points (circles) and 28 edges (lines). (<b>b</b>) The bipartite graph connects two distinct sets: one set includes three representative points (squares) labeled ‘A’, ‘B’, and ‘C’, while the other set contains the 8 data points. Each representative point is associated with several data points of the same color. (<b>c</b>) Label propagation can be performed across the 3 representative points and along the 3 connecting edges.</p>
Full article ">Figure 2
<p>Comparisons between the proposed method and baseline methods on the MNIST dataset, with the quantity of labeled data ranging from 20 to 4500. <b>Left</b>: accuracy. <b>Right</b>: running time.</p>
Full article ">Figure 3
<p>The accuracy of AGR and the proposed SSLCFBG, in relation to varying numbers of representative points, on the USPS dataset, where the number of labeled data are set at 20.</p>
Full article ">
17 pages, 2472 KiB  
Article
Prediction of Energy Efficiency for Residential Buildings Using Supervised Machine Learning Algorithms
by Tahir Mahmood and Muhammad Asif
Energies 2024, 17(19), 4965; https://doi.org/10.3390/en17194965 - 4 Oct 2024
Viewed by 385
Abstract
In the era of digitalization, the large availability of data and innovations in machine learning algorithms provide new potential to improve the prediction of energy efficiency in buildings. The building sector research in the Kingdom of Saudi Arabia (KSA) lacks actual/measured data-based studies [...] Read more.
In the era of digitalization, the large availability of data and innovations in machine learning algorithms provide new potential to improve the prediction of energy efficiency in buildings. The building sector research in the Kingdom of Saudi Arabia (KSA) lacks actual/measured data-based studies as the existing studies are predominantly modeling-based. The results of simulation-based studies can deviate from the actual energy performance of buildings due to several factors. A clearer understanding of building energy performance can be better established through actual data-based analysis. This study aims to predict the energy efficiency of residential buildings in the KSA using supervised machine learning algorithms. It analyzes residential energy trends through data collected from an energy audit of 200 homes. It predicts energy efficiency using five supervised machine learning algorithms: ridge regression, least absolute shrinkage and selection operator (LASSO) regression, a least angle regression (LARS) model, a Lasso-LARS model, and an elastic net regression (ENR) model. It also explores the most significant explanatory energy efficiency variables. The results reveal that the ENR model outperforms other models in predicting energy consumption. This study offers a new and prolific avenue for the research community and other building sector stakeholders, especially regulators and policymakers. Full article
(This article belongs to the Special Issue Climate Change and Sustainable Energy Transition)
Show Figures

Figure 1

Figure 1
<p>Energy saving potential in buildings.</p>
Full article ">Figure 2
<p>A paradigm for the processes used for implemented predictive modeling.</p>
Full article ">Figure 3
<p>Radar plots of actual (presented by the green line) and predicted values (shown by the dotted red line) from (<b>a</b>) ridge regression, (<b>b</b>) LASSO regression, (<b>c</b>) LARS model, (<b>d</b>) LASSO-LARS model, and (<b>e</b>) ENR model.</p>
Full article ">Figure 3 Cont.
<p>Radar plots of actual (presented by the green line) and predicted values (shown by the dotted red line) from (<b>a</b>) ridge regression, (<b>b</b>) LASSO regression, (<b>c</b>) LARS model, (<b>d</b>) LASSO-LARS model, and (<b>e</b>) ENR model.</p>
Full article ">Figure 4
<p>A radar chart based on the coefficients of the ENR model. Where the zero on the radar plot is presented by the green line and regression coefficients were shown by the dotted red line.</p>
Full article ">
19 pages, 3076 KiB  
Article
Three-Stage Recursive Learning Technique for Face Mask Detection on Imbalanced Datasets
by Chi-Yi Tsai, Wei-Hsuan Shih and Humaira Nisar
Mathematics 2024, 12(19), 3104; https://doi.org/10.3390/math12193104 - 4 Oct 2024
Viewed by 404
Abstract
In response to the COVID-19 pandemic, governments worldwide have implemented mandatory face mask regulations in crowded public spaces, making the development of automatic face mask detection systems critical. To achieve robust face mask detection performance, a high-quality and comprehensive face mask dataset is [...] Read more.
In response to the COVID-19 pandemic, governments worldwide have implemented mandatory face mask regulations in crowded public spaces, making the development of automatic face mask detection systems critical. To achieve robust face mask detection performance, a high-quality and comprehensive face mask dataset is required. However, due to the difficulty in obtaining face samples with masks in the real-world, public face mask datasets are often imbalanced, leading to the data imbalance problem in model training and negatively impacting detection performance. To address this problem, this paper proposes a novel recursive model-training technique designed to improve detection accuracy on imbalanced datasets. The proposed method recursively splits and merges the dataset based on the attribute characteristics of different classes, enabling more balanced and effective model training. Our approach demonstrates that the carefully designed splitting and merging of datasets can significantly enhance model-training performance. This method was evaluated using two imbalanced datasets. The experimental results show that the proposed recursive learning technique achieves a percentage increase (PI) of 84.5% in mean average precision ([email protected]) on the Kaggle dataset and of 186.3% on the Eden dataset compared to traditional supervised learning. Additionally, when combined with existing oversampling techniques, the PI on the Kaggle dataset further increases to 88.9%, highlighting the potential of the proposed method for improving detection accuracy in highly imbalanced datasets. Full article
(This article belongs to the Special Issue Advances in Algorithm Design and Machine Learning)
Show Figures

Figure 1

Figure 1
<p>Three conditions of face mask-wearing: (<b>a</b>) correct mask-wearing, (<b>b</b>) no mask-wearing, and (<b>c</b>) incorrect mask-wearing.</p>
Full article ">Figure 2
<p>Comparison of (<b>a</b>) the traditional supervised learning and (<b>b</b>) the proposed recursive learning method. The proposed recursive learning method incorporates dataset manipulation processing into the model-training process to train the model recursively.</p>
Full article ">Figure 3
<p>Concept of the proposed dataset split-and-merge processing for recursive learning.</p>
Full article ">Figure 4
<p>Illustration of the proposed three-stage recursive learning method combined with dataset split-and-merge processing.</p>
Full article ">Figure 5
<p>Flowchart of the proposed recursive learning method.</p>
Full article ">Figure 6
<p>Illustration of the distances C and D between the ground truth A and the predicted B bounding boxes.</p>
Full article ">Figure 7
<p>Experimental results of (<b>a</b>) the supervised learning and (<b>b</b>) the proposed Over-R-S1S2S3 learning method on the Kaggle test set, along with (<b>c</b>) and (<b>d</b>), the corresponding zoom-in results.</p>
Full article ">Figure 8
<p>Experimental results of (<b>a</b>) the supervised learning and (<b>b</b>) the proposed Over-R-S1S2S3 learning method on the Kaggle test set, along with (<b>c</b>) and (<b>d</b>), the corresponding zoom-in results.</p>
Full article ">
21 pages, 4783 KiB  
Article
CECL-Net: Contrastive Learning and Edge-Reconstruction-Driven Complementary Learning Network for Image Forgery Localization
by Gaoyuan Dai, Kai Chen, Linjie Huang, Longru Chen, Dongping An, Zhe Wang and Kai Wang
Electronics 2024, 13(19), 3919; https://doi.org/10.3390/electronics13193919 - 3 Oct 2024
Viewed by 349
Abstract
While most current image forgery localization (IFL) deep learning models focus primarily on the foreground of tampered images, they often neglect the essential complementary background semantic information. This oversight tends to create significant gaps in these models’ ability to thoroughly interpret and understand [...] Read more.
While most current image forgery localization (IFL) deep learning models focus primarily on the foreground of tampered images, they often neglect the essential complementary background semantic information. This oversight tends to create significant gaps in these models’ ability to thoroughly interpret and understand a tampered image, thereby limiting their effectiveness in extracting critical tampering traces. Given the above, this paper presents a novel contrastive learning and edge-reconstruction-driven complementary learning network (CECL-Net) for image forgery localization. CECL-Net enhances the understanding of tampered images by employing a complementary learning strategy that leverages foreground and background features, where a unique edge extractor (EE) generates precise edge artifacts, and edge-guided feature reconstruction (EGFR) utilizes the edge artifacts to reconstruct a fully complementary set of foreground and background features. To carry out the complementary learning process more efficiently, we also introduce a pixel-wise contrastive supervision (PCS) method that attracts consistent regions in features while repelling different regions. Moreover, we propose a dense fusion (DF) strategy that utilizes multi-scale and mutual attention mechanisms to extract more discriminative features and improve the representational power of CECL-Net. Experiments conducted on two benchmark datasets, one Artificial Intelligence (AI)-manipulated dataset and two real challenge datasets, indicate that our CECL-Net outperforms seven state-of-the-art models on three evaluation metrics. Full article
(This article belongs to the Special Issue Image Processing Based on Convolution Neural Network)
Show Figures

Figure 1

Figure 1
<p>Three challenging images for image forgery localization (IFL). The images, from left to right, are (<b>a</b>) input images, (<b>b</b>) CECL-Net without background supervision, (<b>c</b>) CECL-Net, and (<b>d</b>) ground truths.</p>
Full article ">Figure 2
<p>Overview of contrastive learning and edge-reconstruction-driven complementary learning network (CECL-Net): Pyramid Vision Transformer (PVT) for feature extraction (<a href="#sec3dot2-electronics-13-03919" class="html-sec">Section 3.2</a>); dense fusion (DF) for multi-scale feature fusion (<a href="#sec3dot3-electronics-13-03919" class="html-sec">Section 3.3</a>); edge extractor (EE) for edge-artifact extraction (<a href="#sec3dot4-electronics-13-03919" class="html-sec">Section 3.4</a>); edge-guided feature reconstruction (EGFR) for foreground and background feature reconstruction (<a href="#sec3dot5-electronics-13-03919" class="html-sec">Section 3.5</a>).</p>
Full article ">Figure 3
<p>Illustration of dense fusion (DF). DF consists of two components (<a href="#sec3dot3-electronics-13-03919" class="html-sec">Section 3.3</a>): a multi-scale feature extractor (MFE) and the dense interaction of multi-layer-features. The initial stage concentrates on multi-scale feature extraction, while the second stage involves the dense interaction–fusion of multi-layer features.</p>
Full article ">Figure 4
<p>Illustration of edge extractor (EE). EE uses a multi-scale strategy to obtain accurate edge artifacts (<a href="#sec3dot4-electronics-13-03919" class="html-sec">Section 3.4</a>).</p>
Full article ">Figure 5
<p>Illustration of edge-guided feature reconstruction (EGFR). EGFR uses edge artifacts extracted by EE to guide the reconstruction of complementary foreground and background features (<a href="#sec3dot5-electronics-13-03919" class="html-sec">Section 3.5</a>).</p>
Full article ">Figure 6
<p>Illustration of pixel-wise contrastive supervision (PCS). PCS enhances the contrast between the tampered and the real regions and maintains the uniform distribution of the tampered and the real regions (<a href="#sec3dot5-electronics-13-03919" class="html-sec">Section 3.5</a>).</p>
Full article ">Figure 7
<p>Qualitative results of different IFL methods. We present the qualitative results of seven popular methods under five challenging scenes selected from different datasets (<a href="#sec4dot2-electronics-13-03919" class="html-sec">Section 4.2</a>.(2)).</p>
Full article ">Figure 8
<p>Qualitative results of different schemes. Compared to CECL-Net, the lack of any module will cause performance degradation. Further details can be found in <a href="#sec4dot2-electronics-13-03919" class="html-sec">Section 4.2</a>.(3).</p>
Full article ">Figure 9
<p>A visualization of the effectiveness of different edge-extraction strategies. In_MVSS stands for the edge-extraction method in MVSS-Net, and In_ET refers to the edge-extraction method in ET-Net.</p>
Full article ">Figure 10
<p>A visualization of the effectiveness of different contrastive supervision strategies. In_CFL stands for the contrastive supervision method in CFL-Net.</p>
Full article ">Figure 11
<p>The results of the robustness analysis of the proposed method and the other seven SOTA methods on the NIST dataset. (<b>a</b>–<b>d</b>) showcase different robustness tests: (<b>a</b>) compares the resilience of various methods against JPEG compression, (<b>b</b>) examines their resistance to Gaussian filtering, (<b>c</b>) evaluates performance under image scaling, and (<b>d</b>) assesses sensitivity to image sharpening.</p>
Full article ">Figure 11 Cont.
<p>The results of the robustness analysis of the proposed method and the other seven SOTA methods on the NIST dataset. (<b>a</b>–<b>d</b>) showcase different robustness tests: (<b>a</b>) compares the resilience of various methods against JPEG compression, (<b>b</b>) examines their resistance to Gaussian filtering, (<b>c</b>) evaluates performance under image scaling, and (<b>d</b>) assesses sensitivity to image sharpening.</p>
Full article ">Figure 12
<p>Failure cases. Data source: IMD.</p>
Full article ">
15 pages, 473 KiB  
Article
Applying Multi-CLASS Support Vector Machines: One-vs.-One vs. One-vs.-All on the UWF-ZeekDataFall22 Dataset
by Rocio Krebs, Sikha S. Bagui, Dustin Mink and Subhash C. Bagui
Electronics 2024, 13(19), 3916; https://doi.org/10.3390/electronics13193916 - 3 Oct 2024
Viewed by 301
Abstract
This study investigates the technical challenges of applying Support Vector Machines (SVM) for multi-class classification in network intrusion detection using the UWF-ZeekDataFall22 dataset, which is labeled based on the MITRE ATT&CK framework. A key challenge lies in handling imbalanced classes and complex attack [...] Read more.
This study investigates the technical challenges of applying Support Vector Machines (SVM) for multi-class classification in network intrusion detection using the UWF-ZeekDataFall22 dataset, which is labeled based on the MITRE ATT&CK framework. A key challenge lies in handling imbalanced classes and complex attack patterns, which are inherent in intrusion detection data. This work highlights the difficulties in implementing SVMs for multi-class classification, particularly with One-vs.-One (OvO) and One-vs.-All (OvA) methods, including scalability issues due to the large volume of network traffic logs and the tendency of SVMs to be sensitive to noisy data and class imbalances. SMOTE was used to address class imbalances, while preprocessing techniques were applied to improve feature selection and reduce noise in the data. The unique structure of network traffic data, with overlapping patterns between attack vectors, posed significant challenges in achieving accurate classification. Our model reached an accuracy of over 90% with OvO and over 80% with OvA, demonstrating that despite these challenges, multi-class SVMs can be effectively applied to complex intrusion detection tasks when combined with appropriate balancing and preprocessing techniques. Full article
(This article belongs to the Special Issue Machine Learning and Cybersecurity—Trends and Future Challenges)
Show Figures

Figure 1

Figure 1
<p>Distribution of label_tactic.</p>
Full article ">Figure 2
<p>Final distribution of label_tatic after balancing, reduced to the most critical classes for intrusion detection.</p>
Full article ">Figure 3
<p>Confusion Matrix for OvO.</p>
Full article ">Figure 4
<p>Confusion Matrix for OVA.</p>
Full article ">
15 pages, 1401 KiB  
Article
Enhancing Anomaly Detection in Maritime Operational IoT Time Series Data with Synthetic Outliers
by Hyunjoo Kim and Inwhee Joe
Electronics 2024, 13(19), 3912; https://doi.org/10.3390/electronics13193912 - 3 Oct 2024
Viewed by 393
Abstract
Detecting anomalies in engine and machinery data during ship operations is crucial for maintaining the safety and efficiency of the vessel. We conducted experiments using device data from the maritime industry, consisting of time series records from IoT (Internet of Things) datasets such [...] Read more.
Detecting anomalies in engine and machinery data during ship operations is crucial for maintaining the safety and efficiency of the vessel. We conducted experiments using device data from the maritime industry, consisting of time series records from IoT (Internet of Things) datasets such as cylinder and exhaust gas temperatures, coolant temperatures, and cylinder pressures collected from various sensors on the ship’s equipment. We propose data enrichment and validation techniques by generating synthetic outliers through data degradation and data augmentation with a Transformer backbone, utilizing the maritime operational data. We extract a portion of the input data and replace it with synthetic outliers. The created anomaly data are then used to train the model via a self-supervised learning approach. Synthetic outliers are generated using methods such as the arithmetic mean, geometric mean, median, local scale, global scale, and magnitude warping. With our methodology, we achieved a 17.23% improvement in F1 performance compared to existing state-of-the-art methods across five publicly available datasets and actual maritime operational data collected from the industry. Full article
(This article belongs to the Special Issue Empowering IoT with AI: AIoT for Smart and Autonomous Systems)
Show Figures

Figure 1

Figure 1
<p>Different types of anomalies in time series data. (Red) global anomaly, (green) contextual anomaly, (blue) seasonal anomaly, (purple) trend anomaly, (orange) shapelet anomaly, and (black) original data.</p>
Full article ">Figure 2
<p>This diagram depicts an anomaly detection framework for maritime operational data. Outliers are generated through the synthetic outlier generation module and used to train the Transformer backbone model, which produces an F1 score. The model with the best score is determined through voting to identify the best model, which is then used for anomaly detection.</p>
Full article ">Figure 3
<p>(<b>a</b>) Describes important data from ship equipment related to the engine; (<b>b</b>) details the model selection method used to choose the best model for anomaly detection.</p>
Full article ">Figure 4
<p>Synthetic outlier type for anomaly detection using arithmetic mean and geometric mean.</p>
Full article ">Figure 5
<p>Synthetic outlier type for anomaly detection using median and local scaling.</p>
Full article ">Figure 6
<p>Synthetic outlier type for anomaly detection using global scaling and magnitude warping.</p>
Full article ">Figure 7
<p>F1 score compared with SOTA. The blue bar represents AnomalyBERT, the orange bar represents CARLA, and the green bar represents the F1 score.</p>
Full article ">Figure 8
<p>F1 score for external interval replacement. The blue bar represents the Flip method, the orange bar denotes the arithmetic mean method, the green bar represents the geometric mean method, and the red bar represents the median method. The purple bar denotes the global scale method. The brown bar represents the local scale method. The Pink bar represents the magnitude warping method.</p>
Full article ">Figure 9
<p>Performances of different global scales for different datasets. WADI (blue), SWaT (orange), MSL (green), SMAP (red), SMD (purple), and IMOD (brown).</p>
Full article ">Figure 10
<p>Loss with global scales for the IMOD dataset.</p>
Full article ">Figure 11
<p>Anomaly detection results for IMOD. The red point denotes the test dataset. The top part shows the original data for Cylinder7 Pmax, while the bottom part displays the anomaly score results for the corresponding indices.</p>
Full article ">Figure 12
<p>Anomaly detection results for IMOD. The red point is the label for the test dataset. The top part shows the original data for the bearing temperature, cylinder exhaust gas outlet temperature, cylinder Pmax, engine load, and power, while the bottom part displays the anomaly score results for the corresponding indices.</p>
Full article ">
19 pages, 1874 KiB  
Article
An AI-Based Approach for Developing a Recommendation System for Underground Mining Methods Pre-Selection
by Elsa Pansilvania Andre Manjate, Natsuo Okada, Yoko Ohtomo, Tsuyoshi Adachi, Bernardo Miguel Bene, Takahiko Arima and Youhei Kawamura
Mining 2024, 4(4), 747-765; https://doi.org/10.3390/mining4040042 - 2 Oct 2024
Viewed by 441
Abstract
Selecting the most appropriate mining method to recover mineral resources is a critical decision-making task in mining project development. This study introduces an artificial intelligence-based mining methods recommendation system (AI-MMRS) for the pre-selection of underground mining methods. The study integrates and evaluates the [...] Read more.
Selecting the most appropriate mining method to recover mineral resources is a critical decision-making task in mining project development. This study introduces an artificial intelligence-based mining methods recommendation system (AI-MMRS) for the pre-selection of underground mining methods. The study integrates and evaluates the capability of two approaches for mining methods selection (MMS): the memory-based collaborative filtering (CF) approach aided by the UBC-MMS system to predict the top-3 relevant mining methods and supervised machine learning (ML) classification algorithms to enhance the effectiveness and novelty of the AI-MMRS, addressing the limitations of the CF approach. The results reveal that the memory-based CF approach achieves an accuracy ranging from 81.8% to 87.9%. Among the classification algorithms, artificial neural network (ANN) and k-nearest neighbors (KNN) classifiers perform the best, with accuracy levels of 66.7% and 63.6%, respectively. These findings demonstrate the effectiveness and viability of both approaches in MMS, acknowledging their limitations and the need for continuous training and optimization. The proposed AI-MMRS for the pre-selection stage supplemented by the direct involvement of mining professionals in later stages of MMS, has the potential to significantly aid in the MMS decision-making, providing data-driven and experience-based recommendations following the ongoing evolution of mining practices. Full article
(This article belongs to the Topic Mining Innovation)
Show Figures

Figure 1

Figure 1
<p>Methodology for developing the AI-MMRS (DMS: document management software: LogicalDOC Business version 8.7.3; ML: machine learning, CF: collaborative filtering, NMF: nonnegative matrix factorization) [<a href="#B10-mining-04-00042" class="html-bibr">10</a>,<a href="#B11-mining-04-00042" class="html-bibr">11</a>].</p>
Full article ">Figure 2
<p>Collaborative filtering recommendation system framework based on the user–item interaction dataset <span class="html-italic">X</span> composed of u-users and i-items with ratings ranging from 1 to 5, “?” unknown or missing rating.</p>
Full article ">Figure 3
<p>Showing the data pre-processing: transformation of the input dataset for experiments to evaluate the proposed memory-based collaborative filtering approach.</p>
Full article ">Figure 4
<p>Workflow of the proposed methodology for practical experiments: the memory-based collaborative filtering approach for predicting and recommending top-N mining methods.</p>
Full article ">Figure 5
<p>Performance of the proposed model in predicting primary and top-3 most relevant mining methods in terms of GAR and F1-score.</p>
Full article ">Figure 6
<p>Confusion matrix of the artificial neural network (ANN) model showing the per-class Recall or True Positive Rate (TPR) and the True Negative Rate (TNR).</p>
Full article ">
39 pages, 9734 KiB  
Review
A Survey of Robot Intelligence with Large Language Models
by Hyeongyo Jeong, Haechan Lee, Changwon Kim and Sungtae Shin
Appl. Sci. 2024, 14(19), 8868; https://doi.org/10.3390/app14198868 - 2 Oct 2024
Viewed by 549
Abstract
Since the emergence of ChatGPT, research on large language models (LLMs) has actively progressed across various fields. LLMs, pre-trained on vast text datasets, have exhibited exceptional abilities in understanding natural language and planning tasks. These abilities of LLMs are promising in robotics. In [...] Read more.
Since the emergence of ChatGPT, research on large language models (LLMs) has actively progressed across various fields. LLMs, pre-trained on vast text datasets, have exhibited exceptional abilities in understanding natural language and planning tasks. These abilities of LLMs are promising in robotics. In general, traditional supervised learning-based robot intelligence systems have a significant lack of adaptability to dynamically changing environments. However, LLMs help a robot intelligence system to improve its generalization ability in dynamic and complex real-world environments. Indeed, findings from ongoing robotics studies indicate that LLMs can significantly improve robots’ behavior planning and execution capabilities. Additionally, vision-language models (VLMs), trained on extensive visual and linguistic data for the vision question answering (VQA) problem, excel at integrating computer vision with natural language processing. VLMs can comprehend visual contexts and execute actions through natural language. They also provide descriptions of scenes in natural language. Several studies have explored the enhancement of robot intelligence using multimodal data, including object recognition and description by VLMs, along with the execution of language-driven commands integrated with visual information. This review paper thoroughly investigates how foundation models such as LLMs and VLMs have been employed to boost robot intelligence. For clarity, the research areas are categorized into five topics: reward design in reinforcement learning, low-level control, high-level planning, manipulation, and scene understanding. This review also summarizes studies that show how foundation models, such as the Eureka model for automating reward function design in reinforcement learning, RT-2 for integrating visual data, language, and robot actions in vision-language-action models, and AutoRT for generating feasible tasks and executing robot behavior policies via LLMs, have improved robot intelligence. Full article
Show Figures

Figure 1

Figure 1
<p>Five categories for robot intelligence with large language models in this study.</p>
Full article ">Figure 2
<p>Attention patterns in three mainstream architectures: Causal Decoder (<b>left</b>), Prefix Decoder (<b>middle</b>), and Encoder–Decoder (<b>right</b>). The blue, green, yellow, and grey rounded rectangles represent attention between prefix tokens, attention between prefix and target tokens, attention between target tokens, and masked attention [<a href="#B5-applsci-14-08868" class="html-bibr">5</a>].</p>
Full article ">Figure 3
<p>An overview of four strategies for parameter-efficient fine-tuning: (<b>a</b>) Adapter Tuning, (<b>b</b>) Prefix Tuning, (<b>c</b>) Prompt Tuning, and (<b>d</b>) Low-Rank Adaptation [<a href="#B5-applsci-14-08868" class="html-bibr">5</a>].</p>
Full article ">Figure 4
<p>Eureka leverages LLM to generate reward functions for robotic tasks and surpasses expert-designed functions through iterative improvements [<a href="#B11-applsci-14-08868" class="html-bibr">11</a>].</p>
Full article ">Figure 5
<p>DrEureka leverages LLM to design reward functions and solves the sim-to-real problem through its Reward-Aware Physics Priors mechanism and domain randomization [<a href="#B134-applsci-14-08868" class="html-bibr">134</a>].</p>
Full article ">Figure 6
<p>Pre-trained LLMs can act as general sequence modelers, and their abilities were assessed in sequence transformation, completion, and improvement [<a href="#B148-applsci-14-08868" class="html-bibr">148</a>].</p>
Full article ">Figure 7
<p>After encoding visual features, they are mapped using visual tokens and text queries. A plan is then created with the LLaMA model and turned into task commands. The visual tokens are queried and converted into low-level control commands to perform the task [<a href="#B150-applsci-14-08868" class="html-bibr">150</a>].</p>
Full article ">Figure 8
<p>Inner Monologue integrates various feedback sources into the language model to enable robots to carry out instructions: (<b>a</b>) mobile manipulation and (<b>b</b>,<b>c</b>) tabletop manipulation, in both simulated and real-world environments [<a href="#B153-applsci-14-08868" class="html-bibr">153</a>].</p>
Full article ">Figure 9
<p>LLM-Planner is a system that creates high-level plans based on natural language commands, sets subgoals to determine actions, and continuously updates the plan to reflect environmental changes [<a href="#B155-applsci-14-08868" class="html-bibr">155</a>].</p>
Full article ">Figure 10
<p>ProgPrompt is a system that uses Python programming structures to provide environmental information and actions, enhancing the success rate of robot task planning through an error recovery feedback mechanism and environmental state feedback [<a href="#B156-applsci-14-08868" class="html-bibr">156</a>].</p>
Full article ">Figure 11
<p>SM integrates various types of knowledge by using multiple pre-trained models and provides meaningful results even in complex computer vision tasks such as image captioning, context inference, and activity prediction [<a href="#B158-applsci-14-08868" class="html-bibr">158</a>].</p>
Full article ">Figure 12
<p>Based on language instructions and RGB-D data, the LLM interacts with the VLM to generate 3D affordance and constraint maps and design robot trajectories without additional training [<a href="#B165-applsci-14-08868" class="html-bibr">165</a>].</p>
Full article ">Figure 13
<p>LM-Nav uses three pre-trained models: (<b>a</b>) VNM builds a topological graph from observations, (<b>b</b>) LLM converts instructions into landmarks, (<b>c</b>) VLM matches landmarks to images, (<b>d</b>) A graph search algorithm then finds the best robot trajectory, and (<b>e</b>) the robot executes the planned path [<a href="#B173-applsci-14-08868" class="html-bibr">173</a>].</p>
Full article ">
13 pages, 3447 KiB  
Article
Assessing Land-Cover Changes in the Natural Park ‘Fragas do Eume’ over the Last 25 Years: Insights from Remote Sensing and Machine Learning
by Paula Díaz-García and Adrián Regos
Land 2024, 13(10), 1601; https://doi.org/10.3390/land13101601 - 1 Oct 2024
Viewed by 516
Abstract
The ‘Fragas do Eume’ Natural Park includes one of the best-preserved Atlantic forests in Europe. These forests are part of the Natura 2000 Network. This scientific study focuses on analysing land-cover changes in the ‘Fragas do Eume’ Natural Park (NW Spain) over a [...] Read more.
The ‘Fragas do Eume’ Natural Park includes one of the best-preserved Atlantic forests in Europe. These forests are part of the Natura 2000 Network. This scientific study focuses on analysing land-cover changes in the ‘Fragas do Eume’ Natural Park (NW Spain) over a 25-year period, from 1997 to 2022, using machine learning techniques for the classification of satellite images. Several image processing operations were carried out to correct radiometry, followed by supervised classification techniques with previously defined training areas. Five multispectral indices were used to improve classification accuracy, and their correlation was evaluated. Land-cover changes were analysed, with special attention to the transitions between eucalyptus plantations and native deciduous forests. A significant increase in eucalyptus plantations (48.2%) (Eucalyptus globulus Labill.) was observed, while native deciduous forests experienced a decrease in their extent (17.6%). This transformation of the landscape affected not only these two habitats, but also cropland and scrubland areas, both of which increased. Our results suggest that the lack of effective conservation policies and the economic interest of fast-growing tree plantations could explain the loss of native deciduous forests. The results highlight the need to implement pro-active and sustainable management measures to protect these natural forest ecosystems in the ‘Fragas do Eume’ Natural Park. Full article
Show Figures

Figure 1

Figure 1
<p>Map showing the location of ‘Fragas do Eume’. The orange area indicates the extent of the Natural Park.</p>
Full article ">Figure 2
<p>Correlation diagram between multispectral indices. Top right: March 1997, top left: August 1997, bottom right: March 2022, and bottom left: July 2022.</p>
Full article ">Figure 3
<p>Boxplot of the overall accuracy data of the algorithms for the summer maps (Summer), summer and winter maps (Summer + Winter), and summer, winter, and indices maps (Summer + Winter + Index) in the years 1997 and 2022. The line dividing the box represents the median, the box represents 50% of the data, the lines represent 25% of the data, and the points are outliers.</p>
Full article ">Figure 4
<p>This Sankey diagram illustrates the transitions between different land use types over time. The categories include croplands (yellow), deciduous forest (red), evergreen forest (green), shrubland (light green), and water (blue). The flow lines between the source (left) and target (right) nodes represent the changes in land use, measured in hectares, highlighting the dynamic nature of land cover transitions within the study area. The thickness of each line corresponds to the magnitude of change, facilitating the visualisation of how land use categories have evolved.</p>
Full article ">Figure 5
<p>Bar plot of land data by land type and year.</p>
Full article ">Figure 6
<p>Maps of land cover habitat classes for 1997 (<b>top</b>) and 2022 (<b>bottom</b>) in the ‘Fragas do Eume’ Natural Park. Coordinates are in the UTM coordinate system, WGS84 Zone 29N.</p>
Full article ">
17 pages, 9390 KiB  
Article
Applicability of Relatively Low-Cost Multispectral Uncrewed Aerial Systems for Surface Characterization of the Cryosphere
by Colby F. Rand and Alia L. Khan
Remote Sens. 2024, 16(19), 3662; https://doi.org/10.3390/rs16193662 - 1 Oct 2024
Viewed by 366
Abstract
This paper investigates the ability of a relatively low cost, commercially available uncrewed aerial vehicle (UAV), the DJI Mavic 3 Multispectral, to perform cryospheric research. The performance of this UAV, where applicable, is compared to a similar but higher cost system, the DJI [...] Read more.
This paper investigates the ability of a relatively low cost, commercially available uncrewed aerial vehicle (UAV), the DJI Mavic 3 Multispectral, to perform cryospheric research. The performance of this UAV, where applicable, is compared to a similar but higher cost system, the DJI Matrice 350, equipped with a Micasense RedEdge-MX Multispectral dual-camera system. The Mavic 3 Multispectral was tested at three field sites: the Lemon Creek Glacier, Juneau Icefield, AK; the Easton Glacier, Mt. Baker, WA; and Bagley Basin, Mt. Baker, WA. This UAV proved capable of mapping the spatial distribution of red snow algae on the surface of the Lemon Creek Glacier using both spectral indices and a random forest supervised classification method. The UAV was able to assess the timing of snowmelt and changes in suncup morphology on snow-covered areas within the Bagley Basin. Finally, the UAV was able to classify glacier surface features using a random forest algorithm with an overall accuracy of 68%. The major advantages of this UAV are its low weight, which allows it to be easily transported into the field, its low cost compared to other alternatives, and its ease of use. One limitation would be the omission of a blue multispectral band, which would have allowed it to more easily classify glacial ice and snow features. Full article
(This article belongs to the Special Issue Remote Sensing of the Cryosphere II)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) SkySat image of the Lemon Creek Glacier (© Planet Labs). The locations of the ground control points used during this drone survey are shown by the black crosses. (<b>b</b>) The Lemon Creek glacier is located in the southernmost extent of the Juneau Icefield. Glaciated areas are shown in white (RGI Consortium [<a href="#B26-remotesensing-16-03662" class="html-bibr">26</a>]). (<b>c</b>) The Juneau Icefield is located on the border of southeast Alaska, USA and British Columbia, Canada, as shown by the red star. (Map projection: WGS 1984 UTM Zone 8N).</p>
Full article ">Figure 2
<p>(<b>a</b>) Topographic reference map of Mt. Baker, Washington, USA, showing glaciated areas in white (RGI Consortium [<a href="#B26-remotesensing-16-03662" class="html-bibr">26</a>]). (<b>b</b>) SkySat image of the Easton glacier on the southern slope of Mt. Baker (© Planet Labs). The outline of the glacier is shown by the black polygon (RGI Consortium [<a href="#B26-remotesensing-16-03662" class="html-bibr">26</a>]). (<b>c</b>) SkySat image of Bagley Basin, an alpine basin to the northeast of Mt. Baker, adjacent to the Mt. Baker Ski Area (© Planet Labs). (Map projection: WGS 1984 UTM Zone 10N).</p>
Full article ">Figure 3
<p>Drone platforms and sensors. (<b>a</b>) DJI Mavic 3 Multispectral (Mavic 3M), (<b>b</b>) DJI Matrice 350 RTK (Matrice 350), (<b>c</b>) Micasense RedEdge-MX Multispectral Dual-Camera System (Micasense) and downwelling light sensor (DLS).</p>
Full article ">Figure 4
<p>Orthomosaic images of the Lemon Creek Glacier, Juneau Icefield, Alaska, derived from images captured by the RGB bands (Band 5: Red-668, Band 3: Green-560, and Band 2: Blue-475) of the Micasense RedEdge-MX dual camera system (<b>left</b>) and the RGB camera on a DJI Mavic 3 Multispectral (<b>right</b>) collected from 21 August through 23 August 2023. The black rectangle toward the southern end of the Lemon Creek glacier depicts the high density algae area used for analysis in <a href="#remotesensing-16-03662-f005" class="html-fig">Figure 5</a>. (Map projection: WGS 1984 UTM Zone 8N).</p>
Full article ">Figure 5
<p>High density algae area in the southern region of the Lemon Creek Glacier, captured by the DJI Mavic 3 Multispectral, as shown in the left column, and Micasense Red-Edge MX camera system, as shown in the right column. (<b>a</b>) Mavic 3M multispectral composite orthomosaic, symbolized by Red: Band 2, Green: Band 1, and Blue: Band 1. Since there is no blue multispectral band on the Mavic 3M, the green band was used for both the green and blue symbology. (<b>b</b>) Micasense multispectral composite orthomosaic, symbolized by Red: Band 5, Green: Band 3, and Blue: Band 2. (<b>c</b>) ORG index applied to Mavic 3M orthomosaic. (<b>d</b>) ORG index applied to Micasense orthomosaic. (<b>e</b>) random forest classification of Mavic 3M orthomosaic. (<b>f</b>) random forest classification of the Micasense orthomosaic. Algae sample locations are shown by the black and white circles. (Map projection: WGS 1984 UTM Zone 8N).</p>
Full article ">Figure 6
<p>RGB orthomosaic of Bagley Basin, captured by the Mavic 3M on 8 June 2023. The inset map shows meltwater filled suncups on the surface of Upper Bagley Lake. (Map projection: WGS 1984 UTM Zone 10N).</p>
Full article ">Figure 7
<p>RGB orthomosaics and DEMs of a snow surface to the west of Upper Bagley Lake, within Bagley Basin, derived from DJI Mavic 3M imagery captured on (<b>a</b>) 8 June 2023, (<b>b</b>) 26 June 2023, (<b>c</b>) 5 July 2023, (<b>d</b>) 10 July 2023, (<b>e</b>) 13 June 2023, and (<b>f</b>) 17 July 2023. The red circle denotes an individual who’s diameter was measured to assess changes to suncup morphology.</p>
Full article ">Figure 8
<p>Digital elevation models of Easton glacier, captured by Mavic 3M mapping surveys on 20 July, 1 August, 13 August, and 8 September 2023. (Map projection: WGS 1984 UTM Zone 10N).</p>
Full article ">Figure 9
<p>RGB orthomosaic and random forest supervised classification results of the 13 August 2023, Mavic 3M flight of the Easton glacier.</p>
Full article ">
16 pages, 1482 KiB  
Article
SecureVision: Advanced Cybersecurity Deepfake Detection with Big Data Analytics
by Naresh Kumar and Ankit Kundu
Sensors 2024, 24(19), 6300; https://doi.org/10.3390/s24196300 - 29 Sep 2024
Viewed by 612
Abstract
SecureVision is an advanced and trustworthy deepfake detection system created to tackle the growing threat of ‘deepfake’ movies that tamper with media, undermine public trust, and jeopardize cybersecurity. We present a novel approach that combines big data analytics with state-of-the-art deep learning algorithms [...] Read more.
SecureVision is an advanced and trustworthy deepfake detection system created to tackle the growing threat of ‘deepfake’ movies that tamper with media, undermine public trust, and jeopardize cybersecurity. We present a novel approach that combines big data analytics with state-of-the-art deep learning algorithms to detect altered information in both audio and visual domains. One of SecureVision’s primary innovations is the use of multi-modal analysis, which improves detection capabilities by concurrently analyzing many media forms and strengthening resistance against advanced deepfake techniques. The system’s efficacy is further enhanced by its capacity to manage large datasets and integrate self-supervised learning, which guarantees its flexibility in the ever-changing field of digital deception. In the end, this study helps to protect digital integrity by providing a proactive, scalable, and efficient defense against the ubiquitous threat of deepfakes, thereby establishing a new benchmark for privacy and security measures in the digital era. Full article
(This article belongs to the Special Issue Cybersecurity Attack and Defense in Wireless Sensors Networks)
Show Figures

Figure 1

Figure 1
<p>System architecture for audio deepfake detection.</p>
Full article ">Figure 2
<p>System architecture for image deepfake detection.</p>
Full article ">Figure 3
<p>Graphs of training vs. validation based on accuracy and loss.</p>
Full article ">Figure 4
<p>Confusion matrix of audio detection.</p>
Full article ">Figure 5
<p>Confusion matrix for image detection.</p>
Full article ">
27 pages, 6999 KiB  
Article
Improved Road Extraction Models through Semi-Supervised Learning with ACCT
by Hao Yu, Shihong Du, Zhenshan Tan, Xiuyuan Zhang and Zhijiang Li
ISPRS Int. J. Geo-Inf. 2024, 13(10), 347; https://doi.org/10.3390/ijgi13100347 - 29 Sep 2024
Viewed by 346
Abstract
Improving the performance and reducing the training cost of road extraction models in the absence of samples is important for updating road maps. Despite the success of recent road extraction models on standard datasets, they often fail to perform when applied to new [...] Read more.
Improving the performance and reducing the training cost of road extraction models in the absence of samples is important for updating road maps. Despite the success of recent road extraction models on standard datasets, they often fail to perform when applied to new datasets or real-world scenarios where labeled samples are not available. In this paper, our focus diverges from the typical quest to pinpoint the optimal road extraction model or evaluate generalization prowess across models. Instead, we propose a method called Asymmetric Consistent Co-Training (ACCT) to train existing road extraction models faster and make them perform better in new scenarios lacking samples. ACCT uses two models with different structures and a supervision module to enhance accuracy through mutual learning. Labeled and unlabeled images are processed by both models to generate road maps from different perspectives. The supervision module ensures consistency between predictions by computing losses based on labeling status. ACCT iteratively adjusts parameters using unlabeled data, improving generalization. Empirical evaluations show that ACCT improves IoU by 2.79% to 10.26% using only 1/8 of the labeled data compared to fully supervised methods. It also reduces parameters by over 49% compared to state-of-the-art semi-supervised methods while maintaining similar accuracy. These results highlight the potential of leveraging large amounts of unlabeled data to enhance road extraction models as data acquisition technology advances. Full article
(This article belongs to the Special Issue Advances in AI-Driven Geospatial Analysis and Data Generation)
Show Figures

Figure 1

Figure 1
<p>Challenges in achieving complete and accurate road labeling, illustrated with examples from the Massachusetts Roads Dataset and CHN6-CUG Dataset. (<b>b</b>) is the label of the real image (<b>a</b>) and (<b>d</b>) is the label of the real image (<b>c</b>). Red circles mark areas that are not fully labelled, orange circles mark areas that are difficult to be accurately labeled due to building occlusion, and yellow circles mark areas that are difficult to be accurately labeled due to tree occlusion.</p>
Full article ">Figure 2
<p>Illustrating the architectures for (<b>a</b>) common consistency learning methods, (<b>b</b>) mean teacher, (<b>c</b>) PseudoSeg, and (<b>d</b>) CPS. Solid arrows indicate forward operation and dashed arrows indicate loss supervision. ‘//’ on solid arrows means stop-gradient. ‘X’ means the sample, ‘X<sub>1</sub>’ and ‘X<sub>2</sub>’ mean the sample after image augmentation, ‘P<sub>1</sub>’ and ‘P<sub>2</sub>’ mean the prediction result, ‘Y<sub>1</sub>’ and ‘Y<sub>2</sub>’ mean the generated pseudo label, ‘Y<sub>W</sub>’ means the true label.</p>
Full article ">Figure 2 Cont.
<p>Illustrating the architectures for (<b>a</b>) common consistency learning methods, (<b>b</b>) mean teacher, (<b>c</b>) PseudoSeg, and (<b>d</b>) CPS. Solid arrows indicate forward operation and dashed arrows indicate loss supervision. ‘//’ on solid arrows means stop-gradient. ‘X’ means the sample, ‘X<sub>1</sub>’ and ‘X<sub>2</sub>’ mean the sample after image augmentation, ‘P<sub>1</sub>’ and ‘P<sub>2</sub>’ mean the prediction result, ‘Y<sub>1</sub>’ and ‘Y<sub>2</sub>’ mean the generated pseudo label, ‘Y<sub>W</sub>’ means the true label.</p>
Full article ">Figure 3
<p>Overall framework of ACCT. Solid arrows indicate forward operation and dashed arrows indicate loss supervision. ‘//’ on solid arrows means stop-gradient. ‘<math display="inline"><semantics> <mrow> <mi mathvariant="normal">X</mi> </mrow> </semantics></math>’ means the sample. ‘<math display="inline"><semantics> <mrow> <msub> <mrow> <mi>F</mi> </mrow> <mrow> <mi>p</mi> </mrow> </msub> </mrow> </semantics></math>’ and ‘<math display="inline"><semantics> <mrow> <msub> <mrow> <mi>F</mi> </mrow> <mrow> <mi>v</mi> </mrow> </msub> </mrow> </semantics></math>’ mean two road extraction models with different structure. ‘<math display="inline"><semantics> <mrow> <msub> <mrow> <mi>P</mi> </mrow> <mrow> <mi>p</mi> </mrow> </msub> </mrow> </semantics></math>’ and ‘<math display="inline"><semantics> <mrow> <msub> <mrow> <mi>P</mi> </mrow> <mrow> <mi>v</mi> </mrow> </msub> </mrow> </semantics></math>’ mean the segmentation confidence map, ‘<math display="inline"><semantics> <mrow> <mi>Y</mi> </mrow> </semantics></math>’ means the true label, ‘<math display="inline"><semantics> <mrow> <msub> <mrow> <mi>Y</mi> </mrow> <mrow> <mi>p</mi> </mrow> </msub> </mrow> </semantics></math>’ and ‘<math display="inline"><semantics> <mrow> <msub> <mrow> <mi>Y</mi> </mrow> <mrow> <mi>v</mi> </mrow> </msub> </mrow> </semantics></math>’ mean the label map.</p>
Full article ">Figure 4
<p>Illustration of sample batch composition and model processing using <math display="inline"><semantics> <mrow> <msup> <mrow> <mi>r</mi> </mrow> <mrow> <mo>′</mo> </mrow> </msup> </mrow> </semantics></math> = 1/4:3/4 as an example.</p>
Full article ">Figure 5
<p>Pseudo-labels generated by different models during CPS training in CHN6-CUG Dataset. (<b>a</b>) The original satellite image; (<b>b</b>) ground truth marked by experts; (<b>c</b>) the pseudo-label generated by RoadNet [<a href="#B35-ijgi-13-00347" class="html-bibr">35</a>]; and (<b>d</b>) the pseudo-label generated by U-Net [<a href="#B36-ijgi-13-00347" class="html-bibr">36</a>].</p>
Full article ">Figure 6
<p>Comparison of segmentation performance between our method and the baseline on the CHN6-CUG Dataset when employing U-Net as the model to be trained. (<b>a</b>) illustrates the performance of different methods under 1/16, 1/8, 1/4, and 1/2 partition protocols. (<b>b</b>) is an enlarged view of a portion of (<b>a</b>).</p>
Full article ">Figure 7
<p>Comparison of segmentation performance between our method and the baseline on the CHN6-CUG Dataset when employing D-LinkNet as the model to be trained. (<b>a</b>) illustrates the performance of different methods under 1/16, 1/8, 1/4, and 1/2 partition protocols. (<b>b</b>) is an enlarged view of a portion of (<b>a</b>).</p>
Full article ">Figure 8
<p>Comparison of segmentation performance between our method and the baseline on the Massachusetts Roads Dataset when employing U-Net as the model to be trained. (<b>a</b>) illustrates the performance of different methods under 1/16, 1/8, 1/4, and 1/2 partition protocols. (<b>b</b>) is an enlarged view of a portion of (<b>a</b>).</p>
Full article ">Figure 9
<p>Comparison of segmentation performance between our method and the baseline on the Massachusetts Roads Dataset when employing D-LinkNet as the model to be trained. (<b>a</b>) illustrates the performance of different methods under 1/16, 1/8, 1/4, and 1/2 partition protocols. (<b>b</b>) is an enlarged view of a portion of (<b>a</b>).</p>
Full article ">Figure 10
<p>Comparison of segmentation performance between our method and the baseline on the Ottawa Roads Dataset when employing U-Net as the model to be trained. (<b>a</b>) illustrates the performance of different methods under 1/16, 1/8, 1/4, and 1/2 partition protocols. (<b>b</b>) is an enlarged view of a portion of (<b>a</b>).</p>
Full article ">Figure 11
<p>Comparison of segmentation performance between our method and the baseline on the Ottawa Roads Dataset when employing D-LinkNet as the model to be trained. (<b>a</b>) illustrates the performance of different methods under 1/16, 1/8, 1/4, and 1/2 partition protocols. (<b>b</b>) is an enlarged view of a portion of (<b>a</b>).</p>
Full article ">Figure 12
<p>Example qualitative results from the CHN6-CUG Dataset. We have highlighted the differences between them in the circles. (<b>a</b>) Original satellite images; (<b>b</b>) Ground truth; (<b>c</b>) Results of our method under 1/4 partition protocols; (<b>d</b>) Results of CPS under 1/4 partition protocols; (<b>e</b>) Result of fully supervised training using all labeled data; (<b>f</b>) Results of our method under 1/16 partition protocols; (<b>g</b>) Result of fully supervised training under 1/16 partition protocols.</p>
Full article ">Figure 12 Cont.
<p>Example qualitative results from the CHN6-CUG Dataset. We have highlighted the differences between them in the circles. (<b>a</b>) Original satellite images; (<b>b</b>) Ground truth; (<b>c</b>) Results of our method under 1/4 partition protocols; (<b>d</b>) Results of CPS under 1/4 partition protocols; (<b>e</b>) Result of fully supervised training using all labeled data; (<b>f</b>) Results of our method under 1/16 partition protocols; (<b>g</b>) Result of fully supervised training under 1/16 partition protocols.</p>
Full article ">Figure 13
<p>Example qualitative results from the Massachusetts Roads Dataset. We have highlighted the differences between them in the circles. (<b>a</b>) Original satellite images; (<b>b</b>) Ground truth; (<b>c</b>) Results of our method under 1/4 partition protocols; (<b>d</b>) Results of CPS under 1/4 partition protocols; (<b>e</b>) Result of fully supervised training using all labeled data; (<b>f</b>) Results of our method under 1/16 partition protocols; (<b>g</b>) Result of fully supervised training under 1/16 partition protocols.</p>
Full article ">Figure 13 Cont.
<p>Example qualitative results from the Massachusetts Roads Dataset. We have highlighted the differences between them in the circles. (<b>a</b>) Original satellite images; (<b>b</b>) Ground truth; (<b>c</b>) Results of our method under 1/4 partition protocols; (<b>d</b>) Results of CPS under 1/4 partition protocols; (<b>e</b>) Result of fully supervised training using all labeled data; (<b>f</b>) Results of our method under 1/16 partition protocols; (<b>g</b>) Result of fully supervised training under 1/16 partition protocols.</p>
Full article ">Figure 14
<p>Comparison of the segmentation performance of our method with the baseline on different datasets. (<b>a</b>) Performance of different methods on the CHN6-CUG Dataset when using RoadNet as the model to be trained. (<b>b</b>) Performance of different methods on the CHN6-CUG Dataset when using CGNet as the model to be trained.</p>
Full article ">Figure 15
<p>Convergence Curve of Our Method Compared to Baseline on CHN6-CUG Dataset.</p>
Full article ">Figure 16
<p>Early trends in model Training: an example from the first 100 epochs. We have highlighted the differences between them in the circles. (<b>a</b>) ACCT method using RoadNet to assist U-Net training. (<b>b</b>) An approach using two U-Net of the same structure to supervise each other. (<b>c</b>) A method using two RoadNet to supervise each other. (<b>d</b>) Original satellite images; (<b>e</b>) Ground truth.</p>
Full article ">Figure 17
<p>Early trends in model Training: an example from the first 100 epochs. We have highlighted the differences between them in the circles. (<b>a</b>) ACCT method using CGNet to assist RUW-Net training. (<b>b</b>) An approach using two RUW-Net of the same structure to supervise each other. (<b>c</b>) A method using two CGNet to supervise each other. (<b>d</b>) Original satellite images; (<b>e</b>) Ground truth.</p>
Full article ">
18 pages, 792 KiB  
Article
SemConvTree: Semantic Convolutional Quadtrees for Multi-Scale Event Detection in Smart City
by Mikhail Andeevich Kovalchuk, Anastasiia Filatova, Aleksei Korneev, Mariia Koreneva, Denis Nasonov, Aleksandr Voskresenskii and Alexander Boukhanovsky
Smart Cities 2024, 7(5), 2763-2780; https://doi.org/10.3390/smartcities7050107 - 28 Sep 2024
Viewed by 428
Abstract
The digital world is increasingly permeating our reality, creating a significant reflection of the processes and activities occurring in smart cities. Such activities include well-known urban events, celebrations, and those with a very local character. These widespread events have a significant influence on [...] Read more.
The digital world is increasingly permeating our reality, creating a significant reflection of the processes and activities occurring in smart cities. Such activities include well-known urban events, celebrations, and those with a very local character. These widespread events have a significant influence on shaping the spirit and atmosphere of urban environments. This work presents SemConvTree, an enhanced semantic version of the ConvTree algorithm. It incorporates the semantic component of data through semi-supervised learning of a topic modeling ensemble, which consists of improved models: BERTopic, TSB-ARTM, and SBert-Zero-Shot. We also present an improved event search algorithm based on both statistical evaluations and semantic analysis of posts. This algorithm allows for fine-tuning the mechanism of discovering the required entities with the specified particularity (such as a particular topic). Experimental studies were conducted within the area of New York City. They showed an improvement in the detection of posts devoted to events (about 40% higher f1-score) due to the accurate handling of events of different scales. These results suggest the long-term potential for creating a semantic platform for the analysis and monitoring of urban events in the future. Full article
(This article belongs to the Section Smart Data)
Show Figures

Figure 1

Figure 1
<p>Scheme of events detection pipeline of the ConvTree algorithm.</p>
Full article ">Figure 2
<p>Pipeline of the event detection system SemConvTree.</p>
Full article ">Figure 3
<p>Scheme of the post-ranking module.</p>
Full article ">
Back to TopTop