Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (7,172)

Search Parameters:
Keywords = classification problem

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 1803 KiB  
Article
MVSAPNet: A Multivariate Data-Driven Method for Detecting Disc Cutter Wear States in Composite Strata Shield Tunneling
by Yewei Xiong, Xinwen Gao and Dahua Ye
Sensors 2025, 25(6), 1650; https://doi.org/10.3390/s25061650 - 7 Mar 2025
Abstract
Disc cutters are essential for shield tunnel construction, and monitoring their wear is vital for safety and efficiency. Due to their position in the soil silo, it is more challenging to observe the wear of disc cutters directly, making accurate and efficient detection [...] Read more.
Disc cutters are essential for shield tunnel construction, and monitoring their wear is vital for safety and efficiency. Due to their position in the soil silo, it is more challenging to observe the wear of disc cutters directly, making accurate and efficient detection a technical challenge. However, existing methods that treat the problem as a classification task often overlook the issue of data imbalance. To solve these problems, this paper proposes an end-to-end detection method for disc cutter wear state called the Multivariate Selective Attention Prototype Network (MVSAPNet). The method introduces an attention prototype network for variable selection, which selects important features from many input parameters using a specialized variable selection network. To address the problem of imbalance in the wear data, a prototype network is used to learn the centers of the normal and wear state classes, and the detection of the wear state is achieved by detecting high-dimensional features and comparing their distances to the class centers. The method performs better on the data collected from the Ma Wan Cross-Sea Tunnel project in Shenzhen, China, with an accuracy of 0.9187 and an F1 score of 0.8978, yielding higher values than the experimental results of other classification models. Full article
Show Figures

Figure 1

Figure 1
<p>The overall framework of the disc cutter wear detection.</p>
Full article ">Figure 2
<p>The architecture of data preprocessing.</p>
Full article ">Figure 3
<p>The architecture of MVSAPNet.</p>
Full article ">Figure 4
<p>Outlier removal results for shield disc cutter speed. (<b>a</b>) The yellow curve represents the reconstruction bias of the LSTM-ED model, and the blue curve represents selected threshold value 1.74, which was calculated by <math display="inline"><semantics> <mrow> <mn>3</mn> <mi>σ</mi> </mrow> </semantics></math> criterion. (<b>b</b>) The blue curve represents the data before outlier removal; the red curve represents the data after outlier removal.</p>
Full article ">Figure 5
<p>Denoising results of cutterhead speed using VMD-WT.</p>
Full article ">Figure 6
<p>Effect of different preprocessing steps results with proposed model.</p>
Full article ">Figure 7
<p>Impact of removing part of the network structure alone on detection performance.</p>
Full article ">Figure 8
<p>Visualization of mean weights of <math display="inline"><semantics> <msub> <mi>v</mi> <mi>t</mi> </msub> </semantics></math> in VSNs.</p>
Full article ">Figure 9
<p>Visualization results of model sample features and class prototype feature using the t-SNE method, red and black stars for normal and wear state class prototype features, blue dots and orange dots for normal and wear state sample features. (<b>a</b>) Training set. (<b>b</b>) Test set.</p>
Full article ">Figure 10
<p>Visualization results of the distances between the sample vectors and the class prototype matrix on the test set of tunnel in Shenzhen, China, with the blue line being the distance between the samples and the normal state, the red line being the distance between the samples and the worn state, the green line being the worn distance and the normal distance, and the purple color being the class labels, with 0 = normal state and 1 = worn state.</p>
Full article ">Figure 11
<p>Visualization results of the distances in Fuzhou, China. Other is equal to <a href="#sensors-25-01650-f010" class="html-fig">Figure 10</a>.</p>
Full article ">
23 pages, 69279 KiB  
Article
A Novel Equivariant Self-Supervised Vector Network for Three-Dimensional Point Clouds
by Kedi Shen, Jieyu Zhao and Min Xie
Algorithms 2025, 18(3), 152; https://doi.org/10.3390/a18030152 - 7 Mar 2025
Abstract
For networks that process 3D data, estimating the orientation and position of 3D objects is a challenging task. This is because the traditional networks are not robust to the rotation of the data, and their internal workings are largely opaque and uninterpretable. To [...] Read more.
For networks that process 3D data, estimating the orientation and position of 3D objects is a challenging task. This is because the traditional networks are not robust to the rotation of the data, and their internal workings are largely opaque and uninterpretable. To solve this problem, a novel equivariant self-supervised vector network for point clouds is proposed. The network can learn the rotation direction information of the 3D target and estimate the rotational pose change of the target, and the interpretability of the equivariant network is studied using information theory. The utilization of vector neurons within the network lifts the scalar data to vector representations, enabling the network to learn the pose information inherent in the 3D target. The network can perform complex rotation-equivariant tasks after pre-training, and it shows impressive performance in complex tasks like category-level pose change estimation and rotation-equivariant reconstruction. We demonstrate through experiments that our network can accurately detect the orientation and pose change of point clouds and visualize the latent features. Moreover, it performs well in invariant tasks such as classification and category-level segmentation. Full article
(This article belongs to the Section Algorithms for Multidisciplinary Applications)
Show Figures

Figure 1

Figure 1
<p>Equivariant self-supervised vector network’s overall architecture. The input point cloud is divided into several point cloud patches, after which they are randomly masked, and then the point embed operation is carried out through the equivariant layer and token embed layer. Then, the obtained point embedding with vector information is fed into the autoencoder for pre-training. When the pre-training is finished, the decoder module will be abandoned and replaced by different fine-tuning heads. After connecting various fine-tuning heads to the front network, the network can be applied to different downstream tasks.</p>
Full article ">Figure 2
<p>A diagram of the framework of the VN-Transformer. Its structure is similar to the standard Transformer framework. The left is the flow of tokens in the whole autoencoder framework, and the right is the internal structure of each block and the calculation process of VN-Attention.</p>
Full article ">Figure 3
<p>Rendering of the reconstruction effect of our network on ShapeNet. The network will reconstruct the visible point clouds under different rotation states. During training, the network only learns the non-rotated points, but, in the test, the network can also reconstruct the rotated point clouds under z and SO(3) conditions well. The training in the figure uses a 60% mask rate. Meanwhile, in order to better present the results, we render the point clouds in this figure and the later results presentation figure to some extent.</p>
Full article ">Figure 4
<p>Rendering of the reconstruction effect of our network on human point clouds. The human point clouds are trained in the same way as ShapeNet. These human point clouds were obtained by sampling on the mesh model of the HumanBody dataset. Since the initial pose in the original HumanBody dataset is confusing (i.e., the pose of the point cloud below the mesh is the origin pose), we manually adjusted the mesh to show its rough pose and appearance. The training in the figure uses a 60% mask rate.</p>
Full article ">Figure 5
<p>Demonstration of the latent features of the network. The top of each row is the state of the input point cloud, and the bottom is the state of its corresponding latent feature, which can be seen to rotate with the rotation of the point cloud. The origin column is the point cloud without rotation, and the rest are the point clouds rotated by different angles, for example, 90°z means that the point cloud rotates 90 degrees around the z-axis.</p>
Full article ">Figure 6
<p>Gradual rotation of point clouds and their corresponding output results. This figure shows how the network outputs different poses for different angles of the same point cloud around the same axis (z-axis). The top row is the input point cloud <span class="html-italic">P</span> after applying the rotation matrix <span class="html-italic">S</span>, and the bottom row is the predicted pose generated by the network based on the rotated point cloud <math display="inline"><semantics> <msub> <mi>P</mi> <mi>r</mi> </msub> </semantics></math>, which is generated from the point cloud <span class="html-italic">P</span> after applying the rotation matrix <span class="html-italic">S</span>. The predicted pose is expressed as the rotation matrix <math display="inline"><semantics> <msup> <mi>S</mi> <mo>′</mo> </msup> </semantics></math>. After reapplying <math display="inline"><semantics> <msup> <mi>S</mi> <mo>′</mo> </msup> </semantics></math> to the origin point cloud <span class="html-italic">P</span>, a new rotation point cloud <math display="inline"><semantics> <msubsup> <mi>P</mi> <mi>r</mi> <mo>′</mo> </msubsup> </semantics></math> can be obtained, and whether the predicted pose is consistent can be judged by comparing <math display="inline"><semantics> <msub> <mi>P</mi> <mi>r</mi> </msub> </semantics></math> with <math display="inline"><semantics> <msubsup> <mi>P</mi> <mi>r</mi> <mo>′</mo> </msubsup> </semantics></math>.</p>
Full article ">Figure 7
<p>Randomly rotated point cloud and their corresponding output results in 3D space. This figure shows the comparison between the pose output by the network and the origin point cloud for different point clouds rotated randomly under SO(3).</p>
Full article ">Figure 8
<p>Segmentation of rotated point clouds. This figure illustrates the comparison of segmentation results obtained from our network’s output for different point cloud rotation states, namely, Z and SO(3), in contrast to the original point cloud.</p>
Full article ">
20 pages, 2207 KiB  
Article
A Novel TLS-Based Fingerprinting Approach That Combines Feature Expansion and Similarity Mapping
by Amanda Thomson, Leandros Maglaras and Naghmeh Moradpoor
Future Internet 2025, 17(3), 120; https://doi.org/10.3390/fi17030120 - 7 Mar 2025
Viewed by 93
Abstract
Malicious domains are part of the landscape of the internet but are becoming more prevalent and more dangerous both to companies and to individuals. They can be hosted on various technologies and serve an array of content, including malware, command and control and [...] Read more.
Malicious domains are part of the landscape of the internet but are becoming more prevalent and more dangerous both to companies and to individuals. They can be hosted on various technologies and serve an array of content, including malware, command and control and complex phishing sites that are designed to deceive and expose. Tracking, blocking and detecting such domains is complex, and very often it involves complex allowlist or denylist management or SIEM integration with open-source TLS fingerprinting techniques. Many fingerprinting techniques, such as JARM and JA3, are used by threat hunters to determine domain classification, but with the increase in TLS similarity, particularly in CDNs, they are becoming less useful. The aim of this paper was to adapt and evolve open-source TLS fingerprinting techniques with increased features to enhance granularity and to produce a similarity-mapping system that would enable the tracking and detection of previously unknown malicious domains. This was achieved by enriching TLS fingerprints with HTTP header data and producing a fine-grain similarity visualisation that represented high-dimensional data using MinHash and Locality-Sensitive Hashing. Influence was taken from the chemistry domain, where the problem of high-dimensional similarity in chemical fingerprints is often encountered. An enriched fingerprint was produced, which was then visualised across three separate datasets. The results were analysed and evaluated, with 67 previously unknown malicious domains being detected based on their similarity to known malicious domains and nothing else. The similarity-mapping technique produced demonstrates definite promise in the arena of early detection of malware and phishing domains. Full article
Show Figures

Figure 1

Figure 1
<p>A flow diagram of the end-to-end fingerprint processing pipeline.</p>
Full article ">Figure 2
<p>The raw fingerprint produced from the active scan.</p>
Full article ">Figure 3
<p>A screenshot of the HEAD request being made, as seen within Wireshark. The HTTP Protocol is highlighted in green.</p>
Full article ">Figure 4
<p>A screenshot of a typical set of HTTP headers received in response to a HEAD request during the header enrichment process. The HEAD request is seen in red, the HTTP response is blue.</p>
Full article ">Figure 5
<p>Graph displaying TLS features enriched with HTTP header data. The resulting feature matrix <math display="inline"><semantics> <mrow> <mi>M</mi> <mo>∈</mo> <msup> <mrow> <mo>{</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo>}</mo> </mrow> <mrow> <mi>n</mi> <mo>×</mo> <mi>d</mi> </mrow> </msup> </mrow> </semantics></math> has dimensions <span class="html-italic">n</span> = 16,254 (fingerprints) and <span class="html-italic">d</span> = 2124 (features), representing the complete binary feature space of the TLS and HTTP characteristics. Known good domains are coloured green, known bad domains, red and unknown domains, orange.</p>
Full article ">Figure 6
<p>The Mixed Host dataset displays a diverse number of distance metrics and a broader distribution of similarity scores across the sample space. Each line represents a different domain, with a range of colors to aid in differentiation.</p>
Full article ">Figure 7
<p>The Cloudflare CDN dataset displays less diversity in similarity. All k-nearest neighbours maintain distances below 0.30. This shows closer similarity between domains. Each line represents a different domain, with a range of colors to aid in differentiation.</p>
Full article ">Figure 8
<p>A typical domain with strong indicators of malicious intent. The domain was sourced from the unknown category and registered within 30 days of the scan taking place. At the time of evaluation, 12 security vendors had flagged the domain as malicious, including Sophos, Fortinet, ESET and Bitdefender.</p>
Full article ">Figure 9
<p>An example of a domain on the threshold for further investigation. The domain has three vendors confirmed as malicious—BitDefender, CRDF and G-Data—but a further suspicious flag from vendor Trustwave. The left-hand shows the heuristic scan performed by URLQuery, indicating that ClearFake malicious JavaScript library was detected.</p>
Full article ">Figure 10
<p>The LSH forest of dataset A visualised using Fearun. Known bad domains are colored red, known good are colored blue and unknown domains are colored orange.</p>
Full article ">Figure 11
<p>The LSH forest of dataset B (Cloudflare CDN domains) visualised using Fearun. The TLS fingerprints have been enriched with HTTP header data. Known bad domains are colored red, known good are colored blue and unknown domains are colored orange.</p>
Full article ">Figure 12
<p>The LSH forest of dataset B (Cloudflare CDN domains) visualised using Fearun. The TLS fingerpritns are not enriched and contain only TLS features. Known bad domains are colored red, known good are colored blue and unknown domains are colored orange.</p>
Full article ">Figure 13
<p>The LSH visualisation of dataset C, known malicious domains. Clear similarity patterns can be seen forming by capability. Go Phish domains are seen in yellow, Cert Pl orange, Metasploit pink, Tactical RRM purple and Burp Collaborator Blue.</p>
Full article ">
19 pages, 6174 KiB  
Article
Sub-Pixel Displacement Measurement with Swin Transformer: A Three-Level Classification Approach
by Yongxing Lin, Xiaoyan Xu and Zhixin Tie
Appl. Sci. 2025, 15(5), 2868; https://doi.org/10.3390/app15052868 - 6 Mar 2025
Viewed by 97
Abstract
In order to avoid the dependence of traditional sub-pixel displacement methods on interpolation method calculation, image gradient calculation, initial value estimation and iterative calculation, a Swin Transformer-based sub-pixel displacement measurement method (ST-SDM) is proposed, and a square dataset expansion method is also proposed [...] Read more.
In order to avoid the dependence of traditional sub-pixel displacement methods on interpolation method calculation, image gradient calculation, initial value estimation and iterative calculation, a Swin Transformer-based sub-pixel displacement measurement method (ST-SDM) is proposed, and a square dataset expansion method is also proposed to rapidly expand the training dataset. The ST-SDM computes sub-pixel displacement values of different scales through three-level classification tasks, and solves the problem of positive and negative displacement with the rotation relative tag value method. The accuracy of the ST-SDM is verified by simulation experiments, and its robustness is verified by real rigid body experiments. The experimental results show that the ST-SDM model has higher accuracy and higher efficiency than the comparison algorithm. Full article
Show Figures

Figure 1

Figure 1
<p>Sub-pixel search correlation principle of DSCM. (<b>a</b>) Speckle image before deformation; (<b>b</b>) speckle image after deformation.</p>
Full article ">Figure 2
<p>The structure of the Swin Transformer.</p>
Full article ">Figure 3
<p>Architecture of the proposed ST-SDM. “Ref” means reference images, “F-ST” means the first-level classification, “S-ST” means the second-level classification, “T-ST” means the third-level classification, and “Tar”, “Tar2”, and “Tar3” represent the input target images for the three levels of classification, respectively.</p>
Full article ">Figure 4
<p>Rotation-relative labeled value method.</p>
Full article ">Figure 5
<p>Simulated speckle image.</p>
Full article ">Figure 6
<p>The variation in accuracy of the first level classification task.</p>
Full article ">Figure 7
<p>The variation in accuracy of the second level classification task.</p>
Full article ">Figure 8
<p>The variation in accuracy of the third level classification task.</p>
Full article ">Figure 9
<p>The AVME and relative error in the u direction of the two models.</p>
Full article ">Figure 10
<p>Experimental system.</p>
Full article ">Figure 11
<p>Real speckle image.</p>
Full article ">Figure 12
<p>The AVME and relative error of two models each with a 0.1 mm shift.</p>
Full article ">
25 pages, 7248 KiB  
Article
CEEMDAN-IHO-SVM: A Machine Learning Research Model for Valve Leak Diagnosis
by Ruixue Wang and Ning Zhao
Algorithms 2025, 18(3), 148; https://doi.org/10.3390/a18030148 - 5 Mar 2025
Viewed by 121
Abstract
Due to the complex operating environment of valves, when a fault occurs inside a valve, the vibration signal generated by the fault is easily affected by the environmental noise, making the extraction of fault features difficult. To address this problem, this paper proposes [...] Read more.
Due to the complex operating environment of valves, when a fault occurs inside a valve, the vibration signal generated by the fault is easily affected by the environmental noise, making the extraction of fault features difficult. To address this problem, this paper proposes a feature extraction method based on the combination of Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) and Fuzzy Entropy (FN). Due to the slow convergence speed and the tendency to fall into local optimal solutions of the Hippopotamus Optimization Algorithm (HO), an improved Hippopotamus Optimization (IHO) algorithm-optimized Support Vector Machine (SVM) model for valve leakage diagnosis is introduced to further enhance the accuracy of valve leakage diagnosis. The improved Hippopotamus Optimization algorithm initializes the hippopotamus population with Tent chaotic mapping, designs an adaptive weight factor, and incorporates adaptive variation perturbation. Moreover, the performance of IHO was proven to be optimal compared to HO, Particle Swarm Optimization (PSO), Grey Wolf Optimization (GWO), Whale Optimization Algorithm (WOA), and Sparrow Search Algorithm (SSA) by calculating twelve test functions. Subsequently, the IHO-SVM classification model was established and applied to valve leakage diagnosis. The prediction effects of the seven models, IHO-SVM. HO-SVM, PSO-SVM, GWO-SVM, WOA-SVM, SSA-SVM, and SVM were compared and analyzed with actual data. As a result, the comparison indicated that IHO-SVM has desirable robustness and generalization, which successfully improves the classification efficiency and the recognition rate in fault diagnosis. Full article
(This article belongs to the Section Evolutionary Algorithms and Machine Learning)
Show Figures

Figure 1

Figure 1
<p>The flowchart of CEEMDAN.</p>
Full article ">Figure 2
<p>Value of <span class="html-italic">w</span>.</p>
Full article ">Figure 3
<p>IHO’s flowchart.</p>
Full article ">Figure 4
<p>Convergence curves of the four algorithms.</p>
Full article ">Figure 4 Cont.
<p>Convergence curves of the four algorithms.</p>
Full article ">Figure 4 Cont.
<p>Convergence curves of the four algorithms.</p>
Full article ">Figure 5
<p>Experimental platform device.</p>
Full article ">Figure 6
<p>Overall frame diagram.</p>
Full article ">Figure 7
<p>The iteration curve of diameter 200 mm-0.2 MPa.</p>
Full article ">Figure 8
<p>The iteration curve of diameter 200 mm-0.8 MPa.</p>
Full article ">Figure 9
<p>Comparison of classification accuracy of four models on 10 separate occasions.</p>
Full article ">Figure 10
<p>Comparison of classification accuracy of four models on 10 separate occasions.</p>
Full article ">
19 pages, 3746 KiB  
Article
The Impact of the Human Factor on Communication During a Collision Situation in Maritime Navigation
by Leszek Misztal and Paulina Hatlas-Sowinska
Appl. Sci. 2025, 15(5), 2797; https://doi.org/10.3390/app15052797 - 5 Mar 2025
Viewed by 175
Abstract
In this paper, the authors draw attention to the significant impact of the human factor during collision situations in maritime navigation. The problems in the communication process between navigators are so excessive that the authors propose automatic communication. This is an alternative method [...] Read more.
In this paper, the authors draw attention to the significant impact of the human factor during collision situations in maritime navigation. The problems in the communication process between navigators are so excessive that the authors propose automatic communication. This is an alternative method to the current one. The presented system comprehensively performs communication tasks during a sea voyage. To reach the mentioned goal, AI methods of natural language processing and additional properties of metaontology (ontology supplemented with objective functions) are applied. Dedicated to maritime transport applications, the model for translating a natural language into an ontology consists of multiple steps and uses AI methods of classification for the recognition of a message from the ship’s bridge. The reverse model is also multi-stage and uses a created rule-based knowledge base to create natural-language sentences built on the basis of the ontology. Validation of the model’s accuracy results was conducted through accuracy assessment coefficients for information classification, commonly used in science. Receiver operating characteristic (ROC) curves represent the results in the datasets. The presented solution of the designed architecture of the system as well as algorithms developed in the software prototype confirmed the correctness of the assumptions in the described study. The authors demonstrated that it is feasible to successfully apply metaontology and machine learning methods in the proposed prototype software for ship-to-ship communication. Full article
(This article belongs to the Section Marine Science and Engineering)
Show Figures

Figure 1

Figure 1
<p>Algorithm of the communication process.</p>
Full article ">Figure 2
<p>A mathematical tree presenting a fragment of the navigation and communication ontologies.</p>
Full article ">Figure 3
<p>New conversation model.</p>
Full article ">Figure 4
<p>Model of transforming natural language into ontology.</p>
Full article ">Figure 5
<p>ROC curve for the given test scenarios.</p>
Full article ">Figure 6
<p>Answers regarding the use of semi-automatic communication.</p>
Full article ">Figure 7
<p>Voice communication replies (via FM).</p>
Full article ">Figure 8
<p>Answers regarding VHF communication.</p>
Full article ">Figure 9
<p>Automatic communication talking time replies.</p>
Full article ">Figure 10
<p>Answers regarding the modernization of the communication system.</p>
Full article ">Figure 11
<p>Answers regarding safety level enhancement for semiautomatic communication.</p>
Full article ">
18 pages, 35678 KiB  
Article
Novelty Recognition: Fish Species Classification via Open-Set Recognition
by Manuel Córdova, Ricardo da Silva Torres, Aloysius van Helmond and Gert Kootstra
Sensors 2025, 25(5), 1570; https://doi.org/10.3390/s25051570 - 4 Mar 2025
Viewed by 135
Abstract
To support the sustainable use of marine resources, regulations have been proposed to reduce fish discards focusing on the registration of all listed species. To comply with such regulations, computer vision methods have been developed. Nevertheless, current approaches are constrained by their closed-set [...] Read more.
To support the sustainable use of marine resources, regulations have been proposed to reduce fish discards focusing on the registration of all listed species. To comply with such regulations, computer vision methods have been developed. Nevertheless, current approaches are constrained by their closed-set nature, where they are designed only to recognize fish species that were present during training. In the real world, however, samples of unknown fish species may appear in different fishing regions or seasons, requiring fish classification to be treated as an open-set problem. This work focuses on the assessment of open-set recognition to automate the registration process of fish. The state-of-the-art Multiple Gaussian Prototype Learning (MGPL) was compared with the simple yet powerful Open-Set Nearest Neighbor (OSNN) and the Probability of Inclusion Support Vector Machine (PISVM). For the experiments, the Fish Detection and Weight Estimation dataset, containing images of 2216 fish instances from nine species, was used. Experimental results demonstrated that OSNN and PISVM outperformed MGPL in both recognizing known and unknown species. OSNN achieved the best results when classifying samples as either one of the known species or as an unknown species with an F1-macro of 0.79±0.05 and an AUROC score of 0.92±0.01 surpassing PISVM by 0.05 and 0.03, respectively. Full article
Show Figures

Figure 1

Figure 1
<p>Open-set setup. Unseen species (?) may appear in the future and need to be recognized as unknown.</p>
Full article ">Figure 2
<p>Images from FDWE [<a href="#B5-sensors-25-01570" class="html-bibr">5</a>] dataset.</p>
Full article ">Figure 3
<p>Example of the open-set partitioning process at species level.</p>
Full article ">Figure 4
<p>Closed-set results.</p>
Full article ">Figure 5
<p>Confusion matrices in a closed-set setup.</p>
Full article ">Figure 6
<p>Open-set results.</p>
Full article ">Figure 7
<p>Confusion matrices of OSNN in an open-set setup.</p>
Full article ">Figure 8
<p>Incorrect predictions on partition 3.</p>
Full article ">Figure 9
<p>Incorrect predictions on partition 4.</p>
Full article ">
26 pages, 330 KiB  
Article
Construction of Countably Infinite Programs That Evade Malware/Non-Malware Classification for Any Given Formal System
by Vasiliki Liagkou, Panagiotis E. Nastou, Paul Spirakis and Yannis C. Stamatiou
Cryptography 2025, 9(1), 16; https://doi.org/10.3390/cryptography9010016 - 4 Mar 2025
Viewed by 117
Abstract
The formal study of computer malware was initiated in the seminal work of Fred Cohen in the mid-80s, who applied elements of Computation Theory in the investigation of the theoretical limits of using the Turing Machine formal model of computation in detecting viruses. [...] Read more.
The formal study of computer malware was initiated in the seminal work of Fred Cohen in the mid-80s, who applied elements of Computation Theory in the investigation of the theoretical limits of using the Turing Machine formal model of computation in detecting viruses. Cohen gave a simple but realistic formal definition of the characteristic actions of a computer virus as a Turing Machine that replicates itself and proved that detecting this behaviour, in general, is an undecidable problem. In this paper, we complement Cohen’s approach by providing a simple generalization of his definition of a computer virus so as to model any type of malware behaviour and showing that the malware/non-malware classification problem is, again, undecidable. Most importantly, beyond Cohen’s work, our work provides a generic theoretical framework for studying anti-malware applications and identifying, at an early stage, before their deployment, several of their inherent vulnerabilities which may lead to the construction of zero-day exploits and malware strains with stealth properties. To this end, we show that for any given formal system, which can be seen as an anti-malware formal model, there are infinitely many, effectively constructible programs for which no proof can be produced by the formal system that they are either malware or non-malware programs. Moreover, infinitely many of these programs are, indeed, malware programs which evade the detection powers of the given formal system. Full article
35 pages, 5528 KiB  
Review
Vehicle to Grid: Technology, Charging Station, Power Transmission, Communication Standards, Techno-Economic Analysis, Challenges, and Recommendations
by Parag Biswas, Abdur Rashid, A. K. M. Ahasan Habib, Md Mahmud, S. M. A. Motakabber, Sagar Hossain, Md. Rokonuzzaman, Altaf Hossain Molla, Zambri Harun, Md Munir Hayet Khan, Wan-Hee Cheng and Thomas M. T. Lei
World Electr. Veh. J. 2025, 16(3), 142; https://doi.org/10.3390/wevj16030142 - 3 Mar 2025
Viewed by 378
Abstract
Electric vehicles (EVs) must be used as the primary mode of transportation as part of the gradual transition to more environmentally friendly clean energy technology and cleaner power sources. Vehicle-to-grid (V2G) technology has the potential to improve electricity demand, control load variability, and [...] Read more.
Electric vehicles (EVs) must be used as the primary mode of transportation as part of the gradual transition to more environmentally friendly clean energy technology and cleaner power sources. Vehicle-to-grid (V2G) technology has the potential to improve electricity demand, control load variability, and improve the sustainability of smart grids. The operation and principles of V2G and its varieties, the present classifications and types of EVs sold on the market, applicable policies for V2G and business strategy, implementation challenges, and current problem-solving techniques have not been thoroughly examined. This paper exposes the research gap in the V2G area and more accurately portrays the present difficulties and future potential in V2G deployment globally. The investigation starts by discussing the advantages of the V2G system and the necessary regulations and commercial representations implemented in the last decade, followed by a description of the V2G technology, charging communication standards, issues related to V2G and EV batteries, and potential solutions. A few major issues were brought to light by this investigation, including the lack of a transparent business model for V2G, the absence of stakeholder involvement and government subsidies, the excessive strain that V2G places on EV batteries, the lack of adequate bidirectional charging and standards, the introduction of harmonic voltage and current into the grid, and the potential for unethical and unscheduled V2G practices. The results of recent studies and publications from international organizations were altered to offer potential answers to these research constraints and, in some cases, to highlight the need for further investigation. V2G holds enormous potential, but the plan first needs a lot of financing, teamwork, and technological development. Full article
(This article belongs to the Special Issue Electric Vehicles and Smart Grid Interaction)
Show Figures

Figure 1

Figure 1
<p>Schematic diagram of EV components.</p>
Full article ">Figure 2
<p>Architectural types of EVs.</p>
Full article ">Figure 3
<p>Internal configuration of different EV designs.</p>
Full article ">Figure 4
<p>Schematic for EV charging system.</p>
Full article ">Figure 5
<p>Vehicle-to-grid power transmission framework.</p>
Full article ">Figure 6
<p>V2G functioning.</p>
Full article ">Figure 7
<p>EV worldwide sales; STEPS scenario 2022–2030 [<a href="#B102-wevj-16-00142" class="html-bibr">102</a>].</p>
Full article ">Figure 8
<p>Global electric car stock, 2015–2021. Adapted with permission from Ref. [<a href="#B103-wevj-16-00142" class="html-bibr">103</a>].</p>
Full article ">Figure 9
<p>V2G or G2V integration with actors and stakeholders.</p>
Full article ">Figure 10
<p>EV user and grid business interface.</p>
Full article ">
24 pages, 5117 KiB  
Article
Estimation of Aboveground Biomass of Picea schrenkiana Forests Considering Vertical Zonality and Stand Age
by Guohui Zhang, Donghua Chen, Hu Li, Minmin Pei, Qihang Zhen, Jian Zheng, Haiping Zhao, Yingmei Hu and Jingwei Fan
Forests 2025, 16(3), 445; https://doi.org/10.3390/f16030445 - 1 Mar 2025
Viewed by 198
Abstract
The aboveground biomass (AGB) of forests reflects the productivity and carbon-storage capacity of the forest ecosystem. Although AGB estimation techniques have become increasingly sophisticated, the relationships between AGB, spatial distribution, and growth stages still require further exploration. In this study, the Picea schrenkiana [...] Read more.
The aboveground biomass (AGB) of forests reflects the productivity and carbon-storage capacity of the forest ecosystem. Although AGB estimation techniques have become increasingly sophisticated, the relationships between AGB, spatial distribution, and growth stages still require further exploration. In this study, the Picea schrenkiana (Picea schrenkiana var. tianschanica) forest area in the Kashi River Basin of the Ili River Valley in the western Tianshan Mountains was selected as the research area. Based on forest resources inventory data, Gaofen-1 (GF-1), Gaofen-6 (GF-6), Gaofen-3 (GF-3) Polarimetric Synthetic Aperture Radar (PolSAR), and DEM data, we classified the Picea schrenkiana forests in the study area into three cases: the Whole Forest without vertical zonation and stand age, Vertical Zonality Classification without considering stand age, and Stand-Age Classification without considering vertical zonality. Then, for each case, we used eXtreme Gradient Boosting (XGBoost), Back Propagation Neural Network (BPNN), and Residual Networks (ResNet), respectively, to estimate the AGB of forests in the study area. The results show that: (1) The integration of multi-source remote-sensing data and the ResNet can effectively improve the remote-sensing estimation accuracy of the AGB of Picea schrenkiana. (2) Furthermore, classification by vertical zonality and stand ages can reduce the problems of low-value overestimation and high-value underestimation to a certain extent. Full article
(This article belongs to the Special Issue Modeling Aboveground Forest Biomass: New Developments)
Show Figures

Figure 1

Figure 1
<p>Geographic location map of the study area. (<b>a</b>) Position of the study area within China. (<b>b</b>) Position of the study area within Xinjiang Province. (<b>c</b>) Elevation map of the study area, along with the forest sample locations.</p>
Full article ">Figure 2
<p>Topographic feature map of the study area. (<b>a</b>) altitude map of the study area. (<b>b</b>) vertical zonality division map of the study area. (<b>c</b>) slope division map of the study area. (<b>d</b>) aspect division map of the study area.</p>
Full article ">Figure 3
<p>Classification map of stand ages of <span class="html-italic">Picea schrenkiana</span> forests.</p>
Full article ">Figure 4
<p>The framework of the Back Propagation Neural Network model.</p>
Full article ">Figure 5
<p>The framework of the Residual Network model. k represents the convolution kernel, s represents the stride, and p represents the padding.</p>
Full article ">Figure 6
<p>The accuracy of each model using multi-source data in the case of vertical zonality.</p>
Full article ">Figure 7
<p>The accuracy of each model using multi-source data in the case of stand ages.</p>
Full article ">Figure 8
<p>Accuracy of each model using multi-source data in the case of whole forest.</p>
Full article ">Figure 9
<p>AGB estimation results under three different modeling methods. (<b>a</b>) shows the output map of estimated AGB for the whole forest. (<b>b</b>) based on the classification of vertical zonality, shows the output map of estimated AGB. (<b>c</b>) based on the classification of stand ages, shows the output map of estimated AGB.</p>
Full article ">Figure 10
<p>The comparison of AGB estimation results between this study and Yang’s study. (<b>a</b>) the results of this experimental study; (<b>b</b>) The results of Yang’s study.</p>
Full article ">
17 pages, 1206 KiB  
Article
A Smoothing Newton Method for Real-Time Pricing in Smart Grids Based on User Risk Classification
by Linsen Song and Gaoli Sheng
Mathematics 2025, 13(5), 822; https://doi.org/10.3390/math13050822 - 28 Feb 2025
Viewed by 244
Abstract
Real-time pricing is an ideal pricing mechanism for regulating the balance of power supply and demand in smart grid. Considering the differences in electricity consumption risks among different types of users, a social welfare maximization model with user risk classification is proposed in [...] Read more.
Real-time pricing is an ideal pricing mechanism for regulating the balance of power supply and demand in smart grid. Considering the differences in electricity consumption risks among different types of users, a social welfare maximization model with user risk classification is proposed in this paper. Also, a smoothing Newton method is investigated for solving the proposed model. Firstly, the convexity of the model is discussed, which implies that the local optimum of the model is also the global optimum. Then, by transforming the proposed model into a smooth equation system based on the Karush–Kuhn–Tucker (KKT) conditions, we devise a smoothing Newton algorithm integrated with Powell–Wolfe line search criteria. The nonsingularity of the corresponding function’s Jacobian matrix is obtained to ensure the stability of the proposed algorithm. Finally, we give a comparison between the proposed model and the unclassified risk model and the proposed algorithm and the distributed algorithm for real-time pricing, time-of-use pricing, and fixed pricing, respectively. The numerical results demonstrate the effectiveness of the model and the algorithm. Full article
Show Figures

Figure 1

Figure 1
<p>Comparison The flow chart of Algorithm 1.</p>
Full article ">Figure 2
<p>Comparison of price between the risk-classified model and unclassified model under different scales of users based on the smoothing Newton algorithm.</p>
Full article ">Figure 3
<p>Comparison of the social welfare between risk-classified model and unclassified model under different scales of users based on the smoothing Newton algorithm.</p>
Full article ">Figure 4
<p>Comparison of price and the social welfare between the smoothing Newton algorithm and the distributed algorithm.</p>
Full article ">Figure 5
<p>Comparison of price and the social welfare between the RTP, TOU, and FP strategies.</p>
Full article ">
16 pages, 2179 KiB  
Article
MNv3-MFAE: A Lightweight Network for Video Action Recognition
by Jie Liu, Wenyue Liu and Ke Han
Electronics 2025, 14(5), 981; https://doi.org/10.3390/electronics14050981 - 28 Feb 2025
Viewed by 211
Abstract
Video action recognition aims to achieve the automatic classification of human behaviors by analyzing the actions in videos, with its core lying in accurately capturing the spatial detail features of images and the temporal dynamic features among video frames. In response to the [...] Read more.
Video action recognition aims to achieve the automatic classification of human behaviors by analyzing the actions in videos, with its core lying in accurately capturing the spatial detail features of images and the temporal dynamic features among video frames. In response to the problems of limited action recognition accuracy in videos containing complex temporal dynamics and large network model parameters, this paper proposes an innovative multi-feature fusion information modeling method. This paper designs a plug-and-play multi-feature action extraction (MFAE) module. The module adopts a multi-branch parallel processing strategy and integrates the functions of modeling and extracting temporal features, spatial features, and motion features to ensure the efficient modeling of the spatio-temporal information, inter-frame differences, and temporal dependencies of video actions. Meanwhile, the network employs a lightweight channel attention module (TiedSE), which reduces the complexity of the network model and decreases the number of network parameters. Finally, the effectiveness of the model is demonstrated on the Jester dataset, SomethingV2 dataset, and UCF101 dataset, achieving accuracies of 94.01%, 66.19%, and 96.74% with only 1.45 M parameters, significantly fewer than existing algorithms. The proposed method balances accuracy and computational efficiency in video action recognition, overcoming the shortcomings of traditional algorithms in temporal modeling and demonstrating its effectiveness in the task of video action recognition. Full article
Show Figures

Figure 1

Figure 1
<p>Diagram of the improved MobileNetV3 network processing a video action sequence.</p>
Full article ">Figure 2
<p>Structure of the multi-feature action extraction module.</p>
Full article ">Figure 3
<p>Comprehensive structural diagram of the spatial-temporal extraction module for redundancy elimination: (<b>a</b>) STE module; (<b>b</b>) STEV module.</p>
Full article ">Figure 4
<p>SCRU module.</p>
Full article ">Figure 5
<p>Structural diagram of the LTA module for long-distance time series: (<b>a</b>) TA module; (<b>b</b>) LTA module.</p>
Full article ">Figure 6
<p>TiedSE module.</p>
Full article ">Figure 7
<p>Schematic diagram of the convolution process: (<b>a</b>) regular convolution; (<b>b</b>) group convolution; (<b>c</b>) bound-block convolution.</p>
Full article ">Figure 8
<p>Heat map of video action recognition on the Jester dataset: (<b>a</b>) turning hand clockwise; (<b>b</b>) thumb up; (<b>c</b>) turning hand counterclockwise. Heatmap of video action recognition on the UCF101 dataset: (<b>d</b>) walking with dog; (<b>e</b>) basketball; (<b>f</b>) bench press.</p>
Full article ">Figure 9
<p>Confusion matrix for 27-class gesture classification on the Jester dataset.</p>
Full article ">
17 pages, 72606 KiB  
Article
Classification of Large Scale Hyperspectral Remote Sensing Images Based on LS3EU-Net++
by Hengqian Zhao, Zhengpu Lu, Shasha Sun, Pan Wang, Tianyu Jia, Yu Xie and Fei Xu
Remote Sens. 2025, 17(5), 872; https://doi.org/10.3390/rs17050872 - 28 Feb 2025
Viewed by 155
Abstract
Aimed at the limitation that existing hyperspectral classification methods were mainly oriented to small-scale images, this paper proposed a new large-scale hyperspectral remote sensing image classification method, LS3EU-Net++ (Lightweight Encoder and Integrated Spatial Spectral Squeeze and Excitation U-Net++). The method optimized the U-Net++ [...] Read more.
Aimed at the limitation that existing hyperspectral classification methods were mainly oriented to small-scale images, this paper proposed a new large-scale hyperspectral remote sensing image classification method, LS3EU-Net++ (Lightweight Encoder and Integrated Spatial Spectral Squeeze and Excitation U-Net++). The method optimized the U-Net++ architecture by introducing a lightweight encoder and combining the Spatial Spectral Squeeze and Excitation (S3E) Attention Module, which maintained the powerful feature extraction capability while significantly reducing the training cost. In addition, the model employed a composite loss function combining focal loss and Jaccard loss, which could focus more on difficult samples, thus improving pixel-level accuracy and classification results. To solve the sample imbalance problem in hyperspectral images, this paper also proposed a data enhancement strategy based on “copy–paste”, which effectively increased the diversity of the training dataset. Experiments on large-scale satellite hyperspectral remote sensing images from the Zhuhai-1 satellite demonstrated that LS3EU-Net++ exhibited superiority over the U-Net++ benchmark. Specifically, the overall accuracy (OA) was improved by 5.35%, and the mean Intersection over Union (mIoU) by 12.4%. These findings suggested that the proposed method provided a robust solution for large-scale hyperspectral image classification, effectively balancing accuracy and computational efficiency. Full article
(This article belongs to the Topic Hyperspectral Imaging and Signal Processing)
Show Figures

Figure 1

Figure 1
<p>Schematic diagram of the proposed method.</p>
Full article ">Figure 2
<p>Percentage of samples per class in a large-scale dataset.</p>
Full article ">Figure 3
<p>Data-enhanced samples and their corresponding original samples: (<b>a</b>) is the original sample of (<b>d</b>), (<b>b</b>) is the original sample of (<b>e</b>), and (<b>c</b>) is the original sample of (<b>f</b>). The red boxes show the portion of the bare soil that changed after the data enhancement was performed.</p>
Full article ">Figure 4
<p>Rate of change of percentage of samples from each class after data augmentation.</p>
Full article ">Figure 5
<p>Schematic diagram of a common residual CNN (<b>a</b>) and light-weighted MobileNetV2 (<b>b</b>).</p>
Full article ">Figure 6
<p>Schematic diagram of the S3E model.</p>
Full article ">Figure 7
<p>True color display of one of the sub-images and its corresponding ground truth of LHSI-A.</p>
Full article ">Figure 8
<p>Results of the test set experiment, where green is vegetation, red is buildings, blue is water, yellow is bare soil, and black is background: (<b>f</b>–<b>j</b>) correspond to the labeled true-color display plots for (<b>a</b>–<b>e</b>), respectively, and (<b>k</b>–<b>o</b>) correspond to the predicted plots for (<b>a</b>–<b>e</b>), respectively.</p>
Full article ">Figure 9
<p>True color display (<b>a</b>), ground truth map (<b>b</b>), and the predicted map of LS3EU-Net++ on the LHSI-B dataset (<b>c</b>), where green is vegetation, red is buildings, blue is water, and yellow is bare soil.</p>
Full article ">
19 pages, 7206 KiB  
Article
Optimizing Model Performance and Interpretability: Application to Biological Data Classification
by Zhenyu Huang, Xuechen Mu, Yangkun Cao, Qiufen Chen, Siyu Qiao, Bocheng Shi, Gangyi Xiao, Yan Wang and Ying Xu
Genes 2025, 16(3), 297; https://doi.org/10.3390/genes16030297 - 28 Feb 2025
Viewed by 260
Abstract
This study introduces a novel framework that simultaneously addresses the challenges of performance accuracy and result interpretability in transcriptomic-data-based classification. Background/objectives: In biological data classification, it is challenging to achieve both high performance accuracy and interpretability at the same time. This study [...] Read more.
This study introduces a novel framework that simultaneously addresses the challenges of performance accuracy and result interpretability in transcriptomic-data-based classification. Background/objectives: In biological data classification, it is challenging to achieve both high performance accuracy and interpretability at the same time. This study presents a framework to address both challenges in transcriptomic-data-based classification. The goal is to select features, models, and a meta-voting classifier that optimizes both classification performance and interpretability. Methods: The framework consists of a four-step feature selection process: (1) the identification of metabolic pathways whose enzyme-gene expressions discriminate samples with different labels, aiding interpretability; (2) the selection of pathways whose expression variance is largely captured by the first principal component of the gene expression matrix; (3) the selection of minimal sets of genes, whose collective discerning power covers 95% of the pathway-based discerning power; and (4) the introduction of adversarial samples to identify and filter genes sensitive to such samples. Additionally, adversarial samples are used to select the optimal classification model, and a meta-voting classifier is constructed based on the optimized model results. Results: The framework applied to two cancer classification problems showed that in the binary classification, the prediction performance was comparable to the full-gene model, with F1-score differences of between −5% and 5%. In the ternary classification, the performance was significantly better, with F1-score differences ranging from −2% to 12%, while also maintaining excellent interpretability of the selected feature genes. Conclusions: This framework effectively integrates feature selection, adversarial sample handling, and model optimization, offering a valuable tool for a wide range of biological data classification problems. Its ability to balance performance accuracy and high interpretability makes it highly applicable in the field of computational biology. Full article
(This article belongs to the Section Bioinformatics)
Show Figures

Figure 1

Figure 1
<p>Construction of a feature selection and model selection framework. (<b>A</b>) This step includes four stages: (<b>1</b>) Identification of differentially expressed genes. (<b>2</b>) Identification of representative metabolic pathways in samples with distinct labels. (<b>3</b>) Identification of key genes within the representative metabolic pathways. (<b>4</b>) Identification of genes that are insensitive to adversarial samples. (<b>B</b>) Evaluation of model suitability for data based on accuracy, classification stability, and robustness. Subsequently, computation of the weights of primary classifiers based on scores derived from model selection, followed by the construction of a meta-classifier.</p>
Full article ">Figure 2
<p>Feature evaluation. (<b>A</b>) Statistical overview of DEGs and enzyme genes in the binary and ternary datasets. (<b>B</b>) Occurrence statistics of the enriched pathways across sample groups with distinct labels in the binary and three-class datasets. (<b>C</b>) Gene counts in pathways meeting thresholding criteria in the binary and ternary datasets. Kolmogorov–Smirnov (KS) test for γ distribution is provided. (<b>D</b>) Distribution of importance scores over adversarial samples for the final selected features in the binary and ternary datasets. (<b>E</b>) Relationships between the final selected feature genes and cellular functions in the binary classification dataset. (<b>F</b>) Relationships between the final selected feature genes and cellular functions in the ternary classification dataset. Note: “2C” represents the binary classification dataset, and “3C” represents the ternary classification dataset.</p>
Full article ">Figure 3
<p>Comparison of prediction performance for five classifiers and the stacking voting classifier using all genes vs. selected feature genes in two problems. (<b>A</b>) Prediction performance in the binary classification using the full gene set (60,499 genes). (<b>B</b>) Prediction performance in the binary classification using our 25 enzyme genes. (<b>C</b>) Prediction performance in the ternary problem using the full gene set (60,499 genes). (<b>D</b>) Prediction performance in the ternary problem using our 23 enzyme genes. Note: “2C” represents the binary classification dataset, and “3C” represents the ternary classification dataset.</p>
Full article ">Figure 4
<p>Model selection. (<b>A</b>) <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">F</mi> <mo>’</mo> </mrow> <mrow> <mn>1</mn> </mrow> </msub> </mrow> </semantics></math> score evaluation of six models in the binary problem. The square box represents the variance of the <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">F</mi> <mo>’</mo> </mrow> <mrow> <mn>1</mn> </mrow> </msub> </mrow> </semantics></math> score over 100 iterations. (<b>B</b>) <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">F</mi> <mo>’</mo> </mrow> <mrow> <mn>1</mn> </mrow> </msub> </mrow> </semantics></math> score evaluation of six models in the ternary problem. (<b>C</b>) Classification stability and robustness of the six models. (<b>D</b>) Comprehensive evaluation of <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">F</mi> <mo>’</mo> </mrow> <mrow> <mn>1</mn> </mrow> </msub> </mrow> </semantics></math> score, CS, and CR metrics of the six models in the binary problem. (<b>E</b>) Comprehensive evaluation of <math display="inline"><semantics> <mrow> <msub> <mrow> <mi mathvariant="normal">F</mi> <mo>’</mo> </mrow> <mrow> <mn>1</mn> </mrow> </msub> </mrow> </semantics></math> score, CS, and CR metrics of the six models in the ternary problem.</p>
Full article ">Figure 5
<p>F1-score and accuracy statistics of the stacking voting classifier in the test set by our framework. (<b>A</b>) F1-score statistics of our classifier in the binary problem. (<b>B</b>) Accuracy statistics of our classifier in the binary problem. (<b>C</b>) F1-score statistics of our classifier in the ternary problem. (<b>D</b>) Accuracy statistics of our classifier in the ternary problem.</p>
Full article ">Figure 6
<p>Comparison of prediction performance between our model, RFE, interpretable white-box models, and SHAP. (<b>A</b>) Difference in F1 scores between features selected by our framework and the top features selected by RFE in the binary classification. Positive values indicate higher F1 scores achieved by our framework. (<b>B</b>) Difference in F1 scores between features selected by our framework and the top features selected by RFE in the ternary classification. Positive values indicate higher F1 scores achieved by our framework. (<b>C</b>) F1-score comparison between features selected by our framework and the top features selected by white-box models (EBM and RuleFit) in both binary and ternary classification problems. (<b>D</b>) Difference in F1 scores between features selected by our framework and the top features selected by SHAP in the binary and ternary classification problems. Positive values indicate higher F1 scores achieved by our framework.</p>
Full article ">Figure 7
<p>Overview of our algorithmic framework.</p>
Full article ">
28 pages, 1129 KiB  
Article
Mass Generation of Programming Learning Problems from Public Code Repositories
by Oleg Sychev and Dmitry Shashkov
Big Data Cogn. Comput. 2025, 9(3), 57; https://doi.org/10.3390/bdcc9030057 - 28 Feb 2025
Viewed by 234
Abstract
We present an automatic approach for generating learning problems for teaching introductory programming in different programming languages. The current implementation allows input and output in the three most popular programming languages for teaching introductory programming courses: C++, Java, and Python. The generator stores [...] Read more.
We present an automatic approach for generating learning problems for teaching introductory programming in different programming languages. The current implementation allows input and output in the three most popular programming languages for teaching introductory programming courses: C++, Java, and Python. The generator stores learning problems using the “meaning tree”, a language-independent representation of a syntax tree. During this study, we generated a bank of 1,428,899 learning problems focused on the order of expression evaluation. They were generated in about 16 h. The learning problems were classified for further use with the used concepts, possible domain-rule violations, and required skills; they covered a wide range of difficulties and topics. The problems were validated by automatically solving them in an intelligent tutoring system that recorded the actual skills used and violations made. The generated problems were favorably assessed by 10 experts: teachers and teaching assistants in introductory programming courses. They noted that the problems are ready for use without further manual improvement and that the classification system is flexible enough to receive problems with desirable properties. The proposed approach combines the advantages of different state-of-the-art methods. It combines the diversity of learning problems generated by restricted randomization and large language models with full correctness and a natural look of template-based problems, which makes it a good fit for large-scale learning problem generation. Full article
(This article belongs to the Special Issue Application of Semantic Technologies in Intelligent Environment)
Show Figures

Figure 1

Figure 1
<p>The concept of meaning tree.</p>
Full article ">Figure 2
<p>Example of Meaning Tree conversion process.</p>
Full article ">Figure 3
<p>Example of transformation of a Python expression with compound comparison using Directed Acyclic Graph.</p>
Full article ">Figure 4
<p>Component Diagram of the Learning Problem Generator.</p>
Full article ">Figure 5
<p>Example of solving a problem with Java/C++ expression in the ITS. The underlined symbols are expression operators that the learner must press in the order of their evaluations. The numbers above are the positions of the expression operators, and the numbers below prefixed with the # sign are the order of evaluation of the operators that has already been pressed by the learner.</p>
Full article ">Figure 6
<p>Example of solving a problem with Python expression in the ITS. The underlines and numbers play the same role as in the previous figure.</p>
Full article ">Figure 7
<p>Distribution of learning problems per number of concepts.</p>
Full article ">Figure 8
<p>Distribution of learning problems per violations of subject-domain rules.</p>
Full article ">Figure 9
<p>Distribution of learning problems per number of required skills.</p>
Full article ">Figure 10
<p>Overview of problem-selection settings.</p>
Full article ">
Back to TopTop