Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
entropy-logo

Journal Browser

Journal Browser

Entropy and Information Theory in Machine Learning: Theoretical Insights and Applications

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Signal and Data Analysis".

Deadline for manuscript submissions: 31 December 2024 | Viewed by 9487

Special Issue Editors


E-Mail Website
Guest Editor
School of Engineering, Santa Clara University, 500 El Camino Real, Santa Clara, CA 95053, USA
Interests: deep learning; machine learning; adaptive filters; signal processing; applications
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Electrical Engineering, Indian Institute of Technology, Gandhinagar, India
Interests: active noise control; adaptive signal processing; assistive listening devices; psychoacoustics
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Information Engineering, Electronics and Telecommunications (DIET), Sapienza University of Rome, Via Eudossiana 18, 00184 Rome, Italy
Interests: deep learning; adaptive filters; machine learning; audio signal processing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

This Special Issue of Entropy titled, “Entropy and Information Theory in Machine Learning: Theoretical Insights and Applications“, is a second-volume sequel to our first Special Issue titled, “Adaptive signal processing and Machine Learning Using Entropy and Information Theory“.

Adaptive signal processing, machine learning and deep learning, which rely on the paradigm of learning from data, have become indispensable tools for extracting information, making decisions and interacting with our environment. The information extraction process is a very critical step in this process. Many of the algorithms deployed for information extraction have largely been based on using the popular mean square error (MSE) criterion. They leverage the significant information contained in the data. The more accurate the process of extracting useful information from the data, the more precise and efficient the learning and signal processing will be.

It is well-known that information theoretic learning (ITL)-based cost measures can provide better nonlinear models in a range of problems from system identification and regression to classification. Information theoretic learning (ITL) has initially been applied for such supervised learning applications. ITL-based cost measures also perform better when the error distribution is non-Gaussian, such as in supervised learning.

Entropy and information theory have always represented useful tools to deal with information and the amount of information contained in a random variable. Information theory mainly relies on the basic intuition that learning that an unlikely event has occurred is more informative than learning that a likely event has occurred. Entropy gives a measure of the amount of information in an event drawn from a distribution. For this reason, they have been widely used in adaptive signal processing and machine learning to improve performance by designing and optimizing effective and specific models that fit the data, even in noisy and adverse scenario conditions.

The presence of strong disturbances in the error signal can severely deteriorate the convergence behavior of adaptive filters and, in some cases, cause the learning algorithms to diverge. Information theoretic learning (ITL) approaches have recently emerged as an effective solution to handle such scenarios.

Examples of several measures widely adopted include mutual information, cross-entropy, minimum error entropy (MEE) criterion, maximum correntropy criterion (MCC) and Kullback–Leibler divergence, among others. Moreover, a wide class of interesting tasks of adaptive signal processing, machine learning and deep learning take advantage of entropy and information theory, including: exploratory data analysis, feature and model selection, sampling and subset extraction, optimizing learning algorithms, clustering sensitivity analysis, representation learning, and data generation.

This Special Issue aims at providing an avenue for the publication of recent developments in the areas of entropy and information theoretic-based measures used in machine learning. We solicit papers expounding on theoretical insights as well as the latest applications of these techniques for solving various problems.

Prof. Dr. Tokunbo Ogunfunmi
Dr. Nithin V. George
Dr. Danilo Comminiello
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • adaptive signal processing and adaptive filters
  • machine listening and deep learning
  • information theoretic learning
  • generalized maximum correntropy criterion (GMCC)
  • maximum correntropy criterion (MCC) and cyclic correntropy
  • nonlinear adaptive filters
  • robust signal processing and robust learning
  • impulsive noise
  • model selection and feature extraction
  • Bayesian learning and representation learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 18423 KiB  
Article
The Application of Tsallis Entropy Based Self-Adaptive Algorithm for Multi-Threshold Image Segmentation
by Kailong Zhang, Mingyue He, Lijie Dong and Congjie Ou
Entropy 2024, 26(9), 777; https://doi.org/10.3390/e26090777 - 10 Sep 2024
Viewed by 780
Abstract
Tsallis entropy has been widely used in image thresholding because of its non-extensive properties. The non-extensive parameter q contained in this entropy plays an important role in various adaptive algorithms and has been successfully applied in bi-level image thresholding. In this paper, the [...] Read more.
Tsallis entropy has been widely used in image thresholding because of its non-extensive properties. The non-extensive parameter q contained in this entropy plays an important role in various adaptive algorithms and has been successfully applied in bi-level image thresholding. In this paper, the relationships between parameter q and pixels’ long-range correlations have been further studied within multi-threshold image segmentation. It is found that the pixels’ correlations are remarkable and stable for images generated by a known physical principle, such as infrared images, medical CT images, and color satellite remote sensing images. The corresponding non-extensive parameter q can be evaluated by using the self-adaptive Tsallis entropy algorithm. The results of this algorithm are compared with those of the Shannon entropy algorithm and the original Tsallis entropy algorithm in terms of quantitative image quality evaluation metrics PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structural Similarity). Furthermore, we observed that for image series with the same background, the q values determined by the adaptive algorithm are consistently kept in a narrow range. Therefore, similar or identical scenes during imaging would produce similar strength of long-range correlations, which provides potential applications for unsupervised image processing. Full article
Show Figures

Figure 1

Figure 1
<p>Example images from the BSDS0500 image dataset.</p>
Full article ">Figure 2
<p>(<b>a</b>,<b>b</b>) are images in the image set INFRAIMGS1. (<b>c</b>) are images in the image set INFRAIMGS2. (<b>d</b>) are images in the image set INFRAIMGS3. (<b>e</b>) are images in the image set INFRAIMGS4. (<b>f</b>) are images in the image set CTIMGS.</p>
Full article ">Figure 3
<p>Example satellite images of Yellowstone and Padma. In the first row, from <b>left</b> to <b>right</b>, they are Yellowstone 1993, 1997, 2002, 2009, and 2017. In the second row, from <b>left</b> to <b>right</b>, they are Padma 1992, 1996, 2004, 2014, and 2016.</p>
Full article ">Figure 4
<p>The four-level segmentation results of the typical images from eight datasets by using three different algorithms.</p>
Full article ">Figure 5
<p>Sorted fitness values for INFRAIMGS2 from <a href="#entropy-26-00777-f004" class="html-fig">Figure 4</a>, based on 100 independent runs by different algorithms with different number of thresholds.</p>
Full article ">
9 pages, 1800 KiB  
Article
Graph Adaptive Attention Network with Cross-Entropy
by Zhao Chen
Entropy 2024, 26(7), 576; https://doi.org/10.3390/e26070576 - 4 Jul 2024
Viewed by 853
Abstract
Non-Euclidean data, such as social networks and citation relationships between documents, have node and structural information. The Graph Convolutional Network (GCN) can automatically learn node features and association information between nodes. The core ideology of the Graph Convolutional Network is to aggregate node [...] Read more.
Non-Euclidean data, such as social networks and citation relationships between documents, have node and structural information. The Graph Convolutional Network (GCN) can automatically learn node features and association information between nodes. The core ideology of the Graph Convolutional Network is to aggregate node information by using edge information, thereby generating a new node feature. In updating node features, there are two core influencing factors. One is the number of neighboring nodes of the central node; the other is the contribution of the neighboring nodes to the central node. Due to the previous GCN methods not simultaneously considering the numbers and different contributions of neighboring nodes to the central node, we design the adaptive attention mechanism (AAM). To further enhance the representational capability of the model, we utilize Multi-Head Graph Convolution (MHGC). Finally, we adopt the cross-entropy (CE) loss function to describe the difference between the predicted results of node categories and the ground truth (GT). Combined with backpropagation, this ultimately achieves accurate node classification. Based on the AAM, MHGC, and CE, we contrive the novel Graph Adaptive Attention Network (GAAN). The experiments show that classification accuracy achieves outstanding performances on Cora, Citeseer, and Pubmed datasets. Full article
Show Figures

Figure 1

Figure 1
<p>The structure of GCN.</p>
Full article ">Figure 2
<p>The structure of GAAN. The different color arrows represent Multi-head Graph Convolution (MHGC) and the different color circles represent separate categories of nodes.</p>
Full article ">
15 pages, 531 KiB  
Article
Adaptive Joint Carrier and DOA Estimations of FHSS Signals Based on Knowledge-Enhanced Compressed Measurements and Deep Learning
by Yinghai Jiang and Feng Liu
Entropy 2024, 26(7), 544; https://doi.org/10.3390/e26070544 - 26 Jun 2024
Viewed by 1127
Abstract
As one of the most widely used spread spectrum techniques, the frequency-hopping spread spectrum (FHSS) has been widely adopted in both civilian and military secure communications. In this technique, the carrier frequency of the signal hops pseudo-randomly over a large range, compared to [...] Read more.
As one of the most widely used spread spectrum techniques, the frequency-hopping spread spectrum (FHSS) has been widely adopted in both civilian and military secure communications. In this technique, the carrier frequency of the signal hops pseudo-randomly over a large range, compared to the baseband. To capture an FHSS signal, conventional non-cooperative receivers without knowledge of the carrier have to operate at a high sampling rate covering the entire FHSS hopping range, according to the Nyquist sampling theorem. In this paper, we propose an adaptive compressed method for joint carrier and direction of arrival (DOA) estimations of FHSS signals, enabling subsequent non-cooperative processing. The compressed measurement kernels (i.e., non-zero entries in the sensing matrix) have been adaptively designed based on the posterior knowledge of the signal and task-specific information optimization. Moreover, a deep neural network has been designed to ensure the efficiency of the measurement kernel design process. Finally, the signal carrier and DOA are estimated based on the measurement data. Through simulations, the performance of the adaptively designed measurement kernels is proved to be improved over the random measurement kernels. In addition, the proposed method is shown to outperform the compressed methods in the literature. Full article
Show Figures

Figure 1

Figure 1
<p>The proposed framework for carrier and DOA estimations of the FHSS signal based on adaptive compressed measurements.</p>
Full article ">Figure 2
<p>The proposed DNN structure to conduct the adaptive measurement kernel design.</p>
Full article ">Figure 3
<p>RMSE comparison of the estimated carriers.</p>
Full article ">Figure 4
<p>RMSE comparison of the estimated DOAs.</p>
Full article ">Figure 5
<p>Sampled comparisons between the carriers and DOAs in ground truth and those estimated using the CaSCADE algorithm.</p>
Full article ">Figure 6
<p>Sampled comparisons between the carriers and DOAs in ground truth and those estimated using the MWC/MUSIC-based method.</p>
Full article ">Figure 7
<p>Sampled comparisons between the carriers and DOAs in ground truth and those estimated using the proposed framework with random measurement kernels.</p>
Full article ">Figure 8
<p>Sampled comparisons between the carriers and DOAs in ground truth and those estimated using the proposed adaptive method.</p>
Full article ">
14 pages, 7993 KiB  
Article
An Improved Toeplitz Approximation Method for Coherent DOA Estimation in Impulsive Noise Environments
by Jiang’an Dai, Tianshuang Qiu, Shengyang Luan, Quan Tian and Jiacheng Zhang
Entropy 2023, 25(6), 960; https://doi.org/10.3390/e25060960 - 20 Jun 2023
Cited by 2 | Viewed by 1628
Abstract
Direction of arrival (DOA) estimation is an important research topic in array signal processing and widely applied in practical engineering. However, when signal sources are highly correlated or coherent, conventional subspace-based DOA estimation algorithms will perform poorly due to the rank deficiency in [...] Read more.
Direction of arrival (DOA) estimation is an important research topic in array signal processing and widely applied in practical engineering. However, when signal sources are highly correlated or coherent, conventional subspace-based DOA estimation algorithms will perform poorly due to the rank deficiency in the received data covariance matrix. Moreover, conventional DOA estimation algorithms are usually developed under Gaussian-distributed background noise, which will deteriorate significantly in impulsive noise environments. In this paper, a novel method is presented to estimate the DOA of coherent signals in impulsive noise environments. A novel correntropy-based generalized covariance (CEGC) operator is defined and proof of boundedness is given to ensure the effectiveness of the proposed method in impulsive noise environments. Furthermore, an improved Toeplitz approximation method combined CEGC operator is proposed to estimate the DOA of coherent sources. Compared to other existing algorithms, the proposed method can avoid array aperture loss and perform more effectively, even in cases of intense impulsive noise and low snapshot numbers. Finally, comprehensive Monte-Carlo simulations are performed to verify the superiority of the proposed method under various impulsive noise conditions. Full article
Show Figures

Figure 1

Figure 1
<p>Spatial spectrograms comparison.</p>
Full article ">Figure 1 Cont.
<p>Spatial spectrograms comparison.</p>
Full article ">Figure 2
<p>Experimental results vs. GSNRs.</p>
Full article ">Figure 3
<p>Experimental results vs. characteristic exponent <span class="html-italic">α</span>.</p>
Full article ">Figure 4
<p>Experimental results vs. number of snapshots.</p>
Full article ">
17 pages, 3905 KiB  
Article
A Novel ECG Signal Denoising Algorithm Based on Sparrow Search Algorithm for Optimal Variational Modal Decomposition
by Jiandong Mao, Zhiyuan Li, Shun Li and Juan Li
Entropy 2023, 25(5), 775; https://doi.org/10.3390/e25050775 - 10 May 2023
Cited by 2 | Viewed by 2126
Abstract
ECG signal processing is an important basis for the prevention and diagnosis of cardiovascular diseases; however, the signal is susceptible to noise interference mixed with equipment, environmental influences, and transmission processes. In this paper, an efficient denoising method based on the variational modal [...] Read more.
ECG signal processing is an important basis for the prevention and diagnosis of cardiovascular diseases; however, the signal is susceptible to noise interference mixed with equipment, environmental influences, and transmission processes. In this paper, an efficient denoising method based on the variational modal decomposition (VMD) algorithm combined with and optimized by the sparrow search algorithm (SSA) and singular value decomposition (SVD) algorithm, named VMD–SSA–SVD, is proposed for the first time and applied to the noise reduction of ECG signals. SSA is used to find the optimal combination of parameters of VMD [K,α], VMD–SSA decomposes the signal to obtain finite modal components, and the components containing baseline drift are eliminated by the mean value criterion. Then, the effective modalities are obtained in the remaining components using the mutual relation number method, and each effective modal is processed by SVD noise reduction and reconstructed separately to finally obtain a clean ECG signal. In order to verify the effectiveness, the methods proposed are compared and analyzed with wavelet packet decomposition, empirical mode decomposition (EMD), ensemble empirical mode decomposition (EEMD), and the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) algorithm. The results show that the noise reduction effect of the VMD–SSA–SVD algorithm proposed is the most significant, and that it can suppress the noise and remove the baseline drift interference at the same time, and effectively retain the morphological characteristics of the ECG signals. Full article
Show Figures

Figure 1

Figure 1
<p>Waveform of standard heartbeat in one cycle.</p>
Full article ">Figure 2
<p>Waveform plots of three types of noise. (<b>a</b>) Baseline drift, (<b>b</b>) electrode activity interference, and (<b>c</b>) myoelectric interference.</p>
Full article ">Figure 3
<p>Fitness value curves of different optimization algorithms.</p>
Full article ">Figure 4
<p>Parameter optimization and fitness iteration process of VMD–SSA–SVD. (<b>a</b>) Fitness value. (<b>b</b>) Convergence of decomposition layers. (<b>c</b>) Convergence of the quadratic penalty factor.</p>
Full article ">Figure 5
<p>Noise reduction effect of 103 simulated signals containing noise. (<b>a</b>) The original 103 signal, (<b>b</b>) the simulated 103 signal containing noise, (<b>c</b>) effect of noise reduction by EMD, (<b>d</b>) effect of noise reduction by EEMD, (<b>e</b>) effect of noise reduction by CEEMDAN, (<b>f</b>) effect of noise reduction by wavelet threshold, (<b>g</b>) effect of noise reduction by WOA–VMD, and (<b>h</b>) effect of noise reduction by VMD–SSA–SVD.</p>
Full article ">Figure 6
<p>Noise reduction effect of 103 signal with real base drift. (<b>a</b>) The 103 signal with real base drift, (<b>b</b>) original 103 signal, (<b>c</b>) real base drift, (<b>d</b>) effect of noise reduction by EMD, (<b>e</b>) effect of noise reduction by EEMD, (<b>f</b>) effect of noise reduction by wavelet threshold, (<b>g</b>) effect of noise reduction by CEEMDAN, and (<b>h</b>) effect of noise reduction by VMD–SSA–SVD.</p>
Full article ">Figure 7
<p>Noise reduction effect of 105 signal with real base drift. (<b>a</b>) The 105 signal with real base drift, (<b>b</b>) original 105 signal, (<b>c</b>) real base drift, (<b>d</b>) effect of noise reduction by EMD, (<b>e</b>) effect of noise reduction by EEMD, (<b>f</b>) effect of noise reduction by wavelet threshold, (<b>g</b>) effect of noise reduction by CEEMDAN, and (<b>h</b>) effect of noise reduction by VMD–SSA–SVD.</p>
Full article ">Figure 8
<p>Noise reduction effect of actual 212 ECG signal. (<b>a</b>) Original 212 signal, (<b>b</b>) effect of noise reduction by EMD, (<b>c</b>) effect of noise reduction by EEMD, (<b>d</b>) effect of noise reduction by wavelet threshold, (<b>e</b>) effect of noise reduction by CEEMDAN, and (<b>f</b>) effect of noise reduction by VMD–SSA–SVD.</p>
Full article ">Figure 9
<p>Noise reduction effect of actual 109 ECG signal. (<b>a</b>) Original 109 signal, (<b>b</b>) effect of noise reduction by EMD, (<b>c</b>) effect of noise reduction by EEMD, (<b>d</b>) effect of noise reduction by wavelet threshold, (<b>e</b>) effect of noise reduction by CEEMDAN, and (<b>f</b>) effect of noise reduction by VMD–SSA–SVD.</p>
Full article ">
17 pages, 810 KiB  
Article
Knowledge-Enhanced Compressed Measurements for Detection of Frequency-Hopping Spread Spectrum Signals Based on Task-Specific Information and Deep Neural Networks
by Feng Liu and Yinghai Jiang
Entropy 2023, 25(1), 11; https://doi.org/10.3390/e25010011 - 21 Dec 2022
Cited by 2 | Viewed by 1722
Abstract
The frequency-hopping spread spectrum (FHSS) technique is widely used in secure communications. In this technique, the signal carrier frequency hops over a large band. The conventional non-compressed receiver must sample the signal at high rates to catch the entire frequency-hopping range, which is [...] Read more.
The frequency-hopping spread spectrum (FHSS) technique is widely used in secure communications. In this technique, the signal carrier frequency hops over a large band. The conventional non-compressed receiver must sample the signal at high rates to catch the entire frequency-hopping range, which is unfeasible for wide frequency-hopping ranges. In this paper, we propose an efficient adaptive compressed method to measure and detect the FHSS signals non-cooperatively. In contrast to the literature, the FHSS signal-detection method proposed in this paper is achieved directly with compressed sampling rates. The measurement kernels (the non-zero coefficients in the measurement matrix) are designed adaptively, using continuously updated knowledge from the compressed measurement. More importantly, in contrast to the iterative optimizations of the measurement matrices in the literature, the deep neural networks are trained once using task-specific information optimization and repeatedly implemented for measurement kernel design, enabling efficient adaptive detection of the FHSS signals. Simulations show that the proposed method provides stably low missing detection rates, compared to the compressed detection with random measurement kernels and the recently proposed method. Meanwhile, the measurement design in the proposed method is shown to provide improved efficiency, compared to the commonly used recursive method. Full article
Show Figures

Figure 1

Figure 1
<p>The adaptive FHSS compressed measurement and detection framework.</p>
Full article ">Figure 2
<p>The architecture of the deep neural networks used in the adaptive design of the measurement kernels.</p>
Full article ">Figure 3
<p>The Proposed Procedure of the Adaptive Compressed Measurement and Detection of the FHSS Signals.</p>
Full article ">Figure 4
<p>FHSS Signal-Detection Performance at <math display="inline"><semantics> <mrow> <mi>C</mi> <mi>R</mi> <mo>=</mo> <mn>10</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 5
<p>FHSS Signal-Detection Performance at <math display="inline"><semantics> <mrow> <mi>C</mi> <mi>R</mi> <mo>=</mo> <mn>20</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 6
<p>Evolution of the Averaged Power Spectrum Value at the True Subbands at <math display="inline"><semantics> <mrow> <mi>C</mi> <mi>R</mi> <mo>=</mo> <mn>20</mn> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>S</mi> <mi>N</mi> <mi>R</mi> <mo>=</mo> <mo>−</mo> <mn>10</mn> </mrow> </semantics></math> dB.</p>
Full article ">
Back to TopTop