1 Introduction

Hardware Trojans represent a growing class of malicious modifications that critically undermine the integrity, reliability, security, and trust of integrated circuits (ICs). These hidden threats can be stealthily introduced during the design or manufacturing process, with the potential to compromise sensitive data, disable key functionalities, or even bring down entire systems that rely on these compromised ICs. The covert nature of these malicious insertions makes them particularly dangerous, as they can silently sabotage both consumer electronics and critical infrastructure.

Hardware Trojan modification attacks pose significant concerns for the security, assurance, and trust of various electronic devices, ranging from consumer electronics to national infrastructure. Detection of hardware Trojans is becoming more and more challenging due to the hidden nature and the huge complexity of current electronic systems. Conventional detection approaches, such as physical inspections and functional testing, conflict with the limitations of scale and efficacy against increasingly sophisticated HT designs. The decreasing number of components makes an even more difficult method of detection, enabling HTs to avoid detection by standard approaches [1,2,3].

In response to these issues, power-side channel signal analysis has been proposed as a potentially transformational approach for HT detection. This technique, which analyzes the power consumption patterns of ICs during operation, can disclose the indicative signals of HTs without demanding direct physical access to the internal workings of the ICs. The power consumption of an integrated circuit (IC) can be modeled as:

$$\begin{aligned} P(t) = V(t) \cdot I(t) \end{aligned}$$
(1)

where:

  • \( P(t) \) is the power consumption at time \( t \),

  • \( V(t) \) is the voltage at time \( t \),

  • \( I(t) \) is the current at time \( t \).

Fig. 1
figure 1

IC Design Flow and Integration Process

This work utilizes a huge dataset known as "Power and Electromagnetic Side-Channel Signals of Hardware Trojan Benchmarks," which includes an extensive range of power signals captured under various Trojan conditions. By applying powerful machine learning methods to analyze this data collection, our research attempts to effectively increase the accuracy and reliability of HT detection. This comparative investigation is expected to fill a critical gap in current cybersecurity defenses by comparing the performance of several machine learning algorithms and determining their robustness in identifying the hidden presence of HTs [4, 5].

Through this rigorous analysis, this study will offer insight into the most effective approaches adopted by these advanced machine learning algorithms, thereby providing essential insight into the realm of hardware security and strengthening defenses against these insidious threats.

The use of golden ICs, which are trusted and known to be free of Hardware Trojans, is indeed a common practice in side-channel analysis for HT detection. However, in this study, our approach focuses on using advanced machine learning techniques that analyze power signals without relying on a direct comparison to a golden model. Our method likely emphasizes detecting deviations in the side-channel data that could indicate the presence of HTs, making it potentially independent of a golden reference [6,7,8].

In addition, our method focuses on the detection of anomalies and deviations in side-channel data patterns that indicate the presence of HT, potentially reducing the dependence on a gold reference. Although traditional classification algorithms such as Random Forest, SVM, and others may require labeled data, the power of advanced feature extraction across the time, frequency, and wavelet domains allows us to detect subtleties in the data itself, making detection possible even without direct comparison to a gold chip. Furthermore, by using machine learning models and generalizing across different hardware instances, our approach improves the flexibility and scalability of Trojan detection.

By using advanced machine learning techniques, the need for a golden model may be bypassed because the detection is based on recognizing patterns and discrepancies in the side-channel data itself, rather than comparing it directly to a known reference. This approach can offer more flexibility and scalability, especially when a golden IC is not readily available or practical to obtain.

The detection of hardware Trojans has traditionally relied on methods such as physical inspections and functional testing. However, as electronic systems have grown in complexity and HT designs have become more sophisticated, these conventional approaches have increasingly proven inadequate. Previous research has explored various detection methodologies, including functional testing and physical inspections, but these techniques often fall short when confronting the advanced designs of modern hardware Trojans. As such, there is a pressing need for more effective detection methods that can reliably identify HTs even in the face of complex and subtle attack vectors.

In response to these challenges, power side channel signal analysis has emerged as a promising approach to HT detection. This technique leverages the power consumption patterns of ICs during operation to identify potential HTs without requiring direct physical access to the ICs’ internal workings. However, while several studies have demonstrated the potential of power side-channel analysis, there remains a significant gap in the literature regarding the application of advanced machine learning techniques to improve detection accuracy and reliability.

This study aims to fill this gap by applying advanced machine learning algorithms to the analysis of power channel signals. Using a comprehensive feature extraction process and rigorous model validation, this research aims to improve the identification of HTs in ICs, offering a more robust and scalable solution compared to traditional methods. The results of this study not only contribute to the field of hardware security by advancing HT detection methodologies but also provide valuable insights into the broader application of machine learning in cybersecurity.

Figure 1 shows the process of IC design flow and integration. The process starts with the Designer formulating the specification, which is then followed by Architectural Design. The design progresses through several stages: C/C++ system C, RTL design and Verilog/VHDL. Logic synthesis transforms the design into a Gate-level Netlist. This netlist is utilized in the process of Physical Design, finally resulting in the creation of the GDSII Layout. The design is subsequently transformed into a Bare Die, which then undergoes Testing and Packaging to ultimately become a Chip.

The last phase entails System Integration, which involves providing the product to consumers. The integration of third-party intellectual property (3PIP) components occurs at several phases to improve the design. Figure 2 illustrates the Power Side Channel Time-Series Signal for the AES-2000 Benchmark. This diagram depicts the power consumption trends in three distinct scenarios: Trojan Disabled, Trojan Enabled, and Trojan Triggered. The x-axis indicates the sample number, while the y-axis indicates the power measurement in milliwatts (mW). The signals illustrate the impact of a hardware Trojan on the power consumption of the AES-2000 encryption process.

1.1 Problem statement

Hardware Trojans (HTs) present a critical threat to the integrity of integrated circuits (ICs), as these malicious modifications can lead to catastrophic failures or security breaches. Traditional detection methods, such as physical inspections and functional testing, are often inadequate due to the increasing complexity and sophistication of HT designs. The hidden nature of HTs, combined with the vast complexity of modern electronic systems, makes detection a challenging task. This study addresses the need for more effective detection methods by utilizing power side-channel signal analysis in conjunction with advanced machine learning techniques to accurately identify HTs in ICs.

1.2 Objectives of the research

Develop Advanced Detection Techniques: To improve the identification of hardware Trojans in integrated circuits using advanced machine learning algorithms applied to power side-channel signals. Feature Extraction and Model Validation: To perform comprehensive feature extraction and model validation to enhance the accuracy and reliability of HT detection. Comparative Analysis of ML Algorithms: To compare the performance of various machine learning algorithms, such as Support Vector Machines, neural networks, and decision trees, in detecting HTs. Enhance cybersecurity: To provide improved techniques for detecting minor anomalies associated with HTs, thereby enhancing the security of electronic systems. Practical implementation: To demonstrate the practical significance of the proposed methods across a range of applications, from consumer electronics to national infrastructure.

1.3 Motivations

The motivations behind this research stem from the critical need to secure integrated circuits against the growing threat of hardware Trojans. With increasing reliance on electronic systems in both consumer and industrial applications, the potential impact of undetected HTs is substantial, posing risks to personal data, financial systems, and national security. Traditional detection methods are no longer sufficient due to their limitations in scalability and effectiveness against sophisticated HT designs. By leveraging power side-channel signal analysis and machine learning, this research aims to fill a significant gap in current cybersecurity defenses, offering more robust and scalable solutions to detect these insidious threats.

1.4 Contributions

This research makes several significant contributions to the field of cybersecurity. The study introduces advanced machine learning algorithms to analyze power side-channel signals for the detection of hardware Trojans, offering a non-intrusive and highly effective detection method. Comprehensive data set utilization: Using a unique data set, "Power and Electromagnetic Side Channel Signals of Hardware Trojan Benchmarks," the research improves the understanding and identification of HTs under various conditions. Another thing is Comparative Analysis techniques. By comparing the performance of multiple machine learning algorithms, the research identifies the most effective approaches for HT detection, providing valuable insight into their strengths and limitations. Practical Applications are also very important. The study demonstrates the practical application of the proposed methods across different scenarios, highlighting their potential to secure electronic systems in diverse environments.The research outlines potential areas for further investigation, including advanced feature extraction, hybrid models, adversarial robustness, and interdisciplinary collaboration, paving the way for continued advancements in HT detection and electronic system security.

Traditional machine learning methods face several challenges when applied to HT detection. These methods often struggle with issues like high computational costs, overfitting, scalability limitations, noise sensitivity, and difficulty in handling complex, nonlinear data distributions typical of side-channel analysis.

Various ML methods have been developed to address challenges. these methods face specific limitations when applied to HT detection in side-channel analysis. Techniques such as Support Vector Machines (SVMs) and Random Forests have been shown to reduce overfitting and improve scalability in many applications [9], but their performance in side-channel data is constrained by the complexity and noise inherent in such signals. Boosting algorithms like AdaBoost and Gradient Boosting have been effective in mitigating noise sensitivity; however, in the context of HT detection, they still require extensive preprocessing, such as wavelet-based denoising, to handle the unique challenges of power-side channel signals. Furthermore, deep learning models, which excel in handling nonlinear data distributions [10], require significant computational resources, limiting their practical application without optimizations like transfer learning [11]. While these methods provide potential solutions, our approach integrates advanced feature extraction across time, frequency, and wavelet domains, as well as hybrid models tailored to the specific nuances of HT detection, offering improved performance and addressing gaps in existing ML-based methods.

In contrast, the novelty of our approach lies in its use of hybrid machine learning models, such as Random Forests and Deep Neural Networks, which are specifically designed to handle the nonlinear, high-dimensional data typical of side-channel analysis in Hardware Trojan (HT) detection. Our method employs advanced feature extraction across time, frequency, and wavelet domains, allowing us to capture subtle variations in power consumption and detect HTs, even when their impact on performance is minimal. Unlike traditional methods, which often rely on physical inspections, functional testing, or golden chip comparisons, our approach enhances robustness against noise and environmental variations through data augmentation and advanced preprocessing techniques. This eliminates the need for a golden reference, improving scalability and generalization, and providing a more comprehensive, resilient solution for detecting subtle HTs in integrated circuits. Overall, our approach addresses the limitations of conventional ML techniques and offers a scalable, real-world applicable solution.

2 Research background

This section thoroughly evaluates improvements in hardware Trojan (HT) detection approaches. Initially, it contextualizes the paradigmatic change from conventional detection approaches to advanced strategies, highlighting the relevance of power side channel signal analysis. A methodical review of the current literature is conducted, critically examining the techniques and conclusions of significant research that has used power-side channel signals for the detection of HT. The discussion then moves to the integration of machine learning in cybersecurity, specifically in HT detection. This involves an analytical review of how contemporary machine learning paradigms have been modified to detect and analyze the complex subtleties of HT, underlining its efficacy and potential limitations in the growing cybersecurity scene.

This section provides an extensive overview of advances in HT detection methodologies, following the transition from traditional approaches to current advanced techniques. It begins by placing the discussion within the context of a major paradigm shift that has seen conventional detection methods eclipsed by more advanced techniques, with a particular emphasis on the exploitation of power side-channel signal analysis. This analytical approach has emerged as an essential means of identifying the often subtle traces that HTs place on a system’s power consumption patterns, giving a non-intrusive yet effective detection vector.

A thorough assessment of the current literature is done, presenting a critical evaluation of the techniques, results, and implications of crucial investigations that have exploited power side-channel signals to detect HTs. This assessment not only emphasizes the diversity and innovation within the discipline, but also identifies the strengths and limits of current techniques. It emphasizes the intricacy of detecting hardware Trojans, which are purposely meant to be elusive, and the way power side-channel analysis has provided a dependable approach to revealing them based on their energy usage footprints.

The discourse then evolves to address the integration of machine learning (ML) approaches in cybersecurity, with a special focus on HT detection. This part rigorously evaluates how contemporary machine learning algorithms have been used to determine the subtle features of HTs, signifying an important improvement in detection skills. The versatility of ML models, capable of learning complex and high-dimensional data, is an effective tool in the defense against HT. The paper includes a review of various ML paradigms, including supervised, unsupervised, and reinforcement learning, and their application in spotting HTs amid the noise of normal system activity.

Furthermore, this paper elaborates on the problems and potential given by integrating ML in HT detection. It evaluates the efficacy of ML techniques across multiple situations, from chip design and production to post-deployment surveillance, and discusses the potential limitations, such as the necessity for significant training data and the risk of adversarial attacks meant to elude detection. The section also discusses the expanding landscape of cybersecurity risks and the continual arms race between HT developers and detection technologies. Finally, the paper suggests potential topics for additional research, highlighting the need for ML models that are more robust, adaptable, and understandable to meet the increasing complexity of HTs. A multidisciplinary strategy is needed to combine information from computer science, electrical engineering, and cybersecurity so as to create inventive solutions that can predict and counteract new threats (Table 1).

Table 1 Comparison of Detection Methods for Various Applications

The purpose of this extended Literature Review part is to offer an in-depth understanding of the latest developments in HT detection, emphasizing the crucial significance of power side-channel signal analysis and the encouraging integration of machine learning. This study establishes a foundation for future research that will enhance the field and help protect the digital infrastructure from the hidden threat of hardware Trojans.

Fig. 2
figure 2

Power side-channel time-series signal for AES-2000 benchmark

2.1 Traditional detection methods

In the past, the detection of HT has mainly been based on functional tests and physical inspections. Functional testing involves the application of a predetermined set of input vectors to the integrated circuit (IC) and the subsequent analysis of the outputs to identify any deviations from the expected behavior. Physical inspections also involve microscopic research and reverse engineering to detect prohibited alterations. However, these approaches are labor intensive and sometimes fail to detect well-contained HTs.

2.2 Power side-channel analysis

Power side-channel research has become a viable option, utilizing the fact that electronic operations utilize power in distinct patterns. By measuring and analyzing these patterns, it is feasible to infer the presence of HTs. Several studies have proved the efficacy of power side-channel analysis in detecting HTs, particularly in cryptographic devices and integrated circuits used in critical infrastructure. By measuring and analyzing these patterns, it is feasible to infer the presence of HTs. Several studies have proved the efficacy of power side-channel analysis in detecting HTs, particularly in cryptographic devices and integrated circuits used in critical infrastructure. The Fast Fourier Transform (FFT) is used to convert a time-domain signal to its frequency-domain representation:

$$\begin{aligned} X(k) = \sum _{n=0}^{N-1} x(n) \cdot e^{-j(2\pi /N)kn} \end{aligned}$$
(2)

where:

  • \( X(k) \) is the frequency-domain representation,

  • \( x(n) \) is the time-domain signal,

  • \( N \) is the number of samples,

  • \( k \) is the frequency bin.

2.3 Machine learning in HT detection

The introduction of machine learning (ML) techniques into cybersecurity has shown considerable advances in HT detection. ML models, capable of learning from complicated and high-dimensional data, provide an effective defense against HTs. This paper contains a review of various ML paradigms and their use in identifying HTs amongst normal system activity.

2.4 Challenges and opportunities

Integrating ML in HT detection brings both challenges and future potential opportunities. One important problem is the necessity for extensive training data to ensure that models generalize well across multiple circumstances. Additionally, adversarial attacks offer a risk, as they might be suited to avoid detection by exploiting model vulnerabilities. However, the potential for ML to better HT detection is huge. ML techniques can be used across multiple environments, from chip design and production to post-deployment surveillance, enhancing detection accuracy and lowering false positives.

2.5 Future research directions

Future research should focus on creating more robust, adaptive, and interpretable ML models capable of keeping pace with the rising complexity of HTs. A interdisciplinary approach, incorporating insights from electrical engineering, computer science, and cybersecurity, is essential for generating novel solutions. Areas of exploration include:

  • Advanced feature extraction techniques to capture more subtle HT signatures.

  • Hybrid models that combine multiple ML algorithms to leverage their strengths.

  • Techniques to enhance model robustness against adversarial attacks.

  • Expanding datasets to include more varied and nuanced HT scenarios.

This extensive review of the literature set up the basis for ongoing study, underlining the essential function of power side-channel signal analysis and the potential incorporation of machine learning. It intends to improve the field and contribute to securing digital infrastructure against the insidious danger of hardware Trojans. Following the limitations and gaps identified in the existing HT detection methods, particularly in handling the complexities of modern hardware Trojans, this study is motivated by the need to enhance detection accuracy and robustness. By using advanced machine learning techniques applied to diverse power side-channel datasets, our research aims to address these challenges and provide more reliable and scalable solutions. This study not only contributes to the ongoing development of cybersecurity measures but also sets the stage for future advancements in HT detection technologies.

3 Experimental setup

The experimental setup for this study was meticulously designed to evaluate the effectiveness of various machine learning models in detecting hardware Trojans (HTs) through power and electromagnetic (EM) side-channel signals. This setup utilizes an automated FPGA testbed, combining several core components and sophisticated data collection methods to ensure robust and reliable results.

3.1 Core components

The experimental setup includes a Sakura-G FPGA board, a Tektronix TDS2022C oscilloscope, and a control computer. The FPGA board was programmed with TrustHub benchmarks, such as AES cryptography cores and RS232 UART circuits, to simulate various HT scenarios. The oscilloscope measures power consumption during circuit operations, synchronized with trigger signals from the FPGA to capture data at precise operational moments. Input vectors for the AES circuit consisted of a secret key and plaintext, while the RS232 circuit utilized arbitrary inputs. To ensure comprehensive coverage, data was collected under various HT states, Trojan enabled, Trojan disabled, and Trojan triggered, using different FPGA boards to address process variations.

Data collection was conducted under carefully controlled environmental conditions to enhance the robustness and generalizability of the findings. The computer automates the setup of the testbed, managing the FPGA programming, applying the test vector, tuning the oscilloscope, and data transfer. In addition, a temperature control system comprising a sensor, heater, and controller was used to maintain specific environmental conditions during data collection. In some setups, additional components such as a spectrum analyzer and EM probe were used to gather electromagnetic data, which is critical for comprehensive side-channel analysis. This method ensured that data was collected under a wide range of operational conditions, providing a robust dataset for subsequent analysis.

3.2 Data description

The data set comprises power side channel signals gathered under different conditions that show a diverse range of hardware Trojan (HT) scenarios. These signals were sourced from specific benchmarks meant to represent different HT activations across several operating modes, including Trojan enabled, Trojan disabled, and Trojan triggered. The dataset’s variation under varied Trojan conditions and chip temperatures provides a robust framework for HT identification. This flexibility is essential for designing systems that are resistive to the several nuances provided by different types of Trojans under varied conditions [27]. Use of a diverse benchmark dataset for HT detection offers key advantages over traditional methods that rely on actual IC measurements. It allows for broader coverage of HT scenarios, better generalization, scalability, and flexibility in simulation. This approach is more resource-efficient and ensures consistency in experiments, making the detection method more robust and applicable to a wide range of HTs compared to methods that depend solely on real IC data.

The importance of capturing the manufacturing process distinctions between different batches of chips, as such variations can affect the detection of HTs. In our approach, we mitigate these variations by using a diverse benchmark dataset that includes power side channel signals collected from multiple chips, with different process variations, operating conditions, and environmental factors. This diversity allows our machine learning models to be generalized across different batch batches of chips. In addition, we employ data augmentation techniques to simulate manufacturing variations and environmental noise, ensuring that the models are robust to the process distinctions between batches. This approach enhances the reliability of HT detection by allowing our models to adapt to the inherent variability in IC manufacturing, thereby improving their scalability and real-world applicability.

Exploring further into the data set and its significance for HT detection, we study the details of how the power side channel signals were collected, the nature of the benchmarks utilized, and the value of the properties of the data set in the search for HT.

3.2.1 Dataset composition and collection

The dataset was selected from an extensive set of benchmarks designed to cover a wide range of HT scenarios. These benchmarks include circuits with HTs that are as follows:

  • Trojan Enabled: HTs are present within the circuit but remain inactive, without actively altering the circuit’s usual operation. This state provides a unique problem as the HT’s impact on the power side-channel signals may be minor, requiring advanced detection techniques to determine its hidden presence.

  • Trojan Disabled: Represents the baseline condition where no HTs are present within the circuit. Data gathered under this circumstance acts as the control group, giving a benchmark for normal circuit performance against which deviations indicative of HT activity can be evaluated.

  • Trojan Triggered: HTs are active, potentially affecting the circuit’s performance or leaking important information. This circumstance is crucial for understanding the direct influence of HTs on power consumption patterns, providing clear indications for identification.

3.2.2 Signal collection under varied conditions

The signals were recorded under varied operational and environmental conditions to ensure the dataset’s accuracy and robustness. This includes shifting the input stimuli to the circuits, switching operational modes, and adjusting chip temperatures. Such variety is crucial for modeling real-world scenarios where HTs might be engineered to activate or remain dormant under specified conditions, thereby ensuring the detection system’s efficiency throughout a wide range of potential threat scenarios.

3.2.3 Benchmark design and HT activation scenarios

The benchmarks from which the dataset was built are designed to represent several types of HT activation, such as conditional, time-based, or event-triggered activations. These scenarios ensure the dataset contains a wide variety of conceivable HT behaviors, from those that change the logical output of a circuit to those that might simply affect non-functional aspects like power consumption or electromagnetic emissions.

3.2.4 Importance of dataset features

The dataset’s features, particularly its variability under different Trojan conditions and chip temperatures, are instrumental for several reasons:

Fig. 3
figure 3

An Overview of Machine Learning Model for IC Hardware Trojan Detection

  • Detection Resilience: The diversity of the dataset ensures that the detection algorithm is not completely targeted to specific conditions or types of HTs. This robustness is critical for deploying the system in real-world applications where conditions may change significantly.

  • Sensitivity and Specificity: The variety in the dataset aids in training machine learning models to differentiate between tiny changes in signal patterns. This is critical for identifying latent Trojans and decreasing false positives, where natural fluctuations in circuit behavior could be wrongly identified as Trojan activity.

  • Adaptability: By containing a wide range of circumstances and HT situations, the dataset prepares the detection system to respond to spontaneous threats. As new types of HTs are produced, the system’s underlying knowledge, obtained from this broad dataset, provides a platform for increasing its detection capabilities.

The dataset’s composition, developed from specialized benchmarks meant to model a wide range of HT activations and collected under diverse situations, plays a crucial part in creating an effective and resilient HT detection system. This comprehensive methodology ensures the detection system is efficient to recognize HTs across varied operational states and environmental conditions, solving the subtle problems offered by the increasing range of hardware security risks.

3.3 Feature extraction

The feature extraction procedure plays an important role in transforming raw side-channel signals into a format that machine learning algorithms can read efficiently.

A detailed explanation of the feature extraction process is crucial for understanding how raw side-channel data is transformed into a format suitable for machine learning algorithms. This process forms the backbone of effective model training, as it directly influences input data quality. In our study, we employ statistical, temporal, spectral, and wavelet features to capture the subtle signals indicative of hardware Trojans. By elaborating on these features, we provide deeper insights into how our method detects the small yet significant variations caused by Trojans, ensuring a robust foundation for training our detection models.

Fig. 4
figure 4

Comprehensive Feature Selection for Time, Frequency, and Wavelet Domains

Figure 3 shows the Comprehensive Feature Selection for Time, Frequency, and Wavelet Domains. This figure categorizes numerous characteristics used in data analysis into three primary domains: Time Domain characteristics, Frequency Domain Features, and Wavelet Domain Features. Each domain comprises unique metrics and statistical measurements like as skewness, kurtosis, energy, entropy, and more, which are crucial for extensive signal analysis and processing.

This procedure involves numerous steps:

3.3.1 Statistical features

We compute essential statistical measures such as total, mean, median, standard deviation, skewness, and kurtosis of the signals. These properties provide insights into the distribution and dispersion of the signal levels, which may change dramatically in the presence of a hardware Trojan. The feature extraction procedure plays an important role in transforming raw side-channel signals into a format that machine learning algorithms can read efficiently. Some common statistical features that can be extracted from the power side-channel signals include:

The mean (or average) is the sum of all data points \(x_i\) divided by the number of data points N.

$$\begin{aligned} \mu = \frac{1}{N} \sum _{i=1}^{N} x_i \end{aligned}$$
(3)

The standard deviation measures the dispersion of data points from the mean. It is the square root of the average of the squared differences between each data point \(x_i\) and the mean \(\mu \).

$$\begin{aligned} \sigma = \sqrt{\frac{1}{N} \sum _{i=1}^{N} (x_i - \mu )^2} \end{aligned}$$
(4)

Skewness quantifies the asymmetry of the data distribution around the mean. It is the average of the cubed standardized deviations of each data point from the mean.

$$\begin{aligned} \text {Skewness} = \frac{1}{N} \sum _{i=1}^{N} \left( \frac{x_i - \mu }{\sigma } \right) ^3 \end{aligned}$$
(5)

Kurtosis measures the "tailedness" of the data distribution. It is the average of the fourth power of the standardized deviations of each data point from the mean, minus 3 (to adjust for the kurtosis of a normal distribution being 0).

$$\begin{aligned} \text {Kurtosis} = \frac{1}{N} \sum _{i=1}^{N} \left( \frac{x_i - \mu }{\sigma } \right) ^4 - 3 \end{aligned}$$
(6)

3.3.2 Temporal features

Features like mean absolute change, average variation rate, and energy measurements capture the dynamic properties of the signal across time. For instance, a Trojan might change the temporal stability or produce odd variations in the power consumption or EM emissions.

3.3.3 Spectral features

Through Fast Fourier Transform (FFT) and power spectral density analysis, we study the frequency components of the signals. Trojans can introduce new frequencies or modify the energy distribution across frequencies, making these features especially useful for detection.

3.3.4 Wavelet transform features

Wavelet analysis allows for the division of signals into distinct frequency bands, collecting both frequency and location information. This analysis is critical for finding minor, localized variations in the signal that might be indicative of Trojan activity. Wavelet analysis allows for the division of signals into distinct frequency bands, collecting both frequency and location information. The continuous wavelet transform of a signal \( x(t) \) is given by:

$$\begin{aligned} W(a, b) = \frac{1}{\sqrt{a}} \int _{-\infty }^{\infty } x(t) \psi ^* \left( \frac{t-b}{a} \right) dt \end{aligned}$$
(7)

where:

  • \( W(a, b) \) are the wavelet coefficients,

  • \( a \) is the scaling parameter,

  • \( b \) is the translation parameter,

  • \( \psi (t) \) is the mother wavelet,

  • \( \psi ^*(t) \) is the complex conjugate of the mother wavelet.

3.3.5 Entropy and energy

Entropy evaluates the randomness or unpredictability of transmission, which could grow in the presence of Trojans. Energy characteristics assess the entire power within the signal, indicating substantial variances produced by Trojan operations.

By combining these distinct sets of data, we create an extensive, multidimensional feature space that captures an in-depth understanding of the side-channel signals under different conditions.

In addressing the susceptibility of side-channel signals to environmental noise and process variations, our research goes beyond the basic application of Fast Fourier Transform (FFT) by incorporating a suite of advanced pre-processing and feature extraction techniques. These include adaptive filtering, wavelet-based denoising, and signal normalization, which are specifically designed to enhance the signal-to-noise ratio and mitigate the impact of environmental factors. Additionally, we extract features across multiple domains-time, frequency, and wavelet-ensuring that even if noise or variations affect one domain, the robust features captured in the others compensate, thereby improving the reliability of hardware Trojan detection.

Moreover, our machine learning models, such as Random Forest and Deep Neural Networks, are selected for their resilience to noisy and varied data. These models are trained using data augmentation techniques that simulate different environmental conditions and process variations, making them robust in real-world scenarios. By employing ensemble learning and cross-validation across diverse hardware instances, we further ensure that our models generalize well across different devices and conditions. Empirical validation in various scenarios demonstrates that our approach effectively handles the challenges posed by environmental noise and process variations, providing a reliable solution for hardware Trojan detection (Tables 2, 3).

Table 2 AES-T400 Performance Metrics for Machine Learning Models at 25\(^\circ \)C
Table 3 AES-T500 Performance Metrics for Machine Learning Models at 25\(^\circ \)C

3.4 Machine learning model training with class distinctions

The extracted set of features forms a basis for training of machine learning models [13]. Given the three classes-Trojan enabled, Trojan disabled, and Trojan triggered-our approach consists of a multiclass classification strategy:

  • Trojan Enabled: This class shows the situation where the Trojan is present in the circuit but is not active. The signals might resemble those of normal operations, making detection challenging.

  • Trojan Disabled: Signals in this class originate from circuits without any Trojans, acting as a baseline for regular operation. Features from this class should ideally resemble the normal operation of a secure, uncompromised system.

  • Trojan Triggered: This class is important since it indicates situations under which the Trojan is active, potentially affecting the circuit’s functionality or leaking information. The feature values for this class are predicted to reveal significant abnormalities compared to the other two classes.

For effective classification, we apply algorithms such as Support Vector Machines (SVM), Random Forests, and Gradient Boosting, known for their robustness in addressing complex, multi-class issues. These models are trained using a labeled dataset including samples from all three classes, with a focus on distinguishing between the small differences that separate Trojan enabled from Trojan triggered situations [16].

3.5 Evaluation of class-specific detection performance

Model evaluation involves using class-specific measures in addition to overall accuracy, precision, recall, and F1 score. This involves evaluating the following (Tables 4, 5, 6).

  • True Positive Rate (TPR) for each class, indicating the model’s capacity to accurately recognize each type of Trojan condition.

  • False Positive Rate (FPR), which is important in decreasing false alarms, especially distinguishing between Trojan enabled (but not triggered) and normal situations.

  • Confusion Matrix Analysis, to visually and quantitatively review model performance across the three classes, focusing on any biases or shortcomings in classification.

Table 4 AES-T600 Performance Metrics for Machine Learning Models at 25\(^\circ \)C
Table 5 AES-T700 Performance Metrics for Machine Learning Models at 25\(^\circ \)C
Table 6 AES-T800 Performance Metrics for Machine Learning Models at 25\(^\circ \)C

Through careful testing and validation, including cross-validation approaches and unseen data testing, we aim to build a model that is not only accurate but also trustworthy and practical for real-world applications where the distinction between Trojan states is essential [18].

In the experimental part of the study, the parameters of the machine learning models were set using a systematic approach. Initially, the parameters were selected based on insights from existing literature and prior research that had proven effective in similar contexts. Fine-tuning was then conducted using techniques such as grid search and cross-validation, allowing for precise optimization of the model parameters to achieve optimal performance in hardware Trojan detection. This approach was crucial for adapting the models to the specific characteristics of the dataset and the detection task, thereby enhancing their accuracy and reliability.

In our experiments, we fine-tuned the parameters of each machine learning model for optimal performance. For K-Nearest Neighbors (KNN), we used 5 neighbors with Euclidean distance. Logistic Regression was configured with L2 regularization and a regularization strength (C) of 1.0. The Random Forest model used 100 trees with a maximum depth of 10, while Gradient Boosting had 100 estimators, a learning rate of 0.05, and a maximum depth of 3. AdaBoost was set with 50 estimators and a learning rate of 0.1. The Support Vector Machine (SVM) used an RBF kernel with C=1.0 and gamma=0.001. Lastly, the Deep Neural Network (DNN) consisted of 4 hidden layers with ReLU activation, trained for 100 epochs with a batch size of 28 using the Adam optimizer.

4 Machine learning techniques

In the area of machine learning, the performance of an algorithm is important, particularly in applications seeking high-stakes decision-making, such as the identification of hardware Trojans (HTs) using power side-channel analysis. This section presents a full description of the different range of algorithms utilized in our inquiry, highlighting their respective strengths and limitations when applied to HT detection [28]. The purpose of introducing multiple machine learning algorithms is to perform a comprehensive comparative analysis. By evaluating various algorithms, this study aims to identify which methods are most effective in detecting HT. This approach allows for a detailed examination of each algorithm’s strengths, weaknesses, and suitability for different types of HT scenarios. The goal is to determine the most robust and accurate models that can be further refined for HT detection [29]. Figure 4 shows a schematic overview of a machine learning model for IC hardware Trojan detection, illustrating the process from unprocessed data through pre-processing, application of machine learning techniques, and the implementation of a selected framework.

4.1 Random forest classifier

The Random Forest Classifier identified as the most efficient algorithm, with its ensemble approach producing a remarkable 97% across-precision, recall, and F1 score. This solid performance is related to the classifier’s capacity to identify complicated patterns and variations in the data evaluation of HT presence. Random Forest’s ensemble method mixes numerous decision trees to avoid overfitting and boost generalization. Each tree in the forest is trained on a random subset of the data, and the final prediction is determined by aggregating the predictions of all trees. This method is particularly good in capturing the complicated, non-linear correlations within the power side-channel signals that indicate HT activity [9, 19].

4.2 Deep neural networks

Deep Neural Networks (DNNs) also demonstrated commendable performance, obtaining a 90% score in precision, recall, and F1 score. The layered, nonlinear processing of a neural network is well-suited to capture the complicated dependencies within the side-channel signals. DNNs consist of many layers of interconnected neurons, where each layer extracts increasingly abstract properties from the input data. This hierarchical feature extraction helps DNNs to learn complex representations and effectively identify small anomalies suggestive of HT existence. However, training DNNs requires huge computational resources and large volumes of labeled data to minimize overfitting and ensure generalization (Figs. 5, 6) [20].

Fig. 5
figure 5

Confusion matrix for Deep Neural Network

Fig. 6
figure 6

Confusion matrix for Random Forest Classifier

4.3 AdaBoost with logistic regression

In contrast, the AdaBoost and Logistic Regression methods revealed limits, with both obtaining precision and recall rates of roughly 42% and 43%, respectively. AdaBoost, which stands for Adaptive Boosting, combines weak classifiers to build a powerful classifier by focusing on misclassified examples in subsequent iterations. Despite its efficiency in other applications, AdaBoost struggled with the high-dimensional and intricate nature of the HT identification problem. Logistic regression, a linear model, had similar issues, suggesting a potential mismatch between the linearity of these models and the intricate, nonlinear interactions within the data [21, 22].

4.4 Support vector machine and K-nearest neighbors

The Support Vector Machine (SVM) and K-Nearest Neighbors (KNN) algorithms produced moderate performances, with precision and recall rates just around 50%. SVM is a powerful classifier that seeks to identify the best hyperplane that separates various classes in the feature space. It is highly effective in high-dimensional environments but requires appropriate kernel selection and parameter tuning to capture the decision boundaries properly. KNN, a nonparametric technique, classifies instances based on the majority class of their nearest neighbors in the feature space. While simple and obvious, KNN’s performance is subject to the choice of the number of neighbors (k) and can be computationally expensive for large datasets (Figs. 7, 8) [11, 23, 24].

Fig. 7
figure 7

AES-T1000 Performance Metrics for Machine Learning Models at 25\(^\circ \)C

Fig. 8
figure 8

AES-T1100 Performance Metrics for Machine Learning Models at 25\(^\circ \)C

4.5 Gradient boosting classifier

The Gradient Boosting Classifier’s results imply that, despite its usual success in numerous classification tasks, extra feature engineering and parameter modifications are necessary to match the algorithm to the specificity of HT detection. Gradient Boosting forms an ensemble of weak learners, often decision trees, in a sequential fashion, where each tree corrects the errors of previous ones. While powerful, this method can be susceptible to overfitting if not properly normalized and requires careful modifying of hyperparameters such as the learning rate and the number of trees [25].

4.6 Discussion and future directions

Our comparative analysis highlights the significance of algorithm choice in machine learning for HT detection. The resilience of Random Forest and the efficiency of deep learning approaches suggest that these models are particularly well-suited for identifying hidden and overt symptoms of HTs in integrated circuits. Future studies should investigate further optimizing these models and explore hybrid techniques that use the capabilities of different algorithms. Potential directions include:

  • Hyperparameter Optimization: Using techniques such as grid search and Bayesian optimization to fine-tune model parameters for improved performance [30].

  • Ensemble Methods: Combining several models to build more robust and accurate classifiers.

  • Feature Engineering: Developing better feature extraction approaches to capture more complex properties of HTs.

  • Adversarial Training: Enhancing model resilience against adversarial attacks meant avoiding detection.

  • Transfer Learning: Leveraging pre-trained models on similar tasks to increase performance on HT detection with minimal data.

The choice of machine learning algorithms significantly determines the performance of HT detection systems. Our findings highlight the potential of Random Forest and deep learning models, paving the path for further study to optimize and innovate in this essential field of cybersecurity (Figs. 9, 10).

Fig. 9
figure 9

AES-T1300 Performance Metrics for Machine Learning Models at 25\(^\circ \)C

Fig. 10
figure 10

AES-T1400 Performance Metrics for Machine Learning Models at 25\(^\circ \)C

5 Results and discussion

Our extensive evaluation of machine learning techniques for detecting Hardware Trojans (HTs) in integrated circuits has yielded significant insights. The analysis was carried out on a dataset derived from ’Power and electromagnetic side channel signals of hardware Trojan benchmarks’, focusing on an AES circuit infected with various Trojans under different conditions. This dataset enabled us to explore the effectiveness of several machine learning models across different scenarios. This section presents the results of the comparative analysis, highlighting which algorithms performed best under specific conditions and why. The discussion focuses on the accuracy, precision, recall, and other relevant metrics to assess the effectiveness of each algorithm in detecting HTs. The insights gained from this analysis help to understand the practical implications of using these algorithms in real-world HT detection scenarios.

Although traditional methods combining preprocessing and ML classifiers can enhance HT detection, our approach offers key advantages by integrating advanced feature extraction across time, frequency, and wavelet domains, capturing subtle variations that might be missed by standard techniques. Furthermore, our method is highly robust to noise and variability in different manufacturing processes, making it more adaptable to real-world conditions. By leveraging a hybrid model approach, including Random Forest and Deep Neural Networks, we achieve higher accuracy without relying on a golden-chip reference, allowing for scalable and effective HT detection (Fig. 11).

Fig. 11
figure 11

AES-T2000 Performance Metrics for Machine Learning Models at 25\(^\circ \)C

5.1 Machine learning model performance

The performance of the machine learning models varied considerably, highlighting the importance of model selection based on the specific characteristics of the data and the detection task. Notably, the Random Forest Classifier demonstrated exceptional efficacy across all test conditions, achieving a near-perfect accuracy rate of up to 99% in several instances. This model’s ability to handle the complex, non-linear relationships inherent in power side-channel data is particularly advantageous for HT detection.

The classification problem can be formulated as finding a function \( f \) that maps the feature vector \( \textbf{x} \) to a label \( y \):

$$\begin{aligned} f(\textbf{x}) = y \end{aligned}$$
(8)

where:

  • \( \textbf{x} = [x_1, x_2, \ldots , x_n] \) is the feature vector,

  • \( y \) is the class label (e.g., Trojan enabled, Trojan disabled, Trojan triggered).

Deep Neural Networks (DNNs) also showed promising results, with high accuracy, precision, recall, and F1 scores, particularly in scenarios with more straightforward Trojan activations. These results underscore the potential of deep learning in identifying subtle anomalies indicative of HT presence, especially when the data encompasses diverse Trojan conditions and activation states.

Conversely, simpler models like Logistic Regression and AdaBoost generally exhibited lower performance, underscoring the challenges posed by the sophisticated and stealthy nature of HTs. The high-dimensional and complex feature space of power side-channel signals likely contributes to the difficulty these models face in capturing the nuanced differences between benign and malicious circuit behavior [26].

5.2 Insights and implications

The analysis reveals several key insights. First, the substantial variation in model performance across different Trojan scenarios suggests that no single model is universally superior for all types of HT detection. This variability underscores the need for a tailored approach, selecting and optimizing machine learning models based on the specific characteristics of the circuit and the nature of the suspected Trojan threat (Figs. 12, 13).

Fig. 12
figure 12

AES-T1600 Performance Metrics for Machine Learning Models at 25\(^\circ \)C

Fig. 13
figure 13

AES-T1800 Performance Metrics for Machine Learning Models at 25\(^\circ \)sC

Second, the superior performance of Random Forest Classifiers and DNNs indicates that models capable of capturing complex patterns and interactions within the data are more suited to the task of HT detection. This finding aligns with the inherently complex and clandestine nature of HTs, which require sophisticated analytical techniques to uncover.

Finally, the results highlight the critical role of feature selection and data preparation in enhancing model performance. The significant impact of feature selection on the effectiveness of machine learning models emphasizes the importance of understanding the underlying physical processes and the characteristics of HTs that manifest in power side-channel signals.

5.3 Comparison with previous work

Our findings contribute to the growing body of research on HT detection using machine learning. When compared to previous works, our study demonstrates improvements in detection accuracy, especially with Random Forest and DNNs. These advancements not only underscore the potential of machine learning in addressing cybersecurity threats but also highlight the progress being made in developing more resilient and reliable electronic systems (Fig. 14).

Fig. 14
figure 14

Comparison with previous work

The Random Forest Classifier and DNNs outperformed other models, showing significant improvements over traditional methods and previous machine learning approaches. This indicates a trend towards more sophisticated models that can effectively handle the complexities of HT detection.

5.4 Future work

While our study provides a comprehensive evaluation of various machine learning models for HT detection, there are several avenues for future research. First, further optimization of the models through hyperparameter tuning and advanced feature engineering could yield even better performance. Second, exploring hybrid approaches that combine the strengths of multiple algorithms might provide more robust detection systems. Additionally, expanding the dataset to include a wider variety of HT scenarios and real-world conditions will help in developing models that are more generalizable.

Future research should also focus on addressing the challenges posed by adversarial attacks, which can be designed to evade detection. Developing models that are resilient to such attacks will be crucial for practical deployment in security-critical applications.

our study highlights the importance of selecting and optimizing machine learning models for HT detection. The superior performance of Random Forest and DNNs demonstrates their potential in this domain, paving the way for further advancements in securing integrated circuits against these insidious threats. The ongoing evolution of machine learning techniques promises continued improvements in the detection and mitigation of HTs, contributing to the overall security and reliability of electronic systems. we identified the potential limitations of our proposed methods for further consideration. One significant challenge is the reliance on large, labeled datasets, as acquiring such data can be resource-intensive and time-consuming. This dependency may limit the scalability and applicability of our approach in scenarios where labeled data is scarce. Additionally, our machine learning models, like many in the field, are susceptible to adversarial attacks, where even slight manipulations in input data could lead to incorrect classifications. Addressing these vulnerabilities is crucial for enhancing the robustness and reliability of our models. By acknowledging these limitations, we aim to provide a balanced view of our work and to identify areas where future research can focus on improving the resilience and effectiveness of hardware Trojan detection methods.

6 Conclusion and future work

This study has presented a comprehensive analysis of advanced machine learning techniques for detecting Hardware Trojans (HTs) through power side-channel signal analysis. Our findings demonstrate the potential of these techniques, particularly Random Forest Classifiers and Deep Neural Networks (DNNs), in identifying the subtle indicators of HTs with high accuracy and reliability.

The results highlight the importance of model selection and optimization in effectively addressing the complex and clandestine nature of HTs. Random Forest and DNNs emerged as particularly effective models, showcasing their capability to capture intricate patterns and interactions within the data, which are crucial for accurate HT detection.

We will more directly tie our research findings to broader cybersecurity implications, emphasizing how our results contribute to the development of more robust and adaptive machine learning models. By doing so, we underscore the significant impact of our work in enhancing the detection of hardware Trojans. Additionally, we propose exploring hybrid approaches that combine the strengths of multiple algorithms, offering a clear path forward in the ongoing effort to combat increasingly sophisticated hardware threats. This focus on future research directions highlights the evolving nature of cybersecurity challenges and the necessity for continued innovation in the field.

6.1 Future directions

Future work will focus on further refining these models through more sophisticated feature engineering and exploring the integration of ensemble methods to leverage the strengths of multiple models. This could involve developing advanced feature extraction techniques that better capture the subtle characteristics of HTs, as well as employing ensemble approaches that combine the predictive power of various machine learning algorithms.

Additionally, expanding the dataset to include more varied and nuanced Trojan scenarios will be critical in enhancing the robustness and generalizability of the detection models. Including a wider range of HT types, operational conditions, and environmental factors will help in training models that are more resilient to diverse and evolving threats.

6.2 Adversarial robustness

Another important direction for future research is improving the robustness of HT detection models against adversarial attacks. Developing techniques to make models more resilient to attempts at evading detection will be crucial for their practical deployment in security-critical applications. This could involve adversarial training, where models are exposed to adversarial examples during the training process to enhance their ability to recognize and resist such attacks.

6.3 Hybrid approaches

Exploring hybrid approaches that integrate multiple machine learning models could provide more robust and accurate detection systems. Combining the strengths of different algorithms might mitigate the weaknesses of individual models, leading to better overall performance. For example, integrating decision tree-based methods like Random Forests with deep learning techniques could leverage the interpretability of the former and the powerful feature extraction capabilities of the latter.

6.4 Transfer learning

Transfer learning presents another promising avenue for improving HT detection. By leveraging pre-trained models on similar tasks, transfer learning can enhance model performance, especially in scenarios with limited data. This approach could reduce the amount of labeled data required for training while maintaining high detection accuracy.

6.5 Interdisciplinary collaboration

Lastly, fostering interdisciplinary collaboration between fields such as electrical engineering, computer science, and cybersecurity will be essential for developing innovative solutions to HT detection. Combining insights from these domains can lead to more comprehensive and effective approaches to securing integrated circuits against HTs.

By advancing our understanding and capabilities in detecting Hardware Trojans, we move closer to securing integrated circuits against these insidious threats. This not only reinforces the integrity and trustworthiness of electronic systems but also contributes to the broader effort of enhancing cybersecurity across a multitude of applications. Continued research and innovation in this area promise to significantly bolster the defenses of critical electronic infrastructure against the evolving landscape of hardware security threats.