Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu
Article oneM2M‐Enabled Prediction of High Particulate Matter Data Based on Multi‐Dense Layer BiLSTM Model Aji Teguh Prihatno, Ida Bagus Krishna Yoga Utama and Yeong Min Jang * Department of Electronics Engineering, Kookmin University, Seoul 02707, Korea; aji.teguh@gmail.com (A.T.P.); idabaguskrishnayogautama@gmail.com (I.B.K.Y.U.) * Correspondence: yjang@kookmin.ac.kr; Tel.: +82‐02‐910‐5068 Abstract: High particulate matter (PM) concentrations in the cleanroom semiconductor factory have become a significant concern as they can damage electronic devices during the manufacturing process. PM can be predicted before becoming more concentrated based on its historical data to support factory management in regulating the air quality in the cleanroom. In this paper, a Multi‐ Dense Layer BiLSTM model is proposed to predict PM2.5 concentrations in the indoor environment of the cleanroom. To obtain reliability, validity, and interoperability data, the datasets containing temperature, humidity, PM0.3, PM0.5, PM1, PM2.5, PM5, and PM10 were retrieved in a standardized manner via oneM2M‐defined representational state transfer application programmable interfaces by employing software platforms compliant with the Internet of Things (IoT) standard. Based on the proposed model, an algorithm was built providing short‐term PM2.5 concentration predictions (one hour ahead, two hours ahead, and three hours ahead). The proposed model outperformed the RNN, LSTM, CNN‐LSTM, and Single‐Dense Layer BiLSTM models in terms of MSE, MAE, and MAPE values. The model created in this study could predict high PM2.5 concentration levels more accurately, thus providing vital support for operation and maintenance for the semiconductor industry. Citation: Prihatno, A.T.; Utama, I.B.K.Y.; Jang, Y.M. oneM2M‐Enabled Prediction of High Particulate Matter Data Based Keywords: oneM2M; particulate matter (PM); PM2.5; Multi‐Dense Layer BiLSTM; cleanroom on Multi‐Dense Layer BiLSTM Model. Appl. Sci. 2022, 12, 2260. https://doi.org/10.3390/app12042260 1. Introduction Academic Editor: João M. F. Rodrigues Received: 31 December 2021 Accepted: 16 February 2022 Published: 21 February 2022 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. Copyright: © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/license s/by/4.0/). The semiconductor industry is one of the worldʹs most rapidly growing and evolving industries. The global semiconductor market is estimated to be worth 333 billion dollars. In addition, the industry has a considerable impact on the national market economy, accounting for 10–15% of the Republic of Koreaʹs total exports. Market competition is becoming increasingly vital due to the recent growth of available electronic gadgets, such as mobile phones and tablet PCs (personal computers) [1]. Semiconductor fabrication requires a variety of complex chemical components [2], generating various chemicals and by‐products that are almost impossible to remove from the inside of the equipment altogether. Powders and airborne PM, the by‐products of the chemical reaction of the metal precursors used as process materials during regular operation and their release into the workplace during process equipment and scrubber maintenance (used to remove some particulates and gases from industrial exhaust streams), can severely damage the electronic circuits [3]. The yield of the semiconductor industry is defined as the percentage of functional integrated circuit (IC) devices at the end of the fabrication process. In general, there are two types of yield losses in IC manufacturing: systematic and random yield loss. Deviations in the device and material characteristics cause systematic yield loss. Contamination issues and process‐induced particles are frequently linked to random defect yield loss [4]. The following are a few instances of contaminations and mechanisms Appl. Sci. 2022, 12, 2260. https://doi.org/10.3390/app12042260 www.mdpi.com/journal/applsci Appl. Sci. 2022, 12, 2260 2 of 17 responsible for the electronic chip failures in a semiconductor: particulate matter contamination, either from organic or inorganic matter particles created by the environment or by tools, and processes, such as scratches, fractures, overlay flaws, and stress [5]. As a result, in industrial hygiene, monitoring, determining, and predicting the powder by‐products and airborne PM in the cleanroom semiconductor factory are essential to avoid economic losses. This study aims to predict the concentrations of airborne PM in the semiconductor manufacturing facilities based on ten‐day historical data gathered using oneM2M technology. Over the years, several approaches have been developed to predict and manage PM. Chang‐Hoi et al. [6] utilized RNN incorporated with CMAQ (Community Multiscale Air Quality) to forecast PM2.5. Ting Tsai et al. [7] employed the RNN model to predict PM2.5 concentrations, but the result of errors such as RMSE and MAE are high. Park et al. [8] used the long short‐term memory (LSTM) and artificial neural network (ANN) models to forecast PM, which had a higher F‐1 score than the individual scores of LSTM, ANN, and random forest (RF) models. Huang et al. [9] forecasted PM2.5 in a smart city environment using a deep neural network (APNet) based on CNN‐LSTM. Li et al. [10] combined CNN and LSTM (named CNN‐LSTM) to predict PM2.5 concentrations. For improving forecasting accuracy, the CNN‐LSTM used a convolutional neural network for feature extraction and a recurrent neural network for time series data processing. Seong et al. [11] predicted 186 stations of PM2.5 concentrations using 2 layers of convolutional long short‐term memory two‐dimensional (CONVLSTM2D) and batch normalization. Castelli et al. [12] had forecasted the air quality index (AQI) containing O3, CO, SO2, NO2, and PM2.5 based on support vector regression, but the accuracy of PM2.5 still had to be enhanced. Zhang et al. [13] had constructed a model to forecast PM2.5 using a combination of auto‐encoder and BiLSTM neural networks. However, the results lacked metric comparison by only mentioning the RMSE and correlation coefficient. In our work, PM2.5 prediction was still used as a case study to demonstrate the Multi‐ Dense Layer BiLSTM model in order to prove and compare among the AI methods to the same object, as the advancement of the previous author’s research [3], can successfully predict the time series data and even outperform several existing predicting strategies. The following are our specific contributions:     We used the hardware architecture, based on oneM2M technology, to achieve IoT system compatibility in the semiconductor factory cleanroom; We showed that our Multi‐Dense Layer BiLSTM model can accurately forecast PM2.5 from multi‐size PM concentration datasets (PM0.3, PM0.5, PM1, PM2.5, PM5, and PM10); We created a system with a small number of parameters, making it computationally efficient, potent, and stable; Our findings revealed that the Multi‐Dense Layer BiLSTM approach yields the lowest error when compared to the RNN, LSTM, CNN‐LSTM, and Single‐Dense Layer BiLSTM methods. The following is a breakdown of the paperʹs structure. The overview of the system is provided in Section 2, while the experimental setup for PM prediction is highlighted in Section 3. The results of the experiment are evaluated and elaborated in Section 4. Finally, in Section 5, the paper is concluded, and ideas for further research are discussed. 2. System Overview To establish reliability and validity datasets, the authors collected the sensor data via oneM2M standard technology. This method needs a communication interface to support the cyber‐physical system (CPS). Appl. Sci. 2022, 12, 2260 3 of 17 2.1. Communication Interface In this paper, the authors used the RS485 Modbus RTU protocol for communication interface using RJ‐116p4c cable. RS485 to USB converter was used to convert data from the sensor using RS485 Modbus RTU protocol to USB for the computer to read and process the data from the sensor. The RS485 protocol data rate can reach 35 Mbit/s over a 10 m connection and 100 Kbit/s over a 1200 m line [14]. The RS485 Modbus RTU protocol has a number of benefits, including reliable communication, interoperability across devices from different manufacturers, and ease of installation and configuration, making it ideal for edge computing [15]. The general architecture used in this study contained hardware, an IoT platform, and an artificial intelligence (AI) platform, as shown in Figure 1. The industrial‐grade sensor collected temperature, humidity, PM0.3, PM0.5, PM1, PM2.5, PM5, and PM10 during a three‐day period from 24 September to 26 September 2021. The Mobius, an open IoT platform, acts as the gateway, server, and database. This open IoT platform works on an oneM2M technology standard, which is elaborated in the next section. Jupyter Notebook, based on python programming, was used as the AI platform. The Jupyter Notebook contains a set of open standards for collaborative computing. These open standards can be used by third‐party developers to construct bespoke applications with embedded interactive computing based on HTML and CSS on cloud computing. Jupyter Notebook spans through visualization, multimedia, and more with its modular design. In addition to running the code, it saves the code and output as well as markdown notes in an editable document called a notebook. When users save a page in their browser, it is transferred to their notebook server, which saves it as a JSON file with the a.ipynb extension on the disk [16]. Figure 1. General architecture of this study. 2.2. oneM2M Technology Standard Connected devices have been around for a long time, but they took off after the phrase ʺInternet of Thingsʺ (IoT) was established. As IoT devices began to proliferate, a standard was needed to satisfy new IoT requirements without rewriting pieces that previously had tried and verified specifications. The oneM2M‐based platform was built with these concepts in mind to facilitate IoT device and application interoperability and economies of scale. Furthermore, oneM2Mʹs standard interoperability testing activities are important aspects of a robust standard [17]. The oneM2M standards support IoT applications to discover and interact with any IoT devices. IoT solutions can currently communicate across various silos. This is perfect for distributed and collaborative solutions in domains like smart buildings, smart cities, and smart manufacturing. Furthermore, oneM2M standards were created with the goal of reducing fragmentation, increasing reusability, and lowering costs through scalability [18]. The oneM2M initiative has been working on IoT standards to address fragmentation in the IoT landscape. It focuses on service layer interoperability rather than protocol stacks within the network or internet layers, and hence provides optimal technical standards for building a common horizontal IoT service platform across several domain sectors [19]. Appl. Sci. 2022, 12, 2260 4 of 17 The authors used IoT based on oneM2M platforms in the cleanroom of a semicon‐ ductor smart factory environment to obtain reliability, validity, and interoperability of data [20]. We can have a common service capability layer in terms of the end‐to‐end plat‐ form with this technology. In the oneM2M standard, message queuing telemetry transport (MQTT) plays a vital role in collecting and sending sensor data. MQTT is a lightweight application layer proto‐ col for IoT devices. MQTT is a ʺpublishʺ and ʺsubscribeʺ protocol in which the sender can deliver information to clients via an intermediary server known as a broker. Each pub‐ lished message has a single topic that clients used to subscribe to a broker. The sole broker defined in the MQTT protocol standard acts as the single point of failure (PoF). Numerous brokers are introduced into a system to increase availability. The IoT platform containing MQTT broker is depicted in Figure 2. IoT Platform MQTT DEVICE / SENSOR Restful API IoT Server MQTT Broker Database MQTT User Application Restful API Figure 2. IoT platform design and architecture with MQTT. In addition, M2M technology facilitates work by enabling real‐time replies on com‐ plicated provider networks, such as those found in factories. Real‐time control and com‐ mand with crucial technologies add functions and advantages to supply chain optimiza‐ tion and automation. As a result, use cases should be evaluated via the standard oneM2M technology with real‐time command and control [21]. Furthermore, this technology must be incorporated into the current protocol standards. Additionally, oneM2M complies with the international M2M and IoT standards with the goal of creating a single M2M service layer, as shown in Figure 3. It would enable the integration of a wide range of hardware, software, and countless devices from around the world into a system combining M2M‐ related fields of business into a serviceable system, including telematics, smart transpor‐ tation, health care, utilities, industrial automation, and smart home applications. Appl. Sci. 2022, 12, 2260 5 of 17 Figure 3. The oneM2M architecture model. The oneM2M model is a decentralized design that is relatively easy to modify, as shown in Figure 3. Connecting nodes with diverse capabilities construct it. The device component of the IoT or any logical hardware and software service might be defined as an application entity (AE) in this architecture. The oneM2M service core, the IoT gateway, and the AE application service are all managed by the infrastructure node (IN). It is typically set up on a cloud system platform or server. The IN is the in‐charge of the middle layer region with several middle nodes (MNs) that serve IoT service layers and AE application services. In most cases, MNs are created in the IoT gateway. Application service nodes (ASNs) are lightweight common service layers and AE application services are utilized in a remote M2M‐based IoT system. In a tiny or limited IoT device system, application dedicated nodes (ADNs) are used to offer sensor monitoring and information return [22]. For visualization of data collected from the sensor, the authors used an oneM2M browser application. This application represents how data sensors (temperature, humid‐ ity, PM0.3, PM0.5, PM1, PM2.5, PM5, and PM10) are stored in the MySQL database. As we can see in Figure 4, the green blocks titled cin represent updated sensor data received every second. Appl. Sci. 2022, 12, 2260 6 of 17 Figure 4. Real‐time sensor data acquisition visualization using oneM2M browser application. 3. Methodology 3.1. Multi‐Dense Layer BiLSTM The BiLSTM is a variant of the general LSTM [23]. By processing the incoming data sequences from two directions with two independent LSTMs, we utilized the advantages of both prior and future contexts. The LSTM takes a variable‐length sequence x = x1, x2, ..., xn as its general input, where 𝑥 ∈ ℝ and d denotes the features in each time index i. The LSTM preserves its internal hidden state h in each time index, resulting in a hidden se‐ quence of h1, h2, ..., hn. At time index t, the hidden vector ht is modified as follows: 𝑐 𝑖 𝜎 𝑊 𝑥 𝑊 ℎ 𝑓 𝜎 𝑊 𝑥 𝑊 ℎ 𝑓 ⊗ 𝑐 𝑜 𝑖 ⊗ tanh 𝑊 𝑥 𝜎 𝑊𝑥 ℎ 𝑊 ℎ 𝑜 ⊗ tanh 𝑐 ) 𝑏 𝑊 ℎ 𝑏 𝑏 (1) 𝑏 (2) (3) (4) (5) where c, σ, and ⊗ express the cell vector, the sigmoid function, and the element‐wise multiplication; i, f, and o indicate to the input, forget, and output gates, respectively. Figure 5 depicts the proposed Multi‐Dense Layer BiLSTM model for predicting PM2.5 concentrations. The algorithm took the PM2.5 concentrations data from the raw data, which contain temperature, humidity, PM0.3, PM0.5, PM1, PM5, and PM10 concen‐ trations. Later, their values are standardized into a range of 0 to 1. The processed dataset is sent into the model for training, and the learned model is then utilized to forecast PM2.5 levels. Appl. Sci. 2022, 12, 2260 7 of 17 Figure 5. Proposed architecture of Multi‐Dense Layer BiLSTM model. The BiLSTM layer is made up of two LSTM layers: a forward layer and a backward layer. The input is recognized by the forward layer 𝑙 as ascending range, i.e., t = 1, 2, 3, ..., T. Backward layer 𝑙 , on the other hand, considers the input in descending order, i.e., t = T ..., 3, 2, 1. As a result, 𝑙 and 𝑙 can be combined to generate the output 𝑦 . Because they use the same backpropagation through time (BPTT) training mechanism as LSTM networks, BiLSTMs are computationally inexpensive [24]. The backward LSTM layer output sequence 𝑙 is calculated using reversed inputs from time t‐2 to t‐n, the same as the forward LSTM layer output sequence 𝑙 . These output sequences are then passed into the function, which combines them into a 𝑦 output vector. The final output of a BiLSTM layer can be represented as a vector, 𝑌 =[ 𝑦 ,..., 𝑦 ], where the last element, 𝑦 , is the estimated PM2.5 concentration for the following itera‐ tion, similar to the LSTM layer [25]. All the constructed LSTM networks in this study make use of the bidirectional fea‐ ture. The mathematical equations constituting the BiLSTM model are as follows: 𝑙 𝑙 tan h 𝑊 𝑥 tan h 𝑊 𝑥 𝑦 𝑊 𝑙 𝑊 𝑙 𝑊 𝑙 𝑊 𝑙 𝑙 𝑏 𝑙 (6) (7) (8) After the BiLSTM layer has processed the data, it is sent to a multi‐dense layer with a linear activation function to give continuous value predictions. The dense layer is an utterly interconnected layer, i.e., all neurons in one layer are connected to those in the following [26]. There are two dense layers used in this proposed architecture. In a neural network, a dense layer is one that is tightly coupled to the layer before it. That is, every neuron in the layer before it is coupled to every neuron in the layer before it. This is the most often used layer in artificial neural network networks [27]. The authors used two units in the first dense layer and one unit in the second dense layer. All units in the dense layers contain Appl. Sci. 2022, 12, 2260 8 of 17 the sigmoid activation function. In this study, adding more layers to the dense section expectedly can increase the networkʹs robustness [28]. 3.2. Sigmoid Activation Function The sigmoid is a non‐linear activation function frequently employed in feedforward neural networks. It is a bounded differentiable actual function with positive derivatives everywhere and a certain amount of smoothness, defined for real input values. The rela‐ tionship determines the sigmoid function: 𝑓 𝑥 1 𝑒𝑥𝑝 1 (9) The sigmoid function is found in the output layers of deep learning (DL) architec‐ tures, and it is used to predict probability‐based outputs. It has been successfully em‐ ployed in binary classification challenges, modeling logistic regression tasks, and other neural network fields. The key advantages of sigmoid functions are that they are simple to learn and that they are commonly utilized in external networks [29]. The sigmoid activation function was elected for this paper since it is ideally suited to tasks that need a continuous‐valued output, such as PM2.5 concentration [30]. 4. Experimental Setup The proposed Multi‐Dense Layer BiLSTM model is utilized to predict PM2.5 concen‐ trations that can be implemented in the cleanroom of the semiconductor factory. The Ten‐ sorflow Keras library was used to implement the proposed system design. 4.1. Dataset and Preprocessing The collection contains 259,200 data points from 24 September to 26 September 2021. Temperature, humidity, PM0.3, PM0.5, PM1, PM2.5, PM5, and PM10 are among the eight variables listed in Table 1. In this experiment, the algorithm contains linear interpolation, which is employed to fill the data if there are any missing values. Linear interpolation produces the best results for all percentages of missing values if compared to other meth‐ ods, such as the mean method [31]. The authors found no missing data from raw data, which indicates that this is the advantage of the oneM2M system to gather the data [32]. We utilize Equation (9) to normalize the data before inputting it into the proposed method [33]: 𝒙 𝒙𝒎𝒊𝒏 𝒙𝒎𝒂𝒙 𝒙𝒎𝒊𝒏 𝒙𝒏𝒐𝒓𝒎𝒂𝒍𝒊𝒛𝒆𝒅 (10) where 𝒙𝒎𝒊𝒏 denotes minimum data, 𝒙𝒎𝒂𝒙 is minimum data, and x is the original data. It is critical to create supervised time series data. The input matrices and output matrices are shown below in their many configurations. For 1 h ahead prediction: Input matrix 𝑥 Output matrix For 2 h ahead prediction: 𝑥 𝑥 … 𝑥 𝑥 𝑥 𝑥 … 𝑥 𝑥 𝑥 𝑥 𝑥 … 𝑥 … 𝑥 𝑥 𝑥 𝑥 𝑥 … … … 𝑥 … 𝑥 … … … 𝑥 … 𝑥 … 𝑥 … … … 𝑥 Appl. Sci. 2022, 12, 2260 9 of 17 Input matrix 𝑥 Output matrix For 3 h ahead prediction: Input matrix 𝑥 𝑥 𝑥 … 𝑥 𝑥 𝑥 𝑥 𝑥 … … 𝑥 𝑥 Output matrix … 𝑥 … 𝑥 𝑥 𝑥 𝑥 𝑥 𝑥 𝑥 … 𝑥 Table 1. Variables consisted in the dataset. Categories Temperature Humidity Air pollutant variables Air pollutant variables Air pollutant variables Air pollutant variables Air pollutant variables Air pollutant variables 𝑥 𝑥 𝑥 𝑥 … … 𝑥 … 𝑥 … … … 𝑥 … 𝑥 𝑥 𝑥 𝑥 𝑥 𝑥 … 𝑥 … 𝑥 𝑥 𝑥 𝑥 … 𝑥 … 𝑥 … … … 𝑥 … … 𝑥 … 𝑥 … 𝑥 … … … 𝑥 … 𝑥 … 𝑥 … … … 𝑥 Input Variables TEMP HUMID PM0.3 PM0.5 PM1 PM2.5 PM5 PM10 Unit °C %RH μg/m3 μg/m3 μg/m3 μg/m3 μg/m3 μg/m3 4.2. Hyperparameters Setting In this study, the authors evaluated the sequence learning models for short and long‐ term predictions; experiments were conducted for different time scales, such as one hour, two hours, and three hours ahead. Table 2 elaborates the hyperparameter setting for our proposed model, Multi‐Dense Layer BiLSTM. The modelʹs hyperparameters were deter‐ mined to achieve the best results. To ensure the consistency of the results, the authors use three kinds of epoch values, 20, 35, and 50. The choice of 80% for training data and 20% for testing data is because this is empirically the best partition into the training and the testing sets [34]. Table 2. The list of hyperparameters values for the Multi‐Dense Layer BiLSTM Method. 20/35/50 16 N/A Single‐Dense Layer BiLSTM 64 BiLSTM nodes 20/35/50 16 linear Multi‐Dense Layer BiLSTM 64 BiLSTM nodes 20/35/50 16 linear 64% 64% 80% 80% 16% 16% N/A N/A Hyperparameter RNN LSTM CNN‐LSTM Model nodes 2 RNN nodes 64 LSTM nodes 128 LSTM nodes Epoch Batch size Interpolate method Train data (% dataset) Validation data (% dataset) 20/35/50 16 linear 20/35/50 64 linear 64 16% Appl. Sci. 2022, 12, 2260 Test data (% dataset) Optimizer Activation Learning rate Dense layer 10 of 17 20% 20% 20% 20% 20% ADAM Linear 0.01 N/A SGD Linear 0.01 N/A ADAM ReLU 0.001 3 ADAM Linear 0.001 1 ADAM Sigmoid 0.001 2 4.3. Performance Criteria We have predicted PM2.5 concentrations for the three‐day dataset; the experiments used three parameters to assess the efficacy of the proposed model: mean square error (MSE) (8), mean absolute error (MAE) (9), and mean absolute percentage error (MAPE) (10) as metrics to appraise the achievement of the Multi‐Dense Layer BiLSTM model: 𝑀𝑆𝐸 𝑀𝐴𝐸 ∑ 𝑀𝐴𝑃𝐸 ∑ ∑ 𝑦 𝑦 . | | |𝑦 𝑦 | (11) (12) (13) where 𝑦 refers to the real value and 𝑦 refers to the forecasted value, and n expresses the sample size. Higher forecasting accuracy is associated with lower MSE, MAE, and MAPE values [35]. 5. Result and Discussion The proposed Multi‐Dense Layer BiLSTM model was used to predict PM2.5 concentra‐ tions one hour, two hours, and three hours ahead of the three‐day observation, totaling 259,200 s. To test the validity of predictions and support preventive maintenance over a longer timeframe, we set the time forecast into hourly, 2 h, and 3 h. The proposed Multi‐Dense Layer BiLSTM retains low MSE, MAE, and MAPE levels at varied sampling rates, meaning that forecasting accuracy may be assured. Table 3 shows that when the number of epochs in the RNN, LSTM, CNN‐LSTM, and Single‐Dense Layer BiLSTM algorithms increased, so did the MSE, MAE, and MAPE values, indicating that the three ap‐ proaches were overfitting. The MSE, MAE, and MAPE values of our approach model, on the other hand, dropped as the prediction time length was increased. These findings acknowledge that, compared to other models, our proposed model predicts PM2.5 concentrations 1 h, 2 h, and 3 h ahead of time and has reached the most preciseness. Table 3. The best results from all compared models for PM2.5 prediction with 20, 35, and 50 epochs. Prediction Model Time 20 epoch Length 1h 0.1072 RNN 2h 0.1012 3h 0.0833 1h 0.0058 LSTM 2h 0.0187 3h 0.0619 1h 5.838 CNN‐LSTM 2h 3.871 3h 8.434 1h 0.0016 Single‐Dense Layer 2h 0.0015 BiLSTM 3h 0.0053 MSE MAE MAPE 35 epoch 50 epoch 20 epoch 35 epoch 50 epoch 20 epoch 35 epoch 50 epoch 0.1141 0.1138 0.0908 0.0048 0.0771 0.0724 3.907 3.992 8.434 0.0017 0.0027 0.0047 0.1001 0.1165 0.072 0.0045 0.0217 0.066 3.69 3.871 8.434 0.0016 0.0014 0.0063 0.2415 0.1778 0.2501 0.0626 0.125 0.1703 2.305 1.659 2.598 0.0029 0.318 0.0067 0.2371 0.1986 0.2337 0.0548 0.1985 0.2023 1.706 1.739 2.598 0.0034 0.0042 0.0064 0.2199 0.2097 0.2132 0.055 0.1325 0.1765 1.676 1.659 2.598 0.3266 0.3105 0.0073 29.8106 23.89 54.1121 11.4597 25.617 60.5662 6.061 4.864 7.64 0.3385 0.3046 0.673 31.0264 27.4217 56.2735 8.8899 68.8899 67.6347 3.613 4.948 7.64 0.3434 0.4193 0.6433 27.7334 29.8114 52.1554 9.1937 27.6512 63.2913 4.332 4.864 7.64 0.3047 0.2814 0.7322 Appl. Sci. 2022, 12, 2260 Multi‐Dense Layer BiLSTM 11 of 17 1h 2h 3h 0.0009 0.0014 0.001 0.0014 0.0008 0.0018 0.0011 0.0009 0.0006 0.0023 0.0027 0.0022 0.0028 0.002 0.0035 0.0027 0.0021 0.0019 0.2258 0.2713 0.223 0.2849 0.1991 0.3507 0.2701 0.2058 0.1873 Figure 6a–f display the results of predicting PM2.5 concentration one hour ahead, using the Multi‐Dense Layer BiLSTM algorithm. From three kinds of epochs experiments, the opti‐ mum result between train and validation loss is revealed before the 10th epoch. The blue line in Figure 6a,c,f with the log scale indicates train loss. In contrast, the orange line reflecting validation loss is shown to have the same trend, resulting in the proposed model having the lowest MSE, MAE, and MAPE in 1 hour ahead prediction, as mentioned in Table 3. From Figure 6b,d,f, the test data, and prediction data are united very closely. These outputs demon‐ strate that the proposed model has the best fit to predict PM2.5 concentrations. Compared to the other four models, the results of prediction by the proposed model were the closest to the actual value. (a) (b) (c) (d) (e) (f) Figure 6. Result of loss function using the proposed model for 1 hour ahead prediction (a,b) 20 epochs. (c,d) 35 epochs. (e,f) 50 epochs. Appl. Sci. 2022, 12, 2260 12 of 17 The loss function patterns of the proposed model to estimate PM2.5 concentration for the following two hours are shown in Appendix A, Figure A1a–f. Especially in Figure A1e, the blue and orange lines were closer in epoch 7th, then separated wider until epoch 50th, with relatively having the same trend between these two lines. The test data and prediction data have practically merged in Figure A1b,d,f. These findings suggest that the proposed model is the most accurate in predicting PM2.5 concentrations for 2 h ahead, as mentioned in Table 3. Similar to the previous experiments one h ahead and two hours ahead, Figure A2a–f depicts the fitting patterns of three hours ahead prediction by the proposed model has attained the best fit. All models were run for 20, 35, and 50 epochs; then, our proposed model, in which the results of prediction are represented by the orange line, came closest to the actual value, which is represented by the blue line. Notably, in Figure A2e, the line of validation loss got a spike in epoch 15th, then it continued to have the same trend with train loss until epoch 50th. When the number of epochs is increased from 20 to 35, and then 50 epochs, the MSE, MAE, and MAPE values from the proposed model tend to decrease to the lowest error compared to other methods. These results demonstrate that Multi‐Dense Layer BiLSTM has the best fit pattern than the other four models. The authors took a sample to combine the fitting trends from all models for one hour ahead prediction, with each model run by 20 epochs, which are shown in Figure 7. In the zoomed Figure 7, the authors took sample lines from all compared models from 25 to 35 seconds, indicating that the proposed model represented by the orange line has the closest distance to the real data represented by the blue line. The proposed model and single‐dense layer BiLSTM represented by the green line have a close connection. This result shows that the proposed model, which uses multi‐dense layers, could empower network stability [28]. From Figure 7, we can also see that the CNN‐LSTM model, represented by the red line, has the furthest distance from the real data. This result is suitable to the values men‐ tioned in Table 3. This is understandable given that CNN‐LSTM is essentially slower due to its operations and necessitates a lengthy process [36]. Figure 7. Comparison of all models to predict PM2.5 concentrations for one hour ahead. Appl. Sci. 2022, 12, 2260 13 of 17 6. Conclusions PM2.5 concentrations can have a significant impact on the semiconductor plant prod‐ uct quality. Therefore, a robust framework is required to monitor, analyze, and predict air quality with additional visualization services. It is imperative to develop an accurate pre‐ diction method to ensure awareness regarding the prospective air quality in the clean rooms among the personnel working at semiconductor manufacturing sites. In this study, we built PM monitoring infrastructure using an industrial‐grade sensor to meet the quality and compatibility standards of the industrial sector. In particular, open‐source software platforms compliant with oneM2M standard technology were used to provide a standardized approach to access the obtained PM datasets, allowing us to construct globally applicable and access‐independent PM apps using oneM2M‐defined REST APIs. Furthermore, for one hour, two hours, and three hours forecasts, the proposed tech‐ nique, Multi‐Dense Layer BiLSTM, was shown to have the lowest errors in terms of MSE, MAE, and MAPE as compared to the RNN, LSTM, CNN‐LSTM, and Single‐Dense Layer BiLSTM models. The findings can also be used by policymakers in semiconductor facto‐ ries to control the air quality of the cleanroom using HVAC based on this proposed model of PM2.5 prediction. Despite the excellent results, there are a few difficulties that we would like to address in future work, such as the large memory and computing time required by our model in the case of big datasets. To further improve the prediction, cumulative airborne characteristics of the clean‐ room must be analyzed. Further steps include constructing integrated predictive, preven‐ tive, and prescriptive maintenance based on the PM prediction with suitable control func‐ tion services in cleanroom semiconductor manufacturing. Author Contributions: Conceptualization, A.T.P.; methodology, A.T.P.; software, A.T.P. and I.B.K.Y.U.; validation, A.T.P. and I.B.K.Y.U.; formal analysis, A.T.P.; investigation, A.T.P.; resources, A.T.P.; data curation, A.T.P.; writing—original draft preparation, A.T.P.; writing—review and edit‐ ing, A.T.P.; visualization, A.T.P. and I.B.K.Y.U.; supervision, Y.M.J. All authors have read and agreed to the published version of the manuscript. Funding: This research was financially supported by the Ministry of Trade, Industry and Energy (MOTIE) and Korea Institute for Advancement of Technology (KIAT) through the International Co‐ operative R&D program (Project ID:P0011880). Institutional Review Board Statement: Not applicable. Informed Consent Statement: Not applicable. Data Availability Statement: The data presented in this study are available on request from the corresponding author. The data are not publicly available due to further research on processing. Conflicts of Interest: The authors declare no conflict of interest. Appl. Sci. 2022, 12, 2260 14 of 17 Appendix A (a) (b) (c) (d) (e) (f) Figure A1. Result of loss function using the proposed model for 2 h ahead prediction (a,b) 20 epochs. (c,d) 35 epochs. (e,f) 50 epochs. (a) (b) Appl. Sci. 2022, 12, 2260 15 of 17 (c) (d) (e) (f) Figure A2. The accuracy of prediction using the proposed model for 3 h ahead prediction (a,b) 20 epochs. (c,d) 35 epochs. (e,f) 50 epochs. Abbreviations ADAM ADN AE ANN API ASN BiLSTM BPTT CMAQ CNN‐LSTM CONVLSTM2D CPS DL IN HVAC IoT JSON LSTM M2M MAE MAPE ML MN MQTT MSE PM PM0.3 Adaptive Momentum Estimation Application Dedicated Nodes Application Entity Artificial Neural Network Application Programming Interface Application Service Node Bidirectional Long Short‐Term Memory Backpropagation Through Time Community Multiscale Air Quality Convolutional Neural Network—Long Short‐Term Memory Convolutional Long Short‐Term Memory Two‐Dimensional Cyber‐Physical System Deep Learning Infrastructure Node Heating Ventilation and Air Conditioning Internet of Things JavaScript Object Notation Long Short‐Term Memory Machine to machine Mean Absolute Error Mean Absolute Percentage Error Machine Learning Middle Node Message Queue Telemetry Transport Mean Square Error Particulate Matter Particulate Matter of 0.3 μm Appl. Sci. 2022, 12, 2260 16 of 17 PM0.5 PM1.0 PM2.5 PM5 PM10 RDBMS REST RNN RTU Particulate Matter of 0.5 μm Particulate Matter of 1.0 μm Particulate Matter of 2.5 μm Particulate Matter of 5 μm Particulate Matter of 10 μm Relational Database Management System Representational State Transfer Recurrent Neural Network Remote Terminal Unit References 1. 2. 3. 4. Park, S.H.; Kim, S.; Baek, J.G. Kernel‐Density‐Based Particle Defect Management for Semiconductor Manufacturing Facilities. Appl. Sci. (Switz. ) 2018, 8, 224. https://doi.org/10.3390/app8020224. Choi, K.‐M. Airborne PM2.5 Characteristics in Semiconductor Manufacturing Facilities. AIMS Environ. Sci. 2018, 5, 216–228. https://doi.org/10.3934/environsci.2018.3.216. Prihatno, A.T.; Nurcahyanto, H.; Ahmed, M.F.; Rahman, M.H.; Alam, M.M.; Jang, Y.M. Forecasting Pm2.5 Concentration Using a Single‐Dense Layer Bilstm Method. Electron. (Switz. ) 2021, 10, 1808. https://doi.org/10.3390/electronics10151808. Wali, F.; Knotter, D.M.; Kuper, F.G. Impact OfNano Particles on Semiconductor Manufacturing. In Proceedings of the 2008 IEEE International Conference on Multi Topi, Karachi, Pakistan, 23–24 December 2008; pp. 97–99. 5. The International Technology Roadmap for Semiconductors. The International Technology Roadmap for Semiconductors 2.0. In Proceedings of the 2015 Edition Yield Enhancement; IEEE, Chip Design, Solid State Technology: Virtual Conferences, 2015; p. 3. 6. Chang‐Hoi, H.; Park, I.; Oh, H.R.; Gim, H.J.; Hur, S.K.; Kim, J.; Choi, D.R. Development of a PM2.5 Prediction Model Using a Recurrent Neural Network Algorithm for the Seoul Metropolitan Area, Republic of Korea. Atmos. Environ. 2021, 245, 118021. https://doi.org/10.1016/j.atmosenv.2020.118021. 7. Tsai, Y.T.; Zeng, Y.R.; Chang, Y.S. Air Pollution Forecasting using RNN with LSTM. In Proceedings of the 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech); IEEE: Athens, Greece, 2018; pp. 1068–1073. 8. Park, J.; Chang, S. A Particulate Matter Concentration Prediction Model Based on Long Short‐Term Memory and an Artificial Neural Network. Int. J. Environ. Res. Public Health 2021, 18, 6801. https://doi.org/10.3390/ijerph18136801. Huang, C.J.; Kuo, P.H. A Deep Cnn‐Lstm Model for Particulate Matter (Pm2.5) Forecasting in Smart Cities. Sens. (Switz. ) 2018, 18, 2220. https://doi.org/10.3390/s18072220. Li, T.; Hua, M.; Wu, X.U. A Hybrid CNN‐LSTM Model for Forecasting. 3 IEEE Access 2020, 26933–26940. Seong, N. Deep Spatiotemporal Attention Network for Fine Particle Matter 2.5 Concentration Prediction with Causality Analysis. IEEE Access 2021, 9, 73230–73239. https://doi.org/10.1109/ACCESS.2021.3080828. Castelli, M.; Clemente, F.M.; Popovič, A.; Silva, S.; Vanneschi, L. A Machine Learning Approach to Predict Air Quality in California. Complexity 2020, 2020. https://doi.org/10.1155/2020/8049504. Zhang, B.; Zhang, H.; Zhao, G.; Lian, J. Constructing a PM2.5 Concentration Prediction Model by Combining Auto‐Encoder with Bi‐LSTM Neural Networks. Environ. Model. Softw. 2020, 124, 104600. https://doi.org/10.1016/j.envsoft.2019.104600. Wu, J.; Tian, K.; Dong, Q.; Sun, L.; Zhang, L.; Liu, X. A Low Voltage Low Power Adaptive Transceiver for Twisted‐Pair Cable Communication. IEEE Trans. Nucl. Sci. 2015, 62, 3140–3147. https://doi.org/10.1109/TNS.2015.2480596. Seneca The Advantages of ModBUS RTU Protocol Available online: https://blog.seneca.it/en/the‐advantages‐of‐modbus‐rtu‐ protocol/ (accessed on 31 December 2021). 9. 10. 11. 12. 13. 14. 15. Prihatno, A.T. Artificial Intelligence Platform Based for Smart Factory. In Proceedings of the Korea Artificial Intelligence Conference, South Korea. 16 ‐ 18 December; KAIC, Korean Artificial Intelligence Conference: Online Conference, 2020; pp. 1–2. 17. Ken Figueredo. IEEE Communications Standards Magazine Volume: 4, Issues: 2. June 2020, pp. 10–11. 16. 18. 19. 20. 21. 22. oneM2M Partners Benefits of OneM2M Available online: https://www.onem2m.org/using‐onem2m/what‐is‐onem2m (accessed on 12 November 2021). Yun, J.; Woo, J. IoT‐Enabled Particulate Matter Monitoring and Forecasting Method Based on Cluster Analysis. IEEE Internet Things J. 2021, 8, 7380–7393. https://doi.org/10.1109/JIOT.2020.3038862. Prihatno, A.T.; Nurcahyanto, H.; Jang, Y.M. Smart Factory Based on IoT Platform. KIC Summer Conf. 2020, 2–4. https://doi.org/10.3390/MACHINES6020023.Thalesgroup. Zhao, R.; Wang, L.; Zhang, X.; Zhang, Y.; Wang, L.; Peng, H. A OneM2M‐Compliant Stacked Middleware Promoting IoT Research and Development. IEEE Access 2018, 6, 63546–63559. https://doi.org/10.1109/ACCESS.2018.2876197. Xu, S.S.D.; Chen, C.H.; Chang, T.C. Design of OneM2M‐Based Fog Computing Architecture. IEEE Internet Things J. 2019, 6, 9464–9474. https://doi.org/10.1109/JIOT.2019.2929118. Appl. Sci. 2022, 12, 2260 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 17 of 17 Shabanian, S.; Arpit, D.; Trischler, A.; Bengio, Y. Variational Bi‐LSTMs. arXiv 2017, arXiv:1711.05717. Shah, S.R. Bin; Chadha, G.S.; Schwung, A.; Ding, S.X. A Sequence‐to‐Sequence Approach for Remaining Useful Lifetime Estimation Using Attention‐Augmented Bidirectional LSTM. Intell. Syst. Appl. 2021, 10–11, 200049. https://doi.org/10.1016/j.iswa.2021.200049. Li, Y.H.; Harfiya, L.N.; Purwandari, K.; Lin, Y. Der Real‐Time Cuffless Continuous Blood Pressure Estimation Using Deep Learning Model. Sens. (Switz. ) 2020, 20, 5606. https://doi.org/10.3390/s20195606. Rampurawala, M. Classification with TensorFlow and Dense Neural Networks Available online: https://heartbeat.fritz.ai/classification‐with‐tensorflow‐and‐dense‐neural‐networks‐8299327a818a (accessed on 1 June 2021). Verma, Y. A Complete Understanding of Dense Layers in Neural Networks Available online: https://analyticsindiamag.com/a‐ complete‐understanding‐of‐dense‐layers‐in‐neural‐networks/]. (accessed on 3 December 2021). Islam, M.N.; Sulaiman, N.; Farid, F. Al; Uddin, J.; Alyami, S.A.; Rashid, M.; Majeed, A.P.P.A.; Moni, M.A. Diagnosis of Hearing Deficiency Using EEG Based AEP Signals: CWT and Improved‐VGG16 Pipeline. PeerJ Comput. Sci. 2021, 7, 1–28. https://doi.org/10.7717/peerj‐cs.638. Nwankpa, C.; Ijomah, W.; Gachagan, A.; Marshall, S. Activation Functions: Comparison of Trends in Practice and Research for Deep Learning. arXiv 2018, arXiv:1811.03378. Narayan, S. The Generalized Sigmoid Activation Function: Competitive Supervised Learning. Inf. Sci. 1997, 99, 69–82. https://doi.org/10.1016/S0020‐0255(96)00200‐9. Noor, N.M.; Al Bakri Abdullah, M.M.; Yahaya, A.S.; Ramli, N.A. Comparison of Linear Interpolation Method and Mean Method to Replace the Missing Values in Environmental Data Set. Mater. Sci. Forum 2015, 803, 278–281. https://doi.org/10.4028/www.scientific.net/MSF.803.278. Alaya, B.; Medjiah, S.; Monteil, T.; Drira, K.; Khalil, D. Towards Semantic Data Interoper‐Ability in OneM2M Standard. IEEE Commun. Mag. Inst. Electr. Electron. Eng. 2015, 53, 35–41. Gao, X.; Li, W. A Graph‐Based LSTM Model for PM2.5 Forecasting. Atmos. Pollut. Res. 2021, 12, 101150. https://doi.org/10.1016/j.apr.2021.101150. Gholamy, A.; Kreinovich, V.; Kosheleva, O. Why 70/30 or 80/20 Relation Between Training and Testing Sets : A Pedagogical Explanation. Dep. Tech. Reports 2018, 2, 1–6. VIJAYSINH LENDAVE A Guide to Different Evaluation Metrics for Time Series Forecasting Models Available online: https://analyticsindiamag.com/a‐guide‐to‐different‐evaluation‐metrics‐for‐time‐series‐forecasting‐models/ (accessed on 31 December 2021). 36. Sandeep Bhuiya Disadvantages of CNN Models Available online: https://iq.opengenus.org/disadvantages‐of‐cnn/ (accessed on 31 December 2021).