Article
oneM2M‐Enabled Prediction of High Particulate Matter Data
Based on Multi‐Dense Layer BiLSTM Model
Aji Teguh Prihatno, Ida Bagus Krishna Yoga Utama and Yeong Min Jang *
Department of Electronics Engineering, Kookmin University, Seoul 02707, Korea;
aji.teguh@gmail.com (A.T.P.); idabaguskrishnayogautama@gmail.com (I.B.K.Y.U.)
* Correspondence: yjang@kookmin.ac.kr; Tel.: +82‐02‐910‐5068
Abstract: High particulate matter (PM) concentrations in the cleanroom semiconductor factory have
become a significant concern as they can damage electronic devices during the manufacturing
process. PM can be predicted before becoming more concentrated based on its historical data to
support factory management in regulating the air quality in the cleanroom. In this paper, a Multi‐
Dense Layer BiLSTM model is proposed to predict PM2.5 concentrations in the indoor environment
of the cleanroom. To obtain reliability, validity, and interoperability data, the datasets containing
temperature, humidity, PM0.3, PM0.5, PM1, PM2.5, PM5, and PM10 were retrieved in a
standardized manner via oneM2M‐defined representational state transfer application
programmable interfaces by employing software platforms compliant with the Internet of Things
(IoT) standard. Based on the proposed model, an algorithm was built providing short‐term PM2.5
concentration predictions (one hour ahead, two hours ahead, and three hours ahead). The proposed
model outperformed the RNN, LSTM, CNN‐LSTM, and Single‐Dense Layer BiLSTM models in
terms of MSE, MAE, and MAPE values. The model created in this study could predict high PM2.5
concentration levels more accurately, thus providing vital support for operation and maintenance
for the semiconductor industry.
Citation: Prihatno, A.T.;
Utama, I.B.K.Y.; Jang, Y.M.
oneM2M‐Enabled Prediction of
High Particulate Matter Data Based
Keywords: oneM2M; particulate matter (PM); PM2.5; Multi‐Dense Layer BiLSTM; cleanroom
on Multi‐Dense Layer BiLSTM
Model. Appl. Sci. 2022, 12, 2260.
https://doi.org/10.3390/app12042260
1. Introduction
Academic Editor:
João M. F. Rodrigues
Received: 31 December 2021
Accepted: 16 February 2022
Published: 21 February 2022
Publisher’s
Note:
MDPI
stays
neutral with regard to jurisdictional
claims
in
published
maps
and
institutional affiliations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution
(CC
BY)
license
(https://creativecommons.org/license
s/by/4.0/).
The semiconductor industry is one of the worldʹs most rapidly growing and evolving
industries. The global semiconductor market is estimated to be worth 333 billion dollars.
In addition, the industry has a considerable impact on the national market economy,
accounting for 10–15% of the Republic of Koreaʹs total exports. Market competition is
becoming increasingly vital due to the recent growth of available electronic gadgets, such
as mobile phones and tablet PCs (personal computers) [1]. Semiconductor fabrication
requires a variety of complex chemical components [2], generating various chemicals and
by‐products that are almost impossible to remove from the inside of the equipment
altogether. Powders and airborne PM, the by‐products of the chemical reaction of the
metal precursors used as process materials during regular operation and their release into
the workplace during process equipment and scrubber maintenance (used to remove
some particulates and gases from industrial exhaust streams), can severely damage the
electronic circuits [3].
The yield of the semiconductor industry is defined as the percentage of functional
integrated circuit (IC) devices at the end of the fabrication process. In general, there are
two types of yield losses in IC manufacturing: systematic and random yield loss.
Deviations in the device and material characteristics cause systematic yield loss.
Contamination issues and process‐induced particles are frequently linked to random
defect yield loss [4]. The following are a few instances of contaminations and mechanisms
Appl. Sci. 2022, 12, 2260. https://doi.org/10.3390/app12042260
www.mdpi.com/journal/applsci
Appl. Sci. 2022, 12, 2260
2 of 17
responsible for the electronic chip failures in a semiconductor: particulate matter
contamination, either from organic or inorganic matter particles created by the
environment or by tools, and processes, such as scratches, fractures, overlay flaws, and
stress [5]. As a result, in industrial hygiene, monitoring, determining, and predicting the
powder by‐products and airborne PM in the cleanroom semiconductor factory are
essential to avoid economic losses. This study aims to predict the concentrations of
airborne PM in the semiconductor manufacturing facilities based on ten‐day historical
data gathered using oneM2M technology.
Over the years, several approaches have been developed to predict and manage PM.
Chang‐Hoi et al. [6] utilized RNN incorporated with CMAQ (Community Multiscale Air
Quality) to forecast PM2.5. Ting Tsai et al. [7] employed the RNN model to predict PM2.5
concentrations, but the result of errors such as RMSE and MAE are high. Park et al. [8]
used the long short‐term memory (LSTM) and artificial neural network (ANN) models to
forecast PM, which had a higher F‐1 score than the individual scores of LSTM, ANN, and
random forest (RF) models. Huang et al. [9] forecasted PM2.5 in a smart city environment
using a deep neural network (APNet) based on CNN‐LSTM. Li et al. [10] combined CNN
and LSTM (named CNN‐LSTM) to predict PM2.5 concentrations. For improving
forecasting accuracy, the CNN‐LSTM used a convolutional neural network for feature
extraction and a recurrent neural network for time series data processing.
Seong et al. [11] predicted 186 stations of PM2.5 concentrations using 2 layers of
convolutional long short‐term memory two‐dimensional (CONVLSTM2D) and batch
normalization. Castelli et al. [12] had forecasted the air quality index (AQI) containing O3,
CO, SO2, NO2, and PM2.5 based on support vector regression, but the accuracy of PM2.5
still had to be enhanced. Zhang et al. [13] had constructed a model to forecast PM2.5 using
a combination of auto‐encoder and BiLSTM neural networks. However, the results lacked
metric comparison by only mentioning the RMSE and correlation coefficient.
In our work, PM2.5 prediction was still used as a case study to demonstrate the Multi‐
Dense Layer BiLSTM model in order to prove and compare among the AI methods to the
same object, as the advancement of the previous author’s research [3], can successfully
predict the time series data and even outperform several existing predicting strategies.
The following are our specific contributions:
We used the hardware architecture, based on oneM2M technology, to achieve IoT
system compatibility in the semiconductor factory cleanroom;
We showed that our Multi‐Dense Layer BiLSTM model can accurately forecast PM2.5
from multi‐size PM concentration datasets (PM0.3, PM0.5, PM1, PM2.5, PM5, and
PM10);
We created a system with a small number of parameters, making it computationally
efficient, potent, and stable;
Our findings revealed that the Multi‐Dense Layer BiLSTM approach yields the
lowest error when compared to the RNN, LSTM, CNN‐LSTM, and Single‐Dense
Layer BiLSTM methods.
The following is a breakdown of the paperʹs structure. The overview of the system is
provided in Section 2, while the experimental setup for PM prediction is highlighted in
Section 3. The results of the experiment are evaluated and elaborated in Section 4. Finally,
in Section 5, the paper is concluded, and ideas for further research are discussed.
2. System Overview
To establish reliability and validity datasets, the authors collected the sensor data via
oneM2M standard technology. This method needs a communication interface to support
the cyber‐physical system (CPS).
Appl. Sci. 2022, 12, 2260
3 of 17
2.1. Communication Interface
In this paper, the authors used the RS485 Modbus RTU protocol for communication
interface using RJ‐116p4c cable. RS485 to USB converter was used to convert data from
the sensor using RS485 Modbus RTU protocol to USB for the computer to read and process
the data from the sensor. The RS485 protocol data rate can reach 35 Mbit/s over a 10 m
connection and 100 Kbit/s over a 1200 m line [14]. The RS485 Modbus RTU protocol has a
number of benefits, including reliable communication, interoperability across devices
from different manufacturers, and ease of installation and configuration, making it ideal
for edge computing [15].
The general architecture used in this study contained hardware, an IoT platform, and
an artificial intelligence (AI) platform, as shown in Figure 1. The industrial‐grade sensor
collected temperature, humidity, PM0.3, PM0.5, PM1, PM2.5, PM5, and PM10 during a
three‐day period from 24 September to 26 September 2021. The Mobius, an open IoT
platform, acts as the gateway, server, and database. This open IoT platform works on an
oneM2M technology standard, which is elaborated in the next section.
Jupyter Notebook, based on python programming, was used as the AI platform. The
Jupyter Notebook contains a set of open standards for collaborative computing. These
open standards can be used by third‐party developers to construct bespoke applications
with embedded interactive computing based on HTML and CSS on cloud computing.
Jupyter Notebook spans through visualization, multimedia, and more with its modular
design. In addition to running the code, it saves the code and output as well as markdown
notes in an editable document called a notebook. When users save a page in their browser,
it is transferred to their notebook server, which saves it as a JSON file with the a.ipynb
extension on the disk [16].
Figure 1. General architecture of this study.
2.2. oneM2M Technology Standard
Connected devices have been around for a long time, but they took off after the
phrase ʺInternet of Thingsʺ (IoT) was established. As IoT devices began to proliferate, a
standard was needed to satisfy new IoT requirements without rewriting pieces that
previously had tried and verified specifications. The oneM2M‐based platform was built
with these concepts in mind to facilitate IoT device and application interoperability and
economies of scale. Furthermore, oneM2Mʹs standard interoperability testing activities
are important aspects of a robust standard [17].
The oneM2M standards support IoT applications to discover and interact with any
IoT devices. IoT solutions can currently communicate across various silos. This is perfect
for distributed and collaborative solutions in domains like smart buildings, smart cities,
and smart manufacturing. Furthermore, oneM2M standards were created with the goal
of reducing fragmentation, increasing reusability, and lowering costs through scalability
[18]. The oneM2M initiative has been working on IoT standards to address fragmentation
in the IoT landscape. It focuses on service layer interoperability rather than protocol stacks
within the network or internet layers, and hence provides optimal technical standards for
building a common horizontal IoT service platform across several domain sectors [19].
Appl. Sci. 2022, 12, 2260
4 of 17
The authors used IoT based on oneM2M platforms in the cleanroom of a semicon‐
ductor smart factory environment to obtain reliability, validity, and interoperability of
data [20]. We can have a common service capability layer in terms of the end‐to‐end plat‐
form with this technology.
In the oneM2M standard, message queuing telemetry transport (MQTT) plays a vital
role in collecting and sending sensor data. MQTT is a lightweight application layer proto‐
col for IoT devices. MQTT is a ʺpublishʺ and ʺsubscribeʺ protocol in which the sender can
deliver information to clients via an intermediary server known as a broker. Each pub‐
lished message has a single topic that clients used to subscribe to a broker. The sole broker
defined in the MQTT protocol standard acts as the single point of failure (PoF). Numerous
brokers are introduced into a system to increase availability. The IoT platform containing
MQTT broker is depicted in Figure 2.
IoT Platform
MQTT
DEVICE / SENSOR
Restful
API
IoT Server
MQTT Broker
Database
MQTT
User Application
Restful
API
Figure 2. IoT platform design and architecture with MQTT.
In addition, M2M technology facilitates work by enabling real‐time replies on com‐
plicated provider networks, such as those found in factories. Real‐time control and com‐
mand with crucial technologies add functions and advantages to supply chain optimiza‐
tion and automation. As a result, use cases should be evaluated via the standard oneM2M
technology with real‐time command and control [21]. Furthermore, this technology must
be incorporated into the current protocol standards. Additionally, oneM2M complies with
the international M2M and IoT standards with the goal of creating a single M2M service
layer, as shown in Figure 3. It would enable the integration of a wide range of hardware,
software, and countless devices from around the world into a system combining M2M‐
related fields of business into a serviceable system, including telematics, smart transpor‐
tation, health care, utilities, industrial automation, and smart home applications.
Appl. Sci. 2022, 12, 2260
5 of 17
Figure 3. The oneM2M architecture model.
The oneM2M model is a decentralized design that is relatively easy to modify, as
shown in Figure 3. Connecting nodes with diverse capabilities construct it. The device
component of the IoT or any logical hardware and software service might be defined as
an application entity (AE) in this architecture.
The oneM2M service core, the IoT gateway, and the AE application service are all
managed by the infrastructure node (IN). It is typically set up on a cloud system platform
or server. The IN is the in‐charge of the middle layer region with several middle nodes
(MNs) that serve IoT service layers and AE application services. In most cases, MNs are
created in the IoT gateway. Application service nodes (ASNs) are lightweight common
service layers and AE application services are utilized in a remote M2M‐based IoT system.
In a tiny or limited IoT device system, application dedicated nodes (ADNs) are used to
offer sensor monitoring and information return [22].
For visualization of data collected from the sensor, the authors used an oneM2M
browser application. This application represents how data sensors (temperature, humid‐
ity, PM0.3, PM0.5, PM1, PM2.5, PM5, and PM10) are stored in the MySQL database. As
we can see in Figure 4, the green blocks titled cin represent updated sensor data received
every second.
Appl. Sci. 2022, 12, 2260
6 of 17
Figure 4. Real‐time sensor data acquisition visualization using oneM2M browser application.
3. Methodology
3.1. Multi‐Dense Layer BiLSTM
The BiLSTM is a variant of the general LSTM [23]. By processing the incoming data
sequences from two directions with two independent LSTMs, we utilized the advantages
of both prior and future contexts. The LSTM takes a variable‐length sequence x = x1, x2, ...,
xn as its general input, where 𝑥 ∈ ℝ and d denotes the features in each time index i. The
LSTM preserves its internal hidden state h in each time index, resulting in a hidden se‐
quence of h1, h2, ..., hn. At time index t, the hidden vector ht is modified as follows:
𝑐
𝑖
𝜎 𝑊 𝑥
𝑊 ℎ
𝑓
𝜎 𝑊 𝑥
𝑊 ℎ
𝑓 ⊗ 𝑐
𝑜
𝑖 ⊗ tanh 𝑊 𝑥
𝜎 𝑊𝑥
ℎ
𝑊 ℎ
𝑜 ⊗ tanh 𝑐 )
𝑏
𝑊 ℎ
𝑏
𝑏
(1)
𝑏
(2)
(3)
(4)
(5)
where c, σ, and ⊗ express the cell vector, the sigmoid function, and the element‐wise
multiplication; i, f, and o indicate to the input, forget, and output gates, respectively.
Figure 5 depicts the proposed Multi‐Dense Layer BiLSTM model for predicting
PM2.5 concentrations. The algorithm took the PM2.5 concentrations data from the raw
data, which contain temperature, humidity, PM0.3, PM0.5, PM1, PM5, and PM10 concen‐
trations. Later, their values are standardized into a range of 0 to 1. The processed dataset
is sent into the model for training, and the learned model is then utilized to forecast PM2.5
levels.
Appl. Sci. 2022, 12, 2260
7 of 17
Figure 5. Proposed architecture of Multi‐Dense Layer BiLSTM model.
The BiLSTM layer is made up of two LSTM layers: a forward layer and a backward
layer. The input is recognized by the forward layer 𝑙 as ascending range, i.e., t = 1, 2, 3,
..., T. Backward layer 𝑙 , on the other hand, considers the input in descending order, i.e., t
= T ..., 3, 2, 1. As a result, 𝑙 and 𝑙 can be combined to generate the output 𝑦 . Because
they use the same backpropagation through time (BPTT) training mechanism as LSTM
networks, BiLSTMs are computationally inexpensive [24].
The backward LSTM layer output sequence 𝑙 is calculated using reversed inputs
from time t‐2 to t‐n, the same as the forward LSTM layer output sequence 𝑙 . These output
sequences are then passed into the function, which combines them into a 𝑦 output vector.
The final output of a BiLSTM layer can be represented as a vector, 𝑌 =[ 𝑦 ,..., 𝑦 ],
where the last element, 𝑦 , is the estimated PM2.5 concentration for the following itera‐
tion, similar to the LSTM layer [25].
All the constructed LSTM networks in this study make use of the bidirectional fea‐
ture. The mathematical equations constituting the BiLSTM model are as follows:
𝑙
𝑙
tan h 𝑊 𝑥
tan h 𝑊 𝑥
𝑦
𝑊 𝑙
𝑊 𝑙
𝑊 𝑙
𝑊 𝑙
𝑙
𝑏
𝑙
(6)
(7)
(8)
After the BiLSTM layer has processed the data, it is sent to a multi‐dense layer with
a linear activation function to give continuous value predictions. The dense layer is an
utterly interconnected layer, i.e., all neurons in one layer are connected to those in the
following [26].
There are two dense layers used in this proposed architecture. In a neural network, a
dense layer is one that is tightly coupled to the layer before it. That is, every neuron in the
layer before it is coupled to every neuron in the layer before it. This is the most often used
layer in artificial neural network networks [27]. The authors used two units in the first
dense layer and one unit in the second dense layer. All units in the dense layers contain
Appl. Sci. 2022, 12, 2260
8 of 17
the sigmoid activation function. In this study, adding more layers to the dense section
expectedly can increase the networkʹs robustness [28].
3.2. Sigmoid Activation Function
The sigmoid is a non‐linear activation function frequently employed in feedforward
neural networks. It is a bounded differentiable actual function with positive derivatives
everywhere and a certain amount of smoothness, defined for real input values. The rela‐
tionship determines the sigmoid function:
𝑓 𝑥
1
𝑒𝑥𝑝
1
(9)
The sigmoid function is found in the output layers of deep learning (DL) architec‐
tures, and it is used to predict probability‐based outputs. It has been successfully em‐
ployed in binary classification challenges, modeling logistic regression tasks, and other
neural network fields. The key advantages of sigmoid functions are that they are simple
to learn and that they are commonly utilized in external networks [29].
The sigmoid activation function was elected for this paper since it is ideally suited to
tasks that need a continuous‐valued output, such as PM2.5 concentration [30].
4. Experimental Setup
The proposed Multi‐Dense Layer BiLSTM model is utilized to predict PM2.5 concen‐
trations that can be implemented in the cleanroom of the semiconductor factory. The Ten‐
sorflow Keras library was used to implement the proposed system design.
4.1. Dataset and Preprocessing
The collection contains 259,200 data points from 24 September to 26 September 2021.
Temperature, humidity, PM0.3, PM0.5, PM1, PM2.5, PM5, and PM10 are among the eight
variables listed in Table 1. In this experiment, the algorithm contains linear interpolation,
which is employed to fill the data if there are any missing values. Linear interpolation
produces the best results for all percentages of missing values if compared to other meth‐
ods, such as the mean method [31]. The authors found no missing data from raw data,
which indicates that this is the advantage of the oneM2M system to gather the data [32].
We utilize Equation (9) to normalize the data before inputting it into the proposed method
[33]:
𝒙 𝒙𝒎𝒊𝒏
𝒙𝒎𝒂𝒙 𝒙𝒎𝒊𝒏
𝒙𝒏𝒐𝒓𝒎𝒂𝒍𝒊𝒛𝒆𝒅
(10)
where 𝒙𝒎𝒊𝒏 denotes minimum data, 𝒙𝒎𝒂𝒙 is minimum data, and x is the original data. It
is critical to create supervised time series data. The input matrices and output matrices are
shown below in their many configurations.
For 1 h ahead prediction:
Input matrix
𝑥
Output matrix
For 2 h ahead prediction:
𝑥
𝑥
…
𝑥
𝑥
𝑥
𝑥
…
𝑥
𝑥
𝑥
𝑥
𝑥
…
𝑥
…
𝑥
𝑥
𝑥
𝑥
𝑥
…
…
… 𝑥
… 𝑥
…
…
… 𝑥
… 𝑥
… 𝑥
…
…
… 𝑥
Appl. Sci. 2022, 12, 2260
9 of 17
Input matrix
𝑥
Output matrix
For 3 h ahead prediction:
Input matrix
𝑥
𝑥
𝑥
…
𝑥
𝑥
𝑥
𝑥
𝑥
…
…
𝑥
𝑥
Output matrix
…
𝑥
…
𝑥
𝑥
𝑥
𝑥
𝑥
𝑥
𝑥
…
𝑥
Table 1. Variables consisted in the dataset.
Categories
Temperature
Humidity
Air pollutant variables
Air pollutant variables
Air pollutant variables
Air pollutant variables
Air pollutant variables
Air pollutant variables
𝑥
𝑥
𝑥
𝑥
…
… 𝑥
… 𝑥
…
…
… 𝑥
…
𝑥
𝑥
𝑥
𝑥
𝑥
𝑥
…
𝑥
…
𝑥
𝑥
𝑥
𝑥
… 𝑥
… 𝑥
…
…
… 𝑥
…
…
𝑥
… 𝑥
… 𝑥
…
…
… 𝑥
… 𝑥
… 𝑥
…
…
… 𝑥
Input Variables
TEMP
HUMID
PM0.3
PM0.5
PM1
PM2.5
PM5
PM10
Unit
°C
%RH
μg/m3
μg/m3
μg/m3
μg/m3
μg/m3
μg/m3
4.2. Hyperparameters Setting
In this study, the authors evaluated the sequence learning models for short and long‐
term predictions; experiments were conducted for different time scales, such as one hour,
two hours, and three hours ahead. Table 2 elaborates the hyperparameter setting for our
proposed model, Multi‐Dense Layer BiLSTM. The modelʹs hyperparameters were deter‐
mined to achieve the best results. To ensure the consistency of the results, the authors use
three kinds of epoch values, 20, 35, and 50. The choice of 80% for training data and 20%
for testing data is because this is empirically the best partition into the training and the
testing sets [34].
Table 2. The list of hyperparameters values for the Multi‐Dense Layer BiLSTM Method.
20/35/50
16
N/A
Single‐Dense Layer
BiLSTM
64 BiLSTM
nodes
20/35/50
16
linear
Multi‐Dense Layer
BiLSTM
64 BiLSTM
nodes
20/35/50
16
linear
64%
64%
80%
80%
16%
16%
N/A
N/A
Hyperparameter
RNN
LSTM
CNN‐LSTM
Model nodes
2 RNN nodes
64 LSTM nodes
128 LSTM nodes
Epoch
Batch size
Interpolate method
Train data
(% dataset)
Validation data
(% dataset)
20/35/50
16
linear
20/35/50
64
linear
64
16%
Appl. Sci. 2022, 12, 2260
Test data
(% dataset)
Optimizer
Activation
Learning rate
Dense layer
10 of 17
20%
20%
20%
20%
20%
ADAM
Linear
0.01
N/A
SGD
Linear
0.01
N/A
ADAM
ReLU
0.001
3
ADAM
Linear
0.001
1
ADAM
Sigmoid
0.001
2
4.3. Performance Criteria
We have predicted PM2.5 concentrations for the three‐day dataset; the experiments
used three parameters to assess the efficacy of the proposed model: mean square error
(MSE) (8), mean absolute error (MAE) (9), and mean absolute percentage error (MAPE)
(10) as metrics to appraise the achievement of the Multi‐Dense Layer BiLSTM model:
𝑀𝑆𝐸
𝑀𝐴𝐸
∑
𝑀𝐴𝑃𝐸
∑
∑
𝑦 𝑦
.
|
|
|𝑦 𝑦 |
(11)
(12)
(13)
where 𝑦 refers to the real value and 𝑦 refers to the forecasted value, and n expresses the
sample size. Higher forecasting accuracy is associated with lower MSE, MAE, and MAPE
values [35].
5. Result and Discussion
The proposed Multi‐Dense Layer BiLSTM model was used to predict PM2.5 concentra‐
tions one hour, two hours, and three hours ahead of the three‐day observation, totaling 259,200
s. To test the validity of predictions and support preventive maintenance over a longer
timeframe, we set the time forecast into hourly, 2 h, and 3 h.
The proposed Multi‐Dense Layer BiLSTM retains low MSE, MAE, and MAPE levels at
varied sampling rates, meaning that forecasting accuracy may be assured. Table 3 shows that
when the number of epochs in the RNN, LSTM, CNN‐LSTM, and Single‐Dense Layer BiLSTM
algorithms increased, so did the MSE, MAE, and MAPE values, indicating that the three ap‐
proaches were overfitting. The MSE, MAE, and MAPE values of our approach model, on the
other hand, dropped as the prediction time length was increased. These findings acknowledge
that, compared to other models, our proposed model predicts PM2.5 concentrations 1 h, 2 h,
and 3 h ahead of time and has reached the most preciseness.
Table 3. The best results from all compared models for PM2.5 prediction with 20, 35, and 50 epochs.
Prediction
Model
Time
20 epoch
Length
1h
0.1072
RNN
2h
0.1012
3h
0.0833
1h
0.0058
LSTM
2h
0.0187
3h
0.0619
1h
5.838
CNN‐LSTM
2h
3.871
3h
8.434
1h
0.0016
Single‐Dense
Layer
2h
0.0015
BiLSTM
3h
0.0053
MSE
MAE
MAPE
35 epoch 50 epoch 20 epoch 35 epoch 50 epoch 20 epoch 35 epoch 50 epoch
0.1141
0.1138
0.0908
0.0048
0.0771
0.0724
3.907
3.992
8.434
0.0017
0.0027
0.0047
0.1001
0.1165
0.072
0.0045
0.0217
0.066
3.69
3.871
8.434
0.0016
0.0014
0.0063
0.2415
0.1778
0.2501
0.0626
0.125
0.1703
2.305
1.659
2.598
0.0029
0.318
0.0067
0.2371
0.1986
0.2337
0.0548
0.1985
0.2023
1.706
1.739
2.598
0.0034
0.0042
0.0064
0.2199
0.2097
0.2132
0.055
0.1325
0.1765
1.676
1.659
2.598
0.3266
0.3105
0.0073
29.8106
23.89
54.1121
11.4597
25.617
60.5662
6.061
4.864
7.64
0.3385
0.3046
0.673
31.0264
27.4217
56.2735
8.8899
68.8899
67.6347
3.613
4.948
7.64
0.3434
0.4193
0.6433
27.7334
29.8114
52.1554
9.1937
27.6512
63.2913
4.332
4.864
7.64
0.3047
0.2814
0.7322
Appl. Sci. 2022, 12, 2260
Multi‐Dense
Layer
BiLSTM
11 of 17
1h
2h
3h
0.0009
0.0014
0.001
0.0014
0.0008
0.0018
0.0011
0.0009
0.0006
0.0023
0.0027
0.0022
0.0028
0.002
0.0035
0.0027
0.0021
0.0019
0.2258
0.2713
0.223
0.2849
0.1991
0.3507
0.2701
0.2058
0.1873
Figure 6a–f display the results of predicting PM2.5 concentration one hour ahead, using
the Multi‐Dense Layer BiLSTM algorithm. From three kinds of epochs experiments, the opti‐
mum result between train and validation loss is revealed before the 10th epoch. The blue line
in Figure 6a,c,f with the log scale indicates train loss. In contrast, the orange line reflecting
validation loss is shown to have the same trend, resulting in the proposed model having the
lowest MSE, MAE, and MAPE in 1 hour ahead prediction, as mentioned in Table 3. From
Figure 6b,d,f, the test data, and prediction data are united very closely. These outputs demon‐
strate that the proposed model has the best fit to predict PM2.5 concentrations. Compared to
the other four models, the results of prediction by the proposed model were the closest to the
actual value.
(a)
(b)
(c)
(d)
(e)
(f)
Figure 6. Result of loss function using the proposed model for 1 hour ahead prediction (a,b) 20
epochs. (c,d) 35 epochs. (e,f) 50 epochs.
Appl. Sci. 2022, 12, 2260
12 of 17
The loss function patterns of the proposed model to estimate PM2.5 concentration for
the following two hours are shown in Appendix A, Figure A1a–f. Especially in Figure
A1e, the blue and orange lines were closer in epoch 7th, then separated wider until epoch
50th, with relatively having the same trend between these two lines. The test data and
prediction data have practically merged in Figure A1b,d,f. These findings suggest that the
proposed model is the most accurate in predicting PM2.5 concentrations for 2 h ahead, as
mentioned in Table 3.
Similar to the previous experiments one h ahead and two hours ahead, Figure A2a–f
depicts the fitting patterns of three hours ahead prediction by the proposed model has
attained the best fit. All models were run for 20, 35, and 50 epochs; then, our proposed
model, in which the results of prediction are represented by the orange line, came closest
to the actual value, which is represented by the blue line. Notably, in Figure A2e, the line
of validation loss got a spike in epoch 15th, then it continued to have the same trend with
train loss until epoch 50th.
When the number of epochs is increased from 20 to 35, and then 50 epochs, the MSE,
MAE, and MAPE values from the proposed model tend to decrease to the lowest error
compared to other methods. These results demonstrate that Multi‐Dense Layer BiLSTM
has the best fit pattern than the other four models. The authors took a sample to combine
the fitting trends from all models for one hour ahead prediction, with each model run by
20 epochs, which are shown in Figure 7.
In the zoomed Figure 7, the authors took sample lines from all compared models
from 25 to 35 seconds, indicating that the proposed model represented by the orange line
has the closest distance to the real data represented by the blue line. The proposed model
and single‐dense layer BiLSTM represented by the green line have a close connection. This
result shows that the proposed model, which uses multi‐dense layers, could empower
network stability [28].
From Figure 7, we can also see that the CNN‐LSTM model, represented by the red
line, has the furthest distance from the real data. This result is suitable to the values men‐
tioned in Table 3. This is understandable given that CNN‐LSTM is essentially slower due
to its operations and necessitates a lengthy process [36].
Figure 7. Comparison of all models to predict PM2.5 concentrations for one hour ahead.
Appl. Sci. 2022, 12, 2260
13 of 17
6. Conclusions
PM2.5 concentrations can have a significant impact on the semiconductor plant prod‐
uct quality. Therefore, a robust framework is required to monitor, analyze, and predict air
quality with additional visualization services. It is imperative to develop an accurate pre‐
diction method to ensure awareness regarding the prospective air quality in the clean
rooms among the personnel working at semiconductor manufacturing sites.
In this study, we built PM monitoring infrastructure using an industrial‐grade sensor
to meet the quality and compatibility standards of the industrial sector. In particular,
open‐source software platforms compliant with oneM2M standard technology were used
to provide a standardized approach to access the obtained PM datasets, allowing us to
construct globally applicable and access‐independent PM apps using oneM2M‐defined
REST APIs.
Furthermore, for one hour, two hours, and three hours forecasts, the proposed tech‐
nique, Multi‐Dense Layer BiLSTM, was shown to have the lowest errors in terms of MSE,
MAE, and MAPE as compared to the RNN, LSTM, CNN‐LSTM, and Single‐Dense Layer
BiLSTM models. The findings can also be used by policymakers in semiconductor facto‐
ries to control the air quality of the cleanroom using HVAC based on this proposed model
of PM2.5 prediction.
Despite the excellent results, there are a few difficulties that we would like to address
in future work, such as the large memory and computing time required by our model in
the case of big datasets.
To further improve the prediction, cumulative airborne characteristics of the clean‐
room must be analyzed. Further steps include constructing integrated predictive, preven‐
tive, and prescriptive maintenance based on the PM prediction with suitable control func‐
tion services in cleanroom semiconductor manufacturing.
Author Contributions: Conceptualization, A.T.P.; methodology, A.T.P.; software, A.T.P. and
I.B.K.Y.U.; validation, A.T.P. and I.B.K.Y.U.; formal analysis, A.T.P.; investigation, A.T.P.; resources,
A.T.P.; data curation, A.T.P.; writing—original draft preparation, A.T.P.; writing—review and edit‐
ing, A.T.P.; visualization, A.T.P. and I.B.K.Y.U.; supervision, Y.M.J. All authors have read and
agreed to the published version of the manuscript.
Funding: This research was financially supported by the Ministry of Trade, Industry and Energy
(MOTIE) and Korea Institute for Advancement of Technology (KIAT) through the International Co‐
operative R&D program (Project ID:P0011880).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data presented in this study are available on request from the
corresponding author. The data are not publicly available due to further research on processing.
Conflicts of Interest: The authors declare no conflict of interest.
Appl. Sci. 2022, 12, 2260
14 of 17
Appendix A
(a)
(b)
(c)
(d)
(e)
(f)
Figure A1. Result of loss function using the proposed model for 2 h ahead prediction (a,b) 20 epochs.
(c,d) 35 epochs. (e,f) 50 epochs.
(a)
(b)
Appl. Sci. 2022, 12, 2260
15 of 17
(c)
(d)
(e)
(f)
Figure A2. The accuracy of prediction using the proposed model for 3 h ahead prediction (a,b) 20
epochs. (c,d) 35 epochs. (e,f) 50 epochs.
Abbreviations
ADAM
ADN
AE
ANN
API
ASN
BiLSTM
BPTT
CMAQ
CNN‐LSTM
CONVLSTM2D
CPS
DL
IN
HVAC
IoT
JSON
LSTM
M2M
MAE
MAPE
ML
MN
MQTT
MSE
PM
PM0.3
Adaptive Momentum Estimation
Application Dedicated Nodes
Application Entity
Artificial Neural Network
Application Programming Interface
Application Service Node
Bidirectional Long Short‐Term Memory
Backpropagation Through Time
Community Multiscale Air Quality
Convolutional Neural Network—Long Short‐Term Memory
Convolutional Long Short‐Term Memory Two‐Dimensional
Cyber‐Physical System
Deep Learning
Infrastructure Node
Heating Ventilation and Air Conditioning
Internet of Things
JavaScript Object Notation
Long Short‐Term Memory
Machine to machine
Mean Absolute Error
Mean Absolute Percentage Error
Machine Learning
Middle Node
Message Queue Telemetry Transport
Mean Square Error
Particulate Matter
Particulate Matter of 0.3 μm
Appl. Sci. 2022, 12, 2260
16 of 17
PM0.5
PM1.0
PM2.5
PM5
PM10
RDBMS
REST
RNN
RTU
Particulate Matter of 0.5 μm
Particulate Matter of 1.0 μm
Particulate Matter of 2.5 μm
Particulate Matter of 5 μm
Particulate Matter of 10 μm
Relational Database Management System
Representational State Transfer
Recurrent Neural Network
Remote Terminal Unit
References
1.
2.
3.
4.
Park, S.H.; Kim, S.; Baek, J.G. Kernel‐Density‐Based Particle Defect Management for Semiconductor Manufacturing Facilities.
Appl. Sci. (Switz. ) 2018, 8, 224. https://doi.org/10.3390/app8020224.
Choi, K.‐M. Airborne PM2.5 Characteristics in Semiconductor Manufacturing Facilities. AIMS Environ. Sci. 2018, 5, 216–228.
https://doi.org/10.3934/environsci.2018.3.216.
Prihatno, A.T.; Nurcahyanto, H.; Ahmed, M.F.; Rahman, M.H.; Alam, M.M.; Jang, Y.M. Forecasting Pm2.5 Concentration Using
a Single‐Dense Layer Bilstm Method. Electron. (Switz. ) 2021, 10, 1808. https://doi.org/10.3390/electronics10151808.
Wali, F.; Knotter, D.M.; Kuper, F.G. Impact OfNano Particles on Semiconductor Manufacturing. In Proceedings of the 2008 IEEE
International Conference on Multi Topi, Karachi, Pakistan, 23–24 December 2008; pp. 97–99.
5.
The International Technology Roadmap for Semiconductors. The International Technology Roadmap for
Semiconductors 2.0. In Proceedings of the 2015 Edition Yield Enhancement; IEEE, Chip Design, Solid State Technology:
Virtual Conferences, 2015; p. 3.
6.
Chang‐Hoi, H.; Park, I.; Oh, H.R.; Gim, H.J.; Hur, S.K.; Kim, J.; Choi, D.R. Development of a PM2.5 Prediction Model Using a
Recurrent Neural Network Algorithm for the Seoul Metropolitan Area, Republic of Korea. Atmos. Environ. 2021, 245, 118021.
https://doi.org/10.1016/j.atmosenv.2020.118021.
7.
Tsai, Y.T.; Zeng, Y.R.; Chang, Y.S. Air Pollution Forecasting using RNN with LSTM. In Proceedings of the 2018
IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence
and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology
Congress(DASC/PiCom/DataCom/CyberSciTech); IEEE: Athens, Greece, 2018; pp. 1068–1073.
8.
Park, J.; Chang, S. A Particulate Matter Concentration Prediction Model Based on Long Short‐Term Memory and an Artificial
Neural Network. Int. J. Environ. Res. Public Health 2021, 18, 6801. https://doi.org/10.3390/ijerph18136801.
Huang, C.J.; Kuo, P.H. A Deep Cnn‐Lstm Model for Particulate Matter (Pm2.5) Forecasting in Smart Cities. Sens. (Switz. ) 2018,
18, 2220. https://doi.org/10.3390/s18072220.
Li, T.; Hua, M.; Wu, X.U. A Hybrid CNN‐LSTM Model for Forecasting.
3 IEEE Access 2020, 26933–26940.
Seong, N. Deep Spatiotemporal Attention Network for Fine Particle Matter 2.5 Concentration Prediction with Causality
Analysis. IEEE Access 2021, 9, 73230–73239. https://doi.org/10.1109/ACCESS.2021.3080828.
Castelli, M.; Clemente, F.M.; Popovič, A.; Silva, S.; Vanneschi, L. A Machine Learning Approach to Predict Air Quality in
California. Complexity 2020, 2020. https://doi.org/10.1155/2020/8049504.
Zhang, B.; Zhang, H.; Zhao, G.; Lian, J. Constructing a PM2.5 Concentration Prediction Model by Combining Auto‐Encoder
with Bi‐LSTM Neural Networks. Environ. Model. Softw. 2020, 124, 104600. https://doi.org/10.1016/j.envsoft.2019.104600.
Wu, J.; Tian, K.; Dong, Q.; Sun, L.; Zhang, L.; Liu, X. A Low Voltage Low Power Adaptive Transceiver for Twisted‐Pair Cable
Communication. IEEE Trans. Nucl. Sci. 2015, 62, 3140–3147. https://doi.org/10.1109/TNS.2015.2480596.
Seneca The Advantages of ModBUS RTU Protocol Available online: https://blog.seneca.it/en/the‐advantages‐of‐modbus‐rtu‐
protocol/ (accessed on 31 December 2021).
9.
10.
11.
12.
13.
14.
15.
Prihatno, A.T. Artificial Intelligence Platform Based for Smart Factory. In Proceedings of the Korea Artificial
Intelligence Conference, South Korea. 16 ‐ 18 December; KAIC, Korean Artificial Intelligence Conference: Online
Conference, 2020; pp. 1–2.
17. Ken Figueredo. IEEE Communications Standards Magazine Volume: 4, Issues: 2. June 2020, pp. 10–11.
16.
18.
19.
20.
21.
22.
oneM2M Partners Benefits of OneM2M Available online: https://www.onem2m.org/using‐onem2m/what‐is‐onem2m (accessed
on 12 November 2021).
Yun, J.; Woo, J. IoT‐Enabled Particulate Matter Monitoring and Forecasting Method Based on Cluster Analysis. IEEE Internet
Things J. 2021, 8, 7380–7393. https://doi.org/10.1109/JIOT.2020.3038862.
Prihatno, A.T.; Nurcahyanto, H.; Jang, Y.M. Smart Factory Based on IoT Platform. KIC Summer Conf. 2020, 2–4.
https://doi.org/10.3390/MACHINES6020023.Thalesgroup.
Zhao, R.; Wang, L.; Zhang, X.; Zhang, Y.; Wang, L.; Peng, H. A OneM2M‐Compliant Stacked Middleware Promoting IoT
Research and Development. IEEE Access 2018, 6, 63546–63559. https://doi.org/10.1109/ACCESS.2018.2876197.
Xu, S.S.D.; Chen, C.H.; Chang, T.C. Design of OneM2M‐Based Fog Computing Architecture. IEEE Internet Things J. 2019, 6,
9464–9474. https://doi.org/10.1109/JIOT.2019.2929118.
Appl. Sci. 2022, 12, 2260
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
17 of 17
Shabanian, S.; Arpit, D.; Trischler, A.; Bengio, Y. Variational Bi‐LSTMs. arXiv 2017, arXiv:1711.05717.
Shah, S.R. Bin; Chadha, G.S.; Schwung, A.; Ding, S.X. A Sequence‐to‐Sequence Approach for Remaining Useful Lifetime
Estimation Using Attention‐Augmented
Bidirectional
LSTM. Intell. Syst.
Appl. 2021,
10–11, 200049.
https://doi.org/10.1016/j.iswa.2021.200049.
Li, Y.H.; Harfiya, L.N.; Purwandari, K.; Lin, Y. Der Real‐Time Cuffless Continuous Blood Pressure Estimation Using Deep
Learning Model. Sens. (Switz. ) 2020, 20, 5606. https://doi.org/10.3390/s20195606.
Rampurawala,
M.
Classification
with
TensorFlow
and
Dense
Neural
Networks
Available
online:
https://heartbeat.fritz.ai/classification‐with‐tensorflow‐and‐dense‐neural‐networks‐8299327a818a (accessed on 1 June 2021).
Verma, Y. A Complete Understanding of Dense Layers in Neural Networks Available online: https://analyticsindiamag.com/a‐
complete‐understanding‐of‐dense‐layers‐in‐neural‐networks/]. (accessed on 3 December 2021).
Islam, M.N.; Sulaiman, N.; Farid, F. Al; Uddin, J.; Alyami, S.A.; Rashid, M.; Majeed, A.P.P.A.; Moni, M.A. Diagnosis of Hearing
Deficiency Using EEG Based AEP Signals: CWT and Improved‐VGG16 Pipeline. PeerJ Comput. Sci. 2021, 7, 1–28.
https://doi.org/10.7717/peerj‐cs.638.
Nwankpa, C.; Ijomah, W.; Gachagan, A.; Marshall, S. Activation Functions: Comparison of Trends in Practice and Research for
Deep Learning. arXiv 2018, arXiv:1811.03378.
Narayan, S. The Generalized Sigmoid Activation Function: Competitive Supervised Learning. Inf. Sci. 1997, 99, 69–82.
https://doi.org/10.1016/S0020‐0255(96)00200‐9.
Noor, N.M.; Al Bakri Abdullah, M.M.; Yahaya, A.S.; Ramli, N.A. Comparison of Linear Interpolation Method and Mean Method
to Replace the Missing Values in Environmental Data Set. Mater. Sci. Forum 2015, 803, 278–281.
https://doi.org/10.4028/www.scientific.net/MSF.803.278.
Alaya, B.; Medjiah, S.; Monteil, T.; Drira, K.; Khalil, D. Towards Semantic Data Interoper‐Ability in OneM2M Standard. IEEE
Commun. Mag. Inst. Electr. Electron. Eng. 2015, 53, 35–41.
Gao, X.; Li, W. A Graph‐Based LSTM Model for PM2.5 Forecasting. Atmos. Pollut. Res. 2021, 12, 101150.
https://doi.org/10.1016/j.apr.2021.101150.
Gholamy, A.; Kreinovich, V.; Kosheleva, O. Why 70/30 or 80/20 Relation Between Training and Testing Sets : A
Pedagogical Explanation. Dep. Tech. Reports 2018, 2, 1–6.
VIJAYSINH LENDAVE A Guide to Different Evaluation Metrics for Time Series Forecasting Models Available online:
https://analyticsindiamag.com/a‐guide‐to‐different‐evaluation‐metrics‐for‐time‐series‐forecasting‐models/ (accessed on 31
December 2021).
36. Sandeep Bhuiya Disadvantages of CNN Models Available online: https://iq.opengenus.org/disadvantages‐of‐cnn/ (accessed on
31 December 2021).