Predicitive Models Building

Alexandria Engineering Journal 79 (2023) 480–501
Contents lists available at ScienceDirect
Alexandria Engineering Journal

journal homepage: www.elsevier.com/locate/aej
Original Article
An AI-driven model for predicting and optimizing energy-efficient

building envelopes
Luong Duc Long
Faculty of Civil Engineering, Ho Chi Minh City University of Technology (HCMUT), 268 Ly Thuong Kiet Street, District 10, Ho Chi Minh City, Viet Nam
Vietnam National University Ho Chi Minh City, Linh Trung Ward, Thu Duc District, Ho Chi Minh City, Viet Nam
A R T I C L E I N F O A B S T R A C T
Keywords: Unlike many previous studies that often focus on optimizing energy efficiency for buildings when detailed design
Building energy efficiency drawings are available, this paper introduces a newly integrated model for energy-efficient building envelope
Building envelope design in the early stages (when detailed design drawings are not yet available). The newly developed model
Optimization model
includes three main components: a simulation model, a predictive model, and an optimization model. The
Machine learning algorithms
simulation model simulates the building’s energy performance, considering different values for various envelope
AI optimization algorithms
parameters. The predictive model employs machine learning algorithms, including RF, ANN, DNN, SVM, GEN
LIN, and GB (in which GB has been identified as the most suitable algorithm), boasting a very high R2 (0.994) to
assess energy consumption. The optimization model which uses AI optimization algorithms (such as NSGA II,
DSE, and MOPSO) integrates the machine learning predictive model into the evaluation function during the
evolutionary process, efficiently searching for Pareto-optimal building envelope solutions. Results show simul
taneous savings in cost and energy, with savings of 7.52 % in cost and 8.48 % in energy, or 21.17 % in cost and
0.4 % in energy, for a case study in Vietnam. This model establishes a foundation by providing design solutions
for stakeholders to assess, and can incorporate additional objectives at later stages.
1. Introduction The rise of Building Energy Modeling (BEM) tools has significantly
changed how architects and constructors create energy-efficient build
Urbanization has led to higher energy use in buildings worldwide, ing designs[17]. These tools allow the simulation and analysis of energy
contributing to CO2 emissions and global warming [1]. Construction usage in buildings, leading to better-informed design decisions [18].
contributes nearly 30 % to global energy consumption [2]. Specifically, However, challenges remain, including the discrepancy between the
commercial and residential buildings in the US consume about 40 % of optimized solutions and the initial 3D model, especially when it comes
the industry’s total energy [3,4]. With urbanization growth, there’s an to aesthetic and architectural considerations.
increased demand for energy in buildings. In Vietnam, the construction The typical design process for building projects involves architects/
sector consumes around 30 % of total energy and contributes to 35 % of engineers completing design scenarios based on their subjective expe
national CO2 emissions [5]. Despite high energy consumption, there’s riences. Next, they receive performance feedback through building
potential for energy savings in buildings. Prioritizing efficient energy simulations from specialized engineers/experts and then modify the
use in building design can reduce energy consumption, costs, and CO2 scenarios based on the received feedback. However, this repetitive
emissions [6]. process often results in low optimization efficiency and makes it chal
Many factors affect energy savings in buildings, including building lenging to realize an optimal design scenario [19].
characteristics, weather, service systems, and occupant behavior [7,8]. Due to the challenges encountered during the early design stage,
Particularly, the building envelope is crucial as it impacts how a building including limited information, uncertainty, a wide range of potential
responds to external conditions [9]. Thus, optimizing surface and en design solutions, intricate parameter interactions, rapid design changes,
velope designs during the design process is key to future energy con and the consideration of multiple performance criteria, evaluating
sumption reduction, especially given the current energy crisis and rising design alternatives often requires the use of various commercial simu
energy costs globally [10–16]. lation programs. The integration of these energy simulation programs
E-mail address: luongduclong@hcmut.edu.vn.
https://doi.org/10.1016/j.aej.2023.08.041
Received 10 June 2023; Received in revised form 19 July 2023; Accepted 13 August 2023
Available online 19 August 2023
1110-0168/© 2023 THE AUTHOR. Published by Elsevier BV on behalf of Faculty of Engineering, Alexandria University. This is an open access article under the CC
BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
L.D. Long Alexandria Engineering Journal 79 (2023) 480–501
with an optimization model can pose significant challenges [20,21]. As a Consequently, identifying an appropriate initial solution for the building
result, there have been optimization studies conducted for the early envelope that not only conserves energy but also accommodates various
stages of a project [19–22]. criteria from architects and engineers at various stages of consideration
In 2015, Negendahl et al. proposed an optimization method using a during the design process is essential in forming and executing effective
multi-objective optimization algorithm (such as the SPEA2 algorithm) building construction projects [21,22]. From this perspective, an
combined with quasi-steady-state (QSS) methods to determine the stable appropriate approach is to consider the selection of energy-saving so
state of a system for energy and indoor environment evaluations, lutions for the building envelope early in the preliminary design stage,
employing Radiance for daylight simulations [23]. The model demon enabling the involved parties to collaboratively select the envelope
strated its ability to support the optimization of building energy con solutions.
sumption during the early design stages. However, the use of the The problem statements addressed in this study can be described at
specialized tool Termite plugin for Grasshopper would be difficult to both micro and macro levels. At the macro level, the study aims to
integrate with architectural software such as Revit BIM. Additionally, address the challenges faced during the design process of energy-
relying solely on the SPEA2 evolutionary algorithm may not provide the efficient building envelopes. Traditionally, most previous energy
best solution for multiple cases. Later in 2016, Østergard et al. developed studies have been conducted at later stages of the design process when
a decision support model that encompasses the following components: the design documents are complete and detailed. However, such ap
Knowledge database, baseline model, sampling, run simulations, sta proaches often lead to difficulties when various stakeholders need to
tistical analysis, and visualization [20]. However, the model did not modify the envelope design to meet different requirements, including
provide a Pareto optimal solution set for multi-objective optimization architectural aesthetics, energy efficiency, cost-effectiveness, and other
and proved challenging to use due to the requirement for statistical disciplines. The process of modifying the detailed design models can be
analysis. time-consuming, costly, and inefficient. At the micro level, the study
Zhang et al. (2019) developed a parametric energy optimization aims to address the challenges faced in the integration of energy simu
process using Rhino and Grasshopper software to establish the rela lation programs with the optimization model, which can pose significant
tionship between design parameters and energy performance [19]. Ac challenges as mentioned in previous studies (e.g., uncertain variables for
cording to their study, the implementation of the parametric energy modeling, multiple changes due to project stakeholders, time-
optimization method during the early design stage of residential projects consuming model re-runs, complexity for users during the model utili
is anticipated to result in a reduction in energy consumption by 10 %–20 zation phase, inability to evaluate/forecast energy without sufficient
%. However, it is worth noting that the model employed multiple detailed conditions, and difficulties in integrating specialized energy
specialized and complex simulation programs, which may pose chal simulation software with AI-based optimization programming…)
lenges for users. Additionally, the optimization algorithm did not ac [20,21,25]. Hence, this study acknowledges the importance of selecting
count for a multi-objective Pareto set. optimal building envelope solutions (by creating a predictive model for
From these above studies, it can be observed that the optimization building envelope solutions that is fast, accurate, and easily integrated
process tends to be difficult when energy building performance simu with energy simulation programs and the optimization model) that
lation requires a lot of time (E.g, many objective functions and variables fulfill diverse criteria from different stakeholders at the initial stages.
are considered) [24]. To facilitate the integration of energy simulation This enables collaborative decision-making among stakeholders,
programs with the optimization model, which can pose significant thereby preventing unfavorable alterations and ensuring the effective
challenges as mentioned in previous studies [25]. The good way is to ness of building construction projects.
utilize the integrated predictive model as a link between the simulation To address this problem statement, it is necessary to develop an in
program and the optimization model, aiming to create a robust inte tegrated model capable of integrating energy simulation programs with
grated model for forecasting and optimization processes. Therefore, the the optimization model to predict and optimize design choices for the
conceptual model that adopts this approach will consist of three main building envelope during the early design stage. Such a model will
components: a simulation program, a predictive model, and an optimi consist of three components: a simulation model, a predictive model,
zation model. In this approach, Naihua Yue (2021) innovatively com and an optimization model. The simulation model will simulate the
bined the Nondominated Sorting Genetic Algorithm-II (NSGA-II) with energy performance of the building, considering different values for
the Multilayer Perception Artificial Neural Network (MLPANN) meta various building envelope parameters. The predictive model will need to
model, which was trained using simulation results from EnergyPlus and have the ability to rapidly and accurately evaluate different initial
Eppy [25]. The optimization results of the study cases indicated that choices based on the state of initial design documents, even without
reductions were achieved not only in the normalized objectives but also complete detailed designs. Furthermore, this predictive model will need
in the sub-objectives. However, using only a single algorithm such as to be integrated into a modern optimization model to generate an
NSGA-II and MLPANN may not yield optimal results for different cases. optimal set of envelope design solutions with different objectives.
Therefore, the idea is to incorporate a range of machine learning fore Therefore, the objective of this research is to develop a newly inte
casting algorithms to be used in the predictive model, and a variety of grated model that includes three interconnected components (including
different evolutionary algorithms to leverage the unique advantages of an energy analysis simulation model for multiple stakeholders such as
each algorithm considered in this study. architects and engineers, a predictive model using modern machine
The motivation for this research is as follows: During the process of learning algorithms, and an optimization model using new AI algo
designing energy-saving solutions related to the building envelope for rithms) to predict and optimize energy consumption for various building
construction projects, a majority of previous studies have delved envelope design options in the early stages of a project. This model aims
extensively into detailed design (where the condition of the design to assist the involved parties in selecting an appropriate set of energy-
documents is complete and thorough), concurrently implementing en efficient and cost-effective solutions. Subsequently, stakeholders will
ergy studies using specialized energy simulation software on informa consider architectural aesthetics, sustainability, and other objectives to
tion models with a high level of detail, hence providing potentially choose the most well-balanced solution, thereby minimizing the need
highly accurate outcomes compared to the subsequent reality. However, for changes during the later stages of project implementation.
this approach often encounters significant difficulties when various The contribution of that research is to create a powerful tool that
stakeholders interact to modify the envelope according to diverse provides high-speed and accurate forecasting for the energy consump
opinions to meet the requirements of different disciplines (such as ar tion level of a building in the early design phase, without the need for
chitecture, aesthetics, energy, costs, etc.), due to the considerable time, detailed designs. Furthermore, the study has integrated this tool into an
costs, and effort invested in modifying the information model. optimization model that concurrently employs three modern
481
evolutionary algorithms to generate a set of multi-objective optimal energy consumption and related variables, Berriel, et al. [36] presented
solutions (such as building envelope costs and energy consumption a DNN model for predicting monthly energy consumption. These find
levels). This model will aid relevant stakeholders such as investors, ar ings demonstrate that utilizing DNN for learning building features can
chitects, MEP engineers, and structural engineers in making informed greatly enhance the accuracy of energy consumption predictions.
decisions when integrating solutions with other criteria like architec In addition to ANN, SVM, and DNN, several other successful algo
ture, aesthetics, and safety levels in the broader context of a construction rithms have been deployed to predict energy consumption with high
project. In detail, the current study explores the capabilities of various accuracy. Traditional Linear Regression (LR) models have been widely
machine learning algorithms like ANN, DNN, SVM, GENLIN, GB with used to estimate energy performance in buildings. Hygh, et al. [37]
appropriately tuned parameters for building energy prediction, and in forecasted and evaluated energy performance in the early design stages
tegrates the predictive capabilities of these algorithms effectively with using traditional multivariate regression. More recently, with the help of
typical evolutionary optimization algorithms (NGSA II, MDE; MOPSO) unique computer sampling, Tsanas and Xifara [38] demonstrated that
for optimizing building envelope design while considering multiple Random Forest (RF) technique can accurately predict heating and
objectives such as energy consumption and associated costs. cooling loads in residential buildings with low mean absolute errors.
The rest of the paper is organized as follows: The second section Moreover, powerful boosting-based machine learning techniques such
presents the Research Overview, which includes two main aspects: as Gradient Boosting have been used for both prediction and classifi
Building energy consumption forecasting and Building energy con cation problems [39]. João Sauer et al. [40] developed an eXtreme
sumption optimization. The third section outlines the Research Meth Gradient Boosting (XGBoost) model with appropriately determined
odology, which consists of three parts: Energy analysis simulation in hyperparameters to predict heating and cooling loads in residential
Design Builder, the sub-model (SM1) for predicting energy consump buildings. The results, including RMSE, and MAE demonstrated that the
tion, and the Optimization Model with Energy Efficiency Objectives. XGBoost model achieved high accuracy. This indicates that utilizing the
Results and Discussion are presented in the following section, providing XGBoost algorithm is a highly promising tool, offering effectiveness,
an analysis of the findings. Finally, the last section concludes the current stability, and reliability for energy forecasting in buildings. Recently,
study, summarizing the key points and offering several concluding Alshboul et al. developed a Machine Learning-Based Model that utilizes
remarks. gene expression (GEP) algorithms for predicting shear strength [41].
The study explored the application of GEP algorithms to enhance the
2. Research overview efficiency of determining shear strength in slender reinforced concrete
beams without stirrups (SRCB-WS). This method overcomes the main
2.1. Building energy consumption forecasting limitation of using Artificial Neural Networks (ANN), which is the
absence of a closed-form solution for estimating shear strength. Unlike
Designing energy-efficient buildings necessitates predictive models ANN models that only provide solution algorithms, the GEP model offers
for energy consumption. These models guide energy policies and strat the advantage of capturing the intricate relationships among critical
egy decisions but pose challenges due to the complexity and nonlinearity variables, thereby improving prediction accuracy. However, the imple
of dependent variables, such as building characteristics and user be mentation of the GEP model requires more complex procedures, making
haviors [26]. Recent advances in artificial intelligence (AI) have led to it challenging to apply in different fields.
models that can analyze past data and adapt to environmental factors, The aforementioned studies have demonstrated the suitability and
successfully capturing complex nonlinear relationships in historical practicality of artificial intelligence models with the assistance of per
data, and resulting in accurate estimations of building energy perfor sonal computers in estimating the energy consumption of buildings.
mance [27,28]. However, the most appropriate artificial intelligence technique may
The Artificial Neural Network (ANN) is a widely used AI technique in vary depending on the data structure, specific conditions, and unique
predicting building energy consumption. Wong, et al. [29] developed context of each project. Therefore, this paper will implement a variety of
the ANN model for daily electricity usage prediction in office buildings AI techniques to forecast the energy consumption of buildings related to
demonstrated high predictive performance, while Hamzaçebi [30] the building envelope. Subsequently, a comparison of these algorithms
proposed the ANN model showcased superior accuracy in predicting will be conducted to select the most suitable AI algorithm based on
Turkey’s net electricity consumption compared to traditional methods. accuracy, computational efficiency, and execution time for use in the
These studies highlight ANN’s ability to identify complex nonlinear re proposed forecasting and optimization model.
lationships between variables, although ANN struggles to adapt to
varying building components or systems. 2.2. Building energy consumption optimization
In addition to ANN, Support Vector Machine (SVM) has also been
validated as one of the most powerful data mining techniques [31]. In Energy optimization in construction projects involves implementing
2005, Dong, et al. [32] made efforts to use SVM for predicting energy a myriad of strategies spanning efficient design, smart systems and
consumption in buildings. The data analysis based on average monthly equipment selection, operation and maintenance practices, occupant
electricity consumption collected by utilities showed good forecasting behavior adaptation, and renewable energy integration. Modern energy-
effectiveness with a small percentage error of about 4.00 %. Zhong, et al. efficient design principles are integral to effective optimization,
[33] developed a new SVR model with high accuracy and generalization considering factors like building orientation, window-to-wall ratios,
capability to predict energy consumption in buildings. Overall, SVR has envelope insulation materials, efficient HVAC systems, and IoT-
the advantage of effectively addressing nonlinear problems with high connected smart control systems.
accuracy. However, the SVR method also poses challenges in parameter Studies have used energy simulation methods with software like
determination to optimize the SVM model. Design Builder and EnergyPlus, selecting optimal scenarios from
With the advancement of powerful computer configurations, there generated combinations. Ferrara et al. [42] applied dynamic energy
has been a significant increase in the use of deep learning for predicting simulation software for a residential building’s energy-optimized
building energy consumption. Mocanu, et al. [34] applied deep learning design. However, these methods require lengthy computations and
to predict the electricity consumption of individual households within detailed information models with high-level parameters [43].
buildings, by developing a deep learning model based on relevant data. With AI evolution, another approach uses evolutionary optimization
Meanwhile, Li, et al. [35] demonstrated the high predictive efficacy of algorithms. In a previous study, Tuhus-Dubrow et al. [44] combined
the “deep extreme learning” deep learning method through the evalu Genetic Algorithms (GA) with EnergyPlus to determine optimal pa
ation of energy use scenarios in buildings. To work with large datasets of rameters for building envelopes in residential structures and
482
demonstrated the superiority of the method when optimizing more than optimize energy consumption in buildings, often require detailed
ten parameters. Ascione et al. utilized GA to achieve a well-balanced building parameters and involve heavy computational workloads. This
optimization in terms of energy performance, environmental impact, is because these studies rely on performing energy analysis simulations
and economic aspects in building design [45]. multiple times using specialized simulation software like WBES,
Using multi-objective evolutionary optimization techniques, Azari TRNSYS, EnergyPlus, and similar tools. However, these models have
et al. (2016) investigated options for optimizing the building envelope long computation times and require detailed input of building param
with objectives related to energy usage and life cycle environmental eters, which can be inconvenient during the early stages of the design
impacts in a Seattle office building [46]. Hosamo et al. developed a process. Additionally, energy objective calculations in these models
computer program utilizing the NSGA II optimization algorithm to often require numerous repeated energy simulation processes, posing
optimize various elements in the building, such as walls, roofs, floors, challenges in handling variable parameters and integrating diverse ob
and HVAC systems, aiming for energy-efficient utilization [47]. Gry jectives into the optimization process. Moreover, previous research
gierek et al. (2018) presented an optimization model using the Non- suggests that finding a single algorithm capable of efficiently achieving
dominated Sorting Genetic Algorithm II (NSGA-II) coupled with the superior optimization for all cases of the envelope optimization problem
EnergyPlus building performance simulation program to optimize is highly challenging.
design parameters [48]. Yang et al. (2017) [49] proposed a multi- Due to the difficulties in the early design stage, such as the large
objective optimal model (MOPBEM) that aims to minimize envelope design space for potential solutions, complex interactions among pa
construction cost, minimize envelope energy performance, and maxi rameters, and the need to consider multiple performance criteria, the
mize the window opening rate. Similarly, Wang et al. [43] proposed an evaluation of design alternatives often involves using different com
optimization model based on a quantum genetic algorithm to optimize mercial simulation programs. Connecting these simulation programs
office building envelope options, including walls, windows, glass curtain with an optimization model can be challenging [21]. As a result, there
walls, and the number of windows. The objective of their model is to have been optimization studies conducted to attempt to address the
minimize construction costs while meeting the desired energy conser early stages of a project.
vation requirements. In addition to the optimization studies for building envelope in the
In addition to the GA algorithm, the PSO algorithm has been early stages mentioned in the “introduction” section, recently, in 2020,
employed to optimize building envelope designs. Raponean et al. Zahra et al. have employed a comprehensive approach that combines
focused on enhancing energy efficiency in office buildings by utilizing parametric modeling, building performance simulation, and a genetic
the Particle Swarm Optimization (PSO) algorithm to optimize window algorithm for multi-objective optimization. The main objectives of this
variables [10]. Ferrara et al. proposed the Energy Demand and Supply approach include solar radiation, usable space within the building, and
Simultaneous Optimization (EDeSSOpt) method, which is based on the the shape coefficient. To calculate solar radiation, the researchers uti
PSO algorithm, to optimize the design of a single-family house in Italy lized Ladybug, a Grasshopper plugin [22]. This method, leveraging
[50]. parametric modeling, has successfully identified Pareto optimal points
Furthermore, numerous studies have explored the effectiveness of and incorporated these findings during the early stages of building
various optimization algorithms, in addition to NGA and PSO, for schematic design. However, the use of fixed building parameters, such
building envelope optimization. Yao et al. developed a multi-objective as fixed exterior walls (with a U value of 0.45), fixed exterior roof and
optimization model by integrating Grasshopper, EnergyPlus, Daysim, interior floor materials (with a U value of 1.449), and glazing type (with
and Octopus. They applied the SPEA-II algorithm (improved strength a U value of 0.67), can restrict the extent of design modifications during
Pareto evolutionary algorithm) to generate optimal solution sets for the early design phase of a project. Consequently, making changes to
optimizing the building envelope of rural residences in cold climate these values during the detailed design stage may require significant
zones in China [51]. He and Zhang et al. used the improved epsilon- time and effort for re-optimization. Moreover, in this study, the di
constraint method to optimize the design of the building envelope for mensions and Window-to-Wall Ratio (WWR) in four directions were
public buildings, considering the trade-off between energy consumption considered as variables. Furthermore, there is a lack of innovative ma
and investment costs. Their research incorporated architectural and chine learning algorithms that can accelerate predictions and simplify
social factors, generating multiple effective design scenarios that the task of building energy analysis for non-experts.
reduced energy consumption and investment costs [52]. Recently, in 2022, Elbeltagi et al. [21] introduced an optimization
Recently, in 2023, Elsheikh et al. developed a multi-objective genetic model for sustainable building design in the early stages. The model
algorithm model for Egypt’s major climates: Mediterranean, semi-arid, proposes an integrated optimization approach that includes parametric
and arid regions. The model considers design variables such as wall energy simulation, artificial neural networks, and genetic algorithms.
type, roof type, window-to-wall ratio (WWR), building orientation, The proposed optimization model considers a single objective function
HVAC system settings, and operation schedule [53]. The model directly to optimize the design, specifically focusing on minimizing energy
conducts Energy Plus simulations, followed by the application of the GA consumption. The research results provided a promising solution for
optimization code. The model shows promising results in providing reducing energy consumption in residential buildings during the early
optimal design solutions to minimize energy consumption, life cycle cost design stage. However, this study only considers a single objective and
(LCC), and thermal discomfort hours for building envelopes in these does not address multiple objectives that are essential for the early
climates. However, it requires a detailed BIM model and extensive design stage, such as cost-related objectives concerning energy targets.
computational effort, making it less suitable for initial project stages Additionally, the study solely utilizes artificial neural networks (ANN)
with frequent adjustments to the BIM model. Additionally, relying solely without exploring other advanced machine learning algorithms that
on the NSGA-II algorithm may not capture all potential Pareto solutions, could enhance the predictive accuracy. Moreover, the use of the classical
as no single algorithm is universally suitable. genetic algorithm (GA) with VB language may not fully explore the
These studies highlight the use of evolutionary algorithms, such as optimal solutions comprehensively.
PSO and GA, as well as other intelligent algorithms, to optimize various Based on the aforementioned observations and perspectives, the
aspects of building design. These optimization methods can consider proposed research in this paper introduces a novel approach to over
multiple objectives and constraints, leading to energy-efficient and cost- come the mentioned challenges. This approach integrates an Artificial
effective solutions. The findings from these studies provide valuable Intelligence (AI)-based optimization model that utilizes various machine
information and recommendations for achieving sustainable building learning algorithms for evaluating energy consumption. Additionally,
designs. various AI optimization algorithms are employed to search for the
It is important to note that the aforementioned studies, which aim to optimal solutions. The objective of the proposed model is to optimize
483
various design factors, such as window types, window-to-wall ratios, - Random Forest (RF) uses a set of independent decision trees to
roof materials, and wall thicknesses/types, to address envelope design generate predictions. RF performs well with heterogeneous and
concerns in building design, starting from the early design stage of the complexly varying data, increasing the reliability of energy
project. forecasting.
- Artificial Neural Network (ANN) is a network of interconnected
3. Research methodology nodes inspired by the structure of biological neural systems. ANN has
the ability to learn and synthesize non-linear information, capturing
3.1. Energy analysis simulation in design builder complex relationships within energy data. This enhances the accu
racy and reliability of the forecasts.
Simulation models can be built directly in DesignBuilder software, or - Deep Neural Network (DNN) is a powerful variant of ANN with
during the design process, can take advantage of information models multiple hidden layers. DNN has the capability to learn more com
built by architecture, structural, and MEP disciplines to build energy plex models and handle higher complexity data. Using DNN in en
simulation models. By assigning geolocation parameters and selecting ergy forecasting can provide higher accuracy and reliability.
weather stations to provide data for analysis. Then proceed to export the - Support Vector Machine (SVM) is a popular supervised learning al
energy model with GBxml cloud to import into DesignBuilder software. gorithm widely used for classification and prediction tasks. SVM can
The designer selects the parameters to be calculated, assigns the input handle non-linear patterns and produce accurate predictions for
parameter values, and runs the simulation to get the results. energy data.
The purpose of these simulations in Design Builder is to analyze the - Generalized Linear Model (GENLINE) is a general linear model. With
energy consumption per square meter as E (kWh/m2/Year) for different flexibility and the ability to estimate non-homogeneous distribu
values of envelope elements of a building. These simulations aim to tions, GENLINE can increase the reliability of energy prediction.
assess how different design choices impact the energy efficiency of the
building, providing insights into the optimal configurations for mini As each algorithm has its own advantages, the choice of specific al
mizing energy consumption. gorithms depends on the characteristics of the energy data and the goals
of the forecasting application. Therefore, the model will incorporate
3.2. The sub-model (SM1) for predicting energy consumption multiple algorithms to select the best approach in order to enhance the
credibility and effectiveness of the energy forecasting process.”
The research methodology encompasses data collection, machine
learning techniques, model training and testing, evaluation metrics, 3.2.3. Generalized linear model (GENLIN)
comparative analysis, and result interpretation, enabling the identifi The utilization of historical case data for regression analysis is a
cation of the most accurate and efficient model for predicting pre statistical technique GENLIN [55]. The generalized linear model estab
liminary energy consumption in buildings, such as: lishes a correlation between independent variables (Xi ) and the depen
dent variable (Y) as as shown in Equation (1):
3.2.1. Data collection ∑
n
The research methodology employed in this study involves collecting Y= Xi .ai + b (1)
data from simulations conducted on the energy simulation software, i
Design Builder. The building’s information model (BIM) was created

where b represents.
within the software, incorporating various envelope parameters such as
wall type, glass type, window-to-wall ratio, and orientation. These pa
3.2.4. Artificial neural networks (ANNs)
rameters were systematically varied to generate a diverse dataset for
ANNs are advanced tools designed to simulate the neural system of
training and testing the machine learning models. The simulated data
humans, aiming to improve prediction accuracy. This sophisticated
included information on the building’s geometry, materials, and
approach has the capability to analyze historical data and generate
weather conditions.
predictions accordingly. MLPRegressor is one of the popular models in
the ANN model class. MLPRegressor is a feedforward neural network
3.2.2. Machine learning algorithms
with multiple hidden layers and one output layer. Each layer in
In this research, a range of algorithms from the field of computa
MLPRegressor is connected to the next layer through weighted con
tional intelligence were employed to predict the preliminary energy
nections. The calculation of MLPR outputs with 1 input layer, 2 hidden
consumption of a building. The selected algorithms included SVM, ANN,
layers, and 1 output layer is performed utilizing Equation (2)
Generalized Linear Regression (GENLIN), DNN, Random Forest (RF),
and Gradient Boosting (GB). The predictive models corresponding to y = φ(V*φ(W 2 *φ(X*W 1 + b1 ) + b2 ) + c) (2)
different algorithms were implemented using computer programs,
which were authored in Python language with the appropriate libraries where: X represents the input data. W1 denotes the weights of the first
and frameworks in a programming environment. hidden layer. b1 represents the biases of the first hidden layer. W2 sig
Although there are many machine learning algorithms used for nifies the weights of the second hidden layer. b2 corresponds to the
forecasting, this study selected specific machine learning algorithms biases of the second hidden layer. V symbolizes the weights of the output
(GB, RF, ANN, DNN, SVM, GENLINE) for energy prediction in the early layer. c stands for the biases of the output layer. φ represents the acti
design phase, for several main reasons: vation function used in the network [56].
- GBoost (GB) [54] is a gradient tree boosting algorithm that belongs 3.2.5. Support vector machine (SVM)
to the ensemble learning group. This technique focuses on con SVM is a supervised machine learning technique. It proves to be
structing multiple weak models with low complexity and combines beneficial for tackling multivariate regression and classification prob
diagnostic results using various methods to achieve more accurate lems. SVR is commonly used for prediction problems, with the general
final results. In other words, the sequence of models improves pre model being represented as follows in Equation (3)
diction results by compensating for the loss of the previous model. ∑
n
GB is well-known for its ability to handle large datasets and provide y= (αi .K(xi , x) ) + b (3)
accurate predictions. GB can help create a powerful and stable en j=1
ergy prediction model while minimizing overfitting.
484
where: y is the predicted value, x is the input vector, xi is the training ∑n

(AEi − PEi )2
data vector, αi is the Lagrange coefficient, K(xi , x) is the kernel function R2 = 1 − ∑i=1
n 2
(9)
i=1 (AEi − E)
that computes the similarity between xi & x, and b is the bias value. The
key advantage of the kernel trick is that it allows SVMs to implicitly
work with higher-dimensional feature spaces while operating in the 3.4. Comparative analysis & result interpretation
original input space [57].
After evaluating the individual machine learning models, a
3.2.6. Deep neural networks (DNNs) comparative analysis was conducted to determine the best-performing
DNNs represent a more advanced iteration of ANNs, distinguished by model for predicting the preliminary energy consumption of a build
their increased depth, which refers to having a greater number of hidden ing. The performance of each model was assessed, and the reasons
layers positioned between the input and output layers. DNNs provide behind the superior performance of the selected model were discussed.
significant benefits in capturing diverse features across multiple hier By evaluating the energy consumption for different values of these
archical levels, enabling the acquisition of complex datasets. The variables, the sub-model (SM1) will be used to determine the energy
interconnected layers of nodes and neurons in DNNs employ activation objective function in the optimization process within the main model.
functions to establish intricate non-linear connections between input The sub-model plays a crucial role in quantifying the energy perfor
and output variables. This connection can be expressed mathematically mance of different design options and establishing the objective function
as follows: that guides the optimization process. It enables the main model to
( ) effectively search for the optimal combination of design variables that
Ykn = fk Wn,m,k .Xm + bi,k (4) minimizes energy consumption while considering other relevant con
straints and objectives.
where fk() represents the different activation functions for layer k. k
signifies the layer number, n stands for the number of neurons, m in 3.5. Optimization model with energy efficiency objectives
dicates the weight of each transferring neuron, and i denotes the number
of bias nodes. 3.5.1. Optimization model formation
The proposed method utilizes the decision variables, objective
3.2.7. Gradient boosting (GB) functions, and constraints described below to capture the problem’s
GB is a powerful machine learning technique based on the concept of characteristics and relationships. The purpose is to formulate a robust
boosting, widely used in both prediction and classification problems. model capable of identifying optimal solutions for the given optimiza
The GB model works by building a sequence of weak models (often tion problem.
decision trees), with each model trying to correct the errors of the pre
vious one [39]. Gradient Boosting operates through a series of updates,
3.5.1.1. Decision variables. Let Sw represent a pre-defined set of choices
with each step trying to reduce the loss function by moving in the di
for wall types, consisting of n1 elements, denoted as Sw =
rection opposite to the gradient of the loss function (this is the gradient,
{(Option wall1 , UvW1 , Cw1 ), ⋯, (Option walln1 , UvWn1 , Cwn1 ) }, where
so the algorithm is called Gradient Boosting). The basic formula for
Option wall represents the available options for wall types, UvW repre
Gradient Boosting can be written as:
sents the U value of the wall, and Cw represents the unit price for the wall
F(x)t− 1 = Ft− 1 (x) + λ.ht (x) (5) (USD/m2).
Similarly, let Sr denote a pre-defined set of choices for roof types,
{
where ht (x) is the weak model trained at stage t to predict the residuals comprising n2 elements, represented as Sr = (Option roof1 , UvR1 , Cr1 ), ⋯,
( )}
between the current prediction and the actual value, and λ is the Option roofn2 , UvRn2 , Crn2 , where Option roof represents the available
learning rate. options for roof types, UvR represents the U value of the roof, and Cr rep
resents the unit price for the roof (USD/m2).
3.3. Training and testing for the predictive model Moreover, let Sg denote a pre-defined set of choices for glass types,
{( )
comprising n3 elements, indicated as Sg = Option glass1 , SHGC1 , Cg1 , ⋯,
The data obtained from Design Builder simulations was split into ( )}
Option glassn3 , SHGCn3 , Cgn3 , where Option glass represents the avail
training and testing sets. The training set was utilized to train the ma
able options for glass types, SHGC represents the Solar Heat Coefficient
chine learning models by providing them with input features (envelope
value of the glass, and Cg represents the unit price for the glass (USD/m2).
parameters) and their corresponding target variable (energy consump
Finally, let Swwr represent a pre-defined set comprising n4 ele
tion). Through the training process, the models acquired knowledge of
ments, defining the window-to-wall ratio for the wall, denoted as
the underlying patterns and relationships within the data.
Swwr = {(Option wwr1 , WWR1 ), ⋯, (Option wwrn4 , WWRn4 ) }, where
Following the training phase, the different models were evaluated
Option wwr represents the available options for the window-to-wall
using the testing dataset. Their performance was assessed using various
ratio, and WWR represents the corresponding value for the window-
evaluation metrics, including the mean absolute percentage error
to-wall ratio.
(MAPE), the mean square error (MSE), the root mean square error
The proposed optimization model considers decision variables such
(RMSE), and the coefficient of determination (R2 ) as shown in Eqs. (6)–
as the Option wall chosen from the set Sw , the Option roof selected from
(9) where AEi , PEi , E represent the actual/observed energy, predicted
the set Sr , the Option glass chosen from the set Sg , and the Option wwr
energy, and mean energy, respectively.
selected from the set Swwr .
1∑n |AEi − PEi |
MAPE = (6)
n i=1 AEi 3.5.1.2. Objective functions. Energy Consumption Level (ECL): The ECL is
employed as a measurement tool to assess energy consumption in
1∑n buildings. The first objective (EnerObj) aims to minimize the Energy
MSE = (AEi − PEi )2 (7)
n i=1 Consumption Level (ECL) as defined by Eq. (10).
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
1∑n Objectivefuntion1 = Minimize(Energy) = Minimize(ECL) (10)
RMSE = (AEi − PEi )2 (8)
n i=1
Where ECL represents the total energy consumption level over a
specified period (typically a year) divided by the gross floor area of the
485
building (or conditioned space). The calculation of ECL is performed energy and cost optimization, where the relationships between var
using the output from the predictive model, specifically the sub-model iables can be complex and not easily predictable.
(SM1). - Multi-Objective Particle Swarm Optimization (MOPSO): MOPSO is a
Envelope-Related Cost: The building envelope, comprising windows, variant of the population-based optimization algorithm PSO, tailored
walls, and roofs, plays a pivotal role in regulating heat transfer and solar to address multi-objective optimization problems. The simplicity of
radiation, thereby significantly impacting the energy performance of implementation and computational efficiency of PSO, along with
buildings. Calculating the cost of the building envelope becomes an fewer adjustable parameters, make it an attractive choice for opti
important consideration in optimizing energy performance. Therefore, mizing both energy and cost. Additionally, the ability to handle
the second objective is to minimize the cost of the building envelope, as conflicting objectives such as energy and cost further enhances the
determined by Eq. (11) [43]: usefulness of MOPSO in this research.
( )
Objectivefuntion2 = Minimize(Cost) = Minimize A.Cg + B.Cw + C.Cr
These algorithm choices, therefore, enhance the search capability of
(11) the proposed model while evaluating cost and energy objectives,
Where A, B, and C represent the window area (m2), wall area (m2), particularly where energy objectives are assessed using an integrated
and roof area (m2) respectively. Cg , Cw , and Cr are the unit prices (USD/ machine learning forecasting model. This is due to their processing ca
m2) of Glass, Wall, and Roof, respectively. pabilities with complex multi-objective optimization tasks, as well as
their support in energy prediction based on integrated machine learning
3.5.1.3. Constraints. models.
The research methodology for predicting and optimizing the energy
(1) The unit prices (USD/m2) for each material type such as Cg , Cw , performance of buildings is illustrated in Fig. 1. The proposed method
and Cr will be determined by the selected Option wall, uses Python to implement the computer program. The energy objective
Option roof, Option glass, and Option wwr. is evaluated by using the best AI machine learning in Sub-model(SM1)
by Eq. (10). The cost objective is calculated by Eq. (11). In the main
model, the AI optimization algorithm starts with an initial population
(2) The values of UvW, UvR, SHGC, and WWR depend on the selected
and then proceeds to the next generations through the process of se
Option wall, Option roof, Option glass, and Option wwr.
lection, crossover, and mutation. The optimal Pareto solutions are the
best solutions among the set of solutions that cannot be simultaneously
(3) The values of UvW, UvR, SHGC, and WWR lie within the range of
improved in terms of both cost and energy.
[0.60–3.03,0.29–1.923,0.17–0.9,20 %-80 %] respectively (These
Fig. 2 illustrates the research process flow to provide a clear sum
ranges are referenced from [11] and the ANSI/ASHRAE/IES
mary of the involved approaches. In this figure, Stage 1 represents the
Standard: Energy Standard for Buildings Except Low-Rise Resi
phase of generating and processing energy simulation data for the
dential Buildings, as well as the LEED standard).
building. Stage 2 involves creating and comparing energy prediction
models to identify the best-performing energy prediction model. Stage 3
By formulating the model with the above decision variables, objec entails collecting data for the cost and energy optimization problem,
tive functions, and constraints, the proposed method aims to create an
setting up the objective function, decision variables, constraints, and
effective framework for finding optimal solutions to the given optimi executing the population evolution process using AI optimization algo
zation problem.
rithms such as NSGA-II, DSE, and MOPSO. After performing the evolu
tionary process for each algorithm, the results are consolidated to form
3.5.2. Proposal framework for optimization model for building envelope
the final Pareto set.
The proposed research methodology involves the following compo NSGA-II (Nondominated Sorting Genetic Algorithm II) is a multi-
nents: i) Constructing the building model using an energy simulation
objective optimization algorithm developed based on the principles of
model (BIM-Design Builder) to simulate energy performance. ii) Utiliz the Genetic Algorithm (GA). NSGA-II is used to solve multi-objective
ing the energy prediction algorithm based on six Machine learning al
optimization problems. The algorithm represents a population of in
gorithms such as Random Forest, ANN, DNN, GB, GENLIN, and SWM, dividuals as state vectors. The search process occurs through genera
trained on the dataset provided by BIM-Design Builder. iii) Employing
tions, where the population is evolved to find the best solutions. NSGA-II
three AI Optimization algorithms such as the Non-dominated Sorting employs the nondominated sorting technique to determine the goodness
Genetic Algorithm (NSGAII), NSDE, and MOPSO to solve the optimiza
of individuals by comparing and sorting them based on objective
tion problem. criteria. The best individuals and the non-dominated individuals are
The reasons for choosing the Non-dominated Sorting Genetic Algo
maintained in the population and continue to participate in the evolu
rithm II (NSGAII), Differential Evolution Strategy (DES), and Multi- tion process in subsequent generations.
Objective Particle Swarm Optimization (MOPSO) algorithms in this
The implementation steps of NSGA-II include:
research are their capabilities in optimizing building energy with cost
and energy objectives. Specifically:
Initialize the population of individuals.
Evaluate the fitness of individuals using energy and cost objectives.
- NSGAII: This algorithm is chosen for its ability to handle multiple
Perform nondominated sorting to determine the non-dominated
simultaneous optimization objectives - in this case, energy and cost. individuals.
NSGAII maintains diversity in solutions and keeps a set of well-
Select the best individuals and non-dominated individuals to main
optimized Pareto solutions, which aligns with our goal of finding tain in the population.
multiple energy and cost-efficient solutions rather than a single
Apply crossover and mutation operations to create new offspring
optimal solution. individuals.
- Differential Evolution Strategy (DES): DES is used in this research
Repeat steps 2 to 5 until the termination condition is met.
because of its robust handling of optimization problems in contin
Return the set of optimal solutions found in the final population.
uous spaces, especially when considering variables such as building
costs and energy consumption. DES’s ability to handle noisy,
MOPSO (Multi-Objective Particle Swarm Optimization) is a
nonlinear, and multimodal functions is particularly suitable for
multi-objective optimization algorithm based on the principles of Par
ticle Swarm Optimization (PSO). The main objective of MOPSO is to find
486
Fig. 1. Proposal Framework for Research Methodology.
optimal solutions for problems with multiple objectives that need to be Perform nondominated sorting to determine the non-dominated
simultaneously optimized. The MOPSO algorithm utilizes a population fronts.
of particles moving in the search space to discover potential solutions. Select the best individuals from the non-dominated fronts to main
The search process is performed by updating the positions and velocities tain in the population.
of particles based on the information of individual particles and the Perform variation and selection operations on individuals in the
entire population. population to find the best solutions.
The implementation steps of MOPSO include: Repeat steps 2 to 5 until the termination condition is met.
Return the set of optimal solutions found in the final population.
Initialize the population of particles.
Evaluate the fitness of particles using energy and cost objectives. It’s important to highlight that the main difference between NSGA-II
Update the positions and velocities of particles based on the parti and NSDE in these pseudocode examples is in the particular operations
cle’s and the population’s best positions. performed on the population (or swarm) during the main iterative steps
Check and update the best particles (Pareto front) and non- of the algorithms. NSGA-II creates new offspring from the existing
dominated individuals. population using a mix of crossover and mutation operations. However,
Repeat steps 2 to 4 until the termination condition is met. NSDE uses variation and selection operations. Typically, this variation in
Return the set of optimal solutions found in the final population. NSDE involves mutation and crossover, but it can also include unique
operations specific to Differential Evolution algorithms, such as differ
NSDE (Non-dominated Sorting Differential Evolution) is a multi- ential mutation.
objective optimization algorithm based on Differential Evolution (DE).
NSDE is used to solve multi-objective optimization problems, where 3.5.2.1. Termination criteria. The optimization process concludes when
multiple objective functions need to be simultaneously optimized. The specific termination conditions are met, such as reaching the maximum
NSDE algorithm employs a population of individuals to search for the allowed number of generations. In the proposed algorithm, the termi
best solutions. This method relies on sorting individuals into fronts nation criterion is defined as reaching the specified maximum number of
based on the non-domination relationship among individuals in the generations. Termination results in a collection of optimal solutions
population. The search process is carried out by performing replacement known as the Pareto front. Project planners assess the advantages and
operations, transformations, and selection of individuals in the popu disadvantages of each potential solution to identify the optimal choice.
lation to find the best solutions.
The implementation steps of NSDE include: 3.5.2.2. Ensemble of multiple AI optimization algorithms. The optimiza
tion process using NSGA-II, MOPSO, and NSDE algorithms generates
Initialize the population of individuals. separate Pareto front sets. These sets are then merged using an ensemble
Evaluate the fitness of individuals using energy and cost objectives. function to identify the final common Pareto front. The ensemble
487
Fig. 2. The research process flow.
function takes advantage of the unique strengths of each AI algorithm in

exploring the solution space. This approach is a notable strength of the
proposed method compared to previous techniques that rely on a single
algorithm during the solution search process.
4. Results and discussion
The proposed research method was applied to a case study: A typical

office workspace with an area of 144 m2 and a floor-to-ceiling height of
3.8 m. In the initial design phase of this project, the office was
conceptualized using the BEM - Design Builder information model for
stakeholders to consider the office envelope options (see Fig. 3). The
initial design of this office building was developed based on the expe
riences of architects, incorporating relevant parameters related to the
building envelope, as follows: the Coefficient of Performance (COP) of
the air conditioning system is set to 4, the building orientation (BO) is set
to 90, the lighting power density (LPD) is set to 10, and the cooling set
temperature (CST) is set to 26. The thermal transmittance coefficient of
walls (UvW) is 2.14 (W/m2.K), the thermal transmittance coefficient of
the roof (UvR) is 0.93 (W/m2.K), the thermal absorption coefficient of
glass (SHGC) is 0.56, and the Window-to-wall ratio (WWR) is 0.7. The
weather data used in the study is obtained from the Tan Son Hoa station
in Ho Chi Minh City, Vietnam (see Fig. 4).
4.1. Simulation of building energy consumption using DesignBuilder

Fig. 3. The simulation model of energy consumption for the sample building.
The energy information model within the Design-Builder software is

implemented to analyze the energy consumption level per square meter,
represented as E (kWh/m2/Year), for various values of UvW, UvR, SHGC,
488
and selecting weather stations to provide data for analysis. Then proceed
to export the energy model with GBxml cloud to import into Design
Builder software. The designer selects the parameters to be calculated,
assigns the input parameter values, and runs the simulation to get the
results.
4.2. Define design variables to DesignBuilder
Referring to the available studies [47,58–60] and the ANSI/ASH

RAE/IES Standard such as Energy Standard for Buildings Except for
Low-Rise Residential Buildings, LEED standard, the authors have
determined the variable of the parameters used in the simulation model
Design Builder for this case study as shown in Table.1 below:
Where: (1) The U-Value is a parameter of the wall that indicates the
heat transfer capability through the wall, and represents the rate of heat
transfer per unit area per temperature difference between the indoor and
outdoor environments of the building envelope. The standard SI/Metric
unit is W/m2 K. (2) The Solar Heat Gain Coefficient (SHGC) is the sum of
the direct solar transmittance and the secondary heat transfer factor of
Fig. 4. Weather data at Tan Son Hoa Station, HCMC. the glazing towards the inside. (3) The coefficient of performance (COP)
is the ratio of useful heating or cooling provided to the work (energy)
required.
WWR, COP, BO , LPD, and CST. These simulations aim to assess how
different design choices impact the energy efficiency of the building, After constructing the building information model using the
DesignBuilder software and initializing the values of design parameters
providing insights into the optimal configurations for minimizing energy
consumption. as presented in Table 1 above, the energy simulation function of
DesignBuilder is utilized to execute the case study project. The results
The energy information model can be built directly in DesignBuilder
obtained from the energy analysis simulation in DesignBuilder provide
software, or during the design process, can take advantage of informa
data on the energy consumption per square meter, represented as E
tion models built by architecture, structural, and MEP disciplines to
(kWh/m2/Year), for various values of UvW, UvR, SHGC, WWR, COP, BO
build energy simulation models. By assigning geolocation parameters
, LPD, and CST. These simulation results are compiled into a dataset
Fig. 5. Kernel functions of SVM.
489
Fig. 6. Activation functions of ANN, DNN.
Table 1
Values of design parameters used in the model for predicting building energy consumption.
Parameter Symbol Unit Value
The energy efficiency coefficient of the air conditioning system COP 2.6–7.0
Building orientation BO Degree (o) 0–360
Lighting power density LPD W/m2 7–13
Window-to-wall ratio WWR % 20–80
Thermal transmittance coefficient of walls UvW U value wallW/(m2.K) 0.606–3.030
The thermal absorption coefficient of glass SHGC SHGC Glass 0.17–0.9
Thermal transmittance coefficient of the roof UvR U value RoffW/(m2.K) 0.290–1.923
Cooling setpoint temperature CST. Temperature 24–28
consisting of 2013 samples, as shown in Table 2. Subsequently, this affect the performance of the prediction models. In this study, the au
dataset is used for training and validating machine learning forecasting thors used a Python function to check for missing data (as the data was
models in the sub_model in the section below. generated from the DesignBuilder energy simulation model, the results
Dataset: The study utilized the DesignBuilder software (version of the data check showed no missing values). Then, the data was
6.17.007) to simulate and model various combinations of parameters, normalized using the Z-Score method according to Eq. (12). The reason
resulting in a total of 2013 data samples (as shown in Table 2). The for choosing the Z-Score method was to normalize the data to a standard
energy simulations were conducted on a laptop computer with an Intel normal distribution with a mean of 0 and a standard deviation of 1,
(R) Core(TM) i7-8550U CPU @ 1.80 GHz 1.99 GHz. eliminating the impact of outliers and noise, thereby enhancing data
Data preprocessing: Preprocessing the data before applying machine stability during analysis. The preprocessed data was then used as the
learning models such as RF, GB, ANN, DNN, and SVM is crucial because input dataset for implementing the machine learning algorithms. The
inappropriate data (missing data, unnormalized data) can significantly prediction models all shared the same dataset, which consisted of 2013
Table 2
Simulation results of energy consumption for the dataset in DesignBuilder software.
No COP BO (Degree) LPD (W/m2) WWR (%) UvW (W/m2.K) SHGC UvR (W/m2.K) CST (◦ C) E (kWh/m2/Year)
1 6.3 0 7 26 0.741 0.78 1.923 24 72.4

2 7 0 11.5 80 1.683 0.7 0.806 24 43.8
… … … … … … … … … …
1714 2.6 355 11.5 28 1.414 0.33 0.376 28 74.0
1715 4.8 355 10.5 78 2.087 0.29 0.376 28 42.2
1716 5.7 355 12.5 38 2.357 0.17 0.548 28 35.7
… … … … … … … … … …
2013 7 10 10 62 0.741 0.58 0.806 28 26.8
490
samples, with 70 % of the data used for training and 30 % used for
testing.
Xscaled = (X − Xmean )/Xstd (12)
Where:
• X is the original value of the variable.

• Xscaled is the normalized value of the variable.
• Xmean is the mean value of the original dataset.
• Xstd is the standard deviation of the original dataset.
4.2.1. The implementation of the submodel (SM1)

In this sub-model (SM1), energy consumption prediction models
corresponding to different parameters related to UvW, UvR, SHGC,
WWR, COP, BO , LPD, and CST are constructed using Python computer
programs. The models are trained and tested using the dataset provided
in Section 4.1 above. The dataset consists of 2013 samples, with 70 %
used for training and 30 % used for testing.
Tuning hyperparameters of machine learning models, including
ANN, DNN, SVM, GB, and RF, is of utmost importance to achieve
optimal performance and generalization. These models rely on different
hyperparameters, which are parameters that are not learned from the
data but set before the learning process. The selection and optimization
of these hyperparameters have a significant impact on the performance
and effectiveness of the models. In this study, we performed careful
selection and optimization of these hyperparameters using the GA al
gorithm with the Deap library in Python. This approach aimed to
improve the accuracy of the models, prevent overfitting, and enhance
their ability to recognize complex patterns in the data. Table 3 presents
the ranges of hyperparameter choices for the machine learning models
used to define the search space in the parameter optimization process.
The predictive models in the sub_model (SM1) are constructed using
Python software, encompassing Support Vector Machine (SVM), Artifi
cial Neural Network (ANN), Generalized Linear Regression (GENLIN),
Deep Learning Neural Network (DLNN), Random Forest (RF), and
Gradient Boosting (GB). The parameters of these models are optimized Fig. 7. A typical segment of Python code for the DNN-based model.
through the implementation of the Genetic Algorithm (GA) using the
Python package deap.py.
DNN: To build the DNN forecasting model, the authors developed
Table 3 the eaSimple evolutionary algorithm within the Python package Deap to
Description of hyperparameter choices for algorithms. find the most suitable configuration of the DNN network. The parame
Machine learning Options for hyperparameters ters used were pop, log = algorithms.eaSimple(pop, toolbox, cxpb = 0.5,
model mutpb = 0.2, ngen = 10, stats = stats, halloffame = hof, verbose = True)
GB Num_leaves = [from 20 to 60], Min_data_in_leaf = [from 20 to (see Fig. 7). It is observed that the best DNN model structure consists of
100], Learning_rate = [from 0.01 to 0.30], N_estimators = the following layers: Layer 1: 126 neurons with a linear activation func
[from 100 to 1000], Max_depth = [from 3 to 10], Subsample =
tion, Layer 2: 122 neurons with a relu activation function, Layer 3: 52
[from 0.5 to 1.0], Colsample_bytree = [from 0.5 to 1.0]
RF N_estimators = [from 100 to 2000]; With 8 input features, neurons with a linear activation function, Layer 4: 104 neurons with an elu
max_features =“auto”;”sqrt” as ≈ 2.83; or “log2″ as 3; activation function, and Layer 5: 104 neurons with an elu activation
Max_features as [from 1 to 8]. function.
SVM C (Penalty Parameter): [0.1, 1, 10, 100,1000], Kernel function It is worth noting that although the DNN forecasting model has the
= [ ’linear’, ’poly’, ’rbf’ and ’sigmoid’], Gamma: [0.1, 1, 10,
capability to learn complex nonlinear representations of the data, and
100]. The Fig. 6 illustrates various Kernel functions of SVM
ANN (1 Layer) The hidden layer size (10:200).Activation function (Sigmoid, can learn and understand complex and abstract features from the input
1 data, DNN has a complex structure with multiple hidden layers.
Tanh, ReLu). Where Sigmoid f(x) = ; Tanh f(x) =
x − x
{ 1 + e− x Therefore, training and using the model require significant computa
e − e 0for x < 0
ex + e− x
; ReLu f(x) =
xforx ≥ 0
tional resources and a lengthy training time (the execution time in the
ANN (2 Layer) The 1st hidden layer size = [from 10 to 300], the 2nd hidden case study was 91.9 min).
layer size = [from 10 to 300], the activation function for the GB: The predictive model using Gradient Boosting (GB) is executed
entire network = [Sigmoid, Tanh, ReLu] through a program written by the author in Python. In this context, the
DNN (3 Layers) The 1st hidden layer size = [from 10 to 200], the activation
GB method is implemented using the Light Gradient Boosting Machine
function for the 1st hidden layer = [Sigmoid, Tanh, ReLu], the
2nd hidden layer size = [from 10 to 200], the activation (LightGBM) method for regression. The GB model’s parameters are
function for 2nd hidden layer = [Sigmoid, Tanh, ReLu], the 3rd optimized through a genetic algorithm (employing the eaSimple func
hidden layer size = [from 10 to 200], the activation function tion from Python’s deap library). Fig. 8 shows a typical Python code
for 3rd hidden layer = [Sigmoid, Tanh, ReLu]. The Fig. 5 snippet for the Gradient Boosting (GB) forecasting model with GA(Ge
represents the activation functions of DNN (Deep Neural
netic Algorithm). The results from the Python program reveal the best
Network).
491
Table 4
The comparison of the models.
Model R2 MAPE RMSE MAE Execution Time
(%) (mins)
SVM 0.835 7.18 6.50 3.26 13.5

GENLIN 0.856 9.17 6.09 3.748 0.1
GB 0.994 1.04 1.19 0.50 1.53
RF 0.976 1.64 2.46 0.9 0.08
DNN (*) 0.985 2.66 1.96 1.11 91.94
ANN (2 hiden 0.975 2.76 2.52 1.18 3.8
layers)
ANN (1 hiden 0.962 2.96 2.81 1.24 2.6
layers)
Fig. 9. Comparison between Predicted and True Values (Using SVM).
Fig. 8. A typical segment of Python code for the GB-based model.
parameters found by the GA to be “num_leaves: 57.504, min_data_in_leaf:

19.89, learning_rate: 0.359, n_estimators: 959.75, max_depth: 2.738, sub
sample: 0.537, colsample_bytree: 0.549”.
ANN: In the ANN model, a Python program is written to optimize the
parameters for MLPRegressor (which is one of the popular models in the
ANN model class). In this model, MLPRegressor is a feedforward neural
network with multiple hidden layers and one output layer. Each layer in
MLPRegressor is connected to the next layer through weighted con
nections. The program utilizes the “eaSimple” algorithm (in Python) to
obtain the parameter results for MLPRegressor. The MLPRegressor
configuration includes having 2 hidden layers, with the first hidden
layer consisting of 95 neurons and the second hidden layer consisting of Fig. 10. Comparison samples between Predicted and True Values (Using GB).
91 neurons. The activation function used is ’relu’, and the model is
trained using the ’adam’ solver (the training command is as follows: 4.3. Comparison of the predictive models
model = MLPRegressor(hidden_layer_sizes = hidden_layer_sizes, acti
vation=’relu’, solver=’adam’, max_iter = 500)). Table 4 displays the main evaluation metrics used to assess the AI-
ANN (1 hidden layer): In the ANN model, a Python program is driven prediction methods employed in this study. These metrics
written to optimize the parameters for ANN. The program utilizes the include MAPE, MSE, RMSE, R2 . Lower values of MAPE, MSE, and RMSE
“eaSimple” algorithm (in Python) to obtain the parameter results for indicate a higher degree of reliability in the model’s predictions.
ANN. The ANN configuration includes the hidden layer consisting of 105 Figs. 9–13 offer a visual comparison between forecasted and test
neurons. The activation function used is ’relu’, and the model is trained values of 30 random samples, carried out using various machine
using the ’adam’ solver. learning models and algorithms. Several key insights can be drawn from
SVM: For SVR, the best parameters are determined using the Python these outcomes: Accuracy: GB outperforms the other models in terms of
program with the GridSearchCV search function. The search results have accuracy, as shown by its highest R2 score and lowest MAPE, MSE, and
indicated the best parameters for SVM as follows: ’C’ parameter = 100, RMSE values. This makes it an excellent choice for applications where
’epsilon’ = 1, ’gamma’ parameter = 0.1, and the kernel function is precision is of utmost importance. Computational Efficiency: GENLIN
’linear’. stands out for its low computational time, making it suitable for projects
that require a quick turnaround, provided the acceptable level of
492
may offer better interpretability due to their simpler algorithms and

transparent decision-making processes. Execution Time: RF has the
shortest execution time, making it suitable for time-sensitive applica
tions. However, it is important to consider its slightly lower accuracy
compared to GB.
The machine learning energy prediction models in this study have
demonstrated high levels of effectiveness when comparing the fore
casted values with the values from the original test set. Specifically, the
GB model achieved very high accuracy with R2 = 0.994, MAPE (%) =
1.04, RMSE = 1.19, MAE = 0.5. The DNN model also showed good
performance with R2 = 0.985, MAPE (%) = 2.66, RMSE = 1.96, MAE =
1.11. Comparing the results of the GB model with previous studies that
used GB for energy-related forecasting tasks (e.g., in the study by Sauer
et al. [40], they used the GB model for forecasting cooling load, and the
achieved value R2 = 0.989.), it can be observed that the effectiveness of
the GB method in energy prediction is consistently high. This result also
demonstrates the ability of the machine learning models to comprehend
the complex relationships between building energy consumption and
geometric variables of the building (e.g., BO, WWR) as well as related
physical characteristics of the building (such as UvW, UvR, SHGC, COP,
Fig. 11. Comparison between Predicted and True Values (Using GENLIN).
LPD, and CST). Once again, this result reinforces the strength of machine
learning algorithms successfully applied in estimating building energy
based on early-stage design information.
4.4. The best-performing model for predicting energy consumption
After evaluating the individual machine learning models, a

comparative analysis was conducted to determine the best-performing
model for predicting the preliminary energy consumption of a build
ing. Based on the results presented in Table 3, the Gradient Boosting
(GB) algorithm demonstrates superior performance compared to other
models in terms of MAPE, MSE, RMSE, and R2 . Therefore, the GB algo
rithm is considered the most suitable choice for the sub-model (SM1) in
predicting energy consumption, which will serve as the energy objective
in the subsequent main optimization program.
The contribution of the Sub-model (SM1) to the main model (MM1)
is to evaluate the energy consumption level as E (kWh/m2/Year), which
is the energy objective in the main model, for different values of UvW,
UvR, SHGC, and WWR of the selected Option wall, Option roof,
Option glass, and Option wwr from the set Sw , Sr , Sg , and Swwr The Sub-
Fig. 12. Comparison between Predicted and True Values (Using DNN). model aims to assess how different design choices impact the build
ing’s energy efficiency, providing insights into the optimal configura
tions for minimizing energy consumption.
4.4.1. The implementation of the optimization model (the main model

MM1)
The authors have identified relevant data on different options for
glass, wall, roof, and window-to-wall ratio, and their corresponding
costs. These data are presented in Tables 5–8, which reference the data
found in [43]. For this case study, Tables 5–8 are used to describe the
sets Sw , Sr , Sg , and Swwr , where n1 = 18, n2 = 19, n3 = 58, and n4 = 61. The
initial values for fixed requirements related to the electrical and me
chanical system and building orientation, as determined by the user, are
as follows: the Coefficient of Performance (COP) of the air conditioning
system is set to 4, the building orientation (BO) is set to 90, the lighting
Table 5
Data for wall.
Fig. 13. Comparison between Predicted and True Values (Using ANN with GA). Option UvW Cw
1 2.14 60.75
accuracy it offers. Robustness: GB demonstrates high robustness, as 2 3.82 199.35
3 3.3 91.35
indicated by its consistently strong performance across multiple evalu
… …. ….
ation metrics. This suggests that it can handle variations in the dataset 17 2.98 90
and deliver reliable results. Interpretability: SVM and GENLIN models 18 0.93 109.35
493
Table 6
Data for roof.
Option UvR Cr
1 0.93 120.6
2 0.83 107.1
3 0.3 183.6
…. …. ….
18 0.99 168.75
19 1.61 112.5
Table 7
Data for glass.
Option SHGC Cg
1 0.82 36.75
2 0.8 78.45
3 0.78 58.8
…. …. ….
57 0.32 137.25
58 0.25 161.7
Table 8
Data for WWR.
Option WWR
1 0.2
2 0.21
3 0.22
…. ….
60 0.79
61 0.8
power density (LPD) is set to 10, and the cooling set temperature (CST) is
set to 26.
The Python program utilizes sequential implementations of the
NSGA-II (Non-dominated Sorting Genetic Algorithm II), NSDE (Novelty
Search with Diversity Estimation), and MOPSO (Multi-Objective Particle
Swarm Optimization) algorithms to search for Pareto optimal solutions.
Objective function 1, which represents energy consumption per
square meter, is calculated using the best machine learning model,
Gradient Boosting (GB), in the Sub-model. The input parameters for the
prediction sub-model include the values of WWR (Window-to-Wall
Ratio), UvW (U-value of walls), SHGC (Solar Heat Gain Coefficient), UvR
(U-value of the roof), which are determined based on the currently
selected options for wall, roof, glass, and WWR (Option roof,
Option glass, and Option wwr), and the fixed parameters initially set by
Fig. 14. A typical Python code snippet for the NSGA-II algorithm.
the user (COP as 4, BO as 90, LPD as 10, and CST as 26).
Objective function 2, which represents the cost objective of the
optimization problem, is defined by Equation (11). In this equation, Cg ,
Cw , and Cr denote the unit prices of the current options for wall, glass,
and roof, respectively.
The Pareto optimal solutions are those that cannot be improved in
both objectives simultaneously. Each solution is represented by a 4-
dimensional vector corresponding to the 4 variables: Option roof,
Option glass, and Option wwr.
4.4.2. The optimization model using the NSGA-II-based algorithm

In the main model (MM1), the NSGA-II-based algorithm utilizes the
following primary parameters: a crossover probability of 0.9 (pc = 0.9)
and a mutation constant of 0.1 (pm = 0.1). The model with the NSGA-II
algorithm is written and implemented in Python within the Anaconda
environment. Fig. 14 illustrates a typical Python code snippet for the
NSGA-II algorithm.
From the results of running the model, it is observed that, in the
NSGA-II algorithm, each generation generates a new set of solutions, Fig. 15. The solutions from the NGSAII-based algorithm for 10 generations of
leading to the creation of a new set of Pareto fronts. Consequently, when the first run.
494
the algorithm is executed for 10 generations, a total of 10 sets of fronts Table 11 provides the Pareto solution points obtained from running the
are obtained, with each set corresponding to a specific generation MOPSO-based algorithm 30 times, with each run evolved for a
(Fig. 15). When visualizing these fronts, different colors are assigned to maximum of 50 generations. The computation time for executing the
each set of fronts representing each generation. This visualization optimization model using the MOPSO algorithm is 12.8 min when con
method enables the tracking of the algorithm’s progress throughout ducted on a laptop computer with an Intel(R) Core(TM) i7-8550U CPU
each generation. Notably, the advancements of the best solutions can be @ 1.80 GHz 1.99 GHz.
observed as the generations progress. If the algorithm operates effec
tively, the front 1 gradually moves towards the lower-left corner of the 4.4.5. An ensemble model from 3 models: NGSA, MOPSO, MDSE
graph (assuming both objectives are minimized). This movement sig It is noted that algorithms can perform better in certain specific
nifies improvements in both objectives. cases, while in other cases, their performance may be suboptimal. To
Fig. 16 presents the Pareto points following the execution of NSGA-II simplify the process, we employ three algorithms and combine their
with Pop Size 100 and Gen = 50 after 30 runs. One of the most signif results to generate a Pareto solution set. It is important to highlight that
icant considerations in Pareto solutions is the solution closest to the these solutions are based solely on two quantitative criteria: energy
origin coordinates. The program implements objective normalization consumption and building envelope-related costs. Using this solution
using Min-Max normalization. Following this, the Euclidean distance set, project managers can compile a list of potential solutions to be in
from the origin to each point is calculated, and the index of the point tegrated with qualitative criteria for the building envelope (such as wall
closest to the origin is returned. Based on this, the program sorts and aesthetics, glass aesthetics, architectural focal points, etc.) for further
identifies the Pareto optimal point (Cost as 13245.55; Energy as analysis and selection of solutions that meet both qualitative and
57.96721) nearest to the origin (0,0), which serves as a reference for quantitative criteria. Fig. 22 displays the Pareto optimal points from the
selection. Table 9 provides the Pareto solution points obtained from ensemble model of the NSGA-II, NSDE, and MOPSO algorithms. A syn
running the NSGAII-based algorithm 30 times, with each run evolved for thesis function in the main model allows for the synthesis of Pareto
a maximum of 50 generations. The computation time for executing the points from each algorithm mentioned above (algorithms based on
optimization model using the NSGA-II algorithm is 16.9 min when NSGAII, NSDE, and MOPSO), which are then sorted to find common
conducted on a laptop computer with an Intel(R) Core(TM) i7-8550U Pareto points for all three algorithms. The aggregated results are pre
CPU @ 1.80 GHz 1.99 GHz. sented in Table 12 below.
It is also noted that the main model provides these results based on
4.4.3. The optimization model using the NSDE-based algorithm the initial parameter values for fixed requirements related to the elec
In the main model (MM1), the NSDE algorithm relies on the trical and mechanical system and building orientation, as pre-
following primary parameters: the mutant constant was assigned as F determined by users. These values include a Coefficient of Perfor
equals 0.9, and the crossover probability was set as CR equals 0.5. mance (COP) of 4, a building orientation (BO) of 90 degrees, a lighting
Fig. 17 displays a Python code snippet in the optimization model algo power density (LPD) of 10 W/m2, and a cooling set temperature (CST) of
rithm based on the NSDE algorithm. Fig. 18 and Fig. 19 display the 26 degrees Celsius. Based on the given data from Table 12, some ob
Pareto points when using the NSDE-based algorithm. Table 10 provides servations can be made regarding the physical factors and costs of the
the Pareto solution points obtained from running the NSDE-based al options as follows:
gorithm 30 times, with each run evolved for a maximum of 50 genera
tions. The computation time for executing the optimization model using
the NSDE algorithm is 15.2 min when conducted on a laptop computer 4.5. Physical factors
with an Intel(R) Core(TM) i7-8550U CPU @ 1.80 GHz 1.99 GHz.
UvW is the heat transfer coefficient of the wall, with the highest
4.4.4. The optimization model using the MOPSO-based algorithm value of 2.14 indicating the highest rate of heat transfer through the
In the main model (MM1), the MOPSO-based algorithm is initialized wall. This could cause more heat loss and require more energy to heat or
with the following parameters: the inertia weight w is set to 0.7, the two cool the building. Therefore, the Energy value for this selected option
learning factors c1 and c2 are both assigned a value of 2, and the mu would likely be higher, indicating less energy efficiency.
tation rate is set as 0.1. Fig. 20 displays a Python code snippet in the UvR is the heat transfer coefficient of the roof, with the lowest value
optimization model algorithm based on the MOPSO algorithm. Fig. 21 of 0.3 indicating the lowest rate of heat transfer through the roof. This
displays the Pareto points when using the MOPSO-based algorithm. could help keep the temperature inside the building more stable, espe
cially in the summer when the outside temperature is high. As a result,
the energy efficiency for this selected option would likely be better, but
it often incurs a higher cost for this roofing material.
All options have an SHGC value of 0.82, which is the highest value in
the available set of options for glass Sg . This indicates that the heat
absorption capacity of the glass (SHGC) is the same across all options. A
higher SHGC leads to greater solar heat gain, which can warm a building
in colder climates but may result in additional cooling requirements in
warmer climates.
WWR varies from 0.2 to 0.8, showing the proportion of glass to the
surface area of the building varies. An option with a higher WWR will
allow more natural light into the building.
4.6. Cost factors
Costs vary from 11226.38 to 16619.04. This is a crucial factor when

deciding on the building as it can impact the project’s budget. The op
tion with the best cost (11226.38) is not necessarily the best choice as it
Fig. 16. Pareto points for the NSGAII-based algorithm after 30 runs. may not meet all the requirements for quality and energy efficiency.
495
Table 9
Pareto optimal solutions from the NSGAII-based algorithm for 30 runs.
Option UvW OptionUvR OptionSHGC OptionWWR UvW UvR SHGC WWR Cost Energy Color
1 8 1 61 2.14 0.97 0.82 0.8 11226.38 62.6307 Darkviolet

8 3 1 61 0.71 0.3 0.82 0.8 16028.06 57.05788 Darkviolet
8 8 1 1 0.71 0.97 0.82 0.2 13147.06 58.76141 Darkviolet
8 16 1 1 0.71 0.74 0.82 0.2 13245.55 57.96721 Darkviolet
18 3 1 61 0.93 0.3 0.82 0.8 16619.04 56.91771 Darkviolet
18 16 1 1 0.93 0.74 0.82 0.2 13393.3 57.82705 Darkviolet
1 3 1 61 2.14 0.3 0.82 0.8 13073.18 61.06063 Darkviolet
1 16 1 61 2.14 0.74 0.82 0.8 11251.01 61.86496 Darkviolet
def basic_nsde(population, generations,

mutation_factor=0.5, crossover_rate=0.9):
pareto_fronts = []
num_options = [table_1wall.shape[0],
table_2roof.shape[0], table_3glass.shape[0],
table_4wwr.shape[0]] # The number of options for
each variable
for generation in range(generations):

# Mutation and recombination (crossover)
trial_population = population.copy()
for i in range(population.shape[0]):
# Select three different random indices
indices =
np.delete(np.arange(population.shape[0]), i)
random_indices = np.random.choice(indices, 3,
replace=False)
a, b, c = population[random_indices]
Fig. 19. Pareto optimal points for the NSDE-based algorithm after 30 runs.
# Mutation
mutant = a + mutation_factor * (b - c)
for j in range(len(mutant)): 4.7. Energy efficiency
mutant[j] = np.round(np.clip(mutant[j], 0,
num_options[j] - 1)) # Round the values to the nearest Energy efficiency (as represented by the Energy value) also varies
integer
significantly between options, from 56.91771 to 62.6307. While the
# Recombination (crossover) option with the best energy efficiency (56.91771) may be beneficial for
crossover = np.random.rand(len(mutant)) < energy saving, it may also impose some limitations in terms of aesthetics
crossover_rate or user convenience.
trial_population[i] = np.where(crossover, In general, the choice of which option to choose depends on the
mutant, population[i])
specific priorities of the user: if energy performance is the top goal, they
# Combine offspring and parent population may choose the option with the best Energy; if budget is a major issue,
combined_population = np.vstack((population, they may choose the option with the lowest cost. The Pareto solutions
trial_population)) obtained from Table 12 provide valuable insights into the interplay
between various factors, energy consumption, and the cost of the
# Evaluate the combined population building envelope, with each algorithm contributing to a diverse range
objectives = evaluate(combined_population)
of optimal solutions.
Fig. 17. A typical Python code snippet for the NSDE-based algorithm.
4.8. Comparison with the initial design envelope solution
Based on the Pareto solution set provided by the proposed model in

Table 12, the architectural and engineering professionals in the early
design phase have selected the two solutions, which are solution A (that
achieves a balance between cost and energy with equal weighting for
both objectives and has the closest distance to the point (0,0)) and so
lution B (that has the lowest envelope building cost). The solution (A)
has a Cost and Energy value of 13245.55 and 57.96721, respectively,
along with the options for walls, roof, glass, and window-to-wall ratio as
shown in Table 13 below. The solution (B) has a Cost and Energy value
of 11226.38 and 62.6307. It can be observed that for the selected
building, changing the type of the enveloping wall plays a crucial role in
the design when aiming for an energy-efficient building (refer to the
column “Option UvW” in Table 12, and Table 13).
As the traditional approach only utilizes BIM with choices based on
Fig. 18. Pareto optimal points from the NSDE-based algorithm of the first run. the architects’ experience, individually testing each option and selecting
the best results, the initial results of this experiential-based approach
generated energy consumption and envelope-related costs of $62,883
496
Table 10
Pareto optimal solutions from the NSDE-based algorithm for 30 runs.
18 16 1 1 0.93 0.74 0.82 0.2 13393.3 57.82705 Blue

1 3 1 61 2.14 0.3 0.82 0.8 13073.18 61.06063 Blue
1 8 1 61 2.14 0.97 0.82 0.8 11226.38 62.6307 Blue
1 16 1 61 2.14 0.74 0.82 0.8 11251.01 61.86496 Blue
18 3 1 61 0.93 0.3 0.82 0.8 16619.04 56.91771 Blue
8 3 1 61 0.71 0.3 0.82 0.8 16028.06 57.05788 Blue
8 8 1 1 0.71 0.97 0.82 0.2 13147.06 58.76141 Blue
8 16 1 1 0.71 0.74 0.82 0.2 13245.55 57.96721 Blue
and 14,241 KWh/m2/year, respectively. The comparison results problems, integrating a new algorithm can further enhance the
demonstrate significant cost and energy savings compared to the initial comprehensive search for Pareto-optimal multi-objective solutions.
experiential-based approach (see Table 13). Specifically, the comparison
shows simultaneous savings in cost and energy, with a reduction of 7.52
% in cost and 8.48 % in energy, or alternatively 21.17 % in cost and 0.4 4.10. Discussion with previous studies
% in energy for the case study.
Since these results are only from the initial design stage (early stage) Comparing the results of the proposed method with previous studies
of the project, these selected solutions (Solution A and B) will undergo is challenging due to the use of different specialized energy simulation
further evaluation during the detailed design phase, taking into software, variations in building characteristics, and regional climate
consideration additional criteria such as aesthetics, window visibility, differences. However, to provide a comprehensive overview, the results
durability of the enclosure, and more. Various methods such as Choosing of the proposed model can be relatively compared to the findings in the
by Advantages (CBA) developed by Arroyo in a research published in the recent study by Elbeltagi et al. (2022) [21], as shown in Table 14.
Energy and Buildings journal in 2016 [61] or Analytic Hierarchy Process The comparison is inherently relative due to variations in climate
(AHP) may be employed to evaluate design alternatives with multiple across different buildings. However, it still demonstrates a consistent
criteria. trend of energy savings when using the integrated optimization model
(combined with forecasting models) compared to traditional methods.
Furthermore, since there is no single AI algorithm that can be univer
4.9. Discussion on the proposed method sally optimal for all cases, employing multiple algorithms for forecasting
models can yield better results, especially during the early design stage
In the development of the proposed model, there are three main when detailed drawings are not yet available. The proposed method,
components: the simulation model (DesignBuilder), the predictive which utilizes a range of algorithms, shows high effectiveness, with the
model, and the optimization model, which are integrated in a novel way. GB algorithm achieving an R2 value of 0.994, offering advantages over
The simulation model (DesignBuilder) is utilized to generate the previous research in the predictive model. Additionally, the use of
dataset, specifically during the initial design phase, with high capability complementary AI optimization algorithms can provide a comprehen
in modeling the energy performance of buildings. sive set of Pareto optimal solutions for multi-objectives, thereby
The predictive model takes into account the key design features in contributing to improved support for architects/engineers in decision-
the early stage (prior to detailed design) to predict the energy perfor making processes. Based on the comparison in Table 14, it can be
mance of buildings in tropical climate regions. It utilizes a variety of observed that the proposed method is consistent with the approach of
popular techniques (such as ANN, DNN,SVM, GENLIN, RF, GB, etc.) with previous research, but it offers enhanced flexibility and robustness.
their unique characteristics. The machine learning models acquired
knowledge of the underlying patterns and relationships in the data
during the training process, thereby enabling accurate predictions of 4.11. Discusion for valuable capabilities and benefits
energy consumption. In particular, the GBoost algorithm, which had its
hyperparameters optimized using a genetic algorithm, was identified as The proposed model offers valuable capabilities and benefits for
the most suitable technique (compared to RF, SVM, ANN, and DNN al professionals in the field, including architects and engineers. It can be
gorithms after parameter optimization), achieving a very high R2 score used in the following ways:
of 0.994.
The optimization model is employed to optimize the building enve - Fast and Accurate Energy Consumption Prediction: With its machine
lope solutions. Due to the search mechanism of multi-objective AI learning algorithms, the model provides architects and engineers
optimization algorithms, which aim to find near-optimal solutions based with rapid and accurate predictions of annual energy consumption.
on the principles of global exploitation and local exploration in the By utilizing data generated from physics-based simulations, pro
search space, exploitation is the process of focusing on the best- fessionals can quickly and acurately predict the energy performance
performing region in the current search space. Exploration, on the of different design alternatives and make informed decisions to
other hand, is the process of discovering and searching in the search improve energy efficiency.
space to find new regions that may contain better solutions. Each al - Early Design Stage Optimization: The model enables architects and
gorithm has its own advantages, and no algorithm has been able to engineers to optimize design decisions during the early design stage.
outperform others in all optimization search problems. The opmization By considering multiple criteria, such as energy consumption and
model incorporates multiple artificial evolutionary algorithms and in cost, the model facilitates the selection of building envelope options
tegrates the machine learning predictive model into the evaluation that meet the project’s goals and requirements.
function. This enables an efficient search for a set of building envelope - Evaluation of Building Envelope Options: The model allows pro
solutions that are Pareto-optimal. As observed in the generated Pareto fessionals to evaluate various building envelope options compre
front results in Tables 9–12, the optimization algorithms NSGA-II, DSE, hensively. By analyzing the predicted energy consumption,
and MOPSO complement each other to find the nearest-to-optimal construction cost and considering additional criteria, such as aes
Pareto solutions. With the availability of new artificial intelligence al thetics and durability (often considering for the next stage), archi
gorithms that can efficiently search for Pareto-optimal sets in certain tects and engineers can assess the trade-offs between different design
497
Fig. 21. Pareto optimal points for the MOPSO-based algorithm after 30 runs.
minimize future costs and modifications. The model’s capabilities

contribute to improved sustainable building design practices, allowing
professionals to create buildings that are environmentally friendly, en
ergy-efficient.
5. Conclusion
In conclusion, to overcome the challenges faced in the integration of

energy simulation programs with the optimization model, which can
pose significant challenges as mentioned in previous studies (e.g., un
certain variables for energy modeling, multiple changes due to project
stakeholders, time-consuming model re-runs, complexity for users dur
ing the model utilization phase, inability to evaluate/forecast energy
without sufficient detailed conditions, and difficulties in integrating
specialized energy simulation software with AI-based optimization
programming…). In this study, the newly developed model includes
three main components: a simulation model, a predictive model, and an
optimization model. Specifically, the novel prediction model (Sub
Model) is developed, employing data generated from physics-based
simulations by DesignBuilder of an initial office design. This model,
applying machine learning algorithms (RF, ANN, DNN, SVM, GENLIN,
GB), is capable of forecasting annual energy consumption with high
speed and accuracy. Such capabilities facilitate the evaluation of
building envelope options at the early design stage. In addition, the main
model simultaneously employs AI optimization algorithms to obtain
optimal solutions in building envelope designs, taking into account en
ergy consumption and cost. The proposed model offers several advan
tages, including autonomous operation, accelerated energy performance
prediction, handling diverse building envelope options during the early
design stage. From there, the newly developed model will support the
design process by: Assisting in quickly generating integrated solutions,
shortening the synthesis analysis evaluation cycles, and supporting
interaction and selection of the most suitable design alternatives.
In the field of AI and machine learning, the research contributes to
Fig. 20. A typical Python for the MSPSO.
existing knowledge by successfully developing an integrated model. This
model includes a sub-model for energy prediction in the early building
choices and select the most suitable building envelope for the design stage, which exhibits high speed and accuracy. It incorporates
project. various modern AI algorithms such as ANN, DNN, SVM, RF, and GB,
- Autonomous Operation and Speed: The model operates autono with the tuning of hyperparameters performed using the genetic algo
mously and provides fast energy performance predictions. This al rithm (GA). Subsequently, this energy prediction model is integrated
lows professionals to efficiently explore and evaluate numerous into a multi-objective optimization model, combining popular AI algo
design alternatives within a short timeframe, enabling them to make rithms like NSGA-II, DSE, and MOPSO, to generate a near-optimal Par
timely decisions and streamline the design process. eto front with considerations for energy and cost criteria.
The proposed model will be highly beneficial for engineers in general
By utilizing the proposed model, architects and engineers can and architects in particular during the early design stage. The model,
enhance their design decisions, optimize energy efficiency, and accompanied by the process of selection and consensus among multiple
498
Table 11
Pareto optimal solutions from the MOPSO-based algorithm for 30 runs.
18 8 1 1 0.93 0.97 0.82 0.2 13294.8 58.62125 Green

1 16 1 61 2.14 0.74 0.82 0.8 11251.01 61.86496 Green
8 8 1 4 0.71 0.97 0.82 0.23 13198.77 58.76141 Green
18 3 1 61 0.93 0.3 0.82 0.8 16619.04 56.91771 Green
8 3 1 61 0.71 0.3 0.82 0.8 16028.06 57.05788 Green
8 16 1 15 0.71 0.74 0.82 0.34 13469.63 57.96721 Green
18 16 1 7 0.93 0.74 0.82 0.26 13533.65 57.82705 Green
1 18 1 61 2.14 0.99 0.82 0.8 11226.38 62.6307 Green
approve, and proceed with the next design steps. This process facilitates
the consideration of energy goals at an early stage of building design,
allowing for the transfer of selected options to the subsequent design
phase with minimal changes, thereby saving time and cost for modifi
cations. This research contributes to the promotion of sustainable and
energy-efficient construction practices, which are crucial in addressing
global environmental challenges.
The proposed model has some limitations as follows: It has not
considered other factors such as applicability to different geographical
regions with various climates, diverse building structures, the consid
eration of realistic user behavior within the building, and renewable
energy sources. These factors should be taken into account during the
early design stage. Additionally, the model has not fully explored all
emerging machine learning energy forecasting algorithms, and it has not
comprehensively considered the latest evolutionary optimization
algorithms.
In future research, the authors intend to propose an integrated model
incorporating Building Energy Modeling (BEM), new AI algorithms for
Fig. 22. Pareto optimal points from the ensemble of NSGA-II, NSDE, and prediction and optimization. This integrated model aims to predict and
MOPSO algorithms. optimize various types of building envelopes, taking into account energy
performance, cost, and various sustainability objectives, particularly
project stakeholders, enables fast and accurate prediction of energy those related to environmental and social expert assessments.
consumption and related costs associated with building envelopes. It The model can be expanded to consider additional aspects related to
provides a set of Pareto-optimal solutions for stakeholders to consider, cost and financial benefits of the building when considering energy
savings, along with the use of renewable energy sources as building
Table 12
Pareto optimal solutions from the ensemble of NSGAII, NSDE, and MOPSO.
1 8 1 61 2.14 0.97 0.82 0.8 11226.38 62.6307 Darkviolet

8 3 1 61 0.71 0.3 0.82 0.8 16028.06 57.05788 Darkviolet
8 16 1 1 0.71 0.74 0.82 0.2 13245.55 57.96721 Darkviolet
18 3 1 61 0.93 0.3 0.82 0.8 16,619 56.9177 Darkviolet
18 16 1 1 0.93 0.74 0.82 0.2 13393.3 57.82705 Darkviolet
1 3 1 61 2.14 0.3 0.82 0.8 13073.18 61.06063 Darkviolet
1 16 1 61 2.14 0.74 0.82 0.8 11251.01 61.86496 Green
18 3 1 61 0.93 0.3 0.82 0.8 16,619 56.9177 Green
1 18 1 61 2.14 0.99 0.82 0.8 11226.38 62.6307 Green
18 16 1 1 0.93 0.74 0.82 0.2 13393.3 57.82705 Blue
8 8 1 1 0.71 0.97 0.82 0.2 13147.06 58.76141 Blue
Where: The colors ’dark-violet’, ’blue’, and ’green’ represent Pareto points from algorithms based on NSGA, NSDE, and MOPSO, respectively.
Table 13
Comparison results between the selected building envelope solutions and the initial solution.
Solution Number Option Option Option Option UvW UvR SHGC WWR Cost(USD) Energy(KWh/
UvW UvR SHGC WWR m2)
Solution A (closest distance to the point 8 16 1 1 0.71 0.74 0.82 0.2 13245.55 57.96721
(0,0))
Solution B (Best Cost) 1 8 1 61 2.14 0.97 0.82 0.8 11226.38 62.6307
Initial building envelope solution 1 1 10 51 2.14 0.93 0.56 0.7 14241.1 62.8839
Savings (with A) 995.546 (7.52 4.91669 (8.48
%) %)
Savings (with B) 3014.72 (21.17 0.52 (0.4 %)
%)
499
Table 14
Comparison between the proposed method with previous study.
Method energy Predictive model Optimization model Results for the case study. Qualitative characteristics
simulation
programs
Elbeltagi Grasshopper Artificial neural The classical genetic A specific case study on a building in New (1) The study was able to evaluate various
et al. and EnergyPlus networks (ANN) algorithm (GA with Cairo, Egypt showcased energy design alternatives. (2) It successfully
(2022) single objective consumption reductions of up to 25 %. addressed the interoperability problem by
[21] energy integrating simulation and
optimization tools. (3) Predicting building
energy consumption using the ANN
algorithm (4) It proposed optimization
parameters for residential buildings during
the early design stage. (5) It provided
optimal solutions for a single objective
The Design Builder Ensemble of Machine Ensemble of of AI A specific case study on a building in Ho (1) The study was able to evaluate various
proposed (BIM) Learning Algorithms Algorithms (NSGA II, Chi Minh City, Vietnam demonstrated design alternatives. (2) It successfully
method (ANN, DNN, GENLIN, DSE, MOPSO) savings of 7.52 % in cost and 8.48 % in addressed the interoperability issue by
SVM, RF, GB) energy, or alternatively, 21.17 % in cost integrating energy simulation and
and 0.4 % in energy. optimization tools. (3) Predicting building
energy consumption using the GB algorithm
(considered the best algorithm among ANN,
DNN, GENLIN, SVM, RF, and GB) with high
speed and accuracy for different scenarios
and conditions. (4) It proposed optimization
parameters for residential buildings during
the early design stage. (5) It provided Pareto
optimal solutions for multi-objectives. (6) It
offers flexibility by utilizing multiple AI
algorithms to adapt to different scenarios,
such as diverse climatic regions.
envelopes, such as solar energy panels for roofs or solar panels inte [5] E. o. Denmark, VietNam energy outlook report, VietNam, 2017.
[6] K. Amasyali, N.M. El-Gohary, A review of data-driven building energy consumption
grated into façades. It is particularly important to integrate these aspects
prediction studies, Renew. Sustain. Energy Rev. 81 (2018) 1192–1205.
with smart energy trading to optimize the life cycle cost of the building [7] S. Chen, et al., A review of internal and external influencing factors on energy
(e.g., blockchain-based smart energy trading [62]). efficiency design of buildings, Energ. Build. 216 (2020), 109944.
In the technical aspect of researching optimal energy solutions for [8] H.N. Rafsanjani, S.M. ASCE, Factors influencing the energy consumption of
residential buildings: a review, in: Construction Research Congress 2016, 2016, pp.
buildings, further expansion can be achieved by integrating smart de 1133–1142.
vices and Internet of Things (IoT) technology to enhance energy effi [9] G. Farenyuk, The determination of the thermal reliability criterion for building
ciency (e.g., Integrating high-performance smart energy routing envelope structures, Tehnički glasnik 13 (2019) 129–133.
[10] G. Rapone, O. Saro, Optimisation of curtain wall façades for office buildings by
protocols for IoT using wireless sensor devices[63]). Future research can means of PSO algorithm, Energ. Build. 45 (2012) 189–196.
also explore the optimization of energy efficiency for IoT in Unmanned [11] M.T. Kahsay, G.T. Bitsuamlak, F. Tariku, Thermal zoning and window optimization
Aerial Vehicle (UAV) networks (e.g., Reinforcement Learning Based framework for high-rise buildings, Appl. Energy 292 (2021), 116894.
[12] L. Junghans, N. Darde, Hybrid single objective genetic algorithm coupled with the
Optimization on Energy Efficiency in UAV Networks for IoT [64]), simulated annealing optimization method for building optimization, Energ. Build.
where UAVs can be equipped with sensors and data collection devices to 86 (2015) 651–662.
gather detailed information about the building. By utilizing data and [13] W. Yu, B. Li, H. Jia, M. Zhang, D. Wang, Application of multi-objective genetic
algorithm to optimize energy efficiency and thermal comfort in building design,
analysis from UAVs and the IoT system, a smart building can adapt its Energ. Build. 88 (2015) 135–143.
operations to maximize energy efficiency. These integrations are crucial [14] J. Xu, J.-H. Kim, H. Hong, J. Koo, A systematic approach for energy efficient
for achieving a comprehensive energy-efficient smart building. building design factors optimization, Energ. Build. 89 (2015) 87–96.
[15] S.N. Murray, B.P. Walsh, D. Kelliher, D.T.J. O’Sullivan, Multi-variable optimization
of thermal energy efficiency retrofitting of buildings using static modelling and
Declaration of Competing Interest genetic algorithms – a case study, Build. Environ. 75 (2014) 98–107.
[16] Y. Ding, X. Wei, Q. Wang, Optimization approach of passive cool skin technology
application for the Building’s exterior walls, J. Clean. Prod. 256 (2020), 120751.
The authors declare that they have no known competing financial
[17] D.-K. Bui, T.N. Nguyen, A. Ghazlan, N.-T. Ngo, T.D. Ngo, Enhancing building
interests or personal relationships that could have appeared to influence energy efficiency by adaptive façade: a computational optimization approach,
the work reported in this paper. Appl. Energy 265 (2020), 114797.
[18] X. Shi, Z. Tian, W. Chen, B. Si, X. Jin, A review on building energy efficient design
optimization ROM the perspective of architects, Renew. Sustain. Energy Rev. 65
Acknowledgments (Suppl. C) (2016) 872–884.
[19] J. Zhang, N. Liu, S. Wang, A parametric approach for performance optimization of
This research is funded by Vietnam National University HoChiMinh residential building design in Beijing, Build. Simul. 13 (2) (2020) 223–235.
[20] T. Østergård, R. Jensen, S. Maagaard, Building simulations supporting decision
City (VNU-HCM) under grant number DS2022-20-02. making in early design – a review, Renew. Sustain. Energy Rev. 61 (2016)
187–201.
References [21] E. Elbeltagi, H. Wefki, R. Khallaf, Sustainable building optimization model for
early-stage design, Buildings 13(1), doi: 10.3390/buildings13010074.
[22] Z. Jalali, E. Noorzai, S. Heidari, Design and optimization of form and facade of an
[1] J.-S. Chou, N.-T. Ngo, Smart grid data analytics framework for increasing energy
office building using the genetic algorithm, Sci. Technol. Built Environ. 26 (2)
savings in residential buildings, Autom. Constr. (2016).
(2020) 128–140.
[2] Z. Wang, R.S. Srinivasan, A review of artificial intelligence based building energy
[23] K. Negendahl, T.R. Nielsen, Building energy optimization in the early design
use prediction: contrasting the capabilities of single and ensemble prediction
stages: a simplified method, Energ. Build. 105 (2015) 88–99.
models, Renew. Sustain. Energy Rev. 75 (2017) 796–808.
[24] M. Manni, A. Nicolini, Multi-objective optimization models to design a responsive
[3] Z. Wang, S. Srinivasan Ravi, J. Shi, Artificial intelligent models for improved
built environment: a synthetic review, Energies 15(2), doi: 10.3390/en15020486.
prediction of residential space heating, J. Energy Eng. 142 (4) (2016), 04016006.
[4] Z. Wang, Y. Wang, R.S. Srinivasan, A novel ensemble learning approach to support
building energy use prediction, Energ. Build. 159 (2018) 109–122.
500
[25] N. Yue, I. Li, A. Morandi, Y. Zhao, A metamodel-based multi-objective optimization [46] R. Azari, S. Garshasbi, P. Amini, H. Rashed-Ali, Y. Mohammadi, Multi-objective
method to balance thermal comfort and energy efficiency in a campus gymnasium, optimization of building envelope design for life cycle environmental performance,
Energ. Build. 253 (2021), 111513. Energ. Build. 126 (2016) 524–534.
[26] S.-H. Liao, P.-H. Chu, P.-Y. Hsiao, Data mining techniques and applications – a [47] H.H. Hosamo, M.S. Tingstveit, H.K. Nielsen, P.R. Svennevig, K. Svidt,
decade review from 2000 to 2011, Expert Syst. Appl. 39 (12) (2012) 11303–11311. Multiobjective optimization of building energy consumption and thermal comfort
[27] H.-X. Zhao, F. Magoulès, A review on the prediction of building energy based on integrated BIM framework with machine learning-NSGA II, Energ. Build.
consumption, Renew. Sustain. Energy Rev. 16 (6) (2012) 3586–3592. 277 (2022), 112479.
[28] M.A. Mat Daut, M.Y. Hassan, H. Abdullah, H.A. Rahman, M.P. Abdullah, F. Hussin, [48] K. Grygierek, J. Ferdyn-Grygierek, Multi-objective optimization of the envelope of
Building electrical energy consumption forecasting analysis using conventional and building with natural ventilation, Energies 11 (6) (2018) 1383.
artificial intelligence methods: a review, Renew. Sustain. Energy Rev. 70 (2017.) [49] M.-D. Yang, M.-D. Lin, Y.-H. Lin, K.-T. Tsai, Multiobjective optimization design of
1108–1118. green building envelope material using a non-dominated sorting genetic algorithm,
[29] S.L. Wong, K.K.W. Wan, T.N.T. Lam, Artificial neural networks for energy analysis Appl. Therm. Eng. 111 (2017) 1255–1264.
of office buildings with daylighting, Appl. Energy 87 (2) (2010) 551–557. [50] M. Ferrara, F. Prunotto, A. Rolfo, E. Fabrizio, Energy demand and supply
[30] C. Hamzaçebi, Forecasting of Turkey’s net electricity energy consumption on simultaneous optimization to design a nearly zero-energy house, Appl. Sci. 9(11),
sectoral bases, Energy Policy 35 (3) (2007) 2009–2016. doi: 10.3390/app9112261.
[31] V.K. Xindong Wu, The Top Ten Algorithms in Data Mining, Taylor & Francis [51] S. Yao, Z. Jiang, J. Yuan, Z. Wang, L. Huang, Multi-objective optimization of
Group, New York, 2009. transparent building envelope of rural residences in cold climate zone, China, Case
[32] B. Dong, C. Cao, S.E. Lee, Applying support vector machines to predict building Stud. Therm. Eng. 34 (2022), 102052.
energy consumption in tropical region, Energ. Build. 37 (5) (2005) 545–553. [52] L. He, L. Zhang, A bi-objective optimization of energy consumption and investment
[33] H. Zhong, J. Wang, H. Jia, Y. Mu, S. Lv, Vector field-based support vector cost for public building envelope design based on the ε-constraint method, Energ.
regression for building energy consumption prediction, Appl. Energy 242 (2019) Build. 266 (2022), 112133.
403–414. [53] A. Elsheikh, I. Motawa, E. Diab, Multi-objective genetic algorithm optimization
[34] E. Mocanu, P.H. Nguyen, M. Gibescu, W.L. Kling, Deep learning for estimating model for energy efficiency of residential building envelope under different
building energy consumption, Sustain. Energy Grids Netw. 6 (2016) 91–99. climatic conditions in Egypt, Int. J. Constr. Manag. 23 (7) (2023) 1244–1253.
[35] C. Li, Z. Ding, D. Zhao, J. Yi, G. Zhang, Building energy consumption prediction: an [54] T. Chen, C. Guestrin, XGBoost: a scalable tree boosting system, in: Presented at the
extreme deep learning approach, Energies 10 (10) (2017) 1525. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge
[36] R.F. Berriel, A.T. Lopes, A. Rodrigues, F.M. Varejão, T. Oliveira-Santos, Monthly Discovery and Data Mining, San Francisco, California, USA, 2016, Available: doi:
energy consumption forecast: a deep learning approach, in: International Joint 10.1145/2939672.2939785.
Conference on Neural Networks (IJCNN), 2017, pp. 4283–4290. [55] J.A. Nelder, R.W.M. Wedderburn, Generalized linear models, J. R. Stat. Soc. Ser. A
[37] J.S. Hygh, J.F. DeCarolis, D.B. Hill, S. Ranji Ranjithan, Multivariate regression as (Gen.) 135 (3) (1972) 370–384.
an energy assessment tool in early building design, Build. Environ. 57 (2012) [56] D.K. Chaturvedi, Soft Computing-Techniques and its Applications in Electrical
165–175. Engineering (Studies in Computational Intelligence, no. 1860-949X), Springer
[38] A. Tsanas, A. Xifara, Accurate quantitative estimation of energy performance of Berlin, Heidelberg, 2010, pp. XXII, 612.
residential buildings using statistical machine learning tools, Energ. Build. 49 [57] S. Raschka, V. Mirjalili, Python Machine Learning-Machine Learning and Deep
(2012) 560–567. Learning with Python, scikit-learn, and TensorFlow 2, Packt Publishing Ltd.
[39] O. Sagi, L. Rokach, Approximating XGBoost with an interpretable decision tree, Inf. (2019).
Sci. 572 (2021) 522–542. [58] P. Ihm, M. Krarti, Design optimization of energy efficient residential buildings in
[40] J. Sauer, V.C. Mariani, L. dos Santos Coelho, M.H.D.M. Ribeiro, M. Rampazzo, Tunisia, Build. Environ. 58 (2012) 81–90.
Extreme gradient boosting model based on improved Jaya optimizer applied to [59] M. Krarti, P. Ihm, Evaluation of net-zero energy residential buildings in the MENA
forecasting energy consumption in residential buildings, Evol. Syst. 13 (4) (2022) region, Sustain. Cities Soc. 22 (2016) 116–125.
577–588. [60] A.A.R. Khaled Bataineh, Design optimization of energy efficient residential
[41] O. Alshboul, G. Almasabha, A. Shehadeh, R.E. Mamlook, A.S. Almuflih, N. buildings in Mediterranean region, J. Sustain. Dev. Energy Water Environ. Syst. 10
Almakayeel, Machine learning-based model for predicting the shear strength of (2) (2022) 1090385.
slender reinforced concrete beams without stirrups, Buildings 12(8), doi: 10.3390/ [61] P. Arroyo, C. Mourgues, F. Flager, M.G. Correa, A new method for applying
buildings12081166. choosing by advantages (CBA) multicriteria decision to a large number of design
[42] M. Ferrara, A. Rolfo, F. Prunotto, E. Fabrizio, EDeSSOpt – energy demand and alternatives, Energ. Build. 167 (2018) 30–37.
supply simultaneous optimization for cost-optimized design: application to a multi- [62] N.R. Pradhan, et al., A blockchain based lightweight peer-to-peer energy trading
family building, Appl. Energy 236 (2019) 1231–1248. framework for secured high throughput micro-transactions, Sci. Rep. 12 (1)
[43] Y. Wang, C. Wei, Design optimization of office building envelope based on (2022), 14523.
quantum genetic algorithm for energy conservation, J. Build. Eng. 35 (2021), [63] R. Dogra, S. Rani, Kavita, J. Shafi, S. Kim, M.F. Ijaz, ESEERP: enhanced smart
102048. energy efficient routing protocol for internet of things in wireless sensor nodes,
[44] D. Tuhus-Dubrow, M. Krarti, Genetic-algorithm based approach to optimize Sensors 22(16), doi: 10.3390/s22166109.
building envelope design for residential buildings, Build. Environ. 45 (7) (2010) [64] D. Deng, et al., Reinforcement-learning-based optimization on energy efficiency in
1574–1581. UAV networks for IoT, IEEE Internet Things J. 10 (3) (2023) 2767–2775.
[45] F. Ascione, N. Bianco, R.F. De Masi, G.M. Mauro, G.P. Vanoli, Resilience of robust
cost-optimal energy retrofit of buildings to global warming: a multi-stage, multi-
objective approach, Energ. Build. 153 (2017) 150–167.
501

Predicitive Models Building

Uploaded by

Copyright:

Available Formats

Predicitive Models Building

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Predicitive Models Building

Uploaded by

Copyright:

Available Formats

Alexandria Engineering Journal 79 (2023) 480–501

Contents lists available at ScienceDirect

Alexandria Engineering Journal

An AI-driven model for predicting and optimizing energy-efficient

E-mail address: luongduclong@hcmut.edu.vn.

Design Builder. The building’s information model (BIM) was created

ergy prediction model while minimizing overfitting.

where: y is the predicted value, x is the input vector, xi is the training ∑n

Fig. 1. Proposal Framework for Research Methodology.

Fig. 2. The research process flow.

function takes advantage of the unique strengths of each AI algorithm in

4. Results and discussion

The proposed research method was applied to a case study: A typical

4.1. Simulation of building energy consumption using DesignBuilder

The energy information model within the Design-Builder software is

4.2. Define design variables to DesignBuilder

Referring to the available studies [47,58–60] and the ANSI/ASH­

Fig. 5. Kernel functions of SVM.

Fig. 6. Activation functions of ANN, DNN.

1 6.3 0 7 26 0.741 0.78 1.923 24 72.4

• X is the original value of the variable.

4.2.1. The implementation of the submodel (SM1)

SVM 0.835 7.18 6.50 3.26 13.5

Fig. 9. Comparison between Predicted and True Values (Using SVM).

Fig. 8. A typical segment of Python code for the GB-based model.

parameters found by the GA to be “num_leaves: 57.504, min_data_in_leaf:

may offer better interpretability due to their simpler algorithms and

4.4. The best-performing model for predicting energy consumption

After evaluating the individual machine learning models, a

4.4.1. The implementation of the optimization model (the main model

4.4.2. The optimization model using the NSGA-II-based algorithm

4.6. Cost factors

Costs vary from 11226.38 to 16619.04. This is a crucial factor when

1 8 1 61 2.14 0.97 0.82 0.8 11226.38 62.6307 Darkviolet

def basic_nsde(population, generations,

for generation in range(generations):

4.8. Comparison with the initial design envelope solution

Based on the Pareto solution set provided by the proposed model in

18 16 1 1 0.93 0.74 0.82 0.2 13393.3 57.82705 Blue

minimize future costs and modifications. The model’s capabilities

In conclusion, to overcome the challenges faced in the integration of

18 8 1 1 0.93 0.97 0.82 0.2 13294.8 58.62125 Green

1 8 1 61 2.14 0.97 0.82 0.8 11226.38 62.6307 Darkviolet

You might also like

Referring to the available studies [47,58–60] and the ANSI/ASH