Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu

Predicting construction labor productivity using lower upper decomposition radial base function neural network

2020, Engineering Reports

Received: 1 May 2019 Revised: 17 December 2019 Accepted: 18 December 2019 DOI: 10.1002/eng2.12107 RESEARCH ARTICLE Predicting construction labor productivity using lower upper decomposition radial base function neural network Sasan Golnaraghi 1 Osama Moselhi1 Sabah Alkass1 Zahra Zangenehmadar 2 1 Department of Building, Civil and Environmental Engineering, Concordia University, Montreal, Quebec 2 ABSTRACT Construction labor productivity is affected by many factors such as scope Department of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Ontario changes, weather conditions, managerial policies, and operational variables. Correspondence Zahra Zangenehmadar, Department of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON K1N 6N5. Email: z_zange@encs.concordia.ca this article, a novel methodology is proposed for quantifying the impact of mul- Labor productivity is critical in project development. Its modeling, however, can be a very complex task for it requires consideration of the factors stated above. In tiple factors on productivity. The data used in the present study was prepared using data processing techniques and was subsequently used in the development of a predictive model for labor productivity utilizing radial basis function neural network. The model focuses on labor productivity in a formwork installation using data gathered from two high-rise buildings in the downtown area of Montreal, Canada. The predictive capability of the developed model is then compared with other techniques including adaptive neuro-fuzzy inference system, artificial neural network, radial basis function (RBF), and generalized regression neural network. The results show that LU-RBF predicts productivity more accurately and thus can be utilized members of project teams to validate the estimated productivity based on available data. The advantages and limitations of the proposed model are discussed in this article. KEYWORDS construction project, labor productivity, neural network, radial base function 1 I NT RO DU CT ION Researchers and industry practitioners agree that changes represent an integral part of construction projects and that cumulative impact of changes should not be overlooked, as it can be detrimental to project success. Cumulative impact of changes on construction labor productivity is difficult to identify and measure. Although measured mile analysis is a well known and widely accepted method for quantifying the cumulative impact of changes on labor productivity, it is not applicable to many cases. Because, it is very common that an unimpacted period of similar work simply does not exist or the unimpacted period does not represent substantially similar activities. Nowadays, productivity is modeled using the artificial intelligence (AI) algorithms. Technically, AI mostly refers to machine learning (ML) algorithms and heuristic approaches. The way these two algorithms work are totally different from each other such that ML algorithms are built to learn from raw input data, but heuristic approaches greedily act to realize an objective function or reduce the predefined cost function(s).1 In other words, ML algorithms are data-driven methods in a sense that they use statistical input This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made. © 2020 The Authors. Engineering Reports published by John Wiley & Sons, Ltd. Engineering Reports. 2020;2:e12107. https://doi.org/10.1002/eng2.12107 wileyonlinelibrary.com/journal/eng2 1 of 16 GOLNARAGHI et al. 2 of 16 data, whereas heuristic methods deal with approximation approach without considering the trend of data. For a mixed and abnormal dataset, none of the modeling techniques work good enough to have a robust model, which is capable of reliable generalization. Thus, there is a need to propose a new ML-based model to fit any given dataset behavior. The literature indicates that NN prediction generally outperforms regression as NN can leverage subtle changes in data and structural components. Due to recent advancements in NN architecture over the last two decades, especially after the introduction of deep neural architecture capable of handling any sort of data with any type of complexity, this technique was chosen for model development. Choosing a proper network architecture with special settings is totally dependent on the input type, whether it is intended for classification or regression.2 Given the proposed model's current dataset and the task of regression, the radial basis function (RBF) was deemed suitable NN type for mode development. Radial basis function neural networks (RBFNNs) is a powerful method proven to succeed in many areas of engineering in view of the capabilities of its algorithm.2 These capabilities include flexibility to adapt data distribution because of different kinds of kernels, fast training and runtime, and good generalization. RBF is a type of neural network that differs from regular NN in representing the given input data as a statistical distribution (mainly Gaussian) and it continues the process of training with parameters of the chosen distribution. RBFNN requires less computational time (in both training and test) compared with other techniques due to its simple topological structure.3 RBFNN structure simplicity can be considered as its main advantage over artificial neural network (ANN). The number of neurons in each hidden layer, as well as the number of hidden layers play a major role in model accuracy. It must be noted that increasing the number of neurons and hidden layers, however, will increase both model complexity and computational time. ANN has a variety of applications in construction management. Lu et al4 estimated construction labor productivity using real historical data from local construction companies. The group has applied probability inference neural network and compared it to feedforward back-propagation neural network model. Ok and Sinha5 applied ANN to estimate the daily productivity of earthmoving equipment. Ezeldin and Sharara6 described an ANN model for productivity prediction in formwork activity, steel fixing, and concrete-pouring activities. Lau et al7 applied RBFNNs to estimate productivity rates of tunnel construction using data obtained from a project in Hong Kong. They utilized the RBF-based time series analysis to estimate production rate for the following cycle. Oral and Oral8 utilized the self-organizing map to analyze the relationship between construction crew productivity and a variety of factors. They also predicted productivity in given situations for ready mixed concrete, formwork, and placement of rebars. The data was collected from a construction site in turkey. They concluded that ANN can predict productivity better than regression methods due to the complexity of the process involved. Muqeem et al9 predicted production rate values for installation of formwork of beams using ANN. Mohammed and Tofan10 predicted the productivity of ceramic wall construction using data from general contractor companies. They applied ANN as the analysis required performing complex mapping of the environment and management factors to productivity. AL-Zwainy et al11 developed a model for estimation of construction labor productivity in marble finishing works. They used multilayer perceptron training through a back-propagation neural network algorithm. Moselhi and Khan12 ranked labor productivity-influencing parameters in construction using fuzzy subtractive clustering, neural network modeling, and stepwise variable selection. They noticed that parameters such as work type, floor level, and temperature have a larger effect on productivity. Heravi and Eslamdoost13 considered 15 important factors in the motivation of labor, supervision sufficiency, and competency and suggested a model for labor productivity rate estimation. Recently, Aswed14 has applied ANN for estimating bricklayer (Builder) productivity and modeled 13 productivity-influencing factors. El-Gohary15 developed an engineering approach using ANN with hyperbolic tangent transfer functions to model construction labor productivity of construction crafts of carpentry and fixing reinforcing steel bars for different types of concrete foundations. The later study showed an adequate convergence with reasonable generalization capabilities, more reliable results compared with both traditional, and the existing approaches in the literature. This research seeks to quantify the impact of different factors on productivity based on RBFNN. The network was developed in three steps including initial network processing, improving network parameters, and fine-tuning the network. Ultimately, the trained model is developed after performing these steps. The performance of the developed model is compared with that of others and conclusions are drawn. 2 DEVELO PED MODEL A schematic block diagram depicting the work performed in this research is shown in Figure 1. Each step is explained in detail in subsequent sections. GOLNARAGHI et al. 3 of 16 FIGURE 1 Schematic LU-RBFNN diagram for predicating labor productivity. RBFNN, radial basis function neural network 2.1 Zoning By technical definition, zoning is the process of dividing input data into several overlapping and nonoverlapping segments.16 This approach is mainly used to extract deeper features from the dataset used in the model. The chief motivation for the application of zoning is to save local data dependencies.16 When data vectors in datasets are convoluted, their distribution gives a great deal of information. In other words, data have much similarity together, which many of them are not recognizable by human visual examination. These similarities are known as convoluted similarities. Locality refers to the kind of characteristics and varies from segment to segment in the dataset. For mostly numeric datasets, adjacent data vectors are similar to each other and this similarity will gradually vanish by going far from a segment. With the provided locality definition, zoning can preserve locality property. For example, consider a situation where productivity values of two different data (two rows) are very close to each other. In this case, it is likely that the two rows of data will have similar distributions in terms of each feature. It should be noted that zoning is often mistaken for clustering. Clustering is the process of grouping different samples into several categories in such a way that the exact accurate category of each sample is unknown. It is noteworthy to mention that there is a slight difference between classification and clustering as well. The categories of the sample are specified in classification. Zoning is neither classification nor clustering, but simple grid partitioning. A zoning operation can be either one dimensional (1D) or two dimensional (2D). With respect to the dataset used in this study, both 1D and 2D zoning were applied using RBFNN. The results show that 2D zoning predicts productivity better as 2D zoning covers more features of the given dataset in comparison to 1D zoning. Specifically, separating any given 1D data into 2D representation (2D zoning) increases the chance of extracting more informative features because in 2D representation data patches are binned together both vertically and horizontally. The width and height of zoning can be different, but they are normally square zones with even lengths. The model development started GOLNARAGHI et al. 4 of 16 with zoning size of 32 × 32 and also of 16 × 16, where the results were fairly competitive, revealing that doubling the size of the zone does not significantly affect the performance of the proposed RBFNN model. Finally, the bigger size zone was chosen as it reduces computational complexity. The process of zoning is performed by moving the mask associated with the zone along the 2D dataset. Considering a mask 32 × 32 in size, which slips along the 2D dataset in hand, first, the mask is positioned (located or set) on the top-left number in the dataset and splits into 32 rows and 32 columns. The mask moves toward each row and column and based on the mask position, the associated numbers are picked up. This procedure continues until reaching the bottom-right of the dataset. 2.2 Expected maximization Expectation maximization (EM) is a method that uses consecutive guesses to approximate the maximum likelihood function. The EM algorithm is used to obtain maximum likelihood estimates for model parameters when data is partial, has missing data points, or has overlooked latent variables. Two processes of E-step and M-step were included in each EM iteration as follows.17 In E-step, the missing data is assessed given the dataset and current estimate of model parameters.18 x(m) = arg maxx∈X(y) p(x|y, 𝜃 (m) ). (1) In which, x is complete data, x(m) is the mth estimate of x, y is given observation, 𝜃 (m) is the mth estimate of 𝜃, X(y) is support of X conditioned on y (closure of the set of x where p [x|y, 𝜃] > 0), and arg is the argument function. In M-step, the likelihood function is maximized under the hypothesis that the missing data is known. The quantified missing data from the previous step is used as a replacement for the actual missing data (Langari et al, 1997). 𝜃 (m+1) = arg max𝜃∈Ω p(x(m) |𝜃), (2) where p(x|𝜃) is density of x given 𝜃; also written as p(Y = y|𝜃), 𝜃 ∈ Ω is the parameter(s) to estimate and Ω is the parameter space. In this research, EM was used because EM is the process of leveraging low-frequency values, which have a bigger share in the data balance distribution, in the proposed model. It is important to keep the data distribution around its mean, as it is likely to guarantee the stability of the generated model. 2.3 Normalization Normalization is used to organize input data into bounded intervals (eg, [0,1]) with respect to a distribution (usually normal distribution). Normalization can ensure stable convergence and prevent biases in neural network models. The process of normalizing data in the developed model followed a Gaussian distribution with specific values for mean and variance. One common normalization method is to apply normal distribution with mean and variance equal to zero and one, respectively. This approach worked well on the dataset used in this study. However, herein, a normal-like distribution with a threshold on its parameters (mean and variance) will be applied. These values were computed by applying EM on the zones and by attempting to minimize the following functions19 : min‖f (x + 𝜇)p(x + 𝜇) − f (x)p(x)‖, (3) min‖x − 𝜎‖ + 𝜎(1 − p)x. (4) 𝜇 𝜎 Equation 3 minimizes the weighted difference between the approximated function f times its occurrence probability for a small bias value equal to 𝜇. By minimizing this function, the stability criteria for each zone were computed so as to increase the sensitivity of this value to the locality axiom. Equation 4 also minimizes the valid value for perturbation. In other words, the zone is being forced to abide by a specific tolerance. This is useful as it can provide a robust variance around 𝜇. 2.4 Resampling of dataset One of the practical ways to improve NN model resiliency against unknown data is by resampling the dataset.20 It is important to mention that resampling has nothing to do directly with the trained model and is categorized as a GOLNARAGHI et al. FIGURE 2 5 of 16 up-sampling by Factor 2 preprocessing step before training. Resampling can be categorized as up and down sampling.20 The up-sampling approach is to increase the sampling rate by an integer factor. For example, up-sampling by a factor of 2 means to fill in the missing value between two consecutive data points using interpolation as shown in Figure 2. As such, up-sampling is a dataset augmentation process using the locality principle. As can be observed from the Figure 2, the data dimensions are augmented by a coefficient normally an integer. This coefficient refers to the scale of the augmentation and if the coefficient is 2, it means that the data will be augmented in double and the empty generated space will be filled by average local values. up-sampling can be performed in many different formats like bilinear or bicubic. In this research, up-sampling was used because the dataset size is quite small and needs to be augmented. It was implemented based on the distribution of the data normalization step. In other words, the more condensed the data, the more the sample will be extended. Before moving to step five, it should be noted that steps two to four are iterative as shown in Figure 1. Thus, termination criteria are: the mean in step three should be zero and the variance in step three should be less than one. There should be stability in step two, that is, no change overtime; and step four does not affect steps two and three.20 2.5 RBF-I (pretrain) As mentioned before, RBF is a method of regression based on neural network. RBF maps the input data to a |x −𝜇|/б space, where x is the given input, 𝜇 and б are the mean and SD of the inputs. After finishing these steps, the processed GOLNARAGHI et al. 6 of 16 FIGURE 3 RBF components for developed model. RBF, radial basis function FIGURE 4 ReLU activation function. ReLU, rectified linear unit data is fed into RBFNN. The output is not a value, but vectors of vectors, which is called operation tensoring. Operation tensoring refers to an n-dimensional low-rank matrix derived from the RBF maxterm.20 The reason for applying RBF here was to pretrain the model as it has been proven that the pretraining stage gives flexibility power to the final model. The RBF pretraining stage is called RBF-I in this research. Its input was the processed dataset (RBF-1 input) and its output was the tensor vectors associated with that input. RBF architecture in this research is shown in Figure 3. Input variables are the enhanced data generated from the preprocessing phase. Covariance Mapper contains a precomputed weight matrix associated with that input. It linearly multiplies the covariance matrix and the achieved parameters and passes them into the cumulative module to compute their final results. In other words, the covariance mapper works as a regularizer for the RBF-I parameters. The regularizer tries to reduce the discrepancies between parameters and aligns them with a meaningful behavior.21 The Accumulator is known as a preactivation function before computing the final output, which is Accumulator = ∏ fi Ci + 𝛿i , (5) where Ci denotes the covariance values, f i is the input variables, and 𝛿 i is a constant less than one. The proper value for our dataset is set to 𝛿 i = 0.41. Maxterm is the output layer, which is a linear combiner mapping the nonlinearity into a new space. Maxterm is a rectified linear unit (ReLU) function as shown in Figure 4, which is an activation function for the output layer as given in Equation 6: ReLU (p) = arg Max (0, p), where p is the density distribution of the pooler. (6) GOLNARAGHI et al. 7 of 16 In this diagram values along x and y axes denote the predicted result(s) of the model before and after ReLU, respectively. The output of this step will be used to extract distinguishable features of the dataset using two techniques; namely, gradient descent (GD) algorithms and LU-decomposition. RBF-I outputs do not always get fired, if fired, they will be passed into the extracting features block. All generated RBF-I outputs will go through the LU-decompositions step. As, LU-decomposition plays a crucial role in the developed algorithm, “extracting features” output will be passed to this block, as well. The main reason for implementing the stacked RBF is to improve the robustness of the proposed model despite its small dataset. Looking at the big picture, the first RBF was created mainly to extract more local features from the raw data. The big motivation for creating the second RBF was to make the model robust against adversarial attacks and force the model to learn from representational space (feature space). The last RBF was created to make the model the final regressor for the represented features. The last RBF handles the regression process and this can produce the final results. This does not mean that the third RBF is the most important, but just that its proper regression is based on the performance of the first two. 2.6 GD algorithm for dataset features extraction To provide distinguishable RBF features, some detectors in the given dataset need to be extracted. The detectors are the gradient directions in the dataset for each row. Thus, a GD algorithm is applied to each dataset row to show the direction of changes in a given dataset. The primary idea behind the GD is to update the RBF weights. GD is an optimization procedure that can be used with many ML algorithms. Herein, GD quantifies RBFNN weights to minimize the cost function. Based on propagating error backward, Equations 7 and 8 are used for computing GD. Wi = Wj + ∇||Wi − Wj ||2 , (7) ∇||Wi − Wj ||2 = ||ypredicted − yactual ||, (8) where, W i is the weight of RBF hidden layer, ypredicted stands for the generated value by RBF, and yactual is the actual value in the given dataset. Using GD for training along with the Gaussian RBF resulted in a robust RBF model. In addition, the GD algorithm guarantees fast learning and reasonable generalization.22 2.7 LU-decomposition algorithm for dataset features extraction In pattern analysis, there is no one-size-fits-all feature extraction technique that can be applied to all possible cases and datasets. Feature extraction techniques should be selected or developed based on the nature of the dataset. In this research, matrix LU-decomposition was selected to get more distinguished features in the given dataset. LU-decomposition is mathematically very powerful for extracting features. LU-decomposition is actually lower-bound and upper-bound decomposition, which factorizes a given dataset into matrices L and U as given in Equation 9. A = L ∗ U → a𝑖𝑖 = n ∑ l𝑖𝑗 × u𝑗𝑖 , (9) j=1 where A is a square matrix of given input, L is the lower triangular matrix, and U is an upper triangular matrix. lij , uij , and aij are the arrays in L, U, and A matrix, respectively. In other words, L has only zeros above the diagonal and U has only zeros below the diagonal (Figure 5). FIGURE 5 LU decompositions GOLNARAGHI et al. 8 of 16 In this research, LU-decomposition was applied after passing the first RBF module. The LU module input was the first RBF output model. This transformation is needed to make sure that all possible dependencies in the dataset are found. In addition to the LU-decomposition output model, the output of the feature extraction module, as well as resampling is needed. The feature extraction in this stage refers to the process of extracting ranks from the RFB model. Matrix rank in general means the number of vectors needed to be able to reconstruct the given input with high accuracy. Computing this rank is done by feature extraction parts. The reason for adding this module was to remove redundant data. Finally, all three types of LU module inputs were used to extract the lower and upper triangular matrices. The output is two matrices as shown earlier. It should be noted that the diagonal elements of the L and U developed matrices are features used in this developed model.23 2.8 Patching steps Extracted features vectors from RBF-I are fed into a block, known as patching. The reason for patching is subordering their features. Subordering means ordering the feature vectors by their magnitude without considering their location. Subordering helps to have more stress on features with higher magnitudes and orientations. Herein, tang (magnitude) refers to the feature vector orientation. The orientation of the feature vectors has been implicitly impacted by importing the magnitude of the feature vectors. Extracted features can be interpreted as a representation of different clusters. These clusters should be as far as possible from each other to ensure the discriminability of different inputs. The output of this clustering process will be passed to different blocks—proper kernel and RBF-II. In other words, patching is similar to clustering each data patch into a center. The final outcome of this process is several centers surrounded by patches. This process does not change the value of data patches or their distribution, but empirically helps the second RBF learn faster. 2.9 Radial basis function-II The functionality of the second RBF has been previously outlined. This module receives five inputs as LU output, process input data, resampling module output, patching, and feature extraction module. The second RBF handles the transition from represented space (LU decomposition output) and enhanced data to resampling and patching. This can guarantee the robustness of the model and its responsiveness to small perturbations around the input. As explained earlier, RBF maps its inputs to at least one output value. In the RBF-II we want to map, its five mentioned inputs to a more robust feature set that represent the inputs. 2.10 Find the best distribution This step is similar to the previous steps when the distribution of the given dataset is used. This step is employed to fine-tune the RBF-II parameters. The process of fine-tuning is to multiply the singular value of LU-decomposition by RBF-II derived parameters using Equation 10. RBF − II derived parameters = (Si ∗ Pi ) + (si ∗ pi ) + Biasi , (10) where Si is the singular value of samples in LU, Pi is parameters of RBF-II, si is the singular value of subclass in LU, pi is parameters of subclass in RBF-II, and Biasi is Bias value in i class. The output of this step will be used in the proper kernel selection. Finding the best data distribution is important as the RBFs working with the data distribution and not the raw input. Finding a better distribution guarantees a more accurate model. 2.11 Proper kernel Different kernel functions and different parameters of the same kernel may affect the results of the NN models. It is clear that the power of RBF is totally dependent to its kernel. Mathematically speaking, kernels are expressed as24 : Kernels = ∑ Φ (x) an 𝜑(x(t|x(n)) + b, (11) GOLNARAGHI et al. 9 of 16 where Φ (x) is the RBF assumed initial function, an is a Taylor series that approximate the continuity of gradients, 𝜑(x(t| x(n)) is the kernel function that can map Φ (x) into a space of distinguishable features, and b is the bias. For calculating the proper kernel, a posteriori distribution can be used for implementing the Bayes theorem to predict the probability of the occurrence of a sample and maximizing it. Besides, the K-nearest neighbor (K-NN) technique, which is a very simple algorithm used for clustering kernels and applicable to various problems can be used. In conventional NN algorithms, a norm distance is often used. A K-NN algorithm relies on the fact that the inner products between mapped samples need to be computed. The key point of the K-NN algorithm is to choose an appropriate kernel function and its parameters. In this research, a trial and error approach was applied to determine the kernel function and its parameters using K-NN. As result, a circle and cylindrical 500-NN were utilized for the developed model. It should be noted that there are no rigorous theories to help in the selection of the best kernel function and its parameters. This is an existing limitation for most kernel-based algorithms. 2.12 Radial basis function-III As stated above, the third RBF is the final regressor. This module receives the resampled data and fine-tuned kernels. Even if it is eliminated, the final results would only slightly differ. In general, when there is clear uncertainty in the data, this final step needs to be concluded. The first and second RBFs are kind of preparation blocks for RBF-III where it can produce the final model. Then the final model will be used for predicting the labor productivity. 2.13 Gaussian process The purpose of using the Gaussian process (GP) was mainly because of its use in training sample rates to an optimal value. GP is the process of applying Gaussian distribution and finding proper 𝜇 (mean) and б (squared root variance). The resampling process output is used as GP Inputs. The reason for finding the raw data distribution is noisy data that contains high and low variations. If the dataset can be provided by at least a series of its distribution, the most useful information can be extracted. Because of its distinguishable feature, that is, GP inherently resists noise that includes speckle and rough distortion, GP was selected. In the case where small changes in the model are allowed, but its general behavior is not change, the probability mass function needs to be kept stable and the GP would handle it.25 This function is the bare-bone-essential of a balanced system. 2.14 Unmapped SBS The unmapped single base substitutions (SBS) receives three inputs from resampling, the third RBF, and the kernel module. This module will actually retransform mapped represented space to Euclidean space by finding another MLP model from the resampling model and the represented one. From this step, visualization is more possible. In other words, before reaching the unmapped SBS block all the values are in a non-Euclidean space, which cannot readily be interpreted. This block translates these parameters to Euclidean space, which makes it possible for meaningful interpretation. 2.15 Organizer and wrapper There is no condition on the loop of organizer and wrapper as shown in Figure 5. If the GP output is an active organizer, the wrapper stacks up all the organized weights otherwise, the organizer will sort the shared weights. As only one of the blocks will be working in each time step, the loop sets the timing counter and there is no back-and-forth operation. In order for the organizer block to be activated, the following condition needs to be met. #samples magnitude(Πi=1 > (1 − Sqr #samples P(𝜃i |Counteri )) − 𝜆(Πi=1 (P(𝜃i ti |𝜃i=1..n )) ∗ (P(𝜃i=1..n |𝜃i ti )), #samples Πi=1 P(𝜃i |Counteri )) where P is the probability of sample, 𝜃 is parameters, 𝜆 is scaler, Counteri is Counter for the sample, and sqr is the square root. In other words, GP work as a switch that can turn the organizer block on/off. If the conditioned probability GOLNARAGHI et al. 10 of 16 of all the parameters with respect to the prior information is higher than the shared-parameters' probability, the organizer assumes there is no need to stack weights and their achieved parameters need to be sorted with their probabilities. The wrapper is the second last step of the mode before the final regression model. The input of this module is the organizer block, which is a weight matrix and the output model of the GP block. To make the model more distributed, a back-and-forth procedure from organizer to wrapper can be applied. This is a finite iteration process up to at most 10 times. 3 DEVELO PED LU-RBFNN SOFTWARE The LU-RBFNN processes used to build a standalone software using MATLAB 2017a as shown in Figure 6. The interface has different sections, which will be discussed in detail. 1. By clicking on the “Load Data” button, a new window will be opened that lets the user provide the address of the input file. This module is flexible; enabling different input formats from text file to XML, excel file, and so on. 2. The load enhancing parameter button allows reading a file from a physical address, local or remote, by providing the proper IP. Obviously, the values of this file are totally dependent on the input and by changing the input file, the enhancing parameters file must be updated to the correct values as well. Computing the enhancing parameters is not straightforward. The value of input variables in the given dataset is noisy, which means the distribution of the histogram is not consistent with a normal distribution. This measure is assessed by finding a regression between the histogram distribution of the given input and the Gaussian distribution by means of zero and SD of about 0.5. Finally, based on the regressed curve, the adjuster matrix is found, which clips the given noisy input file to a file, which is as normal as possible. FIGURE 6 LU-RBFNN graphical user interface. RBFNN, radial basis function neural network GOLNARAGHI et al. FIGURE 7 11 of 16 Sample of distribution of different kernel used in LU-RBFNN. RBFNN, radial basis function neural network 3. The Run RBF button executes the process of running the RBF module to predict a value. Herein, the second-order nonlinear kernel is utilized for the RBF. Technically, the kernel is a mapped RBF space as it just receives the distribution of data as input instead of the noisy dataset. Actually, the power of the RBF is that it uses a mapped dataset. The process of running the RBF can be executed in two different methods; namely, continuous and discrete. The main motivation for embedding these two methods was to give the rightmost flexibility to the RBF as it can manage to work with both continuous and discrete types. This option is provided by a drop-down menu located next to the “Run RBF” button. Distribution result depicts the distribution of different kernels used in the LU-RBFNN. Overall, there are five different distributions; namely, empirical, generalized Pareto, generalized extreme value, location scale, and exponential. As stated, earlier these kernels are helpful for extracting information from the dataset. As is shown in the option pane, the exponential distribution fits to the dataset as a nonlinear kernel has the widest confidence interval and pretty convex. This convexity helps the RBF to be as smooth as possible and finally produces a robust model as shown in Figure 7. 4. Load GP Parameters located at the top right of the application. By clicking on it, the user is redirected to input the address of a text file name “Gaussian process parameters.” This file has six items for computing the GP that is computed based on the given input dataset. 5. Load linear discriminative Hermite is an address-oriented button for importing a text file with the name of “Disc_hermite” with four adjusting parameters to the Hermite function. The details of this mathematical theorem are beyond the scope of this section and are explained in the Appendix section. 6. Load nonlinear discriminative Hermite enables the user to import a file that contains relative values for biasing the nonlinear version of the Hermite function. The advantage of this modules is that it provides the peripheral sublinear functions to approximate the overall kernel. The file associated with this module is called “Nonlinear Discriminative Parameters.txt.” 7. Upward index matching and slippability is a processing used to define the controlling margins of the model, thus it helps the model to be slippable. Generally, the size of this module is twice the size of the columns of a given input, but not all of them are required for the RBF process. Therefore, the size of the distortion map can be minimized by running an upward method. For the current dataset, this module only needs a matrix 3 × 3 in size called “Slippability criteria.” 8. The load covariance organizer module is similar to the above modules in terms of reading from a specific address. The functionality of this module is biasing the covariance of the RBF model. The RBF model is a weight matrix; by organizing this matrix by its covariance, the result can be improved. It should be noted that this loaded file is input dependent and by changing the input, this file should be updated. This module is optional for running the RBF. GOLNARAGHI et al. 12 of 16 FIGURE 8 Sample of output during training and testing 9. Run organizer runs a specific RBF by changing the odd columns of the RBF kernel. This module uses a fifth-order nonlinear kernel with regularization. The latter means adding extra parameters to have a better control over the given input. 10. Track builder creates an object file from the given model for pretraining, as an active method of optimizing the computational complexity, in terms of storage and execution. By clicking on this button, the RBF with a pretrain weigh matrix will be run. This matrix is normalized between [0, 0.6]. This value is computed tentatively by trial and error. The kernel of this builder is the same as the “run organizer” explained in the previous section. 11. In addition to the nonlinear kernel, run associative uses a rule-based scheme for enhancing the performance of learning. This rule-based scheme is based on a hypothesis implemented by some algebraic facts based on the LU-decomposition. 12. The reason for embedding the Flattener module is to flatten the mapped vectors of the L matrix explained above. By running this module, the wrapper module will be automatically applied. The difference between these two modules is that the flattener smoothens the L basis matrices. The basis matrices are the tensor product of matrix U columns and matrix L rows. 13. Run wrapper is the mapping process from Cartesian space to the L space of the LU-decomposition process. The motivation for using the matrix L was that it is an informative matrix both for extracting the structural features used as a feature vector and for being used in the RBF training. This module has no kernel and is solely based on the mapped L space. 14. Softmax is a function aimed to produce the relative prediction instead of an exact one. This is actually a vector of probabilities of all the trained categories. The kernel associated with this module is the approximate radius. The value of this radius is the mean of the Softmax probability vector. 15. Report graphs produces several graphs; namely, root-mean-squared error (RMSE), mean absolute error (MAE), R-squared, mean-squared error (MSE) for training and testing phases as shown in Figure 8. 16. Live network demo module opens another application, which is the graphical user interface that enables the user to set the layers of the RBF with different sorts of input and various types of output as regression and classification. 17. After training the RBF model, the model should be tested by any unseen input. This input can be inserted by the “custom data” button provided next to the “test probability” button. GOLNARAGHI et al. 4 13 of 16 DATA CO LLECTION Over a period of 18-month observation, Khan26 performed field observations and data collection from two high-rise buildings located in downtown Montreal, Canada. The first building was the Concordia University Engineering and Visual Arts (EV) building, a 17-floor integrated educational complex. The EV building is of reinforced concrete, mainly flat slab structure and several typical levels with a surface area of 68 000 m2 . The project was constructed over 3 years. The second building has a similar structural system of 16-floor flat slab apartment building located also in downtown Montreal. Over 18 months, 221 data points were collected from both projects for formwork activity. These datasets are combined because the work types were the same and by their combination a larger dataset was formed, which resulted in more comprehensive data for model development. This dataset has been used in Golnaraghi et al27 as well. Data of nine factors classified in three major categories were available as shown in Table 1 for performing the task. The dataset was randomly divided into 80% and 20% for training and testing the model, respectively. Khan26 stated that these factors are selected because they can cause variations in productivity in the short term on daily basis and as they are the most commonly encountered factors construction sites.26 Short-term influence means that these factors have a different values every day. As can be observed from Table 1, three variables in the datasets are qualitative. These variables should be represented in a numerical form to be included into the developed models. Precipitation is incorporated in terms of four numerical values of 0 to 3, which are assigned to no precipitation, light rain, snow, and rain, respectively. For the purposes of analysis, slabs, walls, and columns are the types of work that were coded into 1 to 3, respectively. Furthermore, two techniques of built-in-place and flying forms were used at both sites and were coded as 1 and 2, respectively. Table 2 shows the descriptive statistics of collected data, which can provide a summary of the dataset. 5 RE S ULTS AN D D ISCU SSION The result table summarizes detailed information about the different RBF runs. These runs include simple RBF that uses a second-order kernel function, RBF that utilizes LU-decomposition, RBF with LU-decomposition and the GP, and finally, the RBF with LU-decomposition powered by the GP with EM. This table also includes statistical measurements such as R-squared, MSE, RMSE, and MAE for test data. In addition to this information, the associated parameters with respect to the used kernel are listed in the last two rows of Table 3. T A B L E 1 Labor productivity factors26 Weather Crew Project Temperature Gang size Work type Humidity Labor percentage Floor level Wind speed Work method Precipitation T A B L E 2 Collected data descriptive statistics Variable SE Mean StDev Min. Median Max. 4.08 0.81 12.03 −26 3 25 66.34 1.05 15.67 18 67 97 Precipitation 0.28 0.04 0.6 0 0 3 Wind speed 15.42 0.57 8.46 3 14 43 Gang size 16.03 0.34 5.07 8 18 24 Labor percentage 35.49 0.26 3.79 29 36 47 Work type 1.43 0.03 0.51 1 1 3 Floor level 11.38 0.25 3.75 1 12 17 Work method 1.44 0.03 0.5 1 1 2 Productivity 1.57 0.02 0.35 0.82 1.51 2.53 Temperature Humidity Mean GOLNARAGHI et al. 14 of 16 Simple RBF RBF by LU LU-RBF by GP LU-GP-RBF by EM Parameters Generalized pareto Generalized extreme value Tlocationscale Exponential R-Squared 0.8470 0.9079 0.9158 0.9201 MSE 0.0299 0.0280 0.0261 0.0200 RMSE 0.1731 0.1674 0.1617 0.1415 MAE 0.0397 0.0422 0.0391 0.0418 Parameter names K, sigma, theta K, sigma, mu Mu, sigma, nu Mu Parameter values 1.8292 2.2353 9.2724 150.4623 T A B L E 3 Result table sample Abbreviations: EM, expectation maximization; GP, Gaussian process; MAE, mean absolute error; MSE, mean-squared error; RBF, radial basis function; RMSE, root-mean-squared error. R2 Comparison of developed AI-based models. AI, artificial intelligence FIGURE 9 F I G U R E 10 MSE comparison of AI-based model. AI, artificial intelligence; MSE, mean-squared error It can be observed that LU-RBF has the lowest MAE, MSE, and RMSE in comparison to other models. Therefore, this model is selected in the developed model. After selecting the LU-RBF, different runs of RBF were then compared with existing models including RBF, generalized regression neural network (GRNN), ANN, and adaptive neuro-fuzzy inference system (ANFIS). Figures 9 and 10. show the outcome comparisons of the AI-based models developed and discussed in this article. As can be observed from Figure 9, the LU-RBFNN shows superior performance both in the training and testing phase. Another substantial remark is that the error of the developed model is very close in both the training and testing phases. This demonstrates that the developed model shows good performance from the point of view of generalization. GOLNARAGHI et al. 15 of 16 T A B L E 4 Statistical performance indicators of the selected LU-RBF model Performance Indicators Value R2 test 0.9201 R2 train 0.9601 R2 0.9902 RMSE 0.1415 MSE 0.0419 Abbreviations: MSE, mean-squared error; RMSE, root-mean-squared error. The model avoids both training data underfitting, which corresponds to high statistical bias, and training data overfitting, which corresponds to high statistical variance. Although the ANFIS shows considerable results in terms of generalization, ANIFS has lower accuracy as compared with the LU-RBFNN. Thus, it can be concluded that the LU-RBF model performs better in predicting labor productivity for the given dataset. The best model was selected based on statistical performance indicators for the testing phase, which is LU-RBF. The selected model achieved the values of summarized in Table 4. 6 CO NC LUSION S In this article, construction labor productivity modeling for formwork installation is developed utilizing RBFNN, which is improved by applying four data processing techniques: zoning, EM, normalization, and the resampling of raw datasets for enhancing the given dataset. The dataset for two high-rise buildings was adopted from Khan's26 study of productivity modeling. RBFNN is able to respond well to patterns, which are not used in training compared with other neural network techniques. RBFNN can also improve the permanency of the developed model, as it shows resilient noise tolerance for the given data. This is an advantage over ANN, ANFIS, and GRNN. The presence of noise in a dataset is a common problem that may lead to having a less reliable model. RBFNN requires less computational time (both in training and testing) as compared with other techniques due to its simple topological structure. The developed model attempts to improve prediction accuracy of labor productivity and shows reliable performance from the point of generalization due to having training and testing errors close to each other. The developed model benefits from LU-decomposition for extracting important features of the dataset, which can be very useful for generalization. This model could be used to validate calculated productivity and the assumptions used in that process, which is beneficial for all project parties. The developed model can assist in providing a better understanding and more realistic expectations of the construction labor productivity in planning and scheduling as well as in quantifying loss of productivity. CONFLICT OF INTEREST The authors declare that there is no conflict of interest regarding the publication of this article. ORCID Zahra Zangenehmadar https://orcid.org/0000-0002-9508-4440 REFERENCES 1. Robert C. Machine Learning, a Probabilistic Perspective. Taylor and Francis; 2014;27(2):62-63. doi: 10.1080/09332480.2014.914768. 2. Maren AJ, Harston CT, Pap RM. Handbook of Neural Computing Applications. SanDiego, CA: Academic Press; 2014. 3. Jain T, Singh SN, Srivastava SC. Fast static available transfer capability determination using radial basis function neural network. Appl Soft Comput. 2011;11(2):2756-2764. 4. Lu M, AbouRizk S, Hermann U. Estimating labor productivity using probability inference neural network. J Comput Civil Eng. 2000;14(4):241-248. 5. Ok S, Sinha S. Construction equipment productivity estimation using artificial neural network model. Constr Manag Econ. 2006;24(10):1029-1044. 6. Ezeldin A, Sharara L. Neural networks for estimating the productivity of concreting activities. J Constr Eng Manage. 2006;132(6):650-656. 16 of 16 GOLNARAGHI et al. 7. Lau SC, Lu M, Ariaratnam ST. Applying radial basis function neural networks to estimate next-cycle production rates in tunneling construction. Tunneling Underground Space Technol. 2010;25(4):357-365. 8. Oral EL, Oral M. Predicting construction crew productivity by using self organizing maps. Autom Constr. 2010;19:791-797. 9. Muqeem S, Idrus M, Khamidi F. Construction labor production rates modeling using artificial neural network. J Inf Technol Constr. 2011;16:713-725. 10. Mohammed S, Tofan A. Neural networks for estimating the ceramic productivity of walls. J Eng. 2011;17(2):200-217. 11. AL-Zwainy FMS, Rasheed HA, Ibraheem HF. Development of the construction productivity estimation model using artificial neural network for finishing works for floors with marble. ARPN J Eng Appl Sci. 2012;7(6):714-722. 12. Moselhi O, Khan Z. Significance ranking of parameters impacting construction labor productivity. Constr Innovation. 2012;12(3):272-296. 13. Heravi G, Eslamdoost E. Applying artificial neural networks for measuring and predicting construction-labor productivity. J Constr Eng Manage. 2015;141(10):04015032. 14. Aswed G. Productivity estimation model for bricklayer in construction projects using neural network. Al-Qadisiyah J Eng Sci. 2016;9(2):183-199. 15. El-Gohary KM, Aziz RF, Abdel-Khalek HA. Engineering approach using ANN to improve and predict construction labor productivity under different influences. J Constr Eng Manage. 2017;143(8):04017045. 16. Harris PM, Ventura SJ. The integration of geographic data with remotely sensed imagery to improve classification in an urban area. Photogramm Eng Remote Sens. 1995;61(8):993-998. 17. Gupta MR, Chen Y. Theory and Use of the EM Algorithm. Vol 4. Boston, MA: Foundations and Trends® in Signal Processing; 2011:223-296. 18. Langari R, Wang L, Yen J. Radial basis function networks, regression weights, and the expectation-maximization algorithm. IEEE Trans Syst, Man, Cybern-Part A Syst Hum. 1997;27(5):613-623. 19. Roweis, S., & Ghahramani, Z. (2001). Learning nonlinear dynamical systems using the expectation–maximization algorithm. Kalman Filtering and Neural Networks. London: Wiley Online Library; 6, 175–220. 20. Oppenheim AV. 2006. MIT open course ware - lecture 5: sampling rate conversion. Web site https://ocw.mit.edu/courses/electricalengineering-and-computer-science/6-341-discrete-time-signal-processing-fall-2005/lecture-notes/ 21. Frasinski LJ, Codling K & Hatherly PA. Covariance mapping: A correlation method applied to multiphoton multiple ionization. Science. 1989;246(4933):1029-1031. 22. Karayiannis NB. Gradient descent learning of radial basis neural networks. Paper Presented at: The Neural Networks, 1997 International Conference; Vol. 3; 1997;1815–1820. 23. Lu Y, Michaels JE. Feature extraction and sensor fusion for ultrasonic structural health monitoring under changing environmental conditions. IEEE Sensors J. 2009;9(11):1462-1471. 24. Scholkopf B, Sung KK, Burges CJ, et al. Comparing support vector machines with Gaussian kernels to radial basis function classifiers. IEEE Trans Signal Process. 1997;45(11):2758-2765. 25. Koponen I. Analytic approach to the problem of convergence of truncated Lévy flights towards the Gaussian stochastic process. Phys Rev E. 1995;52(1):1197-1199. 26. Khan ZU. 2005. Modeling and Parameter Ranking of Construction Labor Productivity [doctoral dissertation]. Concordia University. Web site https://spectrum.library.concordia.ca 27. Golnaraghi S, Zangenehmadar Z, Moselhi O, Alkass S. Application of artificial neural network(s) in predicting formwork labour productivity. Adv Civil Eng. 2019;2019:5972620. How to cite this article: Golnaraghi S, Moselhi O, Alkass S, Zangenehmadar Z. Predicting construction labor productivity using lower upper decomposition radial base function neural network. Engineering Reports. 2020;2:e12107. https://doi.org/10.1002/eng2.12107