Abstract
To clarify the causality among process parameters is a core issue of data-driven production performance analysis and product quality optimization. The difficulty lies in accurately measuring and distinguishing direct and indirect associations of complex manufacturing systems. In this work, the nonparametric-copula-entropy and network deconvolution method is proposed for causal discovery in complex manufacturing systems. Firstly, based on copula theory and kernel density estimation method, the nonparametric-copula-entropy is introduced to improve the accuracy of association measurement between parameters, and its superiority is verified by comparing with the results of different association measurement methods. Then, the global association matrix is constructed by the nonparametric-copula-entropy, and network deconvolution method is employed to extract the direct information from the global association matrix. The proposed method is tested by using an open gene expression dataset. Finally, as an experimental application, the causal analysis for a diesel engine production line is carried out by the proposed method. The results show that the proposed method can reveal causal relationship between process parameters and quality parameters in the diesel engine production line well, which provide theoretical guidance and implementation approach for the optimal control of complex manufacturing system.
Similar content being viewed by others
Abbreviations
- a 1(x i), a 2(x i):
-
Parameters of multivariate kernel function K(·)
- B :
-
Base-number of logarithmic functions
- c(u 1, u 2, …, u N):
-
Probability density function corresponding to copula function
- C(u 1, u 2, …, u N):
-
Copula function
- f(x 1, x 2, …, x N):
-
Joint probability density function of N random variables X1, X2, …, XN
- f 1(x 1), f 2(x 2) …, f N(x N):
-
Marginal probability density function of N random variables X1, X2, …, XN
- F(x 1, x 2, …, x N):
-
Joint distribution function of N random variables X1, X2, …, XN
- F 1(x 1), F 2(x 2) …, F N(x N):
-
Marginal distribution function of N random variables X1, X2, …, XN
- G(·):
-
Distribution function of univariate kernel function
- G :
-
Global association matrix
- G dir :
-
Direct association matrix
- h :
-
Bandwidth of univariate kernel density estimation
- h i :
-
Bandwidth of multivariate kernel density estimation for the ith random variable
- I :
-
Identity matrix
- k(·):
-
Univariate kernel function
- K(·):
-
Multivariate kernel function
- MI(X 1, X 2, …, X N):
-
Mutual information of multi-dimension random variables X1, X2, …, XN
- N :
-
Number of random variables
- n :
-
Rotational speed
- p(x i):
-
Subfunctions of multivariate kernel function K(·)
- P :
-
Power
- T :
-
Torque
- u i :
-
Independent variable of copula function, ui = Fi(xi)
- U i :
-
Random variable i subject to uniform distribution
- U :
-
Vector form of [u1, u2, …, uN]
- x i :
-
Observation values of the ith random variable
- X i :
-
The ith random variable
- AMV:
-
Association measurement values
- AUC:
-
Area under ROC curve
- CE:
-
Copula entropy
- FN:
-
The numbers of false negatives
- FP:
-
The numbers of false positives
- FPR:
-
False positive rate
- KDE:
-
Kernel density estimation
- MES:
-
Manufacturing execution system
- NCE:
-
Nonparametric copula entropy
- ND:
-
Network deconvolution
- OLE:
-
Object linking and embedding
- PDF:
-
Probability density function
- QAS:
-
Quality assurance system
- RMSE:
-
Root mean square error
- ROC:
-
Receiver operating characteristic
- TN:
-
The numbers of true negatives
- TP:
-
The numbers of true positives
- TPR:
-
True positive rate
- WS:
-
Work station
References
Altay, G., & Emmert-Streib, F. (2010). Revealing differences in gene network inference algorithms on the network level by ensemble methods. Bioinformatics, 26(14), 1738–1744.
Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: an open source software for exploring and manipulating networks. In Third International AAAI Conference on Weblogs and Social Media.
Bowman, A. W., & Azzalini, A. (1997). Applied smoothing techniques for data analysis. Oxford University Press Inc.
Chen, L., & Guo, S. (2019). Copulas and its application in hydrology and water resources. Springer.
Chen, L., Ye, L., Singh, V., Zhou, J., & Guo, S. (2014). Determination of input for artificial neural networks for flood forecasting using the copula entropy method. Journal of Hydrologic Engineering, 19(11), 04014021–04014031.
Cover, T. M., & Thomas, J. A. (2012). Elements of information theory. Wiley.
Croux, C., & Dehon, C. (2010). Influence functions of the Spearman and Kendall correlation measures. Statistical Methods & Applications, 19(4), 497–515.
Embrechts, P., Lindskog, F., & McNeil, A. (2001). Modelling dependence with copulas, rapport technique, Dép. de Math. Inst. Féd. de Technol. de Zurich, Zurich.
Fang, L., Zhao, H., Wang, P., Yu, M., Yan, J., Cheng, W., & Chen, P. (2015). Feature selection method based on mutual information and class separability for dimension reduction in multidimensional time series for clinical data. Biomedical Signal Processing and Control, 21, 82–89.
Feizi, S., Marbach, D., Médard, M., & Kellis, M. (2013). Network deconvolution as a general method to distinguish direct dependencies in networks. Nature Biotechnology, 31(8), 726.
Frigieri, E. P., Ynoguti, C. A., & Paiva, A. P. (2019). Correlation analysis among audible sound emissions and machining parameters in hardened steel turning. Journal of Intelligent Manufacturing, 30(4), 1753–1764.
Gu, Y. K., Fan, C. J., Liang, L. Q., & Zhang, J. (2019). Reliability calculation method based on the Copula function for mechanical systems with dependent failure. Annals of Operations Research. https://doi.org/10.1007/s10479-019-03202-5
Han, M., & Ren, W. (2015). Global mutual information-based feature selection approach using single-objective and multi-objective optimization. Neurocomputing, 168, 47–54.
Hu, L. (2006). Dependence patterns across financial markets: a mixed copula approach. Applied Financial Economics, 16(10), 717–729.
Huard, D., Évin, G., & Favre, A. C. (2006). Bayesian copula selection. Computational Statistics & Data Analysis, 51(2), 809–822.
Jeon, H. W., Lee, S., & Wang, C. (2019). Estimating manufacturing electricity costs by simulating dependence between production parameters. Robotics and Computer-Integrated Manufacturing, 55, 129–140.
Jones, M. C. (1993). Simple boundary correction for kernel density estimation. Statistics and Computing., 3(3), 135–146.
Khan, S., Bandyopadhyay, S., Ganguly, A. R., Saigal, S., Erickson, D. J., et al. (2007). Relative performance of mutual information estimation methods for quantifying the dependence among short and noisy data. Physical Review E, 76, 1–15.
Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138.
Marbach, D., Prill, R. J., Schaffter, T., Mattiussi, C., Floreano, D., & Stolovitzky, G. (2010). Revealing strengths and weaknesses of methods for gene network inference. Proceedings of the National Academy of Sciences, 107(14), 6286–6291.
Nicoloutsopoulos, D. (2005). Parametric and Bayesian non-parametric estimation of copulas. Doctoral dissertation, University of London.
Patton, A. J. (2002). Applications of Copula Theory in Financial Econometrics, Ph.D. Dissertation, University of California.
Qin, W., Zha, D., & Zhang, J. (2018). An effective approach for causal variables analysis in diesel engine production by using mutual information and network deconvolution. Journal of Intelligent Manufacturing, 1–11.
Rossi, F., Lendasse, A., François, D., Wertz, V., & Verleysen, M. (2006). Mutual information for the selection of relevant variables in spectrometric nonlinear modelling. Chemometrics and intelligent laboratory systems, 80(2), 215–226.
Scott, D. W. (2015). Multivariate density estimation: theory, practice, and visualization. Wiley.
Shannon, C. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423.
Shi, J., Zhao, J., Li, T., & Chen, L. (2019). Detecting direct associations in a network by information theoretic approaches. Science China Mathematics, 62(5), 823–838.
Silverman, B. W. (1986). Density estimation for statistics and data analysis. Monographs on Statistics and Applied Probability, 26.
Sklar, M. (1959). Fonctions de repartition an dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris, 8, 229–231.
Sun, J. C., et al. (2017). Complex network construction of multivariate time series using information geometry. IEEE Transactions on Systems, Man, and Cybernetics: Systems., 49(1), 107–122.
Thomas, R. D., Moses, N. C., Semple, E. A., & Strang, A. J. (2014). An efficient algorithm for the computation of average mutual information: Validation and implementation in Matlab. Journal of Mathematical Psychology, 61, 45–59.
Wei, J., Pan, Z., Lin, X., Qin, D., Zhang, A., & Shi, L. (2019). Copula-function-based analysis model and dynamic reliability of a gear transmission system considering failure correlations. Fatigue & Fracture of Engineering Materials & Structures, 42(1), 114–128.
Xu, Y. (2005). Applications of Copula-based Models in Portfolio Optimization, Ph.D. Dissertation, University of Miami.
Zachariah, M., & Reddy, M. J. (2013). Development of an entropy-copula-based stochastic simulation model for generation of monthly inflows into the Hirakud Dam. ISH Journal of Hydraulic Engineering, 19(3), 267–275.
Zhang, X., Zhao, X. M., He, K., Lu, L., Cao, Y., Liu, J., et al. (2012). Inferring gene regulatory networks from gene expression data by path consistency algorithm based on conditional mutual information. Bioinformatics, 28(1), 98.
Zhang, X., Zhao, J., Hao, J. K., Zhao, X. M., & Chen, L. (2015). Conditional mutual inclusive information enables accurate quantification of associations in gene regulatory networks. Nucleic Acids Research, 43(5), e31–e31.
Zhao, N., & Lin, W. T. (2011). A copula entropy approach to correlation measurement at the country level. Applied Mathematics and Computation, 218(2), 628–642.
Acknowledgements
This project is supported by Shanghai Aerospace Science and Technology Innovation Fund (No. SAST2016048), and the National Natural Science Foundation (Grant Nos. 51435009 and 51775348).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sun, Y., Qin, W. & Zhuang, Z. Nonparametric-copula-entropy and network deconvolution method for causal discovery in complex manufacturing systems. J Intell Manuf 33, 1699–1713 (2022). https://doi.org/10.1007/s10845-021-01751-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10845-021-01751-w