Gaussian kernel with correlated variables for incomplete data

181 Accesses
Explore all metrics

Abstract

The presence of missing components in incomplete instances precludes a kernel-based model from incorporating partially observed components of incomplete instances and computing kernels, including Gaussian kernels that are extensively used in machine learning modeling and applications. Existing methods with Gaussian kernels to handle incomplete data, however, are based on independence among variables. In this study, we propose a new method, the expected Gaussian kernel with correlated variables, that estimates the Gaussian kernel with incomplete data, by considering the correlation among variables. In the proposed method, the squared distance between two instance vectors is modeled with the sum of the correlated squared unit-dimensional distances between the instances, and the Gaussian kernel with missing values is obtained by estimating the expected Gaussian kernel function under the probability distribution for the squared distance between the vectors. The proposed method is evaluated on synthetic data and real-life data from benchmarks and a case from a multi-pattern photolithographic process for wafer fabrication in semiconductor manufacturing. The experimental results show the improvement by the proposed method in the estimation of Gaussian kernels with incomplete data of correlated variables.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A sparse linear regression model for incomplete datasets

Article 04 December 2019

Clustering with missing features: a penalized dissimilarity measure based approach

Article 12 June 2018

Enhancing data analysis: uncertainty-resistance method for handling incomplete data

Article 25 June 2019

References

Alvarez, M. A., Rosasco, L., & Lawrence, N. D. (2012). Kernels for vector-valued functions: A review. Foundations and Trends® in Machine Learning, 4(3), 195–266.
Google Scholar
Andridge, R. R., & Little, R. J. (2010). A review of hot deck imputation for survey non-response. International Statistical Review, 78(1), 40–64.
Google Scholar
Bae, J., & Park, J. (2020). Count-based change point detection via multi-output log-Gaussian Cox processes. IISE Transactions, 52(9), 998–1013.
Google Scholar
Belanche, L. A., Kobayashi, V., & Aluja, T. (2014). Handling missing values in kernel methods with application to microbiology data. Neurocomputing, 141, 110–116.
Google Scholar
Cai, J. F., Candès, E. J., & Shen, Z. (2010). A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4), 1956–1982.
Google Scholar
Choi, J., Son, Y., & Jeong, M. K. (2021). Restricted Relevance Vector Machine for Missing Data and Application to Virtual Metrology. IEEE Transactions on Automation Science and Engineering, 19(4), 3172–3183.
Cotton, C. (1991). Functional description of the generalized edit and imputation system. Statistics Canada, Business Survey Methods Division, 59, 447–461.
Google Scholar
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20, 273–297.
Covo, S., & Elalouf, A. (2014). A novel single-gamma approximation to the sum of independent gamma variables, and a generalization to infinitely divisible distributions. Electronic Journal of Statistics, 8(1), 894–926.
Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (methodological), 39(1), 1–38.
Google Scholar
Di Maio, F., Tsui, K. L., & Zio, E. (2012). Combining relevance vector machines and exponential regression for bearing residual life estimation. Mechanical Systems and Signal Processing, 31, 405–427.
Google Scholar
Eirola, E., Doquire, G., Verleysen, M., & Lendasse, A. (2013). Distance estimation in numerical data sets with missing values. Information Sciences, 240, 115–128.
Google Scholar
Eirola, E., Lendasse, A., Vandewalle, V., & Biernacki, C. (2014). Mixture of Gaussians for distance estimation with missing data. Neurocomputing, 131, 32–42.
Google Scholar
Feng, Y., Wen, M., Zhang, J., Ji, F., & Ning, G. X. (2016). Sum of arbitrarily correlated Gamma random variables with unequal parameters and its application in wireless communications. In 2016 international conference on computing, networking and communications (ICNC) (pp. 1–5).
Gazzola, G., Choi, J., Kwak, D. S., Kim, B., Kim, D. M., Tong, S. H., & Jeong, M. K. (2018). Integrated variable importance assessment in multi-stage processes. IEEE Transactions on Semiconductor Manufacturing, 31(3), 343–355.
Google Scholar
He, S., Xiao, L., Wang, Y., Liu, X., Yang, C., Lu, J., Gui, W., & Sun, Y. (2017). A novel fault diagnosis method based on optimal relevance vector machine. Neurocomputing, 267, 651–663.
Google Scholar
Hofmann, T., Schölkopf, B., & Smola, A. J. (2008). Kernel methods in machine learning. The Annals of Statistics, 36(3), 1171–1220.
Google Scholar
Huang, K., Wen, H., Yang, C., Gui, W., & Hu, S. (2021). Outlier detection for process monitoring in industrial cyber-physical systems. IEEE Transactions on Automation Science and Engineering.
Hwang, S., Jeong, M. K., & Yum, B. J. (2014). Robust relevance vector machine with variational inference for improving virtual metrology accuracy. IEEE Transactions on Semiconductor Manufacturing, 27(1), 83–94.
Google Scholar
Jia, S., Ma, B., Guo, W., & Li, Z. S. (2021). A sample entropy based prognostics method for lithium-ion batteries using relevance vector machine. Journal of Manufacturing Systems, 61, 773–781.
Google Scholar
Johnson, N. L., Kotz, S., & Balakrishnan, N. (1970). Continuous univariate distributions. Houghton Mifflin.
Google Scholar
Jurado, S., Nebot, À., Mugica, F., & Mihaylov, M. (2017). Fuzzy inductive reasoning forecasting strategies able to cope with missing data: A smart grid application. Applied Soft Computing, 51, 225–238.
Google Scholar
Kim, B., Jeong, Y. S., & Jeong, M. K. (2021). New multivariate kernel density estimator for uncertain data classification. Annals of Operations Research, 303(1), 413–431.
Google Scholar
Kim, J. K., & Fuller, W. (2004). Fractional hot deck imputation. Biometrika, 91(3), 559–578.
Google Scholar
Lichman, M. (2013). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml.
Lin, T. H. (2010). A comparison of multiple imputation with EM algorithm and MCMC method for quality of life missing data. Quality & Quantity, 44, 277–287.
Google Scholar
Little, R. J. (1992). Regression with missing X’s: A review. Journal of the American Statistical Association, 87(420), 1227–1237.
Google Scholar
Little, R. J., & Rubin, D. B. (2020). Statistical analysis with missing data. John Wiley and Sons.
Google Scholar
Meng, X. L., & Rubin, D. B. (1993). Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika, 80(2), 267–278.
Google Scholar
Mesquita, D. P., Gomes, J. P., Corona, F., Junior, A. H. S., & Nobre, J. S. (2019). Gaussian kernels for incomplete data. Applied Soft Computing, 77, 356–365.
Google Scholar
Mesquita, D. P., Gomes, J. P., Junior, A. H. S., & Nobre, J. S. (2017). Euclidean distance estimation in incomplete datasets. Neurocomputing, 248, 11–18.
Google Scholar
Nakagami, M. (1960). The m-distribution—A general formula of intensity distribution of rapid fading. In Statistical methods in radio wave propagation (pp. 3–36).
Nebot-Troyano, G., & Belanche-Muñoz, L. A. (2009). A kernel extension to handle missing data. In Research and development in intelligent systems XXVI: incorporating applications and innovations in intelligent systems XVII (pp. 165-178). Springer.
Nguyen, T. T., & Tsoy, Y. (2017). A kernel PLS based classification method with missing data handling. Statistical Papers, 58(1), 211–225.
Google Scholar
Pelckmans, K., De Brabanter, J., Suykens, J. A., & De Moor, B. (2005). Handling missing values in support vector machine classifiers. Neural Networks, 18(5), 684–692.
Google Scholar
Piccialli, V., & Sciandrone, M. (2022). Nonlinear optimization and support vector machines. Annals of Operations Research, 314(1), 15−47.
Genton, M. G. (Ed.). (2004). Skew-elliptical distributions and their applications: A journey beyond normality. CRC Press.
Roberts, C., & Geisser, S. (1966). A necessary and sufficient condition for the square of a random variable to be gamma. Biometrika, 53(1/2), 275–278.
Google Scholar
Rubin, D. B. (1978). Multiple imputations in sample surveys-a phenomenological Bayesian approach to nonresponse. In Proceedings of the survey research methods section of the American Statistical Association (Vol. 1, pp. 20–34). American Statistical Association.
Sande, I. G. (1983). Hot-deck imputation procedures. Incomplete Data in Sample Surveys, 3, 339–349.
Google Scholar
Schafer, J. L. (1997). Analysis of incomplete multivariate data. CRC Press.
Google Scholar
Schölkopf, B., Smola, A. J., & Bach, F. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press.
Sexton, J., & Swensen, A. R. (2000). ECM algorithms that converge at the rate of EM. Biometrika, 87(3), 651–662.
Google Scholar
Shahzad, U., Sengupta, T., Rao, A., & Cui, L. (2023). Forecasting carbon emissions future prices using the machine learning methods. Annals of Operations Research. https://doi.org/10.1007/s10479-023-05188-7
Article Google Scholar
Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge University Press.
Google Scholar
Smola, A. J., Schölkopf, B., & Müller, K. R. (1998). The connection between regularization operators and support vector kernels. Neural Networks, 11(4), 637–649.
Google Scholar
Smola, A. J., Vishwanathan, S. V. N., & Hofmann, T. (2005). Kernel methods for missing variables. In Proceedings of the 10th international workshop on artificial intelligence and statistics (pp. 325–332).
Son, Y., Byun, H., & Lee, J. (2016). Nonparametric machine learning models for predicting the credit default swaps: An empirical study. Expert Systems with Applications, 58, 210–220.
Google Scholar
Sun, C. Y., Yin, Y. Z., Kang, H. B., & Ma, H. J. (2022). A quality-related fault detection method based on the dynamic data-driven algorithm for industrial systems. IEEE Transactions on Automation Science and Engineering, 19(4), 3942–3952.
Tipping, M. E. (2001). Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1, 211–244.
Google Scholar
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., & Altman, R. B. (2001). Missing value estimation methods for DNA microarrays. Bioinformatics, 17(6), 520–525.
Google Scholar
Van Buuren, S., & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45, 1–67.
Google Scholar
Van Hulse, J., & Khoshgoftaar, T. M. (2014). Incomplete-case nearest neighbor imputation in software measurement data. Information Sciences, 259, 596–610.
Google Scholar
Von Hippel, P. T. (2009). 8. How to impute interactions, squares, and other transformed variables. Sociological Methodology, 39(1), 265–291.
Google Scholar
Wang, Y., & Fu, L. (2023). Study on regional tourism performance evaluation based on the fuzzy analytic hierarchy process and radial basis function neural network. Annals of Operations Research. https://doi.org/10.1007/s10479-023-05224-6
Article Google Scholar
Wei, C., Chen, J., Song, Z., & Chen, C. I. (2018). Development of self-learning kernel regression models for virtual sensors on nonlinear processes. IEEE Transactions on Automation Science and Engineering, 16(1), 286–297.
Google Scholar
Zhang, K., Song, Z., & Guan, Y. L. (2004). Simulation of Nakagami fading channels with arbitrary cross-correlation and fading parameters. IEEE Transactions on Wireless Communications, 3(5), 1463–1468.
Google Scholar
Zhong, Y., Ma, A., Soon Ong, Y., Zhu, Z., & Zhang, L. (2018). Computational intelligence in optical remote sensing image processing. Applied Soft Computing, 64, 75–93.
Google Scholar

Download references

Funding

This research was supported in part by the National Research Foundation of Korea (NRF) grant funded by the Ministry of Science and ICT (MSIT) of Korea (No. RS-2023-00208412).

Author information

Authors and Affiliations

Department of Management Information Systems, West Virginia University, Morgantown, WV, USA
Jeongsub Choi
Department of Industrial and Systems Engineering, Dongguk University–Seoul, Seoul, South Korea
Youngdoo Son
Department of Industrial and Systems Engineering, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
Myong K. Jeong

Authors

Jeongsub Choi
View author publications
You can also search for this author in PubMed Google Scholar
Youngdoo Son
View author publications
You can also search for this author in PubMed Google Scholar
Myong K. Jeong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Youngdoo Son.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

Appendix A: Proof of Proposition 1

For the squared distance between two real vectors ${\mathbf{x}}_{i}$ and ${\mathbf{x}}_{j}$, a Gamma variable $\zeta_{ij}$ is approximated from the sum of correlated Gamma variables $\gamma_{ijp}$ $\sim$ $Gamma\left( {k_{ijp} ,\theta_{ijp} } \right)$ for $p$ $=$ 1, …, $D$ based on the approximation in Feng et al., (2016). The shape parameter $k_{ijp}$, which is estimated using $E\left[ {\gamma_{ijp} } \right]$ in (19) and $Var\left( {\gamma_{p} } \right)$ in (20) from the moments of the missing components in the original space, satisfies the condition $k_{ijp}$ $\ge$ $\frac{1}{2}$ if $\sigma_{pp,i} + \sigma_{pp,j}$ $>$ $0$:

$$ \begin{array}{ll} k_{ijp} = \frac{{E\left[ {\gamma_{ijp} } \right]^{2} }}{{Var\left( {\gamma_{ijp} } \right)}} \ge \frac{1}{2} \\ \Leftrightarrow \frac{{\left\{ {\left( {\tilde{x}_{ip} - \tilde{x}_{jp} } \right)^{2} + \sigma_{pp,i} + \sigma_{pp,j} } \right\}^{2} }}{{2\left( {\sigma_{pp,i} + \sigma_{pp,j} } \right)\left\{ {\sigma_{pp,i} + \sigma_{pp,j} + 2\left( {\tilde{x}_{ip} - \tilde{x}_{jp} } \right)^{2} } \right\}}} \ge \frac{1}{2} \\ \Leftrightarrow \left( {\tilde{x}_{ip} - \tilde{x}_{jp} } \right)^{4} \ge 0. \end{array} $$

(A.1)

Similarly, the scale parameter $\theta_{ijp}$ satisfies the condition $\theta_{ijp}$ $>$ $0$ if $\sigma_{pp,i} + \sigma_{pp,j}$ $>$ $0$:

$$ \begin{aligned} & \theta_{ijp} = \frac{{Var\left( {\gamma_{ijp} } \right)}}{{E\left[ {\gamma_{ijp} } \right]}} > 0 \\ & \Leftrightarrow \frac{{2\left( {\sigma_{pp,i} + \sigma_{pp,j} } \right)\left\{ {\sigma_{pp,i} + \sigma_{pp,j} + 2\left( {\tilde{x}_{ip} - \tilde{x}_{jp} } \right)^{2} } \right\}}}{{\left( {\tilde{x}_{ip} - \tilde{x}_{jp} } \right)^{2} + \sigma_{pp,i} + \sigma_{pp,j} }} > 0 \\ \end{aligned} $$

(A.2)

Appendix B: Covariance between squared unit-dimensional distances

Under the assumption of the independence between two instances ${\mathbf{x}}_{i}$ and ${\mathbf{x}}_{j}$, The covariance between $\gamma_{ijp}$ and $\gamma_{ijq}$ in (20) can be rewritten as

$$ \begin{aligned} & Cov\left( {\gamma_{ijp} ,\gamma_{ijq} } \right) \\ &\quad = E\left[ {X_{ip}^{2} X_{iq}^{2} } \right] - 2E\left[ {X_{ip}^{2} X_{iq} } \right]E\left[ {X_{jq} } \right] - 2E\left[ {X_{ip} X_{iq}^{2} } \right]E\left[ {X_{jp} } \right] + 4E\left[ {X_{ip} X_{iq} } \right]E\left[ {X_{jp} X_{jq} } \right] \\ & \quad \quad - 2E\left[ {X_{ip} } \right]E\left[ {X_{jp} X_{jq}^{2} } \right] - 2E\left[ {X_{iq} } \right]E\left[ {X_{jp}^{2} X_{jq} } \right] + E\left[ {X_{jp}^{2} X_{jq}^{2} } \right] + 2E\left[ {X_{ip}^{2} } \right]E\left[ {X_{iq} } \right]E\left[ {X_{jq} } \right] \\ &\quad \quad - E\left[ {X_{ip}^{2} } \right]E\left[ {X_{jq}^{2} } \right] + 2E\left[ {X_{ip} } \right]E\left[ {X_{jp} } \right]E\left[ {X_{iq}^{2} } \right] - 4E\left[ {X_{ip} } \right]E\left[ {X_{iq} } \right]E\left[ {X_{jp} } \right]E\left[ {X_{jq} } \right] \\ &\quad \quad + 2E\left[ {X_{ip} } \right]E\left[ {X_{jp} } \right]E\left[ {X_{jq}^{2} } \right] + 2E\left[ {X_{jp}^{2} } \right]E\left[ {X_{iq} } \right]E\left[ {X_{jq} } \right] - E\left[ {X_{jp}^{2} } \right]E\left[ {X_{jq}^{2} } \right] \\ \end{aligned} $$

(B.1)

where the two terms in (20) are

$$ \begin{aligned} & E\left[ {\left( {X_{ip} - X_{jp} } \right)^{2} \left( {X_{iq} - X_{jq} } \right)^{2} } \right] \\ &\quad = E\left[ {X_{ip}^{2} X_{iq}^{2} } \right] - 2E\left[ {X_{ip}^{2} X_{iq} } \right]E\left[ {X_{jq} } \right] + E\left[ {X_{ip}^{2} } \right]E\left[ {X_{jq}^{2} } \right] - 2E\left[ {X_{ip} X_{iq}^{2} } \right]E\left[ {X_{jp} } \right] \\ & \quad \quad + 4E\left[ {X_{ip} X_{iq} } \right]E\left[ {X_{jp} X_{jq} } \right] - 2E\left[ {X_{ip} } \right]E\left[ {X_{jp} X_{jq}^{2} } \right] + E\left[ {X_{iq}^{2} } \right]E\left[ {X_{jp}^{2} } \right] \\ &\quad \quad - 2E\left[ {X_{iq} } \right]E\left[ {X_{jp}^{2} X_{jq} } \right]+ E\left[ {X_{jp}^{2} X_{jq}^{2} } \right] \\ \end{aligned} $$

and

$$ \begin{aligned} & E\left[ {\left( {X_{ip} - X_{jp} } \right)^{2} } \right]E\left[ {\left( {X_{iq} - X_{jq} } \right)^{2} } \right] \\ &\quad = E\left[ {X_{ip}^{2} } \right]E\left[ {X_{iq}^{2} } \right] - 2E\left[ {X_{ip}^{2} } \right]E\left[ {X_{iq} } \right]E\left[ {X_{jq} } \right] + E\left[ {X_{ip}^{2} } \right]E\left[ {X_{jq}^{2} } \right] - 2E\left[ {X_{ip} } \right]E\left[ {X_{jp} } \right]E\left[ {X_{iq}^{2} } \right] \\ &\quad \quad + 4E\left[ {X_{ip} } \right]E\left[ {X_{iq} } \right]E\left[ {X_{jp} } \right]E\left[ {X_{jq} } \right] - 2E\left[ {X_{ip} } \right]E\left[ {X_{jp} } \right]E\left[ {X_{jq}^{2} } \right] + E\left[ {X_{iq}^{2} } \right]E\left[ {X_{jp}^{2} } \right] \\ &\quad \quad - 2E\left[ {X_{jp}^{2} } \right]E\left[ {X_{iq} } \right]E\left[ {X_{jq} } \right] + E\left[ {X_{jp}^{2} } \right]E\left[ {X_{jq}^{2} } \right]. \\ \end{aligned} $$

To compute the high-order moments of the $i$-th instance in (B.1), let ${\mathbf{x}}_{i(pq)}$ $=$ $\left[ {X_{ip} ,X_{iq} } \right]^{{\text{T}}}$ be the bivariate normal distribution, as a subset of the variables in ${\mathbf{x}}_{i}$, with the mean ${\tilde{\mathbf{x}}}_{{i\left( {pq} \right)}} = \left[ {\tilde{x}_{ip} , \tilde{x}_{iq} } \right]^{{\text{T}}}$ and covariance matrix ${\tilde{\mathbf{S}}}_{{i\left( {pq} \right)}}$ $=$ $\left[ {\begin{array}{*{20}c} {\sigma_{pp,i} } & {\sigma_{pq,i} } \\ {\sigma_{pq,i} } & {\sigma_{qq,i} } \\ \end{array} } \right]$. Let $M\left( {\mathbf{t}} \right)$ be the moment generating function of ${\mathbf{x}}_{{i\left( {pq} \right)}}$ with a variable vector ${\mathbf{t}}$ $=$ $\left[ {t_{p} ,t_{q} } \right]^{{\text{T}}}$ as

$$ M\left( {\mathbf{t}} \right) = \exp \left\{ {{\mathbf{t}}^{{\text{T}}} {\tilde{\mathbf{x}}}_{{i\left( {pq} \right)}} + \frac{1}{2}{\mathbf{t}}^{{\text{T}}} {\tilde{\mathbf{S}}}_{{i\left( {pq} \right)}} {\mathbf{t}}} \right\}. $$

(B.2)

High-order raw cross moments of ${\mathbf{x}}_{{i\left( {pq} \right)}}$ are given by

$$ E\left[ {X_{ip}^{{k_{1} }} X_{iq}^{{k_{2} }} {|}{\tilde{\mathbf{x}}}_{{i\left( {pq} \right)}} ,{\tilde{\mathbf{S}}}_{{i\left( {pq} \right)}} } \right] = \left. {\frac{{\partial^{{k_{1} + k_{2} }} M\left( {\mathbf{t}} \right)}}{{\partial t_{p}^{{k_{1} }} t_{q}^{{k_{2} }} }}} \right|_{{{\mathbf{t}} = 0}} $$

(B.3)

and, accordingly, we have

$$ E\left[ {X_{ip}^{2} } \right] = \tilde{x}_{ip}^{2} + \sigma_{pp,i} $$

(B.4)

$$ E\left[ {X_{ip} X_{iq} } \right] = \sigma_{pq,i} + \tilde{x}_{ip} \tilde{x}_{iq} $$

(B.5)

$$ E\left[ {X_{ip}^{2} X_{iq} } \right] = \sigma_{pp,i} \mu_{iq} + 2\sigma_{pq,i} \tilde{x}_{ip} + \tilde{x}_{ip}^{2} \tilde{x}_{iq} $$

(B.6)

$$ E\left[ {X_{ip}^{2} X_{iq}^{2} } \right] = \sigma_{pp,i} \sigma_{qq,i} + \sigma_{pp,i} \tilde{x}_{iq}^{2} + \sigma_{qq,i} \tilde{x}_{ip}^{2} + 2\sigma_{pq,i}^{2} + 4\sigma_{pq,i} \tilde{x}_{ip} \tilde{x}_{iq} + \tilde{x}_{ip}^{2} \tilde{x}_{iq}^{2} . $$

(B.7)

From (B.4) to (B.7), the covariance in (B.1) becomes

$$ Cov\left( {\gamma_{ijp} ,\gamma_{ijq} } \right) = 2\left( {\sigma_{pq,i} + \sigma_{pq,j} } \right)\left\{ {\sigma_{pq,i} + \sigma_{pq,j} + 2\left( {\tilde{x}_{ip} - \tilde{x}_{jp} } \right)\left( {\tilde{x}_{iq} - \tilde{x}_{jq} } \right)} \right\}. $$

(B.8)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Choi, J., Son, Y. & Jeong, M.K. Gaussian kernel with correlated variables for incomplete data. Ann Oper Res 341, 223–244 (2024). https://doi.org/10.1007/s10479-023-05656-0

Download citation

Received: 15 May 2023
Accepted: 13 October 2023
Published: 28 November 2023
Issue Date: October 2024
DOI: https://doi.org/10.1007/s10479-023-05656-0

Gaussian kernel with correlated variables for incomplete data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A sparse linear regression model for incomplete datasets

Clustering with missing features: a penalized dissimilarity measure based approach

Enhancing data analysis: uncertainty-resistance method for handling incomplete data

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix

Appendix A: Proof of Proposition 1

Appendix B: Covariance between squared unit-dimensional distances

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Gaussian kernel with correlated variables for incomplete data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A sparse linear regression model for incomplete datasets

Clustering with missing features: a penalized dissimilarity measure based approach

Enhancing data analysis: uncertainty-resistance method for handling incomplete data

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix

Appendix A: Proof of Proposition 1

Appendix B: Covariance between squared unit-dimensional distances

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation