Abstract
Data mining techniques have already shown useful to classify wine fermentations as problematic. Then, these techniques are a good option for winemakers who currently lack the tools to identify early signs of undesirable fermentation behavior and, therefore, are unable to take possible mitigating actions. In this study we assessed how much the performance of a clustering K-means fermentation classification procedure is affected by the number of principal components (PCs), when principal component analysis (PCA) is previously applied to reduce the dimensionality of the available data. It was observed that three PCs were enough to preserve the overall information of a dataset containing reliable measurements only. In this case, a 40% detection ability of problematic fermentations was achieved. In turn, using a more complete dataset, but containing unreliable measurements, the number of PCs yielded different classifications. Here, 33%f the problematic fermentations were detected.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Al-Harbi, S., Rayward-Smith, V.: The use of a supervised k-means algorithm on real-valued data with applications in health. In: Chung, P.W.H., Hinde, C.J., Ali, M. (eds.) IEA/AIE 2003. LNCS (LNAI), vol. 2718, Springer, Heidelberg (2003)
Chiang, L.H., Russell, E.L., Braatz, R.D.: Fault Detection and Diagnosis in Industrial Systems. Springer, Heidelberg (2001)
Chiang, L.H., Leardi, R., Pell, R., Seasholtz, M.B.: Industrial experiences with multivariate statistical analysis of batch process data. Chemometrics and Intelligent Laboratory Systems 81, 109–119 (2006)
Chiang, L.H., Colegrove, L.I.: Industrial implementation of on-line multivariate quality control. Chemometrics and Intelligent Laboratory Systems 88(2), 143–153 (2007)
Fx, W., Zhang, W.J., Kusalik, A.J.: A genetic K-means clustering algorithm applied to gene expression data. In: Xiang, Y., Chaib-draa, B. (eds.) Canadian AI 2003. LNCS (LNAI), vol. 2671, pp. 520–526. Springer, Heidelberg (2003)
Kamimura, R., Bicciato, S., Shimizu, H., Alford, J., Stephanopoulos, G.: Mining of Biological Data II: Assessing Data Structure and Class Homogeneity by Cluster Analysis. Metab. Eng. 2, 228–238 (2000)
Tamura, M., Tsujita, S.: A study on the number of principal components and sensitivity of fault detection using PCA. Computers and Chemical Engineering 31, 1035–1046 (2006)
Urtubia, A., Pérez-Correa, R., Meurens, M., Agosin, E.: Monitoring large scale wine fermentations with infrared spectroscopy. Talanta 64, 778–784 (2004)
Urtubia, A., Pérez-C, J., Soto, A., Pszczólkowski, P.: Using data mining techniques to predict industrial wine problem fermentations. Food Control 18(12), 1512–1517 (2007)
Vlasides, S., Ferrier, J., Block, D.: Using Historical Data for Bioprocess Optimization: Modeling Wine Characteristics Using Artificial Neural Networks and Archives Process Information. Biotechnol. and Bioeng. 73(1), 55–68 (2001)
Yoshioka, T., Morioka, R., Kobayashi, K., Oba, S., Ogawsawara, N.: Clustering of gene expression data by mixture of PCA models. In: Dorronsoro, J.R. (ed.) ICANN 2002. LNCS, vol. 2415, pp. 522–527. Springer, Heidelberg (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Urtubia U., A., Pérez-Correa, J.R. (2009). Study of Principal Components on Classification of Problematic Wine Fermentations. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2009. Lecture Notes in Computer Science(), vol 5633. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03067-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-03067-3_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03066-6
Online ISBN: 978-3-642-03067-3
eBook Packages: Computer ScienceComputer Science (R0)