Abstract
The Email systems are playing an important and irreplaceable role in the digital world due to its convenience, efficiency and the rapid growth of World Wide Web (WWW). However, most of the email users nowadays are suffering from the large amounts of irrelevant and noisy emails everyday. Thus algorithms which can clean both the noise features and the irrelevant emails are highly desired. In this paper, we propose a novel Supervised Semi-definite Embedding (SSDE) algorithm to reduce the dimension of email data so as to leave out the noise features of them and visualize these emails in a supervised manner to find the irrelevant ones intuitively. Experiments on a set of received emails of several volunteers during a period of time and some benchmark datasets show the comparable performance of the proposed SSDE algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley Longman, Amsterdam (1999)
Balasubramanian, M., Schwartz, E.L., Tenenbaum, J.B., de Silva, V., Langford, J.C.: The Isomap Algorithm and Topological Stability. Science 295
Borchers, B.: CSDP, a C Library for Semidefinite Programming. Optimization Methods and Software 11, 613–623
Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. University of California, Department of Information and Computer Science, Irvine
Ducheneaut, N., Bellotti, V.: E-mail as Habitat: An Exploration of Embedded Personal Information Management. Interactions 8, 30–38
Jolliffe, I.T.: Principal Component Analysis. Springer, Heidelberg (1986)
Mardia, K.V., Kent, J.T., Bibby, J.M.: Multivariate Analysis. Academic Press, London (1979)
Martinez, A.M., Kak, A.C.: PCA versus LDA. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(2), 228–233
Roweis, S.T., Saul, L.K.: Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science 290, 2323–2326
Salton, G., McGill, M.J.: Introduction to Modern Retrieval. McGraw-Hill Book Company, New York (1983)
Saul, L.K., Roweis, S.T.: Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifolds. Machine Learning Research 4, 119–155
Seber, G.A.F., Wild, C.J.: Nonlinear regression. Wiley, New York (1989)
Sturm, J.F.: Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones. Optimization Methods and Software, 625–653
Torkkola, K.: Linear Discriminant Analysis in Document Classification. In: Proceedings of the, pp. 800–806 (2001)
Vandenberghe, L., Boyd, S.: Semidefinite Programming. SIAM Review 38, 49–95
Vlachos, M., Domeniconi, C., Gunopulos, D., Koudas, G.K.: Non-Linear Dimensionality Reduction Techniques for Classification and Visualization. In: Proceedings of the 8th SIGKDD, Edmonton, Canada, pp. 645–651 (2002)
Weinberger, K.Q., Saul, L.K.: Unsupervised Learning of Image Manifolds by Semidefinite Programming. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,CVPR, Washington, DC (2004)
Weinberger, K.Q., Sha, F., Saul, L.K.: Learning a Kernel Matrix for Nonlinear Dimensionality Reduction. In: Proceedings of the 21 International Conference on Machine Learning, Ban, Canada (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, N., Bai, F., Yan, J., Zhang, B., Chen, Z., Ma, WY. (2005). Supervised Semi-definite Embedding for Email Data Cleaning and Visualization. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds) Web Technologies Research and Development - APWeb 2005. APWeb 2005. Lecture Notes in Computer Science, vol 3399. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31849-1_93
Download citation
DOI: https://doi.org/10.1007/978-3-540-31849-1_93
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25207-8
Online ISBN: 978-3-540-31849-1
eBook Packages: Computer ScienceComputer Science (R0)