Nothing Special   »   [go: up one dir, main page]

Skip to main content

A Compression-Based Dissimilarity Measure for Multi-task Clustering

  • Conference paper
Foundations of Intelligent Systems (ISMIS 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6804))

Included in the following conference series:

Abstract

Virtually all existing multi-task learning methods for string data require either domain specific knowledge to extract feature representations or a careful setting of many input parameters. In this work, we propose a feature-free and parameter-light multi-task clustering algorithm for string data. To transfer knowledge between different domains, a novel dictionary-based compression dissimilarity measure is proposed. Experimental results with extensive comparisons demonstrate the generality and the effectiveness of our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Cai, D., He, X., Wu, X., Han, J.: Non-negative Matrix Factorization on Manifold. In: ICDM, pp. 63–72 (2008)

    Google Scholar 

  2. Caruana, R.: Multitask Learning. Machine Learning 28, 41–75 (1997)

    Article  Google Scholar 

  3. Dhillon, I.S.: Co-clustering Documents and Words Using Bipartite Spectral Graph Partitioning. In: KDD, pp. 269–274 (2001)

    Google Scholar 

  4. Indrajit, B., et al.: Cross-Guided Clustering: Transfer of Relevant Supervision across Domains for Improved Clustering. In: ICDM, pp. 41–50 (2009)

    Google Scholar 

  5. Gu, Q., Zhou, J.: Learning the Shared Subspace for Multi-task Clustering and Transductive Transfer Classification. In: ICDM, pp. 159–168 (2009)

    Google Scholar 

  6. Juba, B.: Estimating Relatedness via Data Compression. In: ICML, pp. 441–448 (2006)

    Google Scholar 

  7. Karypis, G., Kumar, V.: A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM J. Sci. Comput. 20, 359–392 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  8. Keogh, E., Lonardi, S., Ratanamahatana, C.A.: Towards Parameter-free Data Mining. In: KDD, pp. 206–215 (2004)

    Google Scholar 

  9. Liu, Q., Liao, X., Carin, H.L., Stack, J.R., Carin, L.: Semisupervised Multitask Learning. IEEE Trans. on PAMI 31, 1074–1086 (2009)

    Article  Google Scholar 

  10. Mahmud, M.M.H.: On Universal Transfer Learning. In: Hutter, M., Servedio, R.A., Takimoto, E. (eds.) ALT 2007. LNCS (LNAI), vol. 4754, pp. 135–149. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  11. Mahmud, M.M.H., Ray, S.R.: Transfer Learning Using Kolmogorov Complexity: Basic Theory and Empirical Evaluations. In: NIPS, pp. 985–992 (2008)

    Google Scholar 

  12. Ming, L., Paul, V.: An Introduction to Kolmogorov Complexity and its Applications, 2nd edn. Springer, New York (1997)

    MATH  Google Scholar 

  13. Schwaighofer, A., Tresp, V., Yu, K.: Learning Gaussian Process Kernels via Hierarchical Bayes. In: NIPS, pp. 1209–1216 (2004)

    Google Scholar 

  14. Slonim, N., Tishby, N.: Document Clustering Using Word Clusters via the Information Bottleneck Method. In: SIGIR, pp. 208–215 (2000)

    Google Scholar 

  15. Steinbach, M., Karypis, G., Kumar, V.: A Comparison of Document Clustering Techniques. In: KDD Workshop on Text Mining, pp. 25–36 (2000)

    Google Scholar 

  16. Vitanyi, P.M.B., Balbach, F.J., Cilibrasi, R., Li, M.: Normalized Information Distance. In: CoRR, abs/0809.2553 (2008)

    Google Scholar 

  17. Welch, T.: A Technique for High-Performance Data Compression. Computer 17, 8–19 (1984)

    Article  Google Scholar 

  18. Zhang, J., Zhang, C.: Multitask Bregman Clustering. In: AAAI (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Thach, N.H., Shao, H., Tong, B., Suzuki, E. (2011). A Compression-Based Dissimilarity Measure for Multi-task Clustering. In: Kryszkiewicz, M., Rybinski, H., Skowron, A., Raś, Z.W. (eds) Foundations of Intelligent Systems. ISMIS 2011. Lecture Notes in Computer Science(), vol 6804. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21916-0_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21916-0_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21915-3

  • Online ISBN: 978-3-642-21916-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics