Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1281192.1281281acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Model-shared subspace boosting for multi-label classification

Published: 12 August 2007 Publication History

Abstract

Typical approaches to the multi-label classification problem require learning an independent classifier for every label from all the examples and features. This can become a computational bottleneck for sizeable datasets with a large label space. In this paper, we propose an efficient and effective multi-label learning algorithm called model-shared subspace boosting (MSSBoost) as an attempt to reduce the information redundancy in the learning process. This algorithm automatically finds, shares and combines a number of base models across multiple labels, where each model is learned from random feature subspace and boots trap data samples. The decision functions for each label are jointly estimated and thus a small number of shared subspace models can support the entire label space. Our experimental results on both synthetic data and real multimedia collections have demonstrated that the proposed algorithm can achieve better classification performance than the non-ensemble baselineclassifiers with a significant speedup in the learning and prediction processes. It can also use a smaller number of base models to achieve the same classification performance as its non-model-shared counterpart.

References

[1]
R. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. Technical Report RC23462, IBM Research Center, 45, 2004.
[2]
K. Barnard, P. Duygulu, D. Forsyth, N. de Freitas, D. Blei, and M. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3, 2002.
[3]
L. Breiman. Bagging predictors. Machine Learning, 24(2):123--140, 1996.
[4]
L. Breiman. Random forests. Mach. Learn., 45(1):5--32, 2001.
[5]
R. Caruana. Multitask learning. Machine Learning, 28(1):41--75, 1997.
[6]
R. Caruana, A. Niculescu-Mizil, G. Crew, and A. Ksikes. Ensemble selection from libraries of models. In Intl. Conf. of Machine Learning, 2004.
[7]
C. Chen, A. Liaw, and L. Breiman. Using random forest to learn unbalanced data. Technical Report 666, Statistics Department, University of California at Berkeley, 2004.
[8]
Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci., 55(1):119--139, 1997.
[9]
J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: a statistical view of boosting. Technical report, Dept. of Statistics, Stanford University, 1998.
[10]
N. Ghamrawi and A. McCallum. Collective multi-label classification. In Proceedings of the 14th ACM international conference on Information and knowledge management, pages 195--200, New York, NY, USA, 2005. ACM Press.
[11]
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Series in Statistics. Springer Verlag, Basel, 2001.
[12]
T. K. Ho. The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell., 20(8):832--844, 1998.
[13]
T. Joachims. Making large-scale support vector machine learning practical. In A. S. B. Schölkopf, C. Burges, editor, Advances in Kernel Methods: Support Vector Machines. MIT Press, Cambridge, MA, 1998.
[14]
M. Naphade, J. R. Smith, J. Tesic, S.-F. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis. Large-scale concept ontology for multimedia. IEEE MultiMedia, 13(3):86--91, 2006.
[15]
A. Natsev, M. R. Naphade, and J. R. Smith. Semantic representation: search and mining of multimedia content. In Proceedings of the 2004 ACM SIGKDD international conference, pages 641--646, 2004.
[16]
A. Natsev, M. R. Naphade, and J. Tešić. Learning the semantics of multimedia queries and concepts from a small number of examples. In Proceedings of the 13th annual ACM international conference on Multimedia, pages 598--607, New York, NY, USA, 2005. ACM Press.
[17]
S. Rosset. Robust boosting and its relation to bagging. In Proceeding of the eleventh ACM SIGKDD international conference, pages 249--255, New York, NY, USA, 2005.
[18]
R. Schapire and Y. Singer. Boostexter: A system for multiclass multi-label text categorization. Machine Learning, 39(2), 2000.
[19]
R. E. Schapire. Using output codes to boost multiclass learning problems. In Proceedings of the Fourteenth International Conference on Machine Learning, pages 313--321, San Francisco, CA, USA, 1997. Morgan Kaufmann Publishers Inc.
[20]
A. Smeaton and P. Over. TRECVID: Benchmarking the effectiveness of information retrieval tasks on digital video. In Proc. of the Intl. Conf. on Image and Video Retrieval, 2003.
[21]
C. Snoek, M. Worring, J. Geusebroek, D. Koelma, and F. Seinstra. The mediamill TRECVID 2004 semantic viedo search engine. In Proc. of TRECVID, 2004.
[22]
D. Tao, X. Tang, X. Li, and X. Wu. Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans. Pattern Anal. Mach. Intell., 28(7):1088--1099, 2006.
[23]
A. Torralba, K. Murphy, and W. Freeman. Sharing visual features for multiclass and multiview object detection. In IEEE Computer Vision and Pattern Recognition(CVPR), 2004.
[24]
R. Yan and A. G. Hauptmann. Mining relationship between video concepts using probabilistic graphical model. In Proceedings of IEEE International Conference On Multimedia and Expo (ICME), 2006.
[25]
Y. Yang and J. O. Pedersen. A comparative study on feature selection in text categorization. In Proc. of the 14th ICML, pages 412--420, 1997.
[26]
J. Zhang, Z. Ghahramani, and Y. Yang. Learning multiple related tasks using latent independent component analysis. In Neural Information Processing Systems (NIPS) 18, 2005.

Cited By

View all
  • (2024)Multi-Label Lifelong Machine Learning: A Scoping Review of Algorithms, Techniques, and ApplicationsIEEE Access10.1109/ACCESS.2024.340356912(74539-74557)Online publication date: 2024
  • (2024)Personalized Medicine with Multiple TreatmentsStatistics in Precision Health10.1007/978-3-031-50690-1_6(131-161)Online publication date: 25-Jun-2024
  • (2023)Development of a COVID-19–Related Anti-Asian Tweet Data Set: Quantitative StudyJMIR Formative Research10.2196/404037(e40403)Online publication date: 28-Feb-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2007
1080 pages
ISBN:9781595936097
DOI:10.1145/1281192
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. multi-label classification
  2. random subspace methods

Qualifiers

  • Article

Conference

KDD07

Acceptance Rates

KDD '07 Paper Acceptance Rate 111 of 573 submissions, 19%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)5
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Multi-Label Lifelong Machine Learning: A Scoping Review of Algorithms, Techniques, and ApplicationsIEEE Access10.1109/ACCESS.2024.340356912(74539-74557)Online publication date: 2024
  • (2024)Personalized Medicine with Multiple TreatmentsStatistics in Precision Health10.1007/978-3-031-50690-1_6(131-161)Online publication date: 25-Jun-2024
  • (2023)Development of a COVID-19–Related Anti-Asian Tweet Data Set: Quantitative StudyJMIR Formative Research10.2196/404037(e40403)Online publication date: 28-Feb-2023
  • (2023)Classification of Call TranscriptionsVAWKUM Transactions on Computer Sciences10.21015/vtcs.v11i2.159111:2(18-34)Online publication date: 7-Oct-2023
  • (2023)Decision system for copper flotation backbone processEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106410123(106410)Online publication date: Aug-2023
  • (2022)The intelligent decision-making of copper flotation backbone process based on CK-XGBoostKnowledge-Based Systems10.1016/j.knosys.2022.108429243:COnline publication date: 11-May-2022
  • (2022)Modeling global and local label correlation with graph convolutional networks for multi-label chest X-ray image classificationMedical & Biological Engineering & Computing10.1007/s11517-022-02604-160:9(2567-2588)Online publication date: 4-Jul-2022
  • (2022)Bi-directional Representation Learning for Multi-label ClassificationMachine Learning and Knowledge Discovery in Databases10.1007/978-3-662-44851-9_14(209-224)Online publication date: 10-Mar-2022
  • (2022)Analysis and Detection Against Overlapping Phenomenon of Behavioral Attribute in Network AttacksScience of Cyber Security10.1007/978-3-031-17551-0_14(217-232)Online publication date: 30-Sep-2022
  • (2021)A Hybrid Thresholding Strategy combining RCut and PCut for Multi-label ClassificationThe 23rd International Conference on Information Integration and Web Intelligence10.1145/3487664.3487702(278-287)Online publication date: 29-Nov-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media