research-article

A dual coordinate descent method for large-scale linear SVM

Authors:

S. Sathiya Keerthi,

S. SundararajanAuthors Info & Claims

ICML '08: Proceedings of the 25th international conference on Machine learning

Pages 408 - 415

https://doi.org/10.1145/1390156.1390208

Published: 05 July 2008 Publication History

Abstract

In many applications, data appear with a huge number of instances as well as features. Linear Support Vector Machines (SVM) is one of the most popular tools to deal with such large-scale sparse data. This paper presents a novel dual coordinate descent method for linear SVM with L1-and L2-loss functions. The proposed method is simple and reaches an ε-accurate solution in O(log(1/ε)) iterations. Experiments indicate that our method is much faster than state of the art solvers such as Pegasos, TRON, SVM^perf, and a recent primal coordinate descent implementation.

References

[1]

Bordes, A., Bottou, L., Gallinari, P., & Weston, J. (2007). Solving multiclass support vector machines with LaRank. ICML.

Digital Library

[2]

Boser, B. E., Guyon, I., & Vapnik, V. (1992). A training algorithm for optimal margin classifiers. COLT.

Digital Library

[3]

Bottou, L. (2007). Stochastic gradient descent examples. http://leon.bottou.org/projects/sgd.

[4]

Chang, C.-C., & Lin, C.-J. (2001). LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.

Digital Library

[5]

Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2007). Coordinate descent method for large-scale L2-loss linear SVM (Technical Report). http://www.csie.ntu.edu.tw/~cjlin/papers/cdl2.pdf.

[6]

Collins, M., Globerson, A., Koo, T., Carreras, X., & Bartlett, P. (2008). Exponentiated gradient algorithms for conditional random fields and max-margin markov networks. JMLR. To appear.

Digital Library

[7]

Crammer, K., & Singer, Y. (2003). Ultraconservative online algorithms for multiclass problems. JMLR, 3, 951--991.

Digital Library

[8]

Friess, T.-T., Cristianini, N., & Campbell, C. (1998). The kernel adatron algorithm: a fast and simple learning procedure for support vector machines. ICML.

Digital Library

[9]

Joachims, T. (1998). Making large-scale SVM learning practical. Advances in Kernel Methods -Support Vector Learning. Cambridge, MA: MIT Press.

Digital Library

[10]

Joachims, T. (2006). Training linear SVMs in linear time. ACM KDD.

Digital Library

[11]

Kao, W.-C., Chung, K.-M., Sun, C.-L., & Lin, C.-J. (2004). Decomposition methods for linear support vector machines. Neural Comput., 16, 1689--1704.

Digital Library

[12]

Keerthi, S. S., & DeCoste, D. (2005). A modified finite Newton method for fast solution of large scale linear SVMs. JMLR, 6, 341--361.

Digital Library

[13]

Keerthi, S. S., Shevade, S. K., Bhattacharyya, C., & Murthy, K. R. K. (2001). Improvements to Platt's SMO algorithm for SVM classifier design. Neural Comput., 13, 637--649.

Digital Library

[14]

Langford, J., Li, L., & Strehl, A. (2007). Vowpal Wabbit. http://hunch.net/~vw.

[15]

Lin, C.-J., Weng, R. C., & Keerthi, S. S. (2008). Trust region Newton method for large-scale logistic regression. JMLR, 9, 623--646.

Digital Library

[16]

Luo, Z.-Q., & Tseng, P. (1992). On the convergence of coordinate descent method for convex differentiable minimization. J. Optim. Theory Appl., 72, 7--35.

Digital Library

[17]

Mangasarian, O. L., & Musicant, D. R. (1999). Successive overrelaxation for support vector machines. IEEE Trans. Neural Networks, 10, 1032--1037.

Digital Library

[18]

Osuna, E., Freund, R., & Girosi, F. (1997). Training support vector machines: An application to face detection. CVPR.

Digital Library

[19]

Platt, J. C. (1998). Fast training of support vector machines using sequential minimal optimization. Advances in Kernel Methods -Support Vector Learning. Cambridge, MA: MIT Press.

Digital Library

[20]

Shalev-Shwartz, S., Singer, Y., & Srebro, N. (2007). Pegasos: primal estimated sub-gradient solver for SVM. ICML.

Digital Library

[21]

Smola, A. J., Vishwanathan, S. V. N., & Le, Q. (2008). Bundle methods for machine learning. NIPS.

[22]

Zhang, T. (2004). Solving large scale linear prediction problems using stochastic gradient descent algorithms. ICML.

Digital Library

Cited By

Görgülü ŞKoç İKırımça NKaraköse MÖzgen M(2024)Yapay Zekâ Kullanımıyla Peron Ayırıcı Kapı Sisteminin Sağlığını İzleme ve Kestirimci BakımAkıllı Ulaşım Sistemleri ve Uygulamaları Dergisi10.51513/jitsa.13119857:1(56-70)Online publication date: 25-Mar-2024
https://doi.org/10.51513/jitsa.1311985
Cui ZWang GLiu MAi BQuek TTellambura C(2024)Wavy Signals and Striped Constellations for Backscatter Communications: Origins and SolutionsIEEE Transactions on Wireless Communications10.1109/TWC.2024.339633823:10(12815-12829)Online publication date: Oct-2024
https://doi.org/10.1109/TWC.2024.3396338
Li CLi YShao Y(2024)Large-Scale Structured Output Classification via Multiple Structured Support Vector Machine by SplittingIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2024.33603398:2(2112-2124)Online publication date: Apr-2024
https://doi.org/10.1109/TETCI.2024.3360339
Show More Cited By

Index Terms

A dual coordinate descent method for large-scale linear SVM
1. Computing methodologies
2. Mathematics of computing
  1. Mathematical analysis
    1. Numerical analysis
      1. Computations on matrices

Recommendations

Stochastic gradient descent for large-scale linear nonparallel SVM
WI '17: Proceedings of the International Conference on Web Intelligence

In recent years, nonparallel support vector machine (NPSVM) is proposed as a nonparallel hyperplane classifier with superior performance than standard SVM and existing nonparallel classifiers such as the twin support vector machine (TWSVM). With the ...
Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines

Linear support vector machines (SVM) are useful for classifying large-scale sparse data. Problems with sparse features are common in applications such as document classification and natural language processing. In this paper, we propose a novel ...
Large-scale linear nonparallel support vector machine solver

Twin support vector machines (TWSVMs), as the representative nonparallel hyperplane classifiers, have shown the effectiveness over standard SVMs from some aspects. However, they still have some serious defects restricting their further study and real ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICML '08: Proceedings of the 25th international conference on Machine learning

July 2008

1310 pages

ISBN:9781605582054

DOI:10.1145/1390156

General Chair:
William Cohen
Carnegie Mellon University
,
Program Chairs:
Andrew McCallum
University of Massachusetts Amherst
,
Sam Roweis
University of Toronto and Google

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Pascal
University of Helsinki
Xerox
Federation of Finnish Learned Societies
Google Inc.
NSF
Machine Learning Journal/Springer
Microsoft Research: Microsoft Research
Intel: Intel
Yahoo!
Helsinki Institute for Information Technology
IBM: IBM

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2008

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

ICML '08

Sponsor:

Microsoft Research
Intel
IBM

ICML '08: The 25th Annual International Conference on Machine Learning held in conjunction with the 2007 International Conference on Inductive Logic Programming

July 5 - 9, 2008

Helsinki, Finland

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

547
Total Citations
View Citations
2,841
Total Downloads

Downloads (Last 12 months)141
Downloads (Last 6 weeks)23

Reflects downloads up to 22 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Görgülü ŞKoç İKırımça NKaraköse MÖzgen M(2024)Yapay Zekâ Kullanımıyla Peron Ayırıcı Kapı Sisteminin Sağlığını İzleme ve Kestirimci BakımAkıllı Ulaşım Sistemleri ve Uygulamaları Dergisi10.51513/jitsa.13119857:1(56-70)Online publication date: 25-Mar-2024
https://doi.org/10.51513/jitsa.1311985
Cui ZWang GLiu MAi BQuek TTellambura C(2024)Wavy Signals and Striped Constellations for Backscatter Communications: Origins and SolutionsIEEE Transactions on Wireless Communications10.1109/TWC.2024.339633823:10(12815-12829)Online publication date: Oct-2024
https://doi.org/10.1109/TWC.2024.3396338
Li CLi YShao Y(2024)Large-Scale Structured Output Classification via Multiple Structured Support Vector Machine by SplittingIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2024.33603398:2(2112-2124)Online publication date: Apr-2024
https://doi.org/10.1109/TETCI.2024.3360339
Cho MChikkam MXu WLai L(2024)Tree Network Design for Faster Distributed Machine Learning Process with Distributed Dual Coordinate AscentICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10447341(6080-6084)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSP48485.2024.10447341
Alge OPickard JZhang WCheng SDerksen HOmenn GGryak JVanEpps JNajarian K(2024)Continuous sepsis trajectory prediction using tensor-reduced physiological signalsScientific Reports10.1038/s41598-024-68901-x14:1Online publication date: 6-Aug-2024
https://doi.org/10.1038/s41598-024-68901-x
Ebrahim MAlsmirat MAl-Ayyoub M(2024)Advanced disk herniation computer aided diagnosis systemScientific Reports10.1038/s41598-024-58283-514:1Online publication date: 5-Apr-2024
https://doi.org/10.1038/s41598-024-58283-5
Yang ZChen WZhang HXu YShi LZhao J(2024)A safe screening rule with bi-level optimization of ν support vector machinePattern Recognition10.1016/j.patcog.2024.110644155:COnline publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1016/j.patcog.2024.110644
Li CHuang LShao YGuo TMao Y(2024)Feature selection by Universum embeddingPattern Recognition10.1016/j.patcog.2024.110514153(110514)Online publication date: Sep-2024
https://doi.org/10.1016/j.patcog.2024.110514
Liu LLi SZhang XDai ZZhu Y(2024)Polycentric intuitionistic fuzzy weighted least squares twin SVMsNeurocomputing10.1016/j.neucom.2024.128475609(128475)Online publication date: Dec-2024
https://doi.org/10.1016/j.neucom.2024.128475
Ye YXu ZZhang JChen WShao Y(2024)Twin support vector quantile regressionExpert Systems with Applications10.1016/j.eswa.2023.121239237(121239)Online publication date: Mar-2024
https://doi.org/10.1016/j.eswa.2023.121239
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents