Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1143844.1143966acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Accelerated training of conditional random fields with stochastic gradient methods

Published: 25 June 2006 Publication History

Abstract

We apply Stochastic Meta-Descent (SMD), a stochastic gradient optimization method with gain vector adaptation, to the training of Conditional Random Fields (CRFs). On several large data sets, the resulting optimizer converges to the same quality of solution over an order of magnitude faster than limited-memory BFGS, the leading method reported to date. We report results for both exact and inexact inference techniques.

References

[1]
Barndorff-Nielsen, O. E. (1978). Information and Exponential Families in Statistical Theory. Wiley, Chichester.
[2]
Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society B, 48(3), 259--302.
[3]
Blake, A., Rother, C., Brown, M., Perez, P., & Torr, P. (2004). Interactive image segmentation using an adaptive GMMRF model. In Proc. European Conf. on Computer Vision.
[4]
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222--1239.
[5]
Collins, M. (2002). Discriminative training methods for hidden markov models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.
[6]
Griewank, A. (2000). Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Frontiers in Applied Mathematics. Philadelphia: SIAM.
[7]
Hirschman, L., Yeh, A., Blaschke, C., & Valencia, A. (2005). Overview of BioCreAtivE:critical assessment of information extraction for biology. BMC Bioinformatics, 6(Suppl 1).
[8]
Kim, J.-D., Ohta, T., Tsuruoka, Y., Tateisi, Y., & Collier, N. (2004). Introduction to the bio-entity recognition task at JNLPBA. In Proceeding of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA), 70 -- 75. Geneva, Switzerland.
[9]
Kolmogorov, V. (2004). Convergent tree-reweighted message passing for energy minimization. Tech. Rep. MSR-TR-2004-90, Microsoft Research, Cambridge, UK.
[10]
Kumar, S., & Hebert, M. (2003). Man-made structure detection in natural images using a causal multiscale random field. In Proc. IEEE Conf. Computer Vision and Pattern Recognition.
[11]
Kumar, S., & Hebert, M. (2004). Discriminative fields for modeling spatial dependencies in natural images. In Advances in Neural Information Processing Systems 16.
[12]
Lafferty, J. D., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic modeling for segmenting and labeling sequence data. In Proc. Intl. Conf. Machine Learning, vol. 18.
[13]
Lipton, R. J., & Tarjan, R. E. (1979). A separator theorem for planar graphs. SIAM Journal of Applied Mathematics, 36, 177--189.
[14]
Parise, S., & Welling, M. (2005). Learning in markov random fields: An empirical study. In Joint Statistical Meeting.
[15]
Pearlmutter, B. A. (1994). Fast exact multiplication by the Hessian. Neural Computation, 6(1), 147--160.
[16]
Sang, E. F. T. K., & Buchholz, S. (2000). Introduction to the CoNLL-2000 shared task: Chunking. In In Proceedings of CoNLL-2000, 127 -- 132. Lisbon, Portugal.
[17]
Schraudolph, N. N. (1999). Local gain adaptation in stochastic gradient descent. In Proc. Intl. Conf. Artificial Neural Networks, 569--574. Edinburgh, Scotland: IEE, London.
[18]
Schraudolph, N. N. (2002). Fast curvature matrix-vector products for second-order gradient descent. Neural Computation, 14(7), 1723--1738.
[19]
Schraudolph, N. N., & Graepel, T. (2003). Combining conjugate direction methods with stochastic approximation of gradients. In C. M. Bishop, & B. J. Frey, eds., Proc. 9th Intl. Workshop Artificial Intelligence and Statistics, 7--13. Key West, Florida. ISBN 0-9727358-0-1.
[20]
Settles, B. (2004). Biomedical named intity recognition using conditional random fields and rich feature sets. In Proceedings of COLING 2004, International Joint Workshop On Natural Language Processing in Biomedicine and its Applications (NLPBA). Geneva, Switzerland.
[21]
Sha, F., & Pereira, F. (2003). Shallow parsing with conditional random fields. In Proceedings of HLT-NAACL, 213--220. Association for Computational Linguistics.
[22]
Vishwanathan, S. V. N., Schraudolph, N. N., & Smola, A. J. (2006). Online SVM with multiclass classification and SMD step size adaptation. Journal of Machine Learning Research. To appear.
[23]
Weiss, Y. (2001). Comparing the mean field method and belief propagation for approximate inference in MRFs. In D. Saad, & M. Opper, eds., Advanced Mean Field Methods. MIT Press.
[24]
Winkler, G. (1995). Image Analysis, Random Fields and Dynamic Monte Carlo Methods. Springer Verlag.
[25]
Yedidia, J., Freeman, W., & Weiss, Y. (2003). Understanding belief propagation and its generalizations. In Exploring Artificial Intelligence in the New Millennium, chap. 8, 239--236, Science & Technology Books.

Cited By

View all
  • (2022)Predicting the Elevation of Canopy Occluded Ground Points in Dense Forest RegionsIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2022.315292560(1-10)Online publication date: 2022
  • (2021)Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGDProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3541502(16226-16237)Online publication date: 6-Dec-2021
  • (2021)Fast Training Logistic Regression via Adaptive SamplingScientific Programming10.1155/2021/99918592021(1-11)Online publication date: 29-May-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '06: Proceedings of the 23rd international conference on Machine learning
June 2006
1154 pages
ISBN:1595933832
DOI:10.1145/1143844
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 June 2006

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Acceptance Rates

ICML '06 Paper Acceptance Rate 140 of 548 submissions, 26%;
Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Predicting the Elevation of Canopy Occluded Ground Points in Dense Forest RegionsIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2022.315292560(1-10)Online publication date: 2022
  • (2021)Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGDProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3541502(16226-16237)Online publication date: 6-Dec-2021
  • (2021)Fast Training Logistic Regression via Adaptive SamplingScientific Programming10.1155/2021/99918592021(1-11)Online publication date: 29-May-2021
  • (2021)Improving Loanword Identification in Low-Resource Language with Data Augmentation and Multiple Feature FusionComputational Intelligence and Neuroscience10.1155/2021/99750782021(1-9)Online publication date: 8-Apr-2021
  • (2021)Conditional Random Field Features and Structure Assessment for Digital Terrain ModelingIEEE Access10.1109/ACCESS.2021.30613719(37146-37155)Online publication date: 2021
  • (2020)Machine Learning Approach for Multi-Layered Detection of Chemical Named Entities in TextCognitive Analytics10.4018/978-1-7998-2460-2.ch076(1496-1512)Online publication date: 2020
  • (2020)Knowledge Graph from Informal Text: Architecture, Components, Algorithms and ApplicationsApplications of Machine Learning10.1007/978-981-15-3357-0_6(75-90)Online publication date: 5-May-2020
  • (2019)An Efficient Method for High Quality and Cohesive Topical Phrase MiningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2018.282375831:1(120-137)Online publication date: 1-Jan-2019
  • (2019)A Novel Generative Model With Bounded-GAN for Reliability Classification of Gear SafetyIEEE Transactions on Industrial Electronics10.1109/TIE.2018.288962966:11(8772-8781)Online publication date: Nov-2019
  • (2019)Labeling of partially occluded regions via the multi-layer CRFMultimedia Tools and Applications10.1007/s11042-018-6298-578:2(2551-2569)Online publication date: 1-Jan-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media