Using SVM to Extract Acronyms from Text

Jun Xu¹ &
Yalou Huang¹

191 Accesses
3 Altmetric
Explore all metrics

Abstract

The paper addresses the problem of extracting acronyms and their expansions from text. We propose a support vector machines (SVM) based approach to deal with the problem. First, all likely acronyms are identified using heuristic rules. Second, expansion candidates are generated from surrounding text of acronyms. Last, SVM model is employed to select the genuine expansions. Analysis shows that the proposed approach has the advantages of saving over the conventional rule based approaches. Experimental results show that our approach outperforms the baseline method of using rules. We also show that the trained SVM model is generic and can adapt to other domains easily.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Adar E (2004) SaRAD: a simple and robust abbreviation dictionary. Bioinformatics 20:527–533
Article Google Scholar
Bowden PR, Automatic (1999) Glossary construction for technical papers. Department Working Paper, Nottingham Trent University
Bowden PR, Halstead P, Rose TG (2000). Dictionaryless English plural noun singularisation using a corpus-based list of irregular forms. In: Proceedings of the 17th international conference on English Language Research on Computerized Corpora, Rodopi, Amersterdam, The Netherlands, pp 130–137
Chang JT, Schutze H, Altman RB (2002) Create an online dictionary of abbreviation from MEDLINE. J Am Med Inform Assoc 9(6):612–620
Article Google Scholar
Hettich S, Bay SD (1999) The UCI KDD Archive. [http:// kdd.ics.uci.edu]. Department of Information and Computer Science, University of California, Irvine
Google Scholar
Larkey LS, Ogilvie P, Price MA, Tamilio B (2000) Acrophile: An automated acronym extractor and server. In: Proceedings of the 5th ACM conference on digital libraries. ACM Press, San Antonio, pp 205–214
Park Y, Byrd RJ (2001) Hybrid text mining for finding abbreviations and their definitions. In: Proceedings of the 2001 conference on empirical methods in natural language processing, Pittsburgh, pp 126–133
Pustejovsky J, Castano J, Cochran B, Kotecki M, Morrell M (2001) Automatic extraction of acronym-meaning pairs from MEDLINE databases. Medinfo 10(Pt 1):371–375
Google Scholar
Schwartz A, Hearst M (2003) A simple algorithm for identifying abbreviation definitions in biomedical text. In: Proceedings of the 2003 pacific symposium on biocomputing. World Scientific Press, Singapore
Taghva K, Gilbreth J (1999) Recognizing acronyms and their definitions. Technical Report, ISRI (Information Science Research Institute), UNLV
Vapnik VN (1995) The nature of statistical learning theory. Springer, Berlin Heidelberg New York
MATH Google Scholar
Yeates S (1999) Automatic extraction of acronyms from text. In: Proceedings of the 3rd new zealand computer science research students’ conference, University of Waikato, Hamilton, pp 117–124
Yeates S, Bainbridge D, Witten IH (2000) Using compression to identify acronyms in text. In: Proceedings of data compression conference, IEEE Press, New York, pp 582
Yoshida M, Fukuda K, Takagi T (2000) PNAD-CSS: a workbench for constructing a protein name abbreviation dictionary. Bioinformatics 16:169–175
Article Google Scholar
Yu H, Hripcsak G, Friedman C (2002) Mapping abbreviations to full forms in biomedical articles. J Am Med Inform Assoc 9:262–272
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

College of Software, Nankai University, No. 94 Weijin Road, Tianjin, 300071, China
Jun Xu & Yalou Huang

Authors

Jun Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yalou Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Xu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, J., Huang, Y. Using SVM to Extract Acronyms from Text. Soft Comput 11, 369–373 (2007). https://doi.org/10.1007/s00500-006-0091-5

Download citation

Published: 20 April 2006
Issue Date: February 2007
DOI: https://doi.org/10.1007/s00500-006-0091-5

Using SVM to Extract Acronyms from Text

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Survey on Acronym–Expansion Mining Approaches from Text and Web

Acronyms: identification, expansion and disambiguation

A cascaded framework for identification and extraction of antonym for Turkish language

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Using SVM to Extract Acronyms from Text

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Survey on Acronym–Expansion Mining Approaches from Text and Web

Acronyms: identification, expansion and disambiguation

A cascaded framework for identification and extraction of antonym for Turkish language

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation