Abstract
We present an approach to the design of an automatic text summarizer that generates a summary by extracting sentence segments. First, sentences are broken into segments by special cue markers. Each segment is represented by a set of predefined features (e.g. location of the segment, number of title words in the segment). Then supervised learning algorithms are used to train the summarizer to extract important sentence segments, based on the feature vector. Results of experiments indicate that the performance of the proposed approach compares quite favorably with other approaches (including MS Word summarizer).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
H. P. Edmundson. New Methods in Automatic Extracting. In Advances In Automatic Text Summarization, pages 23–42, 1999.
J. Kupiec, J. Pedersen, F. Chen. A Trainable Document Summarizer. In Advances In Automatic Text Summarization, pages 55–60, 1999.
H. P. Luhn. The Automatic Creation of Literature Abstracts. In Advances In Automatic Text Summarization, pages 15–21, 1999.
W. Mann, S. Thompson. Rhetorical structure theory: Toward a functional theory of text. In Text 8(3): pages 243–281, 1988.
D. Marcu. The rhetorical parsing, summarization, and generation of natural language texts. Ph.D. Dissertation, Department of Computer Science, University of Toronto. 1997.
J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
J. Yang, R. Parekh, V. Honavar. DistAI: An Inter-pattern Distance-based Constructive Learning Algorithm. In Intelligent Data Analysis 3: pages 55–73, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chuang, W.T., Yang, J. (2000). Text Summarization by Sentence Segment Extraction Using Machine Learning Algorithms. In: Terano, T., Liu, H., Chen, A.L.P. (eds) Knowledge Discovery and Data Mining. Current Issues and New Applications. PAKDD 2000. Lecture Notes in Computer Science(), vol 1805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45571-X_52
Download citation
DOI: https://doi.org/10.1007/3-540-45571-X_52
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67382-8
Online ISBN: 978-3-540-45571-4
eBook Packages: Springer Book Archive