Abstract
This study proposes a method of extracting keywords including those that appear locally. Useful keyword extraction methods are available for text mining, such as TF-IDF and support vector machine. However, when keywords are extracted on the basis of time series, the local keywords are not often extracted. We propose a method of extracting the local keywords by separating a document set, which we call the document separation approach. The approach splits a document set into multiple sets according to time series, extracts the keywords for each document set, and integrates them. Using 1812 newspaper articles, we experimentally demonstrate that we can extract the local feature keywords using the document separation approach.
Chapter PDF
Similar content being viewed by others
References
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)
Zhang, K., Xu, H., Tang, J., Li, J.: Keyword extraction using support vector machine. In: Yu, J.X., Kitsuregawa, M., Leong, H.-V. (eds.) WAIM 2006. LNCS, vol. 4016, pp. 85–96. Springer, Heidelberg (2006)
Salton, G.: Automatic Text Processing: The Transformation Analysis and Retrieval of Information by Computer. Addison-Wesley Publisher (1988)
Saga, R., Terachi, M., Tsuji, H.: FACT-Graph: Trend visualization by frequency and co-occurrence. Electronics and Communications in Japan 95, 50–58 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saga, R., Tsuji, H. (2013). Improved Keyword Extraction by Separation into Multiple Document Sets According to Time Series. In: Stephanidis, C. (eds) HCI International 2013 - Posters’ Extended Abstracts. HCI 2013. Communications in Computer and Information Science, vol 374. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39476-8_91
Download citation
DOI: https://doi.org/10.1007/978-3-642-39476-8_91
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39475-1
Online ISBN: 978-3-642-39476-8
eBook Packages: Computer ScienceComputer Science (R0)