Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/FUZZ-IEEE.2016.7737741guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Linguistic summarization using a weighted N-gram language model based on the similarity of time-series data

Published: 01 July 2016 Publication History

Abstract

This paper describes a method to verbalize the trends of time-series data. As an example of time-series data, we use the price of Nikkei stock average and develop a method to generate natural language sentences which describe how the stock price goes in the market. As the basic idea for making linguistic descriptions of the stock price trends, we firstly classify all the time-series data including a newly observed time-series data, i.e., the target to be verbalized, by means of spectral clustering employing Dynamic Time Warping distance as its similarity metric. Secondly, a bi-gram language model for the newly observed data is built based on the weighted bi-gram language models of the other time-series data classified in the same cluster. The weights for the bi-gram model of the target data from other time-series data are decided based on the similarity between the target data and the other data in the same cluster. Lastly, linguistic summarization for the target data is generated by finding the most likely combination of words by means of dynamic programming, employing the weighted bi-gram model. Through the experiments under the conditions of various cluster numbers in spectral clustering, we have confirmed that natural language sentences, which properly describe the trends of the stock price, are generated by our method.

References

[1]
Gkatzia, D., Hastie, H. and Lemon, O., Finding middle ground? Multi-objective Natural Language Generation from time-series data, the 14th European Association for Computational Linguistics, pp. 210–214, 2014.
[2]
H. Banaee, M. U. Ahmed, A. Loutfi, A Framework for Automatic Text Generation of Trends in Physiological Time Series Data, IEEE Int. Conf. on Systems, Man, and Cybernetics, pp. 3876–3881, 2013.
[3]
Mizuki Kobayashi, Ichiro Kobayashi, Hideki Asoh, and Sergio Guadrrama, A Probabilistic Approach to Text Generation of Human Motions extracted from Kinect Videos, the International Conference on Computer Science and Applications (ICCSA‘13), (World Congress on Engineering and Computer Science 2013), San Francisco, 2013.
[4]
Kasumi Aoki, Ichiro Kobayashi, “An Approach to Text Generation for Describing Stock Price Trends using Language Models (in Japanese)”, 21th Annual Meeting of Natural Language Processing, 2015.
[5]
Wataru Takano, Yoshihiko Nakamura, “Bigram-based natural language model and statistical motion symbol model for scalable language of humanoid robots”, ICRA 2012: 1232–1237, 2012.
[6]
Yusuke Goutsu, Wataru Takano, Yoshihiko Nakamura, “Generating sentence from motion by using large-scale and high-order N-grams”, IROS 2013: 151–156, 2013.
[7]
Priscilla Moraes, Gabriel Sina, Kathleem McCoy. and Sandra Carberry, Generating Summaries of Line Graphs, the 8th International Natural Language Generation Conference, Pages 95–98, 2014.
[8]
AINaymat, G., Chawla, S., & Taheri, J., SparseDTW A Novel Approach to Speed up Dynamic Time Warping, 2012.
[9]
Stan Salvador & Philip Chan, FastDTW Toward Accurate Dynamic Time Warping in Linear Time and Space. KDD Workshop on Mining Temporal and Sequential Data, pp. 70–80, 2004.
[10]
Keogh, E.; Ratanamahatana, C. A. “Exact indexing of dynamic time warping”. Knowledge and Information Systems 7 (3): 358–386. http://10.1007/s10115–004-0154-9. 2005.
[11]
Lemire, D. “Faster Retrieval with a Two-Pass Dynamic-Time-Warping Lower Bound”. Pattern Recognition 42 (9): 2169–2180. http://10.1016j.patcog.2008.11.030, 2009.
[12]
Wang, Xiaoyue; et al. “Experimental comparison of representation methods and distance measures for time series data”. Data Mining and Knowledge Discovery 2010: pp, 1–35, 2010.
[13]
Z. Wu and R. Leahy, An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1, 101–1, 113, Nov. 1993.
[14]
A. Barbu, A. Bridge, Z. Burchill, D. Coroian, S. Dickinson, S. Fidler, A. Michaux, S. Mussman, S. Narayanaswamy, D. Salvi, L. Schmidt, J. Shangguan, J. Siskind, J. Waggoner, S. Wang, J. Wei, Y. Yin, and Z. Zhang. Video In Sentences Out, Conference on Uncertainty in Artificial Intelligence (UAI), 2012.
[15]
J. Kacprzyk, A. Wilbik, S. Zadrozny, Linguistic summarization of time series using a fuzzy quantifier driven aggregation, Fuzzy Sets Syst. 159, pp. 1485–1499, 2008.
[16]
J. Kacprzyk, A. Wilbik, S. Zadrozny, Linguistic summarization of trends:a fuzzy logic based approach, in: Proc. 11th Internat. Conf. on Information Processing and Management of Uncertainty in Knowledge-based Systems, Paris, France, July 2–7, pp. 2166–2172, 2006.
[17]
J. Kacprzyk, A. Wilbik, S. Zadrozny, Linguistic summaries of time series via a quantifier based aggregation using the Sugeno integral, in: Proc. of 2006 IEEE World Congress on Computational Intelligence, Vancouver, BC, Canada, IEEE Press, New York, pp. 3610–3616, July, 2006.
[18]
J. Kacprzyk, A. Wilbik, S. Zadrozny On some types of linguistic summaries of time series Proc. of the Third International IEEE Conf. on Intelligent Systems, IEEE Press, New York, London, UK, pp. 373–378, 2006.
[19]
Vilem Novak, Linguistic characterization of time series., Fuzzy Sets and Systems 285, pp. 52–72, 2016.
[20]
Ulrike von Luxburg, Max Planck Institute for Biological Cybernetics Spr, spemannstr. 38, 72076 Tubinge, Germaniy
“A Tutorial on Spectral Clustering”, Statics and Computing 17 (4), 2007.
[21]
Inderjit Dhillon, Yuqiang Guan, and Brian Kulis, A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts, In The University of Texas at Austin, Department of Computer Science. Technical Report TR-04–25, 2005.
[22]
Cung, B. et al., “Spectral Clustering: An empirical study of Approximation Algorithms and its Application to the Attrition Problem.” arXiv preprint arXiv:1211.3444, 2012.
[23]
C. Fowlkes, S. Belongie, F. Chung, and J. Malik. Spectral grouping using the nystrom method. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 26 (2): 214–225, 2004.
[24]
L. Huang, D. Yan, M.I. Jordan, and N. Taft. Spectral clustering with perturbed data. Advances in Neural Information Processing Systems (NIPS), pp. 705–712, 2008.
[25]
L. Wang, C. Leckie, K. Ramamohanarao, and J. Bezdek. Approximate spectral clustering. Advances in Knowledge Discovery and Data Mining, pages 134–146, 2009.
[26]
Ding Hui, Trajcevski Goce, Scheuermann Peter, Wang, Xiaoyue, Keogh Eamonn, “Querying and mining of time series data:experimental comparison of representations and distance measures”. Proc. VLDB Endow 1 (2): 1542–1552, 2008.

Cited By

View all
  • (2021)Generating Accurate Caption Units for Figure CaptioningProceedings of the Web Conference 202110.1145/3442381.3449923(2792-2804)Online publication date: 19-Apr-2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)
2526 pages

Publisher

IEEE Press

Publication History

Published: 01 July 2016

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Generating Accurate Caption Units for Figure CaptioningProceedings of the Web Conference 202110.1145/3442381.3449923(2792-2804)Online publication date: 19-Apr-2021

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media