Abstract
The discovery of useful patterns embodied in a time series is of fundamental relevance in many real applications. Repetitive structures and common type of segments can also provide very useful information of patterns in financial time series. In this paper, we introduce a time series segmentation and characterization methodology combining a hybrid genetic algorithm and a clustering technique to automatically group common patterns from this kind of financial time series and address the problem of identifying stock market prices trends. This hybrid genetic algorithm includes a local search method aimed to improve the quality of the final solution. The local search algorithm is based on maximizing a likelihood ratio, assuming normality for the series and the subseries in which the original one is segmented. To do so, we select two stock market index time series: IBEX35 Spanish index (closing prices) and a weighted average time series of the IBEX35 (Spanish), BEL20 (Belgian), CAC40 (French) and DAX (German) indexes. These are processed to obtain segments that are mapped into a five dimensional space composed of five statistical measures, with the purpose of grouping them according to their statistical properties. Experimental results show that it is possible to discover homogeneous patterns in both time series.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Technical analysis is the science of recording, usually in graphic form, the actual history of trading (price changes, volume of transactions, etc.) in a certain stock and then deducing from that pictured history the probable future trend [34]. Consequently, technical indicators are numerical values calculated on the basis of past prices, volumes, and other market statistics and used to forecast future price movements.
Trend analysis studies also include the well-known Elliott Wave Principle, Dow Theory and related vocabulary, as primary or secondary trends.
Note that the first and last points of the chromosome are always considered cut points.
Other statistical and temporal characteristics of each segment were tested, but the experimental results were better with these five metrics.
We have considered linear trend, because of the reduced length of the segments.
The stock market prices are affected by a number of factors and events, some of which directly influence stock prices while others do so indirectly (internal developments, world events...). The stock price of a company and the market in general may be affected by world events, such as wars and civil unrest, natural disasters and terrorism. Stock market prices are affected by business fundamentals, company and world events, human psychology, and much more (in general, economy, inflation, terrorism, world news...).
There are 24 well-known financial patterns [2], and these are used to verify the segmentation and clustering results.
References
Ding Y, Yang X, Kavs A, Li J (2010) A novel piecewise linear segmentation for time series. In: Computer and Automation Engineering (ICCAE), 2010 the 2nd international conference on, vol 4 (Feb 2010). pp 52–55
Chung FL, Fu TC, Ng V, Luk RW (2004) An evolutionary approach to pattern-based time series segmentation. IEEE Trans Evolut Comput 8(5):471–489
Esling P, Agon C (2012) Time-series data mining. ACM Comput Surv 45(1):12:1–12:34
Fu TC (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181
Lin W, Orgun M, Williams G (2002) An overview of temporal data mining. In: Proceedings of the 1st Australian data mining workshop (ADM02). ACT, University of Technology, Canberra, pp 83–90
Abonyi J, Feil B, Nemeth S, Arva P (2005) Modified Gath–Geva clustering for fuzzy segmentation of multivariate time-series. Fuzzy Sets Syst 149(1):39–56 (Fuzzy Sets in Knowledge Discovery)
Berndt DJ, Clifford J (1996) Advances in knowledge discovery and data mining. American Association for Artificial Intelligence, Menlo Park
Chan KP, Fu A (1999) Efficient time series matching by wavelets. In: Data Engineering, 1999. Proceedings of the 15th international conference on (Mar 1999). pp 126–133
Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. SIGMOD Rec 23(2):419–429
Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Locally adaptive dimensionality reduction for indexing large time series databases. SIGMOD Rec 30(2):151–162
Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min Knowl Discov 7(4):349–371
Aach J, Church GM (2001) Aligning gene expression time series with time warping algorithms. Bioinformatics 17(6):495–508
Zeiler A, Faltermeier R, Tomé A, Puntonet C, Brawanski A, Lang E (2013) Weighted sliding empirical mode decomposition for online analysis of biomedical time series. Neural Process Lett 37(1):21–32
Lin T, Kaminski N, Bar-Joseph Z (2008) Alignment and classification of time series gene expression in clinical studies. Bioinformatics 24(13):i147–i155
Himberg J, Korpiaho K, Mannila H, Tikanmaki J, Toivonen H (2001) Time series segmentation for context recognition in mobile devices. In: Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on. 203–210
Chung F, Fu T, Luk R, Ng V (2002) Evolutionary time series segmentation for stock data mining. In: Data mining, 2002. ICDM 2003. Proceedings of the 2002 IEEE international conference on. pp 83–90
Han M, Xu M (2015) Predicting multivariate time series using subspace echo state network. Neural Process Lett 41(2):201–209
Modenesi A, Braga A (2009) Analysis of time series novelty detection strategies for synthetic and real data. Neural Process Lett 30(1):1–17
Keogh EJ, Chu S, Hart D, Pazzani M (2004) Segmenting time series: a survey and novel approach. In: Last M, Kandel A, Bunke H (eds) Data mining in time series databases, volume 57 of series in machine perception and artificial intelligence. World Scientific Publishing Company, Singapore, pp 1–22
Keogh E, Chu S, Hart D, Pazzani M (2001) An online algorithm for segmenting time series. In: Data mining, 2001. ICDM 2001, Proceedings of the IEEE international conference on. pp 289–296
Prandom P, Goodwin M, Vetterli M (1997) Optimal time segmentation for signal modeling and compression. In: Acoustics, speech, and signal processing, 1997. ICASSP-97., 1997 IEEE international conference on vol 3. IEEE, pp 2029–2032
Bennett KD (1996) Determination of the number of zones in a biostratigraphical sequence. New Phytol 132(1):155–170
Kehagias A, Nidelkou E, Petridis V (2005) A dynamic programming segmentation procedure for hydrological and environmental time series. Stoch Environ Res Risk Assess 20(1–2):77–94
Nikolaou A, Gutiérrez PA, Durán A, Dicaire I, Fernández-Navarro F, Hervás-Martínez C (2015) Detection of early warning signals in paleoclimate data using a genetic time series segmentation algorithm. Clim Dyn 44(7–8):1919–1933
Basseville M, Nikiforov IV (1993) Detection of abrupt changes: theory and application. Prentice-Hall Inc, Upper Saddle River
Brodsky BE, Darkhovsky BS (2000) Non-parametric statistical diagnosis: problems and methods volume 509 of mathematics and its applications. Kluwer Academic Publishers, Dordrecht
Gustafsson F (2000) Adaptive filtering and change detection, vol 1. Wiley, New York
Vert JP, Bleakley K (2010) Fast detection of multiple change-points shared by many signals using group lars. In: Advances in Neural Information Processing Systems, pp 2343–2351
Lung-Yut-Fong A, Lévy-Leduc C, Cappé O (2011) Robust retrospective multiple change-point estimation for multivariate data. In: 2011 IEEE statistical signal processing workshop (SSP). IEEE, pp 405–408
Wang H, Tang M, Park Y, Priebe CE (2014) Locality statistics for anomaly detection in time series of graphs. IEEE Trans Signal Process 62(3):703–717
Harlé F, Chatelain F, Gouy-Pailler C, Achard S (2016) Bayesian model for multiple change-points detection in multivariate time series. IEEE Trans Signal Process 64(16):4351–4362
Pratt KB (2001) Locating patterns in discrete time-series. PhD Thesis, University of South Florida, Department of Computer Science and Engineering
Gonzalez L, Powell JG, Shi J, Wilson A (2005) Two centuries of bull and bear market cycles. Int Rev Econ Finance 14(4):469–486
Edwards RD, Magee J, Bassetti W (2013) Technical analysis of stock trends, 10th edn. CRC Press, Boca Raton
Pagan AR, Sossounov KA (2003) A simple framework for analysing bull and bear markets. J Appl Econom 18(1):23–46
Levy M, Levy H, Solomon S (1994) A microscopic model of the stock market: cycles, booms, and crashes. Econ Lett 45(1):103–111
Liao TW (2005) Clustering of time series data—a survey. Pattern Recognit 38(11):1857–1874
Kavitha V, Punithavalli M (2010) Clustering time series data stream-a literature survey. arXiv preprint arXiv:1005.4270
Rani S, Sikka G (2012) Recent techniques of clustering of time series data: a survey. Int J Comput Appl 52(15):1–9
Aghabozorgi S, Shirkhorshidi AS, Wah TY (2015) Time-series clustering-a decade review. Inf Syst 53:16–38
Iorio C, Frasso G, D’Ambrosio A, Siciliano R (2016) Parsimonious time series clustering using p-splines. Expert Syst Appl 52:26–38
Maharaj EA (2000) Cluster of time series. J Classif 17(2):297–314
Tseng VS, Chen CH, Huang PC, Hong TP (2009) Cluster-based genetic segmentation of time series with DWT. Pattern Recognit Lett 30(13):1190–1197
Durán-Rosal AM, de la Paz-Marín M, Gutiérrez PA, Hervás-Martínez C (2015) Applying a hybrid algorithm to the segmentation of the Spanish stock market index time series. Springer International Publishing, Cham, pp 69–79
Chundi P, Subramaniam M, Vasireddy DK (2009) An approach for temporal analysis of email data based on segmentation. Data Knowl Eng 68(11):1253–1270 (Including Special Section: Conference on Privacy in Statistical Databases (PSD 2008)—Six selected and extended papers on Database Privacy)
Fuchs E, Gruber T, Nitschke J, Sick B (2010) Online segmentation of time series based on polynomial least-squares approximations. IEEE Trans Pattern Anal Mach Intell 32(12):2232–2245
Houck CR, Joines JA, Kay MG, Wilson JR (1997) Empirical investigation of the benefits of partial lamarckianism. Evol Comput 5(1):31–60
Arthur D, Vassilvitskii S (2007) K-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM symposium on discrete algorithms. SODA ’07, Philadelphia, PA, USA, Society for Industrial and Applied Mathematics, pp 1027–1035
Xu R, Wunsch D (2008) Clustering. IEEE press series on computational intelligence. Wiley, Hoboken
Gurrutxaga I, Albisua I, Arbelaitz O, Martín JI, Muguerza J, Pérez JM, Perona I (2010) Sep/cop: an efficient method to find the best partition in hierarchical clustering based on a new cluster validity index. Pattern Recognit 43(10):3364–3373
Arbelaitz O, Gurrutxaga I, Muguerza J, Pérez JM, Perona I (2013) An extensive comparative study of cluster validity indices. Pattern Recognit 46(1):243–256
Cheong SA, Fornia RP, Lee GHT, Kok JL, Yim WS, Xu DY, Zhang Y (2012) The japanese economy in crises: a time series segmentation study. Econ Open-Access Open-Assess E J 6(2012–5):1–81
Bernaola-Galván P, Román-Roldán R, Oliver JL (1996) Compositional segmentation and long-range fractal correlations in DNA sequences. Phys Rev E 53:5181–5189
Sato AH (2013) A comprehensive analysis of time series segmentation on Japanese stock prices. Proced Comput Sci 24(0):307–314 (17th Asia Pacific Symposium on Intelligent and Evolutionary Systems, IES2013)
Zhuang E, Small M, Feng G (2014) Time series analysis of the developed financial markets’ integration using visibility graphs. Phys A Stat Mech Appl 410:483–495
Degiannakis S, Floros C (2013) Modeling CAC40 volatility using ultra-high frequency data. Res Int Bus Finance 28:68–81
Canova F (1999) Does detrending matter for the determination of the reference cycle and the selection of turning points? Econ J 109(452):126–150
Acknowledgements
This work has been partially subsidized by the TIN2014-54583-C2-1-R and the TIN2015-70308-REDT projects of the Spanish Ministerial Commission of Science and Technology (MINECO. Spain), FEDER funds (EU) and the P11-TIC-7508 project of the “Junta de Andalucía” (Spain). Antonio M. Durán-Rosal’s research has been subsidized by the FPU Predoctoral Program (Spanish Ministry of Education and Science), grant reference FPU14/03039.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Durán-Rosal, A.M., de la Paz-Marín, M., Gutiérrez, P.A. et al. Identifying Market Behaviours Using European Stock Index Time Series by a Hybrid Segmentation Algorithm. Neural Process Lett 46, 767–790 (2017). https://doi.org/10.1007/s11063-017-9592-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-017-9592-8