Abstract
Green computing is the study and practice of efficiently using computers resources. The main purpose of green computing is to achieve an algorithmic efficiency by designing resource-efficient, accurate and energy-efficient algorithms. It is important to achieve the algorithmic efficiency in handling time-series data. One of the main tasks in handling time-series data is to find subsequence matches similar to a given query sequence. The state-of-the-art methods to find subsequence matches in time-series data produce many false alarms by filtering points through comparing only one query window with its corresponding data window. In this paper, we propose a subsequence matching method for green computing, which is called the Efficient Duality-based Subsequence Matching (simply, E-Dual Match). E-Dual Match handles all possible query windows for determining candidates. Hence, E-Dual Match not only reduces the false alarms, and improves the performance compared to Dual Match, but also does so by considering the main requirements of the green computing. In other words, E-Dual Match efficiently uses limited computer resources, accurate and energy-efficient. Experiment results show that E-Dual Match reduces the number of candidates by up to 4.90 times over Dual Match, and improves the subsequence matching time by up to 2.35 times over Dual Match. We also show that E-Dual Match reduces the number of data page accesses by up to 3.04 times over Dual Match.
Similar content being viewed by others
Notes
Ah-Yeon Jin (Sookmyung Women’s University) helped to implement the construction of add-index.
References
Beckmann N, Kriegel HP, Schneider R, Seeger B (1990) The R∗-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the 1990 ACM SIGMOD international conference on management of data, pp 322–331
Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res 11:1601–1604
Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: Proceedings of the 1994 ACM SIGMOD international conference on management of data, pp 419–429
Gingichashvili S (2007) Green computing. The future of things. Available at: http://thefutureofthings.com/articles/1003/green-computing.html
Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM SIGMOD international conference on management of data, pp 47–57
Harizopoulos S, Shah MA, Meza J, Ranganathan P (2009) Energy efficiency: the new holy grail of data management systems research. In: The biennial conference on innovative data systems research (CIDR), pp 1–8
Kahveci T, Singh A (2001) Optimizing similarity search for arbitrary length time series. In: Proceedings of the 17th international conference on data engineering (ICDE), pp 273–282
Lee YC, Zomaya AY (2009) On effective slack reclamation in task scheduling for energy reduction. J Inf Process Syst 5(4):175–186
Murugesan P (2008) Harnessing green IT: principles and practices. IT Prof 10(1):24–33
Lim SH (2006) Using multiple indexes for efficient subsequence matching in time-series databases. In: Proceedings of the database systems for advanced applications (DASFAA), pp 65–79
Lim SH, Park H, Kim SW (2007) Using multiple indexes for efficient subsequence matching in time-series databases. J Inf Sci 170(24):5691–5706
Moon YS, Whang KY, Loh WK (2001) Duality-based subsequence matching in time-series databases. In: Proceedings of the 17th international conference on data engineering (ICDE), pp 263–272
Moon YS, Whang KY, Loh WK (2002) General match: a subsequence matching method in time-series databases based on generalized windows. In: Proceedings of the 2002 ACM SIGMOD international conference on management of data, pp 382–393
Nagral G, Uddin M, Kaur A (2012) A comparative study of estimation by analogy using data mining techniques. J Inf Process Syst 8(4):621–652
Poess M, Nambiar RO (2008) Energy cost, the key challenge of today’s data centers: a power consumption analysis of TPC-C results. Proc VLDB Endow 1(2):1229–1240
The Times Online (2009) Revealed: the environmental impact of Google searches. http://technology.timesonline.co.uk/tol/news/tech_and_web/article5489134.ece, Accessed 3 May 2013
Tseng VS, Chen LC, Liu JJ (2007) Gene relation discovery by mining similar subsequence in time-series microarray data. In: Proceedings of the 2007 IEEE symposium on computational intelligence in bioinformatics and computational biology (CIBCB), pp 106–112
Valêncio C, Oyama F, Neto PS, Colombini A, Cansian A, Souza R, Corrêa P (2012) MR-radix: a multi-relational data mining algorithm. Hum-Cent Comput Inf Sci 2(1):1–17
Wang J, Feng L, Xue W, Song Z (2011) A survey on energy-efficient data management. ACM SIGMOD Rec 40(2):17–23
Whang KY, Song IY, Kim TY, Lee KH (2009) The ubiquitous DBMS. ACM SIGMOD Rec 38(4):14–22
Wu H (2005) Structured time series stream data. Dissertation, Northeastern University
Xu Z (2010) Building a power-aware database management system. In: Proceedings of the fourth SIGMOD PhD workshop on innovative database research (IDAR), pp 1–6
Xu Z, Tu YC, Wang X (2010) Exploring power-performance trade-offs in database systems. In: Proceedings of the 26th international conference on data engineering (ICDE), pp 485–496
Yoon M, Kim YK, Chang JW (2013) An energy-efficient routing protocol using message success rate in wireless sensor networks. J Converg 4(1):15–22
Acknowledgements
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2012003797).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ihm, SY., Nasridinov, A., Lee, JH. et al. Efficient duality-based subsequent matching on time-series data in green computing. J Supercomput 69, 1039–1053 (2014). https://doi.org/10.1007/s11227-013-1028-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-013-1028-2