Stream mining on univariate uncertain data

Ying-Ho Liu¹

724 Accesses
7 Citations
Explore all metrics

Abstract

In this paper, we propose mining frequent patterns from univariate uncertain data streams, which have a quantitative interval for each attribute in a transaction and a probability density function indicating the possibilities that the values in the interval appear. Many data streams comprise flows of univariate uncertain data, for example, the records of atmospheric pollution sensors, and network monitoring records. We propose two algorithms to address this issue: the ExactU2Stream algorithm and the ApproxiU2Stream algorithm. The former incrementally stores the incoming transactions, and delays the mining process until it is requested. The latter mines the transactions immediately when they arrive, and stores the derived frequent patterns. Compared with the latter, the former returns results that are more accurate, but it also requires more response time. Both algorithms utilize the sliding window scheme, which decomposes the continuous data stream into discrete, overlapping chunks. The proposed algorithms outperform the compared methods in terms of runtime and memory usage. We have applied the two proposed algorithms to the data streams recording the air quality in Taiwan; the derived frequent patterns not only show the common air quality in Taiwan but also show the extremely bad air quality when a sand storm affects Taiwan.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Abd-Elmegid LA, El-Sharkawi ME, El-Fangary LM, Helmy YK (2010) Vertical mining of frequent patterns from uncertain data. Comput Inf Sci 3:171–179
Google Scholar
Aggarwal CC, Han J, Yu PS (2004) On demand classification of data streams. In: Proc ACM SIGKDD int conf knowledge discovery and data mining, pp 503–508
Google Scholar
Aggarwal CC, Li Y, Wang J, Wang J (2009) Frequent pattern mining with uncertain data. In: Proc int conf knowledge discovery and data mining, pp 29–37
Google Scholar
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proc int conf very large data base, pp 487–499
Google Scholar
Ahmed CF, Tanbeer SK, Jeong BS, Lee YK (2011) HUC-Prune: an efficient candidate pruning technique to mine high utility patterns. Appl Intell 34:181–198
Article Google Scholar
Chang JH, Lee WS (2004) A sliding window method for finding recently frequent itemsets over online data streams. J Inf Sci Eng 20:753–762
Google Scholar
Charikar M, Chen K, Farach-Colton M (2002) Finding frequent items in data streams. In: Proc int conf automata, languages, and programming, pp 693–703
Chapter Google Scholar
Chu CJ, Tseng VS, Liang T (2008) An efficient algorithm for mining temporal high utility itemsets from data streams. J Syst Softw 81:1105–1117
Article Google Scholar
Chui C, Kao B (2008) A decremental approach for mining frequent itemsets from uncertain data. In: Proc Pacific-Asia conference on knowledge discovery and data mining, pp 64–75
Chapter Google Scholar
Chui C, Kao B, Hung E (2007) Mining frequent itemsets from uncertain data. In: Proc Pacific-Asia conference on knowledge discovery and data mining, pp 47–58
Chapter Google Scholar
Chi Y, Wang H, Yu PS, Muntz RR (2004) Moment: maintaining closed frequent itemsets over a stream sliding window. In: Proc int conf data mining, pp 59–66
Google Scholar
Cormode G, Muthukrishnan S (2003) What’s hot and what’s not: tracking most frequent items dynamically. In: Proc SIGMOD/PODS, pp 296–306
Google Scholar
Gaber MM, Krishnaswamy S, Zaslavsky A (2005) Onboard mining of data streams in sensor networks. In: Maulik U (ed) Advanced methods of knowledge discovery from complex data. Springer, Berlin, pp 307–335
Chapter Google Scholar
Giannella C, Han J, Pei J, Yan X, Yu PS (2003) Mining frequent patterns in data streams at multiple time granularities. In: Kargupta H (ed) Data mining: next generation challenges and future directions. AAAI Press/MIT Press, Melno Park/Cambridge, pp 191–210
Google Scholar
Golab L, Dehaan D, Demaine ED, Lopez-Ortiz A, Munro JI (2003) Identifying frequent items in sliding windows over on-line packet streams. In: Proc internet measurement conference, pp 173–178
Chapter Google Scholar
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proc ACM SIGMOD int conf management of data, pp 1–12
Google Scholar
Hung CC, Peng WC (2011) A regression-based approach for mining user movement patterns from random sample data. Data Knowl Eng 70:1–20
Article Google Scholar
Jiang N, Gruenwald L (2006) Research issues in data stream association rule mining. SIGMOD Rec 35:14–19
Article Google Scholar
Jiang N, Gruenwald L (2006) CFI-stream: mining closed frequent itemsets in data streams. In: Proc int conf knowledge discovery and data mining, pp 592–597
Google Scholar
Karp RM, Shenker S (2003) A simple algorithm for finding frequent elements in streams and bags. ACM Trans Database Syst 28:51–55
Article Google Scholar
Lee CH (2007) IMSP: an information theoretic approach for multi-dimensional sequential pattern mining. Appl Intell 26:231–242
Article MATH Google Scholar
Leung CKS, Carmichael CL, Hao B (2007) Efficient mining of frequent patterns from uncertain data. In: Proc int conf data mining—workshops, pp 489–494
Google Scholar
Leung CKS, Hao B (2009) Mining of frequent itemsets from streams of uncertain data. In: Proc int conf data engineering, pp 1663–1670
Google Scholar
Leung CKS, Hao B, Jiang F (2010) Constrained frequent itemset mining from uncertain data streams. In: Proc int conf data engineering workshops, pp 120–127
Google Scholar
Leung CKS, Khan QI (2006) DSTree: a tree structure for the mining of frequent sets from data streams. In: Proc int conf data mining, pp 928–933
Google Scholar
Leung CKS, Mateo MAF, Brajczuk DA (2008) A tree-based approach for frequent pattern mining from uncertain data. In: Proc Pacific-Asia conference on knowledge discovery and data mining, pp 653–661
Chapter Google Scholar
Li CW, Jea KF, Lin RP, Yen SF, Hsu CW (2012) Mining frequent patterns from dynamic data streams with data load management. J Syst Softw 85:1346–1362
Article Google Scholar
Li HF, Lee SY (2009) Mining frequent itemsets over data streams using efficient window sliding techniques. Expert Syst Appl 36:1466–1477
Article Google Scholar
Li HF, Lee SY, Shan MK (2004) An efficient algorithm for mining frequent itemsets over the entire history of data streams. In: Proc int work knowledge discovery in data streams
Google Scholar
Li HF, Lee SY, Shan MK (2005) Online mining (recently) maximal frequent itemsets over data streams. In: Proc int work research issues in data engineering: stream data mining and applications
Google Scholar
Lin CH, Chiu DY, Wu YH (2005) Mining frequent itemsets from data streams with a time-sensitive sliding window. In: Proc SIAM int conf data mining
Google Scholar
Liu YH (2012) Mining frequent patterns from univariate uncertain data. Data Knowl Eng 71:47–68
Article Google Scholar
Liu YH, Wang CS (2012) Constrained frequent pattern mining on univariate uncertain data. J Syst Softw. doi:10.1016/j.jss.2012.11.020
Google Scholar
Manku GS, Motwani R (2002) Approximate frequency counts over data streams. In: Proc int conf very large data bases, pp 346–357
Chapter Google Scholar
Mao G, Wu X, Zhu X, Chen G, Liu C (2007) Mining maximal frequent itemsets from data streams. J Inf Sci 33:251–262
Article Google Scholar
Purwanto EC, Logeswaran R (2012) An enhanced hybrid method for time series prediction using linear and neural network models. Appl Intell 37:511–519
Article Google Scholar
Qiao S, Tang C, Jin H, Long T, Dai S, Ku Y, Chau M (2010) PutMode: prediction of uncertain trajectories in moving objects databases. Appl Intell 33:370–386
Article Google Scholar
Silvestri C, Orlando S (2007) Approximate mining of frequent patterns on streams. Intell Data Anal 11:49–73
Google Scholar
Sun L, Cheng R, Cheung DW, Cheng J (2010) Mining uncertain data with probabilistic guarantees. In: Proc ACM SIGKDD int conf knowledge discovery and data mining, pp 273–282
Google Scholar
Wang YT, Cheng JT (2011) Mining periodic movement patterns of mobile phone users based on an efficient sampling approach. Appl Intell 35:32–40
Article Google Scholar
Yang L, Sanver M (2004) Mining short association rules with one database scan. Proc information and knowledge engineering
Yu JX, Chong Z, Lu H, Zhang Z, Zhou A (2006) A false negative approach to mining frequent itemsets from high speed transactional data streams. Inf Sci 176:1986–2015
Article Google Scholar
EPA website (2010). http://taqm.epa.gov.tw/taqm/zh-tw/default.aspx
Xu C, Wang Y, Gu Y, Lin S, Yu G (2012) Efficient fuzzy ranking queries in uncertain databases. Appl Intell 37:47–59
Article Google Scholar
Zhao L, Wang L, Xu Q (2012) Data stream classification with artificial endocrine system. Appl Intell 37:390–404
Article Google Scholar

Download references

Author information

Authors and Affiliations

National Dong Hwa University, Hualien, Taiwan, Republic of China
Ying-Ho Liu

Authors

Ying-Ho Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying-Ho Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, YH. Stream mining on univariate uncertain data. Appl Intell 39, 315–344 (2013). https://doi.org/10.1007/s10489-012-0415-3

Download citation

Published: 06 February 2013
Issue Date: September 2013
DOI: https://doi.org/10.1007/s10489-012-0415-3

Stream mining on univariate uncertain data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Mining Weighted Frequent Patterns from Uncertain Data Streams

Fast and Exact Mining of Probabilistic Data Streams

Mining Data Streams with Dynamic Confidence Intervals

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Stream mining on univariate uncertain data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Mining Weighted Frequent Patterns from Uncertain Data Streams

Fast and Exact Mining of Probabilistic Data Streams

Mining Data Streams with Dynamic Confidence Intervals

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation