research-article

Mining Rank Data

Authors:

Sascha Henzgen,

Eyke HüllermeierAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 13, Issue 6

Article No.: 59, Pages 1 - 36

https://doi.org/10.1145/3363572

Published: 11 November 2019 Publication History

Abstract

The problem of frequent pattern mining has been studied quite extensively for various types of data, including sets, sequences, and graphs. Somewhat surprisingly, another important type of data, namely rank data, has received very little attention in data mining so far. In this article, we therefore address the problem of mining rank data, that is, data in the form of rankings (total orders) of an underlying set of items. More specifically, two types of patterns are considered, namely frequent rankings and dependencies between such rankings in the form of association rules. Algorithms for mining frequent rankings and frequent closed rankings are proposed and tested experimentally, using both synthetic and real data.

References

[1]

C. C. Aggrawal and J. Han. 2014. Frequent Pattern Mining. Springer.

[2]

R. Agrawal and R. Srikant. 1994. Fast algorithms for mining association rules. In Proceedings on the 20th International Conference on Very Large Data Bases (VLDB’94). San Francisco, CA, 487--499.

[3]

R. Agrawal and R. Srikant. 1995. Mining sequential patterns. In Proceedings of the 11th International Conference on Data Engineering (ICDE’95). Washington, DC, 3--14. Retrieved from http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=380415

Digital Library

[4]

J. Ayresand, J. Flannick, J. Gehrke, and T. Yiu. 2002. Sequential PAttern mining using a bitmap representation. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’02). New York, NY, 429--435.

[5]

T. Calders, B. Goethals, and S. Jaroszewicz. 2006. Mining rank-correlated sets of numerical attributes. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06). New York, NY, 96--105.

[6]

W. Cheng, E. Hüllermeier, W. Waegeman, and V. Welker. 2012. Label ranking with partial abstention based on thresholded probabilistic models. In Proceedings of the 26th Annual Conference on Neural Information Processing Systems (NIPS’12). Lake Tahoe, Nevada.

[7]

Vachik Dave, Baichuan Zhang, Pin-Yu Chen, and Mohammad Al Hasan. 2016. Trust from the past: Bayesian personalized ranking based link prediction in knowledge graphs. In arXiv:1804.08774.

[8]

B. A. Davey and H. A. Priestley. 2002. Introduction to Lattices and Order. Cambridge University Press.

[9]

C. Rebelo de Sa, C. Soares, A. Mario, J. Paulo, and A. J. Costa. 2011. Mining association rules for label ranking. In Proceedingds of Advances in Knowledge Discovery and Data Mining (PAKDD’11). 432--443.

[10]

P. Dragone, S. Teso, and A. Passerini. 2018. Constructive preference elicitation over hybrid combinatorial spaces. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. 2943--2950.

[11]

M. EL-Sayed, E. A. Rundensteiner, and C. Ruiz. 2003. FS-Miner: An efficient and Incremental System to Mine Contiguous Frequent Sequences. Retrieved from http://digitalcommons.wpi.edu/computerscience-pubs/132/.

[12]

P. Fournier-Viger, A. Gomariz, M. Campos, and R. Thomas. 2014. Fast vertical mining of sequential patterns using co-occurrence information. In Proceedings of the 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’14). Springer, Tainan, Taiwan, 40--52.

[13]

P. Fournier-Viger, A. Gomariz, T. Gueniche, A. Soltani, C. Wu, and V. S. Tseng. 2014. SPMF: A Java open-source pattern mining library. Journal of Machine Learning Research 15, 1 (2014), 3389--3393. Retrieved from http://www.philippe-fournier-viger.com/spmf/.

Digital Library

[14]

J. Fürnkranz and E. Hüllermeier (Eds.). 2011. Preference Learning. Springer.

[15]

A. Gomariz, M. Campos, R. Marín, and B. Goethals. 2013. ClaSP: An efficient algorithm for mining frequent closed sequences. In Proceedings of the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’13). 50--61.

[16]

J. Han, H. Cheng, D. Xin, and X. Yan. 2007. Frequent pattern mining: Current status and future directions. Data Mining and Knowledge Discovery 15, 1 (2007), 55--86.

[17]

J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, and M. Hsu. 2000. FreeSpan: Frequent pattern-projected sequential pattern mining. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’00). New York, NY, 355--359.

[18]

J. Han, J. Pei, and Y. Yin. 2000. Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD’00). 1--12.

Digital Library

[19]

S. Henzgen and E. Hüllermeier. 2014. Mining rank data. In Proceedings of the 17th International Conference on Discovery Science. Bled, Slovenia, 123--134.

[20]

K. Y. Huang, C. H. Chang, J. H. Tung, and C. T. Ho. 2006. COBRA: Closed sequential pattern mining using bi-phase reduction approach. In Proceedings of the 8th International Conference in Data Warehousing and Knowledge Discovery (DaWaK’06). 280--291.

[21]

N. R. Mabroukeh and C. I. Ezeife. 2010. A taxonomy of sequential pattern mining algorithms. ACM Computing Surveys 43, 1 (2010), Article 3, 3:1--3:41 pages.

[22]

F. Masseglia, F. Cathala, and P. Poncelet. 1998. The PSP approach for mining sequential patterns. In Proceedings of the 2nd European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD’98). London, UK, 176--184. Retrieved from http://dl.acm.org/citation.cfm?id=645802.669334.

[23]

F. Masseglia, P. Poncelet, and R. Cicchetti. 2000. An efficient algorithm for web usage mining. Networking and Information Systems Journal 2, 5/6 (2000), 571--604.

[24]

N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. 1999. Efficient mining of association rules using closed itemset lattices. Information Systems 24, 1 (1999), 25--46.

Digital Library

[25]

J. Pei, J. Han, and R. Mao. 2000. CLOSET: An efficient algorithm for mining frequent closed itemsets. In Proceedings of the 4th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD’00). Dallas, Texas, 21--30.

[26]

J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M. Hsu. 2001. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proceedings of the 17th International Conference on Data Engineering (ICDE’01). Heidelberg, Germany, 215--224.

Digital Library

[27]

J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. H. Pinto, Q. Chen, U. Dayal, and M. Hsu. 2004. Mining sequential patterns by pattern-growth: The PrefixSpan approach. IEEE Transactions on Knowledge and Data Engineering 16, 11 (2004), 1424--1440.

Digital Library

[28]

J. Pei, J. Han, B. Mortazavi-Asl, and H. Zhu. 2000. Mining access patterns efficiently from web logs. In Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications (PADKK’00). London, UK, 396--407. Retrieved from http://dl.acm.org/citation.cfm?id=646418.693333.

[29]

V. P. Raju and G. P. S. Varma. 2015. Mining closed sequential patterns in large sequence databases. International Journal of Database Management Systems 7, 1 (2015), 29.

[30]

R. Srikant and R. Agrawal. 1996. Mining sequential patterns: Generalizations and performance improvements. In Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology (EDBT’96). London, UK, 3--17.

[31]

E. Suzuki. 2006. Data mining methods for discovering interesting exceptions from an unsupervised table. Journal of Universal Computer Science 12, 6 (2006), 627--653.

[32]

F. Thabtah. 2007. A review of associative classification mining. Knowledge Engineering Review 22, 1 (2007), 37--65.

Digital Library

[33]

T. Le Van, M. Van Leeuwen, S. Nijssen, A. C. Fierro, K. Marchal, and L. De Raedt. 2014. Ranked tiling. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD’14). Nancy, France, 98--113.

[34]

J. Wang, J. Han, and C. Li. 2007. Frequent closed sequence mining without candidate maintenance. IEEE Transactions on Knowledge and Data Engineering 19, 8 (2007), 1042--1056.

Digital Library

[35]

X. Yan, J. Han, and R. Afshar. 2003. CloSpan: Mining closed sequential patterns in large datasets. In Proceedings of the 3rd SIAM International Conference on Data Mining (SDM’03). San Francisco, CA, 166--177.

[36]

M. J. Zaki. 2000. Scalable algorithms for association mining. IEEE Transactions on Knowledge and Data Engineering 12, 3 (2000), 372--390.

Digital Library

[37]

M. J. Zaki. 2001. SPADE: An efficient algorithm for mining frequent sequences. Machine Learning 42, 1--2 (2001), 31--60.

[38]

M. J. Zaki and C. J. Hsiao. 2002. CHARM: An efficient algorithm for closed itemset mining. In Proceedings on the 2nd SIAM International Conference on Data Mining (SDM’02). Arlington, Virginia, 457--473.

[39]

B. Zhang, S. Choudhury, M. Al Hasan, X. Ning, K. Agarwal, S. Purohit, and P. P. Cabrera. 2016. Trust from the past: Bayesian personalized ranking based link prediction in knowledge graphs. In Proceedings on the 3rd SDM Workshop on Mining Networks and Graphs (SDM’16). Miami, Florida.

Index Terms

Mining Rank Data
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
2. Information systems
  1. Information systems applications
    1. Data mining
      1. Association rules

Recommendations

Mining fuzzy specific rare itemsets for education data

Association rule mining is an important data analysis method for the discovery of associations within data. There have been many studies focused on finding fuzzy association rules from transaction databases. Unfortunately, in the real world, one may ...
Mining association rules using fuzzy inference on web data
AWIC'03: Proceedings of the 1st international Atlantic web intelligence conference on Advances in web intelligence

The association rules model is one of most widely used models in data mining. An association rule is an implication of the form X → Y, where X and Y are a set of items that satisfy two constraints, given by the user, called minimum support (minsup) and ...
Enhanced sequence identification technique for protein sequence database mining with hybrid frequent pattern mining algorithm

Sequential pattern mining is the task of identifying the patterns present in a certain number of data instances. This paper proposes a novel Enhanced Sequence Identification ESI approach to effectively find the frequent patterns from the huge dataset. ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 13, Issue 6

December 2019

282 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/3366748

Editors:
Charu Aggarwal
IBM T. J. Watson Research, USA
,
Xindong Wu
Minginglamp Academy of Sciences, China

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2019

Accepted: 01 August 2019

Revised: 01 May 2019

Received: 01 November 2017

Published in TKDD Volume 13, Issue 6

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
300
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Issue’s Table of Contents