Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Mining Rank Data

Published: 11 November 2019 Publication History

Abstract

The problem of frequent pattern mining has been studied quite extensively for various types of data, including sets, sequences, and graphs. Somewhat surprisingly, another important type of data, namely rank data, has received very little attention in data mining so far. In this article, we therefore address the problem of mining rank data, that is, data in the form of rankings (total orders) of an underlying set of items. More specifically, two types of patterns are considered, namely frequent rankings and dependencies between such rankings in the form of association rules. Algorithms for mining frequent rankings and frequent closed rankings are proposed and tested experimentally, using both synthetic and real data.

References

[1]
C. C. Aggrawal and J. Han. 2014. Frequent Pattern Mining. Springer.
[2]
R. Agrawal and R. Srikant. 1994. Fast algorithms for mining association rules. In Proceedings on the 20th International Conference on Very Large Data Bases (VLDB’94). San Francisco, CA, 487--499.
[3]
R. Agrawal and R. Srikant. 1995. Mining sequential patterns. In Proceedings of the 11th International Conference on Data Engineering (ICDE’95). Washington, DC, 3--14. Retrieved from http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=380415
[4]
J. Ayresand, J. Flannick, J. Gehrke, and T. Yiu. 2002. Sequential PAttern mining using a bitmap representation. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’02). New York, NY, 429--435.
[5]
T. Calders, B. Goethals, and S. Jaroszewicz. 2006. Mining rank-correlated sets of numerical attributes. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06). New York, NY, 96--105.
[6]
W. Cheng, E. Hüllermeier, W. Waegeman, and V. Welker. 2012. Label ranking with partial abstention based on thresholded probabilistic models. In Proceedings of the 26th Annual Conference on Neural Information Processing Systems (NIPS’12). Lake Tahoe, Nevada.
[7]
Vachik Dave, Baichuan Zhang, Pin-Yu Chen, and Mohammad Al Hasan. 2016. Trust from the past: Bayesian personalized ranking based link prediction in knowledge graphs. In arXiv:1804.08774.
[8]
B. A. Davey and H. A. Priestley. 2002. Introduction to Lattices and Order. Cambridge University Press.
[9]
C. Rebelo de Sa, C. Soares, A. Mario, J. Paulo, and A. J. Costa. 2011. Mining association rules for label ranking. In Proceedingds of Advances in Knowledge Discovery and Data Mining (PAKDD’11). 432--443.
[10]
P. Dragone, S. Teso, and A. Passerini. 2018. Constructive preference elicitation over hybrid combinatorial spaces. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. 2943--2950.
[11]
M. EL-Sayed, E. A. Rundensteiner, and C. Ruiz. 2003. FS-Miner: An efficient and Incremental System to Mine Contiguous Frequent Sequences. Retrieved from http://digitalcommons.wpi.edu/computerscience-pubs/132/.
[12]
P. Fournier-Viger, A. Gomariz, M. Campos, and R. Thomas. 2014. Fast vertical mining of sequential patterns using co-occurrence information. In Proceedings of the 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’14). Springer, Tainan, Taiwan, 40--52.
[13]
P. Fournier-Viger, A. Gomariz, T. Gueniche, A. Soltani, C. Wu, and V. S. Tseng. 2014. SPMF: A Java open-source pattern mining library. Journal of Machine Learning Research 15, 1 (2014), 3389--3393. Retrieved from http://www.philippe-fournier-viger.com/spmf/.
[14]
J. Fürnkranz and E. Hüllermeier (Eds.). 2011. Preference Learning. Springer.
[15]
A. Gomariz, M. Campos, R. Marín, and B. Goethals. 2013. ClaSP: An efficient algorithm for mining frequent closed sequences. In Proceedings of the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’13). 50--61.
[16]
J. Han, H. Cheng, D. Xin, and X. Yan. 2007. Frequent pattern mining: Current status and future directions. Data Mining and Knowledge Discovery 15, 1 (2007), 55--86.
[17]
J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, and M. Hsu. 2000. FreeSpan: Frequent pattern-projected sequential pattern mining. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’00). New York, NY, 355--359.
[18]
J. Han, J. Pei, and Y. Yin. 2000. Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD’00). 1--12.
[19]
S. Henzgen and E. Hüllermeier. 2014. Mining rank data. In Proceedings of the 17th International Conference on Discovery Science. Bled, Slovenia, 123--134.
[20]
K. Y. Huang, C. H. Chang, J. H. Tung, and C. T. Ho. 2006. COBRA: Closed sequential pattern mining using bi-phase reduction approach. In Proceedings of the 8th International Conference in Data Warehousing and Knowledge Discovery (DaWaK’06). 280--291.
[21]
N. R. Mabroukeh and C. I. Ezeife. 2010. A taxonomy of sequential pattern mining algorithms. ACM Computing Surveys 43, 1 (2010), Article 3, 3:1--3:41 pages.
[22]
F. Masseglia, F. Cathala, and P. Poncelet. 1998. The PSP approach for mining sequential patterns. In Proceedings of the 2nd European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD’98). London, UK, 176--184. Retrieved from http://dl.acm.org/citation.cfm?id=645802.669334.
[23]
F. Masseglia, P. Poncelet, and R. Cicchetti. 2000. An efficient algorithm for web usage mining. Networking and Information Systems Journal 2, 5/6 (2000), 571--604.
[24]
N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. 1999. Efficient mining of association rules using closed itemset lattices. Information Systems 24, 1 (1999), 25--46.
[25]
J. Pei, J. Han, and R. Mao. 2000. CLOSET: An efficient algorithm for mining frequent closed itemsets. In Proceedings of the 4th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD’00). Dallas, Texas, 21--30.
[26]
J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M. Hsu. 2001. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proceedings of the 17th International Conference on Data Engineering (ICDE’01). Heidelberg, Germany, 215--224.
[27]
J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. H. Pinto, Q. Chen, U. Dayal, and M. Hsu. 2004. Mining sequential patterns by pattern-growth: The PrefixSpan approach. IEEE Transactions on Knowledge and Data Engineering 16, 11 (2004), 1424--1440.
[28]
J. Pei, J. Han, B. Mortazavi-Asl, and H. Zhu. 2000. Mining access patterns efficiently from web logs. In Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications (PADKK’00). London, UK, 396--407. Retrieved from http://dl.acm.org/citation.cfm?id=646418.693333.
[29]
V. P. Raju and G. P. S. Varma. 2015. Mining closed sequential patterns in large sequence databases. International Journal of Database Management Systems 7, 1 (2015), 29.
[30]
R. Srikant and R. Agrawal. 1996. Mining sequential patterns: Generalizations and performance improvements. In Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology (EDBT’96). London, UK, 3--17.
[31]
E. Suzuki. 2006. Data mining methods for discovering interesting exceptions from an unsupervised table. Journal of Universal Computer Science 12, 6 (2006), 627--653.
[32]
F. Thabtah. 2007. A review of associative classification mining. Knowledge Engineering Review 22, 1 (2007), 37--65.
[33]
T. Le Van, M. Van Leeuwen, S. Nijssen, A. C. Fierro, K. Marchal, and L. De Raedt. 2014. Ranked tiling. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD’14). Nancy, France, 98--113.
[34]
J. Wang, J. Han, and C. Li. 2007. Frequent closed sequence mining without candidate maintenance. IEEE Transactions on Knowledge and Data Engineering 19, 8 (2007), 1042--1056.
[35]
X. Yan, J. Han, and R. Afshar. 2003. CloSpan: Mining closed sequential patterns in large datasets. In Proceedings of the 3rd SIAM International Conference on Data Mining (SDM’03). San Francisco, CA, 166--177.
[36]
M. J. Zaki. 2000. Scalable algorithms for association mining. IEEE Transactions on Knowledge and Data Engineering 12, 3 (2000), 372--390.
[37]
M. J. Zaki. 2001. SPADE: An efficient algorithm for mining frequent sequences. Machine Learning 42, 1--2 (2001), 31--60.
[38]
M. J. Zaki and C. J. Hsiao. 2002. CHARM: An efficient algorithm for closed itemset mining. In Proceedings on the 2nd SIAM International Conference on Data Mining (SDM’02). Arlington, Virginia, 457--473.
[39]
B. Zhang, S. Choudhury, M. Al Hasan, X. Ning, K. Agarwal, S. Purohit, and P. P. Cabrera. 2016. Trust from the past: Bayesian personalized ranking based link prediction in knowledge graphs. In Proceedings on the 3rd SDM Workshop on Mining Networks and Graphs (SDM’16). Miami, Florida.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 13, Issue 6
December 2019
282 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3366748
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2019
Accepted: 01 August 2019
Revised: 01 May 2019
Received: 01 November 2017
Published in TKDD Volume 13, Issue 6

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Data mining
  2. association rules
  3. frequent pattern mining
  4. rank data

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 300
    Total Downloads
  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media