Article

Incremental interactive mining of constrained association rules from biological annotation data with nominal features

Authors:

Hassan Najadat,

William Perrizo,

Willy ValdiviaAuthors Info & Claims

SAC '05: Proceedings of the 2005 ACM symposium on Applied computing

Pages 123 - 127

https://doi.org/10.1145/1066677.1066710

Published: 13 March 2005 Publication History

Abstract

Data arising from genomic and proteomic experiments is amassing at high speeds resulting in huge amounts of raw data; consequently, the need for analyzing such biological data --- the understanding of which is still lagging way behind --- has been prominently solicited in the post-genomic era we are currently witnessing. In this paper we attempt to analyze annotated genome data by applying a very central data-mining technique known as association rule mining with the aim of discovering rules capable of yielding deeper insights into this type of data. We propose a new technique capable of using domain knowledge in the form of queries in order to efficiently mine only the subset of the associations that are of interest to researcher in an incremental and interactive mode.

References

[1]

R. Agrawal, T. Imielinski, and A. Swami, Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD (Washington D.C., USA), 1993.]]

Digital Library

[2]

R. Agrawal and R. Srikant, Fast algorithms for mining association rules. Proceeding of the VLDB (Santiago, Chile), 1994.]]

Digital Library

[3]

C. Becquet, S. Blachon, B. Jeudy, J. F. Boulicuat, and O. Grandrillon, "Strong-association-rule mining for large-scale gene-expression data analysis: a case study on human SAGE data." Genome Biology 3(12), 2002.]]

[4]

J. F. Boulicaut, A. Bykowski, C. Rigotti. "Free-sets: a condensed representation of Boolean data for frequency query approximation." Data Mining and Knowledge Journal 7:5--22, 2003.]]

Digital Library

[5]

A. Clare and R. D. King, Data mining the yeast genome in a lazy functional language. Proceedings of the International Symposium on Practical Aspects of Declarative Languages (New Orleans, Louisiana), January 2003.]]

Digital Library

[6]

Q. Ding, M. Khan, A. Roy, and W. Perrizo, The p-tree algebra. Proceedings of the ACM SAC (Madrid, Spain), 2002.]]

Digital Library

[7]

B. Geothals and J. V. D. Bussche, Interactive Constrained Association Rule Mining. Proceedings of the International Conference on Data Warehousing and Knowledge Discovery, volume 1874 of Lecture Notes in Computer Science. Springer, 2000.]]

Digital Library

[8]

J. Han, J. Pei and Y. Yin, Mining Frequent Patterns without Candidate Generation. Proceeding of ACM SIGMOD (Dallas, Texas), 1--12, 2000.]]

Digital Library

[9]

A. Icev, C. Ruiz, and E. F. Ryder, Distance-Enhanced Association Rules fro Gene Expression. Proceedings of the ACM SIGKDD BIOKDD, Workshop on Data Mining in Bioinformatics (Washington D. C., USA), July 2002.]]

[10]

P. Kotala, P. Zhou, S. Mudivarthy, W. Perrizo and E. Deckard, Gene Expression Profiling of DNA Microarray Data using Peano Count Trees. Online proceedings of the first annual Virtual Conference on Genomics and Bioinformatics, October 2001.]]

[11]

Munich Information Center for Protein Sequences. {http://mips.gsf.de/}. August 2004.]]

[12]

W. Perrizo, Peano count tree technology lab notes. Technical Report NDSU-CS-TR-01-1, 2001. {http://www.cs.ndsu.nodak.edu/~perrizo/classes/785/pct.html }. January 2003.]]

[13]

I. Rahal, D. Ren, and W. Perrizo, "A Scalable Vertical Model for Mining Association Rules." To appear in the Journal of Information & Knowledge Management (JIKM) by World Scientific, December 2004 issue.]]

[14]

P. Shenoy, J. Haristsa, S. Sudatsham, G. Bhalotia, M. Baqa and D. Shah, Turbo-charging vertical mining of large databases. Proceedings of the ACM SIGMOD (Austin, Texas), 22--29, May 2000.]]

Digital Library

[15]

A. Tuzhilin and G. Adomavicius, Handling Very Large Numbers of Association Rules in the Analysis of Microarray Data. Proceedings of the ACM SIGKDD (Edmonton, Alberta), July 2002.]]

Digital Library

[16]

D. D. Williams, G. D. Pavitt, and C. G. Proud, "Characterization of the initiation factor eIF2B and its regulation in Drosophila melanogaster." Journal of Biological Chemistry, 276(6): 3733--3742, February 2001.]]

[17]

M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li, New Algorithms for Fast Discovery of Association Rules. Proceedings of the SIGKDD (Newport, California), 283--286, August 1997.]]

Cited By

Aydın TGüvenir H(2009)Modeling interestingness of streaming association rules as a benefit-maximizing classification problemKnowledge-Based Systems10.1016/j.knosys.2008.07.00322:1(85-99)Online publication date: 1-Jan-2009
https://dl.acm.org/doi/10.1016/j.knosys.2008.07.003
Chodos DZaiane O(2008)ARC-UIProceedings of the 2008 12th International Conference Information Visualisation10.1109/IV.2008.35(296-301)Online publication date: 9-Jul-2008
https://dl.acm.org/doi/10.1109/IV.2008.35

Index Terms

Incremental interactive mining of constrained association rules from biological annotation data with nominal features
1. Applied computing
  1. Life and medical sciences

Recommendations

CARIBIAM: Constrained Association Rules using Interactive Biological IncrementAl Mining

This paper analyses annotated genome data by applying a very central data-mining technique known as Association Rule Mining (ARM) with the aim of discovering rules and hypotheses capable of yielding deeper insights into this type of data. In the ...
TCOM, an innovative data structure for mining association rules among infrequent items

Association rule mining is one of the most important areas in data mining, which has received a great deal of attention. The purpose of association rule mining is the discovery of association relationships or correlations among a set of items. In this ...
Future direction of incremental association rules mining
ACMSE '09: Proceedings of the 47th annual ACM Southeast Conference

Data mining has been attracted much attention from practitioners and researchers in recent years. Association rules are one of the most important research areas of data mining. Association Rule Mining (ARM) aims to discovers the relationship between the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SAC '05: Proceedings of the 2005 ACM symposium on Applied computing

March 2005

1814 pages

ISBN:1581139640

DOI:10.1145/1066677

Conference Chair:
Hisham M. Haddad
Kennesaw State University
,
Editor:
Lorie M. Liebrock
New Mexico Institute of Mining and Technology, Socorro, NM
,
Program Chairs:
Andrea Omicini
Alma Mater Studiorum, Universita di Bologna, Italy
,
Roger L. Wainwright
Univerity of Tulsa, OK

Copyright © 2005 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGAPP: ACM Special Interest Group on Applied Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 March 2005

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

SAC05

Sponsor:

SIGAPP

SAC05: The 2005 ACM Symposium on Applied Computing

March 13 - 17, 2005

New Mexico, Santa Fe

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
629
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Aydın TGüvenir H(2009)Modeling interestingness of streaming association rules as a benefit-maximizing classification problemKnowledge-Based Systems10.1016/j.knosys.2008.07.00322:1(85-99)Online publication date: 1-Jan-2009
https://dl.acm.org/doi/10.1016/j.knosys.2008.07.003
Chodos DZaiane O(2008)ARC-UIProceedings of the 2008 12th International Conference Information Visualisation10.1109/IV.2008.35(296-301)Online publication date: 9-Jul-2008
https://dl.acm.org/doi/10.1109/IV.2008.35

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents