Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1066677.1066710acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
Article

Incremental interactive mining of constrained association rules from biological annotation data with nominal features

Published: 13 March 2005 Publication History

Abstract

Data arising from genomic and proteomic experiments is amassing at high speeds resulting in huge amounts of raw data; consequently, the need for analyzing such biological data --- the understanding of which is still lagging way behind --- has been prominently solicited in the post-genomic era we are currently witnessing. In this paper we attempt to analyze annotated genome data by applying a very central data-mining technique known as association rule mining with the aim of discovering rules capable of yielding deeper insights into this type of data. We propose a new technique capable of using domain knowledge in the form of queries in order to efficiently mine only the subset of the associations that are of interest to researcher in an incremental and interactive mode.

References

[1]
R. Agrawal, T. Imielinski, and A. Swami, Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD (Washington D.C., USA), 1993.]]
[2]
R. Agrawal and R. Srikant, Fast algorithms for mining association rules. Proceeding of the VLDB (Santiago, Chile), 1994.]]
[3]
C. Becquet, S. Blachon, B. Jeudy, J. F. Boulicuat, and O. Grandrillon, "Strong-association-rule mining for large-scale gene-expression data analysis: a case study on human SAGE data." Genome Biology 3(12), 2002.]]
[4]
J. F. Boulicaut, A. Bykowski, C. Rigotti. "Free-sets: a condensed representation of Boolean data for frequency query approximation." Data Mining and Knowledge Journal 7:5--22, 2003.]]
[5]
A. Clare and R. D. King, Data mining the yeast genome in a lazy functional language. Proceedings of the International Symposium on Practical Aspects of Declarative Languages (New Orleans, Louisiana), January 2003.]]
[6]
Q. Ding, M. Khan, A. Roy, and W. Perrizo, The p-tree algebra. Proceedings of the ACM SAC (Madrid, Spain), 2002.]]
[7]
B. Geothals and J. V. D. Bussche, Interactive Constrained Association Rule Mining. Proceedings of the International Conference on Data Warehousing and Knowledge Discovery, volume 1874 of Lecture Notes in Computer Science. Springer, 2000.]]
[8]
J. Han, J. Pei and Y. Yin, Mining Frequent Patterns without Candidate Generation. Proceeding of ACM SIGMOD (Dallas, Texas), 1--12, 2000.]]
[9]
A. Icev, C. Ruiz, and E. F. Ryder, Distance-Enhanced Association Rules fro Gene Expression. Proceedings of the ACM SIGKDD BIOKDD, Workshop on Data Mining in Bioinformatics (Washington D. C., USA), July 2002.]]
[10]
P. Kotala, P. Zhou, S. Mudivarthy, W. Perrizo and E. Deckard, Gene Expression Profiling of DNA Microarray Data using Peano Count Trees. Online proceedings of the first annual Virtual Conference on Genomics and Bioinformatics, October 2001.]]
[11]
Munich Information Center for Protein Sequences. {http://mips.gsf.de/}. August 2004.]]
[12]
W. Perrizo, Peano count tree technology lab notes. Technical Report NDSU-CS-TR-01-1, 2001. {http://www.cs.ndsu.nodak.edu/~perrizo/classes/785/pct.html }. January 2003.]]
[13]
I. Rahal, D. Ren, and W. Perrizo, "A Scalable Vertical Model for Mining Association Rules." To appear in the Journal of Information & Knowledge Management (JIKM) by World Scientific, December 2004 issue.]]
[14]
P. Shenoy, J. Haristsa, S. Sudatsham, G. Bhalotia, M. Baqa and D. Shah, Turbo-charging vertical mining of large databases. Proceedings of the ACM SIGMOD (Austin, Texas), 22--29, May 2000.]]
[15]
A. Tuzhilin and G. Adomavicius, Handling Very Large Numbers of Association Rules in the Analysis of Microarray Data. Proceedings of the ACM SIGKDD (Edmonton, Alberta), July 2002.]]
[16]
D. D. Williams, G. D. Pavitt, and C. G. Proud, "Characterization of the initiation factor eIF2B and its regulation in Drosophila melanogaster." Journal of Biological Chemistry, 276(6): 3733--3742, February 2001.]]
[17]
M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li, New Algorithms for Fast Discovery of Association Rules. Proceedings of the SIGKDD (Newport, California), 283--286, August 1997.]]

Cited By

View all
  • (2009)Modeling interestingness of streaming association rules as a benefit-maximizing classification problemKnowledge-Based Systems10.1016/j.knosys.2008.07.00322:1(85-99)Online publication date: 1-Jan-2009
  • (2008)ARC-UIProceedings of the 2008 12th International Conference Information Visualisation10.1109/IV.2008.35(296-301)Online publication date: 9-Jul-2008

Index Terms

  1. Incremental interactive mining of constrained association rules from biological annotation data with nominal features

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SAC '05: Proceedings of the 2005 ACM symposium on Applied computing
    March 2005
    1814 pages
    ISBN:1581139640
    DOI:10.1145/1066677
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 March 2005

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. P-trees
    2. association rule mining
    3. bioinformatics
    4. incremental
    5. interactive
    6. yeast genome

    Qualifiers

    • Article

    Conference

    SAC05
    Sponsor:
    SAC05: The 2005 ACM Symposium on Applied Computing
    March 13 - 17, 2005
    New Mexico, Santa Fe

    Acceptance Rates

    Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 13 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2009)Modeling interestingness of streaming association rules as a benefit-maximizing classification problemKnowledge-Based Systems10.1016/j.knosys.2008.07.00322:1(85-99)Online publication date: 1-Jan-2009
    • (2008)ARC-UIProceedings of the 2008 12th International Conference Information Visualisation10.1109/IV.2008.35(296-301)Online publication date: 9-Jul-2008

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media