Abstract
Inductive queries are queries to an inductive database that generate a set of patterns in a data mining context. Inductive querying poses new challenges to database and data mining technology. We study conjunctive inductive queries, which are queries that can be written as a conjunction of a monotonic and an anti-monotonic subquery. We introduce the conjunctive inductive query optimization problem, which is concerned with minimizing the cost of computing the answer set to a conjunctive query. In the optimization problem, it is assumed that there are costs c a and c m associated to evaluating a pattern w.r.t. a monotonic and an anti-monotonic subquery respectively. The aim is then to minimize the total cost needed to compute all solutions to the query. Secondly, we present an algorithm that aims at optimizing conjunctive inductive queries in the context of the pattern domain of strings and evaluate it on a challenging data set in computational biology.
An early version of this paper was presented at the 2nd ECML/PKDD Workshop on Knowledge Discovery with Inductive Querying, Dubrovnik, 2003.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB (1994)
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. SIGMOD, pp. 207–216 (1993)
Bayardo, R.: Efficiently mining long patterns from databases. In: Proc. SIGMOD (1998)
Bucila, C., Gehrke, J., Kifer, D., White, W.: DualMiner: A dual pruning algorithm for itemsets with constraints. In: Proc. of SIGKDD (2002)
De Raedt, L., Kramer, S.: The levelwise version space algorithm and its application to molecular fragment finding. In: Proc. IJCAI (2001)
De Raedt, L., Jäger, M., Lee, S.D., Mannila, H.: A Theory of Inductive Query Answering. In: Proceedings IEEE ICDM (2002)
De Raedt, L.: A perspective on inductive databases. SIGKDD Explorations 4 (2) (2002)
Fischer, J.: Version Spaces im Contraint-Based Data Mining. Diplomarbeit. Albert-Ludwigs-University Freiburg (2003)
Goethals, B., Van den Bussche, J.: On supporting interactive association rule mining. In: Kambayashi, Y., Mohania, M., Tjoa, A.M. (eds.) DaWaK 2000. LNCS, vol. 1874, p. 307. Springer, Heidelberg (2000)
Gunopulos, D., Mannila, H., Saluja, S.: Discovering All Most Specific Sentences by Randomized Algorithms. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, Springer, Heidelberg (1996)
Gunopulos, D., Khardon, R., Mannila, H., Toivonen, H.: Data mining, Hypergraph Transversals, and Machine Learning. In: Proceedings PODS (1997)
Han, J., Fu, Y., Koperski, K., Wang, W., Zaiane, O.: DMQL: A Data Mining Query Language for Relational Databases. In: Proc. SIGMOD 1996 Workshop on Research Issues on Data Mining and Knowledge Discovery (1996)
Han, J., Lakshmanan, L.V.S., Ng, R.T.: Constraint-Based, Multidimensional Data Mining. Computer 32(8), 46–50 (1999)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. SIGMOD (2000)
Hirsh, H.: Generalizing Version Spaces. Machine Learning 17(1), 5–46 (1994)
Hirsh, H.: Theoretical underpinnings of versionspaces. In: Proc. IJCAI (1991)
Inokuchi, A., Washio, T., Motoda, H.: Complete Mining of Frequent Patterns from Graphs: Mining Graph Data. Machine Learning 50(3), 321–354 (2003)
Kramer, S., De Raedt, L., Helma, C.: Molecular Feature Mining in HIV Data. In: Proc. SIGKDD (2001)
Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1 (1997)
Mitchell, T.: Generalization as Search. Artificial Intelligence 18(2), 203–226 (1980)
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
Ng, R.T., Lakshmanan, L.V.S., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained associations rules. In: Proc. SIGMOD (1998)
NIC Genetic Sequence Database, Available at http://www.ncbi.nlm.nih.gov/Genbank/GenbankOverview.html
Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)
Weiner, P.: Linear pattern matching algorithm. In: Proc. 14th IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fischer, J., De Raedt, L. (2004). Towards Optimizing Conjunctive Inductive Queries. In: Dai, H., Srikant, R., Zhang, C. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science(), vol 3056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_74
Download citation
DOI: https://doi.org/10.1007/978-3-540-24775-3_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22064-0
Online ISBN: 978-3-540-24775-3
eBook Packages: Springer Book Archive