Towards Optimizing Conjunctive Inductive Queries

Johannes Fischer¹⁹ &
Luc De Raedt¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3056))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3092 Accesses

Abstract

Inductive queries are queries to an inductive database that generate a set of patterns in a data mining context. Inductive querying poses new challenges to database and data mining technology. We study conjunctive inductive queries, which are queries that can be written as a conjunction of a monotonic and an anti-monotonic subquery. We introduce the conjunctive inductive query optimization problem, which is concerned with minimizing the cost of computing the answer set to a conjunctive query. In the optimization problem, it is assumed that there are costs c _a and c _m associated to evaluating a pattern w.r.t. a monotonic and an anti-monotonic subquery respectively. The aim is then to minimize the total cost needed to compute all solutions to the query. Secondly, we present an algorithm that aims at optimizing conjunctive inductive queries in the context of the pattern domain of strings and evaluate it on a challenging data set in computational biology.

An early version of this paper was presented at the 2nd ECML/PKDD Workshop on Knowledge Discovery with Inductive Querying, Dubrovnik, 2003.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Best-effort inductive logic programming via fine-grained cost-based hypothesis generation

Article 22 May 2018

Semantic Acyclicity for Conjunctive Queries: Approximations and Constraints

Identification

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB (1994)
Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. SIGMOD, pp. 207–216 (1993)
Google Scholar
Bayardo, R.: Efficiently mining long patterns from databases. In: Proc. SIGMOD (1998)
Google Scholar
Bucila, C., Gehrke, J., Kifer, D., White, W.: DualMiner: A dual pruning algorithm for itemsets with constraints. In: Proc. of SIGKDD (2002)
Google Scholar
De Raedt, L., Kramer, S.: The levelwise version space algorithm and its application to molecular fragment finding. In: Proc. IJCAI (2001)
Google Scholar
De Raedt, L., Jäger, M., Lee, S.D., Mannila, H.: A Theory of Inductive Query Answering. In: Proceedings IEEE ICDM (2002)
Google Scholar
De Raedt, L.: A perspective on inductive databases. SIGKDD Explorations 4 (2) (2002)
Google Scholar
Fischer, J.: Version Spaces im Contraint-Based Data Mining. Diplomarbeit. Albert-Ludwigs-University Freiburg (2003)
Google Scholar
Goethals, B., Van den Bussche, J.: On supporting interactive association rule mining. In: Kambayashi, Y., Mohania, M., Tjoa, A.M. (eds.) DaWaK 2000. LNCS, vol. 1874, p. 307. Springer, Heidelberg (2000)
Chapter Google Scholar
Gunopulos, D., Mannila, H., Saluja, S.: Discovering All Most Specific Sentences by Randomized Algorithms. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, Springer, Heidelberg (1996)
Google Scholar
Gunopulos, D., Khardon, R., Mannila, H., Toivonen, H.: Data mining, Hypergraph Transversals, and Machine Learning. In: Proceedings PODS (1997)
Google Scholar
Han, J., Fu, Y., Koperski, K., Wang, W., Zaiane, O.: DMQL: A Data Mining Query Language for Relational Databases. In: Proc. SIGMOD 1996 Workshop on Research Issues on Data Mining and Knowledge Discovery (1996)
Google Scholar
Han, J., Lakshmanan, L.V.S., Ng, R.T.: Constraint-Based, Multidimensional Data Mining. Computer 32(8), 46–50 (1999)
Article Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. SIGMOD (2000)
Google Scholar
Hirsh, H.: Generalizing Version Spaces. Machine Learning 17(1), 5–46 (1994)
MATH Google Scholar
Hirsh, H.: Theoretical underpinnings of versionspaces. In: Proc. IJCAI (1991)
Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: Complete Mining of Frequent Patterns from Graphs: Mining Graph Data. Machine Learning 50(3), 321–354 (2003)
Article MATH Google Scholar
Kramer, S., De Raedt, L., Helma, C.: Molecular Feature Mining in HIV Data. In: Proc. SIGKDD (2001)
Google Scholar
Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1 (1997)
Google Scholar
Mitchell, T.: Generalization as Search. Artificial Intelligence 18(2), 203–226 (1980)
Article Google Scholar
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Ng, R.T., Lakshmanan, L.V.S., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained associations rules. In: Proc. SIGMOD (1998)
Google Scholar
NIC Genetic Sequence Database, Available at http://www.ncbi.nlm.nih.gov/Genbank/GenbankOverview.html
Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)
Article MathSciNet MATH Google Scholar
Weiner, P.: Linear pattern matching algorithm. In: Proc. 14th IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)
Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Informatik, Albert-Ludwigs-University Freiburg, Georges Koehler Allee 79, D-79110, Freiburg, Germany
Johannes Fischer & Luc De Raedt

Authors

Johannes Fischer
View author publications
You can also search for this author in PubMed Google Scholar
Luc De Raedt
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Engineering and Information Technology, Deakin University, VIC 3125, Australia
Honghua Dai
University of Illinois at Urbana-Champaign, 61801, Urbana, IL, USA
Ramakrishnan Srikant
Faculty of Engineering and Information Technology, Centre for Quantum Computation and Intelligent Systems, and Australian ACS National Committee for Artificial Intelligence, University of Technology, Sydney, Australia
Chengqi Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fischer, J., De Raedt, L. (2004). Towards Optimizing Conjunctive Inductive Queries. In: Dai, H., Srikant, R., Zhang, C. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science(), vol 3056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_74

Download citation

DOI: https://doi.org/10.1007/978-3-540-24775-3_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22064-0
Online ISBN: 978-3-540-24775-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics