Logical scalability and efficiency of relational learning algorithms

Jose Picado ORCID: orcid.org/0000-0001-8265-1509¹,
Arash Termehchy¹,
Alan Fern¹ &
…
Parisa Ataei¹

439 Accesses
Explore all metrics

Abstract

Relational learning algorithms learn the definition of a new relation in terms of existing relations in the database. The same database may be represented under different schemas for various reasons, such as efficiency, data quality, and usability. Unfortunately, the output of current relational learning algorithms tends to vary quite substantially over the choice of schema, both in terms of learning accuracy and efficiency. We introduce the property of schema independence of relational learning algorithms, and study both the theoretical and empirical dependence of existing algorithms on the common class of (de) composition schema transformations. We show theoretically and empirically that current relational learning algorithms are generally not schema independent. We propose Castor, a relational learning algorithm that achieves schema independence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases: The Logical Level. Addison-Wesley, Boston (1994)
Google Scholar
Abouzied, A., Angluin, D., Papadimitriou, C., Hellerstein, J., Silberschatz, A.: Learning and verifying quantified boolean queries by example. In: PODS, pp. 49–60 (2013)
Angluin, D., Frazier, M., Pitt, L.: Learning conjunctions of Horn clauses. Mach. Learn. 9(2–3), 147–164 (1992)
MATH Google Scholar
Arias, M., Khardon, R., Maloberti, J.: Learning Horn expressions with LOGAN-H. J. Mach. Learn. Res. 8, 549–587 (2007)
MATH Google Scholar
Atzeni, P., Ausiello, G., Batini, C., Moscarini, M.: Inclusion and Equivalent Between Relational Database Schemata. TCS, Mumbai (1982)
MATH Google Scholar
Cate, B.T., Dalmau, V., Kolaitis, P.G.: Learning schema mappings. TODS 38(4), 28:1–28:31 (2013)
Article MathSciNet MATH Google Scholar
Chen, Y., Goldberg, S., Wang, D.Z., Johri, S.S.: Ontological pathfinding: mining first-order knowledge from large knowledge bases. In: SIGMOD, pp. 835–846 (2016)
Costa, V.S., Srinivasan, A., Camacho, R., Blockeel, H., Demoen, B., Janssens, G., Struyf, J., Vandecasteele, H., Laer, W.V.: Query transformations for improving the efficiency of ILP systems. J. Mach. Learn. Res. 4, 465–491 (2003)
MathSciNet MATH Google Scholar
Davis, J., Burnside, E.S., de Castro Dutra, I., Page, D., Ramakrishnan, R., Costa, V.S., Shavlik, J.W.: View learning for statistical relational learning: with an application to mammography. In: IJCAI, pp. 677–683 (2005)
De Raedt, L.: Logical and Relational Learning, 1st edn. Springer, Berlin (2010)
MATH Google Scholar
Fagin, R.: Inverting schema mappings. TODS 32(4), 25 (2007)
Article MATH Google Scholar
Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT, pp. 207–224. Springer, Berlin, Heidelberg (2003)
Google Scholar
Fan, W., Bohannon, P.: Information preserving xml schema embedding. TODS 33(1), 1–44 (2008)
Article Google Scholar
Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.: Fast rule mining in ontological knowledge bases with AMIE+. VLDB J. 24(6), 707–730 (2015)
Article Google Scholar
Getoor, L., Machanavajjhala, A.: Entity resolution in big data. In: KDD, p. 1527 (2013)
Getoor, L., Taskar, B.: Introduction to Statistical Relational Learning. MIT Press, Cambridge (2007)
Book MATH Google Scholar
Hull, R.: Relative information capacity of simple relational database schemata. In: PODS, pp. 97–109 (1984)
Khardon, R.: Learning function-free Horn expressions. Mach. Learn. 37(3), 241–275 (1999)
Article MATH Google Scholar
Kuželka, O., Železný, F.: A restarted strategy for efficient subsumption testing. Fundam. Inf. 89(1), 95–109 (2009)
MATH Google Scholar
Li, H., Chan, C.-Y., Maier, D.: Query from examples: an iterative, data-driven approach to query construction. PVLDB 8(13), 2158–2163 (2015)
Google Scholar
Muggleton, S.: Inverse Entailment and Progol. New Gener. Comput. Special Issue Inductive Log. Program. 13, 245–286 (1995)
Article Google Scholar
Muggleton, S., Feng, C.: Efficient induction of logic programs. In: ALT. Springer/Ohmsha, pp. 368–381 (1990)
Muggleton, S., Santos, J.C.A., Tamaddoni-Nezhad, A.: Progolem: a system based on relative minimal generalisation. In: De Raedt, L. (ed.) ILP, vol. 5989. Springer, Berlin, Heidelberg (2009)
Google Scholar
Picado, J., Termehchy, A., Fern, A., Ataei, P.: Schema independent relational learning. arXiv:1508.03846 (2015)
Picado, J., Termehchy, A., Fern, A., Ataei, P.: Schema independent relational learning. In: SIGMOD, pp. 929–944 (2017)
Quinlan, J.R.: Learning logical definitions from relations. Mach. Learn. 5, 239–266 (1990)
Google Scholar
Reddy, C., Tadepalli, P.: Learning Horn definitions: theory and an application to planning. New Gener. Comput. 17, 77–98 (1998)
Article Google Scholar
Richardson, M., Domingos, P.: Markov logic networks. Mach. Learn. 62(1–2), 107–136 (2006)
Article Google Scholar
Srinivasan, A.: The Aleph manual (2007). http://www.cs.ox.ac.uk/activities/machinelearning/Aleph/aleph. Accessed 15 Oct 2018
Termehchy, A., Winslett, M., Chodpathumwan, Y.: How schema independent are schema free query interfaces? In: ICDE, pp. 645–660 (2011)
Zeng, Q., Patel, J.M., Page, D.: QuickFOIL: scalable inductive logic programming. PVLDB 8(3), 197–208 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

School of EECS, Oregon State University, Corvallis, OR, USA
Jose Picado, Arash Termehchy, Alan Fern & Parisa Ataei

Authors

Jose Picado
View author publications
You can also search for this author in PubMed Google Scholar
Arash Termehchy
View author publications
You can also search for this author in PubMed Google Scholar
Alan Fern
View author publications
You can also search for this author in PubMed Google Scholar
Parisa Ataei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jose Picado.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Picado, J., Termehchy, A., Fern, A. et al. Logical scalability and efficiency of relational learning algorithms. The VLDB Journal 28, 147–171 (2019). https://doi.org/10.1007/s00778-018-0523-8

Download citation

Received: 21 November 2017
Revised: 30 July 2018
Accepted: 20 September 2018
Published: 03 November 2018
Issue Date: 11 April 2019
DOI: https://doi.org/10.1007/s00778-018-0523-8

Logical scalability and efficiency of relational learning algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning Models over Relational Data: A Brief Tutorial

FACTORBASE: multi-relational structure learning with SQL all the way

Toward New Evaluation Metrics for Relational Learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Logical scalability and efficiency of relational learning algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning Models over Relational Data: A Brief Tutorial

FACTORBASE: multi-relational structure learning with SQL all the way

Toward New Evaluation Metrics for Relational Learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation