Intelligent SPARQL Endpoints: Optimizing Execution Performance by Automatic Query Relaxation and Queue Scheduling

Ana I. Torre-Bastida¹⁸,
Esther Villar-Rodriguez¹⁸,
Miren Nekane Bilbao¹⁹ &
…
Javier Del Ser^18,19,20

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10048))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

1814 Accesses

Abstract

The Web of Data is widely considered as one of the major global repositories populated with countless interconnected and structured data prompting these linked datasets to be continuously and sharply increasing. In this context the so-called SPARQL Protocol and RDF Query Language is commonly used to retrieve and manage stored data by means of SPARQL endpoints, a query processing service especially designed to get access to these databases. Nevertheless, due to the large amount of data tackled by such endpoints and their structural complexity, these services usually suffer from severe performance issues, including inadmissible processing times. This work aims at overcoming this noted inefficiency by designing a distributed parallel system architecture that improves the performance of SPARQL endpoints by incorporating two functionalities: (1) a queuing system to avoid bottlenecks during the execution of SPARQL queries; and (2) an intelligent relaxation of the queries submitted to the endpoint at hand whenever the relaxation itself and the consequently lowered complexity of the query are beneficial for the overall performance of the system. To this end the system relies on a two-fold optimization criterion: the minimization of the query running time, as predicted by a supervised learning model; and the maximization of the quality of the results of the query as quantified by a measure of similarity. These two conflicting optimization criteria are efficiently balanced by two bi-objective heuristic algorithms sequentially executed over groups of SPARQL queries. The approach is validated on a prototype and several experiments that evince the applicability of the proposed scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The Odyssey Approach for Optimizing Federated SPARQL Queries

SPARQLGX: Efficient Distributed Evaluation of SPARQL with Apache Spark

Highly Parallel SPARQL Engine for RDF

Notes

1.
In fact there have been controversial discussions lately around the originality of the naïve HS algorithm in regards to its close resemblance to the more traditional $(\mu + 1)$ Evolution Strategies. Having said this, foregoing algorithmic descriptions will use the HS terminology impartially with respect to the aforementioned controversy.

References

Shadbolt, N., Hall, W., Berners-Lee, T.: The semantic web revisited. IEEE Intell. Syst. 21(3), 96–101 (2006)
Article Google Scholar
Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL query optimization. In: ACM International Conference on Database Theory, pp. 4–33 (2010)
Google Scholar
Ganapathi, A., Kuno, H., Dayal, U., Wiener, J.L., Fox, A., Jordan, M., Patterson, D.: Predicting multiple metrics for queries: better decisions enabled by machine learning. In: IEEE International Conference on Data Engineering, pp. 592–603 (2009)
Google Scholar
Akdere, M., Çetintemel, U., Riondato, M., Upfal, E., Zdonik, S.B.: Learning-based query performance modeling and prediction. In: IEEE International Conference on Data Engineering, pp. 390–401 (2012)
Google Scholar
Hasan, R., Gandon, F.: A machine learning approach to SPARQL query performance prediction. In: IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies, vol. 1, pp. 266–273 (2014)
Google Scholar
Görlitz, O., Staab, S.: Splendid: SPARQL endpoint federation exploiting void descriptions. In: COLD, vol. 782 (2011)
Google Scholar
Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: optimization techniques for federated query processing on linked data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 601–616. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25073-6_38
Chapter Google Scholar
Bernstein, A., Kiefer, C., Stocker, M.: OptARQ: a SPARQL optimization approach based on triple pattern selectivity estimation. Technical report ifi-2007.03, University of Zurich (2007)
Google Scholar
Tsialiamanis, P., Sidirourgos, L., Fundulaki, I., Christophides, V., Boncz, P: Heuristics-based query optimisation for SPARQL. In: ACM International Conference on Extending Database Technology, pp. 324–335 (2012)
Google Scholar
Gubichev, A., Neumann, T.: Exploiting the query structure for efficient join ordering in SPARQL queries. In: EDBT, pp. 439–450 (2014)
Google Scholar
Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: ACM International Conference on World Wide Web, pp. 595–604 (2008)
Google Scholar
Bicer, V., Tran, T., Gossen, A.: Relational kernel machines for learning from graph-structured RDF data. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., Leenheer, P., Pan, J. (eds.) ESWC 2011. LNCS, vol. 6643, pp. 47–62. Springer, Heidelberg (2011). doi:10.1007/978-3-642-21034-1_4
Chapter Google Scholar
Yamagata, Y., Fukuta, N.: An approach to dynamic query classification and approximation on an inference-enabled SPARQL endpoint. J. Inf. Process. 23(6), 759–766 (2015)
Google Scholar
Maali, F., Hassan, I.A., Decker, S.: Scheduling for SPARQL endpoints. In: International Semantic Web Conference, pp. 19–28 (2014)
Google Scholar
Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25073-6_29
Chapter Google Scholar
Robertson, S.: Understanding inverse document frequency: on theoretical arguments for IDF. J. Documentation 60(5), 503–520 (2004)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article MathSciNet MATH Google Scholar
Trillo, R., Gracia, J., Espinoza, M., Mena, E.: Discovering the semantics of user keywords. J. Univ. Comput. Sci. 13(12), 1908–1935 (2007)
Google Scholar
Marler, R.T., Arora, J.S.: Survey of multi-objective optimization methods for engineering. Struct. Multi. Optim. 26(6), 369–395 (2004)
Article MathSciNet MATH Google Scholar
Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algorithms: empirical results. Evol. Comput. 8(2), 173–195 (2000)
Article Google Scholar
Geem, Z.W., Kim, J.H., Loganathan, G.: A new heuristic optimization algorithm: harmony search. Simulation 76(2), 60–68 (2001)
Article Google Scholar
Manjarres, D., Landa-Torres, I., Gil-Lopez, S., Del Ser, J., Bilbao, M.N., Salcedo-Sanz, S., Geem, Z.W.: A survey on applications of the harmony search algorithm. Eng. Appl. Artif. Intell. 26(8), 1818–1831 (2013)
Article Google Scholar
Bean, J.C.: Genetic algorithms and random keys for sequencing and optimization. ORSA J. Comput. 6(2), 154–160 (1994)
Article MATH Google Scholar
Pirró, G., Euzenat, J.: A feature and information theoretic framework for semantic similarity and relatedness. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010. LNCS, vol. 6496, pp. 615–630. Springer, Heidelberg (2010). doi:10.1007/978-3-642-17746-0_39
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

TECNALIA. OPTIMA Unit, 48160, Derio, Spain
Ana I. Torre-Bastida, Esther Villar-Rodriguez & Javier Del Ser
University of the Basque Country UPV/EHU, 48013, Bilbao, Spain
Miren Nekane Bilbao & Javier Del Ser
Basque Center for Applied Mathematics (BCAM), 48009, Bilbao, Spain
Javier Del Ser

Authors

Ana I. Torre-Bastida
View author publications
You can also search for this author in PubMed Google Scholar
Esther Villar-Rodriguez
View author publications
You can also search for this author in PubMed Google Scholar
Miren Nekane Bilbao
View author publications
You can also search for this author in PubMed Google Scholar
Javier Del Ser
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ana I. Torre-Bastida .

Editor information

Editors and Affiliations

University Carlos III of Madrid, Leganes, Spain
Jesus Carretero
Carlos III University of Madrid, Leganes, Madrid, Spain
Javier Garcia-Blas
The University of Waikato, Hamilton, New Zealand
Ryan K.L. Ko
IBM Zurich Research Laboratory, Rüschlikon, Switzerland
Peter Mueller
Hiroshima University, Higashi-Hiroshima, Japan
Koji Nakano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Torre-Bastida, A.I., Villar-Rodriguez, E., Bilbao, M.N., Del Ser, J. (2016). Intelligent SPARQL Endpoints: Optimizing Execution Performance by Automatic Query Relaxation and Queue Scheduling. In: Carretero, J., Garcia-Blas, J., Ko, R., Mueller, P., Nakano, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2016. Lecture Notes in Computer Science(), vol 10048. Springer, Cham. https://doi.org/10.1007/978-3-319-49583-5_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-49583-5_1
Published: 25 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49582-8
Online ISBN: 978-3-319-49583-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Intelligent SPARQL Endpoints: Optimizing Execution Performance by Automatic Query Relaxation and Queue Scheduling

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

The Odyssey Approach for Optimizing Federated SPARQL Queries

SPARQLGX: Efficient Distributed Evaluation of SPARQL with Apache Spark

Highly Parallel SPARQL Engine for RDF

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Intelligent SPARQL Endpoints: Optimizing Execution Performance by Automatic Query Relaxation and Queue Scheduling

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

The Odyssey Approach for Optimizing Federated SPARQL Queries

SPARQLGX: Efficient Distributed Evaluation of SPARQL with Apache Spark

Highly Parallel SPARQL Engine for RDF

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation