Abstract
This work considers a problem of optimal query processing in heterogeneous and distributed database systems. A global query submitted at a local site is decomposed into a number of queries processed at the remote sites. The partial results returned by the queries are integrated at a local site. The paper addresses a problem of an optimal scheduling of queries that minimizes time spend on data integration of the partial results into the final answer. A global data model defined in this work provides a unified view of the heterogeneous data structures located at the remote sites and a system of operations is defined to express the complex data integration procedures. This work shows that the transformations of an entirely simultaneous query processing strategies into a hybrid (simultaneous/sequential) strategy may in some cases lead to significantly faster data integration. We show how to detect such cases, what conditions must be satisfied to transform the schedules, and how to transform the schedules into the more efficient ones.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ahmad, M., Aboulnaga, A., Babu, S.: Query interactions in database workloads. In: Proceedings of the Second International Workshop on Testing Database Systems, pp. 1–6 (2009)
Ahmad, M., Duan, S., Aboulnaga, A., Babu, S.: Predicting completion times of batch query workloads using interaction-aware models and simulation. In: Proceedings of the 14th International Conference on Extending Database Technology, pp. 449–460 (2011)
Costa, R.L.-C., Furtado, P.: Runtime estimations, reputation and elections for top performing distributed query scheduling. In: Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 28–35 (2009)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th Symposium on Operating Systems Design and Implementation (2004)
Granas, A., Dugundji, J.: Fixed Point Theory. Springer-Verlag (2003)
Ives, Z.G., Green, T.J., Karvounarakis, G., Taylor, N.E., Tannen, V., Talukdar, P.P., Jacob, M., Pereira F.: The ORCHESTRA Collaborative Data Sharing System. SIGMOD Record (2008)
Ilarri, S., Mena, E., Illarramendi, A.: Location-dependent query processing: Where we are and where we are heading. ACM Computing Surveys 42(3), 1–73 (2010)
Lenzerini, M.: Data Integration: A Theoretical Perspective (2002)
Liu L., Pu, C.: A dynamic query scheduling framework for distributed and evolving information systems. In: Proceedings of the 17th International Conference on Distributed Computing Systems (1997)
Mishra, C., Koudas, N.: The design of a query monitoring system. ACM Transactions on Database Systems 34(1), 1–51 (2009)
Nam, B., Shin, M., Andrade, H., Sussman, A.: Multiple query scheduling for distributed semantic caches. Journal of Parallel and Distributed Computing 70(5), 598–611 (2010)
Ozcan, F., Nural, S., Koksal, P., Evrendilek, C., Dogac, A.: Dynamic Query Optimization in Multidatabases. Bulletin of the Technical Committee on Data Engineering 20(3), 38–45 (2011)
Thain, D., Tannenbaum, T., Livny, M.: Distributed computing in practice: the Condor experience: Research Articles. Concurrency Computing: Practice and Experience. 17(2–4), 323–356 (2005)
Zhou, Y., Ooi, B.C., Tan, K.-L., Tok, W.H.: An adaptable distributed query processing architecture. Data and Knowledge Engineering 53(3), 283–309 (2005)
Zhu, Q., Larson, P.A.: Solving Local Cost Estimation Problem for Global Query Optimization in Multidatabase Systems. Distributed and Parallel Databases 6(4), 373–420 (1998)
Ziegler, P.: Three Decades of Data Integration - All problems Solved? In: 18th IFIP World Computer Congress, vol. 12 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Getta, J.R., Handoko (2015). On Transformation of Query Scheduling Strategies in Distributed and Heterogeneous Database Systems. In: Nguyen, N., Trawiński, B., Kosala, R. (eds) Intelligent Information and Database Systems. ACIIDS 2015. Lecture Notes in Computer Science(), vol 9011. Springer, Cham. https://doi.org/10.1007/978-3-319-15702-3_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-15702-3_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-15701-6
Online ISBN: 978-3-319-15702-3
eBook Packages: Computer ScienceComputer Science (R0)