Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2645791.2645804acmotherconferencesArticle/Chapter ViewAbstractPublication PagespciConference Proceedingsconference-collections
research-article

Distributed evaluation of XPath queries over large integrated XML data

Published: 02 October 2014 Publication History

Abstract

XML is a widespread, text-based format used for exchanging information on the Web and representing metadata. Since the amount of XML information is rapidly increasing, efficient querying of large data repositories, containing XML data, is a significant challenge faced by system designers and data analysts who need to support operational actions and decision-making. In this paper we propose a technique for integrating large amount of XML data and use the Map-Reduce framework to efficiently query the integrated data. Each XML document obtained from the sources is transformed properly in order to fit into a predefined, virtual XML structure. Although the transformed documents are not physically integrated, the user is able to pose queries over a single XML structure. To achieve this feature we propose a single-step, Map-Reduce algorithm which takes advantage of virtual structure and computes efficiently the answer of a given XPath queries in a distributed manner.

References

[1]
Hadoop Ecosystem. http://hadoop.apache.org/.
[2]
S. Abiteboul, I. Manolescu, P. Rigaux, M.-C. Rousset, and P. Senellart. Web Data Management. Cambridge University Press, 2011.
[3]
N. Bidoit, D. Colazzo, N. Malla, F. Ulliana, M. Nolé, and C. Sartiani. Processing xml queries and updates on map/reduce clusters. In EDBT, pages 745--748, 2013.
[4]
H. Choi, K.-H. Lee, S.-H. Kim, Y.-J. Lee, and B. Moon. HadoopXML: a suite for parallel processing of massive xml data with multiple twig pattern queries. In CIKM, pages 2737--2739, 2012.
[5]
G. Cong, W. Fan, A. Kementsietsidis, J. Li, and X. Liu. Partial evaluation for distributed XPath query processing and beyond. ACM Trans. Database Syst., 37(4):32:1--32:43, Dec. 2012.
[6]
M. Damigos, M. Gergatsoulis, and S. Plitsos. Distributed processing of xpath queries using mapreduce. In ADBIS (2), pages 69--77, 2013.
[7]
J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Commun. ACM, 51(1):107--113, Jan. 2008.
[8]
L. Fegaras, C. Li, U. Gupta, and J. Philip. Xml query optimization in map-reduce. In WebDB, 2011.
[9]
M. Fernández, T. Jim, K. Morton, N. Onose, and J. Siméon. DXQ: A distributed xquery scripting language. XIME-P '07, pages 3:1--3:6, 2007.
[10]
H. Garcia-Molina, J. D. Ullman, and J. Widom. Database Systems: The Complete Book. Prentice Hall Press, Upper Saddle River, NJ, USA, 2008.
[11]
S. Khatchadourian, M. P. Consens, and J. Siméon. Having a chuql at xml on the cloud. In AMW, 2011.
[12]
D. Suciu. Distributed query evaluation on semistructured data. ACM Trans. Database Syst., 27(1):1--62, Mar. 2002.
[13]
D. Zinn, S. Bowers, S. Köhler, and B. Ludäscher. Parallelizing XML processing pipelines via MapReduce. Journal of Computer and System Sciences, 2009.

Cited By

View all
  • (2019)Efficient Storage and Parallel Query of Massive XML Data in HadoopEmerging Technologies and Applications in Data Processing and Management10.4018/978-1-5225-8446-9.ch012(242-262)Online publication date: 2019
  • (2017)Efficient Processing of Distributed Twig Queries Based on Node DistributionJournal of Computer Science and Technology10.1007/s11390-017-1707-132:1(78-92)Online publication date: 11-Jan-2017

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
PCI '14: Proceedings of the 18th Panhellenic Conference on Informatics
October 2014
355 pages
ISBN:9781450328975
DOI:10.1145/2645791
  • General Chairs:
  • Katsikas Sokratis,
  • Hatzopoulos Michael,
  • Apostolopoulos Theodoros,
  • Anagnostopoulos Dimosthenis,
  • Program Chairs:
  • Carayiannis Elias,
  • Varvarigou Theodora,
  • Nikolaidou Mara
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • Greek Com Soc: Greek Computer Society
  • Univ. of Piraeus: University of Piraeus
  • National and Kapodistrian University of Athens: National and Kapodistrian University of Athens
  • Athens U of Econ & Business: Athens University of Economics and Business

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 October 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Big data
  2. MapReduce
  3. XML integration
  4. XPath

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

PCI '14

Acceptance Rates

PCI '14 Paper Acceptance Rate 51 of 102 submissions, 50%;
Overall Acceptance Rate 190 of 390 submissions, 49%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Efficient Storage and Parallel Query of Massive XML Data in HadoopEmerging Technologies and Applications in Data Processing and Management10.4018/978-1-5225-8446-9.ch012(242-262)Online publication date: 2019
  • (2017)Efficient Processing of Distributed Twig Queries Based on Node DistributionJournal of Computer Science and Technology10.1007/s11390-017-1707-132:1(78-92)Online publication date: 11-Jan-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media