With standardization efforts of a query language for XML documents drawing to a close, researchers and users increasingly focus their attention on the database technology that has to deliver on the new challenges that the sheer amount of XML documents produced by applications pose to data management: validation, performance evaluation and optimization of XML query processors are the upcoming issues. Following a long tradition in database research, the XML Store Benchmark Project provides a framework to assess an XML database''s abilities to cope with a broad spectrum of different queries, typically posed in real-world application scenarios. The benchmark is intended to help both implementors and users to compare XML databases independent of their own, specific application scenario. To this end, the benchmark offers a set queries each of which is intended to challenge a particular primitive of the query processor or storage engine. The overall workload we propose consists of a scalable document database and a concise, yet comprehensive set of queries, which covers the major aspects of query processing. The queries'' challenges range from stressing the textual character of the document to data analysis queries, but include also typical ad-hoc queries. We complement our research with results obtained from running the benchmark on our XML database platform. They are intended to give a first baseline, illustrating the state of the art.
Cited By
- Jo S and Chung K (2015). Design of access control system for telemedicine secure XML documents, Multimedia Tools and Applications, 74:7, (2257-2271), Online publication date: 1-Apr-2015.
- Thimma M, Tsui T and Luo B HyXAC Proceedings of the 18th ACM symposium on Access control models and technologies, (113-124)
- Li J, Wang J and Huang M Twig pattern matching Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II, (43-50)
- Mohammad S and Martin P LLS Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research, (115-127)
- Al-Hamdani W XML security in healthcare web systems 2010 Information Security Curriculum Development Conference, (80-93)
- Mohammad S and Martin P LTIX Proceedings of the Fourteenth International Database Engineering & Applications Symposium, (21-25)
- Gao Z, Liao H, Gao H and Yang K TwigLinkedList Proceedings of the 2010 international conference on Web-age information management, (135-140)
- Hall D and Strömbäck L Generation of synthetic XML for evaluation of hybrid XML systems Proceedings of the 15th international conference on Database systems for advanced applications, (191-202)
- Kulič L Adaptability in XML-to-relational mapping strategies Proceedings of the 2010 ACM Symposium on Applied Computing, (1674-1679)
- Bordawekar R, Lim L, Kementsietsidis A and Kok B Statistics-based parallelization of XPath queries in shared memory systems Proceedings of the 13th International Conference on Extending Database Technology, (159-170)
- Zhang N, Agarwal N, Chandrasekar S, Idicula S, Medi V, Petride S and Sthanikam B (2009). Binary XML storage and query processing in Oracle 11g, Proceedings of the VLDB Endowment, 2:2, (1354-1365), Online publication date: 1-Aug-2009.
- Luo C, Jiang Z, Hou W, Yu F and Zhu Q A sampling approach for XML query selectivity estimation Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, (335-344)
- Feng J, Li G and Ta N (2008). A semantic cache framework for secure XML queries, Journal of Computer Science and Technology, 23:6, (988-997), Online publication date: 1-Nov-2008.
- Ghelli G, Colazzo D and Sartiani C Linear time membership in a class of regular expressions with interleaving and counting Proceedings of the 17th ACM conference on Information and knowledge management, (389-398)
- Li J and Wang J Fast Matching of Twig Patterns Proceedings of the 19th international conference on Database and Expert Systems Applications, (523-536)
- Jayapandian M and Jagadish H (2008). Automated creation of a forms-based database query interface, Proceedings of the VLDB Endowment, 1:1, (695-709), Online publication date: 1-Aug-2008.
- Luo C, Jiang Z, Hou W, Yan F and Zhu Q (2008). A relational model for XML structural joins and their size estimations, Knowledge and Information Systems, 16:1, (97-127), Online publication date: 1-Jul-2008.
- Elghandour I, Aboulnaga A, Zilio D, Chiang F, Balmin A, Beyer K and Zuzarte C An xml index advisor for DB2 Proceedings of the 2008 ACM SIGMOD international conference on Management of data, (1267-1270)
- Chen D and Chan C Minimization of tree pattern queries with constraints Proceedings of the 2008 ACM SIGMOD international conference on Management of data, (609-622)
- Jayapandian M and Jagadish H Expressive query specification through form customization Proceedings of the 11th international conference on Extending database technology: Advances in database technology, (416-427)
- Yui M, Miyazaki J, Uemura S and Kato H XBird/D Proceedings of the 2008 ACM symposium on Applied computing, (1003-1007)
- De Meo P, Palopoli L, Quattrone G and Ursino D (2018). Combining Description Logics with synopses for inferring complex knowledge patterns from XML sources, Information Systems, 32:8, (1184-1224), Online publication date: 1-Dec-2007.
- Li F, Luo B, Liu P, Lee D and Chu C Automaton segmentation Proceedings of the 14th ACM conference on Computer and communications security, (508-518)
- Luo B, Lee D and Liu P Pragmatic XML access control using off-the-shelf RDBMS Proceedings of the 12th European conference on Research in Computer Security, (55-71)
- Kim S, Yoo S, Hong E, Kim T and Kim I A document object modeling method to retrieve data from a very large XML document Proceedings of the 2007 ACM symposium on Document engineering, (59-68)
- Feng J, Ta N, Zhang Y and Li G Exploit sequencing views in semantic cache to accelerate xpath query evaluation Proceedings of the 16th international conference on World Wide Web, (1337-1338)
- Lu J, Huang C and Chuang T Querying and browsing XML and relational data sources Proceedings of the 2007 ACM symposium on Applied computing, (489-493)
- Head M, Govindaraju M, van Engelen R and Zhang W Benchmarking XML processors for applications in grid web services Proceedings of the 2006 ACM/IEEE conference on Supercomputing, (121-es)
- Li G, Feng J, Ta N, Zhang Y and Zhou L SCEND Proceedings of the 7th international conference on Web Information Systems, (460-473)
- Jo S and Yoo W Design of flexible authorization system and small memory management for XML data protection on the server Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I, (451-458)
- Byun C and Park S Two phase filtering for XML access control Proceedings of the Third VLDB international conference on Secure Data Management, (115-130)
- Li H, Aghili S, Agrawal D and El Abbadi A FLUX Proceedings of the 4th international conference on Database and XML Technologies, (61-76)
- Huang C, Chuang T, Lu J and Lee H XML Evolution Proceedings of the 32nd international conference on Very large data bases, (1215-1218)
- Yu C and Jagadish H Schema summarization Proceedings of the 32nd international conference on Very large data bases, (319-330)
- Zhang N, Özsu M, Ilyas I and Aboulnaga A FIX Proceedings of the 32nd international conference on Very large data bases, (259-270)
- Feng Y and Makinouchi A A new structure for accelerating XPath location steps Proceedings of the 7th international conference on Advances in Web-Age Information Management, (49-60)
- Aghili S, Li H, Agrawal D and Abbadi A TWIX Proceedings of the 1st international conference on Scalable information systems, (42-es)
- Li H, Aghili S, Agrawal D and El Abbadi A FLUX Proceedings of the 15th international conference on World Wide Web, (1081-1082)
- Xing G Fast approximate matching between XML documents and schemata Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development, (425-436)
- Head M, Govindaraju M, Slominski A, Liu P, Abu-Ghazaleh N, van Engelen R, Chiu K and Lewis M A Benchmark Suite for SOAP-based Communication in Grid Web Services Proceedings of the 2005 ACM/IEEE conference on Supercomputing
- Huang C, Chuang T and Lee H Prefiltering techniques for efficient XML document processing Proceedings of the 2005 ACM symposium on Document engineering, (149-158)
- Luo C, Jiang Z and Hou W Applying cosine series to join size estimation Proceedings of the 14th ACM international conference on Information and knowledge management, (227-228)
- Ishikawa H, Yokoyama S, Ohta M and Katayama K On mining XML structures based on statistics Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I, (379-390)
- Brinkman R, Schoenmakers B, Doumen J and Jonker W Experiments with queries over encrypted data using secret sharing Proceedings of the Second VDLB international conference on Secure Data Management, (33-46)
- Mandhani B and Suciu D Query caching and view selection for XML databases Proceedings of the 31st international conference on Very large data bases, (469-480)
- Zhang N, Haas P, Josifovski V, Lohman G and Zhang C Statistical learning techniques for costing XML queries Proceedings of the 31st international conference on Very large data bases, (289-300)
- Qin J, Zhao S, Yang S and Dou W XPEV Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I, (360-369)
- Lu H, Yu J, Wang G, Zheng S, Jiang H, Yu G and Zhou A (2005). What makes the differences: benchmarking XML database implementations, ACM Transactions on Internet Technology (TOIT), 5:1, (154-194), Online publication date: 1-Feb-2005.
- Kit L and Ng V Enumerating XML data for dynamic updating Proceedings of the 16th Australasian database conference - Volume 39, (75-84)
- Luo B, Lee D, Lee W and Liu P QFilter Proceedings of the thirteenth ACM international conference on Information and knowledge management, (543-552)
- Huang C, Chuang T and Lee H Fast structural query with application to chinese treebank sentence retrieval Proceedings of the 2004 ACM symposium on Document engineering, (11-20)
- Yao B, Özsu M and Khandelwal N XBench Benchmark and Performance Testing of XML DBMSs Proceedings of the 20th International Conference on Data Engineering
- He H and Yang J Multiresolution Indexing of XML for Frequent Queries Proceedings of the 20th International Conference on Data Engineering
- Vieira H, Ruberg G and Mattoso M Proceedings of the 5th ACM international workshop on Web information and data management, (37-44)
- Galanis L, Wang Y, Jeffery S and DeWitt D Processing queries in a large peer-to-peer system Proceedings of the 15th international conference on Advanced information systems engineering, (273-288)
- Wang W, Jiang H, Lu H and Yu J Containment join size estimation Proceedings of the 2003 ACM SIGMOD international conference on Management of data, (145-156)
- Wang H, Park S, Fan W and Yu P ViST Proceedings of the 2003 ACM SIGMOD international conference on Management of data, (110-121)
- Guo L, Shao F, Botev C and Shanmugasundaram J XRANK Proceedings of the 2003 ACM SIGMOD international conference on Management of data, (16-27)
- Wang J, Meng X and Wang S Integrating path index with value index for XML data Proceedings of the 5th Asia-Pacific web conference on Web technologies and applications, (95-100)
- Tucker P, Maier D, Sheard T and Fegaras L (2003). Exploiting Punctuation Semantics in Continuous Data Streams, IEEE Transactions on Knowledge and Data Engineering, 15:3, (555-568), Online publication date: 1-Mar-2003.
- Wu Y, Patel J and Jagadish H (2018). Using histograms to estimate answer sizes for XML queries, Information Systems, 28:1-2, (33-59), Online publication date: 1-Mar-2003.
- Lu H Efficient management of XML documents Proceedings of the 14th Australasian database conference - Volume 17, (3-4)
- Córcoles J and González P Analysis of different approaches for storing GML documents Proceedings of the 10th ACM international symposium on Advances in geographic information systems, (11-16)
- Grabs T, Böhm K and Schek H XMLTM Proceedings of the eleventh international conference on Information and knowledge management, (142-152)
- Schmidt A, Waas F, Kersten M, Carey M, Manolescu I and Busse R XMark Proceedings of the 28th international conference on Very Large Data Bases, (974-985)
- Polyzotis N and Garofalakis M Statistical synopses for graph-structured XML databases Proceedings of the 2002 ACM SIGMOD international conference on Management of data, (358-369)
- Grust T Accelerating XPath location steps Proceedings of the 2002 ACM SIGMOD international conference on Management of data, (109-120)
- Buneman P, Khanna S, Tajima K and Tan W Archiving scientific data Proceedings of the 2002 ACM SIGMOD international conference on Management of data, (1-12)
- Jiang H, Lu H, Wang W and Yu J (2002). Path materialization revisited, Australian Computer Science Communications, 24:2, (85-94), Online publication date: 1-Jan-2002.
- Jiang H, Lu H, Wang W and Yu J Path materialization revisited Proceedings of the 13th Australasian database conference - Volume 5, (85-94)
- Li Y, Bressan S, Dobbie G, Lacroix Z, Lee M, Nambiar U and Wadhwa B XOO7 Proceedings of the tenth international conference on Information and knowledge management, (167-174)
- Schmidt A, Waas F, Kersten M, Florescu D, Carey M, Manolescu I and Busse R (2001). Why and how to benchmark XML databases, ACM SIGMOD Record, 30:3, (27-32), Online publication date: 1-Sep-2001.
Recommendations
Query rewrite for XML in Oracle XML DB
VLDB '04: Proceedings of the Thirtieth international conference on Very large data bases - Volume 30Oracle XML DB integrates XML storage and querying using the Oracle relational and object relational framework. It has the capability to physically store XML documents by shredding them as relational or object relational data, and creating logical XML ...
An XML transaction processing benchmark
SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of dataXML database functionality has been emerging in "XML-only" databases as well as in the major relational database products. Yet, there is no industry standard XML database benchmark to evaluate alternative implementations. The research community has ...
Towards the Development of XML Benchmark for XML Updates
ITNG '08: Proceedings of the Fifth International Conference on Information Technology: New GenerationsMany XML Benchmarks have been proposed to study strengths and weaknesses of any given XML database system. All existing benchmarks can be applied to evaluate data retrieval queries, but cannot be used to evaluate update performance of XML database ...