Abstract
Peer-to-peer (P2P) systems are gaining increasing popularity as a scalable means to share data among a large number of autonomous nodes. In this paper, we consider the case in which the nodes in a P2P system store XML documents. We propose a fully decentralized approach to the problem of routing path queries among the nodes of a P2P system based on maintaining specialized data structures, called filters that efficiently summarize the content, i.e., the documents, of one or more node. Our proposed filters, called multi-level Bloom filters, are based on extending Bloom filters so that they maintain information about the structure of the documents. In addition, we advocate building a hierarchical organization of nodes by clustering together nodes with similar content. Similarity between nodes is related to the similarity between the corresponding filters. We also present an efficient method for update propagation. Our experimental results show that multi-level Bloom filters outperform the classical Bloom filters in routing path queries. Furthermore, the content-based hierarchical grouping of nodes increases recall, that is, the number of documents that are retrieved.
Work supported in part by the IST programme of the European Commission FET under the IST-2001-32645 DBGlobe project, IST-2001-32645
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Milojicic, D.S., Kalogeraki, V., Lukose, R., Nagaraja, K., Pruyne, J., Richard, B., Rollins, S., Xu, Z.: Peer-to-Peer Computing, HP Laboratories Palo Alto, HPL-2002-57
Gribble, S.D., Brewer, E.A., Hellerstein, J.M., Culler, D.: Scalable Distributed Data Structures for Internet Service Construction. In: OSDI 2000 (2000)
Bloom, B.: Space/Time Trade-offs in Hash Coding with Allowable Errors. CACM 13(7) (July 1970)
Fan, L., Cao, P., Almeida, J., Broder, A.: Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol. In: SIGCOMM 1998 (1998)
Hodes, T.D., Czerwinski, S.E., Zhao, B.Y., Joseph, A.D., Katz, R.H.: Architecture for Secure Wide-Area Service Discovery. In: Mobicom 1999 (1999)
The MD5 Message-Digest Algorithm. RFC1321
The Niagara generator, http://www.cs.wisc.edu/niagara
Napster, http://www.napster.com/
Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: VLDB 1997 (1997)
Cooper, B.F., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: Fast Index for Semistructured Data. In: VLDB 2001 (2001)
Polyzotis, N., Garofalakis, M.: Structure and Value Synopses for XML Data Graphs. In: VLDB 2002 (2002)
Park, S., Kim, H.J.: A New Query Processing Technique for XML Based on Signature. In: DASFAA 2001 (2001)
Faloutsos, C., Christodoulakis, S.: Signature Files: An Access Method for Documents and Its Analytical Performance Evaluation. ACM TOIS 2(4) (October 1984)
Mohan, A., Kalogeraki, V.: Speculative Routing and Update Propagation: A Kundali Centric Approach. In: ICC 2003 (2003)
Crespo, A., Garcia-Molina, H.: Semantic Overlay Networks for P2P Systems (submitted for publication)
Nejdl, W., Wolpers, M., Siberski, W., Schmitz, C., Schlosser, M., Brunkhorst, I., Loser, A.: Super-Peer-Based Routing and Clustering Strategies for RDF-Based Peer-To-Peer Networks. In: WWW 2003 (2003)
Koloniari, G., Pitoura, E.: Bloom-Based Filters for Hierarchical Data. In: WDAS 2003 (2003)
Koloniari, G., Petrakis, Y., Pitoura, E.: Content-Based Overlay Networks of XML Peers Based on Multi-Level Bloom Filters. In: VLDB International Workshop on Databases, Information Systems and Peer-to-Peer Computing (2003)
Crespo, A., Garcia-Molina, H.: Routing Indices for Peer-to-Peer Systems. In: ICDCS 2002 (2002)
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications. In: SIGCOMM 2001 (2001)
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A Scalable Content-Addressable Network. In: SIGCOMM 2001 (2001)
Knowbuddy’s Gnutella FAQ, http://www.rixsoft.com/Knowbuddy/gnutellafaq.html
Pitoura, E., Abiteboul, S., Pfoser, D., Samaras, G., Vazirgiannis, M.: DBGlobe: a Service-oriented P2P System for Global Computing. SIGMOD Record 32(3) (2003)
Koloniari, G., Pitoura, E.: Content-Based Routing of Path Queries in Peer-to-Peer Systems (extended version). Computer Science Dept, Univ. of Ioannina, TR-2003-12
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Koloniari, G., Pitoura, E. (2004). Content-Based Routing of Path Queries in Peer-to-Peer Systems. In: Bertino, E., et al. Advances in Database Technology - EDBT 2004. EDBT 2004. Lecture Notes in Computer Science, vol 2992. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24741-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-24741-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21200-3
Online ISBN: 978-3-540-24741-8
eBook Packages: Springer Book Archive