Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-642-04930-9_40guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Scalable Distributed Reasoning Using MapReduce

Published: 06 November 2009 Publication History

Abstract

We address the problem of scalable distributed reasoning, proposing a technique for materialising the closure of an RDF graph based on MapReduce. We have implemented our approach on top of Hadoop and deployed it on a compute cluster of up to 64 commodity machines. We show that a naive implementation on top of MapReduce is straightforward but performs badly and we present several non-trivial optimisations. Our algorithm is scalable and allows us to compute the RDFS closure of 865M triples from the Web (producing 30B triples) in less than two hours, faster than any other published approach.

References

[1]
Battré, D., Heine, F., Höing, A., Kao, O.: On triple dissemination, forwardchaining, and load balancing in DHT based RDF stores. In: Moro, G., Bergamaschi, S., Joseph, S., Morin, J.-H., Ouksel, A.M. (eds.) DBISP2P 2005 and DBISP2P 2006. LNCS, vol. 4125, pp. 343-354. Springer, Heidelberg (2007)
[2]
Cai, M., Frank, M.: RDFPeers: A scalable distributed RDF repository based on a structured peer-to-peer network. In: WWW Conference (2004)
[3]
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. In: Proceedings of the USENIX Symposium on Operating Systems Design & Implementation (OSDI), pp. 137-147 (2004)
[4]
Fang, Q., Zhao, Y., Yang, G.-W., Zheng, W.-M.: Scalable distributed ontology reasoning using DHT-based partitioning. In: Domingue, J., Anutariya, C. (eds.) ASWC 2008. LNCS, vol. 5367, pp. 91-105. Springer, Heidelberg (2008)
[5]
Guo, Y., Pan, Z., Heflin, J.: LUBM: A benchmark for OWL knowledge base systems. Journal of Web Semantics 3, 158-182 (2005)
[6]
Hayes, P. (ed.): RDF Semantics. W3C Recommendation (2004)
[7]
Hogan, A., Harth, A., Polleres, A.: Scalable authoritative OWL reasoning for the web. Int. J. on Semantic Web and Information Systems 5(2) (2009)
[8]
Hogan, A., Harth, A., Polleres, A.: Saor: Authoritative reasoning for the web. In: Domingue, J., Anutariya, C. (eds.) ASWC 2008. LNCS, vol. 5367, pp. 76-90. Springer, Heidelberg (2008)
[9]
ter Horst, H.J.: Completeness, decidability and complexity of entailment for RDF schema and a semantic extension involving the OWL vocabulary. Journal of Web Semantics 3(2-3), 79-115 (2005)
[10]
Kaoudi, Z., Miliaraki, I., Koubarakis, M.: RDFS reasoning and query answering on top of DHTs. In: Sheth, A.P., Staab, S., Dean, M., Paolucci, M., Maynard, D., Finin, T., Thirunarayan, K. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 499-516. Springer, Heidelberg (2008)
[11]
Kiryakov, A., Ognyanov, D., Manov, D.: OWLIM - a pragmatic semantic repository for OWL. In: Web Information Systems Engineering (WISE) Workshops, pp. 182-192 (2005)
[12]
Mika, P., Tummarello, G.: Web semantics in the clouds. IEEE Intelligent Systems 23(5), 82-87 (2008)
[13]
MacCartney, B., McIlraith, S.A., Amir, E., Uribe, T.: Practical partition-based theorem proving for large knowledge bases. In: IJCAI (2003)
[14]
Oren, E., Kotoulas, S., et al.: Marvin: A platform for large-scale analysis of Semantic Web data. In: Int. Web Science conference (2009)
[15]
Soma, R., Prasanna, V.: Parallel inferencing for OWL knowledge bases. In: Int. Conf. on Parallel Processing, pp. 75-82 (2008)
[16]
Urbani, J.: Scalable Distributed RDFS/OWL Reasoning using MapReduce. Master's thesis, Vrije Universiteit Amsterdam (2009), http://www.few.vu.nl/~jui200/thesis.pdf
[17]
Zhou, J., Ma, L., Liu, Q., Zhang, L., Yu, Y., Pan, Y.: Minerva: A scalable OWL ontology storage and inference system. In: Mizoguchi, R., Shi, Z.-Z., Giunchiglia, F. (eds.) ASWC 2006. LNCS, vol. 4185, pp. 429-443. Springer, Heidelberg (2006)

Cited By

View all
  • (2020)Scalable SPARQL querying of large RDF graphsProceedings of the VLDB Endowment10.14778/3402707.34027474:11(1123-1134)Online publication date: 3-Jun-2020
  • (2019)Enhancing the scalability of expressive stream reasoning via input-driven parallelizationSemantic Web10.3233/SW-18033010:3(457-474)Online publication date: 1-Jan-2019
  • (2019)Streaming saturation for large RDF graphs with dynamic schema informationProceedings of the 17th ACM SIGPLAN International Symposium on Database Programming Languages10.1145/3315507.3330201(42-52)Online publication date: 23-Jun-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ISWC '09: Proceedings of the 8th International Semantic Web Conference
November 2009
1004 pages
ISBN:9783642049293
  • Editors:
  • Abraham Bernstein,
  • David R. Karger,
  • Tom Heath,
  • Lee Feigenbaum,
  • Diana Maynard,
  • Enrico Motta,
  • Krishnaprasad Thirunarayan

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 06 November 2009

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Scalable SPARQL querying of large RDF graphsProceedings of the VLDB Endowment10.14778/3402707.34027474:11(1123-1134)Online publication date: 3-Jun-2020
  • (2019)Enhancing the scalability of expressive stream reasoning via input-driven parallelizationSemantic Web10.3233/SW-18033010:3(457-474)Online publication date: 1-Jan-2019
  • (2019)Streaming saturation for large RDF graphs with dynamic schema informationProceedings of the 17th ACM SIGPLAN International Symposium on Database Programming Languages10.1145/3315507.3330201(42-52)Online publication date: 23-Jun-2019
  • (2019)Semantic query transformations for increased parallelization in distributed knowledge graph query processingProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3295500.3356212(1-14)Online publication date: 17-Nov-2019
  • (2019)Data Mining Algorithms Parallelization in Logic Programming Framework for Execution in ClusterInternet of Things, Smart Spaces, and Next Generation Networks and Systems10.1007/978-3-030-30859-9_8(91-103)Online publication date: 26-Aug-2019
  • (2019)BTC-2019: The 2019 Billion Triple Challenge DatasetThe Semantic Web – ISWC 201910.1007/978-3-030-30796-7_11(163-180)Online publication date: 26-Oct-2019
  • (2018)An Incremental Reasoning Algorithm for Large Scale Knowledge GraphKnowledge Science, Engineering and Management10.1007/978-3-319-99365-2_45(503-513)Online publication date: 17-Aug-2018
  • (2017)Tree Structure for Expressive MapReduce FrameworkProceedings of the 2017 International Conference on Software and e-Business10.1145/3178212.3178236(33-37)Online publication date: 28-Dec-2017
  • (2017)Distributed Semantic Analytics Using the SANSA StackThe Semantic Web – ISWC 201710.1007/978-3-319-68204-4_15(147-155)Online publication date: 21-Oct-2017
  • (2015)SemQuaRE - An extension of the SQuaRE quality model for the evaluation of semantic technologiesComputer Standards & Interfaces10.1016/j.csi.2014.09.00138:C(101-112)Online publication date: 1-Feb-2015
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media