Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Annotating database schemas to help enterprise search

Published: 01 August 2015 Publication History

Abstract

In large enterprises, data discovery is a common problem faced by users who need to find relevant information in relational databases. In this scenario, schema annotation is a useful tool to enrich a database schema with descriptive keywords. In this paper, we demonstrate Barcelos, a system that automatically annotates corporate databases. Unlike existing annotation approaches that use Web oriented knowledge bases, Barcelos mines enterprise spreadsheets to find candidate annotations. Our experimental evaluation shows that Barcelos produces high quality annotations; the top-5 have an average precision of 87%.

References

[1]
K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, pages 1247--1250, 2008.
[2]
M. J. Cafarella, A. Halevy, D. Z. Wang, E. Wu, and Y. Zhang. Webtables: exploring the power of tables on the web. Proceedings of VLDB, (1):538--549, 2008.
[3]
M. J. Cafarella, J. Madhavan, and A. Y. Halevy. Web-scale extraction of structured data. SIGMOD Record, pages 55--61, 2008.
[4]
S. Chaudhuri, V. Ganti, and R. Kaushik. A primitive operator for similarity joins in data cleaning. In Proceedings of ICDE, page 5, 2006.
[5]
J. Fan, M. Lu, B. C. Ooi, W.-C. Tan, and M. Zhang. A hybrid machine-crowdsourcing system for matching web tables. In Proceedings of ICDE, pages 976--987, 2014.
[6]
R. Pimplikar and S. Sarawagi. Answering table queries on the web using column keywords. Proceedings of the VLDB Endowment, 5(10):908--919, 2012.
[7]
F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: A core of semantic knowledge unifying wordnet and wikipedia. In WWW, pages 697--706, 2007.
[8]
P. Venetis, A. Halevy, J. Madhavan, M. Paşca, W. Shen, F. Wu, G. Miao, and C. Wu. Recovering semantics of tables on the web. Proceedings of VLDB, (9):528--538, 2011.
[9]
W. Wu, H. Li, H. Wang, and K. Q. Zhu. Probase: A probabilistic taxonomy for text understanding. In SIGMOD, pages 481--492, 2012.

Cited By

View all
  • (2023)Demystifying Artificial Intelligence for Data PreparationCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589406(13-20)Online publication date: 4-Jun-2023
  • (2019)First International Workshop on Professional SearchACM SIGIR Forum10.1145/3308774.330879952:2(153-162)Online publication date: 17-Jan-2019
  • (2019)Auto-EM: End-to-end Fuzzy Entity-Matching using Pre-trained Deep Models and Transfer LearningThe World Wide Web Conference10.1145/3308558.3313578(2413-2424)Online publication date: 13-May-2019
  • Show More Cited By
  1. Annotating database schemas to help enterprise search

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the VLDB Endowment
      Proceedings of the VLDB Endowment  Volume 8, Issue 12
      Proceedings of the 41st International Conference on Very Large Data Bases, Kohala Coast, Hawaii
      August 2015
      728 pages
      ISSN:2150-8097
      Issue’s Table of Contents

      Publisher

      VLDB Endowment

      Publication History

      Published: 01 August 2015
      Published in PVLDB Volume 8, Issue 12

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 13 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Demystifying Artificial Intelligence for Data PreparationCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589406(13-20)Online publication date: 4-Jun-2023
      • (2019)First International Workshop on Professional SearchACM SIGIR Forum10.1145/3308774.330879952:2(153-162)Online publication date: 17-Jan-2019
      • (2019)Auto-EM: End-to-end Fuzzy Entity-Matching using Pre-trained Deep Models and Transfer LearningThe World Wide Web Conference10.1145/3308558.3313578(2413-2424)Online publication date: 13-May-2019
      • (2018)Synthesizing Type-Detection Logic for Rich Semantic Data Types using Open-source CodeProceedings of the 2018 International Conference on Management of Data10.1145/3183713.3196888(35-50)Online publication date: 27-May-2018
      • (2017)Discovering Enterprise Concepts Using Spreadsheet TablesProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/3097983.3098102(1873-1882)Online publication date: 13-Aug-2017
      • (2017)Synthesizing Mapping Relationships Using Table CorpusProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3064010(1117-1132)Online publication date: 9-May-2017

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media