Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3340531.3417426acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Semantic Search over Structured Data

Published: 19 October 2020 Publication History

Abstract

Searching through large structured data is a crucial task in enterprise as well as online search. Existing keyword-based search methods were mostly designed for web document search, and support a limited variety of query types suitable for structured data. Their main underlying drawback is the lack of semantic understanding of the data and users' natural language queries, resulting in a fundamental disconnect. In this paper, we bridge that gap through effective semantic type discovery and indexing of structured data. We demonstrate S3D (Semantic Search over Structured Data), a novel system that supports queries such as finding related tables, rows or columns, amongst others, on a large scale of structured data from multiple heterogeneous sources.

Supplementary Material

MP4 File (3340531.3417426.mp4)
Video for Demo titled: Semantic Search over Structured Data.\r\nAuthors: Sainyam Galhotra and Udayan Khurana

References

[1]
J. Chen, E. Jiménez-Ruiz, I. Horrocks, and C. Sutton. Colnet: Embedding the semantics of web tables for column type prediction. In AAAI, 2019.
[2]
D. Deng, R. C. Fernandez, Z. Abedjan, S. Wang, M. Stonebraker, A. K. Elmagarmid, I. F. Ilyas, S. Madden, et al. The data civilizer system. In CIDR, 2017.
[3]
D. Deng, Y. Jiang, G. Li, J. Li, and C. Yu. Scalable column concept determination for web tables using large knowledge bases. VLDB, 2013.
[4]
L. Friedman and S. Markovitch. Recursive feature generation for knowledge-based learning. arXiv preprint arXiv:1802.00050, 2018.
[5]
S. Galhotra, U. Khurana, O. Hassanzadeh, et al. Automated feature enhancement for predictive modeling using external knowledge. In IEEE ICDM, 2019.
[6]
A. Halevy, F. Korn, N. F. Noy, C. Olston, N. Polyzotis, S. Roy, and S. E. Whang. Goods: Organizing google's datasets. In ACM SIGMOD, 2016.
[7]
K. Hu, N. Gaikwad, M. Bakker, M. Hulsebos, et al. Viznet: Towards a large-scale visualization learning and benchmarking repository. In ACM SIGCHI, 2019.
[8]
M. Hulsebos, K. Hu, M. Bakker, E. Zgraggen, A. Satyanarayan, et al. Sherlock: A deep learning approach to semantic data type detection. ACM SIGKDD, 2019.
[9]
G. Koutrika, Z. M. Zadeh, and H. Garcia-Molina. Data clouds: summarizing keyword search results over structured data. In EDBT, 2009.
[10]
A. Kumar, J. Naughton, J. M. Patel, and X. Zhu. To join or not to join' thinking twice about joins before feature selection. In ACM SIGMOD, pages 19--34, 2016.
[11]
H. T. Lam, J.-M. Thiebaut, M. Sinn, B. Chen, T. Mai, and O. Alkan. One button machine for automating feature engineering in relational databases. arXiv, 2017.
[12]
G. Li, B. C. Ooi, J. Feng, et al. Ease: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In ACM SIGMOD, 2008.
[13]
S. Neumaier, J. Umbrich, J. Parreira, and A. Polleres. Multi-level semantic labelling of numerical values. In ICWS, 2016.
[14]
P. Nguyen and H. Takeda. Semantic labeling for quantitative data using wikidata. In ESWC, 2018.
[15]
H. Paulheim and J. Fümkranz. Unsupervised generation of data mining features from linked open data. WIMS, 2012.
[16]
N. Polyzotis, S. Roy, S. E. Whang, and M. Zinkevich. Data management challenges in production machine learning. In ACM SIGMOD, 2017.
[17]
E. Zhu, Y. He, and S. Chaudhuri. Auto-join: Joining tables by leveraging transformations. In VLDB, 2017.
[18]
M. M. Zloof. Query-by-example: the invocation and definition of tables and forms. In Proceedings of VLDB, 1975.

Cited By

View all
  • (2024)Empirical Evidence on Conversational Control of GUI in Semantic AutomationProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645172(869-885)Online publication date: 18-Mar-2024
  • (2024)A Large Scale Test Corpus for Semantic Table SearchProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657877(1142-1151)Online publication date: 10-Jul-2024
  • (2024)unKR: A Python Library for Uncertain Knowledge Graph Reasoning by Representation LearningProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657661(2822-2826)Online publication date: 10-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
October 2020
3619 pages
ISBN:9781450368599
DOI:10.1145/3340531
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data lake
  2. dataset search
  3. semantic search

Qualifiers

  • Short-paper

Conference

CIKM '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)86
  • Downloads (Last 6 weeks)11
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Empirical Evidence on Conversational Control of GUI in Semantic AutomationProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645172(869-885)Online publication date: 18-Mar-2024
  • (2024)A Large Scale Test Corpus for Semantic Table SearchProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657877(1142-1151)Online publication date: 10-Jul-2024
  • (2024)unKR: A Python Library for Uncertain Knowledge Graph Reasoning by Representation LearningProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657661(2822-2826)Online publication date: 10-Jul-2024
  • (2024)A survey on semantic data management as intersection of ontology-based data access, semantic modeling and data lakesJournal of Web Semantics10.1016/j.websem.2024.10081981(100819)Online publication date: Jul-2024
  • (2023)Semantics-Aware Dataset Discovery from Data Lakes with Contextualized Column-Based Representation LearningProceedings of the VLDB Endowment10.14778/3587136.358714616:7(1726-1739)Online publication date: 8-May-2023
  • (2023)SANTOS: Relationship-based Semantic Table Union SearchProceedings of the ACM on Management of Data10.1145/35886891:1(1-25)Online publication date: 30-May-2023
  • (2023)Olio: A Semantic Search Interface for Data RepositoriesProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606806(1-16)Online publication date: 29-Oct-2023
  • (2023)Table Discovery in Data Lakes: State-of-the-art and Future DirectionsCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589409(69-75)Online publication date: 4-Jun-2023
  • (2023)Metam: Goal-Oriented Data Discovery2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00213(2780-2793)Online publication date: Apr-2023
  • (2022)Towards Knowledge-based Smart Service Systems: The Case of a Recommender System for a Cultural OrganizationITM Web of Conferences10.1051/itmconf/2022410500141(05001)Online publication date: 8-Feb-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media