Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3437963.3441694acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
demonstration

AnaSearch: Extract, Retrieve and Visualize Structured Results from Unstructured Text for Analytical Queries

Published: 08 March 2021 Publication History

Abstract

Modern search engines retrieve results mainly based on the keyword matching techniques, and thus fail to answer analytical queries like "apps with more than 1 billion monthly active users" or "population growth of the US from 2015 to 2019", which requires numerical reasoning or aggregating results from multiple web pages. Such analytical queries are very common in the data analysis area, the expected results would be structured tables or charts. In most cases, these structured results are not available or accessible, they scatter in various text sources. In this work, we build AnaSearch, a search system to support analytical queries, and return structured results that can be visualized in the form of tables or charts. We collect and build structured quantitative data from the unstructured text on the web automatically. With AnaSearch, data analysts could easily derive insights for decision making with keyword or natural language queries. Specifically, we build AnaSearch under the COVID-19 news data, which makes it easy to compare with manually collected structured data.

References

[1]
Omar Alonso and Thibault Sellam. 2018. Quantitative Information Extraction From Social Data. In Proceedings of the 41st International ACM SIGIR. 1005--1008.
[2]
Krisztian Balog. 2018. Entity-Oriented Search. The Information Retrieval Series, Vol. 39. Springer.
[3]
Dar'i o Garigliotti. 2018. A Semantic Search Approach to Task-Completion Engines. In Proceedings of the 41st International ACM SIGIR. 1457.
[4]
Dar'i o Garigliotti and Krisztian Balog. 2018. Towards an Understanding of Entity-Oriented Search Intents. In Advances in Information Retrieval - the 40th ECIR.
[5]
Faegheh Hasibi, Krisztian Balog, Dar'i o Garigliotti, and Shuo Zhang. 2017. Nordlys: A Toolkit for Entity-Oriented and Semantic Search. In Proceedings of the 40th International ACM SIGIR. 1289--1292.
[6]
Vinh Thinh Ho, Yusra Ibrahim, Koninika Pal, Klaus Berberich, and Gerhard Weikum. 2019. Qsearch: Answering Quantity Queries from Text. In Proceedings of the 18th ISWC. 237--257.
[7]
Vinh Thinh Ho, Koninika Pal, Niko Kleer, Klaus Berberich, and Gerhard Weikum. 2020. Entities with Quantities: Extraction, Search, and Ranking. In Proceedings of the 13th WSDM. 833--836.
[8]
Kyle Hundman and Chris A. Mattmann. 2017. Measurement Context Extraction from Text: Discovering Opportunities and Gaps in Earth Science. In Proceedings of the 23rd ACM SIGKDD (Data-Driven Discovery Workshop).
[9]
Yuyu Luo, Wenbo Li, Tianyu Zhao, Xiang Yu, Lixi Zhang, Guoliang Li, and Nan Tang. 2020. DeepTrack: Monitoring and Exploring Spatio-Temporal Data: A Case of Tracking COVID-19. Proc. VLDB Endow., Vol. 13, 12 (August 2020), 2841--2844.
[10]
Swarnadeep Saha, Harinder Pal, and Mausam. 2017. Bootstrapping for Numerical Open IE. In Proceedings of the 55th ALC (Volume 2: Short Papers). 317--323.
[11]
Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowledge. In Proceedings of the 16th WWW. 697--706.

Cited By

View all
  • (2024)Human-centered NLP Fact-checking: Co-Designing with Fact-checkers using Matchmaking for AIProceedings of the ACM on Human-Computer Interaction10.1145/36869628:CSCW2(1-44)Online publication date: 8-Nov-2024
  • (2024)QuantPlorer: Exploration of Quantities in TextAdvances in Information Retrieval10.1007/978-3-031-56069-9_13(171-176)Online publication date: 23-Mar-2024
  • (2023)SciHarvester: Searching Scientific Documents for Numerical ValuesProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591808(3135-3139)Online publication date: 19-Jul-2023
  • Show More Cited By

Index Terms

  1. AnaSearch: Extract, Retrieve and Visualize Structured Results from Unstructured Text for Analytical Queries

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining
      March 2021
      1192 pages
      ISBN:9781450382977
      DOI:10.1145/3437963
      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 March 2021

      Check for updates

      Author Tags

      1. data visualization
      2. information retrieval
      3. quantitative information
      4. structured data

      Qualifiers

      • Demonstration

      Funding Sources

      Conference

      WSDM '21

      Acceptance Rates

      Overall Acceptance Rate 498 of 2,863 submissions, 17%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)33
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 25 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Human-centered NLP Fact-checking: Co-Designing with Fact-checkers using Matchmaking for AIProceedings of the ACM on Human-Computer Interaction10.1145/36869628:CSCW2(1-44)Online publication date: 8-Nov-2024
      • (2024)QuantPlorer: Exploration of Quantities in TextAdvances in Information Retrieval10.1007/978-3-031-56069-9_13(171-176)Online publication date: 23-Mar-2024
      • (2023)SciHarvester: Searching Scientific Documents for Numerical ValuesProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591808(3135-3139)Online publication date: 19-Jul-2023
      • (2023)The state of human-centered NLP technology for fact-checkingInformation Processing & Management10.1016/j.ipm.2022.10321960:2(103219)Online publication date: Mar-2023
      • (2023)QURG: Question Rewriting Guided Context-Dependent Text-to-SQL Semantic ParsingPRICAI 2023: Trends in Artificial Intelligence10.1007/978-981-99-7022-3_24(275-286)Online publication date: 10-Nov-2023
      • (2022)QFinderProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531672(3272-3277)Online publication date: 6-Jul-2022

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media