Article

Free access

Modern natural language interfaces to databases: composing statistical parsing with semantic tractability

Authors:

Ana-Maria Popescu,

Alexander YatesAuthors Info & Claims

COLING '04: Proceedings of the 20th international conference on Computational Linguistics

Pages 141 - es

https://doi.org/10.3115/1220355.1220376

Published: 23 August 2004 Publication History

Abstract

Natural Language Interfaces to Databases (NLIs) can benefit from the advances in statistical parsing over the last fifteen years or so. However, statistical parsers require training on a massive, labeled corpus, and manually creating such a corpus for each database is prohibitively expensive. To address this quandary, this paper reports on the PRECISE NLI, which uses a statistical parser as a "plug in". The paper shows how a strong semantic model coupled with "light re-training" enables PRECISE to overcome parser errors, and correctly map from parsed questions to the corresponding SQL queries. We discuss the issues in using statistical parsers to build database-independent NLIs, and report on experimental results with the benchmark ATIS data set where PRECISE achieves 94% accuracy.

References

[1]

I. Androutsopoulos, G. D. Ritchie, and P. Thanisch. 1995. Natural Language Interfaces to Databases - An Introduction. In Natural Language Engineering, vol 1, part 1, pages 29--81.]]

[2]

E. Charniak. 2000. A Maximum-Entropy-Inspired Parser. In Proc. of NAACL-2000.]]

Digital Library

[3]

R. Grishman. 2001. Adaptive information extraction and sublanguage analysis. In Proc. of IJCAI 2001.]]

[4]

B. J. Grosz, D. Appelt, P. Martin, and F. Pereira. 1987. TEAM: An Experiment in the Design of Transportable Natural Language Interfaces. In Artificial Intelligence 32, pages 173--243.]]

Digital Library

[5]

Y. He and S. Young. 2003. A data-driven spoken language understanding system. In IEEE Workshop on Automatic Speech Recognition and Understanding.]]

[6]

R. Kittredge. 1982. Variation and homogeneity of sub-languages. In R. Kittredge and J. Lehrberger, editors, Sublanguage: Studies of Language in Restricted Semantic Domains, pages 107--137. de Gruyter, Berlin.]]

[7]

E. Levin and R. Pieraccini. 1995. Chronus, the next generation. In Proc. of the DARPA Speech and Natural Language Workshop, pages 269--271.]]

[8]

W. Minker. 1998. Evaluation methodologies for interactive speech systems. In First International Conference on Language Resources and Evaluation, pages 801--805.]]

[9]

R. Moore, D. Appelt, J. Dowding, J. M. Gawron, and D. Moran. 1995. Combining linguistic and statistical knowledge sources in natural-language processing for atis. In Proc. of the ARPA Spoken Language Technology Workshop.]]

[10]

A. Popescu, O. Etzioni, and H. Kautz. 2003. Towards a theory of natural language interfaces to databases. In Proc. of IUI-2003.]]

Digital Library

[11]

P. Price. 1990. Evaluation of spoken language systems: the atis domain. In Proc. of the DARPA Speech and Natural Language Workshop, pages 91--95.]]

Digital Library

[12]

S. Sekine. 1994. A New Direction For Sublanguage NLP. In Proc. of the International Conference on New Methods in Language Processing, pages 165--177.]]

[13]

S. Seneff. 1992. Robust parsing for spoken language systems. In Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing.]]

[14]

L. R. Tang and R. J. Mooney. 2001. Using Multiple Clause Constructors in Inductive Logic Programming for Semantic Parsing. In Proc. of the 12th European Conference on Machine Learning (ECML-2001), Freiburg, Germany, pages 466--477.]]

Digital Library

[15]

W. Ward and S. Issar. 1996. Recent improvements in the cmu spoken language understanding system. In Proc. of the ARPA Human Language Technology Workshop, pages 213--216.]]

Digital Library

Cited By

Tian YKummerfeld JLi TZhang T(2024)SQLucid: Grounding Natural Language Database Queries with Interactive ExplanationsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676368(1-20)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676368
Ning ZTian YZhang ZZhang TLi T(2024)Insights into Natural Language Database Query Errors: from Attention Misalignment to User Handling StrategiesACM Transactions on Interactive Intelligent Systems10.1145/365011414:4(1-32)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1145/3650114
Wang QCastro Fernandez R(2023)Solo: Data Discovery Using Natural Language Questions Via A Self-Supervised ApproachProceedings of the ACM on Management of Data10.1145/36267561:4(1-27)Online publication date: 12-Dec-2023
https://dl.acm.org/doi/10.1145/3626756
Show More Cited By

Modern natural language interfaces to databases: composing statistical parsing with semantic tractability
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Deep processing for a portable natural language interface to databases
A phrasal approach to natural language interfaces over databases
NLDB'05: Proceedings of the 10th international conference on Natural Language Processing and Information Systems

This short paper introduces the STEP system for natural language access to relational databases. In contrast to most work in the area, STEP adopts a phrasal approach; an administrator couples phrasal patterns to elementary expressions of tuple ...
The restriction language for computer grammars of natural language

Over the past few years, a number of systems for the computer analysis of natural language sentences have been based on augmented context-free grammars: a context-free grammar which defines a set of parse trees for a sentence, plus a group of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings

COLING '04: Proceedings of the 20th international conference on Computational Linguistics

August 2004

1411 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 23 August 2004

Qualifiers

Article

Acceptance Rates

COLING '04 Paper Acceptance Rate 1,411 of 1,411 submissions, 100%;

Overall Acceptance Rate 1,537 of 1,537 submissions, 100%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

38
Total Citations
View Citations
785
Total Downloads

Downloads (Last 12 months)59
Downloads (Last 6 weeks)6

Reflects downloads up to 27 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Tian YKummerfeld JLi TZhang T(2024)SQLucid: Grounding Natural Language Database Queries with Interactive ExplanationsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676368(1-20)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676368
Ning ZTian YZhang ZZhang TLi T(2024)Insights into Natural Language Database Query Errors: from Attention Misalignment to User Handling StrategiesACM Transactions on Interactive Intelligent Systems10.1145/365011414:4(1-32)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1145/3650114
Wang QCastro Fernandez R(2023)Solo: Data Discovery Using Natural Language Questions Via A Self-Supervised ApproachProceedings of the ACM on Management of Data10.1145/36267561:4(1-27)Online publication date: 12-Dec-2023
https://dl.acm.org/doi/10.1145/3626756
Ning ZZhang ZSun TTian YZhang TLi T(2023)An Empirical Study of Model Errors and User Error Discovery and Repair Strategies in Natural Language Database QueriesProceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581641.3584067(633-649)Online publication date: 27-Mar-2023
https://dl.acm.org/doi/10.1145/3581641.3584067
Ma PWang S(2021)MT-teqlProceedings of the VLDB Endowment10.14778/3494124.349413915:3(569-582)Online publication date: 1-Nov-2021
https://dl.acm.org/doi/10.14778/3494124.3494139
Kim HSo BHan WLee H(2021)Natural language to SQLProceedings of the VLDB Endowment10.14778/3401960.340197013:10(1737-1750)Online publication date: 10-Mar-2021
https://dl.acm.org/doi/10.14778/3401960.3401970
Sen JLei CQuamar AÖzcan FEfthymiou VDalmia AStager GMittal ASaha DSankaranarayanan K(2020)ATHENA++Proceedings of the VLDB Endowment10.14778/3407790.340785813:12(2747-2759)Online publication date: 1-Jul-2020
https://dl.acm.org/doi/10.14778/3407790.3407858
Mehrpour SLaToza TSarvari HDevanbu PCohen MZimmermann T(2020)RulePad: interactive authoring of checkable design rulesProceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3368089.3409751(386-397)Online publication date: 8-Nov-2020
https://dl.acm.org/doi/10.1145/3368089.3409751
Scells HZuccon GKoopman BClark J(2020)Automatic Boolean Query Formulation for Systematic Review Literature SearchProceedings of The Web Conference 202010.1145/3366423.3380185(1071-1081)Online publication date: 20-Apr-2020
https://dl.acm.org/doi/10.1145/3366423.3380185
Cheng JReddy SSaraswat VLapata M(2019)Learning an executable neural semantic parserComputational Linguistics10.1162/coli_a_0034245:1(59-94)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.1162/coli_a_00342
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents