Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- ArticleSeptember 2024
Statement Graphs: Unifying the Graph Data Model Landscape
AbstractGraph database users today face a choice between two technology stacks. The Resource Description Framework (RDF), on one side, is a data model that was originally developed by the W3C to exchange interconnected data on the Web. On the other side, ...
- abstractJune 2024
Second Workshop on Simplicity in Management of Data (SiMoD)
SIGMOD/PODS '24: Companion of the 2024 International Conference on Management of DataPages 647–648https://doi.org/10.1145/3626246.3655012At a first glance, data systems today are complex, with various components, tuning knobs and delicate design decisions. However, at their core - as researchers and practitioners have been observing at least anecdotally - are simple ideas that work well ...
- research-articleJune 2024
Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine
SIGMOD/PODS '24: Companion of the 2024 International Conference on Management of DataPages 5–17https://doi.org/10.1145/3626246.3653368Apache Arrow DataFusion is a fast, embeddable, and extensible query engine written in Rust that uses Apache Arrow as its memory model. In this paper we describe the technologies on which it is built, and how it fits in long-term database implementation ...
- research-articleMarch 2024JUST ACCEPTED
Insights into Natural Language Database Query Errors: From Attention Misalignment to User Handling Strategies
ACM Transactions on Interactive Intelligent Systems (TIIS), Just Accepted https://doi.org/10.1145/3650114Querying structured databases with natural language (NL2SQL) has remained a difficult problem for years. Recently, the advancement of machine learning (ML), natural language processing (NLP), and large language models (LLM) have led to significant ...
- abstractJune 2023
Workshop on Simplicity in Management of Data (SiMoD)
SIGMOD '23: Companion of the 2023 International Conference on Management of DataPages 301–302https://doi.org/10.1145/3555041.3590817At a first glance, database systems today are complex, with various components, tuning knobs and delicate design decisions, leading to hundreds of thousands lines of code (if not more). However, at their core - as researchers and practitioners have been ...
-
- research-articleJune 2023
NF-Log: Revisiting Log Writes in Relational Database for Efficient Persistent Memory Utilization
SAC '23: Proceedings of the 38th ACM/SIGAPP Symposium on Applied ComputingPages 305–312https://doi.org/10.1145/3555776.3577733Non-volatile memory (NVM) is a promising storage technology that combines not only high performance and byte-addressability (like DRAM) but also durability (like SSD). However, as existing relational database management systems (RDBMS) are originally ...
- research-articleMarch 2023
An Empirical Study of Model Errors and User Error Discovery and Repair Strategies in Natural Language Database Queries
IUI '23: Proceedings of the 28th International Conference on Intelligent User InterfacesPages 633–649https://doi.org/10.1145/3581641.3584067Recent advances in machine learning (ML) and natural language processing (NLP) have led to significant improvement in natural language interfaces for structured databases (NL2SQL). Despite the great strides, the overall accuracy of NL2SQL models is ...
- surveyDecember 2022
Energy-Efficient Database Systems: A Systematic Survey
ACM Computing Surveys (CSUR), Volume 55, Issue 6Article No.: 111, Pages 1–53https://doi.org/10.1145/3538225Constructing energy-efficient database systems to reduce economic costs and environmental impact has been studied for 10 years. With the emergence of the big data age, along with the data-centric and data-intensive computing trend, the great amount of ...
- research-articleJuly 2022
DBSnap-Eval: Identifying Database Query Construction Patterns
ITiCSE '22: Proceedings of the 27th ACM Conference on on Innovation and Technology in Computer Science Education Vol. 1Pages 131–137https://doi.org/10.1145/3502718.3524822Learning to construct database queries can be a challenging task because students need to learn the specific query language syntax as well as properly understand the effect of each query operator and how multiple operators interact in a query. While ...
- research-articleJune 2022
Efficient Evaluation of Arbitrarily-Framed Holistic SQL Aggregates and Window Functions
SIGMOD '22: Proceedings of the 2022 International Conference on Management of DataPages 1243–1256https://doi.org/10.1145/3514221.3526184Window functions became part of the SQL standard in SQL:2003 and are widely used for data analytics: Percentiles, rankings, moving averages, running sums and local maxima are all expressed as window functions in SQL. Yet, the features offered by SQL's ...
- research-articleJune 2022
CompressDB: Enabling Efficient Compressed Data Direct Processing for Various Databases
SIGMOD '22: Proceedings of the 2022 International Conference on Management of DataPages 1655–1669https://doi.org/10.1145/3514221.3526130In modern data management systems, directly performing operations on compressed data has been proven to be a big success facing big data problems. These systems have demonstrated significant compression benefits and performance improvement for data ...
- abstractJune 2022
Lineage Resource Manager
SIGMOD '22: Proceedings of the 2022 International Conference on Management of DataPages 2530–2532https://doi.org/10.1145/3514221.3520252Main memory columnar database systems such as HyPer [1], Hana [4], MemSQL [9] have rapidly grown in use for both OLTP and OLAP applications in enterprise software [5]. Lineage capture in such in-memory columnar database systems incurs memory and ...
- research-articleJune 2022
Tastes Great! Less Filling! High Performance and Accurate Training Data Collection for Self-Driving Database Management Systems
SIGMOD '22: Proceedings of the 2022 International Conference on Management of DataPages 617–630https://doi.org/10.1145/3514221.3517845A self-driving database management system (DBMS) aims to configure, deploy, and optimize almost all aspects of itself automatically without human intervention or guidance. Achieving this high level of automation relies on machine learning (ML) models ...
- research-articleOctober 2021
Basil: Breaking up BFT with ACID (transactions)
SOSP '21: Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems PrinciplesPages 1–17https://doi.org/10.1145/3477132.3483552This paper presents Basil, the first transactional, leaderless Byzantine Fault Tolerant key-value store. Basil leverages ACID transactions to scalably implement the abstraction of a trusted shared log in the presence of Byzantine actors. Unlike ...
- invited-talkOctober 2021
A PACTful Agenda for Cloud Programming Research: (Invited Talk)
DBPL '21: The 18th International Symposium on Database Programming LanguagesPage 1https://doi.org/10.1145/3475726.3476450We have witnessed two decades of cloud computing research. Yet, programming the cloud remains a tedious task for both the application and cloud infrastructure developers: application developers need to consider various cloud deployment aspects as they ...
- research-articleJune 2021
Building Advanced SQL Analytics From Low-Level Plan Operators
SIGMOD '21: Proceedings of the 2021 International Conference on Management of DataPages 1001–1013https://doi.org/10.1145/3448016.3457288Analytical queries virtually always involve aggregation and statistics. SQL offers a wide range of functionalities to summarize data such as associative aggregates, distinct aggregates, ordered-set aggregates, grouping sets, and window functions. In ...
- research-articleJune 2021
Self-Tuning Query Scheduling for Analytical Workloads
SIGMOD '21: Proceedings of the 2021 International Conference on Management of DataPages 1879–1891https://doi.org/10.1145/3448016.3457260Most database systems delegate scheduling decisions to the operating system. While such an approach simplifies the overall database design, it also entails problems. Adaptive resource allocation becomes hard in the face of concurrent queries. ...
- research-articleJune 2021
Spitfire: A Three-Tier Buffer Manager for Volatile and Non-Volatile Memory
SIGMOD '21: Proceedings of the 2021 International Conference on Management of DataPages 2195–2207https://doi.org/10.1145/3448016.3452819The design of the buffer manager in database management systems (DBMSs) is influenced by the performance characteristics of volatile memory (i.e., DRAM) and non-volatile storage (e.g., SSD). The key design assumptions have been that the data must be ...
- abstractJune 2021
Model-Parallel Model Selection for Deep Learning Systems
SIGMOD '21: Proceedings of the 2021 International Conference on Management of DataPages 2929–2931https://doi.org/10.1145/3448016.3450571As deep learning becomes more expensive, both in terms of time and compute, inefficiencies in machine learning training prevent practical usage of state-of-the-art models for most users. The newest model architectures are simply too large to be fit onto ...
- research-articleApril 2021
DIY: Assessing the Correctness of Natural Language to SQL Systems
IUI '21: Proceedings of the 26th International Conference on Intelligent User InterfacesPages 597–607https://doi.org/10.1145/3397481.3450667Designing natural language interfaces for querying databases remains an important goal pursued by researchers in natural language processing, databases, and HCI. These systems receive natural language as input, translate it into a formal database query, ...