Integrity constraints are semantic conditions that a database should satisfy in order to be an appropriate model of external reality. In practice, and for many reasons, a database may not satisfy those integrity constraints, and for that reason it is said to be inconsistent. However, and most likely, a large portion of the database is still semantically correct, in a sense that has to be made precise. After having provided a formal characterization of consistent data in an inconsistent database, the natural problem emerges of extracting that semantically correct data, as query answers. The consistent data in an inconsistent database is usually characterized as the data that persists across all the database instances that are consistent and minimally differ from the inconsistent instance. Those are the so-called repairs of the database. In particular, the consistent answers to a query posed to the inconsistent database are those answers that can be simultaneously obtained from all the database repairs. As expected, the notion of repair requires an adequate notion of distance that allows for the comparison of databases with respect to how much they differ from the inconsistent instance. On this basis, the minimality condition on repairs can be properly formulated. In this monograph we present and discuss these fundamental concepts, different repair semantics, algorithms for computing consistent answers to queries, and also complexity-theoretic results related to the computation of repairs and doing consistent query answering. Table of Contents: Introduction / The Notions of Repair and Consistent Answer / Tractable CQA and Query Rewriting / Logically Specifying Repairs / Decision Problems in CQA: Complexity and Algorithms / Repairs and Data Cleaning
Cited By
- Yan M, Wang Y, Wang Y, Miao X and Li J (2024). GIDCL: A Graph-Enhanced Interpretable Data Cleaning Framework with Large Language Models, Proceedings of the ACM on Management of Data, 2:6, (1-29), Online publication date: 18-Dec-2024.
- Amezian El Khalfioui A and Wijsen J (2024). Computing Range Consistent Answers to Aggregation Queries via Rewriting, Proceedings of the ACM on Management of Data, 2:5, (1-19), Online publication date: 4-Nov-2024.
- Fan W, Han Z, Ren W, Wang D, Wang Y, Xie M and Yan M (2023). Splitting Tuples of Mismatched Entities, Proceedings of the ACM on Management of Data, 1:4, (1-29), Online publication date: 8-Dec-2023.
- Abriola S, Cifuentes S, Martinez M, Pardal N and Pin E (2023). An epistemic approach to model uncertainty in data-graphs, International Journal of Approximate Reasoning, 160:C, Online publication date: 1-Sep-2023.
- David V, Fournier-S'niehotta R and Travers N NeoMaPy Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, (7123-7126)
- Bienvenu M, Cima G and Gutiérrez-Basulto V REPLACE Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, (3132-3139)
- Chai C, Liu J, Tang N, Fan J, Miao D, Wang J, Luo Y and Li G (2023). GoodCore: Data-effective and Data-efficient Machine Learning through Coreset Selection over Incomplete Data, Proceedings of the ACM on Management of Data, 1:2, (1-27), Online publication date: 13-Jun-2023.
- Fan W, Fu W, Jin R, Liu M, Lu P and Tian C (2023). Making It Tractable to Catch Duplicates and Conflicts in Graphs, Proceedings of the ACM on Management of Data, 1:1, (1-28), Online publication date: 26-May-2023.
- Ilyas I and Rekatsinas T (2022). Machine Learning and Data Cleaning: Which Serves the Other?, Journal of Data and Information Quality, 14:3, (1-11), Online publication date: 30-Sep-2022.
- Bienvenu M, Cima G and Gutiérrez-Basulto V LACE: A Logical Approach to Collective Entity Resolution Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, (379-391)
- Console M, Guagliardo P and Libkin L (2022). Propositional and predicate logics of incomplete information, Artificial Intelligence, 302:C, Online publication date: 1-Jan-2022.
- Pena E, de Almeida E and Naumann F (2021). Fast detection of denial constraint violations, Proceedings of the VLDB Endowment, 15:4, (859-871), Online publication date: 1-Dec-2021.
- Arenas M, Barceló P and Monet M (2021). The Complexity of Counting Problems Over Incomplete Databases, ACM Transactions on Computational Logic, 22:4, (1-52), Online publication date: 31-Oct-2021.
- Perez H and Wijsen J Generalized Weighted Repairs Flexible Query Answering Systems, (67-81)
- Grant J, Martinez M, Molinaro C and Parisi F (2021). Dimensional Inconsistency Measures and Postulates in Spatio-Temporal Databases, Journal of Artificial Intelligence Research, 71, (733-780), Online publication date: 10-Sep-2021.
- Feng S, Glavic B, Huber A and Kennedy O Efficient Uncertainty Tracking for Complex Queries with Attribute-level Bounds Proceedings of the 2021 International Conference on Management of Data, (528-540)
- Dixit A and Kolaitis P CAvSAT: Answering Aggregation Queries over Inconsistent Databases via SAT Solving Proceedings of the 2021 International Conference on Management of Data, (2701-2705)
- Fan W, Tian C, Wang Y and Yin Q (2021). Parallel discrepancy detection and incremental detection, Proceedings of the VLDB Endowment, 14:8, (1351-1364), Online publication date: 1-Apr-2021.
- Koutris P and Wijsen J (2020). Consistent Query Answering for Primary Keys in Datalog, Theory of Computing Systems, 65:1, (122-178), Online publication date: 1-Jan-2021.
- Karlaš B, Li P, Wu R, Gürel N, Chu X, Wu W and Zhang C (2020). Nearest neighbor classifiers over incomplete information, Proceedings of the VLDB Endowment, 14:3, (255-267), Online publication date: 1-Nov-2020.
- Salimi B, Howe B and Suciu D (2020). Database Repair Meets Algorithmic Fairness, ACM SIGMOD Record, 49:1, (34-41), Online publication date: 4-Sep-2020.
- Issa O, Bonifati A and Toumani F (2020). Evaluating top-k queries with inconsistency degrees, Proceedings of the VLDB Endowment, 13:12, (2146-2158), Online publication date: 1-Aug-2020.
- Lembo D, Rosati R and Savo D Revisiting controlled query evaluation in description logics Proceedings of the 28th International Joint Conference on Artificial Intelligence, (1786-1792)
- Gheerbrant A and Sirangelo C Best answers over incomplete data Proceedings of the 28th International Joint Conference on Artificial Intelligence, (1704-1710)
- Salimi B, Rodriguez L, Howe B and Suciu D Interventional Fairness Proceedings of the 2019 International Conference on Management of Data, (793-810)
- Feng S, Huber A, Glavic B and Kennedy O Uncertainty Annotated Databases - A Lightweight Approach for Approximating Certain Answers Proceedings of the 2019 International Conference on Management of Data, (1313-1330)
- Dixit A CAvSAT Proceedings of the 2019 International Conference on Management of Data, (1823-1825)
- Bertossi L Database Repairs and Consistent Query Answering Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, (48-58)
- Fiorentino N, Molinaro C and Trubitsyna I Optimizing the Computation of Approximate Certain Query Answers over Incomplete Databases Flexible Query Answering Systems, (48-60)
- Bienvenu M, Bourgaux C and Goasdoué F (2019). Computing and explaining query answers over inconsistent DL-lite knowledge bases, Journal of Artificial Intelligence Research, 64:1, (563-644), Online publication date: 1-Jan-2019.
- Bienvenu M Inconsistency-tolerant ontology-based data access revisited Proceedings of the 27th International Joint Conference on Artificial Intelligence, (1721-1729)
- Greco S, Molinaro C and Trubitsyna I Algorithms for Computing Approximate Certain Answers over Incomplete Databases Proceedings of the 22nd International Database Engineering & Applications Symposium, (1-4)
- Libkin L Certain Answers Meet Zero-One Laws Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, (195-207)
- Koutris P and Wijsen J Consistent Query Answering for Primary Keys and Conjunctive Queries with Negated Atoms Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, (209-224)
- Calautti M, Libkin L and Pieris A An Operational Approach to Consistent Query Answering Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, (239-251)
- Fiorentino N, Greco S, Molinaro C and Trubitsyna I ACID Proceedings of the 2018 International Conference on Management of Data, (1685-1688)
- Berti-Équille L, Harmouch H, Naumann F, Novelli N and Thirumuruganathan S (2018). Discovery of genuine functional dependencies from relational data with missing values, Proceedings of the VLDB Endowment, 11:8, (880-892), Online publication date: 1-Apr-2018.
- Bertossi L and Milani M (2018). Ontological Multidimensional Data Models and Contextual Data Quality, Journal of Data and Information Quality, 9:3, (1-36), Online publication date: 15-Mar-2018.
- Console M, Guagliardo P and Libkin L On querying incomplete information in databases under bag semantics Proceedings of the 26th International Joint Conference on Artificial Intelligence, (993-999)
- Bertossi L and Salimi B (2017). From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back, Theory of Computing Systems, 61:1, (191-232), Online publication date: 1-Jul-2017.
- Koutris P and Wijsen J (2017). Consistent Query Answering for Self-Join-Free Conjunctive Queries Under Primary Key Constraints, ACM Transactions on Database Systems, 42:2, (1-45), Online publication date: 1-Jun-2017.
- Abiteboul S, Arenas M, Barceló P, Bienvenu M, Calvanese D, David C, Hull R, Hüllermeier E, Kimelfeld B, Libkin L, Martens W, Milo T, Murlak F, Neven F, Ortiz M, Schwentick T, Stoyanovich J, Su J, Suciu D, Vianu V and Yi K (2017). Research Directions for Principles of Data Management (Abridged), ACM SIGMOD Record, 45:4, (5-17), Online publication date: 11-May-2017.
- Livshits E and Kimelfeld B Counting and Enumerating (Preferred) Database Repairs Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, (289-301)
- De Bona G and Hunter A (2017). Localising iceberg inconsistencies, Artificial Intelligence, 246:C, (118-151), Online publication date: 1-May-2017.
- Geerts F, Pijcke F and Wijsen J (2017). First-order under-approximations of consistent query answers, International Journal of Approximate Reasoning, 83:C, (337-355), Online publication date: 1-Apr-2017.
- Madraky A, Othman Z and Hamdan A (2016). Hair-oriented data model for spatio-temporal data representation, Expert Systems with Applications: An International Journal, 59:C, (119-144), Online publication date: 15-Oct-2016.
- Eiter T, Fink M and Stepanova D (2016). Data repair of inconsistent nonmonotonic description logic programs, Artificial Intelligence, 239:C, (7-53), Online publication date: 1-Oct-2016.
- Arieli O and Zamansky A (2016). A graded approach to database repair by context-aware distance semantics, Fuzzy Sets and Systems, 298:C, (4-21), Online publication date: 1-Sep-2016.
- Bourhis P, Puppis G, Riveros C and Staworko S (2016). Bounded Repairability for Regular Tree Languages, ACM Transactions on Database Systems, 41:3, (1-45), Online publication date: 8-Aug-2016.
- Pokorný J Conceptual and Database Modelling of Graph Databases Proceedings of the 20th International Database Engineering & Applications Symposium, (370-377)
- Benferhat S, Bouraoui Z, Croitoru M, Papini O and Tabia K Non-objection inference for inconsistency-tolerant query answering Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (3684-3690)
- Bienvenu M, Bourgaux C and Goasdoué F Query-driven repairing of inconsistent DL-Lite knowledge bases Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (957-964)
- Koutris P and Wijsen J (2016). Consistent Query Answering for Primary Keys, ACM SIGMOD Record, 45:1, (15-22), Online publication date: 2-Jun-2016.
- Fan W (2015). Data Quality, ACM SIGMOD Record, 44:3, (7-18), Online publication date: 3-Dec-2015.
- Ilyas I and Chu X (2015). Trends in Cleaning Relational Data, Foundations and Trends in Databases, 5:4, (281-393), Online publication date: 1-Oct-2015.
- Libkin L How to define certain answers Proceedings of the 24th International Conference on Artificial Intelligence, (4282-4288)
- Pfandler A and Sallinger E Distance-bounded consistent query answering Proceedings of the 24th International Conference on Artificial Intelligence, (2262-2269)
- Benferhat S, Bouraoui Z and Tabia K How to select one preferred assertional-based repair from Inconsistent and prioritized DL-lite knowledge bases? Proceedings of the 24th International Conference on Artificial Intelligence, (1450-1456)
- Salimi B and Bertossi L Query-answer causality in databases Proceedings of the UAI 2015 Conference on Advances in Causal Inference - Volume 1504, (76-85)
- Debosschere M and Geerts F Cell-based causality for data repairs Proceedings of the 7th USENIX Conference on Theory and Practice of Provenance, (14-14)
- Köhler H, Link S and Zhou X (2015). Possible and certain SQL keys, Proceedings of the VLDB Endowment, 8:11, (1118-1129), Online publication date: 1-Jul-2015.
- Köhler H, Link S and Zhou X (2018). Possible and certain SQL keys, Proceedings of the VLDB Endowment, 8:11, (1118-1129), Online publication date: 1-Jul-2015.
- Koutris P and Wijsen J The Data Complexity of Consistent Query Answering for Self-Join-Free Conjunctive Queries Under Primary Key Constraints Proceedings of the 34th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, (17-29)
- Fagin R, Kimelfeld B and Kolaitis P Dichotomies in the Complexity of Preferred Repairs Proceedings of the 34th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, (3-15)
- Greco S, Spezzano F and Trubitsyna I (2015). Checking Chase Termination: Cyclicity Analysis and Rewriting Techniques, IEEE Transactions on Knowledge and Data Engineering, 27:3, (621-635), Online publication date: 1-Mar-2015.
- Ludwig M and Peñaloza R Error-Tolerant Reasoning in the Description Logic $\mathcal{E{\kern-.1em}L}$ Proceedings of the 14th European Conference on Logics in Artificial Intelligence - Volume 8761, (107-121)
- Bertossi L and Gardezi J Tractable vs. Intractable Cases of Query Answering under Matching Dependencies Proceedings of the 8th International Conference on Scalable Uncertainty Management - Volume 8720, (51-65)
- Parisi F and Grant J Repairs and Consistent Answers for Inconsistent Probabilistic Spatio-Temporal Databases Proceedings of the 8th International Conference on Scalable Uncertainty Management - Volume 8720, (265-279)
- Fan W, Geerts F, Tang N and Yu W (2014). Conflict resolution with data currency and consistency, Journal of Data and Information Quality, 5:1-2, (1-37), Online publication date: 4-Sep-2014.
- Libkin L Certain answers as objects and knowledge Proceedings of the Fourteenth International Conference on Principles of Knowledge Representation and Reasoning, (328-337)
- Libkin L Incomplete data Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, (1-13)
- Wijsen J A Survey of the Data Complexity of Consistent Query Answering under Key Constraints Proceedings of the 8th International Symposium on Foundations of Information and Knowledge Systems - Volume 8367, (62-78)
- Greco S, Pijcke F and Wijsen J (2014). Certain query answering in partially consistent databases, Proceedings of the VLDB Endowment, 7:5, (353-364), Online publication date: 1-Jan-2014.
- Ariyan S and Bertossi L A multidimensional data model with subcategories for flexibly capturing summarizability Proceedings of the 25th International Conference on Scientific and Statistical Database Management, (1-12)
- Wijsen J Charting the tractability frontier of certain conjunctive query answering Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGAI symposium on Principles of database systems, (189-200)
- Gardezi J and Bertossi L Tractable cases of clean query answering under entity resolution via matching dependencies Proceedings of the 6th international conference on Scalable Uncertainty Management, (180-193)
- Decan A, Pijcke F and Wijsen J Certain conjunctive query answering in SQL Proceedings of the 6th international conference on Scalable Uncertainty Management, (154-167)
- Gardezi J and Bertossi L Query rewriting using datalog for duplicate resolution Proceedings of the Second international conference on Datalog in Academia and Industry, (86-98)
- Bahmani Z, Bertossi L, Kolahi S and Lakshmanan L Declarative entity resolution via matching dependencies and answer set programs Proceedings of the Thirteenth International Conference on Principles of Knowledge Representation and Reasoning, (380-390)
- Yaghmaie M, Bertossi L and Ariyan S Repair-oriented relational schemas for multidimensional databases Proceedings of the 15th International Conference on Extending Database Technology, (408-419)
- Pivert O and Prade H Detecting suspect answers in the presence of inconsistent information Proceedings of the 7th international conference on Foundations of Information and Knowledge Systems, (278-297)
Index Terms
- Database Repairing and Consistent Query Answering
Recommendations
Condensed Representation of Database Repairs for Consistent Query Answering
ICDT '03: Proceedings of the 9th International Conference on Database TheoryRepairing a database means bringing the database in accordance with a given set of integrity constraints by applying modifications that are as small as possible. In the seminal work of Arenas et al. on query answering in the presence of inconsistency, ...
On the tractability and intractability of consistent conjunctive query answering
PhD '11: Proceedings of the 2011 Joint EDBT/ICDT Ph.D. WorkshopThe consistent query answering framework has received considerable attention since it was first introduced as an alternative to coping with inconsistent databases. The framework was defined based on two notions: repairs and consistent query answers. ...
An Operational Approach to Consistent Query Answering
PODS '18: Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsConsistent query answering (CQA) aims to find meaningful answers to queries when databases are inconsistent, i.e., do not conform to their specifications. Such answers must be certainly true in all repairs, which are consistent databases whose ...