Nothing Special   »   [go: up one dir, main page]

skip to main content
article
Free access

AJAX: an extensible data cleaning tool

Published: 16 May 2000 Publication History

Abstract

@@@@ groups together matching pairs with a high similarity value by applying a given grouping criteria (e.g. by transitive closure). Finally, ging collapses each individual cluster into a tuple of the resulting data source. AJAX provides @@@@ for specifying data cleaning programs, which consists of SQL statements enriched with a set of specific primitives to express these transformations.
AJAX also @@@@. It allows the user to interact with an executing data cleaning program to handle exceptional cases and to inspect intermediate results. Finally, AJAX provides @@@@ @@@@ that permits users to determine the source and processing of data for debugging purposes.
We will present the AJAX system applied to two real world problems: the consolidation of a telecommunication database, and the conversion of a dirty database of bibliographic references into a set of clean, normalized, and redundancy free relational tables maintaining the same data.

Reference

[1]
http//carat, inri#, fr/-gallxda/ajax, html.

Cited By

View all
  • (2022)A Simple Approach for Data Cleansing on Hadoop Framework using File Merging Technique2022 Ninth International Conference on Software Defined Systems (SDS)10.1109/SDS57574.2022.10062900(1-5)Online publication date: 12-Dec-2022
  • (2021)Improving Quality of Smart Grid Data by Functional Data Analysis2021 2nd International Symposium on Computer Engineering and Intelligent Communications (ISCEIC)10.1109/ISCEIC53685.2021.00010(12-17)Online publication date: Aug-2021
  • (2020)Debugging inputsProceedings of the ACM/IEEE 42nd International Conference on Software Engineering10.1145/3377811.3380329(75-86)Online publication date: 27-Jun-2020
  • Show More Cited By

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGMOD Record
ACM SIGMOD Record  Volume 29, Issue 2
June 2000
609 pages
ISSN:0163-5808
DOI:10.1145/335191
Issue’s Table of Contents
  • cover image ACM Conferences
    SIGMOD '00: Proceedings of the 2000 ACM SIGMOD international conference on Management of data
    May 2000
    604 pages
    ISBN:1581132174
    DOI:10.1145/342009
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 May 2000
Published in SIGMOD Volume 29, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)91
  • Downloads (Last 6 weeks)20
Reflects downloads up to 01 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)A Simple Approach for Data Cleansing on Hadoop Framework using File Merging Technique2022 Ninth International Conference on Software Defined Systems (SDS)10.1109/SDS57574.2022.10062900(1-5)Online publication date: 12-Dec-2022
  • (2021)Improving Quality of Smart Grid Data by Functional Data Analysis2021 2nd International Symposium on Computer Engineering and Intelligent Communications (ISCEIC)10.1109/ISCEIC53685.2021.00010(12-17)Online publication date: Aug-2021
  • (2020)Debugging inputsProceedings of the ACM/IEEE 42nd International Conference on Software Engineering10.1145/3377811.3380329(75-86)Online publication date: 27-Jun-2020
  • (2020)Network security analysis using big data technology and improved neural networkJournal of Ambient Intelligence and Humanized Computing10.1007/s12652-020-02080-1Online publication date: 20-May-2020
  • (2019)Engineering Complex Data Integration, Harmonization and Visualization SystemsJournal of Industrial Information Integration10.1016/j.jii.2019.08.001Online publication date: Aug-2019
  • (2018)On Detecting and Removing Superficial Redundancy in Vector DatabasesMathematical Problems in Engineering10.1155/2018/37028082018(1-14)Online publication date: 2018
  • (2017)Human-in-the-Loop Challenges for Entity MatchingProceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics10.1145/3077257.3077268(1-6)Online publication date: 14-May-2017
  • (2016)A systematic view on data descriptors for the visual analysis of tabular dataInformation Visualization10.1177/147387161666776716:3(232-256)Online publication date: 19-Oct-2016
  • (2016)Towards reliable interactive data cleaningProceedings of the Workshop on Human-In-the-Loop Data Analytics10.1145/2939502.2939511(1-5)Online publication date: 26-Jun-2016
  • (2016)V for variety: Lessons learned from complex smart cities data harmonization and integration2016 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops)10.1109/PERCOMW.2016.7457092(1-6)Online publication date: Mar-2016
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media