Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/ICDM.2008.40guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Filling in the Blanks - Krimp Minimisation for Missing Data

Published: 15 December 2008 Publication History

Abstract

Many data sets are incomplete. For correct analysis of such data, one can either use algorithms that are designed to handle missing data or use imputation. Imputation has the benefit that it allows for any type of data analysis. Obviously, this can only lead to proper conclusions if the provided data completion is both highly accurate and maintains all statistics of the original data. In this paper, we present three data completion methods that are built on the MDL-based {\sc Krimp} algorithm. Here, we also follow the MDL principle, i.e. the completed database that can be compressed best, is the best completion because it adheres best to the patterns in the data. By using local patterns, as opposed to a global model, Krimp captures the structure of the data in detail. Experiments show that both in terms of accuracy and expected differences of any marginal, better data reconstructions are provided than the state of the art, Structural EM.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICDM '08: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
December 2008
1145 pages
ISBN:9780769535029

Publisher

IEEE Computer Society

United States

Publication History

Published: 15 December 2008

Author Tags

  1. Krimp
  2. MDL
  3. imputation
  4. local patterns
  5. missing data estimation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Missing value replacement in strings and applicationsData Mining and Knowledge Discovery10.1007/s10618-024-01074-339:2Online publication date: 22-Jan-2025
  • (2014)PeGS: Perturbed Gibbs Samplers that Generate Privacy-Compliant Synthetic DataTransactions on Data Privacy10.5555/2870614.28706177:3(253-282)Online publication date: 1-Dec-2014
  • (2014)MDL4BMFACM Transactions on Knowledge Discovery from Data10.1145/26014378:4(1-31)Online publication date: 7-Oct-2014
  • (2013)Effects of data set features on the performances of classification algorithmsExpert Systems with Applications: An International Journal10.1016/j.eswa.2012.09.01740:5(1847-1857)Online publication date: 1-Apr-2013
  • (2012)The long and the short of itProceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/2339530.2339606(462-470)Online publication date: 12-Aug-2012
  • (2012)Queries for data analysisProceedings of the 11th international conference on Advances in Intelligent Data Analysis10.1007/978-3-642-34156-4_3(7-22)Online publication date: 25-Oct-2012
  • (2011)Model order selection for boolean matrix factorizationProceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/2020408.2020424(51-59)Online publication date: 21-Aug-2011
  • (2010)Making pattern mining usefulACM SIGKDD Explorations Newsletter10.1145/1882471.188248312:1(75-76)Online publication date: 9-Nov-2010

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media