Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Development and evaluation of a continuous-time Markov chain model for detecting and handling data currency declines

Published: 01 November 2017 Publication History

Abstract

Data currency declines, caused by recorded data values becoming outdated, can damage the usability and accountability of data resources. Detecting and updating outdated values may improve data currency and reduce the associated damage, but such efforts may be costly and cannot always be justified. This study models currency decline scenarios using a continuous-time Markov chain stochastic process with a finite number of states, each reflecting a valid data value. The model considers state transition probabilities, transition time distributions, and the tradeoff between the damage associated with outdated data and the cost of reacquisition. The proposed formulation permits the currency level to be estimated without having to rely on a baseline for comparison, as well as the prediction of future currency declines, assessment of their accumulated damage, and optimization of the timing of cost-effective data auditing and reacquisition. The study introduces a comprehensive evaluation of the proposed model, using a large real-world dataset relating to the handling of insurance claims over multiple time periods. The evaluation results highlight the applicability of the model, and its potential contribution to proactive data quality management and cost-effective handling of currency declines. Development of a Continuous-Time Markov Chain model for data quality managementEstimating data currency levels, and predicting future declinesRecommending if and when to reacquire data, considering cost-benefit tradeoffsEvaluation in the context of insurance claim handling, using real-world data

References

[1]
S.E. Madnick, R.Y. Wang, Y.E. Lee, H. Zhu, Overview and framework for data and information quality research, ACM J. Data Inf. Qual., 1 (2009) 2-22.
[2]
L.L. Pipino, Y.W. Lee, R.Y. Wang, Data quality assessment, Commun. ACM, 45 (2002) 211-218.
[3]
A. Even, G. Shankaranarayanan, Utility-driven assessment of data quality, ACM SIGMIS Database, 38 (2007) 75-93.
[4]
B. Heinrich, M. Klier, Metric-based data quality assessmentdeveloping and evaluating a probability-based currency metric, Decis. Support. Syst., 72 (2015) 82-96.
[5]
C. Cappiello, C. Francalanci, B. Pernici, Time-related factors of data quality in multichannel information systems, J. Manag. Inf. Syst., 20 (2003) 71-92.
[6]
B. Heinrich, M. Klier, M. Kaiser, A procedure to develop metrics for currency and its application in CRM, J. Data Inf. Qual. (JDIQ), 1 (2009) 5.
[7]
A. Even, G. Shankaranarayanan, P.D. Berger, Evaluating a model for cost-effective data quality management in a real-world CRM setting, Decis. Support. Syst., 50 (2010) 152-163.
[8]
B. Heinrich, D. Hristova, A quantitative approach for modelling the influence of currency of information on decision-making under uncertainty, J. Decis. Syst., 25 (2016) 16-41.
[9]
A. Wechsler, A. Even, Assessing accuracy degradation over time with A Markov-Chain model, in: The 17th International Conference on Information Quality (ICIQ), Paris, France, 2012.
[10]
W. Zong, Feng Wu, Z. Jiang, A Markov-based update policy for constantly changing database systems, IEEE Trans. Eng. Manag. (2017) 1-14.
[11]
A. Even, G. Shankaranarayanan, Dual assessment of data quality in customer databases, J. Data Inf. Qual. (JDIQ), 1 (2009) 15.
[12]
S. Watts, G. Shankaranarayanan, A. Even, Data quality assessment in context: a cognitive perspective, Decis. Support. Syst., 48 (2009) 202-211.
[13]
A. Haug, F. Zachariassen, D. Van Liempd, The costs of poor data quality, J. Ind. Eng. Manag., 4 (2011) 168-193.
[14]
C.W. Fisher, E.J. Lauria, C.C. Matheus, An accuracy metric: percentages, randomness, and probabilities, J. Data Inf. Qual. (JDIQ), 1 (2009) 16.
[15]
J. Cho, H. Garcia-Molina, Estimating frequency of change, ACM Trans. Internet Technol., 3 (2003) 256-290.
[16]
S. Razniewski, W. Nutt, Long-term optimization of update frequencies for decaying information, in: Paper Presented at the Proc. of the 18th Intl. Workshop on Web and DBs, 2015, pp. 34-40.
[17]
A. Even, G. Shankaranarayanan, P.D. Berger, Managing the quality of marketing data: cost/benefit tradeoffs and optimal configuration, J. Int. Mark., 24 (2010) 209-221.
[18]
B. Heinrich, M. Klier, Assessing data currencya probabilistic approach, J. Inf. Sci., 37 (2011) 86-100.
[19]
A. Abdellatif, A.B. Ammar, C. Mazlout, Markov chain for the recommendation of materialized views in real-time data warehouse, Int. J. Comput. Sci. Eng. Appl., 4 (2014) 13.
[20]
J.D.C. Little, Models and managers: the concept of a decision calculus, Manag. Sci., 16 (1970) B-466-B-485.
[21]
S.M. Ross, Stochastic processes, United States of America, Wiley, New York, 1996.

Cited By

View all
  • (2021)Protecting the Moving User’s Locations by Combining Differential Privacy and k-Anonymity under Temporal Correlations in Wireless NetworksWireless Communications & Mobile Computing10.1155/2021/66919752021Online publication date: 1-Jan-2021
  • (2021)Managing Data Quality of the Data Warehouse: A Chance-Constrained Programming ApproachInformation Systems Frontiers10.1007/s10796-019-09963-523:2(375-389)Online publication date: 1-Apr-2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Decision Support Systems
Decision Support Systems  Volume 103, Issue C
November 2017
107 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 November 2017

Author Tags

  1. Continuous-time Markov chain
  2. Data currency
  3. Data quality management

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Protecting the Moving User’s Locations by Combining Differential Privacy and k-Anonymity under Temporal Correlations in Wireless NetworksWireless Communications & Mobile Computing10.1155/2021/66919752021Online publication date: 1-Jan-2021
  • (2021)Managing Data Quality of the Data Warehouse: A Chance-Constrained Programming ApproachInformation Systems Frontiers10.1007/s10796-019-09963-523:2(375-389)Online publication date: 1-Apr-2021

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media