Abstract
Quality of data in DBpedia depends on underlying information provided in Wikipedia’s infoboxes. Various language editions can provide different information about given subject with respect to set of attributes and values of these attributes. Our research question is which language editions provide correct values for each attribute so that data fusion can be carried out. Initial experiments proved that quality of attributes is correlated with the overall quality of the Wikipedia article providing them. Wikipedia offers functionality to assign a quality class to an article but unfortunately majority of articles have not been graded by community or grades are not reliable. In this paper we analyse the features and models that can be used to evaluate the quality of articles, providing foundation for the relative quality assessment of infobox’s attributes, with the purpose to improve the quality of DBpedia.
Similar content being viewed by others
Notes
- 1.
except those edited by multi-lingual editors and resulting from translation.
- 2.
- 3.
- 4.
alfa version available at http://wikirank.net.
- 5.
This is obvious as with reduced number of classes we avoid misclassification within combined classes.
References
Madnick, S.E., Wang, R.Y., Lee, Y.W., Zhu, H.: Overview and framework for data and information quality research. ACM J. Data Inf. Qual. 1(1), 1–22 (2009)
Heinrich, B., Klier, M.: Metric-based data quality assessment – Developing and evaluating a probability-based currency metric. Decis. Support Syst. 72, 82–96 (2015)
Behkamal, B., Kahani, M., Bagheri, E., Jeremic, Z.: A metrics-driven approach for quality assessment of linked open data. J. Theor. Appl. Electron. Commer. Res. 9(2), 64–79 (2014)
Eppler, M.J.: Managing Information Quality: Increasing the Value of Information in Knowledge-intensive Products and Processes. Springer, Heidelberg (2003)
Commission of the European Communities: eEurope 2002: Quality criteria for health related websites (2002)
Anderka, M.: Analyzing and Predicting Quality Flaws in User-generated Content: The Case of Wikipedia. Phd, Bauhaus-Universitaet Weimar Germany (2013)
Stvilia, B., Al-Faraj, A., Yi, Y.J.: Issues of cross-contextual information quality evaluation-The case of Arabic, English, and Korean Wikipedias. Libr. Inf. Sci. Res. 31(4), 232–239 (2009)
Abramowicz, W.: Filtrowanie informacji. Wydawnictwo Akademii Ekonomicznej w Poznaniu, Poznań (2008)
Ge, M., Helfert, M.: Data and information quality assessment in information manufacturing systems. In: Abramowicz, W., Fensel, D. (eds.) BIS 2008. LNBIP, vol. 7, pp. 380–389. Springer, Heidelberg (2008)
Xu, H.: What are the most important factors for accounting information quality and their impact on ais data quality outcomes? J. Data Inf. Qual. 5(4), 14:1–14:22 (2015)
Hu, M., Lim, E.P., Sun, A., Lauw, H.W., Vuong, B.Q.: Measuring article quality in wikipedia. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management - CIKM 2007, pp. 243–252 (2007)
Blumenstock, J.E.: Size matters: word count as a measure of quality on wikipedia. In: WWW, pp. 1095–1096 (2008)
Wöhner, T., Peters, R.: Assessing the quality of Wikipedia articles with lifecycle based metrics. In: Proceedings of the 5th International Symposium on Wikis and Open Collaboration WikiSym 2009, p. 1 (2009)
Warncke-wang, M., Cosley, D., Riedl, J.: Tell me more : an actionable quality model for Wikipedia. In: WikiSym 2013, pp. 1–10 (2013)
Dalip, D.H., Gonçalves, M.A., Cristo, M., Calado, P.: Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 295–304 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Węcel, K., Lewoniewski, W. (2015). Modelling the Quality of Attributes in Wikipedia Infoboxes. In: Abramowicz, W. (eds) Business Information Systems Workshops. BIS 2015. Lecture Notes in Business Information Processing, vol 228. Springer, Cham. https://doi.org/10.1007/978-3-319-26762-3_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-26762-3_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26761-6
Online ISBN: 978-3-319-26762-3
eBook Packages: Computer ScienceComputer Science (R0)