Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2462932.2462959acmconferencesArticle/Chapter ViewAbstractPublication PageswikisymConference Proceedingsconference-collections
research-article

In search of the ur-Wikipedia: universality, similarity, and translation in the Wikipedia inter-language link network

Published: 27 August 2012 Publication History

Abstract

Wikipedia has become one of the primary encyclopaedic information repositories on the World Wide Web. It started in 2001 with a single edition in the English language and has since expanded to more than 20 million articles in 283 languages. Criss-crossing between the Wikipedias is an inter-language link network, connecting the articles of one edition of Wikipedia to another. We describe characteristics of articles covered by nearly all Wikipedias and those covered by only a single language edition, we use the network to understand how we can judge the similarity between Wikipedias based on concept coverage, and we investigate the flow of translation between a selection of the larger Wikipedias. Our findings indicate that the relationships between Wikipedia editions follow Tobler's first law of geography: similarity decreases with increasing distance. The number of articles in a Wikipedia edition is found to be the strongest predictor of similarity, while language similarity also appears to have an influence. The English Wikipedia edition is by far the primary source of translations. We discuss the impact of these results for Wikipedia as well as user-generated content communities in general.

References

[1]
E. Adar, M. Skinner, and D. S. Weld. Information arbitrage across multi-lingual wikipedia. In Proc. WSDM, pages 94--103, 2009.
[2]
P. Bao, B. Hecht, S. Carton, M. Quaderi, M. Horn, and D. Gergle. Omnipedia: Bridging the wikipedia language gap. In Proc. CHI, 2012.
[3]
E. S. Callahan and S. C. Herring. Cultural bias in wikipedia content on famous persons. Jour. ASIST, 62(10):1899--1915, 2011.
[4]
T. Dahinden. Estimation of the locations of the language-versions of Wikipedia -- A case study on geographic data mining. In Advances in Cartography and GIScience, volume 6, pages 471--487. 2011.
[5]
G. de Melo and G. Weikum. Untangling the cross-lingual link structure of wikipedia. In Proc. ACL, 2010.
[6]
B. Hecht and D. Gergle. Measuring self-focus bias in community-maintained knowledge repositories. In Proc. C&T, pages 11--20, 2009.
[7]
B. Hecht and D. Gergle. The tower of Babel meets web 2.0: User-generated content and its applications in a multilingual context. In Proc. CHI, pages 291--300, 2010.
[8]
B. Hecht and E. Moxley. Terabytes of tobler: evaluating the first law in a massive, domain-neutral representation of world knowledge. In Proc. COSIT, pages 88--105, 2009.
[9]
S. C. Herring, J. C. Paolillo, I. Ramos-Vielba, I. Kouper, E. Wright, S. Stoerger, L. A. Scheidt, and B. Clark. Language networks on LiveJournal. In Proc. HICSS, pages 79--89, 2007.
[10]
L. Hong, G. Convertino, and E. Chi. Language matters in Twitter: A large scale study. In ICWSM, July 2011.
[11]
A. Kittur, E. Chi, B. Pendleton, B. Suh, and T. Mytkowicz. Power of the few vs. wisdom of the crowd: Wikipedia and the rise of the bourgeoisie. World Wide Web, 1(2):19, 2007.
[12]
A. Kittur and R. E. Kraut. Beyond wikipedia: coordination and conflict in online production groups. In Proc. CSCW, pages 215--224, 2010.
[13]
A. Kittur, B. Suh, B. A. Pendleton, and E. H. Chi. He says, she says: conflict and coordination in wikipedia. In Proc. CHI, pages 453--462, 2007.
[14]
M. P. Lewis, editor. Ethnologue: Languages of the World. SIL International, sixteenth edition, 2009. Online version: http://www.ethnologue.com/.
[15]
A. Lih. Wikipedia as participatory journalism: Reliable sources? Metrics for evaluating collaborative media as a news resource. In Proc. ISOJ, 2004.
[16]
A. Lih. The Wikipedia revolution: How a bunch of nobodies created the world's greatest encyclopedia. Hyperion Books, 2009.
[17]
U. Pfeil, P. Zaphiris, and C. S. Ang. Cultural differences in collaborative authoring of wikipedia. Journal of Computer-Mediated Communication, 12(1):88--113, 2006.
[18]
R. Priedhorsky, J. Chen, S. T. K. Lam, K. Panciera, L. Terveen, and J. Riedl. Creating, destroying, and restoring value in wikipedia. In Proc. GROUP, pages 259--268, 2007.
[19]
M. Rask. The Richness and Reach of Wikinomics: Is the Free Web-Based Encyclopedia Wikipedia Only for the Rich Countries? Proc. of the Joint Conference of ISMD and the Macromarketing Society, 2007.
[20]
C. Roth, D. Taraborelli, and N. Gilbert. Measuring wiki viability: an empirical assessment of the social dynamics of a large sample of wikis. In Proc. WikiSym, 2008.
[21]
P. Sorg and P. Cimiano. Enriching the crosslingual link structure of wikipedia - a classification-based approach. In Proc. of the AAAI 2008 Workshop on Wikipedia and Artifical Intelligence, 2008.
[22]
K. Stein and C. Hess. Does it matter who contributes: A study on featured articles in the German Wikipedia. In Proc. HT, pages 171--174, 2007.
[23]
B. Stvilia, A. Al-Faraj, and Y. J. Yi. Issues of cross-contextual information quality evaluation--the case of arabic, english, and korean wikipedias. Library & Information Science Research, 31(4):232--239, 2009.
[24]
B. Stvilia, M. B. Twidale, L. C. Smith, and L. Gasser. Information quality work organization in wikipedia. J. Am. Soc. Inf. Sci. Technol., 59:983--1001, April 2008.
[25]
B. Suh, E. H. Chi, B. A. Pendleton, and A. Kittur. Us vs. them: Understanding social dynamics in wikipedia with revert graph visualizations. In Proc. VAST, 2007.
[26]
B. Suh, G. Convertino, E. H. Chi, and P. Pirolli. The singularity is not near: Slowing growth of wikipedia. In Proc. WikiSym, pages 8:1--8:10, 2009.
[27]
W. R. Tobler. A computer movie simulating urban growth in the detroit region. Econ. Geography, 46:234--240, 1970.
[28]
A. Tversky. Features of similarity. Psychological review, 84(4):327, 1977.
[29]
F. B. Viégas, M. Wattenberg, and K. Dave. Studying cooperation and conflict between authors with history flow visualizations. In Proc. CHI, pages 575--582, 2004.
[30]
F. B. Viégas, M. Wattenberg, J. Kriss, and F. van Ham. Talk before you type: Coordination in wikipedia. In Proc. HICSS, 2007.
[31]
F. B. Viégas, M. Wattenberg, and M. M. McKeon. The hidden order of wikipedia. In Proc, OCSC, 2007.
[32]
T. Yasseri, R. Sumi, and J. Kertész. Circadian patterns of wikipedia editorial activity: A demographic analysis. PLoS ONE, 7(1):e30091, Jan 2012.

Cited By

View all
  • (2023)Increasing Participation in Peer Production Communities with the Newcomer HomepageProceedings of the ACM on Human-Computer Interaction10.1145/36100717:CSCW2(1-26)Online publication date: 4-Oct-2023
  • (2021)The Wikipedia Diversity Observatory: helping communities to bridge content gaps through interactive interfacesJournal of Internet Services and Applications10.1186/s13174-021-00141-y12:1Online publication date: 1-Nov-2021
  • (2020)Multiple Texts as a Limiting Factor in Online Learning: Quantifying (Dis-)similarities of Knowledge NetworksFrontiers in Education10.3389/feduc.2020.5626705Online publication date: 3-Nov-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
WikiSym '12: Proceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration
August 2012
295 pages
ISBN:9781450316057
DOI:10.1145/2462932
  • General Chair:
  • Cliff Lampe
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 August 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Tobler's law
  2. Wikipedia
  3. first law of geography
  4. multilingual

Qualifiers

  • Research-article

Funding Sources

Conference

WikiSym '12
Sponsor:

Acceptance Rates

WikiSym '12 Paper Acceptance Rate 21 of 37 submissions, 57%;
Overall Acceptance Rate 69 of 145 submissions, 48%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Increasing Participation in Peer Production Communities with the Newcomer HomepageProceedings of the ACM on Human-Computer Interaction10.1145/36100717:CSCW2(1-26)Online publication date: 4-Oct-2023
  • (2021)The Wikipedia Diversity Observatory: helping communities to bridge content gaps through interactive interfacesJournal of Internet Services and Applications10.1186/s13174-021-00141-y12:1Online publication date: 1-Nov-2021
  • (2020)Multiple Texts as a Limiting Factor in Online Learning: Quantifying (Dis-)similarities of Knowledge NetworksFrontiers in Education10.3389/feduc.2020.5626705Online publication date: 3-Nov-2020
  • (2020)The Wikipedia Diversity ObservatoryProceedings of the 16th International Symposium on Open Collaboration10.1145/3412569.3412866(1-4)Online publication date: 25-Aug-2020
  • (2020)The Falklands/Malvinas war taken to the Wikipedia realm: a multimodal discourse analysis of cross-lingual violations of the Neutral Point of ViewPalgrave Communications10.1057/s41599-020-0435-26:1Online publication date: 7-Apr-2020
  • (2019)Wikipedia: Mirror, Microcosm, and Motor of Global Linguistic DiversityHandbook of the Changing World Language Map10.1007/978-3-319-73400-2_200-1(1-27)Online publication date: 15-May-2019
  • (2019)Wikipedia: Mirror, Microcosm, and Motor of Global Linguistic DiversityHandbook of the Changing World Language Map10.1007/978-3-030-02438-3_200(3773-3799)Online publication date: 23-Oct-2019
  • (2018)Wikipedia Culture Gap: Quantifying Content Imbalances Across 40 Language EditionsFrontiers in Physics10.3389/fphy.2018.000546Online publication date: 6-Jun-2018
  • (2018)Value-Sensitive Algorithm DesignProceedings of the ACM on Human-Computer Interaction10.1145/32744632:CSCW(1-23)Online publication date: 1-Nov-2018
  • (2016)Understanding Editing Behaviors in Multilingual WikipediaPLOS ONE10.1371/journal.pone.015530511:5(e0155305)Online publication date: 12-May-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media