Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2207676.2208553acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Omnipedia: bridging the wikipedia language gap

Published: 05 May 2012 Publication History

Abstract

We present Omnipedia, a system that allows Wikipedia readers to gain insight from up to 25 language editions of Wikipedia simultaneously. Omnipedia highlights the similarities and differences that exist among Wikipedia language editions, and makes salient information that is unique to each language as well as that which is shared more widely. We detail solutions to numerous front-end and algorithmic challenges inherent to providing users with a multilingual Wikipedia experience. These include visualizing content in a language-neutral way and aligning data in the face of diverse information organization strategies. We present a study of Omnipedia that characterizes how people interact with information using a multilingual lens. We found that users actively sought information exclusive to unfamiliar language editions and strategically compared how language editions defined concepts. Finally, we briefly discuss how Omnipedia generalizes to other domains facing language barriers.

Supplementary Material

MOV File (paperfile282-3.mov)
Supplemental video for “Omnipedia: bridging the wikipedia language gap”

References

[1]
Adafre, S.F. and de Rijke, M. 2006. Finding Similar Sentences Across Multiple Languages in Wikipedia. EACL 2006 Workshop on New Text, Wikis and Blogs and Other Dynamic Text Sources.
[2]
Adar, E., Skinner, M. and Weld, D.S. 2009. Information Arbitrage Across Multi-lingual Wikipedia. WSDM '09.
[3]
von Ahn, L. 2011. Three human computation projects. (2011). SIGCSE '11.
[4]
Au Yeung, C.-man, Duh, K. and Nagata, M. 2011. Providing Cross-Lingual Editing Assistance to Wikipedia Editors. CICL '11.
[5]
Bergstrom, T. and Karahalios, K. 2009. Conversation clusters: grouping conversation topics through human-computer dialog. CHI '09.
[6]
Budanitsky, A. and Hirst, G. 2006. Evaluating WordNet-based Measures of Lexical Semantic Relatedness. Computational Linguistics. 32, 1 (2006), 13--47.
[7]
Callahan, E.S. and Herring, S.C. Cultural bias in Wikipedia content on famous persons. Journal of the American Society for Information Science and Technology. 62: 1899--1915.
[8]
Capocci, A., Servedio, V.D.P., Colaiori, F., Buriol, L.S., Donato, D., Leonardi, S. and Caldarelli, G. 2006. Preferential attachment in the growth of social networks: The internet encyclopedia Wikipedia. Physical Review E. 74, 3 (2006), 036116.
[9]
Dong, W. and Fu, W.-T. 2010. Cultural difference in image tagging. CHI '10.
[10]
Duolingo: http://duolingo.com/. Accessed: 2011-09--13.
[11]
Filatova, E. 2009. Multilingual Wikipedia, Summarization, and Information Trustworthiness. SIGIR Workshop on Information Access in a Multilingual World.
[12]
Frequently asked questions - Wikimedia Foundation: http://wikimediafoundation.org/wiki/Frequently_asked_questions. Accessed: 2011-09--21.
[13]
Gärdenfors, P. 2000. Conceptual Spaces: The Geometry of Thought. The MIT Press.
[14]
Hecht, B. and Gergle, D. 2009. Measuring Self-Focus Bias in Community-Maintained Knowledge Repositories. Communities and Technologies 2009.
[15]
Hecht, B. and Gergle, D. 2010. The tower of Babel meets web 2.0: user-generated content and its applications in a multilingual context. CHI '10.
[16]
Hong, L., Convertino, G. and Chi, E.H. 2011. Language Matters in Twitter: A Large Scale Study. ICWSM '11.
[17]
Jarmasz, M. and Szpakowicz, S. 2003. Roget's thesaurus and semantic similarity. RANLP '03.
[18]
Kittur, A., Suh, B. and Chi, E.H. 2008. Can you ever trust a wiki?: impacting perceived trustworthiness in wikipedia. CSCW '08.
[19]
Kumaran, A., Datha, N., Ashok, B., Saravanan, K., Ande, A., Sharma, A., Vedantham, S., Natampally, V., Dendi, V. and Maurice, S. 2010. WikiBABEL: A System for Multilingual Wikipedia Content. American Machine Translation Association (AMTA) Workshop.
[20]
wiki/List_of_Wikipedias. Accessed: 2011-09--20.
[21]
Manypedia: 2011. http://www.manypedia.com/.
[22]
de Melo, G. and Weikum, G. 2010. Untangling the Cross-Lingual Link Structure of Wikipedia. ACL '10.
[23]
Mihalcea, R. and Csomai, A. 2007. Wikify!: linking documents to encyclopedic knowledge. CIKM '07.
[24]
Milne, D. and Witten, I.H. 2008. An Effective, Low-Cost Measure of Semantic Relatedness Obtained from Wikipedia Links. WIKIAI '08.
[25]
Milne, D. and Witten, I.H. 2008. Learning to link with wikipedia. CIKM '08.
[26]
Oh, J.-H., Kawahara, D., Uchimoto, K., Kazama, J. and Torisawa, K. 2008. Enriching Multilingual Language Resources by Discovering Missing Cross-Language Links in Wikipedia. WIIAT '08.
[27]
Pfeil, U., Zaphiris, P. and Ang, C.S. 2006. Cultural Differences in Collaborative Authoring of Wikipedia. Journal of Computer-Mediated Communication. 12, 1, 88--113.
[28]
Sorg, P. and Cimiano, P. 2008. Enriching the Crosslingual Link Structure of Wikipedia - A Classification-based Approach. WIKI-AI '08.
[29]
Suh, B., Chi, E.H, Pendleton, B.A. and Kittur, A. 2007. Us vs. Them: Understanding Social Dynamics in Wikipedia with Revert Graph Visualizations. VAST '07.
[30]
Translating the world's information with Google Translator Toolkit: 2009. http://googleblog.blogspot.com/2009/06/translating-worlds-information-with.html. Accessed: 2011-09--16.
[31]
Viégas, F.B., Wattenberg, M. and Dave, K. 2004. Studying cooperation and conflict between authors with history flow visualizations. CHI '04.
[32]
Wattenberg, M., Viégas, F.B. and Hollenbach, K. 2007. Visualizing activity on wikipedia with chromograms. INTERACT '07.
[33]
WikiBhasha beta -- A multi-lingual content creator for Wikipedia: http://www.wikibhasha.org/.

Cited By

View all
  • (2024)Chapter 6. Exploring the evolution of Wikipedia articles through ContropediaInvestigating Wikipedia10.1075/scl.121.06lan(156-177)Online publication date: 25-Oct-2024
  • (2023)Introducing an “invisible enemy”: A case study of knowledge construction regarding microplastics in Japanese WikipediaNew Media & Society10.1177/1461444822114974726:10(6159-6180)Online publication date: 22-Jan-2023
  • (2023)Detecting Cross-Lingual Information Gaps in WikipediaCompanion Proceedings of the ACM Web Conference 202310.1145/3543873.3587539(581-585)Online publication date: 30-Apr-2023
  • Show More Cited By

Index Terms

  1. Omnipedia: bridging the wikipedia language gap

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '12: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
    May 2012
    3276 pages
    ISBN:9781450310154
    DOI:10.1145/2207676
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 May 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. hyperlingual
    2. language barrier
    3. multilingual
    4. text mining
    5. user-generated content
    6. wikipedia

    Qualifiers

    • Research-article

    Conference

    CHI '12
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI '25
    CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)48
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 16 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Chapter 6. Exploring the evolution of Wikipedia articles through ContropediaInvestigating Wikipedia10.1075/scl.121.06lan(156-177)Online publication date: 25-Oct-2024
    • (2023)Introducing an “invisible enemy”: A case study of knowledge construction regarding microplastics in Japanese WikipediaNew Media & Society10.1177/1461444822114974726:10(6159-6180)Online publication date: 22-Jan-2023
    • (2023)Detecting Cross-Lingual Information Gaps in WikipediaCompanion Proceedings of the ACM Web Conference 202310.1145/3543873.3587539(581-585)Online publication date: 30-Apr-2023
    • (2023)Between news and history: identifying networked topics of collective attention on WikipediaJournal of Computational Social Science10.1007/s42001-023-00215-w6:2(845-875)Online publication date: 8-Jul-2023
    • (2022)Community Vital Signs: Measuring Wikipedia Communities’ Sustainable Growth and RenewalSustainability10.3390/su1408470514:8(4705)Online publication date: 14-Apr-2022
    • (2022)Using natural language generation to bootstrap missing Wikipedia articles: A human-centric perspectiveSemantic Web10.3233/SW-21043113:2(163-194)Online publication date: 3-Feb-2022
    • (2022)The Model Card Authoring Toolkit: Toward Community-centered, Deliberation-driven AI DesignProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency10.1145/3531146.3533110(440-451)Online publication date: 21-Jun-2022
    • (2022)User Access Models to Event-Centric InformationCompanion Proceedings of the Web Conference 202210.1145/3487553.3524193(329-333)Online publication date: 25-Apr-2022
    • (2022)Designing diagrams for WikipediaInformation Design Journal10.1075/idj.23.1.08mau(65-79)Online publication date: 5-Jul-2022
    • (2022)Information asymmetry in Wikipedia across different languagesJournal of the Association for Information Science and Technology10.1002/asi.2455373:3(347-361)Online publication date: 7-Feb-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media