research-article

Estimating Relative User Expertise for Content Quality Prediction on Reddit

Authors:

Mark James Carman,

Sze-Meng Jojo WongAuthors Info & Claims

HT '17: Proceedings of the 28th ACM Conference on Hypertext and Social Media

Pages 55 - 64

https://doi.org/10.1145/3078714.3078720

Published: 04 July 2017 Publication History

Abstract

Reddit as a social curation site relies on its users to curate content from the World Wide Web (WWW) for the consumption of other users. Content on the site is enriched through user comments, discussions and extensions. This additional content is of varying quality however -- ranging from meaningful information to misleading content; depending on the reliability, expertise and intention of the authors. Reddit relies on the Wisdom of the Crowd (WotC) from its community as well as selected moderators to manage its content. We argue that this approach suffers from the cold start in collecting user votes and is at risk of user bias, particularly a group-think mentality. Besides that, managing the large collection of content on Reddit is expensive. In our study, we explore the estimation of relative user expertise through various content-agnostic approaches. We show that it is possible to infer information quality on Reddit using the expertise of the authors. This prediction of content quality could lead to an improved organisation of Reddit content (re-ranking) for user consumption and future information retrieval.

References

[1]

Ashton Anderson, Daniel Huttenlocher, Jon Kleinberg, and Jure Leskovec. 2012. Discovering Value from Community Activity on Focused Question Answering Sites: A Case Study of Stack Overflow. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '12). ACM, 850--858.

Digital Library

[2]

Çiğdem Aslay, Neil O'Hare, Luca Maria Aiello, and Alejandro Jaimes. 2013. Competition-based Networks for Expert Finding. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '13). ACM, 1033--1036.

Digital Library

[3]

Ralph A. Bradley and Milton E. Terry. 1952. Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons. Biometrika 39, 3/4 (1952).

[4]

Xuan Chen and John Heidemann. 2005. Flash Crowd Mitigation via Adaptive Admission Control Based on Application-level Observations. ACM Trans. Internet Technol. 5, 3 (Aug. 2005), 532--569.

Digital Library

[5]

Justin Cheng, Lada Adamic, P. Alex Dow, Jon Michael Kleinberg, and Jure Leskovec. 2014. Can Cascades Be Predicted?. In Proceedings of the 23rd International Conference on World Wide Web (WWW '14). ACM, New York, NY, USA, 925--936.

Digital Library

[6]

Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. 2008. An experimental comparison of click position-bias models. In WSDM '08: Proceedings of the international conference on Web search and web data mining. ACM, New York, NY, USA, 87--94.

Digital Library

[7]

Jeremy Elson and Jon Howell. 2008. Handling Flash Crowds from Your Garage. In USENIX 2008 Annual Technical Conference (ATC'08). USENIX Association, Berkeley, CA, USA, 171--184.

Digital Library

[8]

Adrien Friggeri, Lada Adamic, Dean Eckles, and Justin Cheng. 2014. Rumor Cascades. (2014).

[9]

Eric Gilbert. 2013. Widespread Underprovision on Reddit. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work (CSCW '13). ACM, New York, NY, USA, 803--808.

Digital Library

[10]

Mark Glickman. 2001. Dynamic paired comparison models with stochastic variances. Journal of Applied Statistics 28, 6 (2001).

[11]

Joshua Guberman, Carol Schmitz, and Libby Hemphill. 2016. Quantifying Toxicity and Verbal Violence on Twitter. In Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion (CSCW '16 Companion). ACM, New York, NY, USA, 277--280.

Digital Library

[12]

Nathan Oken Hodas and Kristina Lerman. 2012. How Visibility and Divided Attention Constrain Social Contagion. In Proceedings of the 2012 ASE/IEEE International Conference on Social Computing and 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust (SOCIALCOM-PASSAT '12). IEEE Computer Society, Washington, DC, USA, 249--257.

Digital Library

[13]

Nathan Oken Hodas and Kristina Lerman. 2013. The Simple Rules of Social Contagion. CoRR abs/1308.5015 (2013).

[14]

Tad Hogg, Kristina Lerman, and Laura M. Smith. 2013. Stochastic Models Predict User Behavior in Social Media. CoRR abs/1308.2705 (2013).

[15]

Jeon-Hyung Kang and Kristina Lerman. 2015. VIP: Incorporating Human Cog- nitive Biases in a Probabilistic Model of Retweeting. In Social Computing, Behavioral-Cultural Modeling, and Prediction, Nitin Agarwal, Kevin Xu, and Nathaniel Osgood (Eds.). Lecture Notes in Computer Science, Vol. 9021. Springer International Publishing, 101--110.

[16]

Simon Kassing, Jasper Oosterman, Alessandro Bozzon, and Geert-Jan Houben. 2015. Locating Domain-specific Contents and Experts on Social Bookmarking Communities. In Proceedings of the 30th Annual ACM Symposium on Applied Computing (SAC '15). ACM, New York, NY, USA, 747--752.

Digital Library

[17]

Himabindu Lakkaraju, Julian McAuley, and Jure Leskovec. 2013. What's in a Name? Understanding the Interplay between Titles, Content, and Communities in Social Media. (2013).

[18]

Kristina Lerman and Aram Galstyan. 2008. Analysis of Social Voting Patterns on Digg. In Proceedings of the First Workshop on Online Social Networks (WOSN '08). ACM, New York, NY, USA, 7--12.

Digital Library

[19]

Kristina Lerman and Tad Hogg. 2010. Using a Model of Social Dynamics to Predict Popularity of News. In Proceedings of the 19th International Conference on World Wide Web (WWW '10). ACM, New York, NY, USA, 621--630.

Digital Library

[20]

Kristina Lerman and Tad Hogg. 2012. Using Stochastic Models to Describe and Predict Social Dynamics of Web Users. ACM Trans. Intell. Syst. Technol. 3, 4, Article 62 (Sept. 2012), 33 pages.

Digital Library

[21]

Kristina Lerman and Tad Hogg. 2014. Leveraging Position Bias to Improve Peer Recommendation. PLoS ONE 9, 6 (11 June 2014), e98914+.

[22]

Baichuan Li and Irwin King. 2010. Routing Questions to Appropriate Answerers in Community Question Answering Services. In Proceedings of the ACM Conference on Information & Knowledge Management (CIKM).

Digital Library

[23]

Wern Han Lim, Mark James Carman, and Sze-Meng Jojo Wong. 2016. Estimating Domain-Specific User Expertise for Answer Retrieval in Community Question-Answering Platforms. In Proceedings of the 21st Australasian Document Computing Symposium (ADCS '16). ACM, New York, NY, USA, 33--40.

Digital Library

[24]

Jing Liu, Young-In Song, and Chin-Yew Lin. 2011. Competition-based User Expertise Score Estimation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '11). ACM, 425--434.

Digital Library

[25]

Richard Mills. 2011. Researching Social News -fi?! Is reddit.com a mouthpiece for the Hive Mind, or a Collective Intelligence approach to Information Overload?. In ETHICOMP 2011 Proceedings. Sheffeld Hallam University.

[26]

Blair Nonnecke and Jenny Preece. 2000. Lurker Demographics: Counting the Silent. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '00). ACM, New York, NY, USA, 73--80.

Digital Library

[27]

Henrique Pinto, Jussara M. Almeida, and Marcos A. Gonçalves. 2013. Using Early View Patterns to Predict the Popularity of Youtube Videos. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining (WSDM '13). ACM, New York, NY, USA, 365--374.

Digital Library

[28]

Maria Priestley and Alex Mesoudi. 2015. Do Online Voting Patterns Reflect Evolved Features of Human Cognition? An Exploratory Empirical Investigation. PLoS ONE 10, 6 (06 2015), e0129703.

[29]

Philipp Singer, Fabian Flöck, Clemens Meinhart, Elias Zeitfogel, and Markus Strohmaier. 2014. Evolution of Reddit: From the Front Page of the Internet to a Self-referential Community?. In Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion (WWW Companion '14). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 517--522.

Digital Library

[30]

Greg Stoddard. 2015. Popularity and Quality in Social News Aggregators: A Study of Reddit and Hacker News. In Proceedings of the 24th International Conference on World Wide Web (WWW '15 Companion). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 815--818.

Digital Library

[31]

Gabor Szabo and Bernardo A. Huberman. 2010. Predicting the Popularity of Online Content. Commun. ACM 53, 8 (Aug. 2010), 80--88.

Digital Library

[32]

Tim Weninger, Xihao Avi Zhu, and Jiawei Han. 2013. An Exploration of Discussion Threads in Social News Sites: A Case Study of the Reddit Community. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM '13). ACM, New York, NY, USA, 579--583.

Digital Library

[33]

Fang Wu and Bernardo A. Huberman. 2007. Novelty and collective attention. Proceedings of the National Academy of Sciences 104, 45 (2007), 17599--17601. arXiv:http://www.pnas.org/content/104/45/17599.full.pdf

[34]

Jun Zhang, Mark S. Ackerman, and Lada Adamic. 2007. Expertise Networks in Online Communities: Structure and Algorithms. In Proceedings of the 16th International Conference on World Wide Web (WWW '07). ACM, 221--230.

Digital Library

[35]

Zhi-Min Zhou, Man Lan, Zheng-Yu Niu, and Yue Lu. 2012. Exploiting User Profile Information for Answer Ranking in cQA. In Proceedings of the Conference Companion on World Wide Web (WWW).

Digital Library

Cited By

Strukova SRuipérez-Valiente JGómez Mármol FSpikol DViberg OMartínez-Monés AGuo P(2023)Towards the Identification of Experts in Informal Learning Portals at ScaleProceedings of the Tenth ACM Conference on Learning @ Scale10.1145/3573051.3596179(316-320)Online publication date: 20-Jul-2023
https://dl.acm.org/doi/10.1145/3573051.3596179
Strukova SMarco RMármol FRuipérez‐Valiente J(2023) Identifying professional photographers through image quality and aesthetics in Flickr Expert Systems10.1111/exsy.1352641:4Online publication date: 27-Dec-2023
https://doi.org/10.1111/exsy.13526
Strukova SRuipérez-Valiente JGómez Mármol F(2023)Computational approaches to detect experts in distributed online communities: a case study on RedditCluster Computing10.1007/s10586-023-04076-w27:2(2181-2201)Online publication date: 23-Jun-2023
https://doi.org/10.1007/s10586-023-04076-w
Show More Cited By

Index Terms

Estimating Relative User Expertise for Content Quality Prediction on Reddit
1. Information systems

Recommendations

Annotator Expertise and Information Quality in Annotation-based Retrieval
ADCS '17: Proceedings of the 22nd Australasian Document Computing Symposium

This paper investigates the annotation-based retrieval (AR) of World Wide Web (WWW) resources that has been annotated by users on Collaborative Tagging (CT) platforms as a form of user-generated content (UGC). Previous approaches have simply weight the ...
Community Archetypes: An Empirical Framework for Guiding Research Methodologies to Reflect User Experiences of Sense of Virtual Community on Reddit
CSCW

Humans need a sense of community (SOC), and social media platforms afford opportunities to address this need by providing users with a sense of virtual community (SOVC). This paper explores SOVC on Reddit and is motivated by two goals: (1) providing ...
The user multifaceted expertise

Site expertise and e-commerce expertise differently impact cognitive effort, site adaptation, and acceptance.Site expertise determined the user cognitive effort, while e-commerce expertise did not.Site expertise and e-commerce expertise had divergent ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

HT '17: Proceedings of the 28th ACM Conference on Hypertext and Social Media

July 2017

336 pages

ISBN:9781450347082

DOI:10.1145/3078714

General Chairs:
Peter Dolog
Aalborg University, Denmark
,
Peter Vojtas
Charles University, Prague, Czech Republic
,
Program Chairs:
Francesco Bonchi
ISI Foundation, Turin, Italy
,
Denis Helic
Graz University of Technology, Austria

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

In-Cooperation

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 July 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

HT'17

Sponsor:

SIGWEB

HT'17: 28th Conference on Hypertext and Social Media

July 4 - 7, 2017

Prague, Czech Republic

Acceptance Rates

HT '17 Paper Acceptance Rate 19 of 69 submissions, 28%;

Overall Acceptance Rate 378 of 1,158 submissions, 33%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
750
Total Downloads

Downloads (Last 12 months)44
Downloads (Last 6 weeks)6

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Strukova SRuipérez-Valiente JGómez Mármol FSpikol DViberg OMartínez-Monés AGuo P(2023)Towards the Identification of Experts in Informal Learning Portals at ScaleProceedings of the Tenth ACM Conference on Learning @ Scale10.1145/3573051.3596179(316-320)Online publication date: 20-Jul-2023
https://dl.acm.org/doi/10.1145/3573051.3596179
Strukova SMarco RMármol FRuipérez‐Valiente J(2023) Identifying professional photographers through image quality and aesthetics in Flickr Expert Systems10.1111/exsy.1352641:4Online publication date: 27-Dec-2023
https://doi.org/10.1111/exsy.13526
Strukova SRuipérez-Valiente JGómez Mármol F(2023)Computational approaches to detect experts in distributed online communities: a case study on RedditCluster Computing10.1007/s10586-023-04076-w27:2(2181-2201)Online publication date: 23-Jun-2023
https://doi.org/10.1007/s10586-023-04076-w
Zhao WKelly RRogerson MWaycott J(2022)Understanding Older Adults' Participation in Online Social ActivitiesProceedings of the ACM on Human-Computer Interaction10.1145/35648556:CSCW2(1-26)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3564855
Ismail AYadav DGupta MDabas KSingh PKumar N(2022)Imagining Caring Futures for Frontline Health WorkProceedings of the ACM on Human-Computer Interaction10.1145/35555816:CSCW2(1-30)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3555581
Wang DMuller MYang QWang ZTan MHobson S(2022)Organizational Distance Also MattersProceedings of the ACM on Human-Computer Interaction10.1145/35555546:CSCW2(1-18)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3555554
Hadi Mogavi RHaq EGujar SHui PMa X(2022)More Gamification Is Not Always BetterProceedings of the ACM on Human-Computer Interaction10.1145/35555536:CSCW2(1-32)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3555553
Hadi Mogavi RZhang YHaq EWu YHui PMa X(2022)What Do Users Think of Promotional Gamification Schemes? A Qualitative Case Study in a Question Answering WebsiteProceedings of the ACM on Human-Computer Interaction10.1145/35551246:CSCW2(1-34)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3555124
Yeh ESwinehart N(2020)Testing the WatersCALICO Journal10.1558/cj.3852937:1Online publication date: 2-Mar-2020
https://doi.org/10.1558/cj.38529
Miura YKano RTaniguchi MTaniguchi TMisawa SOhkuma T(2019)Discourse Act Classification Using Discussion Patterns with Neural Networks木構造とグラフ構造を用いたオンライン議論における談話行為の分類Journal of Natural Language Processing10.5715/jnlp.26.5926:1(59-81)Online publication date: 15-Mar-2019
https://doi.org/10.5715/jnlp.26.59
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents