abstract

Weak inter-rater reliability in heuristic evaluation of video games

Authors:

Gareth R. White,

Pejman Mirza-babaei,

Graham McAllister,

Judith GoodAuthors Info & Claims

CHI EA '11: CHI '11 Extended Abstracts on Human Factors in Computing Systems

Pages 1441 - 1446

https://doi.org/10.1145/1979742.1979788

Published: 07 May 2011 Publication History

Abstract

Heuristic evaluation promises to be a low-cost usability evaluation method, but is fraught with problems of subjective interpretation, and a proliferation of competing and contradictory heuristic lists. This is particularly true in the field of games research where no rigorous comparative validation has yet been published. In order to validate the available heuristics, a user test of a commercial game is conducted with 6 participants in which 88 issues are identified, against which 146 heuristics are rated for relevance by 3 evaluators. Weak inter-rater reliability is calculated with Krippendorff's Alpha of 0.343, refuting validation of any of the available heuristics. This weak reliability is due to the high complexity of video games, resulting in evaluators interpreting different reasonable causes and solutions for the issues, and hence the wide variance in their ratings of the heuristics.

References

[1]

B. Bailey. Judging the severity of usability issues on web sites: This doesn't work. http://www.usability.gov/ articles/newsletter/pubs/102005news.html, Last accessed July 9 2010.

[2]

G. Cockton and A. Woolrych. Understanding inspection methods: Lessons from an assessment of heuristic evaluation. Joint Proc. HCI 2001 and IHM 2001: People and Computers XV, ACM (2001), 171--191.

[3]

G. Cockton, A. Woolrych, and M. Hindmarch. Reconditioned merchandise: extended structured report formats in usability inspection. Proc. CHI '04 extended abstracts on Human factors in computing systems, ACM (2004), 1433--1436.

Digital Library

[4]

H. Desurvire and C. Wiberg. Game usability heuristics (PLAY) for evaluating and designing better games: The next iteration. Proc. OCSC, Springer-Verlag (2009), 557--566.

Digital Library

[5]

H. Desurvire and C. Wiberg. User experience design for inexperienced gamers: GAP - game approachability guidelines. In R. Bernhaupt (ed.), Evaluating User Experience in Games - Concepts and Methods, Springer-Verlag (2010), chapter 8.

[6]

L. Faulkner. Beyond the five-user assumption: benefits of increased sample sizes in usability testing. Behavior Research Methods, Instruments, &#38; Computers 35, 3 (2003), 379--383.

[7]

M. Federoff. Heuristics and usability guidelines for the creation and evaluation of fun in video games. Department of Telecommunications, Indiana University, USA, 2002. http://citeseerx.ist.psu.edu/viewdoc/download? doi=10.1.1.89.8294&#38;rep=rep1&#38;type=pdf, Last accessed 18 February 2011.

[8]

D. Freelon. Recal for ordinal, interval, and ratio data (OIR). http://dfreelon.org/utils/recalfront/ recal-oir/, Last accessed 22 September 2010.

[9]

A. F. Hayes and K. Krippendorff. Answering the call for a standard reliability measure for coding data. Communication Methods and Measures 1, 1 (2007), 77--89.

[10]

M. Hertzum and N. E. Jacobsen. The evaluator effect: A chilling fact about usability evaluation methods. International Journal of Human-Computer Interaction 15, 1 (2003), 183--204.

[11]

K. Hornb&#230;k and E. Fr&#248;kjaer. Comparison of techniques for matching of usability problem descriptions. Interact. Comput. 20, 6 (2008), 505--514.

Digital Library

[12]

W. Hwang and G. Salvendy. What makes evaluators to find more usability problems?: a meta-analysis for individual detection rates. In Proc. HCI'07, Springer-Verlag (2007), 499--507.

Digital Library

[13]

N. E. Jacobsen, M. Hertzum, and B. E. John. The evaluator effect in usability tests. In Proc. CHI '98, ACM (1998), 255--256.

Digital Library

[14]

B. E. John. The evaluator effect in usability studies: Problem detection and severity judgments. Proc. Human Factors and Ergonomics Society Annual Meeting 42, 5, (1998), 1336--1340.

[15]

M. Kessner, J. Wood, R. F. Dillon, and R. L. West. On the reliability of usability testing. In Proc. CHI extended abstracts, ACM (2001), 97--98.

Digital Library

[16]

H. Korhonen, J. Paavilainen, and H. Saarenp&#228;&#228;. Expert review method in game evaluations - comparison of two playability heuristic sets. In Proc. Mindtrek, ACM (2009).

Digital Library

[17]

R. Molich, M. R. Ede, K. Kaasgaard, and B. Karyukin. Comparative usability evaluation. Behavior and Information Technology 23, 1, Taylor &#38; Francis, Inc. (2004), 65--74.

Digital Library

[18]

J. Nielsen. Finding usability problems through heuristic evaluation. In Proc. CHI '92, ACM (1992), 373--380.

Digital Library

[19]

J. Nielsen. Enhancing the explanatory power of usability heuristics. In Proc. CHI '94, ACM (1994) 152--158.

Digital Library

[20]

Nielsen and T. K. Landauer. A mathematical model of the finding of usability problems. In Proc. CHI '93, ACM (1993) 206--213.

Digital Library

[21]

J. Nielsen and R. Molich. Heuristic evaluation of user in-terfaces. In Proc. SIGCHI, ACM(1990), 249--256.

Digital Library

[22]

D. Pinelle, N. Wong, and T. Stach. Heuristic evaluation for games: usability principles for video game design. In Proc. SIGCHI, ACM (2008), 1453--1462. J. Spool and W. Schroeder. Testing web sites: five users is nowhere near enough. In Proc. CHI '01, ACM (2001), 285--286.

Digital Library

Cited By

Irgens GBailey CFamaye TBehboudi A(2024)User experience testing and co-designing a digital game for broadening participation in computing with and for elementary school childrenInternational Journal of Child-Computer Interaction10.1016/j.ijcci.2024.100699(100699)Online publication date: Oct-2024
https://doi.org/10.1016/j.ijcci.2024.100699
Nova ASansalone SRobinson RMirza-Babaei P(2022)Charting the Uncharted with GUR: How AI Playtesting Can Supplement Expert EvaluationProceedings of the 17th International Conference on the Foundations of Digital Games10.1145/3555858.3555880(1-12)Online publication date: 5-Sep-2022
https://dl.acm.org/doi/10.1145/3555858.3555880
Sanchez DWeiner EVan Zelderen A(2022)Virtual reality assessments (VRAs): Exploring the reliability and validity of evaluations in VRInternational Journal of Selection and Assessment10.1111/ijsa.1236930:1(103-125)Online publication date: 10-Jan-2022
https://doi.org/10.1111/ijsa.12369
Show More Cited By

Index Terms

Weak inter-rater reliability in heuristic evaluation of video games
1. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

Heuristic evaluation for games: usability principles for video game design
CHI '08: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

Most video games require constant interaction, so game designers must pay careful attention to usability issues. However, there are few formal methods for evaluating the usability of game interfaces. In this paper, we introduce a new set of heuristics ...
Heuristic usability evaluation on games: a modular approach

Heuristic evaluation is the preferred method to assess usability in games when experts conduct this evaluation. Many heuristics guidelines have been proposed attending to specificities of games but they only focus on specific subsets of games or ...
Usability heuristics for networked multiplayer games
GROUP '09: Proceedings of the 2009 ACM International Conference on Supporting Group Work

Networked multiplayer games must support a much wider variety of interactions than single-player games because networked games involve communication and coordination between players. This means that designers must consider additional usability issues ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI EA '11: CHI '11 Extended Abstracts on Human Factors in Computing Systems

May 2011

2554 pages

ISBN:9781450302685

DOI:10.1145/1979742

General Chair:
Desney Tan
Microsoft Research
,
Program Chairs:
Bo Begole
PARC
,
Wendy A. Kellogg
IBM Research

Copyright © 2011 Authors.

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 May 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Abstract

Conference

CHI '11

Sponsor:

SIGCHI

CHI '11: CHI Conference on Human Factors in Computing Systems

May 7 - 12, 2011

BC, Vancouver, Canada

Acceptance Rates

Overall Acceptance Rate 6,164 of 23,696 submissions, 26%

Upcoming Conference

CHI '25

Sponsor:
sigchi

CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
471
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)4

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Irgens GBailey CFamaye TBehboudi A(2024)User experience testing and co-designing a digital game for broadening participation in computing with and for elementary school childrenInternational Journal of Child-Computer Interaction10.1016/j.ijcci.2024.100699(100699)Online publication date: Oct-2024
https://doi.org/10.1016/j.ijcci.2024.100699
Nova ASansalone SRobinson RMirza-Babaei P(2022)Charting the Uncharted with GUR: How AI Playtesting Can Supplement Expert EvaluationProceedings of the 17th International Conference on the Foundations of Digital Games10.1145/3555858.3555880(1-12)Online publication date: 5-Sep-2022
https://dl.acm.org/doi/10.1145/3555858.3555880
Sanchez DWeiner EVan Zelderen A(2022)Virtual reality assessments (VRAs): Exploring the reliability and validity of evaluations in VRInternational Journal of Selection and Assessment10.1111/ijsa.1236930:1(103-125)Online publication date: 10-Jan-2022
https://doi.org/10.1111/ijsa.12369
Nova ASansalone SMirza-Babaei P(2021)PathOS+: A New Realm in Expert EvaluationExtended Abstracts of the 2021 Annual Symposium on Computer-Human Interaction in Play10.1145/3450337.3483495(122-127)Online publication date: 15-Oct-2021
https://dl.acm.org/doi/10.1145/3450337.3483495
Rehman UAbbasi AShah MIdrees AIlahi HHlavacs H(2021)Analyzing and Prioritizing Usability Issues in GamesHCI in Games: Experience Design and Game Mechanics10.1007/978-3-030-77277-2_9(110-118)Online publication date: 3-Jul-2021
https://doi.org/10.1007/978-3-030-77277-2_9
Smith TKheng C(2021)Reliability of Heuristic Evaluation During Usability AnalysisProceedings of the 21st Congress of the International Ergonomics Association (IEA 2021)10.1007/978-3-030-74614-8_88(708-714)Online publication date: 13-Jun-2021
https://doi.org/10.1007/978-3-030-74614-8_88
Tondello GKappen DGanaba MNacke L(2019)Gameful Design Heuristics: A Gamification Inspection ToolHuman-Computer Interaction. Perspectives on Design10.1007/978-3-030-22646-6_16(224-244)Online publication date: 27-Jun-2019
https://doi.org/10.1007/978-3-030-22646-6_16
Aultman ADowie SHamid NMandryk RHancock MPerry MCox A(2018)Design Heuristics for Mobile Augmented Reality Game User InterfacesExtended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems10.1145/3170427.3188580(1-5)Online publication date: 20-Apr-2018
https://dl.acm.org/doi/10.1145/3170427.3188580
Petersen FThomsen LMirza-Babaei PDrachen ASchouten BMarkopoulos PToups Dugas PCairns PBekker T(2017)Evaluating the Onboarding Phase of Free-toPlay Mobile GamesProceedings of the Annual Symposium on Computer-Human Interaction in Play10.1145/3116595.3125499(377-388)Online publication date: 15-Oct-2017
https://dl.acm.org/doi/10.1145/3116595.3125499
Mirza-Babaei PMoosajee NDrenikow B(2016)Playtesting for indie studiosProceedings of the 20th International Academic Mindtrek Conference10.1145/2994310.2994364(366-374)Online publication date: 17-Oct-2016
https://dl.acm.org/doi/10.1145/2994310.2994364
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents