Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1979742.1979788acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
abstract

Weak inter-rater reliability in heuristic evaluation of video games

Published: 07 May 2011 Publication History

Abstract

Heuristic evaluation promises to be a low-cost usability evaluation method, but is fraught with problems of subjective interpretation, and a proliferation of competing and contradictory heuristic lists. This is particularly true in the field of games research where no rigorous comparative validation has yet been published. In order to validate the available heuristics, a user test of a commercial game is conducted with 6 participants in which 88 issues are identified, against which 146 heuristics are rated for relevance by 3 evaluators. Weak inter-rater reliability is calculated with Krippendorff's Alpha of 0.343, refuting validation of any of the available heuristics. This weak reliability is due to the high complexity of video games, resulting in evaluators interpreting different reasonable causes and solutions for the issues, and hence the wide variance in their ratings of the heuristics.

References

[1]
B. Bailey. Judging the severity of usability issues on web sites: This doesn't work. http://www.usability.gov/ articles/newsletter/pubs/102005news.html, Last accessed July 9 2010.
[2]
G. Cockton and A. Woolrych. Understanding inspection methods: Lessons from an assessment of heuristic evaluation. Joint Proc. HCI 2001 and IHM 2001: People and Computers XV, ACM (2001), 171--191.
[3]
G. Cockton, A. Woolrych, and M. Hindmarch. Reconditioned merchandise: extended structured report formats in usability inspection. Proc. CHI '04 extended abstracts on Human factors in computing systems, ACM (2004), 1433--1436.
[4]
H. Desurvire and C. Wiberg. Game usability heuristics (PLAY) for evaluating and designing better games: The next iteration. Proc. OCSC, Springer-Verlag (2009), 557--566.
[5]
H. Desurvire and C. Wiberg. User experience design for inexperienced gamers: GAP - game approachability guidelines. In R. Bernhaupt (ed.), Evaluating User Experience in Games - Concepts and Methods, Springer-Verlag (2010), chapter 8.
[6]
L. Faulkner. Beyond the five-user assumption: benefits of increased sample sizes in usability testing. Behavior Research Methods, Instruments, & Computers 35, 3 (2003), 379--383.
[7]
M. Federoff. Heuristics and usability guidelines for the creation and evaluation of fun in video games. Department of Telecommunications, Indiana University, USA, 2002. http://citeseerx.ist.psu.edu/viewdoc/download? doi=10.1.1.89.8294&rep=rep1&type=pdf, Last accessed 18 February 2011.
[8]
D. Freelon. Recal for ordinal, interval, and ratio data (OIR). http://dfreelon.org/utils/recalfront/ recal-oir/, Last accessed 22 September 2010.
[9]
A. F. Hayes and K. Krippendorff. Answering the call for a standard reliability measure for coding data. Communication Methods and Measures 1, 1 (2007), 77--89.
[10]
M. Hertzum and N. E. Jacobsen. The evaluator effect: A chilling fact about usability evaluation methods. International Journal of Human-Computer Interaction 15, 1 (2003), 183--204.
[11]
K. Hornbæk and E. Frøkjaer. Comparison of techniques for matching of usability problem descriptions. Interact. Comput. 20, 6 (2008), 505--514.
[12]
W. Hwang and G. Salvendy. What makes evaluators to find more usability problems?: a meta-analysis for individual detection rates. In Proc. HCI'07, Springer-Verlag (2007), 499--507.
[13]
N. E. Jacobsen, M. Hertzum, and B. E. John. The evaluator effect in usability tests. In Proc. CHI '98, ACM (1998), 255--256.
[14]
B. E. John. The evaluator effect in usability studies: Problem detection and severity judgments. Proc. Human Factors and Ergonomics Society Annual Meeting 42, 5, (1998), 1336--1340.
[15]
M. Kessner, J. Wood, R. F. Dillon, and R. L. West. On the reliability of usability testing. In Proc. CHI extended abstracts, ACM (2001), 97--98.
[16]
H. Korhonen, J. Paavilainen, and H. Saarenpää. Expert review method in game evaluations - comparison of two playability heuristic sets. In Proc. Mindtrek, ACM (2009).
[17]
R. Molich, M. R. Ede, K. Kaasgaard, and B. Karyukin. Comparative usability evaluation. Behavior and Information Technology 23, 1, Taylor & Francis, Inc. (2004), 65--74.
[18]
J. Nielsen. Finding usability problems through heuristic evaluation. In Proc. CHI '92, ACM (1992), 373--380.
[19]
J. Nielsen. Enhancing the explanatory power of usability heuristics. In Proc. CHI '94, ACM (1994) 152--158.
[20]
Nielsen and T. K. Landauer. A mathematical model of the finding of usability problems. In Proc. CHI '93, ACM (1993) 206--213.
[21]
J. Nielsen and R. Molich. Heuristic evaluation of user in-terfaces. In Proc. SIGCHI, ACM(1990), 249--256.
[22]
D. Pinelle, N. Wong, and T. Stach. Heuristic evaluation for games: usability principles for video game design. In Proc. SIGCHI, ACM (2008), 1453--1462. J. Spool and W. Schroeder. Testing web sites: five users is nowhere near enough. In Proc. CHI '01, ACM (2001), 285--286.

Cited By

View all
  • (2024)User experience testing and co-designing a digital game for broadening participation in computing with and for elementary school childrenInternational Journal of Child-Computer Interaction10.1016/j.ijcci.2024.100699(100699)Online publication date: Oct-2024
  • (2022)Charting the Uncharted with GUR: How AI Playtesting Can Supplement Expert EvaluationProceedings of the 17th International Conference on the Foundations of Digital Games10.1145/3555858.3555880(1-12)Online publication date: 5-Sep-2022
  • (2022)Virtual reality assessments (VRAs): Exploring the reliability and validity of evaluations in VRInternational Journal of Selection and Assessment10.1111/ijsa.1236930:1(103-125)Online publication date: 10-Jan-2022
  • Show More Cited By

Index Terms

  1. Weak inter-rater reliability in heuristic evaluation of video games

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI EA '11: CHI '11 Extended Abstracts on Human Factors in Computing Systems
    May 2011
    2554 pages
    ISBN:9781450302685
    DOI:10.1145/1979742

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 May 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. heuristic evaluation
    2. usability
    3. user experience
    4. video game

    Qualifiers

    • Abstract

    Conference

    CHI '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 6,164 of 23,696 submissions, 26%

    Upcoming Conference

    CHI '25
    CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)16
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 21 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)User experience testing and co-designing a digital game for broadening participation in computing with and for elementary school childrenInternational Journal of Child-Computer Interaction10.1016/j.ijcci.2024.100699(100699)Online publication date: Oct-2024
    • (2022)Charting the Uncharted with GUR: How AI Playtesting Can Supplement Expert EvaluationProceedings of the 17th International Conference on the Foundations of Digital Games10.1145/3555858.3555880(1-12)Online publication date: 5-Sep-2022
    • (2022)Virtual reality assessments (VRAs): Exploring the reliability and validity of evaluations in VRInternational Journal of Selection and Assessment10.1111/ijsa.1236930:1(103-125)Online publication date: 10-Jan-2022
    • (2021)PathOS+: A New Realm in Expert EvaluationExtended Abstracts of the 2021 Annual Symposium on Computer-Human Interaction in Play10.1145/3450337.3483495(122-127)Online publication date: 15-Oct-2021
    • (2021)Analyzing and Prioritizing Usability Issues in GamesHCI in Games: Experience Design and Game Mechanics10.1007/978-3-030-77277-2_9(110-118)Online publication date: 3-Jul-2021
    • (2021)Reliability of Heuristic Evaluation During Usability AnalysisProceedings of the 21st Congress of the International Ergonomics Association (IEA 2021)10.1007/978-3-030-74614-8_88(708-714)Online publication date: 13-Jun-2021
    • (2019)Gameful Design Heuristics: A Gamification Inspection ToolHuman-Computer Interaction. Perspectives on Design10.1007/978-3-030-22646-6_16(224-244)Online publication date: 27-Jun-2019
    • (2018)Design Heuristics for Mobile Augmented Reality Game User InterfacesExtended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems10.1145/3170427.3188580(1-5)Online publication date: 20-Apr-2018
    • (2017)Evaluating the Onboarding Phase of Free-toPlay Mobile GamesProceedings of the Annual Symposium on Computer-Human Interaction in Play10.1145/3116595.3125499(377-388)Online publication date: 15-Oct-2017
    • (2016)Playtesting for indie studiosProceedings of the 20th International Academic Mindtrek Conference10.1145/2994310.2994364(366-374)Online publication date: 17-Oct-2016
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media