Not All the Same: Understanding and Informing Similarity Estimation in Tile-Based Video Games

S Berns, V Volz, L Tokarchuk, S Snodgrass… - Proceedings of the CHI …, 2024 - dl.acm.org
Proceedings of the CHI Conference on Human Factors in Computing Systems, 2024dl.acm.org
Similarity estimation is essential for many game AI applications, from the procedural
generation of distinct assets to automated exploration with game-playing agents. While
similarity metrics often substitute human evaluation, their alignment with our judgement is
unclear. Consequently, the result of their application can fail human expectations, leading to
eg unappreciated content or unbelievable agent behaviour. We alleviate this gap through a
multi-factorial study of two tile-based games in two representations, where participants (N …
Similarity estimation is essential for many game AI applications, from the procedural generation of distinct assets to automated exploration with game-playing agents. While similarity metrics often substitute human evaluation, their alignment with our judgement is unclear. Consequently, the result of their application can fail human expectations, leading to e.g. unappreciated content or unbelievable agent behaviour. We alleviate this gap through a multi-factorial study of two tile-based games in two representations, where participants (N=456) judged the similarity of level triplets. Based on this data, we construct domain-specific perceptual spaces, encoding similarity-relevant attributes. We compare 12 metrics to these spaces and evaluate their approximation quality through several quantitative lenses. Moreover, we conduct a qualitative labelling study to identify the features underlying the human similarity judgement in this popular genre. Our findings inform the selection of existing metrics and highlight requirements for the design of new similarity metrics benefiting game development and research.
ACM Digital Library