research-article

Open access

Artful Path to Healing: Using Machine Learning for Visual Art Recommendation to Prevent and Reduce Post-Intensive Care Syndrome (PICS)

Authors:

Bereket A. Yilma,

Chan Mi Kim,

Gerald C. Cupchik,

Luis A. LeivaAuthors Info & Claims

CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems

Article No.: 447, Pages 1 - 19

https://doi.org/10.1145/3613904.3642636

Published: 11 May 2024 Publication History

All formats PDF

Abstract

Staying in the intensive care unit (ICU) is often traumatic, leading to post-intensive care syndrome (PICS), which encompasses physical, psychological, and cognitive impairments. Currently, there are limited interventions available for PICS. Studies indicate that exposure to visual art may help address the psychological aspects of PICS and be more effective if it is personalized. We develop Machine Learning-based Visual Art Recommendation Systems (VA RecSys) to enable personalized therapeutic visual art experiences for post-ICU patients. We investigate four state-of-the-art VA RecSys engines, evaluating the relevance of their recommendations for therapeutic purposes compared to expert-curated recommendations. We conduct an expert pilot test and a large-scale user study (n=150) to assess the appropriateness and effectiveness of these recommendations. Our results suggest all recommendations enhance temporal affective states. Visual and multimodal VA RecSys engines compare favourably with expert-curated recommendations, indicating their potential to support the delivery of personalized art therapy for PICS prevention and treatment.

Figure 1:

1 Introduction

Patients in the intensive care unit (ICU) generally undergo stressful and traumatic experiences stemming from critical illness, medical procedures, pain, and a hostile environment [20]. Even after ICU discharge, these patients are vulnerable and at a high risk of readmission to the hospital and the ICU [70, 80]. Post-ICU patients often suffer from post-intensive care syndrome (PICS) which refers to “new or worsening impairments in physical, cognitive, or mental health arising after critical illness and persisting beyond acute care hospitalization” [55]. PICS is quite common, affecting up to 75% of patients discharged from the ICU [14, 56, 60, 67] and reduced quality of life and hindrances in reintegrating into society of post-ICU patients [19, 24]. Psychological aspects of PICS include depression, anxiety disorders, and post-traumatic stress disorder (PTSD) [33]. While there is growing interest in their prevention and treatment, there exist limited interventions available, such as ICU follow-up clinics [12, 33] and ICU diaries [6, 33], and there is a need for more diverse and effective approaches.

Visual art has been widely utilized to promote psychological well-being in clinical environments. While art therapy is commonly understood as a form involving creative activities, in this paper, we use ’art therapy’ as an umbrella term where art serves as a medium for therapeutic benefits [13, 23]. This includes various forms, such as engaging with existing artwork to stimulate emotions and self-reflections. Art therapy, for example, has been employed as a method to address various forms of psychological disorders, including depression [3, 23], anxiety [23], and PTSD [73]. This approach leverages the unique characteristics of visual art, such as diverse styles that activate interpretation and imagination, and the ability to stimulate the expression of memories and specific emotions [73], which have demonstrated effectiveness in addressing these psychological disorders [3, 23, 73]. In the context of hospitals in general, as well as critical care settings such as the ICU, the use of visual art as a positive distraction has also demonstrated its effectiveness in reducing stress, anxiety, and pain perception [54, 76]. This body of evidence showcases the potential of visual art as an intervention for addressing the psychological aspects of PICS.

Previous studies have suggested the importance of personalization in providing positive distractions to enhance effectiveness [71] while minimizing potential side effects [57]. This indicates that for visual art to serve as a positive distraction, it is crucial that it resonates with the patient’s emotional needs, emphasizing the significance of selecting appropriate art for each patient. Furthermore, to achieve a prolonged effect of positive distraction, a continuous supply of personalized art is necessary, which entails a large number of artworks.

In light of these challenges, it becomes evident that embracing a personalised methodology for facilitating the selection of paintings in art therapy is not merely advantageous, but of paramount importance. The intersection of personalised medicine and art therapy holds the potential to revolutionise the landscape of PICS treatment. Particularly recent advances in Machine Learning-based Visual Art Recommendation Systems (VA RecSys) hold great potential to open a novel avenue for tackling the challenges of artwork selection in a personalised and nuanced manner to be used for art therapy of post-ICU patients and beyond. By integrating these systems, we can bridge the gap between the vast universe of artworks and the unique emotional needs of each patient, thereby supporting experts in the selection process of curated artworks that resonate with the penitent’s distinct cognitive and emotional requirements.

In this paper, we set out to explore the potential benefits of integrating Machine Learning (ML) based VA RecSys within the framework of PICS treatment using art therapy. To the best of our knowledge, there are no prior works leveraging VA RecSys in a therapeutic context. Therefore, we formulate the following research question: Can VA RecSys algorithms support the psychological well-being of post-ICU patients through personalized art therapy?

In pursuit of this investigation, we propose approaches to integrate state-of-the-art VA RecSys engines within the context of PICS prevention and reduction, evaluating their potential efficacy and relevance. We explore four VA RecSys engines that have shown superiority in uncovering complex semantic relationships of artwork and have been successfully applied for personalised recommendation tasks [90, 91]. In particular, we trained three uni-modal engines on image and textual data of artworks, and one multimodal engine that fuses both image and text. For our image-based approach we use the popular Residual Neural Network (ResNet) [28], for our text-based approach we adopt both Latent Dirichlet Allocation (LDA) [7] and Bidirectional Encoder Representations from Transformers (BERT) [18], whereas for our fusion approach we use Bootstrapping Language-Image Pre-training (BLIP) [43]. The learned representations are used to derive personalised artwork recommendations for therapy that are presumably matching penitents’ emotional needs.

In sum, this paper makes the following contributions:

•

We develop and study four advanced VA RecSys engines using different backbone architectures (ResNet, LDA, BERT, and BLIP) to support PICS prevention and reduction through guided art therapy.

•

We conduct a usability test with 4 healthcare experts to assess the appropriateness of VA RecSys engines and a large-scale study with 150 post-ICU patients to assess the efficacy of the proposed VA RecSys engines, as compared to expert-curated recommendations.

•

We contextualise our findings and provide guidance about potential strategies to integrate ML-based VA RecSys in a personalised PICS intervention and beyond.

2 Related Work

2.1 Approaches to prevent and reduce PICS

There are several interventions focused on preventing PICS. The ABCDEF bundle [64] is a commonly used strategy for preventing PICS which provides practical ways to promote patients’ state of being awake, autonomy, and rehabilitation. The bundle has demonstrated its effectiveness in improving the likelihood of survival, mitigating delirium, and reducing physical restraints in ICU patients [64].

Environment management is another prevention strategy focusing on addressing environmental factors that are hostile to ICU patients which are associated with delirium and sleep deprivation, leading to PICS [33]. Environment management includes the reduction of negative stimuli, such as overloaded background noise and lighting, which have been shown to improve sleep quality [16][46] and reduce delirium among ICU patients [81]. The provision of positive stimuli is another way of environment management. Exposure to nature view, for example, demonstrated to shorten the length of hospital stay and reduce complications among patients [78]. In addition, music intervention reduces anxiety and stress-related measures in ICU patients [41].

ICU diary is also a popularly practiced intervention supporting patients to understand what has happened during their ICU stay including times during sedation, which is effective in reducing anxiety, depression, and PTSD symptoms [33]. Nevertheless, implementing all the aforementioned interventions requires systemic changes. Furthermore, due to their non-pharmacological nature, these prevention strategies often confront constraints related to increased workload for healthcare professionals [38, 66].

Next to preventing PICS, it is also important to recognize early symptoms and provide interventions to avoid further development into, for example, acute stress disorder (ASD) and post-traumatic stress disorder (PTSD). ICU follow-up care is one of the most practiced interventions that address the psychological aspects of PICS [66][33]. ICU follow-up care aims to support patients through their transition period from hospital to home with guidance. This approach has gained significant traction in both Europe and North America in recent years and found to mitigate symptoms of PTSD stemming from ICU stay [35]. However, the current ICU follow-up care lacks a standardized structure, resulting in variations in interventions between hospitals, which makes it difficult to track the effectiveness of these interventions [35]. On the other hand, this variability paves the way for the advancement and implementation of individualized interventions, which often adopt technology. Overall, these interventions include guided meditation in natural settings using VR which could reduce pain and stress [62], and gamified cognitive behavioral therapies for reducing depression [45]. Furthermore, there exist various strategies to apply technology for ICU delirium prevention [36], many of which can be readily adapted to a post-intensive care environment. As such, there is the potential to enhance the rehabilitation of ICU patients through ICU follow-up care with a personalized methodology enabled by technology.

2.2 Therapeutic visual experience and personalisation

Since the study by Ulrich [78], which demonstrated the impact of having access to a nature view on enhancing the health outcomes of ICU patients, exposure to nature, and particularly its visual experiences, has gained interest as a nonpharmacological intervention [26]. Interventions are introduced wherein visual nature experiences play a role as a positive distraction [83, 85], generating positive feelings and maintaining attention for patients without inducing stress.

Next to the exposure to nature, visual art is another popularly used positive distraction in hospital settings. A growing body of evidence suggested the impact of appropriate art on the health outcomes of patients [27]. Research also found that visual art with nature content is effective in reducing stress, anxiety, and perceived pain in critically ill patients [54, 77]. Importantly, the characteristics of visual art, including subject matter and style, have been found to strongly influence its impact on patients [26, 77]. Representative pictures dominated by nature, such as landscapes with trees and water, have been shown to be more effective in reducing anxiety and pain than other types, such as art without nature content [79]. Notably, the study showed that abstract art with rectilinear and straight-edge forms brings strong negative reactions [77].

The therapeutic effects of visual nature experiences are explained by evolutionary theory [2] and the Biophilia hypothesis [87]. These theories [2, 87] propose that over millions of years of evolution, humans have become genetically predisposed to respond positively to natural settings that promoted well-being and survival for early humans. Negative effects of certain visual arts, on the other hand, can be explained by the Emotional Congruence theory [77], which suggests that our perception of stimuli is influenced by our emotional states. This implies that abstract art, being open to interpretation, can lead patients in highly stressful situations to project negative emotions onto their interpretation. Consequently, they end up experiencing adverse visual experiences, as was demonstrated in Ulrich’s study [77], where immediate removal of such pictures became necessary. These instances underscore that while the use of art in a therapeutic setting is compelling, the application of art requires careful customization and personalization to ensure effectiveness. Recognizing these needs, Hathron [27] emphasized the importance of considering individualized needs when selecting art for healing.

Art has been utilized outside of hospital settings, such as in the treatment of PTSD. Art therapy in these contexts takes various forms, ranging from exposure to creative techniques, with debatable effects [69]. In this study, we focus on the aspects of visual arts as a positive distraction. Taking this approach, we aim to improve temporary stress and evoke positive emotions among post-ICU patients, both during their time in the post-ICU clinic and throughout the rehabilitation stage. To enhance the positive impact, we incorporate techniques from narrative therapy [48] which encourages patients to explore deeper into the meanings of the visual arts and engage in prolonged exposure to the therapeutic elements of the art. We will further elaborate our approach in the method section.

2.3 Visual art recommendation systems

VA RecSys represent an emerging field at the intersection of technology, art, and user preferences. These systems harness data-driven methodologies, particularly powered by machine learning algorithms, to facilitate personalised art recommendations, enabling users to discover artworks that resonate with their personal aesthetic inclinations. VA specifically paintings are both high dimensional and semantically complex, leading to a diverse and subjective set of emotional and cognitive reflections on users [88]. Paintings reflect the complex and intricate interplay of concepts ranging from individual ideas and beliefs to concepts with cultural and historical significance of a society [59].

With the proliferation of artworks in recent years, coupled with an increasing demand for personalization, VA RecSys has gained traction across diverse domains spanning from online platforms, to cultural heritage institutions aiming to enhance visitors experience and engagement. Studies like [21, 30] have shown a number of advantages and highlighted the significant role of VA RecSys. As the study of Falk et al. [22] discuses, the main motivation of museum visitors is to have fun, experience art, learn new things, feel inspired, and interact with others. Thus, using VA RecSys empowered digital museum guides, visitors’ expectations are not only to be exposed to artwork that matches their interest but also learn more and have access to more information [30]. Achieving this fundamentally requires VA RecSys to uncover complex semantics and abstract relationships between art works to derive recommendations. The earliest works such as Kuflik et al. [40] and Deladiennee et al. [15] proposed a graph-based semantic VA RecSys that relies on an ontological formalisation of knowledge.

In recent years with the advancements in Artificial intelligence (AI) particularly Neural Networks (NN) demonstrating enormous success in capturing latent semantic structures and relationships from data, the RecSys domain started to adopt representation learning techniques [29]. Following this, a number of VA RecSys works emerged by learning latent representations from images of artworks [51, 52, 53] and from textual descriptions of paintings [89, 92] as well as from their combinations [90, 91] demonstrating the power of NNs to derive meaningful and personalised recommendations. However, the integration of VA RecSys in healthcare and specifically within the context of PICS rehabilitation remains an unexplored domain. This paper aims to bridge this gap by investigating the potential of VA RecSys in aiding the art therapy process for PICS patients. By comparing VA RecSys-generated recommendations with expert-curated ones, this study seeks to enrich the current understanding of the role of VA RecSys in personalised healthcare interventions.

3 Background: Learning Latent Representations of Visual Art

Representation learning is a powerful computational concept that involves automatically uncovering the underlying structure within complex data [5]. It is a process where an algorithm learns to convert raw data inputs into more compact, meaningful, and feature-rich representations. These representations capture essential patterns, relationships, and characteristics within the data, enabling more effective analysis, understanding, and utilization [93].

In the context of VA RecSys, representation learning plays a pivotal role in converting intricate visual elements into condensed yet informative forms. This process involves training algorithms on text and/or image modalities, often based on neural networks, to recognise and extract not only distinctive features present in paintings, such as colours, shapes, and textures but also complex concepts embodied within artworks such as the emotional and cognitive reflections they trigger which are not always observable to the naked eye [90]. The learned representations encode these features in a way that captures the essence of the artwork’s semantics as well as visual identity. This goes beyond mere pixel values, as the algorithms internalise the higher-level characteristics that make each piece of art unique [91].

Based on the input data source there are two notable paths in representation learning literature which are unimodal and multimodal approaches [11]. As discussed in section 2.3, we draw inspiration from the recent successes of VA RecSys employing NN-based representation learning, and observing how the resulting personalised recommendations capture hidden semantics and benefit users in many ways such as learning, discovery, enhanced engagement and better interaction experience. The key idea of representation learning in this setting is that textual and visual modalities of paintings are used to learn an embedding space where similar items are represented close to each other in the embedding space as explained in the following subsections. Figure 2 and 3 summarise the painting representation learning approaches we propose and study for PICS rehabilitation therapy.

3.1 Unimodal VA representation learning

This approach extracts and encodes inherent features of paintings from a single type of data modality (i.e., image or textual description).

3.1.1 Image-based VA representation learning.

Today, image feature extraction techniques predominantly rely on pre-trained Convolutional Neural Network (CNN) architectures, such as AlexNet [39], GoogLeNet [74], and VGG [72]. An exemplar of this trend is the winner of the 2015 ImageNet challenge, ResNet, introduced by He et al.[28]. ResNet pioneered the integration of residual layers to facilitate the training of very deep CNNs, setting the record with architectures comprising over 100 layers. A prominent version of this architecture is ResNet-50, featuring 50 layers, trained extensively on a vast repository of images from the ImageNet database.¹ Consequently, ResNet-50 has assimilated intricate feature representations across a diverse spectrum of images and has showcased its potential as a superior visual feature extractor compared to other pre-trained models[4, 32, 42].

To extract latent visual features (image embeddings) from paintings, we employed the ResNet-50 model pre-trained on the ImageNet dataset. By channelling each painting image through the network, we derived a convolutional feature map, resulting in a feature vector representation. Upon completing the extraction process for all image features within the dataset containing m number of images, we produce a matrix $\mathbf {A} \in \mathbb {R}^{m \times m}$, with each entry reflecting the cosine similarity measure among all image embeddings. Cosine similarity is an effective metric to find item similarities from embedding spaces, which is commonly used in data mining and information retrieval [58, 91]. This matrix encapsulates the latent visual distribution across all images, serving as a foundation for calculating similarities among paintings for a VA RecSys task discussed in section 4

3.1.2 Text-based VA representation learning.

Learning latent representations of paintings from their textual descriptions was proven a powerful technique to uncover hidden semantic concepts that are embodied across artwork [84, 89, 90]. In this work, we adopt two of the popular text-based representation learning approaches that demonstrated success in VA RecSys tasks namely, Latent Dirichlet Allocation (LDA) and Bidirectional Encoder Representations from Transformers (BERT).

LDA. Our first text-based VA RecSys approach is LDA, an unsupervised generative probabilistic model proposed by Blei et al. [7]. LDA attempts to model a collection of observations as a composite of distinct categories, or topics. In this context, each observation corresponds to a document, and the features are the presence, occurrence, or count of words, while the categories constitute the underlying topics. Notably, the specifics of the topics are not predefined; only the number of topics is chosen beforehand. These topics are learned as probability distributions over the words within each document.

The procedure for constructing an LDA model within the VA RecSys framework is as follows. We begin by curating a collection of documents, each containing textual information about individual paintings. Subsequently, a desired number of topics, denoted as k, is determined, and each word w within the document collection is assigned to a topic. This assignment is guided by θ_i ∼ Dir(α), where θ signifies the topic distribution for a document d, α represents the per-document topic distribution, i ∈ 1,..., k, and Dir(α) denotes a Dirichlet distribution spanning the k topics. The learning phase involves computing conditional probabilities P(t|d) (representing the likelihood of topic t given document d) and P(w|t) (indicating the likelihood of word w given topic t). A comprehensive discourse on LDA topic modeling is presented in [7] and [34]. Upon completing the training of the LDA model over the entire textual dataset containing m number of documents representing each painting, a matrix $\mathbf {A} \in \mathbb {R}^{m \times m}$ is generated. Each entry a(i, j) within this matrix corresponds to the cosine similarity measure between document embeddings. This matrix encapsulates the latent distribution of topics across all documents, which is utilised for calculating semantic similarities among paintings to derive recommendations.

BERT. Similarly, for the second approach with BERT, we start by curating documents for each painting. Then the feature learning process goes through three distinct phases. Firstly, we transform each painting document into an embedding representation by leveraging the pre-trained SBERT large language model.² This transformation maps sentences and paragraphs into a 384-dimensional dense vector space [68]. Secondly, we employ the uniform manifold approximation and projection (UMAP) algorithm [50] to reduce the dimensionality of these embeddings. UMAP, a dimension reduction technique, facilitates the transformation of multi-dimensional data points into a two-dimensional space. This step enhances efficiency while preserving the original embeddings’ overarching structure. Thirdly, we leverage the HDBSCAN algorithm [9], a soft-clustering technique, to semantically cluster the reduced embeddings. HDBSCAN avoids the misallocation of unrelated documents to clusters, thus enhancing the quality of clustering outcomes.

From these clusters, we extract latent topic representations using a custom class-based term frequency-inverse document frequency (c-TF-IDF) algorithm. This algorithm generates importance scores for words within a topic cluster. The essence of c-TF-IDF lies in its capacity to provide topic descriptions by identifying the most vital words within a cluster. Words boasting high c-TF-IDF scores are selected for each topic, thereby creating topic-word distributions for every document cluster. A more detailed discussion of our topic modeling strategy with BERT can be found in the work of Grootendorst et al. [25]. Similar to the LDA approach, upon the completion of training the BERT model across the entire textual dataset of size m, we produce a matrix $\mathbf {A} \in \mathbb {R}^{m \times m}$. Each entry within this matrix quantifies the cosine similarity measure between all document embeddings. As with the LDA approach mentioned above, this similarity matrix captures the latent distribution of topics throughout all documents. Thus, it can be utilised to compute similarities of paintings for a recommendation task.

Figure 2:

3.2 Multimodal VA representation learning

This approach combines information from multiple data sources, like images and associated textual descriptions to create a unified representation space [65]. This joint embedding enables the exploration of the interconnectedness between the inherent attributes of each modality. The latent features extracted from images and textual descriptions are mapped into the same embedding space, ensuring that semantically similar images and corresponding textual descriptions are brought closer together [44]. This synergy between textual narratives and visual aesthetics enhances the potential for various applications in interpreting artworks. Among the different approaches in the literature, we use Bootstrapping Language-Image Pre-training (BLIP) [43], which has demonstrated superior performance in various downstream tasks, including VA RecSys.

BLIP is a technique that trains neural networks by combining language and image data. It trains a model to predict either an image or text given the other, in order to improve the model’s understanding of multimodal relationships. BLIP uses a unified encoder-decoder model that can operate in three modes. The first mode, the unimodal encoder, encodes image and text separately. The second mode, the image-grounded text encoder, uses cross-attention to inject visual information into the text encoder. The third mode, the image-grounded text decoder, replaces bi-directional self-attention layers with causal self-attention layers. During pre-training BLIP optimizes three objectives: Image-Text Contrastive Loss (ITC), Image-Text Matching Loss (ITM), and Language Modeling Loss (LM). ITC aligns the visual and text transformers by encouraging similar representations for positive image-text pairs and dissimilar representations for negative pairs. ITM classifies whether image-text pairs are positive or negative. LM generates textual descriptions based on images.

For our VA representation learning task, we utilized the pre-trained BLIP model as a multimodal feature extractor. First, we extract multimodal features and use the ITM head to compute ITM scores for each painting, generating probability-matching scores for each image-text pair. Then, we compute a matrix $\mathbf {A} \in \mathbb {R}^{m \times m}$ where each entry A_ij is the probability matching score between the joint painting embeddings which can be used to compute similarities for a VA RecSys tasks. See Figure 3 for an illustration of our multimodal approach to learning latent semantic representations of paintings with BLIP.

Figure 3:

4 Method: Personalised Visual Art Recommendation for PICS Therapy

To enhance the potential therapeutic benefits of a visual art experience, we designed a personalized guided art therapy. To tap into participants’ latent needs and preferences, the process starts with inviting them to choose their preferred painting from sample paintings. Based on their selections, participants were subsequently presented with a set of three paintings, carefully chosen to align with their preferred sample painting. This inclusion of multiple paintings was intended to extend the duration of exposure to the painting. Additionally, for each of these selected paintings, we provided accompanying text guidance to facilitate active engagement and thoughtful reflection among participants. In this study, we explore two different approaches of personalised VA recommendation strategies for PICS art therapy; Expert-based and VA RecSys-based recommendation.

4.1 Expert recommendations

This established approach involves the curation of recommendations by experienced clinicians. Their expertise allows them to choose artworks that elicit desired emotional and psychological responses, in line with PICS rehabilitation therapy objectives. The recommendation procedure is primarily guided by the patients’ preference among a list of alternative paintings presented to them in order to identify the paintings that they resonate most profoundly with their journey towards recovery. Following this clinicians will select a set of recommended paintings that evoke similar emotions and moods that align with the therapeutic goals of the recovery journey.

4.2 VA RecSys recommendations

We consider unimodal and multimodal representation learning techniques discussed in section 3 that can learn features from both textual and visual modalities paintings. Particularly we study four models; LDA and BERT to learn text-based representations, ResNet for image-based representations, and BLIP for the fusion of the two modalities.

Let P = {p₁, p₂, …, p_m} be a set of image paintings and $\mathcal {P} = \lbrace {\bfseries \itshape {p}}_1, {\bfseries \itshape {p}}_2, \dots, {\bfseries \itshape {p}}_m\rbrace$ be the associated embeddings of each painting according to LDA, BERT, ResNet or BLIP. Once the dataset embeddings (latent feature vectors) are learned using either model (LDA, BERT, ResNet or BLIP) we compute the similarity matrix for all the paintings A. Next, the preference of a patient user u is modelled by computing a ranking score for the paintings in the dataset according to their therapeutic relevance (i.e. similarity to the painting p_j a user indicated to support their recovery). Thus, the predicted score S^u(p_i) the user would give to each painting in the collection P is calculated as:

\begin{equation} S^u(p_i) = d({\bfseries \itshape {p}}_i, {\bfseries \itshape {p}}_j) \end{equation}

(1)

where $d({\bfseries \itshape {p}}_i, {\bfseries \itshape {p}}_j)$ is the cosine similarity between embeddings of paintings p_i and p_j in the computed similarity matrix. Once the scoring procedure is complete, the paintings are sorted and the r most similar paintings constitute a ranked recommendation list. Figure 4 summarises our VA RecSys-based recommendation pipeline.

Figure 4:

5 Materials

5.1 Sample paintings for preference elicitation

To elicit user preferences, we offer sample paintings, allowing users to provide their input by choosing one of them. These sample paintings were derived from a pre-study approved by the Ethics Review Panel of the University of Twente, conducted with a total of 186 former patients, including 10 former ICU patients.

In a pre-study, an initial selection of 18 nature paintings associated with positive emotions, such as relaxing, cheerful, and awe-inspiring, was made by an academic expert in affective design from WikiArt.³ These paintings varied in style including different levels of abstraction. Subsequently, this set of paintings was narrowed down to 6, each of which was found to strongly evoke one or multiple positive emotions. This selection process involved evaluation by three experts in affective psychology, environmental psychology, and affective and healthcare design, each with over 15 years of experience in their respective fields.

Using these final six paintings, we conducted an online survey to ask participants to choose the painting that would best support their recovery. The results of the pre-study showed that Painting 2 was chosen by the majority of participants (n=63), followed by Painting 1 (n=32) and Painting 3 (n=32). These three paintings are shown in Figure 4, left column (labels as ‘sample paintings’) form top to bottom. In the present study, we use these top three most frequently selected paintings as sample paintings for preference elicitation to derive personalised recommendations. These paintings encompass diverse styles and are believed to convey emotions associated with calmness, restoration, and cheerfulness, respectively, providing a comprehensive representation of the healing experience.

5.2 Dataset for generating recommended paintings

We used a dataset combining the nature paintings from the pre-study with a collection containing 2,368 paintings from the National Gallery, London.⁴ provided through the CrossCult Knowledge Base.⁵ Every painting image within the dataset is accompanied by a complementary set of text-based metadata. This dataset’s configuration renders it well-suited for evaluating the proposed feature learning methodologies. A representative data point is illustrated in Figure 5.

Figure 5:

For acquiring textual features via LDA and BERT models, we performed pre-processing of the painting metadata. This encompassed concatenating text fields, excluding punctuation symbols and stop-words, transforming to lowercase, and applying lemmatization. Conversely, for the acquisition of visual features through the ResNet model, we utilized the authentic painting images.⁶ This procedure involved extracting convolutional feature maps using the pre-trained ResNet-50 model. For the multimodal feature learning with BLIP, both pre-processed text and image data sources were jointly utilized, as elaborated in Section 3.

5.3 Ensuring a safe and sensitive deployment: Expert evaluation of VA RecSys engines

Before delving into our comprehensive study involving end-users, particularly PICS patients, we conducted a pilot test with experts from diverse relevant domains, as a proactive measure to ensure their safety and well-being. The primary objective was to preempt any potential harm or inadvertent elicitation of negative emotions that might arise from the recommendations generated by the VA RecSys engines. The core purpose of this usability test was to ascertain the suitability of VA RecSys engine-generated recommendations for deployment without requiring human intervention.

5.3.1 Apparatus.

We created an overview of sample paintings and their respective recommendations generated by each of the engines. The recommendation engines were anonymized. Participants were provided one set of recommendations from each VA RecSys engine at a time.

5.3.2 Participants.

Four experts well-versed in diverse domains were recruited, including ICU nursing, healthcare design research, and affective design research.

5.3.3 Design.

The experts were exposed to all recommendation pipelines (within-subjects) design and were asked to assess the appropriateness of each of the top-3 recommended paintings from all pipelines for all samples.

5.3.4 Procedure.

Participants were invited for a one-to-one interview to assess the VA RecSys engines recommendation. There, they were informed about the purpose of the study. Then participants were provided with individual recommendation paintings per sample. For each recommended painting, they were asked to respond to the question “To what extent do the recommended painting align with the original painting in terms of the overall experience they evoke? This experience can be visual, semantic, or a combination of both.” They responded in a 1-5 scale (1: not at all, 5: very much) as reported in Table 1. Furthermore, they were also asked to give their expert opinion on the appropriateness of the recommended paintings for the purpose of therapy.

5.3.5 Results.

By leveraging the expertise of recruited professionals, we meticulously assessed the engines to identify those that align harmoniously with therapeutic goals and emotional well-being. The insights garnered from this test served as a foundation for selecting engines that could be employed with a higher degree of confidence, thereby fostering an environment of safety and sensitivity in our subsequent studies involving PICS patients.

From the experts’ evaluation, we observed that both text-based VA RecSys engines were not suitable for deployment in a real-world application. As they were found to be dark and evoking negative emotions. Some of the expert reflections on these recommendations are depicted in Figure 6. The figure shows some examples of the comments provided by the experts. As can be appreciated, some of the recommended images were too dark which could elicit fear or contain images of violence that could trigger traumatic experiences. Following this pilot we have decided to proceed with only three recommendation methods for our user study; Expert, Visual and Multimodal. Figure 7 shows examples of the top-2 recommendations from each of the expert validated approaches.

Table 1:

		Fusion BLIP	Image ResNet	Text BERT	Text LDA
Expert 1 (Affective Design, ICU research, +10)	Painting 1	2	3	1	1
	Painting 2	3	2	3	1
	Painting 3	2	3	2	1
Expert 2 (Affective Design, +10)	Painting 1	3	2	2	1
	Painting 2	3	2	1	2
	Painting 3	2	4	2	1
Expert 3 (Affective Design, +10)	Painting 1	2	4	1	1
	Painting 2	1	1	2	4
	Painting 3	2	3	1	1
Expert 4 (ICU Nurse, +10)	Painting 1	1	4	2	2
	Painting 2	4	2	3	3
	Painting 3	2	3	2	1
Total		27	33	22	19

Table 1: In our pilot test with Experts, they were shown 3 paintings from each of the VA RecSys engines for all sample paintings.

Figure 6:

Figure 7:

6 Evaluation: User Study

The main goal of our evaluation was to understand the user’s perception towards the quality of our studied VA recommendation strategies for PICS rehabilitation therapy, and ultimately to assess their efficacy in supporting the healing journey of PICS survivors. We conducted a large-scale user study, to be described later, that was approved by the Ethics Review Panel of the University of Twente.

6.1 Apparatus

We designed an online guided therapy survey using Google Forms⁷ that first elicited the preferences of participants by providing them with a set of paintings to choose from that resonates with their healing journey. Then participants were taken through a guided therapy session by using three paintings that were recommended based on the elicited preference choices.

Figure 8:

6.2 Participants

We recruited a representative pool of N = 150 participants via the Prolific crowdsourcing platform.⁸ We identified people who had conditions in the past that may potentially lead to PICS, such as COVID-19 patients who were treated in a hospital, cancer survivors, or having other surgical issues. For this study, we focused on post-COVID patients, as the likelihood of having developed PICS recently is higher. The main screening criteria was “I have been officially diagnosed with COVID-19 (tested by a licensed medical professional), and was treated in a hospital.”. Furthermore, we enforced other criteria for eligibility: being fluent in English, minimum approval rate of 100% in previous crowdsourcing studies in the platform, and being active on the platform in the last 90 days.

Our recruited participants (74 female, 76 male) were aged 32.7 years (SD=10.9) and could complete the study only once. Most of them lived in UK (40 participants), USA (28), or South Africa (35). Most participants had been through general ward or a medical/surgical unit (67), or had been in ICU (44) or in an Emergency Room (37). For most participants, the duration of their stay in a hospital was less than a week (71) or between 1 to 2 weeks (50). Most participants declared to suffer from anxiety (130) and/or depression (127), indicating the presence of psychological components related to PICS symptoms. The study took a median time of 23 min to complete and participants were paid an equivalent hourly wage of $12/h. We also administered the Patient Health Questionnaire-4 (PHQ-4) [47] to collect signs of psychological symptoms related to PICS: anxiety and depression. Most participants (86%) exhibited at least one symptom, with the majority (70%) showing symptoms for both of these conditions.

6.3 Design

Following the expert evaluation of our developed VA RecSys engines discussed in section 5.3, we deployed two engines together with the expert recommendations: visual (ResNet-based VA RecSys) and multimodal (BLIP-based VA RecSys). Each participant was only exposed to the recommendations generated by one of the three groups (between-subjects design) and then went through guided art therapy. Each group comprised 50 participants.

6.4 Procedure

We first assessed baseline and post-test affective states using two different measures: a Pick-A-Mood tool [17] for moods and the short version of the Positive and Negative Affect Schedule (PANAS) scale [86] for emotions. In order to reduce the cognitive load on participants and to be more efficient, we selected 10 items that are relevant to PICS (for negative emotions) and patient well-being (for positive emotions). Particularly we consider 5 positive and 5 negative items (instead of 10 positive and 10 negative items) together with a neutral item to assess the affective state of participants before and after guided art therapy.

Guided art therapy involves asking questions that encourage participants to engage with the paintings, such as “Imagine yourself entering the painting and exploring it. How did you feel while spending time in this painting?” Participants were prompted to reflect on their experience and describe it in three to four sentences.

Participants rated the provided painting recommendations in a 5-point Likert scale. Our dependent variables are widely accepted proxies of recommendation quality [63]:

Accuracy:

The paintings match my personal preferences and interests.

Diversity:

The paintings are diverse.

Novelty:

I discovered paintings I did not know before.

Serendipity:

I found surprisingly interesting paintings.

We also collected two dependent variables that inform to what extent the recommended paintings contributed to a sense of immersion and engagement:

Immersion:

How much do the recommended paintings contribute to your sense of immersion, making you feel deeply involved or absorbed in the artwork?

Engagement:

To what extent do the recommended paintings contribute to your feeling of engagement, capturing your attention and generating a sense of involvement or interest?

6.5 Results

6.5.1 Comparison of recommended groups: Expert, Visual, and Multimodal.

We investigated whether there were differences between the three recommendation groups (i.e. Expert, Visual, and Multimodal). We used a linear mixed-effects (LME) model where each dependent variable is explained by each recommendation group. Participants are considered random effects. An LME model is appropriate here because the dependent variables are discrete and have a natural order.

We fitted the LME models (one model per dependent variable) and computed the estimated marginal means for specified factors. We then ran pairwise comparisons (also known as contrasts) with Bonferroni-Holm correction to guard against multiple comparisons.

Analysis of recommendation quality measures. Figure 9 shows the distributions of user ratings for the user-centric dependent variables of recommendation quality. Differences between groups were not statistically significant in any case, with small to moderate effect sizes. The largest effect sizes were observed when comparing Expert and Multimodal recommendations in terms of diversity (r = 0.177) and serendipity (r = 0.119).

Figure 9:

Analysis of changes in mood. Figure 10 shows the mood changes before and after therapy, for the three recommendation groups we have considered in our study. In all three groups, a mood enhancement effect of guided art therapy was observed. When comparing it to the baseline where the majority of participants were in a negative mood (44.6%), after guided art therapy the majority of participants reported being in a positive mood (70.5%), with only a minority remaining in a negative (13.6%) or neutral (15.8%) mood.

Figure 10:

Figure 11 shows the change in scores after therapy, aggregated according to the ten items of the PANAS scale. Differences between groups were not statistically significant in any case, with small to moderate effect sizes. According to an item-independent analysis, the largest effect sizes were observed in terms of the ‘afraid’ item, when comparing Expert recommendations against Visual (r = 0.181) and Multimodal (r = 0.175) recommendations, followed by the ‘scared’ item when comparing Expert and Visual recommendations (r = 0.122). Upon further examination, users in the Visual group did not change their scores for the ‘afraid’ item (the median difference is 0). This was also the case for users in both the Visual and Multimodal group with regards to the ‘scared’ item.

Figure 11:

6.5.2 Analysis of user reflections.

We conducted sentiment analysis on the reflections of participants to gain deeper insights into their experience. We used the pre-trained transformer-based sentiment analysis model bert-large-uncased-sst2 from the Hugging Face Transformers library.⁹ This model is a fine-tuned version of bert-large-uncased which was trained on the Stanford Sentiment Treebank v2 (SST2)¹⁰; part of the General Language Understanding Evaluation (GLUE) benchmark¹¹. It is well-suited for a wide range of NLP tasks due to its large size and general language understanding capabilities. It has also demonstrated exceptional success in sentiment analysis tasks. Leveraging this model the result of our sentiment analysis, illustrated in Figure 12 indicates overwhelmingly positive sentiments expressed by participants in response to their interaction with the recommended paintings. However, it is noteworthy that a subtle trend emerged within the expert group, where a marginal 4% of sentences conveyed negative reflections, while sentences from the Visual group exhibited a slightly lower 2% negativity rate. In stark contrast, the sentences from the Multimodal group displayed an absence of negative sentiments altogether, echoing the similar pattern observed in the mood change PANAS, as well as recommendation quality measures. Figure 13 shows some sample sentences from the positive and negative reflections of participants per group on a 2D projection map of all reflection sentences using the non-linear projection t-SNE algorithm [82].

Figure 12:

Figure 13:

Table 2:

Themes	Theme Description	Example Quotes
Hope and Purpose	Drawing one’s attention toward more positive prospects, reminding them of hope and purposefulness.	“The style of the painting makes it more dynamic and the colors also make it more appealing.... This color palette really brightens my mood and puts me in a better place mentally.... I felt happy and hopeful about the future. I was in a very good mood.” P07
Rejuvenation	Supporting one to feel recharged through a sense of being carefree, calm, and relaxed.	“In this painting, I felt calm, relaxed, and refreshed.... I felt my spirit being re-energized as I cast away my concerns and worries.” P21
Engagement	Supporting one to immerse into the visual art by triggering one’s attention and interest.	“... the shadows and the light are done perfectly to give me that feeling of being there, actually seeing myself walking down the road on my way to the dam. This is a place I know in my mind, thus making the painting very engaging to me.” P35
Safety	Promoting a sense of safety through elements that signal a safe and familiar environment.	“It (the scenery in the painting) is very comforting and close to home. The subject matter is also important because it represents a situation where I’m in someone’s company.... I felt safe and at ease because I’m in good company and surrounded by nature.” P04
Sensory Pleasure	Providing pleasure through rich sensory stimulation that either directly comes from visual art or is derived from memory through visual triggers.	“In this painting, I found myself standing on the edge of a serene, sun-kissed meadow, surrounded by vibrant wildflowers swaying gently in the breeze. The colors were so vivid that I could practically feel the warmth of the sun on my skin and the softness of the grass beneath my feet.... I began to meditate, allowing the beauty of the painting to fill my senses.” P42
Relevance	Presenting subject matter relevant to one’s specific situation that can stimulate memories or constructive reflections.	“The scenery and style helped me remember previous times I have been in this situation. It made me think of all the good times and look forward to more good times. It was a very positive way to pass the time and think about what I need to be focusing on in life.” P18
Personal Preference	Increasing pleasure with visual stimulation that meets one’s preference.	“... since I love to visit places with a lot of forest, where there is not much noise and if there is noise, it is of nature, this painting helped my experience a lot.” P74

Table 2: Themes and Example Quotes from Some Participants

While the sentiment analysis offered valuable insights, it’s important to acknowledge that our pre-trained sentiment analysis model based on the general-purpose language model BERT, may not capture all sentiment subtleties, particularly those related to healing. Thus, to account for this we further conducted a qualitative evaluation through reflexive thematic analysis (RTA) [8] to identify healing elements in the recommended arts. The participant quotes were reviewed and labeled through open coding. These codes captured information related to the elements that contribute to patients’ affective states. The results of open coding were re-analyzed to identify important and general concepts. These concepts were interpreted and categorized into a total of seven themes based on how they contribute to eliciting positive emotions and moods: hope, rejuvenation, engagement, safety, sensory pleasure, relevance, and personal preference (see Table 2 for the list of themes and example quotes). The identified themes show different elements that contribute to the healing of individuals, each leading to unique paths to healing. We observed that these paths involve symbolic associations, such as hope and purpose for some individuals, while for others, they involve aesthetic values, such as sensory pleasure or personal preference. Therefore, the list of these themes can be used as elements to select healing paintings tailored to the needs of individuals: one who is drained might seek rejuvenation, while one who feels insecure might seek safety or familiarity. These findings underpin the importance of personalization of healing art for therapeutic purposes. Most identified themes, such as engagement, safety, sensory pleasure, familiarity, and personal preference, echo the relaxing visual nature elements found in other studies [2, 37, 87]. Hope and purpose are often emphasized as core elements in one’s coping and healing process [1]. This analysis also revealed the significance of including these elements in art to function as a healing mediator. We observed that the absence of such elements had led to rather negative affective experiences in all three groups (Expert, Visual, and Multimodal groups). An absence of engagement, for instance, contributed to boredom: “This one just doesn’t draw me in, (…) I would find it very boring and not part of any healing experience“ (participant 54 from Visual Group). Likewise, an absence of hope led to feeling tense: “This image doesn’t really inspire hope. Instead, it brings pessimistic thoughts. I hope for a more uplifting and positive atmosphere during my time here” (Participant 122 from Multimodal group). An absence of safety led to feeling gloomy: “This painting evokes the feeling of being lost. I felt like there was nowhere to hide so I just had to face whatever feeling I had” (Participant 25 from Expert group).

7 Discussion

Based on our findings, we can answer positively our research question posed at the beginning of this paper. That is, VA RecSys algorithms can indeed support the rehabilitation of post-ICU patients using art therapy. This has important implications in several fronts, as we discuss below.

7.1 Personalised visual art as PICS intervention

We have explored the potential of art therapy as a PICS intervention and have tested VA RecSys as a means to personalize visual art for this goal. Overall, we found that this comparatively new approach to using art, which combines narrative techniques with personalized recommendations, allowed participants to engage with various healing elements in the artworks. Additionally, our findings show that personalized guided art therapy is effective in temporarily alleviating negative emotions and enhancing positive emotions as well as enhancing mood states. This suggests that by increasing its duration and dosage, it has the potential to address the psychological aspects of PICS as an intervention, which could potentially result in more lasting effects in enhancing the affective state of patients. Importantly, we utilized nature-based artwork in this study. While the use of nature-based artwork to support former ICU patients is a novel approach, previous studies have demonstrated the therapeutic effects of nature-based visuals in various forms, ranging from a real nature view [78] to static as well as dynamic versions of virtual nature [31, 49, 83]. The results of our study contribute to the ongoing research efforts in applying nature-based visuals for therapeutic purposes, demonstrating their potential to support the healing process of patients and enhance their psychological well-being. Furthermore, in line with a recent study [37] that has highlighted the influence of personal characteristics on the impact of visual nature experiences, our study suggests the potential for a higher level of personalization with the support of RecSys.

Finally, we should mention that the process of guided art therapy in this study engaged an expert solely during the preparation phase, remaining independent throughout. This suggests the potential for developing guided art therapy as an intervention for remote and self-administered use, which could help alleviate the primary constraints of current PICS interventions, known for their high demand on healthcare professionals.

7.2 Crossing boundaries: VA Recsys - From entertainment to therapy

VA RecSys engines originally emerged as a means to enhance user experience in the entertainment field. Particularly, recent approaches boosted by machine learning techniques have undoubtedly demonstrated their potential in supporting users such as museum visitors and art enthusiasts to discover art pieces that are tailored to their personal preferences and interests. Furthermore, their ability to uncover complex semantics embodied within visual art made them powerful tools to support learning and discovery by exposing users to novel content. While the art entertainment industry has benefited from these advancements, our study sheds light on the remarkable potential of VA RecSys to transcend the space of entertainment, emerging as therapeutic tools within the healthcare domain. In the field of art entertainment, users seek diversion, enjoyment or relaxation while service providers strive to not only enhance user engagement and satisfaction but also drive up revenue. VA RecSys has long been at the forefront, seamlessly aligning these dual objectives.

In stark contrast to entertainment, therapy serves a deeper purpose; it is a journey of healing and self-discovery. The use of art in therapy has been proven to provide individuals with a unique medium to express complex emotions, confront traumatic experiences, and embark on the path to recovery. Art therapy, in particular, has emerged as a powerful tool in the hands of trained professionals to address a wide range of psychological and emotional challenges. The key to effective therapy lies in personalization and relevance. Patients seek a therapeutic experience that resonates with their individual needs and experiences. Thus, a careful selection of paintings tailored to the individual patient speaks to their unique circumstances, fostering self-reflection and healing. By introducing VA RecSys to the domain of therapy, we have showcased its remarkable potential to assist professionals not just in curating personalized artworks from a vast selection but also in delivering precise, tailored treatment to patients.

The intent here is not mere engagement as in the domain of entertainment but rather the transformation of the individual’s affective state and well-being. Thus, the adoption of VA RecSys algorithms from entertainment to the context of therapy requires rigorous quality control before being deployed in a system facing patients. As informed by our pilot test in section 5.3 not all top-performing VA RecSys engines in the entertainment domain were found to be appropriate for the purpose of therapy. Particularly recommendations from our text-based engines BERT and LDA tend to contain paintings that feature contents evoking negative emotions and with potentially harmful consequences. Therefore, we need to underscore the importance of acknowledging the risks involved and taking the necessary precautions when adopting these algorithms. On the contrary, our image-based and fusion-based engines produced paintings that were deemed appropriate by experts and our results also indicate that they were even perceived to support healing better than expert-curated recommendations. However, the promising results we observed may indicate a potential for a Human-in-the-Loop approach wherein experts fine-tune the recommendations generated by VA RecSys engines. While experts play a crucial role in ensuring the quality of paintings, VA RecSys could significantly reduce their workload (e.g., sifting through thousands of individual paintings from a database), which is reportedly a concern [10, 61], thereby enhancing the potential for scaling up guided art therapy to bring benefits to more patients. This is nonetheless an exciting opportunity of Human-AI collaboration for future work.

7.3 Looking ahead: Potential of VA RecSys in healthcare beyond PICS intervention

Our exploration of VA RecSys in the context of PICS is but a glimpse into the vast potential of this innovation within healthcare. Particularly, in light of our promising results observed in PICS treatment, one natural extension of this approach is to implement VA RecSys-assisted visual art into the ICUs (see Figure 14-a). This could support PICS prevention and the well-being of patients by providing essential emotional support (e.g., reducing fear and anxiety) through recommending personalized art.

The adaptability of VA RecSys-assisted art therapy holds promise in areas far beyond the boundaries of intensive care where the use of visual content as a positive distraction is already active. For instance, the use of projection creates a more relaxing experience in an MRI room where patients can get easily worried and feel discomfort (see Figure 14-b). In the context of residential care, as another example, a virtual window or digital frames are used to support cognitive activation and recovery (see Figure 14-c). The implications of our findings extend the potential of VA RecSys engines in the intersection of AI and healthcare. The role of VA RecSys engines in facilitating the use of visual art as a positive distraction is merely the tip of the iceberg, hinting at a future where technology enables more holistic and personalized care, amplifying our capacity to heal and connect on a profound level.

Figure 14:

8 Limitations and Future Work

While our study highlights the potential of VA RecSys within a therapeutic context, we acknowledge certain limitations and chart out promising directions for future research. Firstly, we have observed significant disparities when using VA RecSys in therapeutic context compared to its conventional application in entertainment. While our study has shed light on these distinctions, it has also highlighted risks associated with therapeutic use, especially when leveraging text-based models. This may partially be attributed to the quality of the text data source.

Data quality, underpinning VA RecSys recommendations, plays a pivotal role in its effectiveness. We have employed artist-curated descriptions of 2,368 paintings from the National Gallery dataset, but it is evident that these descriptions may not fully encapsulate the intricate affective attributes of the artworks. Thus, improving these models for therapeutic purposes can potentially be achieved by curating richer, more comprehensive affective descriptions. Although this entails substantial content curation efforts and a thorough evaluation, we believe it is a worthwhile endeavour. For future work it would also be beneficial to implement tree-based indexing data structures to scale up more efficiently for larger datasets.

Another limitation is that our current preference elicitation method relies on users selecting a single preferred painting, which may oversimplify their preferences. An area ripe for improvement involves allowing users to rate all the sample paintings that provide comprehensive representations of affective states (i.e., calmness, restoration, and cheerfulness), thereby capturing to what extent they resonate towards each affective dimension. By using these ratings as weights and projecting them into the embedding space, we can refine recommendation accuracy and granularity. Thus, the development and evaluation of VA RecSys combining the curation of high-quality data with such preference weighting mechanisms holds potential to improve the current approach in therapeutic settings. Additionally, by deriving more personalised content recommendations that uncover deeper semantics of artworks, this may also extend current VA RecSys approaches mostly limited in entertainment [89, 90] to benefit other areas such as education, blend learning, and discovery of artistic concepts.

As hinted in the above subsection, one particularly promising avenue is the exploration between humans and AI systems in the context of therapy. Here, experts can fine-tune VA RecSys-generated recommendations to align them precisely with individual patient needs. Investigating the dynamics of such collaborative efforts and developing tools to facilitate expert interventions could significantly enhance the therapeutic value of VA RecSys. This exciting direction opens doors to more targeted and personalized therapy experiences, bridging the gap between technology and human expertise. Furthermore, while we have gained valuable insights with the current sample (i.e., former patients with psychological symptoms of PICS), validation with patients exhibiting PICS symptoms is warranted. Finally, one key challenge lies in comprehending the reasoning behind the VA RecSys recommendations, which remains a critical aspect in determining model performance in different contexts. The explanation of machine learning models has been a longstanding challenge in the field of AI. Nevertheless, recent strides have been made in the realm of explainable AI, with emerging techniques and methodologies. Leveraging these innovative approaches to provide more transparent and interpretable explanations for VA RecSys recommendations holds substantial promise. This advancement can facilitate a Human-in-the-Loop approach, empowering experts to refine and enhance therapy efforts with greater precision.

9 Conclusion

We have studied Machine Learning-based VA RecSys approaches to enable personalized therapeutic visual art experiences for post-ICU patients. We have evaluated the relevance of the recommendations for therapeutic purposes as compared to expert-curated recommendations. Our results suggest that Visual and Multimodal VA RecSys engines compare favourably with expert-curated recommendations, indicating a great potential to support the delivery of personalised and targeted art therapy for PICS prevention and treatment. Overall, our study marks a significant step towards integrating VA RecSys in the context of therapy. Considering future research directions, this work points to the exciting potential for further advancements in the field of AI-assisted therapy and recommendation systems. The implications of our findings extend to patient-centred care, early intervention, and health promotion.

Acknowledgments

We thank Thomas Falck, MSc (Philips) and Dr. Esther van der Heide (Philips) for their advice on PICS research, and Prof. Dr. Geke Ludden (University of Twente) and Dr. Thomas van Rompay (University of Twente) for their advice on developing guided art therapy used in this study. We extend our thanks to the participants of our study for sharing their valuable experiences, and to the anonymous reviewers for their constructive comments. This work was supported by the Horizon 2020 FET program of the European Union through the ERA-NET Cofund funding grant CHIST-ERA-20-BCI-001 and the European Innovation Council Pathfinder program (SYMBIOTIK project), and the Top Technology Twente Connecting Industry program (TKI Topsector HTSM), which is partially funded by Philips.

Footnotes

https://www.image-net.org

In our implementation, we employed the all-MiniLM-L6-v2 version to optimize performance, though alternative versions can also yield suitable painting embeddings.

https://www.wikiart.org

https://www.nationalgallery.org.uk/

https://www.crosscult.lu/

All paintings are available under a Creative Commons (CC) license.

https://www.google.com/forms/about/

https://www.prolific.co/

https://huggingface.co/models

https://nlp.stanford.edu/sentiment

https://gluebenchmark.com

Source: www.philips.nl

Supplemental Material

MP4 File - Video Preview

Video Preview

Transcript for: Video Preview

MP4 File - Video Presentation

Video Presentation

Transcript for: Video Presentation

References

[1]

Jon G Allen. 2008. Coping with trauma: Hope through understanding. American Psychiatric Pub.

Abstract

1 Introduction

2 Related Work

2.1 Approaches to prevent and reduce PICS

2.2 Therapeutic visual experience and personalisation

2.3 Visual art recommendation systems

3 Background: Learning Latent Representations of Visual Art

3.1 Unimodal VA representation learning

3.1.1 Image-based VA representation learning.

3.1.2 Text-based VA representation learning.

3.2 Multimodal VA representation learning

4 Method: Personalised Visual Art Recommendation for PICS Therapy

4.1 Expert recommendations

4.2 VA RecSys recommendations

5 Materials

5.1 Sample paintings for preference elicitation

5.2 Dataset for generating recommended paintings

5.3 Ensuring a safe and sensitive deployment: Expert evaluation of VA RecSys engines

5.3.1 Apparatus.

5.3.2 Participants.

5.3.3 Design.

5.3.4 Procedure.

5.3.5 Results.

6 Evaluation: User Study

6.1 Apparatus

6.2 Participants

6.3 Design

6.4 Procedure

6.5 Results

6.5.1 Comparison of recommended groups: Expert, Visual, and Multimodal.

6.5.2 Analysis of user reflections.

7 Discussion

7.1 Personalised visual art as PICS intervention

7.2 Crossing boundaries: VA Recsys - From entertainment to therapy

7.3 Looking ahead: Potential of VA RecSys in healthcare beyond PICS intervention

8 Limitations and Future Work

9 Conclusion

Acknowledgments

Footnotes

Supplemental Material

References

Cited By

Index Terms

Recommendations

The Elements of Visual Art Recommendation: Learning Latent Semantic Representations of Paintings

Together Yet Apart: Multimodal Representation Learning for Personalised Visual Art Recommendation

Using a trust network to improve top-N recommendation

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

HTML Format

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations