Personalizing Peloton: Combining Rankers and Filters To Balance Engagement and Business Goals

S Banerjee, A Bhadury, N Talukder… - Proceedings of the 15th …, 2021 - dl.acm.org
S Banerjee, A Bhadury, N Talukder, S Thammana
Proceedings of the 15th ACM Conference on Recommender Systems, 2021dl.acm.org
Peloton is at the forefront of the at-home fitness market, with two business pillars:(a) a line of
connected fitness equipment, and (b) a subscription-based service that offers access to a
rich catalog of high quality fitness classes. As of May 2021, the total member base for
Peloton stood at over 5.4 million, who took more than 170 million workouts. Peloton classes
cover a diversity of instructors, languages, fitness disciplines, durations, intensity and muscle
groups. On the other side, each user has their own specific fitness goals, time available to …
Peloton is at the forefront of the at-home fitness market, with two business pillars: (a) a line of connected fitness equipment, and (b) a subscription-based service that offers access to a rich catalog of high quality fitness classes. As of May 2021, the total member base for Peloton stood at over 5.4 million, who took more than 170 million workouts. Peloton classes cover a diversity of instructors, languages, fitness disciplines, durations, intensity and muscle groups. On the other side, each user has their own specific fitness goals, time available to work out, fitness equipment and level of skill or strength. This diversity of content and individuality of user needs creates the need for a recommender system capable of personalizing the Peloton experience.
Most recommendation engines optimize for user lifetime value or time of engagement. However, Peloton users have very different usage habits when compared to other industry recommendation problems. Users arrive on the platform with a clear intent to workout, which allows our recommendation algorithms to not just focus on the short-term classic metrics such as click-through-rates and optimizing session lengths for exploration. Instead, fitness content recommendations at Peloton also help solve longer term problems such as: 
It helps balance engagement and business goals. A classic example of this is the introduction of a new instructor. For existing users, who already have developed affinities to certain instructors, how can we ensure that enough of them see and try some classes from the new instructor so that they can build their own following?
It helps guide users to explore the vast library of content. Peloton users quickly develop set routines with our fitness content, with high repeat plays of the same instructor/duration/class type. Recommendations serve as a mechanism to encourage them to try something outside this comfort zone, which both increases the breadth of a user's engagement with the platform and leads to a more holistic workout routine.
We achieve these two goals by utilizing a combination of rankers and filters. Ranking models help order the universe of content for each user according to their preferences. Filters take a slice of this ordered content to generate a shelf of content with a reason for suggesting it. Explainability is heavily linked to business goals, while ranking is linked to engagement goals. For instance, ranking and filtering can work in tandem to populate a shelf that helps promote a new music partnership, e.g. Workouts Featuring The Beatles, where we highlight classes that contain music by The Beatles (filter), ordered by the user's class preferences (ranker). With rankers and filters, we empower other teams to curate personalized experiences. The creation process is simplified to picking a ranker and a filter to create a shelf, and then giving it a title to then have it displayed to users.
Further, we have context-based models that order the shelves for each user depending on both their preferences and context, such as platform and time. For ranking our various filters, we take a multi-armed bandit approach, due to the need to handle cold starts on users and balance exploration (keep putting new shelves in front of the user) with exploitation (keep showing them shelves they already interact with). To start with, we implemented a simple Thompson Sampling based bandits algorithm, which accumulates rewards (for shelves interacted with) and penalties (for shelves ignored), which in turn constantly adapts the shelf ordering for a user, making the experience more personalized over time. We are able to perform all calculations offline in batch, using Spark, and cache the Thompson Sampling …
ACM Digital Library