Nothing Special   »   [go: up one dir, main page]

Skip to content

Codes for “Our Model Achieves Excellent Performance on MovieLens: What Does it Mean?”

License

Notifications You must be signed in to change notification settings

FrankYuchen/RevistMovieLens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Codes for the paper “Our Model Achieves Excellent Performance on MovieLens: What Does it Mean?”

Our paper focuses on the MovieLens dataset, one of the benchmark datasets in the field of recommender systems. Our study attempts to explore a different perspective: to what extent do we understand the user-item interaction generation mechanisms of a dataset?

MovieLens1518 Dataset

To minimize the potential impact of different versions, we extract user interaction data from the last four years (2015-2018) in the MovieLens-25M dataset.
In the curated dataset, we only retain ratings from the users whose entire rating history falls within the 4-year period.
In addition, we remove the less active users who have fewer than 35 interactions with MovieLens, from the collected data.
The dataset contains about 4.2 million user-item interactions (see Table below).

#Users #Items Avg #Ratings per user Avg #Ratings per item #Interactions
24,812 36,378 170.3 116.2 4.2M

The internal recommendation considers the popularity score of movies in the past year. We follow the same time scale and set the evaluation time window as one year and conduct independent comparative experiments year by year, from 2015 to 2018.

Baseline and Evaluation Metric

We select seven widely used baselines from four categories: (1) memory-based methods: MostPop and ItemKNN; (2) latent factor method: PureSVD; (3) non-sampling deep learning method: Multi-VAE; and (4) sequence-aware deep learning method: SASRec, TiSASRec, Caser.

Experiments

Our experiments are divided into two main parts. All of the experiments follow the leave-last-one-out scheme.

Impact of Interaction Context at Different Stages

We conduct an ablation experiment with the removal of the first $15$ ratings, randomly sampled $15$ ratings, and the last $15$ ratings of each user's training instances. For the experiments that randomly remove $15$ ratings, we repeat the experiments three times with different seeds and get the average recommendation performance to reduce random error.

Impact of Interaction Sequence

Our final experiment changes the order in the original sequence by data shuffling. We keep the validation set and test set unchanged, get new pseudo-sequences by disrupting the order of user interaction sequences in the training set, and observe the performance changes of the sequence recommendation algorithm.

Acknowledgements

We build on the following repositories to improve our codes for customized experiments, which also ensures the reproducibility and reliability of our results.

About

Codes for “Our Model Achieves Excellent Performance on MovieLens: What Does it Mean?”

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published