Codes for the paper “Our Model Achieves Excellent Performance on MovieLens: What Does it Mean?”

Our paper focuses on the MovieLens dataset, one of the benchmark datasets in the field of recommender systems. Our study attempts to explore a different perspective: to what extent do we understand the user-item interaction generation mechanisms of a dataset?

MovieLens1518 Dataset

To minimize the potential impact of different versions, we extract user interaction data from the last four years (2015-2018) in the MovieLens-25M dataset.
In the curated dataset, we only retain ratings from the users whose entire rating history falls within the 4-year period.
In addition, we remove the less active users who have fewer than 35 interactions with MovieLens, from the collected data.
The dataset contains about 4.2 million user-item interactions (see Table below).

#Users	#Items	Avg #Ratings per user	Avg #Ratings per item	#Interactions
24,812	36,378	170.3	116.2	4.2M

The internal recommendation considers the popularity score of movies in the past year. We follow the same time scale and set the evaluation time window as one year and conduct independent comparative experiments year by year, from 2015 to 2018.

Baseline and Evaluation Metric

We select seven widely used baselines from four categories: (1) memory-based methods: MostPop and ItemKNN; (2) latent factor method: PureSVD; (3) non-sampling deep learning method: Multi-VAE; and (4) sequence-aware deep learning method: SASRec, TiSASRec, Caser.

Experiments

Our experiments are divided into two main parts. All of the experiments follow the leave-last-one-out scheme.

Impact of Interaction Context at Different Stages

We conduct an ablation experiment with the removal of the first $15$ ratings, randomly sampled $15$ ratings, and the last $15$ ratings of each user's training instances. For the experiments that randomly remove $15$ ratings, we repeat the experiments three times with different seeds and get the average recommendation performance to reduce random error.

Impact of Interaction Sequence

Our final experiment changes the order in the original sequence by data shuffling. We keep the validation set and test set unchanged, get new pseudo-sequences by disrupting the order of user interaction sequences in the training set, and observe the performance changes of the sequence recommendation algorithm.

Acknowledgements

We build on the following repositories to improve our codes for customized experiments, which also ensures the reproducibility and reliability of our results.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
DaisyRec-v2.0		DaisyRec-v2.0
SASRec pytorch		SASRec pytorch
TiSASRec		TiSASRec
caser_pytorch-master		caser_pytorch-master
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Codes for the paper “Our Model Achieves Excellent Performance on MovieLens: What Does it Mean?”

MovieLens1518 Dataset

Baseline and Evaluation Metric

Experiments

Impact of Interaction Context at Different Stages

Impact of Interaction Sequence

Acknowledgements

About

Releases

Packages

Languages

License

FrankYuchen/RevistMovieLens

Folders and files

Latest commit

History

Repository files navigation

Codes for the paper “Our Model Achieves Excellent Performance on MovieLens: What Does it Mean?”

MovieLens1518 Dataset

Baseline and Evaluation Metric

Experiments

Impact of Interaction Context at Different Stages

Impact of Interaction Sequence

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages