Recommender Systems for Last.fm

Recommender systems with collaborative filtering created with Apache Mahout framework. The system uses a Music Recommendation dataset for research purposes as input, but you can train it and predict recommendations with any other dataset. This project explores the calibration and accuracy of user-based and item-based models.

Data

The original dataset contains <user, timestamp, artist, song> tuples collected from Last.fm API, using the user.getRecentTracks() method. This dataset represents the whole listening habits (till May, 5th 2009) for nearly 1,000 users.

The pre-processed dataset below contains <user, artist, rating> tuples. The rating field was calculated by normalizing the number of times a user listened to a specific artist’s songs in Last.fm.

Table format: u.data.csv

user id	artist id	rating
1	100001	5.0
3	101943	4.6
6	100906	4.3
11	101722	3.6
15	107070	3.9

You can see the original dataset here

Model Tuning

The training results of the user-based collaborative filtering model are shown below:

User-Based 1 - Model tuning

User-Based 2 - Model tuning

The training results of the item-based collaborative filtering model are shown below:

Item-Based 1 - Model tuning

Technologies and Techniques

Java (JDK 1.7)
Eclipse IDE
Apache Mahout
Maven dependencies

Program Execution Rules

The project has an executable in the ‘jar’ folder. The JAR name is: RS_CF_LastFm-v1.jar and you must send as input parameters:

User ID for which recommendations are to be computed
Filepath to the input data
Filepath to the output file
Type of collaborative filtering: USER or ITEM
Similarity metric: COSINE, PEARSON or JACCARD
Number of neighbors for the KNN algorithm (only applies with user-based filtering). Default value: 51
Desired number of recommendations. Default value: 10

Execution examples:

    java -jar RS_CF_LastFm-v1.jar 1 ../data/in/u.data.csv ../data/out/output.txt USER COSINE 101 20
    java -jar RS_CF_LastFm-v1.jar 10 ../data/in/u.data.csv ../data/out/output.txt ITEM PEARSON 0 10
    java -jar RS_CF_LastFm-v1.jar 10 ../data/in/u.data.csv ../data/out/output.txt ITEM JACCARD

The .JAR program must be run with Java 7 or higher.

Program Output

Once trained the model, the system can make recommendations (on demand) for users, as follows:

user id	artist id	rating
1	130710	4.366509
1	114674	3.0061495
1	143895	2.9370918
1	103116	2.8950827
1	104052	2.7250140
1	135747	2.6153402
1	135743	2.5869453
1	102936	2.5726979
1	113273	2.5512722
1	114145	2.5447776

Conclusions

Apache Mahout is an excellent fast development framework for Recommender Systems projects. It has a wide variety of Machine Learning algorithms to make predictions (recommendations) and an extensive list of similarity functions.
In static or semi-static scenarios, recommendation systems with items-based collaborative filtering offer better results than those users-based , since it is easier to calculate the similarity between items than between users.
As a general rule, both in the user-based and in the item-based models, the predictive results improved when the models were exposed to more data (75/25 or 80/20), since the Machine Learning algorithm used has more information from which to learn.
The construction of a model items-based takes more time than the construction of a user-based model. However, once the model is built, it makes predictions more quickly and above all, more accurately than the user-based one.
When a new user is created, the user-based recommender model will have a cold start for that user, until the user performs enough interactions to be able to look like someone else. Analogously, it occurs for the item-based model when a new item is created.

Contributing and Feedback

Any kind of feedback/criticism would be greatly appreciated (algorithm design, documentation, improvement ideas, spelling mistakes, etc…).

Authors

Created by Andrés Segura Tinoco and Francisco Ariza Benavides
Created on May 29, 2019

License

This project is licensed under the terms of the MIT license.

Acknowledgements:

Thanks to Last.fm for providing the access to this data via their web services.

A Recommender Systems for Last.fm

Recommendation system with collaborative filtering created with Apache Mahout