User:MaurizioFD/sandbox

Hello, my name is MaurizioFD. I'm a PhD student at Politecnico di Milano, Italy. My research topic is recommender systems and machine learning. I do like here and salted peanuts



Matrix factorization is a class of collaborative filtering algorithms used in recommender systems. Matrix factorization algorithms work by decomposing the user-item interaction matrix as the product of two lower dimensionality rectangular matrices. This family of methods became widely known during the Netflix prize challenge due to its effectiveness as stated by Simon Funk in his 2006 blog post he used to share his findings with the research community.

Techniques
The idea behind matrix factorization is to represent users and items in a lower dimensional latent space. Since the initial work by Funk in 2006 a multitude of matrix factorization approaches have been proposed for recommender systems. Some of the most used ones are listed in the following sections.

Funk SVD
The original algorithm proposed by Simon Funk in his blog post factorized the user-item rating matrix as the product of two lower dimensional matrices, the first one will have a row for each user, while the second will have a column for each item. The row or column associated to a specific user or item is called latent factors. Note that, despite its name, in this matrix factorization algorithm no singular value decomposition is applied. The predicted ratings will be computed as $$\tilde{R}=H W$$, where $$\tilde{R} \in \mathbb{R}^{users \times items}$$, $$H \in \mathbb{R}^{users \times latent factors}$$ and $$W \in \mathbb{R}^{latent factors \times items}$$.

Specifically, the predicted rating user u will give to item i is computed as:

$$\tilde{r}_{ui} = \sum_{f=0}^{n factors} H_{u,f}W_{f,i}$$

It is possible to tune the expressive power of the model by changing the number of latent factors. It has been demonstrated that a matrix factorization with one latent factor is equivalent to a most popular or top popular recommender (e.g. recommends the items with the most interactions without any personalization). Increasing the number of latent factor will improve personalization, therefore recommendation quality, until the number of factors becomes too high and the model starts to overfit, then the recommendation quality will decrease. A common strategy to avoid overfitting is adding regularization terms to the objective function. FunkSVD was developed as a rating prediction problem, therefore it used explicit numerical ratings as user-item interactions.

All things considered, FunkSVD minimizes the following objective function:

$$\underset{H, W}{\operatorname{arg\,min}}\, \|R - \tilde{R}\|_{\rm F} + \alpha\|H\| + \beta\|W\|$$

Where $$\|.\|_{\rm F}$$ is the frobenius norm, the other norms might be different depending on what works best for the specific recommending problem.

SVD++
While FunkSVD was able to provide very good recommendation quality, its ability to use only explicit numerical ratings as user-items interactions constituted a limitation. Modern day recommender systems should exploit all available interactions both explicit (e.g. numerical ratings) and implicit (e.g. likes, purchases, skipped, bookmarked). To this end SVD++ was designed to take into account implicit interactions as well. Compared to FunkSVD, SVD++ takes also into account user and item bias.

Ihe predicted rating user u will give to item i is computed as:

$$\tilde{r}_{ui} = \mu + b_i + b_u + \sum_{f=0}^{n factors} H_{u,f}W_{f,i}$$

SVD++ has however some disadvantages, mainly it is not model-based, meaning that if a new user is added, the algorithm is incapable of modeling it unless the whole model is retrained. Even though you might have gathered some interactions, you still do not have the user's latent factors. This is an example of cold-start problem, that is the algorithms cannot deal efficiently with new users or items and therefore specific strategies should be put in place.

It is possible to modify SVD++ in order for it to become a model-based algorithm, therefore allowing to manage easily new items and new users, solving the main drawback of SVD++.

Since the problem of SVD++ is that for new users we don't have the latent factors, it is necessary to represent the user's latent factors in a different way. Since the user's latent factors should represent the preference of that user for the corresponding item's latent factors, user's latent factors can be estimated via the past interactions. Knowing to which items the user interacted to, and therefore its latent factors, allows to estimate the user's ones. Note that this does not entirely solve the cold-start, since still we would need some reliable interactions for new users, but at least there is no need to recompute the whole model every time. It has been demonstrated that this formulation is almost equivalent to a SLIM model, which is an item-item model based recommender.

$$\tilde{r}_{ui} = \mu + b_i + b_u + \sum_{f=0}^{n factors} \biggl( \sum_{j=0}^{n items} r_{uj} W_{j,f} \biggr) W_{f,i}$$

With this formulation, the equivalent item-item recommender would be $$\tilde{R} = R S = R W^{\rm T} W$$. Therefore the similarity matrix is symmetric.

Asymmetric SVD
Asymmetric SVD aims at combining the advantages of SVD++ while being a model based algorithm, therefore being able to consider new users with a few ratings without needing to retrain the whole model. As opposed to the model-based SVD here the user latent factor matrix H is replaced by Q, which learns the user's preferences as function of their ratings.

Ihe predicted rating user u will give to item i is computed as:

$$\tilde{r}_{ui} = \mu + b_i + b_u + \sum_{f=0}^{n factors} \sum_{j=0}^{n items} r_{uj} Q_{j,f}W_{f,i}$$

With this formulation, the equivalent item-item recommender would be $$\tilde{R} = R S = R Q^{\rm T} W$$. Since matrices Q and W are different the similarity matrix is asymmetric, hence the name of the model.

Hybrid MF
In recent years many other matrix factorization models have been developed to exploit the ever increasing amount and variety of available interaction data and use cases. Hybrid matrix factorization algorithms are capable of merging explicit and implicit interactions or both content and collaborative data