User:Bloodysnowrocker/YahooMusic


 * Yahoo! Music Recommendations:Modeling Music Ratings with Temporal Dynamics and Item Taxonomy
 * Overview:
 * 1) Problem to solve: recommendation over Yahoo!
 * 2) Data set: A million users, 600K music item, 250 million ratings, fine resolution of timestamps associated with ratings.
 * 3) Model: matrix factorization, temporal analysis of users rating behavior, item popularity trends.Sessions of Users behavior can be used to improve prediction accuracy.
 * 4) Novelty: four level taxonomy so information can be shared among items in the temporal user behavior analysis
 * Collaborative Filtering Approach
 * 1) utilize user implicit or explicit feedback to infer relationships between two users, items etc.
 * 2) CF approach suffers from cold-start problem where the rating of an entity is really small. So item attributes can be proper gated to be used in such cases.
 * Paper's Approach
 * 1) Matrix Factorization model, which maps items and users into comparable latent factors.
 * $$\hat{r}_ui= \mu + b_i + b_u + {p_u^T}q_i$$, where $${{p_u}\in{R^d}}$$ is users latent factor vector, and $${q_i}\in{R^d}$$ is item's latent factor. $$\mu$$ is the average rating, $${b_i}$$ and $${b_u}$$ are user and item biases respectively.$${p_u^T}{q_i}$$


 * Bias Modelling: $$\hat{r}_ui= \mu + b_i + b_u $$. $$b_i$$ and $$b_u $$ are user and item biases respectively, which are independent of user and item. This bias factor is pure lack of personalization. And it can be attributed to explained variance of the model. What the paper claimed is, they can reduce variance to around 510.3($${R^2}$$=52.9%). Out of this, the vast majority (41.4%) can be attributed to pure biases.Complete bias model, including enhanced user and item biases, for track i. Taxonomy bias adds up biases for album, artist and genre, and item popularity trend across time. User biases add up session biases as well.
 * Considering :$$\hat{r}_ui= \mu + b_{ui} + {p_u^T}q_i$$ where :$$b_{ui}$$ is the modeled bias and :$${p_u^T}q_i$$ is the personalization model. :$${p_u^T}$$ is user factor vector which also accounts for session characteristics, and :$$q_i$$ is the factor vector for tracks, which also accounts for its album and its artist.
 * Learning :$$\hat{r}_ui= \mu + b_{ui} + {p_u^T}q_i$$ by stochastic gradient descent( SGD ) for learning all parameter :$$\theta$$.
 * Catch: The process is based on modeling of additive components of the model, each trying to reflect a unique characteristics of the data. Significant effort is spent on estimating the biases of the data, which tend to capture much of the data variability.