3652
Comment:

← Revision 3 as of 20140212 16:35:15 ⇥
3649

Deletions are marked like this.  Additions are marked like this. 
Line 80:  Line 80: 
* Online Collaborative Filtering. Jacob Abernethy, Kevin Canini, John Langford, Alex Simma http://canini.me/research_files/OnlineCollaborativeFiltering.pdf  * Online Collaborative Filtering. Jacob Abernethy, Kevin Canini, John Langford, Alex Simma http://canini.me/research_files/OnlineCollaborativeFiltering.pdf 
Online Collaborative Filtering
Contents
 Overview
 Usage
 Complementary functions
Overview.
 The problem of collaborative filtering is typically defined as the task of inferring consumer preferences: Given an observed set of product preferences for a set of users, can we accurately predict the unobserved preferences?
 Notation. We define a collaborative filtering problem as a distribution D over triples (a, b, r) ⊂ A × B × R where A and B are finite sets of size n and m respectively. We are given a set M of triples {(a, b, r)} and want to find a function f (a, b) which minimizes the expected squared error
 Typically, we think of A as our set of users, B as our set of products, and r as user a’s “rating” of product b. In most movie recommendation datasets, r is a number in {1, 2, 3, 4, 5} as in the number of “stars”, although in other settings we may only be given r ∈ {0, 1}, as in liked/disliked.
Usage.
 Basic overview of usage steps are:
 Convert. convert input data into OnlineCF compatible format.
 Configuration and Train. set parameters for training
 Load
 Predict
Convert.
Since, currently, we support one input path, we need to convert input data and combine set of triples, item and user features into one file. In order to implement custom parsing of input data, use InputConverter class Below is example for Movie Lens dataset converter.
MovieLensConverter converter = new MovieLensConverter(); converter.convert(pathToPreferences, pathToMovieGenres, convertedOutputPath);
Configuration and Train.
 In order to achieve good performance in prediction we need to configure iteration count, matrix rank and matrix factorization update functions.
OnlineCF recommender = new OnlineCF(); recommender.setInputPreferences(convertedOutputPath); recommender.setIteration(150); recommender.setMatrixRank(3); recommender.setSkipCount(1); // after how many steps we should synchronize values in each task recommender.setUpdateFunction(MeanAbsError.class); recommender.setOutputPath(outputFileName); recommender.train();
Load.
 After training, model will be saved into output file by default In order to use prediction functions we need to load it.
recommender.load(pathToTrainedModel, false);
Predict.
// estimate score double estimatedScore = recommender.estimatePreference(userId, itemId); // estimate user similarities double userSimilarity = recommender.calculateUserSimilarity(user1, user2); // Pair<K, V>  where K predicted similar user, V predicted similarity score List<Pair<Long, Double>> similarUsers = recommender.getMostSimilarUsers(userId, count); // estimate item similarities double itemSimilarity = recommender.calculateItemSimilarity(item1, item2); // Pair<K, V>  where K predicted similar item, V predicted similarity score List<Pair<Long, Double>> similarItems = recommender.getMostSimilarItems(itemId, count);
Complementary functions
 There some classes which can be useful to know while using Online Collaborative Filtering.
InputConverter. For parsing input data and converting into OnlineCF compatible format. (see MovieLensConverter)
OnlineUpdate.Function. For matrix factorization functions It will be used while training and estimating user preference (see MeanAbsError)
References
Online Collaborative Filtering. Jacob Abernethy, Kevin Canini, John Langford, Alex Simma http://canini.me/research_files/OnlineCollaborativeFiltering.pdf