Link Prediction Approach to Recommender Systems
NNoname manuscript No. (will be inserted by the editor)
Link Prediction Approach to Recommender Systems
T. Jaya Lakshmi · S. Durga Bhavani the date of receipt and acceptance should be inserted later
Abstract
The problem of recommender system is very popular with myriad availablesolutions. A novel approach that uses the link prediction problem in social networkshas been proposed in the literature that model the typical user-item information asa bipartite network in which link prediction would actually mean recommending anitem to a user. The standard recommender system methods suffer from the problemsof sparsity and scalability. Since link prediction measures involve computations per-taining to small neighborhoods in the network, this approach would lead to a scalablesolution to recommendation. One of the issues in this conversion is that link pre-diction problem is modelled as a binary classification task whereas the problem ofrecommender systems is solved as a regression task in which the rating of the linkis to be predicted. We overcome this issue by predicting top k links as recommenda-tions with high ratings without predicting the actual rating. Our work extends similarapproaches in the literature by focusing on exploiting the probabilistic measures forlink prediction. Moreover, in the proposed approach, prediction measures that utilizetemporal information available on the links prove to be more effective in improvingthe accuracy of prediction. This approach is evaluated on the benchmark ’Movielens’dataset. We show that the usage of temporal probabilistic measures helps in improv-ing the quality of recommendations. Temporal random-walk based measure T_Flowimproves recommendation accuracy by 4% and Temporal cooccurrence probabilitymeasure improves prediction accuracy by 10% over item-based collaborative filteringmethod in terms of AUROC score.
T. Jaya LakshmiSchool of Engineering and Applied SciencesSRM University - Andhra Pradesh, India.E-mail: [email protected]. Durga BhavaniSchool of Computer and Information SciencesUniversity of HyderabadHyderabad, IndiaE-mail: [email protected] a r X i v : . [ c s . I R ] F e b T. Jaya Lakshmi, S. Durga Bhavani
Many e-commerce websites provide a wide range of products to the users. The userscommonly have different needs and tastes based on which they buy the products. Pro-viding the most appropriate products to the users make the buying process efficientand improves the user satisfaction. The enhanced user satisfaction keeps the userloyal to the website and improves the sales and thus profits to the retailers. Therefore,more retailers start recommending the products to the users which needs efficientanalysis of user interests in products. E-commerce leaders like Amazon and Netflixuse recommender systems to recommend products to the users. The standard recom-mendation algorithms like content-based filtering and collaborative filtering, modelthe ratings given by users to items as a matrix and predict the ratings of the customersto unrated items based on user/item similarity.Recommender systems recommend items to users. Items include products andservices such as movies, music, books, web pages, news, jokes and restaurants. Therecommendation process utilizes data including user demographics, product descrip-tions and the previous actions of users on items like buying, rating, and watching. Theinformation can be acquired explicitly by collecting ratings given by users on itemsor implicitly by monitoring user’s behavior such as songs listened in music web-sites, news/movies watched in news/movies websites, items bought in e-commercewebsites or books read in book-listing websites in the past. Usage of intelligent rec-ommendation systems improved the revenue of Amazon by 35%, caused a businessgrowth of 24% for BestBuy, increased 75% of views on Netflix and 60% views onYouTube. [7]. Therefore, building a personalized recommendation system has a pro-found significance not only in commercial arena, but also in the fields like health care,news, food, academia and so many. Each domain needs to consider different features.Recommender system is a typical application of link prediction in heterogeneousnetworks. Link Prediction problem infers future links in a network represented asgraph where nodes are users and edges denote interactions between users. In thecontext of recommender systems, the nodes may be of two types: items and users. Atransaction of a user buying an item can be shown as an edge from user node to itemnode. Recommendation problem can be viewed as a task of selecting unobservedlinks for each user node, and thus can be modeled as a link prediction problem.In this work, we have applied various link prediction measures on the bipartitenetwork in the context of recommender systems and verified the efficacy of thosemeasures. We have chosen a medium sized dataset of MovieLens1M for experimen-tation. The contributions made in this paper are – Evaluated the efficacy of base line link prediction measures to the recommenda-tion task. – Extended existing probabilistic measures called co-occurrence probability mea-sure and temporal co-occurrence probability measure to make it suitable for rec-ommendation problem. ink Prediction Approach to Recommender Systems 3 R = item item item item Fig. 1: Example rating matrix denoting rating given by users to 4 products.Fig. 2:
An example of user-item Bipartite Network with edges denoting rating given by user on item.
Schafer et al [36] define the problem of recommender systems as follows:Given a set of users U = { u , u , . . . u m } , and items I = { I , I , . . . , I n } and theratings R representing the ratings given by user u i to item I j , the task of recommendersystems is to recommend a new item I j to a user u i which the user u i did not buy.For example, consider the matrix given in Fig.1, where rows correspond to theusers and the columns denote products. The matrix entry R i j represents the rankingsgiven by user u i on item I j . The main task of the recommender systems is to predictthe unrated entries in the rating matrix.Recommender systems are natural examples of weighted bipartite networks. A Bipartite network contains exactly two types of nodes and single type of edgesexisting between different types of nodes. Bipartite network is defined as G = ( V ∪ V , E ) , where V and V are sets of two types of nodes, E represents the set ofedges between nodes of type V and V . Fig.2 depicts a sample scenario of usersbuying items in e-commerce sites such as Amazon, modeled as a bipartite network. T. Jaya Lakshmi, S. Durga Bhavani
Fig. 3: Taxonomy of recommender systems
The papers on recommender system in the literature are discussed from algorithmicas well as domain perspective in the next two sections. Fig. 3 gives a taxonomy of theliterature.3.1 Domain PerspectiveProduct recommendation is the mostly explored domain. E-commerce sites like Ama-zon, Flipkart, Ebay etc use various collaborative filtering techniques for recommend-ing products to their customers. Video recommendations like movies, TV shows andweb series are increasing exponentially by companies like Netflix, YouTube. Netflixhas launched a competition with 1Million dollars prize money for improving 10% ofRMSE in the movie recommendation, by releasing a 100 million customer ratings [5].A news recommendation system has been implemented in [27] using google news,by constructing a Bayesian network for identifying the user interests and then apply-ing content based as well as collaborative filtering and hybrid techniques. Researcharticle recommendation, food recommendation and health recommendation are otherdomains among many.Valdez et al. integrate recommender systems into individual medical records tohelp patients improve their autonomy. The authors have collected health related infor-mation from the patients’ records and have performed text processing for extractingfeatures and evaluated using metrics available in Information Extraction domain [38].All these domains need different pre-processing techniques, algorithms, and eval-uation metrics. ink Prediction Approach to Recommender Systems 5
Content-based recommendation uses item descriptions and constructs user profileswhich contain the information about user preferences [32]. These user preferencesmay include a genre, director and actor for a movie, an author for book etc. The rec-ommendation of an item to the user is then based on the similarity between the itemdescription and the user profile. This method has the benefit of recommending itemsto users that have never been rated by them [29]. Content-based recommendation sys-tems require complete descriptions of items and detailed user profiles. This is one ofthe main limitations of such systems [29].Content-based recommender systems are divided into three general approaches:Explicit, Implicit and Model-based. In explicit content based methods, profile in-formation of the users is collected through questionnaires that ask questions abouttheir preferences for the items. Implicit methods build the user profiles implicitlyby searching for similarities in liked and disliked item descriptions by the user. Thethird class of methods, viz., model-based methods construct user profiles based on alearning method, using item descriptions as input features to be given to a supervisedlearning algorithm, and producing user ratings of items as output.The item descriptions are often textual information contained in documents, websites and news messages. User profiles are modeled as weighted vectors of theseitem descriptions. The advantage of this method is the ability to recommend newlyintroduced items to users [29]. Content-based recommendation systems require thecomplete descriptions of items and detailed user profiles. This is the main limitationof such systems. Privacy issues such as users dislike to share their preferences withothers is another limitation of content-based systems.
Collaborative filtering systems, however, do not require such difficult to come by,well-structured item descriptions. Instead, they are based on users’ preferences foritems, which can carry a more general meaning than is contained in an item descrip-tion. Indeed, viewers generally select a film to watch based upon more criteria otherthan only its genre, director or actors.Collaborative filtering (CF) techniques depend only on the user’s past behaviorand provide personalized recommendation of items to users [26]. CF techniques take
T. Jaya Lakshmi, S. Durga Bhavani a rating matrix as input where the rows of the matrix correspond to users, the columnscorrespond to items and the cells correspond to the rating given by the user to theitem [20]. The rating by a user to an item represents the degree of preference ofthe user towards the item. The major advantage of the CF methods is they are domainindependent. CF methods are classified into Model-based CF [18] and Memory-basedCF. Model based techniques model ratings by describing both items and users onsome factors induced by matrix factorization strategies. Data sparsity is a seriousproblem of CF methods.Model based approaches such as dimensionality reduction techniques, PrincipalComponent Analysis (PCA), Singular Value Decomposition (SVD) are some of thepopular techniques that address the problem of sparsity. However, when certain usersor items are ignored, useful information for recommendations may be lost causingreduction in the quality of recommendation [33]. The limitation of collaborative fil-tering systems is the cold start problem. i.e., these methods can not recommend newitems to the existing users as there is no past buying history for the item. Similarly, itis difficult to recommend items to a new user without knowing the user’s interests inthe form of ratings.Memory based methods compute a set of nearest neighbors for users as well asitems using similarity measures like Pearson coefficient, cosine distance and Manhat-tan distance. Memory based CF techniques are further classified into user based anditem based [37].
User-based CF [12] User based methods compute the similarity between the usersbased on their ratings on items [13]. This creates a notion of user neighborhoods.These methods associate a set of nearest neighbors with each user and then predictthe user’s rating for unrated items utilizing the information of the neighbor’s ratingfor that item.
Item-based CF [34]: Similarly, item neighborhood depicts the number of users whorate the same items [35]. The item rating for a given user can then be predicted basedupon the ratings given in their user neighborhood and the item neighborhood. Thesemethods associate an item with a set of similar neighbors, and then predict the user’srating for an item using the ratings given by the user for the nearest neighbors of thetarget item.
Hybrid approach uses both types of information, collaborative and content based.Content boosted CF algorithm [28], uses the item profile information to recommendthe items to new users.
Context aware recommendation system is able to label each user action with an ap-propriate context and effectively tailor the system output to the user in that givencontext [3]. In [17], the authors proposed a recommendation of an activity to a social ink Prediction Approach to Recommender Systems 7 network user based on demographic information. Kefalas et al. have represented theirdata as a k-partite network modeled as a tensor and proposed novel algorithms. In thesame way, in [4], a citation recommendation is made to researchers also consideringtags and time. Standard recommender system algorithms cannot address these issues.
Li et al. map transactions to a bipartite user–item interaction graph and thus con-verting recommendation into a link prediction problem. They propose a kernel-basedrecommendation approach which generate random walk between user–item pair anddefine similarities between user–item pairs based on the random walks. Li et al. use akernel-based approach that indirectly examines customers and items related to user-item pair to foresee the existence of an edge between them. The paper uses a setof nodes representing two types of nodes users(U) and items(I) and edges denotingtransactions by the user regarding the items. The graph kernel is defined on the user-item pairs context [22].Zhang et al. propose a model for music recommendation [40]. The authors repre-sent the recommendation data as a bipartite graph with user and item nodes and theweight of the link is represented as a complex number depicting the dual preferenceof users in the form of like and dislike and improve the similarity of the users.In graph-based recommender system, the task of recommendation is equivalentto predicting a link between user-item based on graphical analysis. Link predictionin graphs is commonly a classification task, that predicts existence of a link in fu-ture [23]. But the problem of recommender systems is modeled as a regression task,which predicts the rating of a link. The predicted links with high ratings are gen-erally recommended to users. The standard techniques of recommendation systemstake the ratings matrix as input and predicts the empty cells in the matrix. This ap-proach severely suffer with the sparsity in the matrix. Ranking-oriented collaborativefiltering approach is more meaningful compared to the rating based approach be-cause, the recommendation is a ranking task recommending top-k ranked items toa user [9]. In some applications like recommending web pages, rating informationmay not be available. Graph-based recommendation can efficiently utilize the hetero-geneous information available in the networks by expanding the neighborhoods andcan compute the proximity between the users and items [37].A probabilistic measure for predicting future links in bipartite networks is givenin [14]. Several link prediction measures in various kinds of networks have beensummarized in [19].Our aim is to evaluate the efficacy of these measures in the context of recom-mender systems by modeling them as bipartite graphs. In this paper, we have appliedvarious existing link prediction measures on bipartite graphs in order to build a rec-ommender system. However, the link prediction approach does not give the actualrating, but gives the top k-item recommendations for a user. For experimentation, wehave chosen the bench-mark ’Movielens’ dataset. Next section gives an overview ofapplication of link prediction measures on bipartite graphs.
T. Jaya Lakshmi, S. Durga Bhavani
Graph-based recommendation algorithms compute recommendations on a bipartitegraph where nodes of the graph represent users and items and an edge forms betweena user node and an item node when the user buys the item. Once the transactionsare modeled as a graph, all the standard link prediction methods defined in [23],[25]can be applied directly on the graph to predict the future links [21]. All the measuresused in [23] are based on neighbors and paths. Especially, common neighbors, whichlie on paths of length two play an important role. All the measures are extended toheterogeneous environment in [19] with a mention of bipartite networks as specialcase.We follow similar notation as in [19] in this paper, which is given below. – ( u , p ) is a pair of user and product nodes without an edge between them. – k represents hop-distance between two nodes and t denotes time. – Γ k ( x ) is the set of k - hop neighbors of node x . Γ ( x ) refers to the set of all nodesconnected by an edge of any type to x generally written as Γ ( x ) . – Γ ( u ) ∩ Γ ( p ) denotes the set of common neighbors between node u and p . Inbipartite graph, this set is empty. – Γ k ( u ) ∩ Γ k ( p ) contains all the common neighbors within k -hop distance betweennodes u and p . In bipartite graph, k needs to be even to start and end path betweensame type of nodes and odd if path starts and ends between nodes of differenttypes. As recommender system tries to recommend items to users, the odd lengthpaths are meaningful. – P k ( u , p ) denotes the set of paths connecting u and p by at most k edges.4.1 Baseline link prediction measures in bipartite graphIn recommendation systems modeled as bipartite graph, the task is to recommendproducts to users. The product nodes and user nodes are connected with odd lengthpaths. Even length paths connect the nodes of same type, which is not meaningful inthis scenario. Keeping this in mind, the baseline link prediction measures for bipartitegraphs are given below. We use suffix B with all link prediction measures to indicatethey are used in context of bipartite network. – Common Neighbors (CN) : The common neighbor measure in bipartite graph isgiven as follows: CN B ( u , p ) = | Γ ( u ) ∩ Γ ( p ) | (1)Jaccard Coefficient, AdamicAdar and Preferential Attachment measures are alsoneighborhood-based defined in a similar way as follows. – Jaccard Coefficient (JC) : Jaccard Coefficient is the normalized CN measure JC B ( u , p ) = | Γ ( u ) ∩ Γ ( p ) || Γ ( u ) ∪ Γ ( p ) | (2) ink Prediction Approach to Recommender Systems 9 – Adamic Adar (AA) : This measure gives importance to the common neighborswith low degree. The following definition for bipartite networks has been hintedat [10]. AA B ( u , p ) = ∑ z ∈ Γ ( u ) ∩ Γ ( p ) log ( | Γ ( z ) | ) (3) – Preferential Attachment (PA) : This measure does not change in bipartite en-vironment because the measure is concerned only about the degree of the nodewhatever the type of the node may be. PA B ( u , p ) = | Γ ( u ) | ∗ | Γ ( p ) | (4) – Katz (KZ) : This measure is based on the total number of paths between u and p bounded by a limit penalized by path length. KZ B ( u , p ) = ∑ l β l | P l ( u , p ) l | (5) – Page Rank (PR) : This measure can be extended to heterogeneous network byincluding the heterogeneous edges in the random-walk. PR B ( u ) = − α | E | + α ∑ z ∈ Γ ( u ) PR ( z ) | Γ ( z ) | (6)where | E | is the total number of links in G . – Rooted Page Rank (RPR) : In order to make the measure PR symmetric, RootedPage Rank is computed for an edge ( u , p ) as follows. RPR B ( u , p ) = PR ( u , p ) + PR ( p , u ) (7) – PropFlow (PF) : This is a random-walk beginning at node u and ending at p within l steps. This random-walk from node u to p terminates either when itreaches p or revisits any node. PF B ( u , p ) is the probability of information flowfrom u to p based on random transmission along all paths defined recursively asfollows. PF B ( u , p ) = L ∑ l = ∑ p ∈ paths l ( u , p ) ∑ ∀ ( z , z ) ∈ p PF ( z , z ) (8)where PF B ( z , z ) = PF ( a , z ) ∗ w ( z , z ) ∑ z ∈ Γ ( z ) w ( z , z ) (9)with a as previous node of z in the random-walk, PF ( a , z ) =1 if a is the startingnode and paths l ( u , p ) is the set of paths of length l between u and p . In bipartitenetwork, l is odd. PF B ( u , p ) = L ∑ l = ∑ p ∈ P l ( u , p ) ∑ ∀ ( z , z ) ∈ p PF B ( z , z ) (10)where PF B ( z , z ) is as defined in Eq.9.4.2 Temporal measures for link prediction in bipartite graph B ) We extend Time-Score measure proposed for homogeneous networks [31] to bipartiteenvironment as follows:
T S B ( u , p ) = ∑ path ∈ P ( u , p ) w ( path ) ∗ β r ( path ) | latest ( path ) − oldest ( path ) | + w ( path ) is equal to the harmonic mean of edge weights of edges in path , β is a damping factor (0< β <1), latest ( path ) = max e on P ( t ( e )) , oldest ( path ) = min e on P ( t ( e )) and r is a recency factor, defined as r ( path ) = current _ time − latest ( path ) . B ) Choudhary et al. extend the Time-score measure to obtain a path based measure called
Link - score [6] for homogeneous networks. To obtain the Link - score between a pairof nodes u and p which are not directly connected, the authors define a Time PathIndex ( T PI ) on each path p between the nodes u and p . T PI evaluates path weightbased on time stamps of links involved in a path. Link-score is the sum of
T PI ofeach path between the nodes x and y .We extend Link - score to bipartite network by considering paths of oddlength between two nodes instead of paths containing any length. The modified defi-nitions of T PI and
Link - Score are given in equation 12.
T PI path = w ( path ) ∗ β current _ time − avg ( path ) | current _ time − latest ( path ) | + avg ( path ) is the average active year, which is the average of years of recentinteraction of edges on odd length path and all others are as defined in equation 11. LS B ( u , p ) = L ∑ l = Avg ( T PI P l ( u , p ) ) l − L is the maximum length of paths between nodes i and j . ink Prediction Approach to Recommender Systems 11 B ) T_Flow [30] is a random-walk based measure, which is an extension of PropFlowmeasure defined in [25]. Munasinghe et.al [30] define T_Flow that computes theinformation flow between a pair of nodes x and y through all random-walks startingfrom node u to node p including link weights as well as activeness of links by givingmore weight to recently formed links recursively and take the summation.We extend T_Flow measure to bipartite networks as follows: T F B ( u , p ) = l ∑ i = ∑ path ∈ P l ( x , y ) ∑ ∀ e ∈ path T F B ( z , z ) ∗ ( − α ) r ( path ) (14)If ( u , p ) ∈ E , then T F B ( u , p ) is given by T F B ( u , p ) = T F B ( a , u ) ∗ w ( u , path ) ∑ z ∈ Γ ( u ) w ( u , z ) ∗ ( − α ) r ( path ) (15)where t u is the time stamp of the link when the random walk visits the node u and t p is the time stamp of the link when the random walk visits node p .4.3 Probabilistic measure for link prediction in bipartite graphThe Probabilistic Graphical Model (PGM) represents the structure of the graph ina natural way by considering the nodes of graph as random variables and edges asdependencies between them. By representing a graph as a PGM, the problem of linkprediction, which is calculation of the probability of link formation between twonodes u , p is translated to computing the joint probability of the random variables U , P .Recommender systems represented as bipartite networks are large in size. There-fore, finding the joint probabilities of link formation is intractable. But the links in thegraphs are also sparse, with nodes generally directly connected to only a few othernodes. For example, in the e-commerce sites, out of lakhs of items, a user buys onlya few items. This property allows the PGM distribution to be represented tractably.The model of this framework is simple to understand. Inference between two randomvariables is same as finding the joint probability between those two random variablesin PGM. Many algorithms are available for computation of joint probability betweenvariables, given evidence on others. These inference algorithms work directly on thegraph structure and are generally faster than computing the joint distribution explic-itly. With all these advantages, link prediction in PGM is more effective.In most of the cases, the pair-wise interactions of entities are available in the eventlogs. For example, a user buying/rating three items say p , p and p is available inthe corresponding transaction database. In order to model the unknown distributionof co-occurrence probabilities, events available in the event logs can be used as con-straints to build a model for the unknown distribution. Probabilistic Graphical Models efficiently utilize this higher order topological information and thus are efficient inlink prediction task [39].The probabilistic model helps in estimating the joint probability distribution overthe variables of the model [11]. That means, a probabilistic model represents the prob-ability distribution over all possible deterministic states of the model. This facilitatesthe inference of marginal probabilities of the individual variables of the model.Wang et al. [39] are among first researchers who modeled the problem of linkprediction using MRFs. Kashima et al. [16] propose a probabilistic model of net-work evolution and apply it for predicting future links. The authors show that byintelligently selecting the parameters in an evolution model, the problem of networkevolution reduces to the problem of link prediction. Clauset et al. [8] propose a proba-bilistic model based on hierarchical structure of the network. In hierarchical structure,the vertices are divided into groups that further subdivide into groups of groups andso on. The model infers hierarchical structure from network data and can be used forprediction of missing links. The learning task uses the observed network data andinfers the most likely hierarchical structure through statistical inference.The works of Kashima [16] and Clauset [8] are global models and are not scal-able for large networks. The method of Wang et. al. [39] uses local probabilisticinformation of graphs for link prediction and we adopt their method in our work. Thefollowing section explains the algorithm for link prediction using MRF proposed byWang et al. [39] Wang et al. [39] propose a measure called Co-occurrence Probability (COP) to becomputed between a pair of nodes without an edge between them. The procedure forcomputing COP has three steps. First step chooses a few set of nodes which may playa significant role in future link formation. Second step computes a Markov RandomField with the chosen nodes in the first step. Third step infers joint probability be-tween the two nodes using the MRF constructed. The measure has been extended totemporal networks in [15]. Heterogeneous version of the measure is defined in [14].Though
Hetero -COP has been defined in [14], the algorithm has to be re-worked forthe bipartite context. In this paper, we have extended COP measure to bipartite envi-ronment, named it as B -COP of a missing link ( u , p ) in the lines of COP , which isgiven in Algorithm 1. The extended computation procedure for bipartite networks isgiven in subsequent sections. ink Prediction Approach to Recommender Systems 13
Algorithm 1 B -COP measure for Link Prediction in bipartite graphs Input : G = ( V , E ) , where V = { V ∪ V } is the set of two types of nodes, E is the set of links from nodes of V to V . Output : B -COP ( u , p ) where ( u , p ) is a missing link. Step 1:
Extract B -cliques from G , from the event logs using Algorithm 3. Call it BCliq . Step 2:
Compute
BCNS , the central neighborhood set of ( u , p ) using the Algorithm 2. Step 3:
Extract B -cliques formed with the nodes in BCNS and compute clique potentials. This formsthe local MRF with the nodes in BCNS. Step 4:
Return B -COP ( u , p ) which is the joint probability of link ( u , p ) using junction tree algorithm. A clique in bipartite graphs can be considered as complete bipartite sub-graph. A sample B -clique is shown in Fig.4.Fig. 4: An example B -clique.5.1 Computation of Central Neighborhood Set in Bipartite graph(BCNS)BCNS ( u , p ) is computed using a breadth first search based algorithm on bipartitegraph as follows: All paths between u and p are obtained using breadth first search(BFS) algorithm. Then, the frequency score of each path is computed by summingthe occurrence count of nodes on the path. Occurrence count of a node is the numberof times the node appears in all paths. The paths are now ordered in the increasingorder of length with equal paths being ordered in decreasing order of frequency score.The size of the central neighborhood set is further restricted by considering only top k nodes. The procedure of computing BCNS ( u , p ) is described in the Algorithm 2.The procedure of computation of BCNS for a toy example given in Fig.5 is shownin Table 1. Algorithm 2 Bipartite Central Neighborhood Set( G , u , p , l , maxSize ) Input : G : a graph; u : starting node; p : ending node; l :maximum path length; maxSize :Central Neighbor-hood Set size threshold Output : BCNS , Bipartite Central Neighborhood Set between u and p ; Step 1:
Compute paths of length ≤ l between u and p . Step 2:
Find occurrence count O k of each node k in paths between u and p . Step 3:
Compute f requency - score F p , of each path p as follows: F p = ∑ k ∈ p O k f requency - score of a path is the sum of the occurrence counts of all nodes along the paths. Step 4:
Sort the paths in increasing order of path length and then in decreasing order of f requency - score . Let the ranked list of paths be P . Step 5:while size(BCNS) ≤ maxSize do Add nodes of path p ∈ P to BCNS end whilereturn BCNS
Fig. 5:
A toy example for illustratingcomputation of BCNS between nodes u and p . Node weights are the occurrencecounts. path(p) frequency-score( F p ) u − p − u − p O p + O u = 6 u − p − u − p O p + O u = 5 u − p − u − p O p + O u = 4 u − p − u − p − u − p O p + O u + O p + O u = 9 Table 1: All paths between nodes u and p in Fig.5and their frequency scores.Table 1 shows all paths sorted in the increasing order of path length and fre-quency score, between the nodes u and p of the graph shown in Fig.5 along withtheir f requency - scores . One can observe that all these paths are of odd length. Forthe bipartite graph in Fig.5, BCNS ( u , p ) = { u , p , u , p , p } , if maxSize is taken as 5.5.2 Construction of local MRFAfter computing the BCNS, the B -cliques containing only nodes of BCNS are ex-tracted. This forms the clique graph of local MRF. MRF construction needs compu-tation of clique potentials. The clique potential table of a B -clique is computed usingthe NDI.In most of the cases, the information of homogeneous cliques containing all nodesof same type are available in the event logs. For example, in coauthorship networks,the group of authors who publish a paper together forms a homogeneous clique ofauthor nodes and an author who publishes a paper in a conference forms a heteroge-neous edge between the author node and conference node. But in the case of recom- ink Prediction Approach to Recommender Systems 15 mender systems, user cliques and item cliques are not readily available in the eventlogs. We propose an algorithm for extracting B-cliques shown in Algorithm 3. Sincethe number of users is huge in comparison to the set of items which is a much smallerset, the extraction of B -cliques can start from the item cliques. Algorithm 3 Extraction of B-Cliques from user-item bipartite graph
Input : G = ( V , E ) where V = U ∪ I , U is set of user nodes and I is the set of item nodes and E ⊆ UXI represents weighted edges.
Output : Bcliq , set of maximal B-cliques of G . Step 1:
Extract the set of all item cliques,
Item _ Cliq as follows:
Item _ Cliq = φ for each user u ∈ U do I u = φ for each i ∈ I and ( u , i ) ∈ E do I u = I u ∪ { i } end for // I u is formed by taking all the items to which u gives a rating. Item _ Cliq = Item _ Cliq ∪ I u end forStep 2: Extract the set of all user cliques,
User _ Cliq as follows:
User _ Cliq = φ for each item i ∈ do U i = φ for each u ∈ U and ( u , i ) ∈ E do U i = U i ∪ { u } end for // U i contains all users that have rated item i . User _ Cliq = User _ Cliq ∪ U i end forStep 3:for each item i in I do J ← I for each user v in U i do J ← J ∩ I v end for // J = (cid:92) v ∈ U i I v // Bcliq i = J ∪ U i Bcliq = ∪ i Bcliq i end for return Bcliq
The B-clique extraction algorithm first extracts all homogeneous cliques of items
Icliq and users
Ucliq . A homogeneous user clique is formed with all the users whorate/buy the same item. Similarly, a homogeneous clique of items is formed with allthe items a user rate/buy. To extract a B-clique, first consider an item i . For each user v who have rated i , compute the common items rated by user v . The union of U i alongwith all the common items rated by U i forms a B -clique. The process of extractingB-cliques for toy example in Fig.6 is illustrated below: Fig. 6: User-Itembipartite graph U= { u , u , u , u , u } I= { i , i , i , i , i } Item cliques :Item clique corresponding to user u = { i , i } Item clique corresponding to user u = { i , i } Item clique corresponding to user u = { i , i , i , i } Item clique corresponding to user u = { , i , i } Item clique corresponding to user u = { , i , i } User cliques :User clique corresponding to item i = { u , u } User clique corresponding to item i = { u , u , u } User clique corresponding to item i = { u , u , u } User clique corresponding to item i = { u , u , u } User clique corresponding to item i = { u , u , u } The the extraction of B-cliques from the above homoge-neous cliques is shown in Fig.7. ink Prediction Approach to Recommender Systems 17
Fig. 7: Illustration of extracting B -cliques from User-Item event logs B -COP scoreOnce the local MRF of a pair of nodes u and p is constructed, the B -COP scorebetween the nodes is obtained using junction tree inference algorithm. Note that B -COP score for a link u - p cannot be computed if u and p are in disjoint cliques asthere exists no path connecting these cliques.The experimental evaluation of proposed approach is given in next section. The experimentation is carried out on a benchmark MovieLens recommender systemwhose details are given below.6.1 DatasetThis data set used for experimental evaluation contains more than ten million rat-ings given by 71,567 users on 10,681 movies of the online movie recommender ser-vice MovieLens [1]. In the graph recommendation systems, movies are considered asnodes and items are users and rating given by users to the movies are considered asthe weights on links. In MovieLens dataset, the train-test splits are given in [1]. Theusers who have rated at least 20 movies have been chosen randomly to be includedin this dataset. The benchmark data set [2] is given with 80% - 20% split with 80%given as training set and test set containing 20%. The training and test sets are formedby splitting the ratings data such that, for every user, 80% of his/her ratings is takenin training and the rest are taken in the test set. In this experimentation, 5 fold crossvalidation is used. All the 5 sets of training and test datasets are made available at[2]. The evaluation metrics used for recommender systems are given in the followingsection.6.2 Evaluation MetricsEvaluation metrics in recommender systems can be classified as – Accuracy measures: Mean Absolute Error (MAE), Root of Mean Square Error(RMSE), Normalized Mean Average Error (NMAE). – Set recommendation metrics : Precision, Recall and Area Under Receiver Oper-ating Characteristic (AUROC), Area Under Precision-recall curve (AUPR) – Rank recommendation metrics: Half-life, discounted cumulative gain and Rank-score Most of the measures listed above, use rating to calculate the error andhence are not applicable in our context. We use AUROC, AUPR are used for evalu-ating performance. ink Prediction Approach to Recommender Systems 19
Rank-score
Rank-score metric measures the ability of a recommendation algorithmto produce a ranked list of recommended items. The recommender system method isefficient, if the ranking given by the method matches with the user’s order of buyingthe items in the recommended list. Rank-score is defined as follows : For a user u ,the list of items i recommended to u , that is predicted by the algorithm is captured by rankscore p rankscore max = ∑ | T | j = j − α rankscore p = ∑ j ∈ T rank ( j ) − α Rankscore = rankscore p rankscore max where rank ( j ) is the rank given by the recommender algorithm to item j . T is theset of items of interest and α is ranking half-life, an exponential reduction factor andcan be any numeric.6.3 Results and DiscussionThe prediction scores of all baseline link prediction measures are computed usingthe tool LPMade [24], with default parameters. In the computation of
T S B , LS B and T PI , the damping factor β is taken as 0.5. The decay factor α is considered as 0.1in computation of T F B . In TCOP , maximum path length of 10 is taken for comput-ing BCNS. The accuracy of all these link prediction measures is compared with thestandard User-based and Item-based collaborative filtering (CF) methods. Pearsoncorrelation coefficient is used to find the similar users in User-based CF and cosinesimilarity is used for finding item similarity in Item-based CF.The results obtained for AUROC, AUPR and Rank-score for MovieLens datasetare given in Table.2.First observation in this experimentation is that some of the link prediction mea-sures like Katz, PropFlow could produce better recommendation compared to Item-based CF. The usage of temporal measures seem to help in improving the quality ofrecommendations. The time of formation of link or the time of rating given by a userto movie plays crucial role, as the user’s preferences change over time. The temporalmeasures TS, LS, TF and TCOP assign more weight to the recent ratings. Therefore,temporal measures performed better than all the other measures including User-basedCF and Item-based CF. Fig.8 depicts a situation where non-temporal measures predicta link between the nodes 1 and 1080 and temporal measures predict correctly that thelink won’t be formed. This is clearly evident from the fact that B − Cliq
Table 2: Performance of LP measures for recommending movies to users in
MovieLens
Bipartite Network
LP measure Rank-score AUROC AUPRNon-temporal
LP measures CN B JC B AA B PA B KZ B PF B COP B Temporal
LP measures
T S B LS B T F B TCOP B methodsUser-based CF 3.9942 0.6925 0.0415Item-based CF 3.0274 0.7136 0.0491 Fig. 8: A snapshot from movielens dataset where the link predicted is not formed.Table.2 that the AUPR score also have shown great improvement from 0.0960(of TF)to 0.2351. Similar trend is observed for the evaluation measure of Rank-score. Allthe temporal measures perform better than User-based CF and Item-based CF. TCOPis rated as highest by Rank-score. ink Prediction Approach to Recommender Systems 21
Fig. 9: A snapshot from movielens dataset where the link predicted is formed.
In this paper, the recommender systems problem is solved using link prediction ap-proach. Link prediction approach is scalable as it is based on local neighborhood ofthe large sparse graph. The standard recommender systems approaches do not utilizethe temporal information available on the link effectively. In this work, some exten-sions are proposed to existing temporal measures in bipartite graphs. One of the maincontributions of this work is an algorithm for computing temporal cooccurrence prob-ability measure on bipartite graphs and its application to movie recommendation sys-tem. Temporal measures for link prediction such as Time score, Link Score, T_Flowand Temporal cooccurrence measure achieve improvement in recommendation qual-ity by utilizing this temporal information more efficiently. However, link predictionapproach to solve recommender systems do not address the cold start problem. Infuture, we would like to work on predicting actual rating of the link.
References
1. https://grouplens.org/datasets/, 2009.2. http://files.grouplens.org/datasets/movielens/ml-10m-README.html, 2009.3. Gediminas Adomavicius, Bamshad Mobasher, Francesco Ricci, and Alexander Tuzhilin. Context-aware recommender systems.
AI Magazine , 32(3):67–80, 1 2011.4. Zafar Ali, Guilin Qi, Pavlos Kefalas, Waheed Ahmad Abro, and Bahadar Ali. A graph-based taxon-omy of citation recommendation models.
Artificial Intelligence Review , pages 1–44, 2020.5. Robert M Bell and Yehuda Koren. Lessons from the netflix prize challenge.
Acm Sigkdd ExplorationsNewsletter , 9(2):75–79, 2007.6. Pankaj Choudhary, Nishchol Mishra, Sanjeev Sharma, and Ravindra Patel. Link score: A novelmethod for time aware link prediction in social network.
ICDMW , 2013.7. Michael Chui. Artificial intelligence the next digital frontier?
McKinsey and Company Global Insti-tute , 47:3–6, 2017.8. Aaron Clauset, Cristopher Moore, and Mark EJ Newman. Hierarchical structure and the prediction ofmissing links in networks.
Nature , 453(7191):98, 2008.9. Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. Performance of recommender algorithms ontop-n recommendation tasks. In
Proceedings of the Fourth ACM Conference on Recommender Sys-tems , RecSys ’10, pages 39–46. ACM, 2010.10. Darcy A. Davis, Ryan Lichtenwalter, and Nitesh V. Chawla. Supervised methods for multi-relationallink prediction.
Social Network Analysis and Mining , 3(2):127–141, 2013.11. Marek J. Druzdzel. Some properties of joint probability distributions. In
Proceedings of the TenthInternational Conference on Uncertainty in Artificial Intelligence , UAI’94, pages 187–194. MorganKaufmann Publishers Inc., 1994.12. Zan Huang, Wingyan Chung, and Hsinchun Chen. A graph model for e-commerce recommendersystems.
J. Am. Soc. Inf. Sci. Technol. , 55(3):259–274, 2004.13. Zan Huang, Wingyan Chung, and Hsinchun Chen. A graph model for e-commerce recommendersystems.
Journal of the American Society for information science and technology , 55(3):259–274,2004.14. T. Jaya Lakshmi and S. Durga Bhavani. Link prediction in temporal heterogeneous networks. In
Intelligence and Security Informatics , pages 83–98. Springer International Publishing, 2017.15. T. Jaya Lakshmi and S. Durga Bhavani. Temporal probabilistic measure for link prediction in collab-orative networks.
Applied Intelligence , 47(1):83–95, Jul 2017.16. H. Kashima, T. Kato, Y. Yamanishi, M. Sugiyama, and K. Tsuda. Link propagation: A fast semi-supervised learning algorithm for link prediction. In
Proceedings of the 2009 SIAM InternationalConference on Data Mining , pages 1099–1110. Philadelphia, PA, USA, May 2009.17. Pavlos Kefalas, Panagiotis Symeonidis, and Yannis Manolopoulos. A graph-based taxonomy of rec-ommendation algorithms and systems in lbsns.
IEEE Transactions on Knowledge and Data Engi-neering , 28(3):604–622, 2015.18. Yehuda Koren, Robert Bell, and Chris Volinsky. Matrix factorization techniques for recommendersystems.
Computer , 42(8):30–37, 2009.19. T Jaya Lakshmi and S Durga Bhavani. Link prediction measures in various types of informationnetworks: a review. In , pages 1160–1167. IEEE, 2018.20. Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman.
Recommendation Systems , page292–324. Cambridge University Press, 2 edition, 2014.21. Jing Li, Lingling Zhang, Fan Meng, and Fenhua Li. Recommendation algorithm based on link predic-tion and domain knowledge in retail transactions.
Procedia Computer Science , 31:875 – 881, 2014.22. Xin Li and Hsinchun Chen. Recommendation as link prediction in bipartite graphs: A graph kernel-based machine learning approach.
Decision Support Systems , 54(2):880–890, 2013.23. David Liben-Nowell and Jon Kleinberg. The link-prediction problem for social networks.
Journal ofThe American Society For Information Science and Technology , 58(7):1019–1031, 2007.24. Ryan N. Lichtenwalter and Nitesh V. Chawla. Lpmade: Link prediction made easy.
Journal ofMachine Learning Research. , 12:2489–2492, 2011.25. Ryan N. Lichtenwalter, Jake T. Lussier, and Nitesh V. Chawla. New perspectives and methods inlink prediction. In
Proceedings of the 16th ACM SIGKDD international conference on Knowledgediscovery and data mining , KDD’10, pages 243–252. ACM, 2010.ink Prediction Approach to Recommender Systems 2326. Greg Linden, Brent Smith, and Jeremy York. Amazon. com recommendations: Item-to-item collabo-rative filtering.
IEEE Internet computing , 7(1):76–80, 2003.27. Jiahui Liu, Peter Dolan, and Elin Rønby Pedersen. Personalized news recommendation based onclick behavior. In
Proceedings of the 15th international conference on Intelligent user interfaces ,pages 31–40, 2010.28. Prem Melville, Raymod J. Mooney, and Ramadass Nagarajan. Content-boosted collaborative filteringfor improved recommendations. In
Eighteenth National Conference on Artificial Intelligence , pages187–192. American Association for Artificial Intelligence, 2002.29. Raymond J. Mooney and Loriene Roy. Content-based book recommending using learning for textcategorization. In
Proceedings of the Fifth ACM Conference on Digital Libraries , DL ’00, pages195–204. ACM, 2000.30. Lankeshwara Munasinghe. Time-aware methods for link prediction in social networks.
PhD Thesis,The Graduate University for Advanced Studies , 2013.31. Lankeshwara Munasinghe and Ryutaro Ichise. Time aware index for link prediction in social net-works. In
DaWaK , volume 6862 of
Lecture Notes in Computer Science , pages 342–353. Springer,2011.32. Michael J. Pazzani and Daniel Billsus. The adaptive web. chapter Content-based RecommendationSystems, pages 325–341. Springer-Verlag, 2007.33. Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. Analysis of recommendation algo-rithms for e-commerce. In
Proceedings of the 2Nd ACM Conference on Electronic Commerce , EC’00, pages 158–167. ACM, 2000.34. Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. Item-based collaborative filteringrecommendation algorithms. In
Proceedings of the 10th International Conference on World WideWeb , WWW ’01, pages 285–295. ACM, 2001.35. Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. Item-based collaborative filteringrecommendation algorithms. In
Proceedings of the 10th international conference on World Wide Web ,pages 285–295, 2001.36. J. Ben Schafer, Joseph A. Konstan, and John Riedl. E-commerce recommendation applications.
DataMining and Knowledge Discovery , 5(1):115–153, 2001.37. Bita Shams and Saman Haratizadeh. Graph-based collaborative ranking.
Expert Syst. Appl. , 67(C):59–70, 2017.38. André Calero Valdez, Martina Ziefle, Katrien Verbert, Alexander Felfernig, and Andreas Holzinger.Recommender systems for health informatics: state-of-the-art and future perspectives. In
MachineLearning for Health Informatics , pages 391–414. Springer, 2016.39. Chao Wang, Venu Satuluri, and Srinivasan Parthasarathy. Local probabilistic models for link predic-tion. In
Proceedings of Seventh IEEE International Conference on Data Mining , ICDM ’07, pages322–331. IEEE Computer Society, 2007.40. Lingling Zhang, Minghui Zhao, and Daozhen Zhao. Bipartite graph link prediction method withhomogeneous nodes similarity for music recommendation.