[PDF] Reliable graph-based collaborative ranking

Abstract

GRank is a recent graph-based recommendation approach the uses a novel heterogeneous information network to model users' priorities and analyze it to directly infer a recommendation list. Unfortunately, GRank neglects the semantics behind different types of paths in the network and during the process, it may use unreliable paths that are inconsistent with the general idea of similarity in neighborhood collaborative ranking. That negligence undermines the reliability of the recommendation list generated by GRank. This paper seeks to present a novel framework for reliable graph-based collaborative ranking, called ReGRank, that ranks items based on reliable recommendation paths that are in harmony with the semantics behind different approaches in neighborhood collaborative ranking. To our knowledge, ReGRank is the first unified framework for neighborhood collaborative ranking that in addition to traditional user-based collaborative ranking, can also be adapted for preference-based and representative-based collaborative ranking as well. Experimental results show that ReGRank significantly improves the state-of-the art neighborhood and graph-based collaborative ranking algorithms.

Full PDF

RReliable Graph-based Collaborative Ranking

Bita Shams a and Saman Haratizadeh a a Faculty of New Sciences and Technologies, University of Tehran, Tehran, Iran .

Abstract

GRank is a recent graph-based recommendation approach the uses a novel heterogeneous information network to model users’ priorities and analyze it to directly infer a recommendation list. Unfortunately, GRank neglects the semantics behind different types of paths in the network and during the process, it may use unreliable paths that are inconsistent with the general idea of similarity in neighborhood collaborative ranking. That negligence undermines the reliability of the recommendation list generated by GRank.

This paper seeks to present a novel framework for reliable graph-based collaborative ranking, called ReGRank, that ranks items based on reliable recommendation paths that are in harmony with the semantics behind different approaches in neighborhood collaborative ranking. To our knowledge, ReGRank is the first unified framework for neighborhood collaborative ranking that in addition to traditional user-based collaborative ranking, can also be adapted for preference-based and representative-based collaborative ranking as well. Experimental results show that ReGRank significantly improves the state-of-the art neighborhood and graph-based collaborative ranking algorithms.

Keywords:

Collaborative ranking, Pairwise preferences, Heterogeneous networks, meta-path analysis, neighborhood recommendation Introduction

Recommender systems are sort of information filtering systems that help people through filtering the items that are in interest of the users. Although recommender systems can be designed to make recommendations based on distributed knowledge in peer-to-peer architectures [4,6], here we focus on the standard centralized recommendation service architecture that analyzes a centralized knowledge base for making recommendations. Collaborative filtering is the main class of recommendation algorithms which exploits users’ historical feedbacks to predict and recommend what users will require more in the future. Traditional collaborative filtering algorithms, collaborative rating methods, generate a model that correctly predict how users will rate the items. Recently, collaborative filtering algorithms are directed to learn the users’ ranking over items. The latter approach, called collaborative ranking, has gained more attention as recommendation algorithms should improve quality of Top-N recommendation that is inherently a ranking task[13,26]. Neighborhood collaborative ranking algorithms, also called NC-Rank methods, typically calculate users’ similarities through Kendall correlation and its variants that take into account the number of agreements/disagreements of users over common pairwise comparisons of items, and ind the most similar users to the target user. Then they analyze the pairwise comparisons of those similar users to predict the ranking of items for the target user [1,13,18,30]. NC-Rank algorithms generally get in trouble in sparse datasets where information is not rich enough to calculate reliable similarities and to find useful neighborhoods for estimating the preferences of users. More clearly, in sparse datasets users rarely have common pairwise preferences and so, most of estimated similarity values, obtained by Kendall and its variants[12,30], will be equal to zero[18]. Even if there exists a set of common pairwise preferences between two users, it is not large enough to ensure that similarity values are reliable [18,19]. In such a situations, NC-Rank algorithms form the neighborhood randomly and so, they fail to infer the true ranking of items for the target user [18]. Graph-based collaborative ranking (GRank) [19] resolves this sparsity issue by calculating extended similarities among entities in a heterogeneous information network, called Tripartite preference graph (TPG) that reflects priorities of users over items. TPG reflects the relations between entities of neighborhood collaborative ranking that are users, pairwise comparisons and items’ representatives (that refer to the winner or loser sides of items). It uses the personalized PageRank algorithm on the graph to propagate the rank from a target node to other nodes in the graph via existing paths in the network. The amount of rank propagated from a target user node to a an item node is then used to estimate the closeness of that item to the target user, which in turn determines the position of that item in the sorted recommendation list of items for that user. Heterogeneous information networks, such as TPG, are comprised of different types of entities resulting in different types of paths each reflecting a specific semantic. For instance, from the rating-oriented perspective, a certain type of path may connect the target user to items that are similar to his favorite items, while another kind of path connects a user to items that his friends like them. These types of relations are in harmony with different classes of neighborhood collaborative filtering; the former one follows the idea of item-based recommendation approach, while, the latter one follows the idea of user-based recommendation approach. Therefore, it is important for a graph-based recommendation algorithm to take into account the semantics behind different types of paths in heterogeneous networks [8,9,27]. Otherwise, it might score items based on unreliable paths that are inconsistent with the semantic behind the desired approach of neighborhood recommendation. Unfortunately, GRank neglects these semantics and interprets each existing type of path between any two nodes of the graph, as an evidences for closeness of those nodes. Therefore, it needs to be investigated whether GRank scores items through reliable paths (i.e. paths that are in harmony with the desired concept of similarity) or not, and if it does not, then we need a way to identify the reliable paths in TPG, before using them for making recommendations. To handle these issues we first formalize different perspectives to neighborhood collaborative ranking, that are user-based, preference-based, and items’ representative-based collaborative ranking. Thereafter, we determine which paths of TPG are in harmony with each perspective, and call them reliable recommendation meta-paths for that perspective. Once the reliable recommendation meta-paths are extracted, we show that GRank score items through paths that are not reliable in any classes of neighborhood collaborative ranking. Finally, we present a framework for graph-based collaborative ranking, called ReGRank, which guarantees to score items only ased on reliable paths. ReGRank is adapted to follow the semantic behinds user-based, pairwise preference-based and representative-based neighborhood collaborative ranking. The main contributions of this paper can be summarized as follows:  We present a novel approach to describe a large set of paths on heterogeneous information with a short string, called string description of meta-paths. That enables us to define and analyze reliable recommendation paths from different perspectives in neighbor-based collaborative ranking.  We formalize the user-based, representative-based, and preference-based recommendation approaches in a graph-based collaborative ranking framework called ReGRank. To our knowledge, this is the first comprehensive research on different classes of neighbor-based collaborative ranking algorithms.  We systematically show how each approach in neighborhood collaborative ranking can be modeled through a particular set of paths in TPG. Also, we show that TPG contains some paths that are not reliable for calculating similarity in any class of NC-Rank methods.  We introduce three novel network structures , , and whose all paths are reliable from different perspectives of NC-Rank and we show how to construct and use those network structures to make reliable recommendations in ReGRank.  We provide a comprehensive set of experiments to evaluate different classes of ReGRank. The results show that user-based and representative-based ReGRank improve current state-of-the art neighborhood and graph-based collaborative ranking methods. The rest of this paper is organized as follows. Section 2, we discuss the current collaborative ranking methods especially those that follows neighborhood recommendation approach which covers the scope of this paper. Then, we summarize the preliminaries in graph-based collaborative ranking in section 3. Next, in section 4, we show how to briefly describe recommendation meta-paths of recommendation algorithms by a string descriptions. The concept of string description enable us to formally define reliable recommendation meta-paths of neighborhood collaborative ranking in section 5. Thereafter, we present a novel framework for reliable graph-based collaborative ranking in section 6. Finally, we present experimental evaluations of our approach in section 7, and conclude the whole paper with directions to future works in section 8. Related work

As the scope of this paper lies in the category of neighborhood collaborative ranking here we will briefly review the alternative approach to collaborative ranking, called latent factor model, and then we will discuss the neighborhood collaborative ranking algorithms in more details. We also mention that current neighborhood collaborative ranking responds to the queries using a client/server architecture in which there is a centralized data center that encompass all available information. .1. Latent factor models

Latent factor model (LFM) approach of collaborative ranking transforms users and items to a latent common feature space in which we can correctly estimate interest of users to items through calculating inner product of their corresponding feature vectors. Cofi-Rank-NDCG [31] directly optimize the convex upper bound of a ranking loss function, called normalized discounted cumulative gain NDCG. Shi. et al [25] presented a LFM approach that lead to correct prediction of top-1 probabilities of each item for user. URM[34] aggregates ranking and rating-oriented loss functions to improve the quality of total ranking. ListPMF[11] adapts probabilistic matrix factorization approach to maximize the log- posterior of the predicted and observed preference orders of users. BoostMF[2] aggregates boosting and matrix factorization models to improve the quality of top-N recommendation. Another group of LFM algorithms maps users and items to a latent feature space which results in correct prediction of users’ pairwise preferences. CofiRank-ordinal is one of pioneers in this category which minimizes the number of dis-concordant comparisons among items[32] when rating data is available. Bayesian personalized Ranking (BPR) is a pairwise collaborative ranking framework for users’ implicit feedbacks. ABPR[17] adapts Bayesian personalized ranking for heterogeneous implicit feedbacks. CofiSet[16] learns the pairwise preferences among item sets rather than single items. PushAtTop [3] is another LFM approach that weighs the pairwise comparisons corresponding to items that are highly preferred by each user. CLIMF[23] and xCLIMF[24] optimizes the mean reciprocal rank (MRR) of the recommendation list. UOCCF[10] is one of the recent collaborative ranking framework which aggregates CLIMF and probabilistic matrix factorization to improve top-n recommendation quality in case of implicit feedbacks. We note that this paper focus on neighborhood collaborative ranking, and so, our algorithm is totally different from LFM techniques as it does not map users and items to any other feature space. Instead, it explores the concept of similarities between entities like users and items to estimate a target user’s priorities.

Neighborhood class of collaborative ranking methods, which covers the current research, explores opinions of similar users to predict the total ranking of items for the target user. Although some researchers have investigated list-wise[29] or point-wise[7] approach to calculate users’ similarity, most of current NC-Rank algorithms follow the pairwise approach in which takes into account users’ pairwise preferences over items[1,12,14,30,33]. More clearly, most of current NC-Rank algorithms follow a three step framework. The first step is to calculate users’ similarities according to their agreement/disagreement over common pairwise preferences. The second step, is to estimate a preference matrix for the target user based on the pairwise preferences of k-most similar users to him. Finally, the third step is to aggregate the estimated pairwise preferences to infer the total ranking of items for the target user. This framework is originally introduced by EigenRank [13] which uses Kendall correlation to estimate users’ similarities, and a random walk pproach for ranking inference. Wang et al. [30] has slightly modified Kendall correlation in order to weigh different pairwise preferences according to their popularity and their strength in a rating dataset. Kalloori et al.[5] have suggested to calculate the similarities using EDRC metric, that models each user u as a preference graph of items whose link indicates the preference of item i over item j for user u. Then, it calculates the similarities between preferences graphs of users to estimate the similarities of users. The main flaw of the mentioned algorithms is their inefficiency in sparse data when users rarely have common pairwise preferences. SibRank[18] exploits a signed bipartite network to calculate extended similarities even among users that do not have any common pairwise preferences. Although SibRank is able to calculate users’ similarities in sparse data, it can’t estimate the preferences for the target user when no information is available in his neighborhood. GRank[19] is another network-oriented collaborative ranking framework which applies personalized PageRank over the so called Tripartite Preference Graph (TPG) to directly estimate the interest of users to the items. This approach enables GRank to assign higher scores to items that are popular in the neighborhood while it is still capable of using information that is available outside of the neighborhood. However, GRank only takes into account the length and number of existing paths between a target user and item representatives without considering the semantics behind different types of paths in TPG. This ignorance affects GRank’s reliability as it might score items based on the paths that do not reflect a reasonable sense of relevance between a target user and an item representative, or even are in contrast to the concept of similarity in the desired neighborhood recommendation approach. Our suggested framework, called ReGRank, is different from current NC-Rank algorithms from different perspectives. First, it does not rely on common pairwise comparisons to calculate users’ similarities as in [13] [30]. Instead, it exploits a graph-based proximity measure that is able to calculate transitive similarities even when users have no common pairwise preferences. Moreover, ReGRank directly estimates interests of users to items base on all available information in the system while most of NC-Rank algorithms define a neighborhood around the target user and only explore opinion of users in that neighborhood for scoring items[1,13,18,30,33]. Also it should be noted that unlike GRank, ReGRank takes into account the semantics behind each type of path in TPG and only uses reliable paths which follow the concept of similarity in a neighborhood collaborative ranking approach. Finally, note that while traditional NC-Rank algorithms lie in the category of user-based collaborative ranking methods [1,13,18,30,33], GRank[19] can’t be placed in any category, not user based nor item based methods. On the other hand, ReGRank is a unified framework that can be adapted for user-based, preference-based and representative-based neighborhood collaborative ranking as well. Preliminaries

A rating-oriented system can be modeled as a bipartite network with adjacency matrix B whose element , indicates the opinion of a user to an item . However, this approach is not applicable in ranking-oriented systems in which users’ feedbacks to items are not absolute values and the choices of the users should be analyzed with regard to the choice context. More clearly, anking-oriented data set can be represented in the form of set = { =< , , > , ∈ , ∈ ∈ } where =< , , > denotes that user u has preferred item i over item j . Such a dataset might contain a record indicating that user prefers item A over item B while some other item like C may be preferred over A by user u . That means that there are two sides for each item i , the desirable side, , that refers to when a user prefers i over another item, and the undesirable side, i , that refers to when a user prefers another item to i . Formally, given a set of users = { , … , } , and a set of items = { , … , } , we use a set V to represent the main entities of a ranking-oriented recommender systems and define it as =< , , > where U is the set of users, = {( , )| ∈ } is the set of representatives denoting desirable and undesirable sides of items, and = {< , >, ∈ , ∈ } is the set of pairwise preferences. Note that each =< , > indicates preference of i over j . For legibility, we call the first item as p ’s desirable item, denoted by , and the second one as p’ s undesirable item denoted by . The agreement function and support function sup define the relation between different entities of ranking-oriented recommender systems. Definition 1.

Given a user ∈ , and a preference ∈ , the agreement function : × → {0,1} indicates the relations between users and pairwise preferences: a , = 1, < , , > ∈ 0, ℎ where O is the observation set of preferences. Definition 2.

Given a preference = (, ) and a representative r , the support function : × → {0,1} indicates whether a preference p supports the representative r or not. (, ) = 1, = =

1, = =

0, ℎ

Using agreement and support functions, we can construct the tripartite preference graph (TPG) of a pairwise preference dataset as in Definition 3 [20]. Fig.1 shows the network schema of TPG. Users are connected to preferences which they agree, and, representatives are connected to those preferences that supports them.

Definition 3. A tripartite preference graph is a tripartite graph =< (, , ), ( , ) > where U, is the set of users, P is the set of possible pairwise preferences, and R is the set of items’ representatives. = {(, )|(, ) = 1, ∈ , ∈ } , is the set of edges that connect a user to a preference if there is an agreement relation between that user and preference. Also, ={(, )|(, ) = 1, ∈ , ∈ } , is the set of edges linking each pairwise preference to the representatives that it supports. Fig.1. Network schema of tripartite preference graph (TPG)

More clearly, TPG is a tripartite graph which models the subjective relations between users and items’ representatives through intermediate nodes of pairwise preferences. This intermediate layer provides an ability to model the choice context in which a user has preferred/not preferred an item (See Fig.2c). Moreover, the items’ representatives layer enables TPG to model different sides of an item i: when reflects the situation in which i is preferred over another item, and refers to the situation in which another item is preferred over i . Fig.2 illustrates a schematic example of TPG and its difference with tradition bipartite graph (BG) representation of recommender systems. As illustrated in Fig.2b, BG only represents the interaction between users and items, and so, Martin and Lee both are connected to B in a similar way while they have picked item B in different contexts. On the other hand, TPG can clearly reflect that these users do not agree over any pairwise comparison. Fig.2. BG and TPG representation of a pairwise preference dataset.

Given the graph representation of a recommender system, graph-based recommendation algorithm should define a function score : × → ℝ in order to reply queries in form of query Q = ( , ) where is the object type of query nodes and is the object type of target nodes. More clearly, the score function calculates scores for objects with type based on the degree of closeness of those objects to some particular objects with type through paths that we call recommendation paths, as in Definition 4. efinition 4. Given a network G and a query format Q = ( , ), any path connecting a node ∈ to another node ∈ , is called a recommendation path over G with respect to Q. Graph-based collaborative ranking algorithms seek to reply the query in forms of = (, ) and score representatives according to their closeness to the target user. Therefore, ranking –oriented recommendation paths are those paths that connect a user node to a desirable or undesirable representative in graph representation of ranking-oriented recommender systems. As this paper focuses on graph-based collaborative ranking frameworks, we simply refer ranking-oriented recommendation paths as recommendation paths over G . As stated earlier, a main challenge that a graph-based recommendation algorithm faces is to determine which recommendation paths should be taken into account by the scoring function. In the following sections, we will follow a systematic approach to determine reliable recommendation paths for different classes of neighbor based collaborative ranking and we will show how to use these reliable recommendation paths to score items more precisely. String description of recommendation meta-paths

In this paper we seek to determine which recommendation paths in TPG are reliable. Clearly, it is not feasible or useful to individually analyze the reliability of each recommendation path between any pair of user and representative nodes. So, we use a high-level description of recommendation paths, called recommendation meta-paths. Let =< , > be a heterogeneous network and = ( , ) be its network schema [27] in which = {()| ∈ } contains the object types of G’s vertices, and =(), ()(, ) ∈ . The set of network meta-paths are defined as the whole set of paths in the network and are denoted by as Ψ = { = ( , .., )| ∀ ( , ) ∈ } on the graph of network schema . We note that each meta-path provides a high level description of one possible type of paths over G as a heterogeneous network. For instance, the UPR meta-path is a path over TPG’s schema that starts at U, passes its outgoing edge to P and then passes an edge to reach R. This meta-path describes all paths over TPG that connect a user to a representative node through a preference node. The UPR meta-path describes paths such as < − < − > and < − < − > as illustrated in Fig2.c. The concept of meta-paths in heterogeneous networks enables us to describe the semantics behind different relations among entities of recommender systems [8,9,21,22,27,35]. Motivated by the concept of meta-path, we define the concept of recommendation meta-paths with regards to the query format = ( , ) over the network in Definition 5. Definition 5. A recommendation meta-path in a graph G with respect to a query = ( , ) is a path defined on G 's network schema = ( , ) from to where and are the node types corresponding to the query and the target objects of recommendation. Also, = { = , .., )| = , = , ∀ ( , ) ∈ } is the set of all recommendation meta–paths existing in G . Different recommendation meta-paths can model different approaches for recommendation. For instance, in a heterogeneous networks comprised of users, movies, actors and relations between them, the < − − > meta-path can be used to recommend a movie to a user because his friend has watched it. This approach is exactly what is used in user-based recommendation. On the other hand, the < − − > meta-path can help us to relate a target user to a movie in which his favorite actor has played a role. This latter approach is an example of content-based class of recommendation methods. Typically, there are a large number of recommendation meta-paths that exist in a graph and some of them are reliable when using a certain NC-Rank approach. Here we define the string description of the recommendation meta-path set , denoted by , to shortly demonstrate the patterns appearing in meta-paths that belong to . The string description of meta-paths in a graph G are obtained from a finite alphabet ={ … } where { , … , } represents the set of node types in G . Each member of Σ describes a meta-path in G that is a meta-path with length zero from a node type to that is conceptually equivalent to starting from node with type and staying there and not moving anywhere. Also a meta-path = ( , , … , ) is described by a string = … that represents starting from a node with type , following the meta-path that end in a node with type . For instance, the meta-path = (, , , , ) over TPG is described by string while string U describes the meta-path = () . We define a set of operations that are needed to systematically obtain complicated string descriptions from simple ones. Given two string descriptions of meta-paths = { … . } and β = {′ … .′ } .  Select operation denoted " | ", is used to describe different meta-path options between nodes. For example if = ′ then = (|), is a string description that represents that one of meta-paths α or β may be selected to be followed from a node with type (= ). For example, if = and = , α|β would be represented by the string description = | and describes one of UPU or UPRPU paths can be selected for connecting two user nodes (that are nodes with type U).  Join Operation , denoted by "." , is used to describe concatenating two meta-paths. For example if = ′ , then α.β is string description for a meta-path that connects nodes of type to nodes of type ′ through following a meta-path that is described by β just after finishing a meta-path described by α . Formally, we have α.β = [ … ].[ … ]=( … . … . ) . For instance, if we have = and = , then α.β would be equal to UPR .  Repeat Operation , denoted by "∗ " , is used to describe meta-paths that are generated by repeatedly following some meta-path α , for zero or more times. It is applicable to the string description of a meta-path only if the starting and ending node types of that meta-path are the same. For example if = , then α ∗ describes a meta-path that starts from and ends at a certain object type , by following meta-path described by α , zero or more times. Formally, α ∗ = {α , α , … , α , … } , and since α is a string description for a meta-path starting at a node type , then α , that means following α for zero times, denotes not moving from to anywhere. Also string α is generated by applying join operation over α for − 1 times. In other words, α ∗ refers to the strings that belongs to the set ={ , … , … … , … } . For example, if = then γ ∗ = [] ∗ ={, , , ..} . To make the whole concept clear, let us define the string description of recommendation meta-paths of GRank denoted by . GRank seeks to answer the query = (, ) as it seeks to determine the closeness of representatives to a given target user. For this purpose, it applies personalized PageRank over TPG, and so it considers all recommendation meta-paths in TPG. The set of TPG’s recommendation meta-paths can be obtained through Breadth-first search on network schema starting at user node U . These meta-paths first pass edges that are described by = to reach a preference node. Then, it leaves P and returns to it passing edges in the form of = PU and = UP or = PR and = RP . These paths are described by string in form of [ . | . ]= [|] .This pattern can be repeated for arbitrary number of times which is expressed by [|] ∗ . Finally, these paths will end through passing the meta-path α = PR to reach a representative node R. Therefore, = . [|] ∗ . = . [|] ∗ . . Lemma. 1 . Given two string descriptions = { … . } and = { … . } where = = , we have [ ∗ ].[]= ( … ) ∗ … Proof:

From the definition of the repeat operation, we know that [ ∗ ] ends either with a or a that are the same and both are equal to ′ . So, [ ∗ ]= ( … ) ∗ = [… ] []= … , and according to the definition of the join operation, we have [ ∗ ].[]=( … ) ∗ … . Lemma.2.

Given two string dsescriptions of meta-paths = { … . } and = { … . } where = = , we have .[].[ ∗ ]= … ( … ) ∗ Proof:

The proof is similar to that of Lemma.1. s we showed that = . [|] ∗ . . Using Lemma.1 and Lemma.2 we can further simplify the string description to see that = [|] ∗ . Using string description of meta-paths, we can shortly describe the reliable recommendation meta-paths of NC-Rank algorithms and also check whether the recommendation paths of an algorithm, like GRank, are consistent to string description of reliable recommendation meta-paths of a particular class of NC-Rank algorithms or not. Discovering reliable meta-paths for neighborhood collaborative ranking

Graph-based collaborative ranking algorithms should define a ranking function : × →ℝ , which scores each item based on the degree of closeness of its corresponding representatives to the target user. More formally, a graph-based collaborative ranking algorithm, , needs to define a ranking score function as: (, ) = ( (, ), (, )) (1) Where and are desirable and undesirable representatives for an item i, and (, ) is a function that measures the degree of closeness between a user u and a representative based on recommendation meta-paths that are considered reliable in the recommendation algorithm alg. In this setting, function f must be defined to aggregate the scores of representatives of each item i in order to estimate the overall desirability of i for user u. The first step for defining such a scoring function is to find the reliable recommendation meta-paths for an algorithm alg . In this section we are going to first define the concept of similarity or closeness that is used in a neighborhood collaborative ranking method alg . We will also show how to describe all recommendation meta-paths that are in harmony with that concept of similarity, using a string description . Here, we will do that for user-based, preference-based and representative based approaches to the neighborhood collaborative ranking. Finally, we show that GRank uses some recommendation meta-paths that are not in harmony with the rules of any of the three mentioned approaches, and can be considered as unreliable meta-paths. Like neighborhood collaborative rating algorithms, user-based neighborhood collaborative ranking (UNC-Rank) algorithms estimate (, ) using opinions of u’ s neighbors about r . More clearly, UNC-Rank algorithms define the scoring function according to the following rules: - Rule U-1:

A representative is close to the target user if it is voted directly by him or his similar users. A user u votes to a representative r if he agrees with some pairwise preference that supports r. - Rule U-2:

A user is similar to the target user if he is directly similar to him or is similar to users that are similar to the target user. ow, the question is how to calculate direct users’ similarities using TPG. TPG reflects the similarity over pairwise preferences as well as the similarity over representatives[20]. The first one that we call type-1 user-user similarity, indicates that two users are similar if they agrees over some pairwise preferences. For instance, Jack and John are similar as they both have preferred B over A, and, D over A (See Fig.3a). This similarity can be determined through < ℎ, <, > and through < ℎ, < , > . Trivially, this type of paths are described by UPU meta-paths. On the other hand, type-2 user-user similarity demonstrates that two users are similar if they agree upon the overall desirability/undesirability of items for them. For instance, in Fig. 3b, we can infer that Jack is similar to John more than Lee as Jack and John both have preferred some item over A while Lee has preferred A over D . We can infer similarity between Jack and John through the path of < , < , _, < , ℎ > that is described by meta-path UPRPU over TPG. Type-1 and Type-2 direct user-user similarities can be respectively obtained through meta-paths and in TPG’s schema. We repeat that UPU is a meta-path over TPG as it is a valid path over TPG’s schema; it starts at U, then passes an outgoing edge to reach P, and finally it returns to U. Fig.3. A schematic example to illustrate different types of direct similarities among users. a) based on pairwise preferences. b) based on desirability/undesirability of items.

According to recursive definition of users’ similarities (Rule. U-2), and scoring logic (Rule. U-1), we can define the string description of user-based reliable recommendation meta-paths as strdesc = (|) ∗ (See Definition 6). Note that and | denotes two alternatives to find users that are directly similar to the target user, and, its repetition (|) ∗ describes meta-paths between users that are directly /indirectly similar to each other. We also mention that meta-path UPR connects users to representatives that are directly voted by them. Therefore, the concatenation of (|) ∗ and , denoted by (|) ∗ , describes meta-paths that are directly voted by the target user or users that are directly/indirectly similar to him. Therefore, UNC-Rank algorithms should estimate (, ) through paths that are consistent with as in Definition 6. efinition 6. is the string description describing the reliable recommendation meta-paths for user-based NC-Rank algorithms. = (|) ∗ indicating paths that start at the target user u , then meet similar users to the target user based on type-1 or type-2 similarities) i.e. following meta-paths in form of (|) ∗ and then passing the meta-path UPR Preference-based neighborhood collaborative ranking takes into account similarities among pairwise preferences to define the scoring function. Preference-based algorithms define the scoring function using these rules:  Rule P-1.

A representative is close to a user if it is supported by his concordant preferences.  Rule P-2.

A preference p is concordant to the target user u if u agrees with p or agrees with preferences similar to p .  Rule P-3.

A preference p’ is similar to preference p if it is directly similar to p or is similar to preferences that are similar to p . The next step is to define direct similarity among pairwise preferences. Like items’ similarities in collaborative rating algorithms, we assume that two pairwise preferences, and are directly similar if there are (ideally a large number of) users who agree with both and . Fig. 4a depicts a schematic example of two pairwise preferences ( < ) and ( < ) that are similar, as users that prefer A over B , also prefers D over B . On the other hand, the pairwise preferences (C < ) and ( < ) are dissimilar as there is no paths connecting them (See Fig.4b) since no user agrees with both of these preferences. These similarities can be captured through PUP paths in TPG, that can be extended to find indirect similar preference through repetitions of PUP paths, expressed by [] ∗ Fig.4. A schematic example to illustrate similarity/dissimilarity between pairwise preferences. ccording to Rule P.1-3, the PNC-Rank algorithms make recommendations according to paths that are described through concatenation of = , = [] ∗ , and = . Therefore, we can define = [] ∗ as mentioned in Definition 7. Definition 7. refers to the string description of the reliable recommendation meta-paths of preference-based NC-Rank algorithms. We define = [] ∗ that is generated through applying join operations over meta-path UP , [] ∗ , and PR. Note that UP meta-paths reach users to their direct concordant preferences, [] ∗ enable us to find preferences that directly/indirectly similar to his concordant preferences, and finally, PR determines representatives that are supported by these direct/indirect concordant preferences. Representative-based neighbor-based collaborative ranking defines the scoring method using these rules:  Rule R-1:

A representative is close to a target user u if it is similar to representatives that u has voted to them  Rule R-2:

A representative r’ is similar to representative r if it is directly similar to r or is similar to representatives that are similar to r . Similar to preference similarity, we assume that two representatives are directly similar if they have been voted by similar, ideally equal, sets of users. For instance, in Fig. 5a, the representatives , and are similar as several users have voted both of them, and ranked them above B and C . Nevertheless, , and seem dissimilar in Fig. 5b, as there is no user voting for both of them. These similarities can be captured through repetition of meta-path () in TPG, and so, the recommendation paths of representative-based algorithms are described by =() ∗ as stated in Definition 8: Definition 8. denotes the string description denoting the reliable recommendation meta-paths from representative-based NC-Rank perspective.

We define =() ∗ that is generated through joint of UPR and () ∗ meta-paths that first connect users to the representatives that are directly voted by the target user, and then, pass meta-paths in form of () ∗ to find directly/indirectly similar representatives. Fig. 5. A schematic example to illustrate similarity/dissimilarity among representatives.

Once the string descriptions of NC-Rank methods are extracted, we can answer the question that if GRank scores items through reliable recommendation meta-paths or not. As mentioned before, GRank applies personalized PageRank to score items through paths described by, = [|] ∗ , that refers to all existing paths in TPG. For instance, in Fig.6, GRank might score items for Jack through the path = {, < , , < , , <, } that follows meta-path = {} which is not compatible with any of the reliable recommendation meta-paths described by , or . To see why, please notice that sequence RPR appears in while it cannot appear in any of these reliable recommendation meta-paths. Paths such as , that pass a RPR meta-path, implicitly assume that two representatives are similar if they are connected to a common pairwise preference. Trivially, that does not seem reasonable as it implies that all desirable representatives are similar to all undesirable representatives. Note that each item has been compared to all other items in the preference layer, and so, each desirable item’s representative , is connected to the undesirable representatives of other items through RPR meta-paths. Therefore, inferring similarity through these paths will mislead the ranking algorithm. For instance, in the mentioned example, connects Jack to A because it implicitly assumes that B is similar to C and , is similar to A . It is clear that voting for A is trivially in contrast with the original opinion of Jack who prefers C over A. Trivially, considering such paths that are not compatible with the desired meaning of similarity or closeness in neighborhood recommendation, may deviate the ranking algorithm from the real interests of the users and avoid it from making accurate recommendations to them. Therefore, GRank that applies personalized PageRank over TPG and considers all existing meta-paths, is not totally consistent with any of the basic approaches in neighborhood collaborative ranking. Fig.6. A schematic example to illustrate invalid meta-paths in TPG. Top-N recommendation through reliable meta-paths for neighborhood collaborative ranking

So far, we have defined the string description, that describe Γ , the set of recommendation meta-paths that are reliable from the viewpoint of user-based, preference-based, and representative-based recommendation. Here, we propose a novel collaborative ranking framework, called ReGRank, that scores items through reliable recommendation meta-paths of TPG. Personalized PageRank[15] is one of the most acknowledged proximity metrics which weighs the whole set of paths in a network to assign a proximity value to a pair of nodes consisting a target node and some other node in the network[8]. Personalized PageRank models the behavior of a random surfer who starts at a node corresponding to the target user, and then randomly surfs the network to reach other entities. The random surfer will increase the rank of each node every time that he reaches that node. Trivially, the higher ranks are assigned to the nodes that are more reachable from the target user. That make sense in recommender systems as they seek to find items that there are many evidences for relating them to the target user [8,9,19]. However since there are some unreliable meta-paths in the original TPG network, it is not safe to use Personalized PageRank directly on TPG. On the other hand, it is hard to individually consider each reliable recommendation meta-path (ρ ∈ Γ ) and find each instance of that meta-path in TPG for calculating the overall proximities among nodes, because Γ is an infinite set containing a large number of recommendation meta-paths. Therefore, ReGRank projects TPG to graph =<(, ), > which contains all reliable meta-paths described by . Thereafter, it applies personalized PageRank on to rank items using the reliable meta-paths that have been defined for that class of recommendation (e.g. user-based). enerally, network projection is used to compress a network of n types of nodes to another network with l different types of nodes where < . That provides an ability to directly show the relations between a particular set of objects with possibly different types. Here, we are going to project TPG to a 2-types graph consisting users and representatives as we are interested in recommendation meta-paths that start at users and ends at representatives. To project TPG, we should determine which types of relations should be kept in the projected graph and also, how to weigh each relation in the projected graph . Projection strategy

As we mentioned there are different approaches to neighborhood recommendation, each defining a different set of meta-paths, , and , and here we are going to construct projected graphs each containing meta-paths that reflect one of those approaches to recommendation. In other words, each projected networks should contain all paths described by one of these string descriptions and no other recommendation meta-paths. For this purpose, we first define how to project a network through a set of meta-paths in Definition 9. For instance, if θ = {, } , the projected network of Fig. 7a will be like the one shown in Fig. 7b; Martin and Mike are connected to each other as they have connected through paths {< , <, >, < , < , >} that represent paths between Mike and Martin that are consistent with meta-path . Note that Jack is not connected to other users due to absence of any UPU paths between Jack and them. Moreover, each user is connected to representatives that he has voted for. As an example, Mike is connected to through path < , < , > that follows meta-path . Note that there is no edge from representatives to users as we have projected TPG over the meta-path set = {, } in which there is no meta-paths from representatives to users. Furthermore, the projected network contains two directed links, associated with a weight, between users who have a UPU path in TPG.

Fig.7. A schematic example to explain the compressing mechanism through a set of meta-paths in ={, } . a) the original network b) the projected network Definition 9.

Projection of TPG through a set of meta-paths = { , … , } generates a graph with network schema =< , > where = {⋃ .} ∪ {⋃ . } and =( ., .)| ∈ θ } where . and . indicate the node types for the starting and ending nodes of , respectively. Prepositions 1-3 state that projection of TPG with regards to a set of meta-paths θ , θ and θ will generate three networks, and the set of recommendation meta-paths for each of those networks is described by , , and , respectively. Therefore, applying personalized PageRank of the target user over these networks will score items according to meta-paths that are described by , , or , respectively. Fig. 8. Network schema of TPG’s transformation for UNC-Rank, PNC-Rank, and RNC-Rank

Preposition.1.

Projecting TPG over meta-paths in = {, , } , results in a graph , and the set of all all recommendation meta-paths in , from users to representatives, in , is equal to the set of all reliable meta-paths described by =(|) ∗ Justification:

If we project TPG over = {, , } , we obtain a graph with network schema of Fig. 8a. According to Definition 9, the projected network will contain two types of nodes, that are user and representative nodes, and three types of edges, between users that are connected through UPU paths, from users to representatives based on UPR paths, and from representatives to users based on RPU paths. Therefore, the network schema will connect edges from U to R, R to U, and U to U as in Fig. 8a. These set of all recommendation meta-paths of a network can be obtained through breadth first search over its schema starting at U node[27]. The paths start at node U, then return to U passing meta-path (UU) or through passing meta-path UR and RU. These patterns show two alternatives for moving from U to U, and so, can be expressed by (|) . This loop can be repeated for a number of times, and so we use (|) ∗ expression to describe paths from user nodes to each other in g . Additionally, these paths can be joined with the UR meta-path to form a recommendation meta-path from users to representatives. Therefore, these recommendation meta-paths can be described by = [(|) ∗ ].[]=(|) ∗ . The remaining step is to prove that the recommendation meta-path set over , described by , is equivalent to the set of reliable recommendation meta-path set over TPG that s described by . More formally, we should prove that {(|) ∗ } ≅{(|) ∗ } Aforementioned, UR and RU edges in g respectively refer to UPR and RPU paths in TPG. So, we can infer that {} ≅ {} . Additionally, we know that UU edges in g refer to UPU paths in TPG. Therefore, we can infer that {(|) ∗ } ={(|) ∗ } . Furthermore, as (|) ∗ is the obtained through concatenation of {(|) ∗ } with UR, we should join {(|) ∗ } to {} to form equivalent of {(|) ∗ } over TPG. So, we can infer that = {(|) ∗ } ≅ [((|) ∗ ].[] = [(|) ∗ ].[] = {(|) ∗ } = Corollary. 1.

Projection TPG with regards to the set of meta-paths will result in network . Personalized PageRank of the target user in only takes into account the reliable meta-paths in user-based approach to NC-Rank. Preposition. 2.

Projecting TPG over meta-paths in = {, } , result in a graph whose all recommendation meta-paths that are described by = () ∗ , Justification:

Projecting TPG over = {, } will generate a network in which users are linked to each other if there is a UPU path between them in TPG, and also, a user will be linked to a representative in case that there exist a UPR path between them in TPG. Accordingly, the projected graph will contain user-user and user-representative edges and its schema would be as in Fig. 8b. Similar to preposition.1, we can see that the recommendation paths of are described by = {() ∗ R} because the recommendation meta-paths contain repetitions of loops from U to U for some number of times and then they all pass the UR meta-path to form a recommendation meta-path from users to representatives. As {} ≅ {} , and, {} ≅ {} , we can infer that {() ∗ } ≅{() ∗ } . Consequently, we have: {() ∗ } = [() ∗ ].[]} ≅ {[() ∗ ].[]} = {() ∗ } . Additionally, we have () ∗ } = {, , , … } = {() , () , () , .. } = {() ∗ } So, we can infer that = {() ∗ } ≅ {() ∗ } = {() ∗ } = . Corollary 2.

Projection TPG with regards to the set of meta-paths will result in network . Personalized PageRank of the target user in only takes into account the reliable meta-paths in preference-based approach to NC-Rank. Preposition 3.

Projecting TPG over meta-paths in = {, } , results in a graph whose all recommendation meta-paths are described by = () ∗ , Justification : Projecting TPG over = {, } results in a network in which users and representatives are connected to each other if there is a path in form of UPR/RPU between them. The projected network, denoted by g , is a bipartite directed network with a schema of Fig. 8c. Aforementioned, all recommendation meta-paths of are obtained through applying breadth first search over its schema. Trivially, these paths contain the UR link and some repetition of loops from R to R that is described by the string in form of () ∗ . Therefore, the recommendation meta-paths of denoted by = () ∗ that is generated by applying a join operation over UR and () ∗ , As {} ≅ {} , and {} ≅ {} , we can infer that {} ≅{} . Furthermore, {() ∗ } is generated through concatenation of UR with () ∗ , its equivalent over TPG will be obtained joining {} and {() ∗ } . So, we have = {() ∗ } = {[].[() ∗ ]} ≅ {[].[() ∗ ]} = {() ∗ } = . orollary. 3. Projection TPG with regards to the set of meta-paths will result in network . Personalized PageRank of the target user in only takes into account the reliable meta-paths in representative-based approach to NC-Rank. It is worth mentioning that there is an interesting relation. between different network schemas; g takes into account relations over pairwise preferences, but it does not provide an ability to propagate ranking from representatives. On the other hand, g models users’ similarity over representatives but it ignores the relations between users and preferences. g is where g and meets each other: it lets similarities to be calculated based on both available evidences, common pairwise preferences and common representatives among users. Weight assignment

Now that we projected TPG using a set of meta-paths the remaining question is how to weigh each edge in the projected network. Our final goal is to use personalized PageRank, the most well-known proximity measure, to score items based on the probability that a random surfer starting at the target user, will reach at each representative. Those probabilities can be adjusted by assigning weights to the edges of the graphs resulted from the projection process. To preserve the information available in original data, we will need to assign a weight to each link, in a projection graph, based on the transition probability that the random surfer will reach from i to j in the original graph TPG when following meta-paths in . In other words, in a reasonable projection process, TPG must be projected to a weighted directed network with adjacency matrix M where (, ) approximates the probability of transition from i to j in the original network (TPG) when the transition paths are defined by a set of meta-paths . Accordingly, we first estimate the transition matrix that reflects the transition probability from i to j using a meta-path ∈ . Then, we aggregate the whole set of transition matrices to infer adjacency matrix of the projected network over . To calculate the transition matrix of each meta-path, we define the basic adjacency matrices of the projected networks as , , and as (, ) = ( ) , ( , ) ∈

0, . (2) (, ) = ( ) , ( , ) ∈

0, . (3) (, ) = ( ) , ( , ) ∈

0, . (4) (, ) = ( ) , ( , ) ∈

0, . (5) Where () indicates the number of adjacent nodes to a node v in TPG. E is the set of edges between users and preferences, and E is the set of edges between preferences and representatives. In the next step, the transition matrix of each meta-path is obtained through … . =∏ Where … . indicates a meta-path with length k starting from node type , and, ending at node type . For instance, the basic adjacency matrices of , , should be multiplied to achieve the transition matrix that represents the probability of transition among nodes using a meta-path UPR or = × . So, we can define the adjacency matrix of the projected networks for the set of projection meta-paths (i.e ) through linear combination of adjacency matrices of its members using Eq. 6-8 (, ) = ⎩⎪⎨⎪⎧ (,) () () , ∈ (,) () () , ∈ , ∈ (,) () , ∈ , ∈ (6) (, ) = (,) () ∈ , ∈ (,) () , ∈ , ∈ (7) (, ) = ⎩⎨⎧ (,) () , , ∈ (,) () , , ∈ (8) Where () = ∑ (, ) || , () = ∑ (, ) || , and () = ∑ (, ) || Once the networks constructed, we can use personalized PageRank on these networks to infer a recommendation. Personalized PageRank with a target user can be calculated by iteratively computing Eq.9 = + (1 − ) (9) where the PPR denotes the personalized PageRank values of the target user u and d is the personalization vector that is a one-hot encoding for the target user u . We define the scoring function for each representative r, as (, ) = () . Usually PPR-values converge in a small number of iteration, 20 iterations in our experiments. We can score different representatives hrough user-based, preference-based, and representative-based NC-Rank perspectives by substituting M with one of transition matrices , or . When score values of representatives are calculated, we can rank items through aggregating the score values of their corresponding representatives as in Eq.10. () = (, )− (, ) (10) Experimental Setting and results

We conducted a comprehensive set of experiments to evaluate the performance of ReGRank framework which models user-based, preference-based, and representative-based NB-Rank models. We denote the user-based variant of ReGRank as U-ReGRank, preference-based variant as P-ReGRank, and the representative-based variant as R-ReGRank. We assess these algorithms on two publicly available datasets, Movielens100K, Movielens1M, and FilmTrust that are widely used for evaluating recommendation systems [1,3–5]. Both Movielens datasets are comprised of ratings in 5 levels (i.e.1, 2, … ., 5) .while, FilmTrust contains rates from 1 to 4 at 8 levels (i.e.1, 1.5, 2, … .,4) . We evaluate the recommendation algorithms through the well-known evaluation methodology in collaborative ranking literature[18,20,26,32], we split a fixed number UPL of ratings of each user as his training data, and his other ratings will be considered as test samples. This methodology is widely accepted in the community for several reasons: First, it enables us to compare the performance of algorithms under different sparsity. Second, the number of higher ratings in the test set simulates real world recommender systems that should only suggest a small number of items among a large number of unrated items[3,28]. For each value of UPL, 5 variants of training sets are generated via random sampling, and the average of the performance on their corresponding test sets is reported. In this paper, we assess the performance of algorithms with regards to data sparsity via changing UPL from 10 to 50 for all datasets. We should ensure that for each UPL, each user should have at least 10 items in the test as recommendation algorithms are evaluated through their Top-10 recommendation. We assess the performance of algorithms using Normalized Cumulative Discounted Gain (NDCG) of their Top-N recommendation. Let U be the set of users, be the recommendation list for user u , and be the ideal recommendation list for user u . Then, we can calculate the average NDCG of Top-N using Eq.11 @ = || ∑ || (11) Where DCG is computed through Eq. 12 = ∑ () (12) here rel is the relevancy of i-th item in the recommendation list. In addition to recommendation quality, we assess scalability of ReGRank framework through measuring its recommendation time. We assess the performance ReGRank against the state-of-the art neighbor-based collaborative ranking and graph-based recommendation algorithms. Also, we compare it to CofiRank, the most well-known matrix factorization algorithm for collaborative ranking. We briefly explain the algorithms below:  EigenRank:

EigenRank [12] is the most famous algorithm in the family of Neighbor-based collaborative ranking algorithm which uses Kendall correlation to find similarities between users’ ranking, and then, it uses a random walk algorithm to aggregate the pairwise preferences and infer the total ranking of target user.  NN-GK:

NN-GK [5] is a recent neighborhood recommendation algorithm for pairwise preference dataset. NN-GK calculate users’ similarity through Kruskal’s gamma (GK) that is equivalent to Kendall correlation in rating datasets. Thereafter, it score each item through summing up the neighbors’ opinions over its comparison with other items.  SibRank:

SibRank [18] is a state of the art neighbor-based collaborative ranking algorithm that calculates users’ similarities based on signed multiplicative rank propagation from a target user's node on a signed bipartite preference network.  PrefRank:

PrefRank [7] presents another novel approach, which transforms users’ pairwise preferences to a preference score and then, uses a traditional rating approach to calculate users and items’ similarities. We call its user-based variant as U-PrefRank and its item-based variant as I-PrefRank.  GRank:

GRank [20] is the state of the art algorithm that processes TPG to make recommendations. GRank applies personalized PageRank on TPG in order to estimate the desirability/undesirability of items. Thereafter, to infer the overall score of an item, it aggregates the scores of their corresponding representative nodes.  CofiRank:

CofiRank[31] lies in one of the state-of-the-art algorithms in the class of matrix factorization collaborative ranking. CofiRank learns the latent factors of users and items while optimizing a structured loss function. The CofiRank’s framework has been adapted for ordinal and NDCG ranking loss functions [32]. CofiRank-Ordinal minimizes the ordinal loss function that refers to the number of discordant preferences in the recommendation list On the other hand, CofiRank-NDCG triers to maximize the NDCG that reflects quality of recommendation list in subject to the ideal one. We used the publicly available code for CofiRank while setting the optimal parameter values that is provided in [32] We implemented traditional neighbor-based collaborative ranking algorithms such as EigenRank, NN-GK, SibRank, and PrefRank using neighborhood size of 100 that is reportedly the neighborhood size that leads to the best performance for those algorithms. Additionally, we set the amping factor as 0.85 for GRank, EigenRank, SibRank, and different classes of ReGRank framework.

We first analyze and compare the performance of different classes of ReGRank comparing to each other. As seen in Table 1-3, U-ReGRank and R-ReGRank significantly outperform P-ReGRank in lower values of UPL. For instance, in case of UPL=10, R-ReGRank and U-ReGRank achieve NDCG@10 of 71% for FilmTrust, 69% for ML100K, and 71% for ML-1M. On the other hand, P-ReGRank’s performance does not exceed 69% in FilmTrust, 63% in ML100K, in 65% for ML1M when UPL=10. On the other hand, in higher UPL values, P-ReGRank performs similar to R-ReGRank and U-ReGRank in Movielens datasets, and even, 2% better in FilmTrust dataset. That result simply can be explained through the dataset’s sparsity; in sparse datasets, users have rarely common pairwise preferences (e.g. Fig. 4b), and so the similarities between pairwise preferences are not reliable enough. In such a situation, type-2 similarity between users or similarity between representatives are more informative. On the other hand, as UPL increases, users have more common pairwise preferences, and consequently, similarity between pairwise preferences are more reliable for the recommendation. It should be noted that P-ReGRank could not achieve high performance in case of UPL=50 in FilmTrust dataset due to its high sparsity that is consequence of low number of users having more than 50 ratings in FilmTrust dataset. More clearly, in such a dataset, pairwise preferences’ similarities are estimated based on a small number of users that can't provide enough information for an accurate estimation. We also compare ReGRank algorithms to traditional neighborhood collaborative ranking algorithms and also GRank, the other state of the art recommendation approach over TPG. Experimental results showed the superiority of ReGRank over GRank in the majority of evaluation conditions especially when UPL takes low values. For instance, in case of UPL=10, the ReGRank shows an improvement of 2% in FilmTrust, 7% in Movielens100K, and 6% in Movielens1M. The reason is that in sparse datasets, TPG contains a large number of paths in form of RPR and PRP which are not reliable is any neighborhood recommendation approach. Therefore, GRank fails to make accurate recommendations in sparse datasets. In higher UPL values, users have more pairwise preferences that form a large number of reliable recommendation paths passing users and pairwise presences. Therefore, unreliable recommendation paths have a smaller effect on the scoring mechanism of GRank in higher values of UPL. We also compare ReGRank algorithms to traditional NB-Rank algorithms, EigenRank, SibRank, and, PrefRank. ReGRank algorithms significantly outperform all other algorithms in majority of evaluation conditions. For instance, in UPL=30 in Movielens1M, ReGRank algorithms improve NDCG@10 up to 21% compared to PrefRank, up to 4% compared to EigenRank, NN-GK and SibRank. An interesting point is that the performance of different classes of PrefRank is significantly lower than other neighborhood algorithms which calculate users’ similarity over pairwise preferences. This result implies that transforming users’ rankings or pairwise preferences to numerical scores will probably eliminate some useful information and will decrease the recommendation’s performance. inally, we compare our algorithm to different classes of CofiRank as representatives for matrix-factorization approach. ReGRank significantly outperforms both CofiRank-ordinal and CofiRank-NDCG in all experimental conditions. As an example, in case of UPL=50 in FilmTrust dataset, U-ReGRank, P-ReGRank, and R-ReGRank has gained NDCG@10 of 67%, 66%, and 67%, while, CofiRank-NDCG and CofiRank-Ordinal showed performance of 60% and 63%, respectively. An interesting point is that unlike neighborhood algorithm, CofiRank-NDCG lose its performance in higher UPL values. For instance, the performance of CofiRank-NDCG is 68% and 60% in movielense1m dataset when UPL=10 and UPL=50, respectively. Nevertheless, U-ReGRank increase its performance from 71% to 75% when UPL changes from 10 to 50 in Movielens1M. That is because neighborhood algorithms can capture the local taste of users and improve their recommendation accuracy in case of higher UPL value. Nevertheless, model-based algorithms typically learn the model that fits to the whole data. More clearly, model-based algorithms learn the global tastes of users and miss information that is available in local users’ neighborhood.

Table 1. Comparison of algorithms in terms of NDCG@10 in Movielens100k dataset

Algorithm UPL=10 UPL=20 UPL=30 UPL=40 UPL=50 U-ReGRank

P-ReGRank

GRank

U-PrefRank

I-PrefRank

EigenRank

SibRank

NN-GK

CofiRank-NDCG

CofiRank-Ordinal

Algorithm UPL=10 UPL=20 UPL=30 UPL=40 UPL=50 U-ReGRank

P-ReGRank

GRank

U-PrefRank

I-PrefRank

EigenRank

SibRank

NN-GK

CofiRank-NDCG

CofiRank-Ordinal ± ± ± ± ± Table 3. Comparison of algorithms in terms of NDCG@10 in FilmTrust dataset

Algorithm UPL=10 UPL=20 UPL=30 UPL=40 UPL=50 U-ReGRank

R-ReGRank 0.713±0.007

U-PrefRank

I-PrefRank

EigenRank

SibRank

NN-GK

CofiRank-NDCG

CofiRank-Ordinal

We also analyze the scalability of different neighborhood algorithms in terms of computational complexity and running times. Figure 9 illustrates the running of neighborhood algorithms measured in seconds on a Linux based PC running an Intel core i7-5820K processor at 3.3 GHz with 32GB of RAM. Note that is not plausible to compare the running time of neighborhood approach and matrix factorization algorithms as neighborhood recommendation typically makes an up-to-date recommendation which requires online processing, while, model-based algorithms require an offline learning to learn latent factors of users and items. Here, we focus on the recommendation process in neighborhood recommendation approach. Given m as the number of users and n as the number of items, projecting TPG over UPU path will maximally generate edges and projecting over UPU/UPR will maximally generate edges. Accordingly, contains + 2 links, contains + links, and contains links. ReGRank make recommendation through personalized PageRank of target user over , , and . Remind that personalized PageRank can be implemented at computational complexity of () where e denotes the number of links and t is the number of iterations required for convergence. Therefore, R-ReGRank makes recommendation to a target user at computational complexity of () , while, P-ReGRank and U-RegRank would recommend at computational complexity of ( + ) . So, R-ReGRank is more scalable than other variants of ReGRank. This is also reflected in their running time of different extensions of ReGRank framework; For instance, R-ReGRank makes recommendation in 0.07 seconds, P-ReGRank in 0.73 seconds, and U-ReGRank in 0.8 seconds in Movielens1M when UPL=50. he running time of R-ReGRank is also up to 40 times less than U-PrefRank, 60 times less than I-PrefRank, 80 times less than pairwise-oriented neighborhood collaborative ranks that are NN-GK, SibRank, and EigenRank , and 180 times less than GRank in movielens1M dataset where UPL is set to 40. These running times are also compatible with their computational complexity that is ( + ) for GRank, ( + ) for I- PrefRank, ( + ) for U-PrefRank, ( + + ) for NN-GK, SibRank, and EigenRank with neighborhood size of k . Fig. 9. Runtime of recommendation process in neighborhood collaborative ranking algorithms over Movielens1M dataset. Conclusion

In this paper, we investigated the semantics behind different meta-paths of TPG, and their relations to neighborhood collaborative ranking algorithms. Thereafter, we present a framework, ReGRank, to make recommendation through meta-paths that are reliable from different perspective of neighborhood collaborative ranking. ReGRank models reliable recommendation meta-paths of NC-Rank algorithms, and then, projects TPG to other network containing these reliable recommendation meta-paths. Experimental results showed significant improvement of the suggested framework over well-known and state of the art graph-based and neighborhood collaborative ranking algorithms. Furthermore, we observed that using paths that are consistent with reliable user-based and representative-based meta-paths are more effective in sparse datasets, while, using paths that are in harmony with preference-based meta-paths leads to better performance in denser datasets. Despite overall improvement of ReGRank framework over existing algorithms, there are several interesting directions for extending the current work. ReGRank scores items based on a diverse set of reliable meta-paths. Although those meta-paths may have different values in scoring items, ReGRank, in its current form, does not make any effort to weight them based on their importance. One possible direction for extending the current research is to assign such weights to the reliable meta-paths. Furthermore, ReGRank follows an off-line algorithm to weigh each edge in the projected networks and so it would be valuable to design an online algorithm for updating the weights of links as new data is inserted to the system. Finally, it is possible to extend the current research to support reliable graph-based recommendation using heterogeneous information networks containing other types of information like content and context.