# Recommendations Based on Listwise Learning-to-Rank by Incorporating Social Information.

1. IntroductionWith the popularity of the Internet and the rapid development of cloud computing technology, the amount of data on the Internet increases dramatically. How to find valuable information from the vast data is becoming more and more challenging for users. In such a context, recommendation system came into being, which could provide users with a recommendation list to meet their preference. Since the recommendation system can effectively solve information overload problem, it has drawn a lot of attention from academia and industry.

Collaborative filtering (CF) is currently one of the most widely used algorithms in the recommendation field. Based on the assumption that similar users have similar interests, CF predicts the ratings of items and then generates the recommendation list according to the descending order of predicted ratings. About rating prediction, many works have been proposed, such as social-aware prediction [1], location-aware prediction [2], time-aware prediction [3], but they all ignore the ordering information of rating data, which is important to improve rating prediction accuracy.

Furthermore, some researchers recently have found that rating-based recommendation is not always accurate. For example, suppose a user rates 2 points for item A and 3 points for item B. By using different algorithms to predict the ratings, we will get different recommendation results. One prediction is item A scoring 2.5 points and item B scoring 3.6 points; the other one is item A scoring 2.5 points and item B scoring 2.4 points. MAE (mean absolute error) is one of the most widely used evaluation criteria of CF algorithms. The two algorithms have the same value of MAE (i.e. (0.5 + 0.6)/2 = 0.55), meaning that their rating prediction accuracy is the same. However, the order of item A and B in their recommendation lists is quite opposite. This example verifies the conclusion that rating prediction accuracy is difficult to interpret as a measure of recommendation accuracy [4].

In order to solve the problem, researchers began to integrate learning-to-rank (LTR) into CF algorithms, believing that directly optimizing the recommendation list is more in line with the goal of recommendation system. Similar to the principle of machine learning, LTR-based CF algorithms learn an optimal ranking model based on historical rating records. Since the ranking model focuses on the order of items in recommendation list rather than the rating, it can output a recommendation list more consistent with the user preference. In recent years, LTR-based CF algorithms have received more and more attention [5], but they still suffer from traditional challenges in recommendation field, such as data sparsity problem, malicious ratings, and so on. Some methods have been proposed in related works, such as incorporating users' social networks [6], employing clustering models [7], but they all aim to predict the missing ratings rather than directly optimize the recommendation list. Currently little effort has been devoted to exploiting LTR-based CF in solving those traditional challenges.

In this paper, we propose a recommendation method based on listwise LTR by incorporating social information (ListSo). It considers both the trust relationship between users and ordering information of rating data, and learns an optimal recommendation model by listwise LTR algorithm. The contribution of this paper is three-fold:

(1) Unlike the previous studies, we propose an enhanced measurement to compute the user similarity. By using the Plackett-Luce probability distribution model, the information included in the order of items is fully exploited to find more accurate similar neighbors.

(2) While predicting the rating, the influence of neighbors is integrated into the matrix factorization model, effectively alleviating the data sparsity problem. Meanwhile, the model is also used to explore the trust relationship between users, so as to reduce of impact of malicious rating.

(3) By using the listwise LTR algorithm, an optimal recommendation model is learned from the training dataset. Experiments show that ListSo can be applied to large-scale datasets.

The rest of this paper is organized as follows: Section 2 surveys the related works on recommendation methods. Section 3 shows the framework and details of our approach, including the similarity evaluation, rating prediction, trust evaluation and listwise LTR algorithm. Section 4 describes the experimental results. Finally, we draw conclusions and outline our future work in Section 5.

2. Related Work

Many CF algorithms have been proposed in the recommendation field, which can be divided into two categories: rating-based CF algorithms and LTR-based CF algorithms [9]. Next, we will introduce related works of the two categories in detail.

2.1 Rating-based CF Algorithms

Rating-based CF algorithms can be divided into two categories: memory-based and model-based. Memory-based CF is one of the earliest and most widely used recommendation algorithms, which has been successfully applied to many e-commerce systems. Memory-based CF finds similar users or items as neighbors according to the rating data, and predicts missing ratings based on those neighbors, so it can be further divided into user-based CF and item-based CF. Wang et al. [10] took users' interests and the timeliness of items into consideration, and found the nearest neighbor set by combining with item-based CF algorithms. Jia et al. [11] proposed a novel CF recommendation algorithm based on the double neighbor choosing strategy. They chose the trustworthy neighbor set of target user by considering the similarity and trust relationship between users at the same time. Li et al. [12] differentiated the non-target users into two types that without recommending ability and with recommending ability. They only cared about the latter users and proposed a "domain nearest neighbor" method to predict the missing ratings of items.

Different from memory-based CF, model-based CF learns a model by training the rating data, and then uses it to predict missing ratings. The popular models include clustering model, Bayesian model, linear regression model, Markov model, matrix factorization model, etc. The matrix factorization model is the most widely used among them. It decomposes the high-dimensional user-item rating matrix into low-dimensional user latent matrix and item latent matrix, thereby reducing the dimensionality of rating data and alleviating the data sparsity problem. Guo et al. [13] combined the community structure and interest clusters information into matrix factorization model by adding a clustering-based regularization term to improve the objective function. Jamali et al. [14] proposed a recommendation model named SocialMF. They assumed that the latent feature vector of the target user was determined by that of his friends, and introduced the concept of trust propagation into the process of matrix factorization. Yang et al. [15] classified the user's friends according to the categories of items, and then gave them different level of trust, but it may further aggravate the data sparsity problem.

In conclusion, either memory-based CF or model-based CF, they both aim to predict missing ratings. However, the recommendation list got by sorting predicted ratings can not accurately reflect user's preference, which has been illustrated by the example in the introduction. So, recently researchers have begun to study how to incorporate learning-to-rank into CF to optimize the recommendation list directly.

2.2 LTR-based CF Algorithms

Learning-to-rank (LTR) is a supervised machine learning algorithm, which aims to get the optimal ranking model by training the dataset [5]. By learning the optimal parameters automatically, LTR can integrate complex features and reduce the defect of considering single factors while ranking, so it has become popular in the recommendation field. According to the input data, LTR can be divided into three categories: pointwise, pairwise and listwise. Pointwise LTR inputs individual items and predicts their missing ratings, which is very similar to rating-based CF. Pairwise LTR defines the partial order between items according to their ratings, and generates the recommendation list by integrating the partial order of all item pairs. Listwise LTR inputs all the items and directly optimizes the entire list, which is the most popular among them.

Liu et al. [4] predicted the partial order of unrated item pairs based on similar users, and generated a recommendation list by incorporating all the partial orders. Pan et al. [16] converted the user-item matrix into the partial order matrix according to the user's browsing records, and proposed a personalized ranking algorithm based on Bayesian theory. Both [4] and [16] have high recommendation accuracy, but their time complexity will be high when being applied to large datasets. Shi et al. [17] combined listwise LTR with matrix factorization and got the recommendation list by minimizing the loss function representing the uncertainty between training lists and predicted lists. Weimer et al. [18] proposed an algorithm called CoFiRank which directly optimized the evaluation criteria NDCG. Huang et al. [19] directly predicted a total order of items for each user based on similar users' probability distributions over permutations of the items.

As a kind of machine learning algorithms, LTR can effectively process large-scale datasets, but it still suffers from data sparsity problem, as users only rate a few items. In this paper, we will address this issue by mining the information included in the order of items and incorporating social information into recommendation.

3. The Recommendation Approach

Traditional CF-based recommendation algorithms find similar users according to the historical ratings, and then predict missing ratings for the target user based on those similar users. However, if some neighbors give malicious ratings on some items, the recommendation accuracy will be severely affected. Aiming at this problem, both similarity and trust relationship are taken into consideration, thereby improving the recommendation accuracy and reducing the impact of malicious ratings.

3.1 Framework of Our Approach

Now we describe the framework of our approach as shown in Fig. 1. For simplicity, we suppose that the rating dataset is already provided or acquired.

Our approach has four subcomponents: similarity evaluation, rating prediction, trust evaluation and listwise learning-to-rank algorithm. The similarity evaluation employs the Plackett-Luce probability distribution model to calculate the user similarity, thereby making full use of the information included in the order of items and finding more accurate neighbors. The rating prediction is performed by integrating the influence of neighbors into matrix factorization model, effectively alleviating the data sparsity problem. The trust evaluation explores the trust relationship between users from their social information, so as to reduce the impact of malicious ratings. Based on the rating and trust relationship, the listwise LTR algorithm learns an optimal ranking model from the training dataset. Finally, the ranking model will output a recommendation list more consistent with the user preference. The major symbols used in this paper are listed in Table 1.

3.2 Similarity Evaluation

The key step in CF-based recommendation algorithms is finding similar users. Methods of calculating the user similarity include Pearson correlation, cosine similarity, modified cosine similarity, etc [12]. The Pearson correlation is most widely used in related papers, which is defined as follows:

Definition 1 (Pearson correlation) If [r.sub.ui] and [r.sub.vi] respectively denotes the rating of user u and v for item i, [[bar.r].sub.u] and [[bar.r].sub.v] respectively denotes the average rating of user u and v, [I.sub.u,v] is the commonly rated item set of user u and v, then the similarity between user u and v based on Pearson correlation is:

[mathematical expression not reproducible] (1)

However, in this paper we argue that using the Pearson correlation to compute the user similarity is not accurate. For example, assume user A, B and C commonly rate item a, b, c, d, and the ratings range from 0 to 100. Their rating vectors for the four items are respectively denoted as A = {0,20,80,100}, B = {10,0,80,100}, C = {0,22,100,89}. As shown in the Fig. 2, by sorting the items according to their ratings, the recommendation list for each user is obtained. It can be seen that the recommendation order of user B for item a and b is contrary to that of user A, while the recommendation order of user C for item c and d is contrary to that of user A. As a result, user B would recommend item d to user A, while user C would recommend item c to user A. It can be seen that the recommendation of user B is more consistent with user A preference, so the user similarity between A and B should be larger. However, according to Definition 1, we can get PCC(A,B)=PCC(A,C)=0.97, meaning the similarity between A and B is the same as the similarity between A and C, which is unreasonable. In the actual recommendation system, users are more concerned about the items ranked ahead in the recommendation list, such as c and d, while paying less attention to those items ranked behind in the list, such as a and b, because low rating means the items do not meet the user preference. So the order of items in the recommendation list should also be considered while computing the user similarity.

In this paper, both ratings and order of items are taken into account, so as to find more accurate similar users. By using the Plackett-Luce model in [20] to represent the user with the probability distribution over the permutations of rated items, the user similarity is computed more accurately.

The main idea of Plackett-Luce model is as follows: given a set I containing n items and the corresponding ranking function, the item set I has n! kinds of permutations. Each permutation can be denoted as [pi] =< [[pi].sub.1], [[pi].sub.2],..., [[pi].sub.n] >, where [[pi].sub.i] [member of] I denotes the item ranked at position i. The set of all possible permutations is denoted as [[ohm].sup.I] . Each item in the permutation has its own rating, so different permutation has different probability. Those permutations where items with larger ratings are ranked higher should have larger probability.

Definition 2 (Permutation Probability) Given a permutation [pi] =< [[pi].sub.1], [[pi].sub.2],..., [[pi].sub.n] > and the rating of the corresponding items [mathematical expression not reproducible], the probability of [pi] is defined as:

[mathematical expression not reproducible] (2)

Where [phi](*) is a monotonically increasing and strictly positive function. In this paper, we assume [phi](x) = [e.sup.x].

In the real network environment, the number of items in the set I is often very large, so computing the probability of n!different permutations will consume lots of time. Cao et al. [21] proposed a permutation model, which only focused on the items in the top-k positions of the permutation.

Definition 3 (Top-k permutation set) Given an item set I={[i.sub.1],[i.sub.2],[i.sub.3],...,[i.sub.n]}, a top-k permutation set [T.sub.k] ([i.sub.1], [i.sub.2],..., [i.sub.k]) of I includes all the permutations where the top-k items are [i.sub.1], [i.sub.2], ..., [i.sub.k] [member of] I, which can be denoted as

[mathematical expression not reproducible] (3)

According to Definition 3, two top-k permutation sets are different if the top-k items are different. So for set I of n items, there are n!/ (n - k)! different top-k permutation sets in all, which are denoted as set [T.sup.I.sub.k]. For example, let I={[i.sub.1],[i.sub.2],[i.sub.3]}. There are 3!=6 different permutations of I: < [i.sub.1],[i.sub.2],[i.sub.3]>, < [i.sub.1],[i.sub.3],[i.sub.2]>, < [i.sub.2],[i.sub.1],[i.sub.3]>, < [i.sub.2],[i.sub.3],[i.sub.1]>, < [i.sub.3],[i.sub.2],[i.sub.1]>, < [i.sub.3],[i.sub.1],[i.sub.2]>. If k=1, there are 3!/(3-1)!=3 different top-1 permutation sets: [T.sup.I.sub.1] = {[T.sub.1]([i.sub.1]),[T.sub.1]([i.sub.2]),[T.sub.1]([i.sub.3])}, where [T.sub.1]([i.sub.1]) = {< [[i.sub.1].bar],[i.sub.2],[i.sub.3] >, < [[i.sub.1].bar],[i.sub.3], [i.sub.2] >}, [T.sub.1]([i.sub.2]) = {< [[i.sub.2].bar],[i.sub.1], [i.sub.3] >, < [[i.sub.2].bar],[i.sub.3], [i.sub.1] >}, [T.sub.1]([i.sub.3]) = {< [[i.sub.3].bar],[i.sub.1], [i.sub.2] >, < [[i.sub.3].bar],[i.sub.2], [i.sub.1] >}.

Cao et al. [21] gave an efficient way to compute the probability of the top-k permutation set.

LEMMA The probability of the top-k permutation set P([T.sub.k] ([i.sub.1], [i.sub.2],...,[i.sub.k] )) is the probability of items [i.sub.1], [i.sub.2],..., [i.sub.k] being ranked in the top-k positions, which can be computed as:

[mathematical expression not reproducible] (4)

Where [mathematical expression not reproducible] denotes the rating of the item ranked in position j. Thus, computing the probability of the top-k permutation sets in [T.sup.I.sub.k] is much more efficient than computing the probabilities of all the permutations of I.

The above is an introduction of Plackett-Luce model. Next we will illustrate how to apply Plackett-Luce model to compute the user similarity.

In a recommendation system, for any two users u and v, let [I.sub.u,v] be the set of items they commonly rated, and [mathematical expression not reproducible] be the set of top-k permutation sets of [I.sub.u,v]. Given the rating of u and v, their probability distributions over [mathematical expression not reproducible] can be computed by Eq.(4), denoted as [P.sub.u] and [P.sub.v] respectively. Then the similarity between u and v can be measured by the distance between the probability distribution [P.sub.u] and [P.sub.v] . The closer [P.sub.u] and [P.sub.v] is, the more similar u and v is.

The Kullback-Leibler (KL) distance [22] is commonly used to measure the distance of probability distributions in probability theory, so we use it to compute the user similarity in this paper. For user u and v, the KL distance between [P.sub.u] and [P.sub.v] is computed as follows:

[mathematical expression not reproducible] (5)

Where [P.sub.u] (g) and [P.sub.v](g) represents the probability of the top-k permutation set [mathematical expression not reproducible] for u and v respectively, which can be computed by Eq.(4). According to Eq.(5), it can be seen that the KL distance is asymmetric, i.e., [D.sub.KL] ([P.sub.u]||[P.sub.v]) [not equal to] [D.sub.KL] ([P.sub.v]||[P.sub.u]). Therefore a symmetric similarity measurement based on KL distance is proposed.

Definition 4 (User similarity) The user similarity between u and v is defined as follows:

s(u, v) = 1 - 1/2([D.sub.KL] ([P.sub.u] ||[P.sub.v]) + [D.sub.KL] ([P.sub.v] ||[P.sub.u])) (6)

However, according to Eq.(6), when the number of commonly rated items of u and v is small, the user similarity will be too large. In order to solve the problem, we multiply s(u,v) by [mathematical expression not reproducible], where c is threshold of |[I.sub.u,v]|, i.e.

[mathematical expression not reproducible] (7)

When the number of commonly rated items is less than c, the similarity will be reduced to the normal level.

For each user [u.sub.i], compute the similarity s'(u, v) between [u.sub.i] and each other user. Select the top-k users with larger similarity as the neighbors of [u.sub.i], denoted as [N.sub.i], which will be used in subsequent rating prediction.

Using the algorithm to compute the user similarity in Fig. 2, we can get s'(A,B) = 0.95, s'(A, C) = 0.82, which verifies the conclusion that the user similarity between A and B should be larger.

Algorithm 1. User Similarity Computation Input: user u; user v; common rated item set [I.sub.u,v]; the rating data of user u and user v; the threshold c; parameter k Output: the similarity between u and v: s'(u,v) 1: [mathematical expression not reproducible]; 2: [mathematical expression not reproducible]; // get all the possible top-k permutation sets of [I.sub.u,v] according to Definition 3 3: for each top-k permutation set [mathematical expression not reproducible] do 4: compute the probability of [mathematical expression not reproducible]for user u by Eq.(4), denoted as [P.sub.u] (g); 5: compute the probability of [mathematical expression not reproducible] for user v by Eq.(4), denoted as [P.sub.v] (g); 6: [mathematical expression not reproducible] 7: End for 8: [mathematical expression not reproducible]; 9: [mathematical expression not reproducible]; 10: return s'(u,v);

3.3 Rating Prediction

Based on the idea "users with similar rating behavior tend to give similar ratings on items [23]", CF-based recommendation algorithms predict missing ratings according to similar users. However, they still suffer from data sparsity problem. There exists massive items in recommendation system, but most users only rate a few of them, so their rating data is very sparse, making CF-based algorithms hard to find accurate similar users.

In this paper, the matrix factorization model is used to alleviate the data sparsity problem. The main idea is to decompose the high-dimensional user-item matrix R into low-dimensional user latent matrix U and item latent matrix V, which is as follows:

R [approximately equal to] [U.sup.T]V (8)

Then the rating of user [u.sub.i] for item j can be predicted as:

[[??].sub.ij]= g ([U.sup.T.sub.i][V.sub.j]) (9)

During the training process, the model constantly adjusts the user latent matrix U and item latent matrix V, so as to minimize the bias between predicted ratings and actual ratings. However, in the actual recommendation system, the rating behavior of the target user is often influenced by his similar neighbors. In order to further improve the rating prediction accuracy, the influence of similar users is integrated into the process of matrix factorization, modifying Eq.(9) as follows:

[mathematical expression not reproducible] (10)

Where [N.sub.t] is the neighbor set of user [u.sub.i]; s'(i,k)is the user similarity between [u.sub.i] and [u.sub.k], computed by Eq.(7); [alpha] controls the influence of similar neighbors; g(x) is the logistic function, assumed as g (x) = 1/1 + [e.sup.-x] in this paper. It can be seen that when the neighbor user [u.sub.k] gives a high rating on item j ([U.sup.T.sub.K][V.sub.j] is large), the predicted rating [[??].sub.ij] is supposed to be increased as well.

3.4 Trust Evaluation

In traditional CF-based recommendation algorithms, if there are some neighbors giving malicious ratings on items, the recommendation accuracy will be severely affected. There are also works about the trust-based recommendation [11][15], but they still suffer from data sparsity problem, due to the fact that trust relationship between users is also sparse under the big data environment.

To solve the problem, in this section the matrix factorization model is also used to predict the unknown trust relationship between users. By decomposing the high-dimensional user-trust matrix T into low-dimensional user latent matrix U and trust latent matrix Z, the trust relationship between [u.sub.i] and [u.sub.k] can be predicted as follows:

[[??].sub.ik] = g ([U.sup.T.sub.i][Z.sub.k]) (11)

Where U is the same user latent matrix in rating prediction. g(x) is the logistic function that bounds the range of [U.sup.T.sub.i][Z.sub.k] within [0,1], assumed as g(x) = 1/1 + [e.sup.-x] in this paper.

In real social network, the trust relationship between users is asymmetric, because A trusts B does not mean B trusts A. So [t.sub.ik] [member of] {0,1} can not accurately reflect the trust between users. The structure of social network needs to be taken into account while computing the trust value between users. For example, as shown in Fig. 3, when user [u.sub.i] trusts many users, the trust value [t.sub.ik] between user [u.sub.i] and user [u.sub.k] should be lowered. When user [u.sub.k] is trusted by many users, [t.sub.ik] should be enhanced.

Based on the above analysis, the original trust value [t.sub.ik] [member of] {0,1} is modified as follows [24]:

[mathematical expression not reproducible] (12)

Where [d.sup.+] ([u.sub.i]) denotes the number of users that [u.sub.i] trusts, and [d.sup.-] ([u.sub.k]) denotes the number of users who trust user [u.sub.k].

3.5 Listwise Learning-to-rank Algorithm

Listwise LTR algorithm can directly optimize the final recommendation list, so as to avoid the inaccuracy caused by generating recommendation list based on the descending order of ratings. To employ listwise LTR algorithm, we need to take the cross entropy between the predicted list and correct list as the loss function [5]. By minimizing the loss function, we finally get an optimal ranking model, which outputs a recommendation list more consistent with the user preference. However, listwise LTR algorithm still faces the challenge of data sparsity problem and malicious rating. In this section, we incorporate the rating data and trust relationship into listwise LTR algorithm. Before introducing the specific algorithm, some definitions are given as follows.

Definition 5 (Top-one probability) It is the probability of an item with rating [r.sub.jj] being ranked at the top-one position in the user's recommendation list, which is computed as follows:

[mathematical expression not reproducible] (13)

Where [l.sub.i] denotes the recommendation list of user [u.sub.i], and n is the number of items in the recommendation list.

On the one hand, we get the correct list according to the rating and trust relationship provided by the training dataset. On the other hand, we get the predicted list according to the rating and trust relationship predicted by our approach. Based on top-one probability, the cross entropy between the predicted list and the correct list is taken as the loss function, which is as follows:

[mathematical expression not reproducible] (14)

Where [r.sub.ij] is the rating provided by the training dataset, [[??].sub.ij] is the predicted rating computed by Eq.(10), and [I.sup.R.sub.ij] is an indicator function, which equals 1 if user [u.sub.t] rated item j, and 0 otherwise. [t.sup.*.sub.ik] is the modified trust value computed by Eq.(12), [[??].sub.ij] is the predicted trust value computed by Eq.(11), and [I.sup.T.sub.ik] equals 1 if user [u.sub.i] trusts user [u.sub.k], and 0 otherwise. [[lambda].sub.t] is a parameter that controls the importance of trust regularization, and [lambda] is the regularization coefficient used to reduce over-fitting. [||*||.sup.2.sub.F] denotes the Frobenius norm.

Having formulated the non-convex loss function as shown in Eq.(14), the next step is the training process. Compute the gradient of each latent vector, such as [U.sub.i], [V.sub.j], [Z.sub.k] defined in Section 3.1, and learn them by stochastic gradient descent as shown in Eq.(15), from which the local minimum of the cross-entropy loss function can be obtained.

[mathematical expression not reproducible] (15)

Where [eta] denotes the learning rate. Repeat the iteration and compute the MAE of loss function L. When the MAE is less than the predefined threshold [epsilon] or the maximum iterations t-max is reached, the training process is completed. By sorting items according to the descending order of their top-one probability, we can get a recommendation list more consistent with the user preference.

Algorithm 2. listwise learning-to-rank algorithm Input: training dataset; calculated user similarity by Algorithm 1, learning rate [eta]; maximum iterations t-max; [[lambda].sub.t],[lambda],[alpha]; Output: the recommendation list for each user; 1. randomly initialize U, V, Z 2. compute the initial loss function L by Eq.(14) 3: for t[left arrow]1 to t-max do 4. update [U.sub.i], [V.sub.j], [Z.sub.k] by Eq.(15) 5. record the loss function in the last iteration as L 6: for each user [u.sub.i] [member of] U 7: for each item j [member of]V 8: update the predicted rating [[??].sub.ij] by Eq.(10) 9: end for 10: end for 11: for each user [u.sub.i] [member of] U 12: for each user [u.sub.k] [member of] U,k [not equal to] i 13: update the predicted trust value [[??].sub.ik] by Eq.(11) 14: end for 15: end for 16: update the loss function L' by Eq.(14) 17: if(||L'-L|| <[member of]) then 18: break; 19: t[left arrow]t+1 20: end for //the iteration completes 21: for each user [u.sub.i] [member of] U 22: sort items according to the descending order of their top-one probability 23: return the recommendation list. 24: end for

3.6 Complexity Analysis

Evaluating the complexity comprises the computation process of ListSo: (1) The computational complexity of user similarity is O([m[n.sub.1]!]/[([n.sub.1]-k)!]), where m is the number of users, [n.sub.1] is the average number of items that are commonly rated by two users, and k denotes the parameter in the top-k model. (2) Computing the loss function L is of complexity O(|R|d + |R|[m.sub.1]d + |T|d + |T|[m.sub.2]d), where |R| is the number of observed ratings in matrix R, |T| is the number of observed trust relationship in matrix T, d is the dimensionality of latent vector, [m.sub.1] is the average number of neighbor users, and [m.sub.2] is the average number of users that a user trust. (3) Computing the gradients [mathematical expression not reproducible] are of complexity O(|R|d + |R|[m.sub.1]d + |T|d + |T|[m.sub.2] d), O(|R|d + |R|[n.sub.2] [m.sub.1]d), O(|T|d + |T|[m.sub.2] d), respectively, where [n.sub.2] is the average number of items that a user rated. (4) Computing the top-one probability of the items to be predicted is of complexity O([n.sub.3]d + [n.sub.3][m.sub.1]d), where [n.sub.3] is the number of items to be predicted.

Actually, the value of k is usually small, which will be illustrated in experiments. So it can be inferred that the total computational complexity has a linear correlation with the number of users, the number of items to be predicted and the number of observed data in matrix R and T. Therefore, the proposed approach can be applied to large-scale dataset.

4. Experiments

4.1 Dataset

During our experiments, we adopt two public real-world datasets, FilmTrust [25] and Epinion [26]. They contain rating of items and trust relationship between users. The rating of Epinion is an integer between 1 and 5, while the rating of FilmTrust is a number from 0.5 to 4.0 with an interval of 0.5. The details are shown in Table 2.

4.2 Evaluation Metric

Our proposed approach directly optimizes the recommendation list by listwise LTR algorithm, so we employ NDCG (Normalized Discounted Cumulative Gain) [27] and ERR (Expected Reciprocal Rank) [28], which are widely used in LTR algorithms, as the criteria to measure the recommendation accuracy. The larger the NDCG and ERR are, the higher the recommendation accuracy will be.

NDCG. In recommendation system, the rating can be regarded as the relevance. Assume there are m users, and [r.sup.j.sub.u] denotes the rating of the item ranked at position j in the recommendation list of user u. k [member of] R denotes the cutoff threshold. For user u, the [NDCG.sub.u] is defined as:

[mathematical expression not reproducible] (16)

NDCG @ k denotes the average of [NDCG.sub.u] @ k for all users, defined as

[mathematical expression not reproducible] (17)

ERR. In practice, users usually consider recommended items from front to back, until a satisfactory item is found. So the probability that the user considers the item at i-th position is related to the user's satisfaction with those items before i-th position. ERR is used to evaluate the relevance of recommended items.

[mathematical expression not reproducible] (18)

Where [r.sub.max] denotes the maximal value of ratings. [r.sub.max] =100 in most recommendation systems.

4.3 Impact of Parameters

In order to study the impact of parameters, for each user, we randomly select ten items for testing, and reserve the remaining data for training.

According to Eq.(10), parameter [alpha] controls the weight of the influence of neighbors in the rating prediction. In order to further explore the influence of [alpha] on the recommendation accuracy, we vary the value of [alpha] from 0 to 1, with a step value of 0.1, and conduct experiments in the two datasets. Fig. 4 shows that no matter in which dataset, our approach always achieves the highest recommendation accuracy when [alpha] = 0.6, indicating that considering the influence of neighbors can effectively improve the recommendation accuracy.

According to Eq.(14), parameter [[lambda].sub.t] controls the importance of trust regularization. We also vary the value of [[lambda].sub.t] from 0 to 1, with a step value of 0.1, and conduct experiments in the two datasets. Fig. 5 shows that our approach always achieves the highest recommendation accuracy when [[lambda].sub.t] = 0.8, indicating that incorporating the trust relationship can effectively improve the recommendation accuracy.

Parameter d is the dimensionality of latent vector. We vary the value of d from 10 to 35, with a step value of 5, and record the runtime and recommendation accuracy when testing our approach in the two datasets. Note that the time units in FilmTrust and Epinion datasets are second(s) and minute(min) respectively.

According to Fig. 6, no matter in which dataset, when the value of d increases, the runtime of our approach increases much more rapidly than the recommendation accuracy. For example, in FilmTrust dataset, when d increases from 10 to 35, the value of NDCG@5 only increases by 7.2%, but the runtime increases by 204%. This shows that just a small value of d can ensure relatively high recommendation accuracy in short runtime. Therefore, we set d=10 in all the experiments.

4.4 Runtime Comparison

The top-k probability model is used when computing the user similarity. The larger the k is, the more accurate the user similarity will be. But according to Section 3.6, the computational complexity of user similarity is [mathematical expression not reproducible], which means the computational complexity increases with the value of k by exponential growth. In order to get the reasonable value of k, we let k=1,2,3 respectively. For each user, we also randomly select ten items for testing, and reserve the remaining data for training.

According to Table 3, no matter in which dataset, with the increase of k, the runtime increases much more rapidly than the recommendation accuracy. For example, in FilmTrust dataset, the value of NDCG@5 when k=2 is 3.6% larger than that when k=1, but the runtime increases by 311%. When k=1, our approach can already achieve relatively high recommendation accuracy in short runtime. So we set k=1 in our experiments.

In order to evaluate the efficiency of our approach, we compare our approach ListSo with the following classical recommendation methods:

1SoRec [24]: It is a social-aware recommendation method fusing the user-item rating records with the user's social network by probabilistic matrix factorization, which belongs to traditional rating-based CF algorithms.

2BPMF [29]: It is a Bayesian Probabilistic Matrix Factorization algorithm by using Markov chain Monte Carlo (MCMC) methods to estimate the parameters, which belongs to traditional rating-based CF algorithms.

3ListRank-MF [17]: It combines a listwise LTR algorithm with matrix factorization, and gets a recommendation list by minimizing the loss function representing the uncertainty between training lists and output lists. It belongs to LTR-based CF algorithms.

4ListPMF [30]: This method employs an improved probability matrix factorization model to predict the user's preference sequence, and recommends the items by maximizing the posterior probability of the predicted preference sequence for the observed preference sequence. It belongs to LTR-based CF algorithms.

The main parameters of our approach are set as follows: [alpha] = 0.7, [beta] = 0.6, d = 10, k = 1. The other compared methods are set to the optimal parameters described in their original paper. We vary the number of items to be predicted from 10 to 30, with a step value of 5, and conduct experiments in the FilmTrust and Epinion datasets.

According to Fig. 7, the runtime of ListSo is shorter than that of BPMF and ListPMF, but longer than that of SoRec and ListRank-MF. The reason is as follows: compared with SoRec, ListSo adds the similarity evaluation based on probability distribution model; compared with ListRank-MF, ListSo adds the trust evaluation. ListSo is a hybrid CF algorithm, involving the similarity evaluation, rating prediction and trust evaluation simultaneously. So ListSo consumes larger amount of computation sources while achieving higher recommendation accuracy. But according to Fig. 7 and the complexity analysis in Section 3.6, it can be seen that with the increase of the number of items to be predicted, the runtime of ListSo still keeps a linear growth trend, indicating that ListSo can be applied to large-scale datasets.

4.5 Recommendation Accuracy Comparison

In order to evaluate the proposed user similarity computation method, we record the method, which replaces the proposed user similarity computation method with Pearson correlation given in Definition 1, as ListPe. By conducting experiments in FilmTrust and Epinion, we compare the recommendation accuracy of ListSo, ListPe and the other four recommendation methods in Section 4.4, shown as Table 4. The experimental procedure is the same as that described in Section 4.4.

According to Table 4 and Table 5, we can get the following conclusions:

(1) Almost all the methods perform better in FilmTrust than in Epinion. The reason lies in the larger data sparsity of Epinion, which lowers the accuracy of recommendation models.

(2) Compared with ListPe, the recommendation accuracy of ListSo gets improved averagely by more than 4% in FilmTrust and more than 5% in Epinion. This indicates that our proposed user similarity computation method is efficient in improving the recommendation accuracy.

(3) ListSo has the largest value of NDCG@5 and ERR@5, indicating that it outperforms the other five methods in recommendation accuracy. ListSo fully utilizes the ordering information of items to find more accurate neighbors, and employs listwise LTR algorithm to directly optimize the recommendation list, thereby achieving the highest recommendation accuracy.

4.6 The Ability to Resist Malicious Ratings

SoRec, ListRank-MF, ListPMF and ListSo all belong to CF algorithms based on neighbor users. In this kind of algorithms, there may be some illegal users giving malicious rating due to the purpose of profit. Once the system regards them as neighbor users, the recommendation accuracy will be severely affected. In order to measure their ability to resist malicious ratings, we compare ListSo with SoRec, ListRank-MF and ListPMF by gradually increasing the proportion of illegal users in the system. The main parameters of our approach are set as follows: [alpha] = 0.7, [beta] = 0.6, d = 10, k = 1. The other compared methods are set to the optimal parameters described in their original paper. For each user, we also randomly select ten items for testing, and reserve the remaining data for training.

According to Fig. 8, with the increase of illegal users, the recommendation accuracy of ListRank-MF and ListPMF decreases rapidly, because they don't establish any defense mechanism. Once the neighbor user gives malicious ratings, the recommendation accuracy will be severely affected. SoRec fuses the social network while making recommendation, so it has anti-attack ability, but it fails to mine the ordering information of items. ListSo uses the probability distribution model to compute the user similarity, and combines with the social information to recommend the items, so it not only has higher recommendation accuracy, but also can resist malicious ratings.

5. Conclusions

This paper presents a recommendation method based on listwise learning-to-rank by incorporating social information. Firstly, in view of the low accuracy of Pearson correlation, both of the rating and order of items are taken into account by Plackett-Luce model, so as to find more accurate similar users. Secondly, in order to alleviate the data sparsity problem, the improved matrix factorization model by integrating the influence of similar users is proposed to predict the rating. In addition, we also employ the matrix factorization model to explore the trust relationship between users from their social information. Finally, based on the predicted rating and trust relationship, a listwise learning-to-rank algorithm is proposed to learn an optimal ranking model, which can output the recommendation list more consistent with the user preference. Comprehensive experiments conducted on two public real-world datasets show that our approach outperforms previous rating-based CF algorithms and LTR-based CF algorithms.

In the future, more detailed information about users will be taken into account, such as the user's label. We will also consider the influence of time factor, so as to enhance the applicability of our approach in dynamic environment.

References

[1] Zhao Z, Yang Q, Cai D, et al, "Expert finding for community-based question answering via ranking metric network learning," in Proc. of International Joint Conference on Artificial Intelligence, pp. 3000-3006, July 9-15, 2016. Article (CrossRef Link).

[2] Liu J, Tang M, Zheng Z, et al, "Location-aware and personalized collaborative filtering for web service recommendation," International Journal of Computer Engineering In Research Trends, vol. 3, no. 5, pp. 356-360, 2016. Article (CrossRef Link).

[3] Hu Y, Peng Q, Hu X, et al, "Time aware and data sparsity tolerant web service recommendation based on improved collaborative filtering," IEEE Transactions on Services Computing, vol. 8, no. 5, pp. 782-794, 2015. Article (CrossRef Link).

[4] Liu N N, Yang Q, "EigenRank: a ranking-oriented approach to collaborative filtering," in Proc. of International ACM S-G-R Conference on Research and Development in Information Retrieval ACM, pp. 83-90, July 20-24, 2008. Article (CrossRef Link).

[5] Grotov A, Rijke M D, "Online learning to rank for information retrieval: SIGIR 2016 Tutorial," in Proc. of International ACM S-G-R Conference on Research and Development in Information Retrieval, pp. 1215-1218, July 17-21, 2016. Article (CrossRef Link).

[6] Zhao Z, Zhang L, He X, et al, "Expert finding for question answering via graph regularized matrix completion," IEEE Transactions on Knowledge & Data Engineering, vol.27, no.4, pp. 993-1004, 2015. Article (CrossRef Link).

[7] Lei Y, Wang Z, Meng L, et al, "Clustering and recommendation for semantic web service in time series," KSII Transactions on Internet And Information Systems, vol.8, no.8, pp. 2743-2762, 2014. Article (CrossRef Link).

[8] Kang G, Tang M, Liu J, et al, "Diversifying web service recommendation results via exploring service usage history," IEEE Transactions on Services Computing, vol. 9, no. 4, pp. 566-579, 2016. Article (CrossRef Link).

[9] Ricci F, Rokach L, Shapira B, "Introduction to recommender systems handbook," Recommender Systems Handbooks, vol.22, no.1, pp. 1-4, 2011. Article (CrossRef Link).

[10] Wang Y B, Meng X W, Hu X, "Information aging-based collaborative filtering recommendation algorithm," Journal of Electronics & Information Technology, vol. 35, no. 10, pp. 2391-2396, 2013. Article (CrossRef Link).

[11] Jia D, Zhang F, "A collaborative filtering recommendation algorithm based on double neighbor choosing strategy," Journal of Computer Research & Development, vol.50, no.5, pp. 1076-1084, 2013. Article (CrossRef Link).

[12] Cong L, Liang C, Li M, "A collaborative filtering recommendation algorithm based on domain nearest neighbor," Journal of Computer Research & Development, vol.45, no.9, pp. 1532-1538, 2008. Article (CrossRef Link).

[13] Guo H Y, Liu G S, Su B, et al, "Collaborative filtering recommendation algorithm combining community structure and interest clusters," Journal of Computer Research and Development, vol.53, no.8, pp. 1664-1672, 2016. Article (CrossRef Link).

[14] Jamali M, Ester M, "A matrix factorization technique with trust propagation for recommendation in social networks," in Proc. of ACM Conference on Recommender Systems, pp. 135-142, September 26-30, 2010. Article (CrossRef Link).

[15] Yang X, Steck H, Liu Y, "Circle-based recommendation in online social networks," in Proc. of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 1267-1275, August 12-16, 2012. Article (CrossRef Link).

[16] Pan W, Zhong H, Xu C, et al, "Adaptive bayesian personalized ranking for heterogeneous implicit feedbacks," Knowledge-Based Systems, vol.73, no.1, pp. 173-180, 2015. Article (CrossRef Link).

[17] Shi Y, Larson M, Hanjalic A, "List-wise learning to rank with matrix factorization for collaborative filtering," in Proc. of ACM Conference on Recommender Systems, pp. 269-272, September 26-30, 2010. Article (CrossRef Link).

[18] Weimer M, Karatzoglou A, Le Q V, et al, "COFI RANK, maximum margin matrix factorization for collaborative ranking," in Proc. of International Conference on Neural Information Processing Systems, pp. 1593-1600, December 03-06, 2007. Article (CrossRef Link).

[19] Huang S, Wang S, Liu T Y, et al, "Listwise collaborative filtering," in Proc. of International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 343-352, August 09-13, 2015. Article (CrossRef Link).

[20] Mollica C, Tardella L, "Bayesian mixture of Plackett-Luce models for partially ranked data," Statistics, 2015. Article (CrossRef Link).

[21] Cao Z, Qin T, Liu T Y, et al, "Learning to rank: from pairwise approach to listwise approach," in Proc. of International Conference on Machine Learning, pp. 129-136, June 20-24, 2007. Article (CrossRef Link).

[22] Morita C, Tsukimoto H, "Knowledge discovery from numerical data," Knowledge-Based Systems, vol.10, no.7, pp. 413-419, 1998. Article (CrossRef Link).

[23] Zheng X, Luo Y, Xu Z, et al, "Tourism destination recommender system for the cold start problem," KSII Transactions on Internet And Information Systems, vol.10, no.7, pp. 3192-3212, 2016. Article (CrossRef Link).

[24] Ma H, Yang H, Lyu M R, et al, "SoRec: social recommendation using probabilistic matrix factorization," in Proc. of ACM Conference on Information & Knowledge Management, pp. 931-940, October 26-30, 2008. Article (CrossRef Link).

[25] Barinova O, Lempitsky V, Kholi P, "On detection of multiple object instances using hough transforms," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 34, no. 9, pp. 1773-1784, 2012. Article (CrossRef Link).

[26] Oren M, Papageorgiou C, Sinha P, et al, "Pedestrian detection using wavelet templates," in Proc. of 1997 Conference on Computer Vision and Pattern Recognition, pp. 193, June 17-19, 1997. Article (CrossRef Link).

[27] Kalervo Jarvelin and et al, "IR evaluation methods for retrieving highly relevant documents," in Proc. of International ACM S-G-R Conference on Research and Development in Information Retrieval, pp. 41-48, July 24-28, 2000. Article (CrossRef Link).

[28] Chapelle O, Metlzer D, Zhang Y, et al, "Expected reciprocal rank for graded relevance," in Proc. of ACM Conference on Information and Knowledge Management, pp. 621-630, November 02-06, 2009. Article (CrossRef Link).

[29] Salakhutdinov R, Mnih A, "Bayesian probabilistic matrix factorization using markov chain monte carlo," in Proc. of International Conference on Machine Learning, pp. 880-887, July 05-09, 2008. Article (CrossRef Link).

[30] Liu J, Wu C, Xiong Y, et al, "List-wise probabilistic matrix factorization for recommendation," Information Sciences, vol. 278, pp. 434-447, 2014. Article (CrossRef Link).

Chen Fang received the B.S. degree in electronic science and technology from Zhengzhou information science and technology institute, Henan, China, in 2015. He is pursuing the M.S. degree in information security at Zhengzhou information science and technology institute. His research interests include cloud service management and data mining.

Hengwei Zhang received the B.S., M.S. and Ph.D degrees in information security from Zhengzhou information science and technology institute, Henan, China, in 2000, 2006, and 2015, respectively. He is now an associate professor at the Zhengzhou information science and technology institute. His research interests include information security and cloud computing.

Ming Zhang received the B.S. degree in electronic science and technology from Zhengzhou information science and technology institute, Henan, China, in 2016. He is pursuing the M.S. degree in information security at Zhengzhou information science and technology institute. His research interests include cloud resource scheduling and trust evaluation.

Jindong Wang received the B.S., M.S. degrees in information security from Zhengzhou information science and technology institute, Henan, China, in 1983, 1990 respectively. He is now a professor at the Zhengzhou information science and technology institute. His research interests include cloud computing, cyberspace security, game theory. He has published 40 papers and 1 book in these areas.

Chen Fang, Hengwei Zhang, Ming Zhang and Jindong Wang

Zhengzhou Information Science and Technology Institute, P.R. China

[e-mail: 17756230629@163.com]

(*) Corresponding author: Hengwei Zhang

Received April 13, 2017; revised July 10, 2017; revised August 2, 2017; accepted September 17, 2017; published January 31, 2018

http://doi.org/10.3837/tiis.2018.01.006

Table 1. Maior symbols and definitions Symbols Definitions (R) = [([r.sub.ij]).sub.mxn] user-item matrix, where m is the number of users and n is the number of items. [r.sub.ij] the rating of user [u.sub.i] for item j T = [{[t.sub.ik]}.sub.mxm] user-trust matrix, where [t.sub.ik] [member of] {0,1} indicates whether user [u.sub.i] trusts user [u.sub.k] or not U d x m dimensional user latent matrix [U.sub.i]=[[[u.sub.i1], the i-th column vector of U, [u.sub.i2],...,[u.sub.id]].sup.T] representing the latent vector of user [u.sub.i] V d x n dimensional item latent matrix the j-th column vector of V, representing the latent vector of item j Z d x m dimensional trust latent matrix [Z.sub.k] = [[[Z.sub.k1], the k-th column vector of Z, [Z.sub.k2],...,[Z.sub.kd]].sup.T] representing the trust latent vector of user [u.sub.k] dimensionality of latent vectors Table 2. Data Statistics Dataset Users Items Ratings Density Trust FilmTrust 1,508 2,071 35,497 1.1400% 1,853 Epinion 40,163 139,738 664,824 0.0118% 487,183 Table 3. The influence of k for the recommendation performance Dataset FilmTrust Epinion NDCG@5 ERR@5 runtime NDCG@5 ERR@5 runtime k=1 0.732 0.804 12.5s 0.708 0.773 5.6min k=2 0.758 0.837 51.3s 0.736 0.805 17.2min k=3 0.772 0.851 123s 0.758 0.832 33.3min Table 4. Experimental results on NDCG@5 items to be Dataset SoRec BPMF ListRank-MF ListPe ListPMF predicted [n.sub.3]=10 FilmTrust 0.694 0.699 0.701 0.703 0.711 Epinion 0.658 0.665 0.669 0.672 0.679 [n.sub.3]=15 FilmTrust 0.698 0.704 0.705 0.708 0.713 Epinion 0.659 0.663 0.671 0.673 0.674 [n.sub.3]=20 FilmTrust 0.689 0.695 0.693 0.699 0.703 Epinion 0.647 0.653 0.652 0.656 0.657 [n.sub.3]=25 FilmTrust 0.714 0.718 0.722 0.724 0.729 Epinion 0.664 0.668 0.677 0.686 0.681 [n.sub.3]=30 FilmTrust 0.704 0.715 0.714 0.719 0.720 Epinion 0.667 0.673 0.678 0.679 0.683 items to be ListSo predicted [n.sub.3]=10 0.732 0.708 [n.sub.3]=15 0.741 0.713 [n.sub.3]=20 0.726 0.694 [n.sub.3]=25 0.763 0.727 [n.sub.3]=30 0.751 0.718 Table 5: Experimental results on ERR@5 items to be Dataset SoRec BPMF ListRank-MF ListPe ListPMF predicted [n.sub.3]=10 FilmTrust 0.752 0.761 0.763 0.771 0.775 Epinion 0.720 0.729 0.731 0.735 0.743 [n.sub.3]=15 FilmTrust 0.757 0.762 0.765 0.773 0.782 Epinion 0.722 0.728 0.732 0.735 0.747 [n.sub.3]=20 FilmTrust 0.748 0.754 0.759 0.765 0.773 Epinion 0.709 0.715 0.722 0.723 0.728 [n.sub.3]=25 FilmTrust 0.767 0.777 0.781 0.784 0.797 Epinion 0.726 0.734 0.737 0.741 0.753 [n.sub.3]=30 FilmTrust 0.765 0.772 0.780 0.783 0.794 Epinion 0.736 0.741 0.744 0.751 0.761 items to be ListSo predicted [n.sub.3]=10 0.804 0.773 [n.sub.3]=15 0.812 0.779 [n.sub.3]=20 0.798 0.761 [n.sub.3]=25 0.826 0.785 [n.sub.3]=30 0.822 0.794

Printer friendly Cite/link Email Feedback | |

Author: | Fang, Chen; Zhang, Hengwei; and, Ming Zhang; Wang, Jindong |
---|---|

Publication: | KSII Transactions on Internet and Information Systems |

Article Type: | Report |

Date: | Jan 1, 2018 |

Words: | 8795 |

Previous Article: | Exact and Approximate Symbol Error Probability of cooperative systems with best relay selection and all participating relaying using Amplify and... |

Next Article: | QoS-aware Cross Layer Handover Scheme for High-Speed vehicles. |

Topics: |