11Latent Unexpected Recommendations
PAN LI,
New York University, Stern School of Business
ALEXANDER TUZHILIN,
New York University, Stern School of BusinessUnexpected recommender system constitutes an important tool to tackle the problem of filter bubbles and userboredom, which aims at providing unexpected and satisfying recommendations to target users at the same time.Previous unexpected recommendation methods only focus on the straightforward relations between currentrecommendations and user expectations by modeling unexpectedness in the feature space, thus resultingin the loss of accuracy measures in order to improve unexpectedness performance. Contrast to these priormodels, we propose to model unexpectedness in the latent space of user and item embeddings, which allowsto capture hidden and complex relations between new recommendations and historic purchases. In addition,we develop a novel Latent Closure (LC) method to construct hybrid utility function and provide unexpectedrecommendations based on the proposed model. Extensive experiments on three real-world datasets illustratesuperiority of our proposed approach over the state-of-the-art unexpected recommendation models, whichleads to significant increase in unexpectedness measure without sacrificing any accuracy metric under allexperimental settings in this paper.CCS Concepts: • Information systems → Recommender systems . Additional Key Words and Phrases:
Unexpected Recommendation, Beyond-Accuracy Objectives, LatentClosure, Latent Embeddings, Latent Space
ACM Reference Format:
Pan Li and Alexander Tuzhilin. 2020. Latent Unexpected Recommendations.
ACM Trans. Intell. Syst. Technol.
Recommender systems have been playing an important role in the process of information dissemi-nation and online commerce, which assist users in filtering the best content while shaping theirconsumption behavior patterns at the same time. However, classical recommender systems arefacing the problem of filter bubbles [43, 46], which means that target users only get recommenda-tions of their most familiar items, while losing reach to many other available items. They also leadto the problem of user boredom [28, 29], which significantly deteriorates user satisfaction withrecommender systems. For example, even a Harry Potter fan may feel unsatisfied if the systemkeeps recommending Harry Potter series all the time.To address these two problems, researchers have introduced recommendation objectives beyondaccuracy, including unexpectedness, serendipity, novelty and diversity [53], the goal of which isto provide novel, surprising and satisfying recommendations. Among them, unexpectedness is ofparticular interest for its close relation with user satisfaction and ability to improve recommendation
Authors’ addresses: Pan Li, New York University, Stern School of Business, [email protected]; Alexander Tuzhilin, NewYork University, Stern School of Business, [email protected] to make digital or hard copies of all or part of this work for personal or classroom use is granted without feeprovided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice andthe full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored.Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requiresprior specific permission and/or a fee. Request permissions from [email protected].© 2020 Association for Computing Machinery.2157-6904/2020/1-ART1 $15.00https://doi.org/10.1145/nnnnnnn.nnnnnnnACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. a r X i v : . [ c s . I R ] J u l :2 Pan Li and Alexander Tuzhilin performance [1, 2]. Therefore, we focus on modeling unexpectedness and providing unexpectedrecommendations in this paper.In prior literature, researchers have proposed to define unexpectedness in multiple ways, includ-ing deviations from primitive prediction results [14, 42], unexpected combination of feature patterns[5] and feature distance from previous consumptions [2]. They subsequently provide unexpectedrecommendations based on these definitions and achieve significant performance improvements interms of certain unexpectedness measures.However as shown in the prior literature [76, 78], improvements in unexpectedness come at thecost of sacrificing accuracy measures, which severely limits practical use of unexpected recom-mendations since the major goal of recommender system is to enhance overall user satisfaction.This is the case for the following reasons. First, previous models only focus on the straightforwardrelations between current recommendation and user expectations by modeling unexpectedness inthe feature space , while not taking into account deep, complex and heterogeneous relations betweenusers and items. Second, prior modeling of unexpectedness relies completely on the explicit userand item information, and may not work well in the case when the consumption records aresparse, noisy or even missing. And finally, the distance metric between discrete items, which iscrucial for defining unexpectedness, is hard to formulate in the discrete feature space, and thismay lead to unintentional biases in the estimation of user preferences. Therefore, prior unexpectedrecommendation models can be further improved, and this constitutes the main topic of this paper.To address the aforementioned concerns, in this paper we propose to define unexpectedness inthe latent space containing latent embeddings of users and items, as opposed to the feature space that only has the explicit information about them. Specifically, we propose a novel Latent Closure(LC) method to model unexpectedness that: • captures latent, complex and heterogeneous relations between users and items to effectivelymodel the concept of unexpectedness. • provides unexpected recommendations without sacrificing any performance accuracy. • efficiently computes unexpectedness for large-scale recommendation services.The proposed unexpected recommendation model follows the following three-stage procedure.First, we map the features of users and items into the latent space and represent users and items aslatent embeddings there. These embeddings are obtained using several state-of-the-art mappingapproaches, including Heterogeneous Information Network Embeddings (HINE) [12, 54, 62],
AutoEn-coder (AE) [21, 52] and
MultiModal Embeddings (ME) [45] methods. We subsequently utilize theconcept of ‘’closure” from differential geometry and formulate the definition of unexpectednessof a new item as the distance between the embedding of that item and the closure of all the pre-viously consumed item embeddings. And finally, we combine this unexpectedness measure withthe estimated rating of the item to construct the hybrid utility function for providing unexpectedrecommendations.In this paper, we make the following contributions:(1) We propose latent modeling of unexpectedness. Although many papers have recently ex-plored latent spaces for recommendation purposes, it is not clear how to do it for unexpected recommendations, which constitutes the topic of this work.(2) We construct hybrid utility function based on the proposed unexpectedness measure andprovide unexpected recommendations accordingly. We also demonstrate that this approach wouldsignificantly outperform all other unexpected recommendation baselines considered in this paper.(3) We conduct extensive experiments in multiple settings and show that it is indeed the latentmodeling of unexpectedness that leads to significant increase in unexpectedness measures without
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. atent Unexpected Recommendations 1:3 sacrificing any accuracy performance. Thus, the proposed method helps users to break out of theirfilter bubbles without sacrificing recommendation performance.The rest of the paper is organized as follows. We discuss the related work in Section 2 and presentour proposed latent modeling of unexpectedness in Section 3. The unexpected recommendationmodel is introduced in Section 4. Experimental design on three real-world datasets are describedin Section 5 and the results as well as discussions are presented in Section 6. Finally, Section 7summarizes our contributions and concludes the paper.
In this section, we provide an overview on the related work covering three fields: beyond-accuracymetrics, unexpected recommendations and latent embeddings for recommendations. We highlightthe importance of combining unexpected recommendations with latent modeling approaches toachieve superb recommendation performance.
As researchers have pointed out, accuracy is not the only important objective of recommendations[41], while other beyond-accuracy metrics should also be taken into account, including unexpect-edness, serendipity, novelty, diversity, coverage and so on [14, 27]. Note that, these metrics areclosely related to each other, but still different in terms of definition and formulation. Therefore,prior literature have proposed multiple recommendation models to optimize each of these metricsseparately.Serendipity measures the positive emotional response of the user about a previously unknownitem and indicates how surprising these recommendations are to the target users[9, 53]. Representa-tive methods to improve serendipity performance include Serendipitous Personalized Ranking (SPR)[39] that extends traditional personalized ranking methods by considering serendipity informationin the AUC optimization process; and Auralist[74] that utilizes the topic modeling approach tocapture serendipity information and provide serendipitous recommendations accordingly.Novelty measures the percentage of new recommendations that the users have not seen beforeor known about [41]. It is computed as the percentage of unknown items in the recommendations.Researchers have proposed multiple methods to improve novelty measure in recommendations,including clustering of long-tail items [47], innovation diffusion [25], graph-based algorithms [56]and ranking models [44, 67].Diversity measures the variety of items in a recommendation list, which is commonly modeled asthe aggregate pairwise similarity of recommended items [77]. Typically models to improve diversityof recommendations include Determinantal Point Process (DPP) [10, 13] that proposes a novelalgorithm to greatly accelerate the greedy MAP inference and provide diversified recommendationaccordingly; Greedy Re-ranking methods [8, 32, 59, 64, 77] that provide diversified recommendationsbased on the combination of the itemâĂŹs relevance and its average distance to items already inthe recommended list; and also Latent Factor models to optimize diversity measures [22, 57, 61]Coverage measures the degree to which recommendations cover the set of available items[3, 14, 19]. To improve coverage measure, researchers propose to use coverage optimization [3, 4]and popularity reduction methods [65] to balance between relevance and coverage objectives [70].Over all beyond-accuracy metrics, in this paper we only focus on the unexpectedness measureand aim at providing unexpected recommendations for its close relation with user satisfactionand ability to improve recommendation performance [1, 2]. Moreover, the proposed unexpectedrecommendation algorithm is capable of improving serendipity and diversity measures as well, asshown in our experiment results.
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. :4 Pan Li and Alexander Tuzhilin
Different from other beyond-accuracy metrics, unexpectedness measures those recommendationsthat are not included in user expectations and depart from what they would expect from therecommender system. Researchers have shown the importance of incorporating unexpectednessin recommendations, which could overcome the overspecialization problem [2, 24], broaden userpreferences [19, 74, 75] and increase user satisfaction [2, 39, 74]. Unexpectedness captures thedeviation of a particular recommender system from the results obtained from other primitiveprediction models [5, 14, 42], and also the deviation from user expectations [2, 35].To improveunexpectedness measure in the final recommendations, existing models can be classified into threecategories: rule-based approaches, model-based approaches and utility-based approaches, as weshow in Table 1.Rule-based approaches typically involve pre-definition of a set of rules or recommendationstrategies for unexpected recommendations, including partial similarity [26], k-furthest-neighbor[50] and graph-based approaches [34, 63]. Rule-based approaches are generally simple to imple-ment and easy to put into actual practice, as most of the approaches incorporate unexpectednessinto the classical models instead of starting from scratch. Besides, rule-based approaches allowfor more control in the model, as the rules and recommendation strategies are often explicitlyspecified by the designers. It also improves the explanability and interpretability of the proposedunexpected recommendation model. However, they require pre-defined strategies to be set prior torecommendations. Also, scalability is a big concern for the usage of rule-based methods. In addition,these models typically lack of generalizability for they focus only on specific domains and specificapplications.Model-based approaches aim to improve novelty and unexpectedness of the recommendeditems by proposing new models and data structures that go beyond the traditional collaborativefiltering paradigm. Representative models that optimize the unexpectedness objective includepersonalized ranking [67], innovator identification [30, 31] and transition cost graph [56]. Model-based approaches are backed with mathematical foundations that guarantee either convergence orstability of the learning process, thus making them robust to different settings and with greaterpotential of generalizablity. However, they are often hard to interpret, for there is no natural wayto transfer mathematical formulations into explicit rules or recommendation strategies. Therefore,it is relatively hard to control the degree of unexpectedness that we aim to incorporate into therecommendation model. And finally, model-based approaches might not take full advantages of allavailable information due to the restrictions of specific model form.Utility-based approaches involve the construction of a hybrid utility function as the combinationof estimated relevance and degree of unexpectedness. Researchers in [20, 23, 69] have followed thisdirection of research. Specifically, [2] proposed to include user expectation into the hybrid utilityfunction and achieves state-of-the-art unexpected recommendation performance. Utility-basedmethods allow for more control of the recommendation strategy, and it is easier to implementand put into practice as well. Especially, the construction of unexpectedness do not depend onthe estimation of user preferences towards the candidate item, thus making it model-agnostic.On the other hand, the unexpected hyperparameter plays an important role in determining therecommendation performance of the hybrid-based model, thus requiring proper hyperparameteroptimization.One important limitation of all prior unexpected recommendation models lies in that they onlyfocus on the straightforward relations between users and items and define unexpectedness inthe feature space, without taking into account the deep, complex interactions underlying theirfeature information. Therefore, previous unexpected recommendations might not reach the optimal
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. atent Unexpected Recommendations 1:5
Model Literature Strength WeaknessRule-Based Approaches [50], [11], [26], • Easy to implement • Require pre-defined rules• K-Furthest Neighbor [34], [63] • Allow for model control • Lack of scalability• Frequency Discount • Improves interpretability • Lack of generalizability• Taxonomy-Based Similarity• Partial Similarity• Social Network• Graph TheoryModel-Based Approaches [31], [30], [39], • Robust and generalizable • Lack of interpretability• Matrix Factorization [56], [67] • Mathematical foundation • Restricted model control• Learning to Rank • Efficient optimization • Limited model input• Re-Ranking• Clustering• Graph TheoryUtility-Based Approaches [69], [23], [20], • Balance between objectives • Require hyperparameter optimization• Weighted Sum Model [74], [2], [35] • Allow for model control • Explicit information only• Weighted Product Model • Model-agnostic• Probabilistic Model• Neural Network Model
Table 1. Classification of Unexpected Recommendation Research recommendation performance, as discussed in [54, 71, 72]. In addition, they are facing the trade-off dilemma between optimizing the accuracy and unexpectedness objectives. To address theselimitations, in this paper we propose to define unexpectedness instead in the latent space, thusobtaining significant improvements over previous models.
Another body of related work is around embedding approaches that effectively map users anditems into the latent space and extract their deep, complex and heterogeneous relations betweeneach other. Specifically, different embedding methods fit for different recommendation applications.In the case where heterogeneous feature data is available, Heterogeneous Information NetworkEmbedding approach (HINE) [12, 54, 55] utilizes the data structure of heterogeneous informationnetwork (HIN) to extract complex heterogeneous relations between user and item features and thusprovide better recommendations to the target users. In the case where rich interactions betweenusers and items are available, AutoEncoding (AE) approach [17, 21, 36, 37, 49, 52] utilizes deepneural network (DNN) techniques and obtain the semantic-aware representations of users anditems as embeddings in the latent space to model their relationship and provide recommendationsaccordingly. Finally in the case where multimodal dataset is available, researchers propose to useMultimodal Embedding (ME) approach [45] to combine information from different sources andobtain superb recommendation performance.Compared with classical approaches, latent embedding methods have several important advan-tages that enable recommender systems to provide more satisfying recommendations [38, 73], asdiscussed in Section 1. Therefore, in this paper we provide the definition of unexpectedness utilizingthese latent embedding methods, which contributes to the strong recommendation performance.
In this section, we introduce the proposed latent modeling of unexpectedness. We compare the newdefinition with feature-based definitions and illustrate superiority and benefits of the proposedapproach.
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. :6 Pan Li and Alexander Tuzhilin
As introduced in prior literature [2, 14, 42], an important component for modeling unexpectednessis the expected set , which contains previous consumptions of the user. The idea is that, users shouldhave no unexpectedness towards those recommended items that they have purchased before orvery similar to their purchases, for they understand that typical recommender systems collect theirhistoric behaviors and thus provide similar recommendations based on these records.To construct the user expectations, [2] propose to form the expected set by taking into accountexplicit feature information of users and items. For example in the book recommendation, theexpected set is constructed based on the features of alternative editions, in the same series, withsame subjects and classifications, with the same tags, and so on. Unexpectedness is subsequentlydefined by a positive, unbounded function of the distance of the recommended item from the set ofexpected items.However, this definition only focuses on the straightforward relations between users and items,but fall short of addressing deeper correlations beyond the explicit feature information. For example,if a certain user has been a frequent consumer of McDonald and Carl’s Jr, then the recommendationof Burger King might not be unexpected to that user, although these restaurants belong to differentfranchise and offer different menus, as shown in their feature information.Besides, feature-based modeling of unexpectedness typically assumes the same importance foreach feature during the calculation of unexpectedness, while in reality it is not necessarily the case.A natural example is that for music recommendations, genre information plays a more importantrole in determining the degree of unexpectedness than profile information, such as time of release.In addition, the distance function is also hard to define in the discrete feature space.Therefore, in this paper, we propose to construct the expected set in the latent space by takingthe closure of item embeddings. Unexpectedness is subsequently defined as the distance betweenthe new item embedding and the closure of the expected set in the latent space. Comparing withfeature-based definitions, latent modeling of unexpectedness obtains several important advantages,as discussed in Section 1. Especially, we point out that the proposed Latent Closure (LC) modelis capable of utilizing richer information of user reviews and multi-modal data to determine thedegree of unexpectedness, as previous models typically do not take these information into account,as shown in Table 2. These benefits are also supported by strong experiment results.
Latent Modeling Feature ModelingAlgorithms LC SPR Auralist HOM-LIN DPPLatent Embeddings ✓ ✗ ✗ ✗ ✗
Explicit Features ✓ ✗ ✗ ✓ ✗
User Reviews ✓ ✗ ✓ ✗ ✗
Pre-Defined Rules ✓ ✗ ✓ ✓ ✗
Past Transactions ✓ ✓ ✓ ✓ ✓
User Ratings ✓ ✓ ✓ ✓ ✓
Table 2. Comparison of Unexpected Recommendation Methods
In the next section, we will introduce the idea of latent closure and how to construct userexpectations based on the proposed latent closure method.
As discussed in the previous section, we propose to compute user expectations in the latent spacerather than in the original feature space. In addition, we point out that the modeling of userexpectations should go beyond the direct aggregation of previous consumptions, and should also
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. atent Unexpected Recommendations 1:7 take into account those items that are similar to the consumed items, while similarities betweenitems are captured by the Euclidean distance in the latent space. Therefore, it is natural to takethe ‘’closure” of all consumed item embeddings to model the expected set, as opposed to usingindividual item embedding in the latent space.According to mathematical theories in differential geometry [18], there are three commongeometric structures in high-dimensional latent spaces that can be naturally extended to modelingthe closure of latent embeddings, namely Hypersphere, Hypercube and Convex Hull. The particularchoice of latent closure depends on the assumption we make towards the relations between usersand items in the latent space. • Latent HyperSphere (LHS)
The hypersphere in the R n space is defined as the set of n-tuples points ( x , x , · · · . x n ) such that x + x + · · · + x n = r where r is the radius of thehypersphere. Under this definition, we assume that the expected set of items for each usergrows homogeneously in all directions in the latent space. • Latent HyperCube (LHC)
The hypercube is a closed, compact, convex figure, whose 1-skeleton consists of groups of opposite parallel line segments aligned in each of the space’sdimensions, perpendicular to each other and of the same length. Under this definition, weassume that the expected set of each user grows homogeneously in the n perpendiculardirections. • Latent Convex Hull (LCH)
The convex hull of a set of points X in the Euclidean space isthe smallest convex set that contains all points in X . Under this definition, we assume thatthe expected set maintains its convexity in the growing process. In addition, if we constructthe expected set as the convex hull of consumed item embeddings, the convexity propertywill guarantee the feasibility of the recommendation as an optimization problem given bythe Slater’s Condition [58].We visualize the definition of unexpectedness based on these geometric structures in Figure1a, 1b and 1c. These latent closure approaches capture latent semantic interactions between usersand items and construct the expected set for each user accordingly. Compared to feature-baseddefinitions [2], latent closures utilize richer information including user and item features to modeluser expectations more precisely. The process for finding closures in high-dimensional latent spacesis not significantly different from the process in the 2-dimensional space. For LHS and LHC, we onlyneed to find the furthest two points in the latent space to identify the centroid of the latent closure.For LCH, we follow the QuickHull algorithm [7] to identify the latent structure. Experiment resultsshow that all three geometric structures consistently obtain significant improvements over baselinemodels, while no structure dominates the other two.To sum up, in this paper we utilize the latent closure method to model unexpectedness in thelatent space. We hereby propose the following definition of unexpectedness: Definition 3.1. Unexpectedness of a new item as the distance between the embedding of thatitem and the closure of all previously consumed item embeddings.In the next section, we will discuss the specific techniques for obtaining latent embeddings andmethods to provide unexpected recommendations accordingly.
To effectively model unexpectedness in the latent space and demonstrate the robustness of theproposed model, we utilize three state-of-the-art latent embedding approaches, namely HINE, AEand ME to map users and items into the latent space and calculate the unexpectedness subsequently.
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. :8 Pan Li and Alexander Tuzhilin(a) Latent Convex Hull (b) Latent Hypersphere (c) Latent Hypercube
Fig. 1. Visualization of Latent Closure and the Unexpectedness. Blue points stand for all the available items;Orange points represent the consumed items; Green point refers to the newly recommended item. We defineunexpectedness as the distance between the new item and the latent closure generated by all consumeditems.
To capture the complex and multi-dimensional relations in the data record, Heterogeneous Information Network (HIN) [62] hasbecome an effective data structure for recommendations, which models multiple types of objectsand multiple types of links in one single network. It includes users, items, transactions, ratings,entities extracted from reviews and the feature information. We link the associated entities withcorresponding users and items in the network and utilize meta-path embedding approach [12] toobtain node embeddings.We denote the heterogeneous network as G = ( V , E , T ) , in which each node v and each link e are assigned with specific type T v and T e . To effectively learn node representations we enable theskip-gram mechanism to maximize the probability of each context node c t within the neighbors of v , denoted as N t ( v ) , where we add the subscript t ( t ∈ T v ) to limit the node to a specific type:arg max θ (cid:213) v ∈ V (cid:213) t ∈ T v (cid:213) c t ∈ N t ( v ) loдP ( c t | v ; θ ) (1)Thus, it is important to calculate P ( c t | v ; θ ) , which represents the conditional probability of con-text node c t given node v . Therefore, we follow [16] and revise the network embedding modelaccordingly for dealing with heterogeneous information network. Specifically, we propose to useheterogeneous random walk to generate paths of multiple types of nodes in the network. Given aheterogeneous information network G = ( V , E , T ) , the metapath of the network is generated inthe form of V R −−→ V R −−→ V · · · V n wherein R = R ◦ R ◦ · · · R n defines the composite relationsbetween the start and the end of the heterogeneous random walk. The transition probability withineach random walk between two nodes is defined as follows: p ( V t + | V t ) = (cid:40) C ( T Vt , T Vt + )| N t + ( V t )| , ( V t , V t + ) ∈ E , ( V t , V t + ) (cid:60) E (2)where C ( T V t , T V t + ) stands for the transition coefficient between the type of node V t and the type ofnode V t + . We have 6 different transition coefficients that correspond to 6 different relations in thenetwork C U U , C U E , C U I , C EI , C EE and C I I . (U:User, I:Item, E:Entity/Feature) | N t + ( V t )| stands forthe number of nodes of type V t + in the neighborhood of V t . We apply heterogeneous random walk ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. atent Unexpected Recommendations 1:9
Fig. 2. Heterogeneous Information Network Embedding Method iteratively to each node and generate the collection of meta-path sequences. The user and itemembeddings are therefore obtained through the aforementioned skip-gram mechanism.
Apart from modeling interactions between users and items through HIN,AutoEncoder (AE) approach also constitutes an important tool to learn the latent representations ofuser and item features and transform discrete feature vectors into continuous feature embeddings.We denote the feature information for user a as u a = { u a , u a , · · · , u a m } and the feature infor-mation for item b as i b = { i b , i b , · · · , i b n } , where m and n stand for the dimensionality of user anditem feature vectors respectively. The goal is to train two separate neural networks: encoder thatmaps feature vectors into latent embeddings, and decoder that reconstructs feature vectors fromlatent embeddings. Due to effectiveness and efficiency of the training process, we formulate both theencoder and the decoder as multi-layer perceptron (MLP). MLP learns the hidden representationsusing the following equations: y a = Φ ( u a ) , y b = Φ ( i b ) (3)where y a , y b represents the latent embeddings and Φ stands for the fully connected layer withactivation functions. We apply another layer of fully connected network for reconstruction andoptimization. Note that, in this step we train the global autoencoder for users and items in theentire dataset simultaneously to obtain the hidden representation. In addition to the aforementioned approaches, when dealingwith datasets that include multiple modalities, such as movie and video data (which are usuallyassociated with images and subtitles), multimodal embeddings [45, 68] constitute an efficient toolto combine the information from different sources.Specifically, in the video recommendation task, we illustrate the model for obtaining videoembeddings in Figure 4. First, we initialize the embeddings for text, audio and image data throughFully Convolutional Network (FCN) with L2-Norm as regularization term. For the text data, weuse the average pooling technique as a special treatment to obtain the semantic information as theaverage of word embeddings. Then we concatenate these embeddings and apply another layer ofFully Convolutional Network to obtain multimodal embeddings for the input video that capturesjoint information of subtitles, sound and graphics.
Based on the latent embedding approaches introduced in the previous section, we map the usersand items into the continuous latent space and model the expected set for each user as the latent
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. :10 Pan Li and Alexander Tuzhilin
Fig. 3. AutoEncoder Embedding MethodFig. 4. Multimodal Embedding Method closure of item embeddings. Specifically, we feed the user and item features as input into thelatent embedding models and obtain their latent representations. We subsequently formulate theunexpectedness as the distance between the embedding of new item and the latent expected set as U u , i = d ( i ; LC ( N i )) (4)where N i = ( i , i , · · · , i n ) contains the embeddings of all consumed items. This unexpectednessmetric is well defined as the minimal distance from the new item to the boundaries of the closurein the latent space. We then perform the unexpected recommendation based on the hybrid utility ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. atent Unexpected Recommendations 1:11 function:
U tility u , i = EstRatinд u , i + α ∗ U nexp u , i (5)which incorporates the linear combination of estimated ratings and unexpectedness. The key idealies in that, instead of recommending the similar items that the users are very familiar with as theclassical recommenders do, we recommend unexpected and useful items to the users that theymight have not thought about, but indeed fit well to their satisfactions. The two adversarial forcesof accuracy and unexpectedness work together to get the optimal recommendation and thus obtainthe best recommendation performance and user satisfaction. We present the entire framework inAlgorithm 1. Algorithm 1:
Latent Unexpected Recommendation
Data:
Users; Items; Historic Actions; Other feature information
Result:
List of Recommended Items Map users and items into the latent space; for each user u in Users do for each item i in Items do U nexp ( u , i ) = d ( i ; LC ( N i )) ; U tility u , i = EstRatinд u , i + α ∗ U nexp u , i end Recommend Top-N(Utility); end To validate the performance of our approach, we conduct extensive experiments on three large-scalereal-world applications and compare the results of our model with the state-of-the-art baselines.The experimental setup is introduced in this section. Specifically, we design the experiments toaddress the following research questions:
RQ1 : How does the proposed model perform compared to baseline unexpected recommendationmodels?
RQ2 : Can we achieve significant improvements in unexpectedness measure while keeping the samelevel of accuracy performance?
RQ3 : Are the improvements robust to different experimental settings?
We implement our model on three real-world datasets: the Yelp Challenge Dataset Round 12 , whichcontains ratings and reviews of users and restaurants; the TripAdvisor Dataset , which containscheck-in information of users and hotels; and the Video Dataset, which includes the traffic logs wecollected from a large-scale industrial video platform. Specifically, we use four days of traffic logsfor the training process and the following day for the evaluation process. We list the descriptivestatistics of these datasets in Table 3. To avoid the cold-start and sparsity issues, we filter out usersand items that appear less than 5 times in all three datasets. :12 Pan Li and Alexander Tuzhilin Dataset
Yelp TripAdvisor Video
Table 3. Descriptive Statistics of Three Datasets
We perform Bayesian optimization [60] to select optimal hyperparameters for the proposed methodas well as baseline models. The α is selected as 0.03, where we achieve the optimal balance betweenthe accuracy and unexpectedness measures. In addition, the dimension of the latent embeddings is128, which is efficient to capture the relations between users and items, as shown in [73]. Detailedparameter settings are further introduced in the next section.As discussed in Section 4, for three different datasets we select three state-of-the-art embeddingapproaches accordingly to model the unexpectedness in the latent space. Specifically, the Yelpdataset contains information about explicit users, items and ratings, as well as substantial amountsof meta-information, including text reviews, friendship network, user demographic and geolocation.Thus, it is suitable to be analyzed using Heterogeneous Information Network Embedding (HINE)approach to address the heterogeneous relationships within the Yelp dataset. Meanwhile, due tothe multimodality of video data structure, we utilize the Multimodal Embedding (ME) approachto calculate the unexpectedness between users and videos in the Video dataset. Meanwhile, theTripAdvisor dataset only includes users, items and their associated feature information, whichmakes the AutoEncoding (AE) approach a reasonable choice for obtaining latent embeddings.We point out that, although it could further increase the validity of our approach if we testthe same embedding approach on the three datasets, it is not practical to do so. By implementingour model through three different embedding approaches, we illustrate the strength of modelingunexpectedness in the latent space. Note that illustration of this point does not rely on the specificdesign of embedding approaches. Our proposed latent unexpected recommendation model follows a three-step training procedure:first, we utilize the latent embedding approaches to map users and items into the latent space; thenwe subsequently calculate the unexpectedness and construct the hybrid utility function for eachuser; finally, we provide unexpected recommendations based on the hybrid utility function andupdate our model accordingly.To obtain the heterogeneous information network embeddings from the Yelp dataset, we extractthe users, restaurants and feature labels from the dataset to construct the nodes in the heterogeneousinformation network. We link the user nodes and items nodes with their associate feature nodes,and we also link the user node with the item node if the user has visited that restaurant before.We conduct heterogeneous random walk [54] with length 100 starting from each node to generatethe sequences of nodes. We repeat this process 10 times. Then we enable skip-gram mechanismfollowing the procedures in [16] with window size 2, minimal term count 1 and iterations 100 tomap the nodes into the latent space, and obtain the corresponding latent embeddings.To obtain the autoencoder embeddings from the TripAdvisor dataset, we utilize one layer ofMLP (Multi-Layer Perceptron) as the encoder to generate latent representations for each user and
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. atent Unexpected Recommendations 1:13 item, and then use one layer of MLP as decoder to reconstruct the original information. We jointlyoptimize encoder and decoder to generate the latent embeddings.To obtain the multimodal embeddings from the Video dataset, we decompose the input videosinto texts, audios and images, where we subsequently apply FCN (Fully-Connected Network) withL2-Norm as regularization term to obtain the latent embeddings separately. Then we concatenatetext embeddings, audio embeddings and image embeddings to go through another layer of FCN togenerate the final multimodal embeddings.For performance comparison, we select the deep-learning based Neural Collaborative Filtering(NCF) model [17] as well as five popular collaborative filtering algorithms including k-Nearest Neigh-borhood approach (KNN) [6], Singular Value Decomposition approach (SVD) [51], Co-Clusteringapproach [15], Non-Negative Matrix Factorization approach (NMF) [33] and Factorization Machineapproach (FM) [48] to verify robustness of the proposed model. We implement the model in thePython environment using the ”Surprise”,”SciPy” and ”Gensim” packages. All experiments are per-formed on a laptop with 2.50GHz Intel Core i7 and 8GB RAM. We show that the training procedureis time-efficient: it takes 3 hours, 0.5 hours and 1 hours respectively for our proposed model toobtain latent embeddings in the Yelp dataset, the TripAdvisor dataset and the Video dataset. Thesubsequent unexpected recommendation process takes less than one hour to complete.
To compare the performance of the proposed Latent Closure (LC) method and baseline models,we measure the recommendation results along two dimensions: accuracy , in terms of RMSE,MAE, Precision@N and Recall@N metrics [19], and unexpectedness , in terms of Unexpectedness,Serendipity and Diversity metrics [14]. Specifically, we calculate unexpectedness through equation(5) following our proposed definition, while serendipity and diversity are computed following thestandard measures in the literature [14, 77].
Serendipity = RS & PM & U SEFU LRS (6)Serendipity is computed as the percentage of serendipitous recommendations, where RS stands forthe recommended items using the target model, PM stands for the recommendation items using aprimitive prediction algorithm (usually selected as the linear regression) and USEFUL stands forthe items whose utility is above the average level. Diversity is computed as the average intra-listdistance.
Diversity = (cid:213) i ∈ RS (cid:213) j (cid:44) i ∈ RS sim ( i , j ) (7) We implement several state-of-the-art unexpected recommendation models as baselines and reporttheir performance in terms of aforementioned metrics. The baseline models include SPR, Auralist,DPP, HOM-LIN. Note that we do not include neural network approach because there is no deep-learning based model for unexpected recommendations in the literature. • SPR [39].
Serendipitous Personalized Ranking is a simple and effective method for serendip-itous item recommendation that extends traditional personalized ranking methods by con-sidering item popularity in AUC optimization, which makes the ranking sensitive to thepopularity of negative examples. • Auralist [74].
Auralist is a personalized recommendation system that balances between thedesired goals of accuracy, diversity, novelty and serendipity simultaneously. Specifically inthe music recommendation, the authors combine Artist-based LDA recommendation with
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. :14 Pan Li and Alexander Tuzhilin two novel components: Listener Diversity and Musical Bubbles. We adjust the algorithmaccordingly to fit in our restaurant and hotel recommendation scenario. • DPP [10]
The determinantal point process (DPP) is an elegant probabilistic model of repulsionwith applications in various machine learning tasks. The authors propose a fast greedy MAPinference approach for DPP to generate relevant and diverse recommendations. • HOM-LIN [2].
HOM-LIN is the state-of-the-art unexpected recommendation algorithm,where the author propose to define unexpectedness as the distance between items and theexpected set of users in the feature space and linearly combine unexpectedness with estimatedratings to provide recommendations.
To illustrate the differences of recommendation performance between our proposed model andthe baseline methods, we conduct significant testing over the experiment results. Specifically, thesignificance level is determined through rerunning the unexpected recommendation models withrandom initialization multiple times and conduct StudentâĂŹs t-test to compute the p-value. Wereport the significance level together with our results in the next section.
Note that, the cold start problem is very important in recommender systems. We would like topoint out that our proposed unexpected recommender system does not encounter this problem dueto the following reasons: First, for the user-side cold start problem, we do not provide unexpectedrecommendations, as the new users have very few interactions and normally do not face the problemof boredom. Instead, we suggest to provide classical recommendations, which aim at producingsimilar recommendations to help the users identify and reinforce their interested contents. Second,for the item-side cold start problem, the new item embeddings could be obtained through classicalcold start embedding methods [66], and then we could subsequently calculate the unexpectednessand provide unexpected recommendations accordingly.
In this section, we report the experimental results on three real-world datasets to answer theresearch questions in Section 5.
To start with, we compare the recommendation performance of proposed latent unexpectednesswith baseline unexpected recommendation models. Specifically, the proposed LC method providesunexpected recommendations through
U tility u , i = EstRatinд u , i + α ∗ U nexp u , i where unexpect-edness is calculate using Latent HyperSphere introduced in Section 3.2 and estimated ratings arecomputed through deep-learning based method Neural Collaborative Filtering (NCF) and five otherpopular collaborative filtering algorithms Factorization Machine (FM), CoClustering (CC), SingularValue Decomposition (SVD), Non-negative Matrix Factorization (NMF) and K-Nearest Neighbor(KNN). We denote the corresponding unexpected recommendations provided through hybrid utilityfunctions as NCF+LC, FM+LC, CC+LC, SVD+LC, NMF+LC and KNN+LC accordingly.As shown in Table 4, 5 and 6, by utilizing the proposed latent modeling of unexpectedness, all sixunexpected recommendation models consistently and significantly outperforms the baseline meth-ods in both accuracy and unexpectedness measures. Specifically, we observe an average increase of5.21% in RMSE, 8.11% in MAE, 1.14% in Precision, 1.57% in Recall, 48.77% in Unexpectedness, 8.30%in Serendipity and 27.69% in Diversity compared to the second best baseline model in the Yelpdataset. That is to say, the proposed latent modeling of unexpectedness enables us to provide more ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. atent Unexpected Recommendations 1:15
Yelp DatasetModel RMSE MAE Pre@5 Rec@5 Unexp Ser DivNCF+LC 0.9169 0.7078
FM+LC 0.9180
Table 4. Comparison of unexpected recommendation performance in the Yelp dataset, ”*” stands for 95%statistical significance
TripAdvisor DatasetModel RMSE MAE Pre@5 Rec@5 Unexp Ser DivNCF+LC
SVD+LC 0.9908 0.7519 0.7093 0.9569 0.0585 0.4614 0.0477NMF+LC 1.0280 0.7594 0.6864 0.9735 0.0584 0.4629 0.0488KNN+LC 0.9981 0.7493 0.6909 0.9743
Table 5. Comparison of unexpected recommendation performance in the TripAdvisor dataset, ”*” stands for95% statistical significance
Video DatasetModel RMSE MAE Pre@5 Rec@5 Unexp Ser DivNCF+LC
SPR 0.4610 0.3638 0.2298 0.2870 0.6300 0.9593 0.2137Auralist 0.4515 0.3610 0.2304 0.2890 0.6462 0.9462 0.1980HOM-LIN 0.4498 0.3608 0.2310 0.2912 0.6732 0.9473 0.2154DPP 0.4770 0.3670 0.2271 0.2870 0.6593 0.9328 0.2154
Table 6. Comparison of unexpected recommendation performance in the Video dataset, ”*” stands for 95%statistical significance
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. :16 Pan Li and Alexander Tuzhilin
Model RMSE MAE Pre@5 Rec@5 Unexp Ser DivNCF 0.9154 0.7070 0.7761 0.6318 0.0492 0.1666 0.3492NCF+LC 0.9169 0.7078 0.7783 0.6291
FM 0.9197 0.6815 0.7699 0.6223 0.0326 0.0978 0.0135FM+LC 0.9180 0.6888 0.7704 0.6278
CC 0.9499 0.7040 0.7655 0.5913 0.0338 0.1595 0.3106CC+LC 0.9514 0.7007 0.7626 0.5926
SVD 0.9132 0.7071 0.7792 0.6244 0.0457 0.1352 0.0479SVD+LC 0.9136 0.7039 0.7722 0.6212
NMF 0.9533 0.7081 0.7797 0.6318 0.0333 0.1954 0.3268NMF+LC 0.9522 0.7026 0.7781 0.6238
KNN 0.9123 0.7748 0.7687 0.6285 0.0448 0.0977 0.0129KNN+LC 0.9133 0.7715 0.7674 0.6287 (a) RMSE (b) MAE (c) Unexpectedness (d) Serendipity
Table 7. Comparison of recommendation performance with and without unexpectedness in the Yelp dataset,”*” stands for 95% statistical significance; we observe significant improvements in unexpectedness measuresin (c) and (d), while no significant change in accuracy measures in (a) and (b) at the same time. unexpected and more useful recommendations at the same time. Also, we show that the superiorityof latent unexpectedness is robust to the specific selection of collaborative filtering algorithms, aswe obtain significant increase of performance measures in all six algorithms and do not observeany significant difference in unexpectedness metric within these methods.
As we discuss in the previous section, an important problem with incorporating unexpectedness intorecommendations is the trade-off between accuracy and novelty measures [76, 78], which is crucialto the practical use of unexpected recommendations. In this section, we compare the unexpectedrecommendation performance using hybrid utility functions with those classical recommendersystems that provide recommendations based on estimated ratings only.As shown in Table 7, 8, 9 and the corresponding plots, when including unexpectedness in the rec-ommendation process, we consistently obtain significant improvements in terms of unexpectedness,serendipity and diversity measures, while we do not witness any loss in the accuracy measures.Therefore, we show that it is indeed the proposed latent closure approach that enables us to provideuseful and unexpected recommendations simultaneously. It is crucial for the successful deploymentof unexpected recommendation models in the industrial applications.
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. atent Unexpected Recommendations 1:17
Model RMSE MAE Pre@5 Rec@5 Unexp Ser DivNCF 0.9588 0.7291 0.7230 0.9815 0.0222 0.3960 0.0010NCF+LC 0.9624 0.7310 0.7201 0.9810
FM 1.0105 0.7440 0.7068 0.9590 0.0222 0.3979 0.0017FM+LC 1.0230 0.7450 0.7031 0.9638
CC 1.0178 0.7543 0.6845 0.9732 0.0234 0.3973 0.0015CC+LC 1.0230 0.7539 0.6887 0.9754
SVD 0.9868 0.7533 0.7010 0.9565 0.0231 0.3967 0.0006SVD+LC 0.9908 0.7519 0.7093 0.9569
NMF 1.0241 0.7609 0.6850 0.9681 0.0227 0.3979 0.0010NMF+LC 1.0280 0.7594 0.6864 0.9735
KNN 0.9940 0.7531 0.6969 0.9689 0.0233 0.3979 0.0019KNN+LC 0.9981 0.7493 0.6909 0.9743 (a) RMSE (b) MAE (c) Unexpectedness (d) Serendipity
Table 8. Comparison of recommendation performance with and without unexpectedness in the TripAdvisordataset, ”*” stands for 95% statistical significance; we observe significant improvements in unexpectednessmeasures in (c) and (d), while no significant change in accuracy measures in (a) and (b) at the same time.
In addition, we study the impact of the hyperparameter α in Equation (5) that controls for thedegree of unexpectedness and usefulness in the hybrid utility function. Typically a higher value of α indicates that the recommendation model is in favor of unexpected recommendations over usefulrecommendations, while a lower value of α tends to recommend more useful items as opposed tounexpected items.We plot the change of accuracy and novelty measures with respect to different α value in Figure5. This figure illustrates that when we select relatively small value of α , (e.g., α =0.03) we can obtainsignificant amount of increase in unexpectedness (8.40%, 10.81% and 9.10% respectively in threedatasets) while the decrease of accuracy performance is not statistically significant for all threedatasets. It is also worth noticing that if we select a large value of α , we might risk deterioratingthe accuracy performance of recommendations significantly. In this paper, we show that the proposed latent modeling of unexpectedness significantly improvesrecommendation performance and provide indeed unexpected recommendations. In total, weconduct the experiments on 3 different datasets. using 3 different latent embedding approaches, 6different collaborative filtering algorithms, 7 different evaluation metrics and 3 different geometricstructures for modeling unexpectedness, resulting in 378 experimental settings, where all 378 results
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. :18 Pan Li and Alexander Tuzhilin
Model RMSE MAE Pre@5 Rec@5 Unexp Ser DivNCF 0.3815 0.2877 0.2544 0.3630 0.6402 0.8678 0.2317NCF+LC 0.3810 0.2854 0.2560 0.3615
FM 0.3920 0.3013 0.2472 0.3280 0.6398 0.8552 0.2396FM+LC 0.3924 0.3044 0.2498 0.3265
CC 0.4129 0.3302 0.2560 0.3646 0.6479 0.8382 0.2362CC+LC 0.4167 0.3296 0.2569 0.3676
SVD 0.3806 0.2895 0.2392 0.3232 0.6495 0.8480 0.2346SVD+LC 0.3888 0.2862 0.2455 0.3253
NMF 0.4462 0.3285 0.2480 0.3391 0.6548 0.8655 0.2385NMF+LC 0.4405 0.3330 0.2494 0.3439
KNN 0.4103 0.3048 0.2531 0.3173 0.6416 0.8632 0.2385KNN+LC 0.4088 0.3091 0.2608 0.3212 (a) RMSE (b) MAE (c) Unexpectedness (d) Serendipity
Table 9. Comparison of recommendation performance with and without unexpectedness in the Video dataset,”*” stands for 95% statistical significance; we observe significant improvements in unexpectedness measuresin (c) and (d), while no significant change in accuracy measures in (a) and (b) at the same time. (a) Yelp (b) TripAdvisor (c) Video
Fig. 5. Comparison of Accuracy-Novelty Trade-off are in supportive of our claims. We observe significant improvements in unexpectedness, serendipityand diversity measures, while we do not witness any loss in accuracy measures compared to plain
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. atent Unexpected Recommendations 1:19 collaborative filtering algorithms that do not include unexpectedness during the recommendationprocess. In addition, when compared to baseline unexpected recommendation models, our modelsignificantly outperforms them in both accuracy and unexpectedness measures.To sum up, the superiority of latent modeling of unexpectedness is robust to • Various Datasets
We conduct the experiments in three different datasets: Yelp dataset,TripAdvisor dataset and Video dataset, where we obtain consistent improvements in all threedatasets. • Multiple Latent Embedding Approaches
To construct the unexpectedness in the latentspace, we utilize three state-of-the-art latent embedding approaches: Heterogeneous In-formation Network Embeddings (HINE), Autoencoder Embeddings (AE) and MultimodalEmbeddings (ME) and obtain similar superior recommendation performance over baselinemodels. • Specific Collaborative Filtering Algorithms
We select six representative collaborativefiltering algorithms to estimate user ratings and form the hybrid utility function accordingly.These methods include the deep-learning based approach NCF and five other popular modelsFM, CC, SVD, NMF and KNN. The latent modeling of unexpectedness enables each collabora-tive filtering algorithm to provide more unexpected recommendations without losing anyaccuracy measure. • Selective Evaluation Metrics
We evaluate the recommendation performance using accu-racy measures RMSE, MAE, Precision, Recall and unexpectedness measures Unexpectedness,Serendipity, Diversity. The proposed model significantly outperforms baseline unexpectedrecommendation models in all these seven metrics. • Different Geometric Shapes of Latent Closures
As discussed in Section 3.2, there arethree common geometric structures in high-dimensional latent space that are suitable formodeling the closure of latent embeddings: Latent HyperSphere (LHS), Latent HyperCube(LHC) and Latent Convex Hull (LCH). We calculate unexpectedness using the three structuresseparately and provide unexpected recommendations accordingly. As shown in Table 10, 11and 12, the specific selection of geometric structure does not influence the recommendationperformance, as we get similar results and neither approach dominates the other two. In-stead, it is really the latent modeling of unexpectedness that contributes to the significantimprovements of recommendation performance.
Finally, we conduct case study to reveal the effectiveness of modeling unexpectedness throughlatent embedding approaches. Specifically, we visualize the learned embedding vectors to provideinsights of their semantic information in the latent space. Taking the Yelp dataset as an example,we randomly select 100 restaurants from the dataset and obtain their corresponding embeddingsthrough the HINE method. In Figure 6, we show the visualization of those embeddings throught-SNE [40], in which similar restaurants are clustered close to each other. We could see that cafes andbakeries are clustered to the left side, whereas burger bars and fast food restaurants are clustered tothe right side, and Asian restaurants are clustered to the far right in the latent space. Therefore, weshow that the latent embedding approaches we use in this paper are indeed capable of capturinglatent relations among items and thus providing precise modeling of unexpectedness.
In this paper, we propose novel latent modeling of unexpectedness that simultaneously providesunexpected and satisfying recommendations. Specifically, we define unexpectedness of a new item
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. :20 Pan Li and Alexander Tuzhilin
Model RMSE MAE Pre@5 Rec@5 Unexp Ser DivNCF+LCH 0.9158 0.7076 0.7798 0.6308 0.1478 0.4889 0.4170NCF+LHS 0.9169 0.7078 0.7783 0.6291 0.1450 0.4905 0.4178NCF+LHC 0.9180 0.7013 0.7725 0.6270 0.1478 0.4930 0.4178FM+LCH 0.9178 0.6820 0.7700 0.6123 0.1422 0.4593 0.4198FM+LHS 0.9180 0.6888 0.7704 0.6278 0.1378 0.4603 0.4164FM+LHC 0.9162 0.6798 0.7698 0.6195 0.1402 0.4608 0.4198CC+LCH 0.9504 0.7038 0.7596 0.5864 0.1400 0.4660 0.3869CC+LHS 0.9514 0.7007 0.7626 0.5926 0.1355 0.4793 0.3961CC+LHC 0.9501 0.7072 0.7645 0.5774 0.1349 0.4644 0.3847SVD+LCH 0.9134 0.7076 0.7701 0.6175 0.1240 0.4569 0.3524SVD+LHS 0.9136 0.7039 0.7722 0.6212 0.1214 0.4630 0.3511SVD+LHC 0.9126 0.7081 0.7720 0.6133 0.1192 0.4534 0.3602NMF+LCH 0.9522 0.7054 0.7722 0.6233 0.1390 0.4869 0.4030NMF+LHS 0.9522 0.7026 0.7781 0.6238 0.1466 0.4894 0.4045NMF+LHC 0.9558 0.7013 0.7692 0.6260 0.1471 0.4852 0.4012KNN+LCH 0.9128 0.7751 0.7659 0.6273 0.1220 0.4365 0.3259KNN+LHS 0.9133 0.7715 0.7674 0.6287 0.1288 0.4380 0.3388KNN+LHC 0.9117 0.7753 0.7662 0.6272 0.1327 0.4421 0.3427
Table 10. Comparison of unexpected recommendations in the Yelp dataset using different geometric structures,”*” stands for 95% statistical significance
Model RMSE MAE Pre@5 Rec@5 Unexp Ser DivNCF+LCH 0.9635 0.7317 0.7210 0.9795 0.0579 0.4622 0.0478NCF+LHS 0.9624 0.7310 0.7201 0.9810 0.0586 0.4635 0.0472NCF+LHC 0.9652 0.7305 0.7214 0.9814 0.0593 0.4647 0.0469FM+LCH 1.0275 0.7445 0.7040 0.9656 0.0543 0.4631 0.0393FM+LHS 1.0230 0.7450 0.7031 0.9638 0.0581 0.4637 0.0388FM+LHC 1.0218 0.7472 0.7020 0.9632 0.0561 0.4607 0.0407CC+LCH 1.0285 0.7541 0.6865 0.9703 0.0552 0.4619 0.0471CC+LHS 1.0230 0.7539 0.6887 0.9754 0.0587 0.4629 0.0491CC+LHC 1.0200 0.7539 0.6864 0.9730 0.0562 0.4667 0.0498SVD+LCH 0.9937 0.7517 0.7085 0.9594 0.0544 0.4621 0.0499SVD+LHS 0.9908 0.7519 0.7093 0.9569 0.0585 0.4614 0.0477SVD+LHC 0.9884 0.7541 0.7091 0.9474 0.0562 0.4654 0.0485NMF+LCH 1.0262 0.7533 0.6881 0.9775 0.0544 0.4627 0.0499NMF+LHS 1.0280 0.7594 0.6864 0.9735 0.0584 0.4629 0.0488NMF+LHC 1.0265 0.7600 0.6853 0.9711 0.0559 0.4677 0.0504KNN+LCH 1.0001 0.7483 0.6907 0.9763 0.0543 0.4631 0.0492KNN+LHS 0.9981 0.7493 0.6909 0.9743 0.0588 0.4625 0.0488KNN+LHC 0.9950 0.7524 0.6927 0.9701 0.0564 0.4671 0.0500
Table 11. Comparison of unexpected recommendations in the TripAdvisor dataset using different geometricstructures, ”*” stands for 95% statistical significance
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. atent Unexpected Recommendations 1:21
Model RMSE MAE Pre@5 Rec@5 Unexp Ser DivNCF+LCH 0.3799 0.2870 0.2572 0.3638 0.7049 0.9819 0.2538NCF+LHS 0.3810 0.2854 0.2560 0.3615 0.7070 0.9830 0.2538NCF+LHC 0.3817 0.2846 0.2549 0.3632 0.7101 0.9852 0.2536FM+LCH 0.3906 0.2998 0.2510 0.3278 0.7112 0.9840 0.2518FM+LHS 0.3924 0.3044 0.2498 0.3265 0.7096 0.9833 0.2510FM+LHC 0.3940 0.3056 0.2506 0.3302 0.7177 0.9833 0.2518CC+LCH 0.4157 0.3240 0.2564 0.3624 0.7101 0.9817 0.2512CC+LHS 0.4167 0.3296 0.2569 0.3676 0.7053 0.9815 0.2519CC+LHC 0.4151 0.3358 0.2553 0.3659 0.7065 0.9830 0.2508SVD+LCH 0.3841 0.2925 0.2400 0.3277 0.7010 0.9844 0.2408SVD+LHS 0.3888 0.2862 0.2455 0.3253 0.7018 0.9810 0.2412SVD+LHC 0.3836 0.2841 0.2433 0.3271 0.7007 0.9812 0.2454NMF+LCH 0.4423 0.3306 0.2380 0.3491 0.7008 0.9799 0.2488NMF+LHS 0.4405 0.3330 0.2494 0.3439 0.6999 0.9792 0.2450NMF+LHC 0.4433 0.3387 0.2420 0.3459 0.6961 0.9803 0.2438KNN+LCH 0.4106 0.3107 0.2584 0.3175 0.7007 0.9817 0.2558KNN+LHS 0.4088 0.3091 0.2608 0.3212 0.7014 0.9814 0.2558KNN+LHC 0.4069 0.3099 0.2620 0.3248 0.7073 0.9830 0.2519
Table 12. Comparison of unexpected recommendations in the Video dataset using different geometricstructures, ”*” stands for 95% statistical significanceFig. 6. t-SNE Visualization of Latent Embeddings as the distance between the embedding of that item in the latent space and the closure of all thepreviously consumed item embeddings. This new definition enables us to capture latent, complexand heterogeneous relationships between users and items that significantly improves performance
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. :22 Pan Li and Alexander Tuzhilin and practicability of unexpected recommendations. To achieve this, we design a hybrid utilityfunction as the linear combination of estimated ratings and unexpectedness to optimize accuracyand unexpectedness objectives of recommendations simultaneously. Furthermore, we demonstratethat the proposed approach consistently and significantly outperforms all other baseline modelsin terms of unexpectedness, serendipity and diversity measures without losing any accuracyperformance.The contributions of this paper are threefold. First, we propose latent modeling of unexpectedness.Though it is a common idea to explore latent space for recommendations, it is not obvious how todo it for unexpected recommendations, as we have discussed in Section 3. Second, we constructthe hybrid utility function that combines the proposed unexpectedness measure with the ratingestimation value and provides unexpected recommendations based on the hybrid utility values. Wedemonstrate that this approach significantly outperforms all other unexpected recommendationbaselines. Third, we conduct extensive experiments in multiple settings and show that it is indeedthe latent modeling of unexpectedness that leads to the significant increase in unexpectednessmeasures without sacrificing any performance accuracy. Thus, the proposed approach helps usersto break out of their filter bubbles.As the future work, we plan to conduct live experiments within real business environmentsin order to further evaluate the effectiveness of unexpected recommendations and analyze bothqualitative and quantitative aspects in online retail settings through A/B tests. Specifically, weplan to launch our model in an industrial platform and measure its performance using businessmetrics, including CTR and GMV. Moreover, we will further explore the impact of unexpectedrecommendations on user satisfaction. Finally, we plan to design algorithms that automaticallyincorporate the concept of unexpectedness into the deep-learning recommendation frameworkthat optimizes the recommendation performance and the construction of latent embeddings at thesame time.
REFERENCES [1] Panagiotis Adamopoulos. 2014. On discovering non-obvious recommendations: Using unexpectedness and neighbor-hood selection methods in collaborative filtering systems. In
Proceedings of the 7th ACM international conference onWeb search and data mining . ACM, 655–660.[2] Panagiotis Adamopoulos and Alexander Tuzhilin. 2015. On unexpectedness in recommender systems: Or how tobetter expect the unexpected. In
ACM Transactions on Intelligent Systems and Technology (TIST) . ACM, 54.[3] Gediminas Adomavicius and YoungOk Kwon. 2011. Improving aggregate recommendation diversity using ranking-based techniques. In
IEEE Transactions on Knowledge and Data Engineering . IEEE, 896–911.[4] Gediminas Adomavicius and YoungOk Kwon. 2011. Maximizing aggregate recommendation diversity: A graph-theoreticapproach. Citeseer.[5] Takayuki Akiyama, Kiyohiro Obara, and Masaaki Tanizaki. 2010. Proposal and Evaluation of Serendipitous Recom-mendation Method Using General Unexpectedness.. In
PRSAT@ RecSys . 3–10.[6] Naomi S Altman. 1992. An introduction to kernel and nearest-neighbor nonparametric regression. In
The AmericanStatistician . Taylor & Francis Group, 175–185.[7] C Bradford Barber, David P Dobkin, David P Dobkin, and Hannu Huhdanpaa. 1996. The quickhull algorithm for convexhulls. In
ACM Transactions on Mathematical Software (TOMS) . 469–483.[8] Andrea Barraza-Urbina. 2017. The exploration-exploitation trade-off in interactive recommender systems. In
Proceedingsof the Eleventh ACM Conference on Recommender Systems . 431–435.[9] Li Chen, Yonghua Yang, Ningxia Wang, Keping Yang, and Quan Yuan. 2019. How Serendipity Improves User Satisfactionwith Recommendations? A Large-Scale User Evaluation. In
The World Wide Web Conference . ACM, 240–250.[10] Laming Chen, Guoxin Zhang, and Eric Zhou. 2018. Fast greedy MAP inference for Determinantal Point Process toimprove recommendation diversity. In
Advances in Neural Information Processing Systems . 5622–5633.[11] Yu-Shian Chiu, Kuei-Hong Lin, and Jia-Sin Chen. 2011. A social network-based serendipity recommender system. In . IEEE, 1–5.[12] Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning forheterogeneous networks. In
Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and
ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. atent Unexpected Recommendations 1:23
Data Mining . ACM, 135–144.[13] Mike Gartrell, Ulrich Paquet, and Noam Koenigstein. 2017. Low-rank factorization of determinantal point processes.In
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence . 1912–1918.[14] Mouzhi Ge, Carla Delgado-Battenfeld, and Dietmar Jannach. 2010. Beyond accuracy: evaluating recommender systemsby coverage and serendipity. In
Proceedings of the fourth ACM conference on Recommender systems . ACM, 257–260.[15] Thomas George and Srujana Merugu. 2005. A Scalable Collaborative Filtering Framework Based on Co-Clustering. In
Proceedings of the Fifth IEEE International Conference on Data Mining . 625–628.[16] Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In
Proceedings of the 22ndACM SIGKDD international conference on Knowledge discovery and data mining . ACM, 855–864.[17] Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering.In
Proceedings of the 26th international conference on world wide web . International World Wide Web ConferencesSteering Committee, 173–182.[18] Sigurdur Helgason. 2001.
Differential geometry and symmetric spaces . Vol. 341. American Mathematical Soc.[19] Jonathan L Herlocker, Joseph A Konstan, Loren G Terveen, and John T Riedl. 2004. Evaluating collaborative filteringrecommender systems.
ACM Transactions on Information Systems (TOIS)
22, 1 (2004), 5–53.[20] Yoshinori Hijikata, Takuya Shimizu, and Shogo Nishida. 2009. Discovery-oriented collaborative filtering for improvinguser satisfaction. In
Proceedings of the 14th international conference on Intelligent user interfaces . 67–76.[21] Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. science
Proceedings of the 7th ACM conference on Recommendersystems . 379–382.[23] Leo Iaquinta, Marco De Gemmis, Pasquale Lops, Giovanni Semeraro, Michele Filannino, and Piero Molino. 2008.Introducing serendipity in a content-based recommender system. In
Hybrid Intelligent Systems, 2008. HIS’08. EighthInternational Conference on . IEEE, 168–173.[24] Leo Iaquinta, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, and Piero Molino. 2010. Can a recommendersystem induce serendipitous encounters? In
E-commerce . InTech.[25] Masayuki Ishikawa, Peter Geczy, Noriaki Izumi, and Takahira Yamaguchi. 2008. Long tail recommender utilizinginformation diffusion theory. In , Vol. 1. IEEE, 785–788.[26] Junzo Kamahara, Tomofumi Asakawa, Shinji Shimojo, and Hideo Miyahara. 2005. A community-based recommendationsystem to reveal unexpected interests. In . IEEE, 433–438.[27] Marius Kaminskas and Derek Bridge. 2016. Diversity, serendipity, novelty, and coverage: a survey and empiricalanalysis of beyond-accuracy objectives in recommender systems.
ACM Transactions on Interactive Intelligent Systems(TiiS)
7, 1 (2016), 1–42.[28] Komal Kapoor, Vikas Kumar, Loren Terveen, Joseph A Konstan, and Paul Schrater. 2015. I like to explore sometimes:Adapting to dynamic user novelty preferences. In
Proceedings of the 9th ACM Conference on Recommender Systems .ACM, 19–26.[29] Komal Kapoor, Karthik Subbian, Jaideep Srivastava, and Paul Schrater. 2015. Just in time recommendations: Modelingthe dynamics of boredom in activity streams. In
Proceedings of the Eighth ACM International Conference on Web Searchand Data Mining . ACM, 233–242.[30] Noriaki Kawamae. 2010. Serendipitous recommendations via innovators. In
Proceedings of the 33rd international ACMSIGIR conference on Research and development in information retrieval . ACM, 218–225.[31] Noriaki Kawamae, Hitoshi Sakano, and Takeshi Yamada. 2009. Personalized recommendation based on the personalinnovator degree. In
Proceedings of the third ACM conference on Recommender systems . ACM, 329–332.[32] John Paul Kelly and Derek Bridge. 2006. Enhancing the diversity of conversational collaborative recommendations: acomparison.
Artificial Intelligence Review
25, 1-2 (2006), 79–95.[33] Daniel D Lee and H Sebastian Seung. 2001. Algorithms for non-negative matrix factorization. In
Advances in neuralinformation processing systems . 556–562.[34] Kibeom Lee and Kyogu Lee. 2015. Escaping your comfort zone: A graph-based recommender system for finding novelrecommendations among relevant items.
Expert Systems with Applications
42, 10 (2015), 4851–4858.[35] Pan Li and Alexander Tuzhilin. 2019. Latent Modeling of Unexpectedness for Recommendations.
Proceedings of ACMRecSys 2019 Late-breaking Results (2019), 7–10.[36] Pan Li and Alexander Tuzhilin. 2019. Latent multi-criteria ratings for recommendations. In
Proceedings of the 13thACM Conference on Recommender Systems . 428–431.[37] Pan Li and Alexander Tuzhilin. 2020. DDTCDR: Deep Dual Transfer Cross Domain Recommendation. In
Proceedings ofthe 13th International Conference on Web Search and Data Mining . 331–339.ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. :24 Pan Li and Alexander Tuzhilin [38] Yen-Yu Lin, Tyng-Luh Liu, and Hwann-Tzong Chen. 2005. Semantic manifold learning for image retrieval. In
Proceedingsof the 13th annual ACM international conference on Multimedia . 249–258.[39] Qiuxia Lu, Tianqi Chen, Weinan Zhang, Diyi Yang, and Yong Yu. 2012. Serendipitous personalized ranking for top-nrecommendation. In
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2012 IEEE/WIC/ACM InternationalConferences on , Vol. 1. IEEE, 258–265.[40] Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE.
Journal of machine learning research
9, Nov (2008), 2579–2605.[41] Sean M McNee, John Riedl, and Joseph A Konstan. 2006. Being accurate is not enough: how accuracy metrics havehurt recommender systems. In
CHI’06 extended abstracts on Human factors in computing systems . 1097–1101.[42] Tomoko Murakami, Koichiro Mori, and Ryohei Orihara. 2007. Metrics for evaluating the serendipity of recommendationlists. In
Annual conference of the Japanese society for artificial intelligence . Springer, 40–46.[43] Tien T Nguyen, Pik-Mai Hui, F Maxwell Harper, Loren Terveen, and Joseph A Konstan. 2014. Exploring the filterbubble: the effect of using recommender systems on content diversity. In
Proceedings of the 23rd international conferenceon World wide web . ACM, 677–686.[44] Jinoh Oh, Sun Park, Hwanjo Yu, Min Song, and Seung-Taek Park. 2011. Novel recommendation based on personalpopularity tendency. In . IEEE, 507–516.[45] Yingwei Pan, Tao Mei, Ting Yao, Houqiang Li, and Yong Rui. 2016. Jointly Modeling Embedding and Translationto Bridge Video and Language. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition .4594–4602.[46] Eli Pariser. 2011.
The filter bubble: How the new personalized web is changing what we read and how we think . Penguin.[47] Yoon-Joo Park and Alexander Tuzhilin. 2008. The long tail of recommender systems and how to leverage it. In
Proceedings of the 2008 ACM conference on Recommender systems . 11–18.[48] Steffen Rendle. 2010. Factorization machines. In
Data Mining (ICDM), 2010 IEEE 10th International Conference on . IEEE,995–1000.[49] David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. 1985.
Learning internal representations by errorpropagation . Technical Report. California Univ San Diego La Jolla Inst for Cognitive Science.[50] Alan Said, Benjamin Kille, Brijnesh J Jain, and Sahin Albayrak. 2012. Increasing diversity through furthest neighbor-based recommendation.
Proceedings of the WSDM
12 (2012).[51] Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2002. Incremental Singular Value DecompositionAlgorithms for Highly Scalable Recommender Systems. In
Fifth International Conference on Computer and InformationScience . Citeseer.[52] Suvash Sedhain, Aditya Krishna Menon, Scott Sanner, and Lexing Xie. 2015. Autorec: Autoencoders meet collaborativefiltering. In
Proceedings of the 24th International Conference on World Wide Web . ACM, 111–112.[53] Guy Shani and Asela Gunawardana. 2011. Evaluating recommendation systems. In
Recommender systems handbook .Springer, 257–297.[54] Chuan Shi, Binbin Hu, Xin Zhao, and Philip Yu. 2018. Heterogeneous Information Network Embedding for Recom-mendation.
IEEE Transactions on Knowledge and Data Engineering (2018).[55] Chuan Shi, Yitong Li, Jiawei Zhang, Yizhou Sun, and S Yu Philip. 2017. A survey of heterogeneous information networkanalysis.
IEEE Transactions on Knowledge and Data Engineering
29, 1 (2017), 17–37.[56] Lei Shi. 2013. Trading-off among accuracy, similarity, diversity, and long-tail: a graph-based recommendation approach.In
Proceedings of the 7th ACM conference on Recommender systems . 57–64.[57] Yue Shi, Xiaoxue Zhao, Jun Wang, Martha Larson, and Alan Hanjalic. 2012. Adaptive diversification of recommendationresults via latent factor portfolio. In
Proceedings of the 35th international ACM SIGIR conference on Research anddevelopment in information retrieval . 175–184.[58] Morton Slater. 2014. Lagrange multipliers revisited. In
Traces and Emergence of Nonlinear Programming . Springer,293–306.[59] Barry Smyth and Paul McClave. 2001. Similarity vs. diversity. In
International conference on case-based reasoning .Springer, 347–361.[60] Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical Bayesian Optimization of Machine LearningAlgorithms. In
Advances in Neural Information Processing Systems 25 , F. Pereira, C. J. C. Burges, L. Bottou, andK. Q. Weinberger (Eds.). Curran Associates, Inc., 2951–2959. http://papers.nips.cc/paper/4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf[61] Ruilong Su, Li’Ang Yin, Kailong Chen, and Yong Yu. 2013. Set-oriented personalized ranking for diversified top-nrecommendation. In
Proceedings of the 7th ACM conference on Recommender systems . 415–418.[62] Yizhou Sun and Jiawei Han. 2013. Mining heterogeneous information networks: a structural analysis approach.
AcmSigkdd Explorations Newsletter
14, 2 (2013), 20–28.ACM Trans. Intell. Syst. Technol., Vol. 1, No. 1, Article 1. Publication date: January 2020. atent Unexpected Recommendations 1:25 [63] Maria Taramigkou, Efthimios Bothos, Konstantinos Christidis, Dimitris Apostolou, and Gregoris Mentzas. 2013. Escapethe bubble: Guided exploration of music preferences for serendipity and novelty. In
Proceedings of the 7th ACMconference on Recommender systems . ACM, 335–338.[64] Saúl Vargas, Linas Baltrunas, Alexandros Karatzoglou, and Pablo Castells. 2014. Coverage, redundancy and size-awareness in genre diversity for recommender systems. In
Proceedings of the 8th ACM Conference on Recommendersystems . 209–216.[65] Saúl Vargas and Pablo Castells. 2011. Rank and relevance in novelty and diversity metrics for recommender systems.In
Proceedings of the fifth ACM conference on Recommender systems . 109–116.[66] Jizhe Wang, Pipei Huang, Huan Zhao, Zhibo Zhang, Binqiang Zhao, and Dik Lun Lee. 2018. Billion-scale commodityembedding for e-commerce recommendation in alibaba. In
Proceedings of the 24th ACM SIGKDD International Conferenceon Knowledge Discovery & Data Mining . 839–848.[67] Jacek Wasilewski and Neil Hurley. 2019. Bayesian Personalized Ranking for Novelty Enhancement. In
Proceedings ofthe 27th ACM Conference on User Modeling, Adaptation and Personalization . 144–148.[68] Yinwei Wei, Xiang Wang, Liqiang Nie, Xiangnan He, Richang Hong, and Tat-Seng Chua. 2019. MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video. In
Proceedings of the 27th ACMInternational Conference on Multimedia . 1437–1445.[69] Li-Tung Weng, Yue Xu, Yuefeng Li, and Richi Nayak. 2007. Improving recommendation novelty based on topic taxonomy.In . IEEE,115–118.[70] Le Wu, Qi Liu, Enhong Chen, Nicholas Jing Yuan, Guangming Guo, and Xing Xie. 2016. Relevance meets coverage: Aunified framework to generate diversified recommendations.
ACM Transactions on Intelligent Systems and Technology(TIST)
7, 3 (2016), 1–30.[71] Xiao Yu, Xiang Ren, Yizhou Sun, Quanquan Gu, Bradley Sturt, Urvashi Khandelwal, Brandon Norick, and Jiawei Han.2014. Personalized entity recommendation: A heterogeneous information network approach. In
Proceedings of the 7thACM international conference on Web search and data mining . ACM, 283–292.[72] Xiao Yu, Xiang Ren, Yizhou Sun, Bradley Sturt, Urvashi Khandelwal, Quanquan Gu, Brandon Norick, and Jiawei Han.2013. Recommendation in heterogeneous information networks with implicit user feedback. In
Proceedings of the 7thACM conference on Recommender systems . ACM, 347–350.[73] Shuai Zhang, Lina Yao, and Aixin Sun. 2017. Deep learning based recommender system: A survey and new perspectives. arXiv preprint arXiv:1707.07435 (2017).[74] Yuan Cao Zhang, Diarmuid Ó Séaghdha, Daniele Quercia, and Tamas Jambor. 2012. Auralist: introducing serendipityinto music recommendation. In
Proceedings of the fifth ACM international conference on Web search and data mining .ACM, 13–22.[75] Qianru Zheng, Chi-Kong Chan, and Horace HS Ip. 2015. An unexpectedness-augmented utility model for makingserendipitous recommendation. In
Industrial Conference on Data Mining . Springer, 216–230.[76] Tao Zhou, Zoltán Kuscsik, Jian-Guo Liu, Matúš Medo, Joseph Rushton Wakeling, and Yi-Cheng Zhang. 2010. Solvingthe apparent diversity-accuracy dilemma of recommender systems.
Proceedings of the National Academy of Sciences
Proceedings of the 14th international conference on World Wide Web . ACM, 22–32.[78] Zainab Zolaktaf, Reza Babanezhad, and Rachel Pottinger. 2018. A Generic Top-N Recommendation Framework ForTrading-off Accuracy, Novelty, and Coverage. In2018 IEEE 34th International Conference on Data Engineering (ICDE)