[PDF] Diversification in Session-based News Recommender Systems

Abstract

Recommender systems are widely applied in digital platforms such as news websites to personalize services based on user preferences. In news websites most of users are anonymous and the only available data is sequences of items in anonymous sessions. Due to this, typical collaborative filtering methods, which are highly applied in many applications, are not effective in news recommendations. In this context, session-based recommenders are able to recommend next items given the sequence of previous items in the active session. Neighborhood-based session-based recommenders has been shown to be highly effective compared to more sophisticated approaches. In this study we propose scenarios to make these session-based recommender systems diversity-aware and to address the filter bubble phenomenon. The filter bubble phenomenon is a common concern in news recommendation systems and it occurs when the system narrows the information and deprives users of diverse information. The results of applying the proposed scenarios show that these diversification scenarios improve the diversity measures in these session-based recommender systems based on four news datasets.

Full PDF

aa r X i v : . [ c s . I R ] F e b Personal and Ubiquitous Computing manuscript No. (will be inserted by the editor)

Diversiﬁcation in Session-based News RecommenderSystems

Alireza Gharahighehi · Celine Vens

Received: date / Accepted: date

Abstract

Recommender systems are widely applied in digital platforms suchas news websites to personalize services based on user preferences. In news web-sites most of users are anonymous and the only available data is sequences ofitems in anonymous sessions. Due to this, typical collaborative ﬁltering meth-ods, which are highly applied in many applications, are not eﬀective in newsrecommendations. In this context, session-based recommenders are able to rec-ommend next items given the sequence of previous items in the active session.Neighborhood-based session-based recommenders has been shown to be highlyeﬀective compared to more sophisticated approaches. In this study we proposescenarios to make these session-based recommender systems diversity-awareand to address the ﬁlter bubble phenomenon. The ﬁlter bubble phenomenonis a common concern in news recommendation systems and it occurs whenthe system narrows the information and deprives users of diverse information.The results of applying the proposed scenarios show that these diversiﬁcationscenarios improve the diversity measures in these session-based recommendersystems based on four news datasets.

Keywords

Session-based recommender system · news recommendation · diversity · ﬁlter bubble phenomenon Nowadays recommender systems (RSs) are applied in almost every digitalplatform. These platforms try to adapt their services based on user needs inorder to increase user satisfaction. RSs infer user needs and preferences using

A. Gharahighehi · C. VensItec, imec research group at KU Leuven, Kortrijk, BelgiumKU Leuven, Campus KULAK, Department of Public Health and Primary Care, Kortrijk,BelgiumE-mail: { alireza.gharahighehi, celine.vens } @kuleuven.be Alireza Gharahighehi, Celine Vens the previous interactions and activities of the user in the platform. In newsaggregator websites, users are usually anonymous and therefore their proﬁlesand long-term interaction histories are not available. In this situation the onlyavailable information is the sequence of interactions in the current session ofthe (anonymous) user. Moreover, the news domain is highly dynamic and theset of available news articles for recommendation changes rapidly. Therefore,a news RS should focus on these characteristics to capture recent trends andanonymous users’ short-term preferences [13,15,3].Session-based Recommender Systems (SBRSs) are applied, when the user’slong-term history is not available and the item set is highly dynamic. SBRSsare meant to recommend the next items given the sequence of visited itemsin the current session of an anonymous user. SBRSs use the collaborative andsequential information from previous sessions of anonymous users to rank andrecommend candidate items for an active session. These methods are appliedin many applications such as news recommendation [1], music recommendationand next basket prediction in e-commerce [12].RSs are mostly designed to generate accurate recommendations based onprevious interactions. As they are primarily optimised based on predictiveaccuracy they can narrow the scope of users’ recommendations and tightenthe ﬁlter bubble around the user. The ﬁlter bubble phenomenon occurs whenusers are exposed to similar topics and content and consequently are isolatedfrom diverse viewpoints and content [23]. In news aggregator websites, in ad-dition to the ﬁlter bubble phenomenon, focusing only on accuracy can formeco-chambers and boost polarization, radicalization, and fragmentation amongusers [7]. To mitigate these issues, diversity should be considered in the rec-ommendation lists to avoid recommending redundant items to the users andalso to broaden users’ horizons.In this paper we propose diversiﬁcation approaches for four state-of-artneighborhood-based SBRSs using news article metadata. To the best of ourknowledge, most current SBRS methods only focus on providing accurate pre-dictions, ignoring diversity of recommendation lists. In particular, we proposetwo simple yet eﬀective methods to manipulate the candidate item selectionor the neighbor selection, resulting in more diverse recommendation lists. Wealso study their combined eﬀect. Our main question of interest is investigatingthe accuracy/diversity trade-oﬀ, i.e., quantifying the accuracy loss as a cost forthe introduced diversity. For this purpose, we use an evaluation measure thattakes both goals into account, apart from the standard accuracy or diversitymeasures. This is an extension of our previous study [4] where we introduceddiversity in the session-based k nearest neighbor (SKNN) method [12], whichis a neighborhood-based SBRS. Our extension in this paper is fourfold: – First, instead of one model (SKNN), we generalize the diversiﬁcation ap-proaches to several neighborhood-based SBRSs, namely, SKNN [12], vectormultiplication SKNN (VSKNN) [21], STAN [2] and VSTAN [22]. – Second, we provide a more comprehensive overview of related SBRS studiesand diversiﬁcation approaches. iversiﬁcation in Session-based News Recommender Systems 3 – Third, we compare the results of the proposed approaches with the maximalmarginal relevance (MMR) re-ranking approach. – Finally, we extend the evaluation procedure by adding one more newsdataset and an additional evaluation measure (topics coverage) to assessthe ﬁlter bubble issue.In the following, related studies about SBRSs and diversity are presentedin Section 2. Next, in Section 3, the proposed scenarios to diversify recommen-dations of neighborhood-based SBRSs are explained. In Section 4, four newsdatasets are described and the experimental setup in designing and testingthe proposed scenarios are discussed. Next, the obtained results of applyingthese proposed scenarios are presented and discussed in Section 5. Finally, weconclude and outline some future research directions in Section 6.

Alireza Gharahighehi, Celine Vens graph neural network (SR-GNN) SBRS. In this model the sequences of itemsin the sessions are represented as graph structured data.Numerous previous studies on SBRSs [20,12,22,21,17,13] have shown thatneighborhood-based SBRSs surprisingly can outperform recent complex neuralnetwork-based SBRSs in both accuracy and computational cost. More speciﬁ-cally, [22] concluded that despite the recent surge of the newly proposed neuralnetwork-based SBRSs, in most cases, they are unable to outperform much sim-pler methods such as neighborhood-based SBRSs. For instance, they showedthat VSTAN performs better compared to the recently proposed SR-GNNw.r.t. accuracy, while the training time of VSTAN is 10,000 times lower thanSR-GNN.2.2 Diversity in recommender systemsThe concept of diversity was ﬁrst introduced in the information retrieval com-munity. A diversiﬁed list is more likely to contain the user’s actual search in-tent [14]. In recommender systems diversiﬁcation is applied to provide a widerrange of content and therefore to address the ﬁlter bubble phenomenon. Tomeasure the diversity of a recommendation list, a common evaluation measureis the average pair-wise distance between items in the ranked list [26]. Thismeasure is called intra-list diversity (ILD) and a high value in ILD meansthe recommended list contains items with a broad range of content. In rec-ommender systems accuracy and ranking play important roles. Vargas andCastells [28] introduced a rank and relevance sensitive intra-list diversity mea-sure (RR-ILD) that shows to what extent the recommender can diversify thelist and preserve the relevant items in the high ranks.Generally there are two diversiﬁcation approaches in recommendation sys-tems: re-ranking and diversity modeling. Re-ranking approaches such as [16,29,11] are post-processing methods that reorder the initial ranked list gener-ated by a baseline recommender. While these methods are able to increasediversity, they need additional post-processing steps and are normally com-putationally expensive. On the other hand, diversity modeling methods suchas [24,25,27] adapt the main recommender method to make it diversity-awarebased on items metadata. Each of these methods can only be applied on thespeciﬁc recommender method (for instance BPR) that is used as the main RSand therefore not applicable in other types of RS such as SBRSs.Although diversity has been vastly studied in user-based recommendationsystems, it has received very limited attention in SBRSs. The aim of thisarticle is to make the neighborhood-based SBRSs diversity-aware. Althoughour method is applicable in all domains where SBRSs are used, here we focuson news content. iversiﬁcation in Session-based News Recommender Systems 5

In this section we explain how we make neighborhood-based SBRSs diversity-aware. We ﬁrst explain the background related to these SBRSs, namely, SKNN,VSKNN, STAN and VSTAN in Section 3.1 and then propose our approachesto make them diversity-aware in section 3.2. As mentioned above, these SBRSsare very promising in terms of accuracy and computational cost [20,12,13,22,21,17], compared to more complex types of SBRS.3.1 BackgroundIn this section we propose approaches to make four neighborhood-based SBRSs,namely SKNN [12], VSKNN [21], STAN [2] and VSTAN [22], diversity-aware.The last three models are all extensions of SKNN. SKNN is a memory-basedSBRS that uses the items in the current session to select the nearest neighborsessions and to predict the next items in the current session. To predict thescore of a candidate item, SKNN uses the similarity of the item set in theneighbor sessions that include the candidate item, with the item set in theactive session. This score is calculated using Eq. 1:ˆ r SKNN ( i, s ) = X n ∈ N s w s,n × n ( i ) (1)where ˆ r SKNN ( i, s ) is the predicted score for session s and candidate item i , w s,n is the similarity between session s and session n , 1 n ( i ) is an indicatorfunction that veriﬁes whether item i exists in session n and N s is the set ofneighbor sessions for session s . To calculate the similarity between two sessionsone can use cosine distance measure: w s,n = s · n k s k × k n k (2)where s and n are the binary vector representations for session s and n overthe item set in training sessions.VSKNN [21] is an extension of SKNN that puts more focus on the re-cent items in the active session. In this method, instead of considering inclu-sion/exclusion of items in the active session as a binary value, real values areassigned to the items of the session based on their order. The last item ofthe active session gets the value 1 and the values of the less recent items inthe vector are decayed based on a decay function. The choice of the decayfunction is a hyperparameter. Ludewig et al. [21] proposed linear, logarithmic,quadratic and inverse decays functions. After constructing this vector withreal values using one of these decay functions, w s,n is deﬁned as the similaritybetween the binary vector of the neighbor session and this real-valued vectorof the active session.STAN [2] is another extension of SKNN that considers three additionalcomponents in SKNN to calculate relevance scores: (1) recency of items in Alireza Gharahighehi, Celine Vens the active session, (2) recency of the neighbor session and (3) position of thecandidate item in the neighbor session. They proposed three decay functionsto include these three components in SKNN. To focus more on the more recentitems in the active session, real values are calculated for items in the activesession using the following exponential decay function: s i = exp ( p ( s, i ) − l ( s ) λ ) (3)where i is an item in the active session s , p ( s, i ) is the position of item i in session s , l ( s ) is the position of the last item of session s and λ is ahyperparameter (e.g. if i is the third item of a session with length 4, p ( s, i ) = 3and l ( s ) = 4). The similarity of this real-valued vector s and the binary vectorof neighbor session n can be calculated using Eq. 2. To give a higher weight tothe more recent neighbor session w.r.t. active session they used the followingdecay function: w t ( s, n ) = exp ( − t ( s ) − t ( n ) λ ) (4)where t ( s ) and t ( n ) are the timestamps of the last event of the active session s and the neighbor session n ( t ( s ) > t ( n )) and λ is a hyperparameter. Finallyto consider the position of the candidate item in the neighbor session thefollowing decay function is proposed: w n ( i, n ) = exp ( − | p ( i, n ) − p ( i ∗ , n ) | λ ) × I n ( i ) × I n ∩ s ( i ∗ ) (5)where i ∗ is the most recent item that occurred in the active session s and thatalso occurred in the neighbor session n and λ is a hyperparameter. The ﬁnalpredicted relevance score in STAN is:ˆ r STAN ( i, s ) = X n ∈ N s w s,n × w t ( s, n ) × w n ( i, n ) × n ( i ) (6)VSTAN [22] is the most recent extension of SKNN that has two additionalcomponents compared to STAN: (1) position of the last item in the activesession that also exists in the neighbor session (Eq. 7) and (2) an idf weightingscheme (Eq.8) that gives lower weights to frequent items: w s ( n, s ) = exp ( p ( i ∗ , s ) − l ( s ) λ ) × I n ∩ s ( i ∗ ) (7) idf i = log ( | S ||{ s ∈ S : i ∈ s }| ) × λ idf (8)where w s ( n, s ) is the position weight of the last item in active session s thatalso exists in neighbor session n , idf i is the idf weight of candidate item i , S is the set of training sessions and λ and λ idf are hyperparameters. Therelevance score using VSTAN is calculated by: iversiﬁcation in Session-based News Recommender Systems 7 ˆ r V STAN ( i, s ) = X n ∈ N s w s,n × w t ( n, s ) × w n ( i, n ) × w s ( n, s ) × idf i × n ( i ) (9)3.2 Diversiﬁcation approachesTo make these neighborhood-based SBRSs (SKNN, VSKNN, STAN and VS-TAN) diversity-aware we propose three diversiﬁcation approaches: (1) diverseneighbor , (2) diverse candidate item and (3) their combination. We need acontent representation for news articles and a distance measure (Here we usecosine measure) to apply these approaches. We use article embeddings, whichare described in the next section, as the content representation in this study. Diverse neighbor:

In this approach we consider higher weights for the neighborsessions that have more content diversity. Diversity of a neighbor session is theaverage content dissimilarity of pairs of items in the session. Neighbors withhigher average content dissimilarity are more probable to bring more diverserecommendations. The diversity of a neighbor session can be calculated asfollowing: d n = P i ∈ n P j ∈ n \ { i } dist c ( i, j ) | n | ( | n | −

1) (10)

Diverse candidate item:

In this approach we consider a higher diversity weightfor a candidate item with higher average dissimilarity with items of the activesession. This weight is calculated using the embedding of the candidate itemand the embeddings of the articles in the active session based on the followingequation: d i,s = P j ∈ s dist ( i, j ) | s | (11)where d i,s is the diversity weight of candidate item i and active session s and dist ( i, j ) is the dissimilarity between embeddings of item i and j . These twoproposed diversiﬁcation approaches can be simultaneously applied to enhancediversity.To make SKNN diversity-aware we apply these two introduced diversityweights: ˆ r d − SKNN ( i, s ) = d i,s X n ∈ N s w s,n × d n × n ( i ) (12)where d n is the diversity of session n and d i,s is the average content dissim-ilarity of item i and the items in session s . Similar weights ( d n and d i,s ) areadded to VSKNN, STAN and VSTAN to make them diversity-aware. Alireza Gharahighehi, Celine Vens

Table 1: Datasets descriptions

Roularta Kwestie Globo.com Adressa

To compare results of the proposed approaches with a diversiﬁcation base-line, we use the MMR re-ranking approach [14]. In this approach multipleperformance criteria (here accuracy and diversity) are used to re-rank itemsof an initial recommendation list.

We evaluate three diversity boosting approaches, namely, diverse neighbor , diverse candidate item and the combination of both with the original methods.We use four news datasets, namely Adressa [6],

Globo.com [1],

Kwestie and

Roularta , which are described in Table 1, to evaluate the performance of theseapproaches. To calculate content dissimilarity, as explained in the previoussection, we should form content representations for news articles. The CNN-based deep neural network approach proposed by [1] is used to generate articleembeddings for news articles of these datasets based on article title, summary,full text and tags.We compare performance of the SBRSs explained in previous section (SKNN,VSKNN, STAN and VSTAN) with the proposed diversiﬁcation approaches, di-verse candidate item ( I), diverse neighbor ( D) and the combined approaches( ID) based on ﬁve performance measures, namely precision ( P@k ), recall(

R@k ), expected intra-list diversity (

ILD@k ), rank and relevance sensitive ex-pected intra-list diversity (

RR-ILD@k ) [28] and covered topics (

CT@k ). P@k and (

R@k ) are standard information retrieval accuracy measures that evalu-ate the model in predicting the relevant items in the top k recommendations.

ILD@k is the average content dissimilarity between pairs of items in top k rec-ommendation and

RR-ILD@k is another diversity measure, that considers theranks and relevance of top k recommendation in calculating diversity.

CT@k isthe average number of unique topics/keywords in recommendation lists.

CT@k is used to evaluate the merits of the diversiﬁcation approaches in expandingthe range of recommended content to the users by enhancing the number ofnews topics in the recommendation lists and consequently addressing the ﬁlterbubble issue. A recommendation list with more unique topics is less probableto tighten the ﬁlter bubble around the user. The last two datasets were obtained from Roularta Media Group, a Belgian multimediagroup.iversiﬁcation in Session-based News Recommender Systems 9

Table 2: The ﬁnal tuned values for the hyperparameters of the neighborhood-based SBRSs

Method Hyper-parameters Range

Adressa Globo Kwestie Roularta

SKNN sample size [500,3000] 2500 700 500 500 λ spw [0.1,5] 0.80 1.15 5.00 2.55 λ snh [1,5] 1.29 1.00 4.42 1.86 λ inh [0.5,5] 5.00 2.75 5.00 4.68VSTAN sample size [500,3000] 1000 700 2500 500 λ spw [0.1,5] 1.50 1.85 3.95 3.60 λ snh [1,5] 1.57 3.57 2.71 2.14 λ inh [0.5,5] 3.39 2.11 4.68 3.71 λ ipw [0.1,5] 3.95 1.50 3.25 3.25 λ idf [0.1,5] 0.10 4.3 1.50 0.10 To form the train and test sets, we use the approach by [20]. In this ap-proach, the datasets are split into ﬁve partitions with the same duration. Thesessions in the last day of each partition are considered as test sessions, andthe sessions from the other days of the partition as training sessions. In thesetest sessions the last two items are regarded as the test items. The accuracymeasures (

P@k and

R@k ) are calculated based on ability of the model in pre-dicting these test items in the test sessions. The reported performance in thenext section is the average performance of the model over these ﬁve partitions.As mentioned in the previous section, there are some hyperparameters tobe tuned. We consider the last day of the ﬁrst training set as the validation setand tune the hyperparameters based on

P@10 . The ﬁnal values of the tunedhyperparameters are reported in Table 2.

The results of applying the proposed approaches for Adressa dataset w.r.t.

P@10 , R@10 , ILD@10 , RR-ILD@10 and

CT@10 are reported in Table 3.As is shown in Table 3, while

SKNN is the simplest form of neighborhood-based SBRSs, it has the best accuracy and diversity among these methods.Among the proposed approaches, the ID approach has the best performancein ILD@10 but it deteriorates the accuracy signiﬁcantly. In

SKNN and

VSKNN the re-ranking approach performs better w.r.t.

RR-ILD@10 compared to theproposed approaches. On the other hand, in

STAN and

VSTAN , the D ap- The source code is available at https://github.com/alirezagharahi/d_SBRS proach has the best

RR-ILD@10 and therefore can provide a better trade-oﬀbetween diversity and accuracy.Table 3: Results of diversity-aware neighborhood-based SBRSs for

Adressa .Bold values highlight the best performing method and italic values indicateimprovement w.r.t. the base method.

Methods P@10 R@10 ILD@10 RR-ILD@10 CT@10SKNN 0.0788 0.3942 0.1849 0.0198 19.28SKNN I 0.0654 0.3272

SKNN D 0.0786 0.3929

SKNN ID 0.0631 0.3153

SKNN Re

VSKNN

VSKNN D 0.0768 0.3840

VSKNN ID 0.0523 0.2614

VSKNN Re 0.0760 0.3800

STAN

STAN D 0.0751 0.3755

STAN ID 0.0497 0.2483

STAN Re 0.0778 0.3889

VSTAN

VSTAN D 0.0745 0.3723

VSTAN ID 0.0489 0.2444

VSTAN Re 0.0782 0.3911

For the

Globo dataset, the results of applying the proposed approachesare shown in Table 4. According to this table, all the proposed approachesenhance diversity and number of unique topics in recommendations with thecost of reduced accuracy. In neighborhood-based SBRSs, STAN has the bestperformance in all measures. The ID approach has the best RR-ILD@10 and

CT@10 in SKNN and

VSKNN , but the I approach has better RR-ILD@10 in STAN and

VSTAN among the other approaches. The re-ranking approachcan not outperform the best performing proposed approach in

RR-ILD@10 and therefore is less eﬀective.Table 5 shows the performance of the proposed methods for the

Kwestie dataset. As is shown in this table,

STAN has the best accuracy, diversity and

RR-ILD@10 , while

VSTAN has the best topic coverage compared to the otherbaselines. All the proposed diversiﬁcation approaches can improve diversity,

RR-ILD@10 and topic coverage. In

SKNN and

VSKNN the I approach andthe re-ranking approach have almost same performance in RR-ILD@10 . InSTAN and VSTAN the I approach can eﬀectively improve diversity whileslightly reducing accuracy in all methods and therefore has the best RR-ILD@10 among the diversiﬁcation approaches. iversiﬁcation in Session-based News Recommender Systems 11

Table 4: Results of diversity-aware neighborhood-based SBRSs for

Globo . Boldvalues highlight the best performing method and italic values indicate improve-ment w.r.t. the base method.

Methods P@10 R@10 ILD@10 RR-ILD@10 CT@10SKNN

SKNN D 0.0441 0.2206

SKNN ID 0.0408 0.2039

SKNN Re 0.0422 0.2108

VSKNN

VSKNN D 0.0460 0.2302

VSKNN ID 0.0442 0.2210

VSKNN Re 0.0422 0.2110

STAN

STAN D 0.0432 0.2161

STAN ID 0.0413 0.2066

STAN Re 0.0457 0.2286

VSTAN

VSTAN D 0.0408 0.2041

VSTAN ID 0.0378 0.1891

VSTAN Re 0.0448 0.2241

For the

Roularta dataset, the results of applying the proposed approachesare reported in Table 6. According to this table,

STAN has the best accu-racy, diversity and

RR-ILD@10 and

VSTAN provides recommendations thatcover more unique topics compared to the other neighborhood-based baselines.All the proposed approaches improve diversity,

RR-ILD@10 and topic cover-age. For

VSTAN the I approach is the most eﬀective way to diversify therecommendations, while in the other neighborhood-based approaches the IDapproach is the most eﬀective approach among the proposed diversiﬁcationapproaches.To summarize, the recommendation lists have more diversity when morediverse neighbors are selected and when candidate items that have more dis-similarities with the corresponding active sessions are recommended. In theneighborhood-based SBRSs, the nearest neighbors convey the collaborativeinformation and, according to the results, using more diverse collaborativeinformation gives a better trade-oﬀ between diversity and accuracy, i.e., ithas always higher

RR-ILD@10 compared to the original methods. The di-verse candidate item approach is based on content-based information and inall methods and datasets it can bring more diverse recommendations. Almostin all cases, at least one of the proposed diversiﬁcation approaches can enhance

RR-ILD@10 compared to the original neighborhood-based methods and alsocan outperform the re-ranking approach. The choice of the best and the mosteﬀective approach depends on the SBRS method and the dataset. This is in

Table 5: Results of diversity-aware neighborhood-based SBRS for

Kwestie .Bold values highlight the best performing method and italic values indicateimprovement w.r.t. the base method.

Methods P@10 R@10 ILD@10 RR-ILD@10 CT@10SKNN

SKNN D 0.0309 0.1543

SKNN ID 0.0195 0.0976

SKNN Re 0.0329 0.1643

VSKNN

VSKNN D 0.0312 0.1561

VSKNN ID 0.0214 0.1070

VSKNN Re 0.0321 0.1605

STAN

STAN D 0.0355 0.1776

STAN ID 0.0270 0.1352

STAN Re 0.0406 0.2029

VSTAN

VSTAN D 0.0327 0.1635

VSTAN ID 0.0219 0.1094

VSTAN Re 0.0370 0.1849 line with the results of [20,22] where the choice of best performing SBRSdepends on the dataset.To investigate how the proposed approaches address the ﬁlter bubble phe-nomenon, the recommended lists generated by the diversity-aware approachesand the original models are compared w.r.t. the average number of unique top-ics (

CT@10 ) that they recommend to the users. All the proposed approachesenhance

CT@10 in all datasets, which indicates that the diversiﬁcation ap-proaches broaden the user reading experience and oﬀer a wider range of contentby recommending articles from more diverse topics and therefore mitigate theﬁlter bubble issue. As is shown in Table 7, in all datasets the Pearson correla-tions between diversity (

ILD@10 ) and

CT@10 of the methods are strong andstatistically signiﬁcant (p-values are less than 0.00001). Therefore, diversiﬁ-cation can widen the ﬁlter bubbles around the users by recommending newsfrom more diverse topics.

The main contribution of this study is to make neighborhood-based session-based recommender systems (SBRSs) diversity-aware. In news aggregator web-sites, focusing only on predictive performance of the recommender, can tightenthe ﬁlter bubbles around the users and can intensify polarization and fragmen- iversiﬁcation in Session-based News Recommender Systems 13

Table 6: Results of diversity-aware neighborhood-based SBRS for

Roularta .Bold values highlight the best performing method and italic values indicateimprovement w.r.t. the base method.

Methods P@10 R@10 ILD@10 RR-ILD@10 CT@10SKNN

SKNN D 0.0348 0.1742

SKNN ID 0.0244 0.1220

SKNN Re 0.0344 0.1720

VSKNN

VSKNN D 0.0298 0.1491

VSKNN ID 0.0257 0.1286

VSKNN Re 0.0333 0.1667 0.

STAN

STAN D 0.0445 0.2225

STAN ID 0.0394 0.1972

STAN re 0.0459 0.2297

VSTAN 0.0408 0.2041 0.1653 0.0076 4.46VSTAN I 0.0335 0.1674

VSTAN D 0.0385 0.1926

VSTAN ID 0.0281 0.1403

VSTAN re

Table 7: Pearson correlation between diversity (

ILD@10 ) and topic coverage(CT@10)

Roularta Kwestie Globo.com AdressaR 0.9594 0.9541 0.9251 0.8411P-value < . < . < . < . tation among them. Diversiﬁcation is a way to address these issues in newsrecommenders. We proposed scenarios to diversify the recommendation listsgenerated by these SBRSs. According to the results, all the scenarios improvediversity in all news datasets. The choice of the most eﬀective approach de-pends on the method and the dataset. In all cases we can ﬁnd a trade-oﬀbetween diversity and accuracy. Moreover, diversiﬁcation addresses the ﬁlterbubble phenomenon by increasing the number of unique news topics in therecommendation lists.For future extension, we will assess the possibility of enhancing the diver-sity of model-based SBRSs such as GRU4REC [8] and CHAMELEON [1]. Inthe loss functions of these model-based methods, regularization terms that pe-nalize similar contents should be applied. Moreover, we will apply the proposedscenarios on other domains such as music and e-commerce recommenders. Inthese domains there are other types of metadata such as lyrics, genres, artists,item descriptions or a hierarchy of item categories that should be used to diversify recommendations. Finally, we will investigate how the performanceof neighborhood-based SBRSs can be enhanced in other performance criteriasuch as fairness [5] and serendipity.

Acknowledgements

This work was executed within the imec.icon project NewsButler, aresearch project bringing together academic researchers (KU Leuven, VUB) and industrypartners (Roularta Media Group, Bothrs and ML6). The NewsButler project is co-ﬁnancedby imec and receives project support from Flanders Innovation & Entrepreneurship (projectnr. HBC.2017.0628).The authors also acknowledge the Flemish Government (AI ResearchProgram).

Conﬂict of interest

The authors declare that they have no conﬂict of interest.

References

1. Gabriel De Souza, P.M., Jannach, D., Da Cunha, A.M.: Contextual hybrid session-basednews recommendation with recurrent neural networks. IEEE Access , 169185–169203(2019)2. Garg, D., Gupta, P., Malhotra, P., Vig, L., Shroﬀ, G.: Sequence and time aware neigh-borhood for session-based recommendations: Stan. In: Proceedings of the 42nd Interna-tional ACM SIGIR Conference on Research and Development in Information Retrieval,pp. 1069–1072 (2019)3. Gharahighehi, A., Vens, C.: Extended bayesian personalized ranking based on consump-tion behavior. In: Postproceedings of the 31st Benelux Conference on Artiﬁcial Intel-ligence (BNAIC 2019) and the 28th Belgian Dutch Conference on Machine Learning(Benelearn 2019). Springer (2020)4. Gharahighehi, A., Vens, C.: Making session-based news recommenders diversity-aware.In: OHARS’20: Workshop on Online Misinformation- and Harm-Aware RecommenderSystems, p. (to appear) (2020)5. Gharahighehi, A., Vens, C., Pliakos, K.: Fair multi-stakeholder news recommender sys-tem with hypergraph ranking (2020)6. Gulla, J.A., Zhang, L., Liu, P., ¨Ozg¨obek, ¨O., Su, X.: The adressa dataset for newsrecommendation. In: Proceedings of the international conference on web intelligence,pp. 1042–1048 (2017)7. Helberger, N.: On the democratic role of news recommenders. Digital Journalism (8),993–1012 (2019)8. Hidasi, B., Karatzoglou, A.: Recurrent neural networks with top-k gains for session-based recommendations. In: Proceedings of the 27th ACM International Conference onInformation and Knowledge Management, pp. 843–852 (2018)9. Hidasi, B., Karatzoglou, A., Baltrunas, L., Tikk, D.: Session-based recommendationswith recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015)10. Hidasi, B., Quadrana, M., Karatzoglou, A., Tikk, D.: Parallel recurrent neural networkarchitectures for feature-rich session-based recommendations. In: Proceedings of the10th ACM conference on recommender systems, pp. 241–248. ACM (2016)11. Jambor, T., Wang, J.: Optimizing multiple objectives in collaborative ﬁltering. In:Proceedings of the fourth ACM conference on Recommender systems, pp. 55–62 (2010)12. Jannach, D., Ludewig, M.: When recurrent neural networks meet the neighborhood forsession-based recommendation. In: Proceedings of the Eleventh ACM Conference onRecommender Systems, pp. 306–310 (2017)13. Jugovac, M., Jannach, D., Karimi, M.: Streamingrec: a framework for benchmarkingstream-based news recommenders. In: Proceedings of the 12th ACM Conference onRecommender Systems, pp. 269–273 (2018)iversiﬁcation in Session-based News Recommender Systems 1514. Kaminskas, M., Bridge, D.: Diversity, serendipity, novelty, and coverage: a survey andempirical analysis of beyond-accuracy objectives in recommender systems. ACM Trans-actions on Interactive Intelligent Systems (TiiS) (1), 1–42 (2016)15. Karimi, M., Jannach, D., Jugovac, M.: News recommender systems–survey and roadsahead. Information Processing & Management (6), 1203–1227 (2018)16. Kelly, J.P., Bridge, D.: Enhancing the diversity of conversational collaborative recom-mendations: a comparison. Artiﬁcial Intelligence Review (1-2), 79–95 (2006)17. Kouki, P., Fountalis, I., Vasiloglou, N., Cui, X., Liberty, E., Al Jadda, K.: From the labto production: A case study of session-based recommendations in the home-improvementdomain. In: Fourteenth ACM Conference on Recommender Systems, pp. 140–149 (2020)18. Li, J., Ren, P., Chen, Z., Ren, Z., Lian, T., Ma, J.: Neural attentive session-basedrecommendation. In: Proceedings of the 2017 ACM on Conference on Information andKnowledge Management, pp. 1419–1428 (2017)19. Linden, G., Smith, B., York, J.: Amazon. com recommendations: Item-to-item collabo-rative ﬁltering. IEEE Internet computing (1), 76–80 (2003)20. Ludewig, M., Jannach, D.: Evaluation of session-based recommendation algorithms.User Modeling and User-Adapted Interaction28