FeedRec: News Feed Recommendation with Various User Feedbacks
FFeedRec: News Feed Recommendation withVarious User Feedbacks
Chuhan Wu , Fangzhao Wu , Tao Qi , Yongfeng Huang Tsinghua University, Beijing 100084, China Microsoft Research Asia, Beijing 100080, China{wuchuhan15,wufangzhao,taoqi.qt}@gmail.com,[email protected]
ABSTRACT
Personalized news recommendation techniques are widely adoptedby many online news feed platforms to target user interests. Learn-ing accurate user interest models is important for news recommen-dation. Most existing methods for news recommendation rely onimplicit feedbacks like click behaviors for inferring user interestsand model training. However, click behaviors are implicit feedbacksand usually contain heavy noise. In addition, they cannot help infercomplicated user interest such as dislike. Besides, the feed recom-mendation models trained solely on click behaviors cannot optimizeother objectives such as user engagement. In this paper, we presenta news feed recommendation method that can exploit various kindsof user feedbacks to enhance both user interest modeling and recom-mendation model training. In our method we propose a unified usermodeling framework to incorporate various explicit and implicituser feedbacks to infer both positive and negative user interests.In addition, we propose a strong-to-weak attention network thatuses the representations of stronger feedbacks to distill positive andnegative user interests from implicit weak feedbacks for accurateuser interest modeling. Besides, we propose a multi-feedback modeltraining framework by jointly training the model in the click, fin-ish and dwell time prediction tasks to learn an engagement-awarefeed recommendation model. Extensive experiments on real-worlddataset show that our approach can effectively improve the modelperformance in terms of both news clicks and user engagement.
KEYWORDS
News recommendation, News feed, User feedback
ACM Reference Format:
Chuhan Wu , Fangzhao Wu , Tao Qi , Yongfeng Huang . 2021. FeedRec:News Feed Recommendation with Various User Feedbacks. In Proceedingsof ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD2021),
Jennifer B. Sartor, Theo D’Hondt, and Wolfgang De Meuter (Eds.).ACM, New York, NY, USA, Article 4, 9 pages. https://doi.org/10.475/123_4
In recent years, online news feed services have gained huge pop-ularity for users to obtain news information from never-endingfeeds on their personal devices [13]. However, the huge volume ofnews articles streaming every day will overwhelm users [1]. Thus,
Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for third-party components of this work must be honored.For all other uses, contact the owner/author(s).
KDD 2021, August 2021, Singapore © 2021 Copyright held by the owner/author(s).ACM ISBN 123-4567-24-567/08/06.https://doi.org/10.475/123_4
Louisville police move to fire two more officers involved in raid that killed Breonna Taylor
Georgia faces off with Cincy in the Peach Bowl SB NationThe Safest Places to Travel in 2021
ClickSkip and not click
Georgia faces off with Cincy in the Peach Bowl
Spread:
Georgia -7.5
Total:
Moneyline:
Georgia -340 | Cincinnati +260Georgia and Cincinnati are set to face off in thePeach Bowl. The Dawgs and Bearcats come into this game both ranked inside the Top 10 and offer one of the more intriguing NY6 matchups this bowl season.The Bearcats, of course, are an undefeated squad to this point. They ran roughshod through the American and probably deserved a better ranking than ninth to end the regular season. They possess one of the best defenses in the entire country in lots of categories, having roughed up many of their adversaries this year. They are led on offense by the great Desmond Ridder, who has torched teams with his arm and his legs this year.Georgia had a bit of a rough go to start the season. But recently, the Dawgs have had a surge on offense. JT Daniels has been positively ‘it’ for UGA on offense, blazing teams with efficient quarterback play. It’s a wonder where they’d be with Daniels the entire year given UGA’s issues under center prior to him assuming the role. While James Cook will be absent this game, Zamir White will pack a serious punch in the backfield. Lest we forget, Georgia also has one of the most dominant defenses in the country, as well.
The Safest Places to Travel in 2021
Travel Without HesitationThe COVID-19 pandemic hampered international travel for much of 2020 but as vaccines are distributed and more countries reopen their borders to foreign tourists there's hope that Americans will regain a sense of normalcy in 2021 in terms of where and how they can travel abroad. Safety will be the top priority for many and while conditions can change in a hurry in the time of COVID-19, there are a handful of potential destinations that stand out when it comes to offering travelers both protection and peace of mind.CanadaAlthough Canada is still off-limits to Americans, it was rated as the safest destination among respondents to a recent survey conducted by travel insurance company Berkshire Hathaway Travel Protection. In addition to handling the coronavirus pandemic better than most, Canada ranked as the world's safest place across all three age groups, with millennials, middle-aged and mature travelers all in agreement.New ZealandNew Zealand is another country that has responded well to the COVID-19 crisis that remains closed to Americans heading into the new year. The Down Under destination ranks second on the 2020 Global Peace Index, even edging out Canada by four spots. New Zealand is also listed at a Level 1 travel advisory (exercise normal precautions) by the U.S. State Department..
Quick closeShareDislike (a) Implicit weak feedbacks (click/skip)and explicit feedbacks (share/dislike) (b) Implicit strong feedbacks(finish/quick close)
Georgia faces off with Cincy in the Peach Bowl SB Nation
Finish qŠš‚Ž—
Figure 1: An example of various user feedbacks on a newsfeed platform. personalized news recommendation is important for news feedservices to alleviate information overload and improve the readingexperience of users [11, 14, 28].Most existing news recommendation methods rely on click be-haviors of users to infer their interests and train the recommen-dation model [14, 19, 20, 22, 23, 25]. For example, Okura et al. [14]proposed to use a gated recurrent unit (GRU) network to learn userrepresentations from historical clicked news. Wang et al. [20] pro-posed to use a candidate-aware attention network to measure therelevance between clicked news and candidate news when learninguser representations. Wu et al. [25] proposed to use a combina-tion of multi-head self-attention and additive attention networksto learn user representations from clicked news. All these methodsare trained by predicting future news clicks based on the user in-terests inferred from historical clicked news. However, However,click behaviors are implicit feedbacks and usually contain heavynoise [21, 31]. For example, users may click a news due to the attrac-tion of a news title but close it quickly if the user is disappointedat the news content [27]. In addition, many user interests such aslike and dislike cannot be indicated by the implicit click feedbacks,which are actually very important for improving the engagementof users on the news platform. Thus, it is insufficient to model userinterests and train the recommendation model only based on newsclicks.Fortunately, on news feed platforms there are usually variouskinds of user feedbacks. An example is shown in Fig. 1. Besidesthe weak implicit feedbacks such as click and skip, there are also a r X i v : . [ c s . I R ] F e b xplicit feedbacks like share and dislike (Fig. 1(a)) and strong im-plicit feedbacks like finishing the news article and closing the newswebpage quickly after click (Fig. 1(b)). These feedbacks can providemore comprehensive information for inferring user interests [17].However, it is non-trivial to incorporate the various feedbacks intonews feed recommendation due to several challenges. First, implicitfeedbacks are usually very noisy. Thus, it is important to distillreal positive and negative user interests from the noisy implicitfeedbacks. Second, different feedbacks have very different char-acteristics, e.g., the intensity of user interests they reflect. Thus,the model needs to take their differences into consideration. Third,the feedbacks of a user may have some inherent relatedness. Forexample, a user may quickly close the webpage of a clicked newsand then push the dislike button. Thus, it is important to model therelatedness between feedbacks for better modeling user interests.In this paper, we present a news feed recommendation approachnamed FeedRec that can incorporate various user feedbacks intoboth user modeling and recommendation model training. In ourmethod, we propose a unified framework to incorporate variousexplicit and implicit feedbacks of users, including click, skip, share,dislike, finish, and quick close, to infer both positive and negativeinterests of users. We use a heterogeneous Transformer to capturethe relatedness among different kinds of feedbacks, and use sev-eral homogeneous Transformers to capture the relations amongthe same kind of feedbacks. In addition, we propose a strong-to-weak attention network that uses the representations of strongerfeedbacks to distill accurate positive and negative interests from im-plicit weak feedbacks. Besides, we propose a multi-feedback modeltraining framework that jointly trains the model using click pre-diction, finish prediction and dwell time prediction tasks to learnan engagement-aware feed recommendation model. Extensive ex-periments on real-world dataset validate that our approach cannot only gain more news clicks but also effectively improve userengagement.The contributions of this paper are summarized as follows: • We propose a unified user modeling framework which canincorporate various explicit and implicit feedbacks to inferboth positive and negative user interests. • We propose a strong-to-weak attention network to distillaccurate positive and negative user interests from implicitfeedbacks with the guidance of strong feedbacks. • We propose a multi-feedback model training framework byjointly training the model in click, finish and dwell timeprediction tasks to learn engagement-aware feed recommen-dation models. • We conduct extensive experiments on real-world datasetto verify the effectiveness of our approach in news feedrecommendation.
User modeling is critical for personalized news recommendation [10].Most existing news recommendation approaches model user inter-ests based on historical clicked news [2, 4–6, 8, 12, 14, 15, 19, 20, 22–25, 32, 33]. For example, Okura et al. [14] proposed an embedding-based news recommendation method that uses a GRU network tocapture user interests from the representations of clicked news. Wang et al. [20] proposed to use a candidate-aware attention net-work to learn user representations from clicked news based ontheir relevance to candidate news. Wu et al. [23] proposed a newsrecommendation method with personalized attention network thatselects informative clicked news for user modeling according tothe embeddings of user IDs. Wu et al. [25] proposed to use multi-head self-attention mechanism to capture the relations betweenclicked news and use additive attention to select informative newsfor user modeling. Wang et al. [19] proposed to use a hierarchical di-lated convolution neural network to learn multi-grained features ofclicked news for representing users. These methods only considerthe click behaviors of users. However, click behaviors are usuallyvery noisy for inferring user interests because users may not clicknews only due to their interests. In addition, click behaviors cannotreflect many other kinds of user interests such as like or dislike.Thus, it is insufficient to accurately and comprehensively modeluser interests with click feedbacks only.There are only a few news recommendation methods that con-sider user feedbacks beyond clicks in user modeling [3, 27, 29, 31].For example, Gershman et al. [3] proposed to represent users bythe news they carefully read, rejected, and scrolled. Yi et al. [31]proposed to use the dwell time of news reading as the weights ofclicked news for user modeling. Wu et al. [27] proposed a user mod-eling method based on click preference and reading satisfaction,which uses news clicks and the reading satisfaction derived fromdwell time and news content length to model users. Xie et al. [29]proposed to model users’ interests by their click, non-click anddislike feedbacks. They used click- and dislike-based user represen-tations to distill positive and negative user interests from non-clicks,respectively. However, these methods mainly rely on clicked newsto model the positive interests of users, which may not be accu-rate enough due to the heavy noise in click behaviors. Differentfrom them, our approach can incorporate the various feedbacksof users into user modeling to distill both positive and negativefeedbacks, which can capture user interests more comprehensivelyand accurately. In addition, our approach jointly trains the modelin various tasks including click prediction, finish prediction anddwell time prediction, which can learn an engagement-aware feedrecommendation model.
In this section, we introduce the details of our
FeedRec approach fornews feed recommendation. We first introduce its user modelingframework, then describe the model architecture for news modeling,and finally introduce our multi-feedback model training method.
The user modeling framework of our
FeedRec approach is shown inFig. 2. It aims to accurately infer the user preferences for subsequentnews feed recommendation by distilling positive and negative userinterests from both explicit and implicit feedbacks it incorporates.In our approach, we consider six kinds of user feedbacks in total,including click, skip, share, dislike, finish and quick close. As shownin Fig. 1(a), the click feedback is obtained from users’ click behaviorson the displayed news articles, which is a commonly used implicitpositive feedback for user modeling. Users can also skip some news … skip click finish dislike skip click share click quick close Transformer
Attention Attention Attention Attention Attention Attention … Aggregator Aggregator
Explicit Positive Feedback Implicit Strong Positive Feedback Weak Positive Feedback Weak Negative Feedback Implicit Strong Negative Feedback Explicit Negative FeedbackStrong Positive Feedback Strong Negative Feedback 𝐿𝐿 𝐷𝐷 skip Group by Feedback Type … … … … … … share finish click quickclose dislikeskip
Transformer News Encoder … 𝐷𝐷 𝐷𝐷 𝐷𝐷 𝐷𝐷 𝐷𝐷 𝐷𝐷 𝑁𝑁−4 𝐷𝐷 𝑁𝑁−3 𝐷𝐷 𝑁𝑁−2 𝐷𝐷 𝑁𝑁−1 𝐷𝐷 𝑁𝑁 𝒆𝒆 𝒆𝒆 𝒆𝒆 𝒆𝒆 𝒆𝒆 𝒆𝒆 𝑁𝑁−4 𝒆𝒆 𝑁𝑁−3 𝒆𝒆 𝑁𝑁−2 𝒆𝒆 𝑁𝑁−1 𝒆𝒆 𝑁𝑁 𝒉𝒉 𝒉𝒉 𝒉𝒉 𝒉𝒉 𝒉𝒉 𝒉𝒉 𝑁𝑁−4 𝒉𝒉 𝑁𝑁−3 𝒉𝒉 𝑁𝑁−2 𝒉𝒉 𝑁𝑁−1 𝒉𝒉 𝑁𝑁 𝒉𝒉 𝒉𝒉 𝑁𝑁 𝑠𝑠 𝑠𝑠 𝒉𝒉 𝒉𝒉 𝑁𝑁 𝑓𝑓 𝑓𝑓 𝒉𝒉 𝒉𝒉 𝑁𝑁 𝑐𝑐 𝑐𝑐 𝒉𝒉 𝒉𝒉 𝑁𝑁 𝑛𝑛 𝑛𝑛 𝒉𝒉 𝒉𝒉 𝑁𝑁 𝑞𝑞 𝑞𝑞 𝒉𝒉 𝒉𝒉 𝑁𝑁 𝑑𝑑 𝑑𝑑 𝒖𝒖 𝑒𝑒𝑝𝑝 … … … … … … 𝒓𝒓 𝒓𝒓 𝑁𝑁 𝑠𝑠 𝑠𝑠 𝒓𝒓 𝒓𝒓 𝑁𝑁 𝑓𝑓 𝑓𝑓 𝒓𝒓 𝒓𝒓 𝑁𝑁 𝑐𝑐 𝑐𝑐 𝒓𝒓 𝒓𝒓 𝑁𝑁 𝑛𝑛 𝑛𝑛 𝒓𝒓 𝒓𝒓 𝑁𝑁 𝑞𝑞 𝑞𝑞 𝒓𝒓 𝒓𝒓 𝑁𝑁 𝑑𝑑 𝑑𝑑 𝒖𝒖 𝑖𝑖𝑝𝑝 𝒖𝒖 𝑒𝑒𝑛𝑛 𝒖𝒖 𝑖𝑖𝑛𝑛 𝒔𝒔 𝑝𝑝 𝒔𝒔 𝑛𝑛 𝒘𝒘 𝑝𝑝 𝒘𝒘 𝑛𝑛 Transformer Transformer Transformer Transformer Transformer query + query Positive-NegativeDisentangling Loss + Aggregator
Attention Attention
Aggregator query queryquery
Aggregator
Unified User Embedding 𝒖𝒖𝒖𝒖 𝑐𝑐𝑝𝑝 𝒖𝒖 𝑐𝑐𝑛𝑛 𝒖𝒖 𝑛𝑛𝑝𝑝 𝒖𝒖 𝑛𝑛𝑛𝑛 Figure 2: The user modeling framework of our
FeedRec approach. without click such as the third news in Fig. 1(a), which is regarded asan implicit negative feedback. In addition, along with each displayednews, there are buttons for users to provide explicit feedbacks suchas share and dislike. For example, the user shares the second newsin Fig. 1(a) while reports a dislike of the fourth news. Besides, thereare also implicit feedback stronger than click and skip. For example,as shown in Fig. 1(b), after a user clicking a news, this user mayfinish reading this news (including watching the embedded video),which usually indicates a positive interest. However, the user mayalso take a quick read after click for only a few seconds and thenclose the news webpage, which is an indication of unsatisfaction.We use the news reading behavior with dwell time shorter than 𝑇 seconds to construct this kind of feedback.Next, we introduce the architecture of our user modeling frame-work. We first use a shared news encoder to obtain the embeddingof each feedback and its associated news article. We denote thefeedback sequence as [ 𝐷 , 𝐷 , ..., 𝐷 𝑁 ] , where 𝑁 is the sequence length. It is converted into a feedback embedding sequence, whichis denoted as E = [ e , e , ..., e 𝑁 ] .Next, we apply a heterogeneous feedback Transformer [18] tothe feedback embedding sequence to capture the relations betweendifferent feedbacks. The feedbacks from the same user may havesome inherent relatedness [29]. For example, the finish and quickclose feedbacks usually appear after clicks. In addition, some skipsmay also have correlations to the previous clicks because a user mayonly choose to read a few news on similar topics [9]. For example,in Fig. 1(a) the user clicks and shares the second news while skipsthe third news, which may be because both of them are about thesame football team. Thus, we use a heterogeneous feedback Trans-former to capture the relations among various kinds of feedbacks ina feedback sequence. It receives the feedback embedding sequence E as the input, and outputs a hidden feedback representation se-quence H = [ h , h , ..., h 𝑁 ] . To help the subsequent user modelingprocess that separately models different kinds of feedbacks, weroup the hidden feedback representations by their types. We de-note the embedding sequences of share, finish, click, skip, quickclose and dislike feedbacks respectively as H 𝑠 = [ h 𝑠 , h 𝑠 , ... h 𝑠𝑁 𝑠 ] , H 𝑓 = [ h 𝑓 , h 𝑓 , ... h 𝑓𝑁 𝑓 ] , H 𝑐 = [ h 𝑐 , h 𝑐 , ... h 𝑐𝑁 𝑐 ] , H 𝑛 = [ h 𝑛 , h 𝑛 , ... h 𝑛𝑁 𝑛 ] , H 𝑞 = [ h 𝑞 , h 𝑞 , ... h 𝑞𝑁 𝑞 ] and H 𝑑 = [ h 𝑑 , h 𝑑 , ... h 𝑑𝑁 𝑑 ] , where 𝑁 𝑠 , 𝑁 𝑓 , 𝑁 𝑐 , 𝑁 𝑛 , 𝑁 𝑞 and 𝑁 𝑑 represent the number of the corresponding feed-backs.Following is a homogeneous feedback Transformer, which isapplied to each kind of feedbacks to learn feedback-specific repre-sentations. Different kinds of feedbacks usually have very differentcharacteristics. For example, click and skip feedbacks are usuallyabundant but noisy, while share and dislike feedbacks are strongbut sparse. Thus, they may need to be handled differently. In ad-dition, the relations between the same kind of feedbacks are alsoimportant for user interest modeling [29]. For example, researchershave found that modeling the interactions between clicked newscan help better infer user interests [25]. Since the heterogenousTransformer may not focus on capturing the relatedness betweenthe same kind of feedback, we apply independent Transformersto each kind of feedbacks to learn feedback-specific representa-tions for them and meanwhile capture the relations among thesame kind of feedbacks. We denote the feedback-specific repre-sentation sequences of finish, click, skip, quick close and dislikeas R 𝑓 = [ r 𝑓 , r 𝑓 , ... r 𝑓𝑁 𝑓 ] , R 𝑐 = [ r 𝑐 , r 𝑐 , ... r 𝑐𝑁 𝑐 ] , R 𝑛 = [ r 𝑛 , r 𝑛 , ... r 𝑛𝑁 𝑛 ] , R 𝑞 = [ r 𝑞 , r 𝑞 , ... r 𝑞𝑁 𝑞 ] and R 𝑑 = [ r 𝑑 , r 𝑑 , ... r 𝑑𝑁 𝑑 ] , respectively.Based on the representation sequences of each kind of feedbacks,we then propose a strong-to-weak attention network to distill accu-rate positive and negative interests from implicit weak feedbacks(e.g., clicks) based on their relevance to stronger feedbacks (e.g.,share and finish). Since explicit feedbacks like share and dislike areusually reliable, we can directly regard them as pure positive andnegative feedbacks, respectively. We apply two separate attentionnetworks [30] to them to learn an explicit positive feedback repre-sentation u 𝑝𝑒 and an explicit negative feedback representation u 𝑛𝑒 ,which are formulated as follows: 𝛼 𝑝𝑘 = exp ( q 𝑠 · r 𝑠𝑘 ) (cid:205) 𝑁 𝑠 𝑗 = exp ( q 𝑠 · r 𝑠𝑗 ) , u 𝑝𝑒 = 𝑁 𝑠 ∑︁ 𝑘 = 𝛼 𝑝𝑘 r 𝑠𝑘 , (1) 𝛼 𝑛𝑘 = exp ( q 𝑑 · r 𝑑𝑘 ) (cid:205) 𝑁 𝑑 𝑗 = exp ( q 𝑑 · r 𝑑𝑗 ) , u 𝑛𝑒 = 𝑁 𝑑 ∑︁ 𝑘 = 𝛼 𝑛𝑘 r 𝑑𝑘 . (2)Next, we use the explicit positive feedback representation u 𝑝𝑒 toselect informative finish feedbacks and build a representation u 𝑝𝑖 of implicit strong positive feedback, which is formulated as follows: 𝛽 𝑝𝑘 = exp ( u 𝑝𝑒 · r 𝑓𝑘 ) (cid:205) 𝑁 𝑓 𝑗 = exp ( u 𝑝𝑒 · r 𝑓𝑗 ) , u 𝑝𝑖 = 𝑁 𝑓 ∑︁ 𝑘 = 𝛽 𝑝𝑘 r 𝑓𝑘 . (3)The implicit strong negative feedback u 𝑛𝑖 is computed in a similarway from the representations of quick close feedbacks, which isformulated as follows: 𝛽 𝑛𝑘 = exp ( u 𝑛𝑒 · r 𝑞𝑘 ) (cid:205) 𝑁 𝑞 𝑗 = exp ( u 𝑞𝑒 · r 𝑞𝑗 ) , u 𝑛𝑖 = 𝑁 𝑞 ∑︁ 𝑘 = 𝛽 𝑛𝑘 r 𝑞𝑘 . (4)Click and skip feedbacks are usually noisy for inferring positiveand negative interests [27, 29]. This is because clicks do not neces-sarily mean like or satisfaction, and those seen but skipped newsmay also be relevant to user interests. Thus, we need to distill thereal positive and negative user interests from them. To address thisproblem, we select click and skip feedbacks based on their rele-vance to strong feedbacks for learning positive and negative userinterest representations. We use the summation of u 𝑝𝑒 and u 𝑝𝑖 as theattention query for distilling the click-based and skip-based weakpositive interests (denoted as u 𝑝𝑐 and u 𝑝𝑛 ), which are computed asfollows: 𝛾 𝑝𝑘 = exp [( u 𝑝𝑒 + u 𝑝𝑖 ) · r 𝑐𝑘 ] (cid:205) 𝑁 𝑐 𝑗 = exp [ u 𝑝𝑒 + u 𝑝𝑖 ) · r 𝑐𝑗 ] , u 𝑝𝑐 = 𝑁 𝑓 ∑︁ 𝑘 = 𝛾 𝑝𝑘 r 𝑐𝑘 , (5) 𝛾 𝑛𝑘 = exp [( u 𝑝𝑒 + u 𝑝𝑖 ) · r 𝑛𝑘 ] (cid:205) 𝑁 𝑛 𝑗 = exp [ u 𝑝𝑒 + u 𝑝𝑖 ) · r 𝑛𝑗 ] , u 𝑝𝑛 = 𝑁 𝑛 ∑︁ 𝑘 = 𝛾 𝑛𝑘 r 𝑛𝑘 . (6)The click- and skip-based weak negative feedbacks (denoted as u 𝑛𝑐 and u 𝑛𝑛 ) are computed similarly by using u 𝑛𝑒 + u 𝑛𝑖 as the attentionquery. In this way, we can distill accurate positive and negativeuser interest information from the noisy feedbacks.The last one is feedback aggregation. It aims to aggregate differ-ent kinds of feedbacks into summarized representations by consid-ering their different importance and functions. We first aggregatethe explicit positive feedback u 𝑝𝑒 and implicit strong positive feed-back u 𝑝𝑖 into a unified strong positive feedback representation s 𝑝 ,which is formulated as follows: 𝛿 𝑝 = 𝜎 ( v 𝑝 · [ u 𝑝𝑒 ; u 𝑝𝑖 ]) , s 𝑝 = 𝛿 𝑝 u 𝑝𝑒 + ( − 𝛿 𝑝 ) u 𝑝𝑖 , (7)where 𝜎 is the sigmoid function, v 𝑝 is a learnable vector. In a similarway, we aggregate the explicit negative feedback u 𝑛𝑒 and implicitstrong negative feedback u 𝑛𝑖 into a unified strong negative feedback ext Embedding 1 Text Embedding 2 Text Embedding N ... Position Embedding 1 Position Embedding 2 Position Embedding NDwell Time Embedding 1 Dwell Time Embedding 2 Dwell Time Embedding N ++ ++ ++
Feedback 1 Feedback 2 Feedback N
Time Interval Embedding 1 Time Interval Embedding 2 Time Interval Embedding N + + + ...
Feedback Embedding 1 Feedback Embedding 2 Feedback Embedding N + + + ... 𝒆𝒆 𝒆𝒆 𝒆𝒆 𝑁𝑁 Figure 3: The architecture of the news encoder. representation u 𝑛 as follows: 𝛿 𝑛 = 𝜎 ( v 𝑛 · [ u 𝑛𝑒 ; u 𝑛𝑖 ]) , s 𝑛 = 𝛿 𝑛 u 𝑛𝑒 + ( − 𝛿 𝑛 ) u 𝑛𝑖 , (8)where v 𝑛 are parameters. Similarly, we aggregate the click-basedand skip-based positive feedbacks ( u 𝑝𝑐 and u 𝑝𝑛 ) into a weak positivefeedback representation w 𝑝 , and aggregate u 𝑛𝑐 and u 𝑛𝑛 into a weaknegative feedback representation w 𝑛 . We finally aggregate thefour kinds of feedbacks, i.e., s 𝑝 , w 𝑝 , w 𝑛 and s 𝑛 into a unified userembedding u , which is formulated as follows: u = 𝑠 𝑝 s 𝑝 + 𝑤 𝑝 w 𝑝 + 𝑠 𝑛 s 𝑛 + 𝑤 𝑛 w 𝑛 , (9)where 𝑠 𝑝 , 𝑤 𝑝 , 𝑠 𝑛 , 𝑤 𝑛 are learnable parameters. In this section, we briefly introduce the details of news encoderin our approach. The architecture of the news encoder is shownin Fig. 3. For each feedback on news, we compute five kinds ofembeddings for it. The first one is text embedding, which is com-puted from news title through a Transformer [18] network to cap-ture the semantic information of news. The second one is positionembedding, which aims to encode the positional information offeedback. The third one is feedback embedding, which encodesthe type of feedback to help better distinguish different kinds offeedbacks. The fourth one is dwell time embedding, which aims toencode use engagement information. We use a quantization func-tion ˜ 𝑡 = ⌊ log ( 𝑡 + )⌋ to convert the real-valued dwell time 𝑡 into adiscrete value ˜ 𝑡 for building the embedding table. The last one istime interval embedding, which aims to better capture the related-ness between adjacent feedbacks. We use the same quantizationfunction to convert the time interval between the current and pre-vious feedbacks into a discrete variable for embedding. The fivekinds of embeddings are added together to form a unified newsembedding for subsequent user modeling and model training. This embedding is deactivated when encoding candidate news.
Click Predictor Dwell Time PredictorFinish Predictor �𝑦𝑦 ̂𝑧𝑧 ̂𝑡𝑡
News EncoderUser Encoder 𝒖𝒖 … Historical Feedbacks Candidate News
Candidate News EmbeddingPredicted Click Score Predicted Finish Score Predicted Dwell TimeUserEmbedding 𝒆𝒆 Figure 4: The multi-feedback model training framework of
FeedRec . In this section, we introduce the multi-feedback framework in ourapproach. Existing news recommendation methods mainly rely onthe click signals to train the recommendation model. However, thereare usually some gaps between news clicks and user engagementor satisfaction, because users may leave the news page quickly ifthey are not satisfied with the quality of news content. Thus, wepropose to jointly train the model in three tasks, including clickprediction, finish prediction and dwell time prediction, to encodeboth click and user engagement information. The model trainingframework is shown in Fig. 4. We use the user encoder to learn auser embedding u from the feedback sequence and use the newsencoder to encode the candidate news into its embedding e . Wedenote the predicted click, finish and dwell time scores of this pairof user and candidate news as ˆ 𝑦 , ˆ 𝑧 and ˆ 𝑡 respectively, which arecomputed as follows: ˆ 𝑦 = u · e , ˆ 𝑧 = u · ( W 𝑧 e ) , ˆ 𝑡 = max [ , u · ( W 𝑡 e )] , (10)where W 𝑧 and W 𝑡 are learnable parameters.Following [25], we also use negative sampling techniques toconstruct training samples. For each clicked news, we sample 𝐾 skipped news displayed on the same page, and jointly predict thethree kinds of scores for these 𝐾 + L 𝑅 = − log [ exp ( ˆ 𝑦 + ) exp ( ˆ 𝑦 + ) + (cid:205) 𝐾𝑖 = exp ( ˆ 𝑦 − 𝑖 ) ] , L 𝐹 = − 𝑧 + log [ 𝜎 ( ˆ 𝑧 + )] − ( − 𝑧 + ) log [ − 𝜎 ( ˆ 𝑧 + )] , L 𝑇 = | 𝑡 + − ˆ 𝑡 + | , (11)where ˆ 𝑦 + and ˆ 𝑦 − 𝑖 are the predicted click scores for a clicked newsand its associated 𝑖 -th skipped news. ˆ 𝑧 + , 𝑧 + , ˆ 𝑡 + 𝑖 and 𝑡 + 𝑖 stand for thepredicted finish label, real finish label, predicted dwell time, andreal dwell time of a clicked news, respectively. We use the log function to transform the raw dwell time and then normalize it. esides, since we expect the weak positive feedback to be dif-ferent from the weak negative feedback, we propose a positive-negative disentangling loss L 𝑑 to help distill more accurate positiveand negative user interests by regularizing w 𝑝 and w 𝑛 as follows: L 𝐷 = w 𝑝 · w 𝑛 || w 𝑝 || × || w 𝑛 || , (12)where || · || means the L2-norm. The final unified loss function L isa weighted summation of four loss functions, which is formulatedas follows: L = L 𝑅 + 𝛼 L 𝐹 + 𝛽 L 𝑇 + 𝛾 L 𝐷 , (13)where 𝛼 , 𝛽 and 𝛾 are loss coefficients that control the relative im-portance of the corresponding loss functions. In our experiments, since there is no off-the-shelf dataset for newsrecommendation that contains multiple kinds of user feedbacks,we constructed one by ourselves from the news feed platform ofMicrosoft News. The dataset contains the behavior logs of 10,000users in about one month, i.e., from Sep. 1st, 2020 to Oct. 2nd, 2020.The logs in the last week were used for test, and the rest oneswere used for training and validation (rest logs on the last day).The statistics of this dataset is shown in Table 1. We can see thatexplicit feedbacks like share and dislike are relatively sparse, whileimplicit feedbacks are much richer. The distributions of the numberof each kind of feedback provided by a user are shown in Fig. 5.We can find that the number of skip feedbacks is approximatelylog-normal, while the numbers of other kinds of feedbacks obeylong-tail distributions. Since skip feedbacks are dominant in ourdataset, we only randomly sample 10% of skips to reduce the lengthof input sequence. We also show the distribution of dwell time inour dataset in Fig. 6. We find an interesting phenomenon is that thedistribution has two peaks, one of which approximately appearsbetween 0 and 10 seconds. This may be because users are sometimesdisappointed at the news content and quickly close the webpage.Thus, we accordingly set the dwell time threshold 𝑇 to 10 secondsto construct the quick close feedbacks, and we will discuss theinfluence of 𝑇 in the hyperparameter analysis section. Table 1: Detailed statistics of the datasets. Number of Feedback D e n s i t y Skip Number of Feedback D e n s i t y Click Number of Feedback D e n s i t y Quick Close Number of Feedback D e n s i t y Finish
Number of Feedback D e n s i t y Share
Number of Feedback D e n s i t y Dislike
Figure 5: Distribution of different kinds of feedbacks. Dwell Time/s D e n s i t y Figure 6: Dwell time distribution of the dataset. on the validation sets. We used AUC, MRR, nDCG@5 and HR@5 tomeasure the click-based model performance. In addition, we usedseveral metrics to measure the model performance in terms of userengagement. We used the ratio of the share/dislike frequency of top5 ranked news to the overall share/dislike frequency in the datasetto measure share/dislike based performance, and we also reportedthe average finishing ratio of top 5 ranked news and their averagedwell time if clicked. We independently repeated each experiment5 times and reported the average results.
In this section, we compare the performance of our
FeedRec ap-proach with many baseline methods, including: (1) EBNR [14], anembedding-based news recommendation method with GRU net-work; (2) DKN [20], deep knowledge network for news recommen-dation; (3) NPA [23], a neural news recommendation method withpersonalized attention; (4) NAML [22], a neural news recommen-dation method with attentive multi-view learning; (5) LSTUR [1], anews recommendation method that models long- and short-termuser interests; (6) NRMS [25], using multi-head self-attention fornews and user modeling; (7) FIM [19], a fine-grained interest match-ing approach for news recommendation; (8) DFN [29], deep feed-back network for feed recommendation; (9) CPRS [27], a newsrecommendation approach with click preference and reading sat-isfaction. The click-based and user-engagement performance ofdifferent methods are shown in Tables 2 and 3, respectively. We able 2: Performance comparison in terms of news clicks.
Methods AUC MRR nDCG@5 HR@5EBNR [14] 0.6112 0.2622 0.2790 0.1062DKN [20] 0.6076 0.2591 0.2768 0.1045NPA [23] 0.6210 0.2685 0.2882 0.1095NAML [22] 0.6192 0.2670 0.2871 0.1089LSTUR [1] 0.6224 0.2701 0.2896 0.1099NRMS [25] 0.6231 0.2707 0.2904 0.1103FIM [19] 0.6250 0.2729 0.2925 0.1114DFN [29] 0.6296 0.2748 0.2948 0.1140CPRS [27] 0.6334 0.2781 0.2972 0.1156FeedRec ↑ Means higher is better, while ↓ means lower is better. Methods Share( ↑ ) Dislike( ↓ ) Finish( ↑ ) Dwell Time/s( ↑ )EBNR [14] 1.1203 0.9679 0.0671 84.061DKN [20] 1.1169 0.9729 0.0655 83.494NPA [23] 1.1288 0.9588 0.0691 84.579NAML [22] 1.1269 0.9593 0.0689 84.487LSTUR [1] 1.1325 0.9610 0.0696 84.712NRMS [25] 1.1343 0.9583 0.0709 84.793FIM [19] 1.1365 0.9595 0.0711 85.010DFN [29] 1.1398 0.9519 0.0745 85.346CPRS [27] 1.1455 0.9434 0.0772 86.129FeedRec have several findings from the results. First, compared with themethods based on click feedbacks only, the methods that considerother user feedbacks (i.e., DFN , CPRS and
FeedRec ) achieve bet-ter performance in terms of news clicks and user engagement. Itshows that click feedbacks may not be sufficient to model userinterests accurately and other feedbacks such as dislike and dwelltime can provide complementary information for user modeling.Second, among the methods that can exploit multiple kinds of userfeedbacks,
CPRS and
FeedRec perform better than
DFN . This maybe because the dislike feedbacks are relatively sparse, which maybe insufficient to distill negative user interests accurately. Third,our
FeedRec approach outperforms other compared methods inboth click- and engagement-based metrics. This is probably be-cause our approach can effectively exploit the various feedbacksof users to model their interests more accurately. In addition, ourmulti-feedback model training framework not only considers newsclicks but also the engagement signals, which can help learn auser engagement-aware recommendation model to improve userexperience.
In this section, we study the influence of different feedbacks on themodel performance. We compare the performance of
FeedRec andits variants with one kind of feedbacks removed, and the resultsare shown in Fig. 7. We find that the performance declines whenany kind of feedbacks is dropped. Among them, the click feedback
AUC [email protected] A UC n DC G @ FeedRec- Share- Dislike- Quick Close - Finish-
Skip - Click
Figure 7: Influence of different user feedbacks on the modelperformance.
AUC [email protected] A UC n DC G @ FeedRec- Homogeneous Transformer- Strong-to-weak Attention- Heterogeneous Transformer
Figure 8: Effectiveness of several core model components. plays the most important role, which is intuitive. However, we findit is interesting that the skip feedback is the second most important.This may be because skips can also provide rich clues for inferringuser interests (usually negative ones) to support user modeling. Inaddition, finish and quick close feedbacks are also important. Thismay be because both kinds of feedbacks are indications of users’news reading satisfaction, which are important for modeling userpreferences. Besides, share and dislike feedbacks are also useful,but their contributions are relatively small. This may be becausethat although these explicit feedbacks are strong indications of userpreference, they are usually sparse in practice. Thus, it is importantto incorporate other implicit feedbacks like finish to model userinterests more comprehensively.
In this section, we validate the effectiveness of the core model com-ponents in our
FeedRec approach and the loss functions used formodel training. We first compare the performance of our approachand its variants with one component removed, as shown in Fig. 8.From the results, we find that the heterogeneous feedback Trans-former contributes most. This may be because the heterogeneousfeedback Transformer can capture the global relatedness betweenthe feedbacks of a user. In addition, the strong-to-weak attentionnetwork is also very useful. This is because it can select informativefeedbacks for user modeling and meanwhile take the information of
UC [email protected] A UC n DC G @ FeedRec- Finish Prediction- Dwell Time Prediction- Positive-negative Disentangling
Figure 9: Influence of different loss functions.
AUC [email protected] A UC n DC G @ FeedRec- Time Interval Embedding- Position Embedding- Dwell Time Embedding- Feedback Embedding
Figure 10: Effect of different embeddings in news encoder. strong feedbacks into consideration, which can help distill positiveand negative user interests more precisely. Moreover, the homoge-neous Transformer can also improve the performance. This may bebecause it can better capture the diverse characteristics of differentkinds of feedbacks and benefit user modeling.We also study the influence of each loss function on model train-ing by removing it from the unified training loss. The results areshown in Fig. 9. We find that the positive-negative disentanglingloss can effectively improve the model performance. This may bebecause it can push the model to distill positive and negative in-terest information more accurately, which is beneficial for rec-ommendation. In addition, both the finish prediction and dwelltime prediction losses are helpful. This may be because finish anddwell time signals are correlated to user satisfaction. Thus, incor-porating these signals into model training can also help learn anengagement-aware user model, which can improve the recommen-dation performance.Finally, we investigate the influence of several different kinds ofembeddings in the news encoder, including position embedding,feedback embedding, dwell time embedding and time interval em-bedding by removing one of them. We illustrate the results inFig. 10. We find the feedback embedding plays the most impor-tant role. This is because the embedding of feedback type is veryuseful for distinguishing different kinds of feedbacks. In addition, We do not remove text embedding because the performance will be quiteunsatisfactory.
Dwell Time Threshold T /s A UC AUCnDCG@5 n DC G @ Figure 11: Influence of the dwell time threshold 𝑇 . the dwell time embedding is also important. This may be becausedwell time embeddings can provide rich information on inferringthe satisfaction of users. Besides, both position and time intervalembeddings are useful. This is because position embeddings canhelp capture the feedback orders and time interval embeddings canhelp better model the relatedness between adjacent feedbacks. In this section, we present some analysis on several critical hyper-parameters in our approach, including the dwell time threshold 𝑇 for constructing quick close feedbacks and the coefficients (i.e., 𝛼 , 𝛽 and 𝛾 ) for controlling the importance of different tasks. We firstvary the threshold 𝑇 from 0 to 25 seconds to study its influence onmodel performance. The results are shown in Fig. 11. We find thatthe performance is suboptimal when the threshold 𝑇 is too small(e.g., 5 seconds). This may be because many negative feedbackswith short reading dwell time cannot be exploited. In addition, theperformance also declines when 𝑇 goes too large. This is becausemany positive feedbacks will be mistakenly regarded as negativeones, which is not beneficial for user interest modeling. Thus, inour approach the threshold 𝑇 is set to 10 seconds, which is alsoconsistent with the findings in [26].We then study the influence of the three loss coefficients. We firsttune the finish prediction loss coefficient 𝛼 under 𝛽 = 𝛾 =
0. Theresults are shown in Fig. 12(a). We find that the performance is notoptimal when 𝛼 is either too small or too large. This may be becausethe finish signals are not fully exploited when 𝛼 is very small, whilethe main click prediction task will be influenced if the coefficientgoes too large. Thus, we empirically set 𝛼 to 0.2. Then, we tune thedwell time prediction loss coefficient 𝛽 under 𝛼 = . 𝛾 = 𝛽 is too small, whilethe click prediction task is not fully respected when 𝛽 is too large.Thus, we set 𝛽 to 0.15 according to the results. Finally, we searchthe value of the positive-negative disentangling loss coefficient 𝛾 under the previous settings of 𝛼 and 𝛽 . We observe that a moderatevalue of 𝛾 such as 0.2 is suitable for our approach. This may bebecause the positive and negative feedbacks cannot be effectivelydistinguished when 𝛾 is too small, while this regularization loss isover emphasized when 𝛾 is too large. .0 0.1 0.2 0.3 0.4 0.564.064.464.865.265.666.0 A UC AUCnDCG@5 n DC G @ (a) Finish prediction loss coefficient 𝛼 . A UC AUCnDCG@5 n DC G @ (b) Dwell time prediction loss coefficient 𝛽 . A UC AUCnDCG@5 n DC G @ (c) Positive-negative disentangling loss 𝛾 . Figure 12: Influence of different loss coefficients on the model performance.
In this paper, we present a news feed recommendation approachthat can exploit various kinds of user feedbacks. In our approach,we propose a unified user modeling framework to incorporatevarious explicit and implicit user feedbacks to comprehensivelycapture user interests. In addition, we propose a strong-to-weakattention network that uses strong feedbacks to distill accuratepositive and negative user interests from weak implicit feedbacks.Besides, we propose a multi-feedback model training framework totrain the model in the click, finish and dwell time prediction tasks tolearn engagement-aware feed recommendation models. Extensiveexperiments on real-world dataset validate that our approach caneffectively improve model performance in terms of both news clicksand user engagement.
REFERENCES [1] Mingxiao An, Fangzhao Wu, Chuhan Wu, Kun Zhang, Zheng Liu, and Xing Xie.2019. Neural News Recommendation with Long-and Short-term User Represen-tations. In
ACL . 336–345.[2] Suyu Ge, Chuhan Wu, Fangzhao Wu, Tao Qi, and Yongfeng Huang. 2020. GraphEnhanced Representation Learning for News Recommendation. In
WWW . 2863–2869.[3] Anatole Gershman, Travis Wolfe, Eugene Fink, and Jaime G Carbonell. 2011.News personalization using support vector machines. (2011).[4] Linmei Hu, Chen Li, Chuan Shi, Cheng Yang, and Chao Shao. 2020. Graphneural news recommendation with long-term and short-term interest modeling.
Information Processing & Management
57, 2 (2020), 102142.[5] Linmei Hu, Siyong Xu, Chen Li, Cheng Yang, Chuan Shi, Nan Duan, Xing Xie,and Ming Zhou. 2020. Graph neural news recommendation with unsupervisedpreference disentanglement. In
ACL . 4255–4264.[6] Dhruv Khattar, Vaibhav Kumar, Vasudeva Varma, and Manish Gupta. 2018.Weave& rec: A word embedding based 3-d convolutional network for newsrecommendation. In
CIKM . ACM, 1855–1858.[7] Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Opti-mization. In
ICLR .[8] Dongho Lee, Byungkook Oh, Seungmin Seo, and Kyong-Ho Lee. 2020. NewsRecommendation with Topic-Enriched Knowledge Graphs. In
CIKM . 695–704.[9] Lei Li, Dingding Wang, Tao Li, Daniel Knox, and Balaji Padmanabhan. 2011.SCENE: a scalable two-stage personalized news recommendation system. In
SIGIR . 125–134.[10] Miaomiao Li and Licheng Wang. 2019. A Survey on Personalized News Recom-mendation Technology.
IEEE Access
ICDE . IEEE, 505–516.[12] Danyang Liu, Jianxun Lian, Shiyin Wang, Ying Qiao, Jiun-Hung Chen,Guangzhong Sun, and Xing Xie. 2020. KRED: Knowledge-Aware DocumentRepresentation for News Recommendations. In
Recsys . 200–209. [13] Nuno Moniz and Luís Torgo. 2018. Multi-source social feedback of online newsfeeds. arXiv preprint arXiv:1801.07055 (2018).[14] Shumpei Okura, Yukihiro Tagami, Shingo Ono, and Akira Tajima. 2017.Embedding-based news recommendation for millions of users. In
KDD . 1933–1942.[15] TYSS Santosh, Avirup Saha, and Niloy Ganguly. 2020. MVL: Multi-View Learningfor News Recommendation. In
SIGIR . 1873–1876.[16] Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and RuslanSalakhutdinov. 2014. Dropout: a simple way to prevent neural networks fromoverfitting.
JMLR
15, 1 (2014), 1929–1958.[17] Liang Tang, Bo Long, Bee-Chung Chen, and Deepak Agarwal. 2016. An empiricalstudy on recommendation with multiple types of feedback. In
KDD . 283–292.[18] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is allyou need. In
NIPS . 5998–6008.[19] Heyuan Wang, Fangzhao Wu, Zheng Liu, and Xing Xie. 2020. Fine-grainedInterest Matching for Neural News Recommendation. In
ACL . 836–845.[20] Hongwei Wang, Fuzheng Zhang, Xing Xie, and Minyi Guo. 2018. DKN: DeepKnowledge-Aware Network for News Recommendation. In
WWW . 1835–1844.[21] Hongyi Wen, Longqi Yang, and Deborah Estrin. 2019. Leveraging post-clickfeedback for content recommendations. In
Recsys . 278–286.[22] Chuhan Wu, Fangzhao Wu, Mingxiao An, Jianqiang Huang, Yongfeng Huang,and Xing Xie. 2019. Neural News Recommendation with Attentive Multi-ViewLearning. In
IJCAI . 3863–3869.[23] Chuhan Wu, Fangzhao Wu, Mingxiao An, Jianqiang Huang, Yongfeng Huang, andXing Xie. 2019. Npa: Neural news recommendation with personalized attention.In
KDD . 2576–2584.[24] Chuhan Wu, Fangzhao Wu, Mingxiao An, Yongfeng Huang, and Xing Xie. 2019.Neural News Recommendation with Topic-Aware News Representation. In
ACL .1154–1159.[25] Chuhan Wu, Fangzhao Wu, Suyu Ge, Tao Qi, Yongfeng Huang, and Xing Xie.2019. Neural News Recommendation with Multi-Head Self-Attention. In
EMNLP-IJCNLP . 6390–6395.[26] Chuhan Wu, Fangzhao Wu, Yongfeng Huang, and Xing Xie. 2020. Neural newsrecommendation with negative feedback.
CCF Transactions on Pervasive Com-puting and Interaction
2, 3 (2020), 178–188.[27] Chuhan Wu, Fangzhao Wu, Tao Qi, and Yongfeng Huang. 2020. User Modelingwith Click Preference and Reading Satisfaction for News Recommendation. In
IJCAI-PRICAI . 3023–3029.[28] Fangzhao Wu, Ying Qiao, Jiun-Hung Chen, Chuhan Wu, Tao Qi, Jianxun Lian,Danyang Liu, Xing Xie, Jianfeng Gao, Winnie Wu, et al. 2020. MIND: A Large-scale Dataset for News Recommendation. In
ACL . 3597–3606.[29] Ruobing Xie, Cheng Ling, Yalong Wang, Rui Wang, Feng Xia, and Leyu Lin. 2020.Deep Feedback Network for Recommendation. In
IJCAI-PRICAI . 2519–2525.[30] Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and EduardHovy. 2016. Hierarchical attention networks for document classification. In
NAACL-HLT . 1480–1489.[31] Xing Yi, Liangjie Hong, Erheng Zhong, Nanthan Nan Liu, and Suju Rajan. 2014.Beyond clicks: dwell time for personalization. In
RecSys . 113–120.[32] Hui Zhang, Xu Chen, and Shuai Ma. 2019. Dynamic News Recommendationwith Hierarchical Attention Network. In
ICDM . IEEE, 1456–1461.[33] Qiannan Zhu, Xiaofei Zhou, Zeliang Song, Jianlong Tan, and Li Guo. 2019. Dan:Deep attention neural network for news recommendation. In