[PDF] Automatic Generation of Natural Language Explanations

Abstract

An important task for recommender system is to generate explanations according to a user's preferences. Most of the current methods for explainable recommendations use structured sentences to provide descriptions along with the recommendations they produce. However, those methods have neglected the review-oriented way of writing a text, even though it is known that these reviews have a strong influence over user's decision. In this paper, we propose a method for the automatic generation of natural language explanations, for predicting how a user would write about an item, based on user ratings from different items' features. We design a character-level recurrent neural network (RNN) model, which generates an item's review explanations using long-short term memories (LSTM). The model generates text reviews given a combination of the review and ratings score that express opinions about different factors or aspects of an item. Our network is trained on a sub-sample from the large real-world dataset BeerAdvocate. Our empirical evaluation using natural language processing metrics shows the generated text's quality is close to a real user written review, identifying negation, misspellings, and domain specific vocabulary.

Full PDF

AAutomatic Generation of Natural Language Explanations

Felipe Costa

Aalborg UniversitetSelma Lagerløfs Vej 300Aalborg, Denmark [email protected]

Sixun Ouyang

Insight Centre for Data AnalyticsUniversity College DublinBelfield, Dublin 4, [email protected]

Peter Dolog

Aalborg UniversitetSelma Lagerløfs Vej 300Aalborg, Denmark [email protected]

Aonghus Lawlor

Insight Centre for Data AnalyticsUniversity College DublinBelfield, Dublin 4, [email protected]

ABSTRACT

An important task for recommender system is to generate explana-tions according to a user’s preferences. Most of the current methodsfor explainable recommendations use structured sentences to pro-vide descriptions along with the recommendations they produce.However, those methods have neglected the review-oriented wayof writing a text, even though it is known that these reviews havea strong influence over user’s decision.In this paper, we propose a method for the automatic generationof natural language explanations, for predicting how a user wouldwrite about an item, based on user ratings from different items’ fea-tures. We design a character-level recurrent neural network (RNN)model, which generates an item’s review explanations using long-short term memories (LSTM). The model generates text reviewsgiven a combination of the review and ratings score that expressopinions about different factors or aspects of an item. Our net-work is trained on a sub-sample from the large real-world datasetBeerAdvocate. Our empirical evaluation using natural languageprocessing metrics shows the generated text’s quality is close toa real user written review, identifying negation, misspellings, anddomain specific vocabulary.

CCS CONCEPTS • Information systems → Recommender systems ; •

Comput-ing methodologies → Natural language generation ; Neuralnetworks ; KEYWORDS

Recommender Systems, Explainability, Explanations, Neural Net-works

ACM Reference format:

Felipe Costa, Sixun Ouyang, Peter Dolog, and Aonghus Lawlor. 1997. Auto-matic Generation of Natural Language Explanations. In

Proceedings of 2nd

Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for third-party components of this work must be honored.For all other uses, contact the owner/author(s).

DLRS’17, August 2017, Como, Italy © 2017 Copyright held by the owner/author(s).ACM ISBN 123-4567-24-567/08/06...$15.00https://doi.org/10.475/123_4 workshop on Deep Learning for Recommender Systems, Como, Italy, August2017 (DLRS’17),

One of the key challenges for a recommender system is to pre-dict the probability that a target user likes a given item, takinginto account the user’s history and their similarity to other users.However, making predictions in this way does not explain why theitem matches with the users’ preferences. Recent works have in-troduced the concept of explainable recommender systems, whichtry to generate explanations according to users’ preferences ratherthan only predicting a numerical rating for an item. In this workwe develop an approach using character-level neural networks togenerate readable explanations.Current explainable recommendations propose to mine user re-views to generate explanations. In [27] they propose an explicitfactor model, where they first extracts aspects and user opinionsby phrase-level sentiment analysis on user generated reviews, thengenerate both recommendations and disrecommendations accordingto the specific product features and personalised to the user’s inter-ests and the hidden features learned. On the other hand, in [8] theypropose a tripartite graph to enrich the user-item binary relation toa user-item-aspect ternary relation. In each of the these work, theypropose to extract aspects from reviews to generate explainablerecommendations, but they do not consider user opinions and influ-ences from social relations as a source of explanation. In [23] theypropose the social collaborative viewpoint regression model, whichdetects viewpoints and uses social relations as a latent variablemodel. This model is represented as tuples of a concept, topic, and asentiment label from both user reviews and trusted social relations.Explanations generated in this manner lack natural languageexpressions, since the sentences are generated in a modular way.However, it is well established by [25] that a good explanation mustbe clear, and interesting to the target user, since this informationhas a significant influence on the user’s decision. On-line user-generated reviews present clear and interesting information aboutitems, since they describe personal usage experience from users.Furthermore, this source plays an important role on the user side,since he/she tend to trust the opinion of other users [5, 13, 22]. a r X i v : . [ c s . C L ] J u l LRS’17, August 2017, Como, Italy Felipe Costa, Sixun Ouyang, Peter Dolog, and Aonghus Lawlor

Recurrent neural networks (RNNs) have recently demonstratedto show very good performance in natural language generation,since the generating function can be automatically learned frommassive text corpora. Due RNNs suffers from gradient vanishingproblem, long-short term memory (LSTM) has been applied to thetext generation field, and leads to significant improvements on thisissue. Another advantage of using LSTM is the ability to keep inmemory the long-range dependencies among words and characters.The combination of RNNs with LSTM have shown promising re-sults on such different text datasets as Shakespeare poem, scientificpapers, and linux source code generation [12].Most natural languages text generation approaches focus on theraw textual content and often neglect their contextual information.This context, such as the specific location, time and sentiment areimportant factors in the creation of user generated on-line reviewsand should not be neglected. Recent research on recommendersystems demonstrated improvements achieved by including context[1]. This paper incorporates this information to enrich the generatedsentences with particular contextual features.In this paper, we propose a technique for the automatic gen-eration of explanations, based on generative text reviews given avector of ratings that express opinions about different factors of anitem. Our method is based on a character-level LSTM trained on asub-sample from the large real-world dataset BeerAdvocate. It isdivided into three modules: a context encoder, LSTM decoder, andthe review generation. The ratings are normalised, then concate-nated to the characters to feed the LSTM cells, which can generatecharacters that are contextualised by the normalised ratings. Thegenerative review module has a weighted generation based on rat-ings vector as input. The weights learns soft alignments betweengenerated characters and sentiment, where we adaptively computeencoder-side context vectors used to predict the next characters.Automatic generated review-oriented explanations, are usefulfor companies and users, who can benefit from helpfulness aspectof the explanations to assess an item recommendation. [4] showscharacter-level generation has advantages over other techniquessuch as unsupervised learning of grammar and punctuation, andcan be more efficient than word-level generation, since it allowsfor the prediction and generation of new words and strings.This paper presents as contributions: • A context-aware review generation based on rating scores • Generate readable reviews in a human perspective.

In this section, we provide the basic definition and preliminaries togenerate natural language explanations. Given a set of items I , andtarget user u : • An item is a product (beer) represented by i ∈ I . • Explicit feedback is an action represented by the matrix X u : U × I → R , where u ∈ U is a user, i ∈ I is an item,and r ∈ R represents a rating that the user u have beengiven to item i . Considering the each r rating is a vectorcorresponding to a set of five features appearance, aroma,palate, taste, and overall. • Reviews are another explicit feedback in text format repre-sented by the matrix X a : U × I → T , where u ∈ U is a user, i ∈ I is an item, and t ∈ T represents a review that the user u have been given to item i . Ratings are attributes to express opinions from a user about a certainitem, however it is difficult to compose a judgement of a productbased only on the rating score. Therefore, user-generated reviewsare richer, since the user can give explanations according to differentfeatures and aspects of a specific item. There are many approachesto generating explanations for different types of recommendersystems, including collaborative filtering [9] and case-based ap-proaches [18]. Explanations showed to increase the effectivenessof the recommendation and the user’s satisfaction [26] in variousevaluations methods. Current state of the art in explainable recom-mender systems does not offer human-oriented explanations. Toaddress this particular issue, our model is defined to target the prob-lem of generating explanations in a review-oriented and naturallanguage basis.We formulate the item explanation generation problem as fol-lows. Given input ratings vector r i = ( r , . . . , r | r i | ) , we aim togenerate item explanation e i = ( w , . . . , w | t i | ) , maximizing theconditional probability p ( e | r ) . Note, rating r i is the average valuesfrom the evaluation of target item i in a fixed numerical represen-tation, while the review t i is considered a character sequence ofvariable length. We set | r | as 5 in our task, as we have 5 featureswith different ratings values. The model learns to compute thelikelihood of generated reviews given a set of input ratings. Thisconditional probability p ( e | r ) is represented in the Eq. 1. p ( e | r ) = | e | (cid:214) s = p ( w s | w < s , r ) (1)where w < s = ( w , . . . , w t − ) Neural networks have started to attract attention in recommendersystems community only recently. In [14] they study recurrentneural networks in different architectures for a collaborative rec-ommender system with experiments showing good performance.Despite good performance, this example of work suffers from thesame problem as the other works that it is not explainable.The work of [2] is among the first where a recommender sys-tem is utilising the review text as side information to improvethe performance of recommender system and the solutions arerooted in recurrent neural networks. Our work differs from thiswork as we are in fact trying to generate explanations in the formof a user-generated review to improve a user’s understanding ofrecommended items.What we would like to achieve however is an alignment betweenvariables or features which lead to a recommendation of one item oranother and a descriptive text where rules about the text composi-tion are learned from the existing reviews. Therefore, we would liketo achieve similar alignment as others have achieved in differentdomains such as text generation for images as in [11].Learning the rules for generating the reviews can be accom-plished by representing input as sentences, words or characters. In[19] and [24] they propose a tree-based neural network model for utomatic Generation of Natural Language Explanations DLRS’17, August 2017, Como, Italy natural-language inference based on words and their context. Westudy character-level explanation generation to further improvethe state of the art. The work of [12] provides the first insights intowhy the LSTM variant of neural networks has such good perfor-mance. Similar technique were used on [4], where they build onthe previous work to generate product reviews in the restaurantdomain.Encoding rating vectors in the training phase allows the systemto calculate the probability of the next character based on the givenrating. In previous work, [6] showed an efficient method for gener-ation of next the word in the sequence when we add an attentionmechanism, showing that this idea improves performance for longsequences.Character-level generation has shown improvement over word-level on the text generation problem using RNNs [4]. This is be-cause, on the character-level, the neural network can autonomouslylearn grammatical and punctuation rules. In [4] they mention thecharacter-level RNN provides slightly worse performance than theequivalent word-based model, however it shows improvements interms of computational cost, which grows with the size of the inputand output dictionaries, an in contrast, it allows for the predictionand generation of new words and strings.In [15] they focus on character-level review generation and clas-sification where the ratings are used as auxiliary information. Ourwork differs from both aforementioned approaches for character-level text generation in utilising richer data (ratings are used toexplicit quality of a product in different features, identified as asource of user’s preference) and providing a first attempt to gen-erate explanations with character level networks to reflect userdifferences and preferences.

Recurrent neural networks (RNNs) are feed-forward networks withtemporal verifying activation, processing and learning sequentialdata. While in the training step, given an input vector X t in time t and the cell state of previous time step t −

1, where the input weightmatrix is represented by W x and state weight metrices refers to W h , the RNNs then pass the cell state h t to the next time step andpropose a prediction value Y t via a softmax layer which consists ofa non-linear softmax function, as shown in Eq. 2. h t = tanh ( X t ⊙ W x + h t − ⊙ W h ) Y t = softmax ( h t ⊙ W + b ) (2)According to Eq. 2, if we continue feeding the same values to X t , the input weight matrix W x and state weight matrix W h will bechanged to suit the input value. RNN’s suffer a vanishing gradientproblem, that depending on the activation functions, sequentialinformation gets lost over time. To handle this issue, [10] introducedLong short term memory (LSTM) cells, and was later improved by[7] using forget gates to discard some information.LSTM is an improved version of RNNs controlled by sequentialconnection of gates: forget gate, input gate and output gate. Whenreceiving an input data x t at time t and the cell state C t − fromprevious time step t −

1, those values will be concatenated togetherfor the next computation. It will feed the forget gate initially, where

Figure 1: Generative Concatenative Network it decides which information has to be discarded. There, f t repre-sents the results via the forget gate in time t , W f and b f refers tothe weight matrix and bias, respectively. The next step for LSTMcells is to determine which information should be stored in cellstate through the input gate. At the update step, i t means the inputgate results, W i and b i are its parameters. The cell creates a candi-date state C ′ t through a tanh layer. Using the candidate state withthe previous cell state, forget gate results f t and input gate results i t to update the current state C t . Finally, the data goes to outputgate, where it uses siдmoidal function layer to determine whichpart of the cell state is the output, then it multiplies tanh with thecurrent cell state C t to give as result the character with the highestprobability. X = [ x t , C t − ] f t = σ ( X ⊙ W f + b f ) i t = σ ( X ⊙ W i + b i ) C ′ t = tanh ( X ⊙ W c + b c ) C t = f t ⊙ C t − + i t ⊙ C ′ t o t = σ ( X ⊙ W o + b o ) H t = o t ⊙ tanh ( C t ) (3) Generative RNN models can be applied in many fields as most datacan be represented as a sequence, especially for text generation.State weights benefits generative RNNs to generate coherent text,where one character can be fed into the network at a time step andthese affect the state weights. This project builds on the generativeconcatenative network presented by [15], which uses an LSTM RNNcharacter-based generation model, adding auxiliary informationaccording to ratings for different feature preferences.In [12] they define a character-level language model given asequence of characters as input to an LSTM neural network, cal-culate the probability of the next character in the sequence with a so f tmax function at each time step s then generate the characteras output. Given a set of C characters we encode all characters with C − dimensional { x t } , t = , . . . , T , and feed themto the recurrent network to obtain a sequence of H − dimensional hidden vectors as the last layer of the network { H lt } , t = , . . . , T .To obtain predictions for the next character in the sequence, theoutput goes to the top layer of a sigmoid activation function toa sequence of vectors ˆ y , where ˆ y = W y . H Lt and W y is a [ K × D ] parameter matrix. The output vectors are interpreted as holdingthe loд probability of the next character in the sequence and theobjective is to minimize the average cross-entropy loss over alltargets.In [15] they propose to generate text, conditioned on an auxiliaryinput x aux , where the input x aux is concatenated with the character LRS’17, August 2017, Como, Italy Felipe Costa, Sixun Ouyang, Peter Dolog, and Aonghus Lawlor

Figure 2: Generative Explanations representation x ( t ) char , as it is seen in Fig. 1. They train their networkbased on the concatenated information input x ′( t ) = [ x ( t ) char ; x aux ] .At training time, x aux is a feature of the training set, while dur-ing the generation step, they define some x aux , concatenating itwith each character sampled from ˆ y ( t ) . They replicate the auxiliaryinformation x aux at each input to allow the model to focus onlearning the complex interactions between the auxiliary input andthe language, rather than just memorising the input. However, theyconsider only the overall rating or temperature for a certain item,neglecting the user’s preference in different aspects. Similar to [12] and [15], our model is based on LSTM RNNs networkto generate reviews. Our model adds a set of auxiliary informationto each character in the context encoder module.In our model, the context encoder module encodes the inputcharacter using one-hot encoding and concatenates a set of ratingsto it, before feeding it into network as we can see in Fig. 2. In ourexperiments, we generate a dictionary for all the characters in thecorpus to record their positions, which will be used as the encodingprocess in the training step and for decoding in the generatingstep. For each character in the reviews, a one-hot vector will begenerated by using its position in that dictionary. Then the one-hotvector will be concatenated with a set of auxiliary informationswhich relies on the review, as shown in Eq. 4. Meanwhile, in termsof the auxiliary information, our model uses a set of numeric valuesof the users’ ratings, which are rescaled to the range [ , ] . X ′ t = [ onehot ( x char ) ; x auxiliary ] (4) As mentioned previously, [15] proposed a GCN model concatenat-ing characters with some auxiliary information, i.e. overall ratingor temperature, being able to generate some remarkable samples.It uses one piece of auxiliary information to enrich the probabilityto define the next character.We propose an improvement to the concatenation process, wherewe consider a vector of auxiliary data, i.e a set of the ratings scoresfor different features of items, instead of only one dimension ofauxiliary information. During the review generation our modelgenerates distinct pieces of text tuned to the distribution of appliedratings. A non-linear so f tmax layer is used in our model to compute theprobability for all characters. During the generation process themodel concatenates a prime text, which is a start symbol in eachreview, concatenated with a series of ratings scores to the model.Then the model passes its output to a so f tmax layer, as shown in5, where H t is the output of a LSTM cell, W and b are the weightand bias of so f tmax layer, respectively. Y t = so f tmax ( H t ⊙ W + b ) (5)This procedure is applied recursively and a group of charactersis generated until we find the pre-defined end symbol.By using LSTM cells for character-level explainable review gen-eration, and merging with the vector of ratings, we allow the modelto learn grammar and punctuation, being more efficient than word-level models [4], since our model can predict and generate newwords and strings. Therefore, our model generates explanations forrecommender systems with a review-oriented perspective, addingimprovements on the quality of the explanation text presented toto the user in the form of a review. Empirical experiments used a customised LSTM RNN library writ-ten in Python and using Tensorflow. There are 2 hidden layers with1024 LSTM cells per layer. During training, a wrapper mechanismis used to prevent over-fitting. Feed-in data was split by 100 batcheswith batch size of 128 and each batch has a sequence length of 280.

We tested our model in a sub-sample from the large real-worlddataset: BeerAdvocate. The original dataset consists of approxi-mately 1.5 million reviews retrieved from 1998 to 2011. Each reviewincludes rating in terms of five categories: appearance , aroma , palate , taste , and overall impression. Reviews include item anduser ids, followed by each of these five ratings, and a plain textreview. The summarised statistical information from the extractedsub-sample is shown on the Table 1. BeerAdvocate

Table 1: Dataset Statistics

The BeerAdvocate dataset contains several beer categories, and weselected a sub-sample dataset based on just 5 categories: "ameri-can ipa" ,"russian imperial stout" ,"american porter","american am-ber/red ale" and "fruit/vegetable beer". Considering some reviewsare probably too short or even empty that would cause problemswith training, we filter our sub-sample to include only reviewswith at least 50 characters. For our experiments we concentrate on Original ratings were normalized with values between 0-1. utomatic Generation of Natural Language Explanations DLRS’17, August 2017, Como, Italy ari ﬂesch read ﬂesch kincaid gunning fog index epochs smo index epochs goleman liau index epochs lix epochs rix

Figure 3: Readability metrics with epoch. The metrics are detailed in Sec. 5.4. generating reviews conditioned on the size of reviews of each beercategories, we select 4k reviews of each category for our trainingdatasets.We first generate a dictionary for all characters, i.e. punctuation,numbers and letters, then transform each character into a one-hot vector using that dictionary. We train the network based on asequential approach, where each review is fed into a sequence, to doso it is essential to remind the network of the start and end positionof each of the reviews. We do this by appending start and end symbols, i.e. < str > and < end > , to each reviews for both thetraining and generation modules. In order to generate explanationsfor different ratings, we concatenate the input characters with theratings of the review the character belongs to. In addition, wenormalise the scale of the ratings to [ , ] . Current methods to explain recommendations do not have a naturallanguage way to present the information to the user. Our proposedmethod explains the recommendation to a target user in a style of auser-generated review. To measure the quality of the presented text,we used a suite of natural language readability metrics : AutomatedReadability index (ARI) [16], Flesch reading ease (FRE) [20], Flesch-Kincaid grade level (FGL) [21], Gunning-Fog index (GFI) [20], simplemeasure of gobbledygook (SMOG) [17], Coleman Liau index (CLI)[21], LIX [3], and RIX[21]. Flesch reading ease score is consideredthe oldest method to calculate the readability through the analysisof number of words and sentence length. An updated version ofthis metric is the Flesch-Kincaid grade level. The Gunning Fogindex is commonly used to confirm a text can be read easily by theintended audience. The SMOG score is a improvement of GunningFog index, showing better accuracy overall. Automated Readabilityindex relies on a relation of the number of characters per word. TheLix score gauges the word length by the percentage of long words.

The initial test of our explanation generation is about readability.We use 8 readability evaluation metrics as mentioned above forboth generated and reference reviews.We first select 10 reviews from our sample dataset as the ref-erence reviews. By using the same users and items from these 10reviews, as well as considering the different learning curve of themodel in different epochs, we generated 10 reviews per epoch fromthe model. We then apply the readability metrics to the generatedand reference reviews to evaluate the text. The readability resultsare shown in Fig. 3, where it is observed the generated reviewsreach the same level of readability as the user reviews on all metricsafter 20 epochs.

ARI FRE FGL GFI SMOG CLI LIX RIXreadability features0 . . . . . . . s c o r e ( g e n ) / s c o r e (r e f ) Figure 4: Extent of readability for different metrics. Thereadability of generated text is shown relative the meanscore for the user-generated reviews.

LRS’17, August 2017, Como, Italy Felipe Costa, Sixun Ouyang, Peter Dolog, and Aonghus Lawlor

As Fig. 3 shows, the readability evaluation metrics illustrate thecapacity of the model to generate reviews which are close to theuser’s style of writing. We use the readability scores of the gen-erated reviews from the final epochs and normalise to the scoresobtained from the reference reviews to demonstrate the relativereadability in Figure 4. This emphasises the neural network gener-ated reviews are close in style to the human written reviews. Thisis determined by a broad range of readability metrics which aresensitive to different qualities of the text. It is important for ourexplanations that they are legible, easy to understand, and appearto be written in a recognisable style.We established our model can generate natural language textwhich reaches the overall readability level of the user-generatedreviews. We now investigate the different kinds of explanationsthat can be generated when we modify the auxiliary values at thegeneration stage. We are using the ratings from 5 aspects of thebeers as auxiliary values, and they represent each users preferencesand general ratings opinion about a target beer. We choose a user-item pair ( U , I ) , and compute the average ratings for each feature forboth user ¯ R user = (cid:205) i R user , i /| R user | and item ¯ R beer . The precisecontribution of user/item ratings is controlled with a weightingparameter α , and we demonstrate three different text samples tocompare through Eq. 6. As Eq. 6 shows, α controls the auxiliaryvalues and we then generate reviews based on them. When α isclose to 1, the generated review will be more like a review thatthe user will write. With α close to 0, the generated review will becloser to the general rating of all users for that beer. To investigatethe divergence of generated reviews, we set α equal to 1, 0.5, 0,which refers to the opinion of the user on general beers, the reviewthe user might compose on that beer, and the general reviews ofthe beer. R auxiliary = α × R user + ( − α ) × R beer (6)According to Fig. 5, the first review ( α =

1) shows the opinion ofthe user on general beers, which have a positive sentiment overall.When we look into the last review ( α = α = . In this paper, we propose a model to automatically generate naturallanguage explanations for recommender systems.Our explanations provide easily intelligible and useful reasonsfor a user to decide whether to purchase a certain product. Thishas important benefits for the field of recommender systems sincethese explanations can help a user to make a better decision andmore quickly, as users place a high degree of trust in the reviewsof others.As our experiments with natural language readability metricsshow, we were able to generate readable English text with specificcharacteristics that match user-generated review text.In the future we will focus on further extensions of the auto-matic generation of natural language explanations in two ways:(1) personalised explanations that benefit the user’s preferences,where the explanation of the product is tailored to the users ratings, α = . with some sweetness. the taste isalso a bit more like a beer. there’s a little bit of a sweetfruitiness to it as well. the mouthfeel is a bit thin forthe style. drinkability is good. i would drink this allnight long, but i wouldn’t try to get more than one. α = . but iwouldn’t try to get my entire offering into a pint glass.i was pretty surprised this is the beer i was expectinga bit more. α = . but iwouldn’t try to get my entire offerings. i wouldn’t rec-ommend it.Figure 5: Sample generated reviews for α = { , . , } . Thedissimilar sentences are highlighted in bold. preferred aspects and expressed sentiments; (2) we will test ourmodel in larger reviews domains such as hotels and restaurants. ACKNOWLEDGMENTS

This work is supported by Science Foundation Ireland throughthrough the Insight Centre for Data Analytics under grant num-ber SFI/12/RC/2289, and Conselho Nacional de DesenvolvimentoCientífico e Tecnológico - CNPq (grant

REFERENCES [1] Gediminas Adomavicius and Alexander Tuzhilin. 2015. Context-aware recom-mender systems. In

Recommender systems handbook . Springer, 191–226.[2] Amjad Almahairi, Kyle Kastner, Kyunghyun Cho, and Aaron Courville. 2015.Learning Distributed Representations from Reviews for Collaborative Filtering.In

Proceedings of the 9th ACM Conference on Recommender Systems (RecSys ’15) .ACM, New York, NY, USA, 147–154.[3] Jonathan Anderson. 1983. Lix and rix: Variations on a little-known readabilityindex.

Journal of Reading

26, 6 (1983), 490–496.[4] A. Bartoli, A. d. Lorenzo, E. Medvet, D. Morello, and F. Tarlao. 2016. ”Best DinnerEver!!!”: Automatic Generation of Restaurant Reviews with LSTM-RNN. In . 721–724.[5] Dan Cosley, Shyong K Lam, Istvan Albert, Joseph A Konstan, and John Riedl.2003. Is seeing believing?: how recommender system interfaces affect users’opinions. In

Proceedings of the SIGCHI conference on Human factors in computingsystems . ACM, 585–592.[6] Li Dong, Shaohan Huang, Furu Wei, Mirella Lapata, Ming Zhou, and Ke XuT.Learning to Generate Product Reviews from Attributes.[7] Felix A Gers, Jürgen Schmidhuber, and Fred Cummins. 2000. Learning to forget:Continual prediction with LSTM.

Neural computation

12, 10 (2000), 2451–2471.[8] Xiangnan He, Tao Chen, Min-Yen Kan, and Xiao Chen. 2015. TriRank: Review-aware Explainable Recommendation by Modeling Aspects. In

Proceedings of the24th ACM International on Conference on Information and Knowledge Management(CIKM ’15) . 1661–1670. utomatic Generation of Natural Language Explanations DLRS’17, August 2017, Como, Italy [9] Jonathan L Herlocker, Joseph A Konstan, and John Riedl. 2000. Explaining col-laborative filtering recommendations. In

Proceedings of the 2000 ACM conferenceon Computer supported cooperative work . ACM, 241–250.[10] Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory.

Neuralcomputation

9, 8 (1997), 1735–1780.[11] Andrej Karpathy and Li Fei-Fei. 2017. Deep Visual-Semantic Alignments forGenerating Image Descriptions.

IEEE Trans. Pattern Anal. Mach. Intell.

39, 4 (2017),664–676.[12] Andrej Karpathy, Justin Johnson, and Fei-Fei Li. 2015. Visualizing and Understand-ing Recurrent Networks.

CoRR abs/1506.02078 (2015). International Conferenceon Learning Representaions.[13] Bart P Knijnenburg, Martijn C Willemsen, Zeno Gantner, Hakan Soncu, andChris Newell. 2012. Explaining the user experience of recommender systems.

User Modeling and User-Adapted Interaction

22, 4-5 (2012), 441–504.[14] Young-Jun Ko, Lucas Maystre, and Matthias Grossglauser. 2016. CollaborativeRecurrent Neural Networks for Dynamic Recommender Systems. In

Proceedingsof The 8th Asian Conference on Machine Learning, ACML 2016, Hamilton, NewZealand, November 16-18, 2016. (JMLR Workshop and Conference Proceedings) ,Robert J. Durrant and Kee-Eung Kim (Eds.), Vol. 63. JMLR.org, 366–381.[15] Zachary Chase Lipton, Sharad Vikram, and Julian McAuley. 2015. CapturingMeaning in Product Reviews with Character-Level Generative Text Models.

CoRR abs/1511.03683 (2015).[16] Lei Liu, Georgia Koutrika, and Shanchan Wu. 2015. Learningassistant: A novellearning resource recommendation system. In

Data Engineering (ICDE), 2015 IEEE31st International Conference on . IEEE, 1424–1427.[17] G Harry Mc Laughlin. 1969. SMOG grading-a new readability formula.

Journalof reading

12, 8 (1969), 639–646.[18] David McSherry. 2005. Explanation in recommender systems.

Artificial Intelli-gence Review

24, 2 (2005), 179–197.[19] Zhao Meng, Lili Mou, Ge Li, and Zhi Jin. 2016.

Context-Aware Tree-Based Convo-lutional Neural Networks for Natural Language Inference . Springer International Publishing, Cham, 515–526.[20] Maria Soledad Pera and Yiu-Kai Ng. 2012. BReK12: A Book Recommender forK-12 Users. In

Proceedings of the 35th International ACM SIGIR Conference onResearch and Development in Information Retrieval (SIGIR ’12) . 1037–1038.[21] Maria Soledad Pera and Yiu-Kai Ng. 2013. What to Read Next?: Making Per-sonalized Book Recommendations for K-12 Users. In

Proceedings of the 7th ACMConference on Recommender Systems (RecSys ’13) . 113–120.[22] Pearl Pu, Li Chen, and Rong Hu. 2012. Evaluating recommender systems from theuser’s perspective: survey of the state of the art.

User Modeling and User-AdaptedInteraction

22, 4 (2012), 317–355.[23] Zhaochun Ren, Shangsong Liang, Piji Li, Shuaiqiang Wang, and Maarten de Rijke.2017. Social Collaborative Viewpoint Regression with Explainable Recommenda-tions. In

Proceedings of the Tenth ACM International Conference on Web Searchand Data Mining (WSDM ’17) . 485–494.[24] Jian Tang, Yifan Yang, Samuel Carton, Ming Zhang, and Qiaozhu Mei. 2016.Context-aware Natural Language Generation with Recurrent Neural Networks.

CoRR abs/1611.09900 (2016).[25] Nava Tintarev and Judith Masthoff. 2007. Effective explanations of recommen-dations: user-centered design. In

Proceedings of the 2007 ACM conference onRecommender systems . ACM, 153–156.[26] Nava Tintarev and Judith Masthoff. 2012. Evaluating the effectiveness of expla-nations for recommender systems.

User Modeling and User-Adapted Interaction

22, 4 (2012), 399–439.[27] Yongfeng Zhang, Guokun Lai, Min Zhang, Yi Zhang, Yiqun Liu, and ShaopingMa. 2014. Explicit Factor Models for Explainable Recommendation Based onPhrase-level Sentiment Analysis. In