Pengjie Ren | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pengjie Ren is active.

Explore More

Publication

Featured researches published by Pengjie Ren.

conference on information and knowledge management | 2017

Neural Attentive Session-based Recommendation

Jing Li; Pengjie Ren; Zhumin Chen; Zhaochun Ren; Tao Lian; Jun Ma

Given e-commerce scenarios that user profiles are invisible, session-based recommendation is proposed to generate recommendation results from short sessions. Previous work only considers the users sequential behavior in the current session, whereas the users main purpose in the current session is not emphasized. In this paper, we propose a novel neural networks framework, i.e., Neural Attentive Recommendation Machine (NARM), to tackle this problem. Specifically, we explore a hybrid encoder with an attention mechanism to model the users sequential behavior and capture the users main purpose in the current session, which are combined as a unified session representation later. We then compute the recommendation scores for each candidate item with a bi-linear matching scheme based on this unified session representation. We train NARM by jointly learning the item and session representations as well as their matchings. We carried out extensive experiments on two benchmark datasets. Our experimental results show that NARM outperforms state-of-the-art baselines on both datasets. Furthermore, we also find that NARM achieves a significant improvement on long sessions, which demonstrates its advantages in modeling the users sequential behavior and main purpose simultaneously.

international acm sigir conference on research and development in information retrieval | 2017

Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model

Pengjie Ren; Zhumin Chen; Zhaochun Ren; Furu Wei; Jun Ma; Maarten de Rijke

As a framework for extractive summarization, sentence regression has achieved state-of-the-art performance in several widely-used practical systems. The most challenging task within the sentence regression framework is to identify discriminative features to encode a sentence into a feature vector. So far, sentence regression approaches have neglected to use features that capture contextual relations among sentences. We propose a neural network model, Contextual Relation-based Summarization (CRSum), to take advantage of contextual relations among sentences so as to improve the performance of sentence regression. Specifically, we first use sentence relations with a word-level attentive pooling convolutional neural network to construct sentence representations. Then, we use contextual relations with a sentence-level attentive pooling recurrent neural network to construct context representations. Finally, CRSum automatically learns useful contextual features by jointly learning representations of sentences and similarity scores between a sentence and sentences in its context. Using a two-level attention mechanism, CRSum is able to pay attention to important content, i.e., words and sentences, in the surrounding context of a given sentence. We carry out extensive experiments on six benchmark datasets. CRSum alone can achieve comparable performance with state-of-the-art approaches; when combined with a few basic surface features, it significantly outperforms the state-of-the-art in terms of multiple ROUGE metrics.

NLPCC | 2013

Understanding Temporal Intent of User Query Based on Time-Based Query Classification

Pengjie Ren; Zhumin Chen; Xiaomeng Song; Bin Li; Haopeng Yang; Jun Ma

Web queries are time sensitive which implies that user’s intent for information changes over time. How to recognize temporal intents behind user queries is crucial towards improving the performance of search engines. However, to the best of our knowledge, this problem has not been studied in existing work. In this paper, we propose a time-based query classification approach to understand user’s temporal intent automatically. We first analyzed the shared features of queries’ temporal intent distributions. Then, we present a query taxonomy which group queries according to their temporal intents. Finally, for a new given query, we propose a machine learning method to decide its class in terms of its search frequency over time recorded in Web query logs. Experiments demonstrate that our approach can understand users’ temporal intents effectively.

Information Retrieval | 2015

Mining and ranking users' intents behind queries

Pengjie Ren; Zhumin Chen; Jun Ma; Shuaiqiang Wang; Zhiwei Zhang; Zhaochun Ren

Abstract How to understand intents behind user queries is crucial towards improving the performance of Web search systems. NTCIR-11 IMine task focuses on this problem. In this paper, we address the NTCIR-11 IMine task with two phases referred to as Query Intent Mining (QIM) and Query Intent Ranking (QIR). (I) QIM is intended to mine users’ potential intents by clustering short text fragments related to the given query. (II) QIR focuses on ranking those mined intents in a proper way. Two challenges exist in handling these tasks. (II) How to precisely estimate the intent similarity between user queries which only consist of a few words. (2) How to properly rank intents in terms of multiple factors, e.g. relevance, diversity, intent drift and so on. For the first challenge, we first investigate two interesting phenomena by analyzing query logs and document datasets, namely “Same-Intent-Co-Click” (SICC) and “Same-Intent-Similar-Rank” (SISR). SICC means that when users issue different queries, these queries represent the same intent if they click on the same URL. SISR means that if two queries denote the same intent, we should get similar search results when issuing them to a search engine. Then, we propose similarity functions for QIM based on the two phenomena. For the second challenge, we propose a novel intent ranking model which considers multiple factors as a whole. We perform extensive experiments and an interesting case study on the Chinese dataset of NTCIR-11 IMine task. Experimental results demonstrate the effectiveness of our proposed approaches in terms of both QIM and QIR.

asia-pacific web conference | 2015

Sleep Quality Evaluation of Active Microblog Users

Kai Wu; Jun Ma; Zhumin Chen; Pengjie Ren

In this paper, we propose a novel method to evaluate the sleep quality of Active Microblog Users(AMUs) based on Sina Microblog data, where Sina Microblog is the largest microblog platform with 500 million registered users in China. A microblog user is called AMU if s/he posts more than 100 microblogs during a year. Our study is meaningful because the amount of AMUs is huge in China and the results can reflect the lifestyle of these people. The primary works of this paper are as follows: First we successfully obtained 700 million microblogs from 0.55 million microblog users as our dataset. Then we detected the possible start and end sleep time of each AMU by a novel pattern and algorithm. Finally we designed an evaluation system to give the score of each AMU’s sleep quality. In the experiment, we compared the sleep quality of AMUs in different cities of China and found the difference in topics between high and low score groups by LDA method.

Journal of Computer Science and Technology | 2017

Relation Enhanced Neural Model for Type Classification of Entity Mentions with a Fine-Grained Taxonomy

Kai-Yuan Cui; Pengjie Ren; Zhumin Chen; Tao Lian; Jun Ma

Inferring semantic types of the entity mentions in a sentence is a necessary yet challenging task. Most of existing methods employ a very coarse-grained type taxonomy, which is too general and not exact enough for many tasks. However, the performances of the methods drop sharply when we extend the type taxonomy to a fine-grained one with several hundreds of types. In this paper, we introduce a hybrid neural network model for type classification of entity mentions with a fine-grained taxonomy. There are four components in our model, namely, the entity mention component, the context component, the relation component, the already known type component, which are used to extract features from the target entity mention, context, relations and already known types of the entity mentions in surrounding context respectively. The learned features by the four components are concatenated and fed into a softmax layer to predict the type distribution. We carried out extensive experiments to evaluate our proposed model. Experimental results demonstrate that our model achieves state-of-the-art performance on the FIGER dataset. Moreover, we extracted larger datasets from Wikipedia and DBpedia. On the larger datasets, our model achieves the comparable performance to the state-of-the-art methods with the coarse-grained type taxonomy, but performs much better than those methods with the fine-grained type taxonomy in terms of micro-F1, macro-F1 and weighted-F1.

NLPCC | 2014

Social Media as Sensor in Real World: Geolocate User with Microblog

Xueqin Sui; Zhumin Chen; Kai Wu; Pengjie Ren; Jun Ma; Fengyu Zhou

People always exist in the two dimensional space, i.e. time and space, in the real world. How to detect users’ locations automatically is significant for many location-based applications such as dietary recommendation and tourism planning. With the rapid development of social media such as Sina Weibo and Twitter, more and more people publish messages at any time which contain their real-time location information. This makes it possible to detect users’ locations automatically by social media. In this paper, we propose a method to detect a user’s city-level locations only based on his/her published posts in social media. Our approach considers two components: a Chinese location library and a model based on words distribution over locations. The former one is used to match whether there is a location name mentioned in the post. The latter one is utilized to mine the implied location information under the non-location words in the post. Furthermore, for a user’s detected location sequence, we consider the transfer speed between two adjacent locations to smooth the sequence in context. Experiments on real dataset from Sina Weibo demonstrate that our approach can outperform baseline methods significantly in terms of Precision, Recall and F1.

conference on information and knowledge management | 2018

An Attentive Interaction Network for Context-aware Recommendations

Lei Mei; Pengjie Ren; Zhumin Chen; Liqiang Nie; Jun Ma; Jian-Yun Nie

Context-aware Recommendations (CARS) have attracted a lot of attention recently because of the impact of contextual information on user behaviors. Recent state-of-the-art methods represent the relations between users/items and contexts as a tensor, with which it is difficult to distinguish the impacts of different contextual factors and to model complex, non-linear interactions between contexts and users/items. In this paper, we propose a novel neural model, named Attentive Interaction Network (AIN), to enhance CARS through adaptively capturing the interactions between contexts and users/items. Specifically, AIN contains an Interaction-Centric Module to capture the interaction effects of contexts on users/items; a User-Centric Module and an Item-Centric Module to model respectively how the interaction effects influence the user and item representations. The user and item representations under interaction effects are combined to predict the recommendation scores. We further employ effect-level attention mechanism to aggregate multiple interaction effects. Extensive experiments on two rating datasets and one ranking dataset show that the proposed AIN outperforms state-of-the-art CARS methods. In addition, we also find that AIN provides recommendations with better explanation ability with respect to contexts than the existing approaches.

ACM Transactions on Information Systems | 2018

Sentence Relations for Extractive Summarization with Deep Neural Networks

Pengjie Ren; Zhumin Chen; Zhaochun Ren; Furu Wei; Liqiang Nie; Jun Ma; Maarten de Rijke

Sentence regression is a type of extractive summarization that achieves state-of-the-art performance and is commonly used in practical systems. The most challenging task within the sentence regression framework is to identify discriminative features to represent each sentence. In this article, we study the use of sentence relations, e.g., Contextual Sentence Relations (CSR), Title Sentence Relations (TSR), and Query Sentence Relations (QSR), so as to improve the performance of sentence regression. CSR, TSR, and QSR refer to the relations between a main body sentence and its local context, its document title, and a given query, respectively. We propose a deep neural network model, Sentence Relation-based Summarization (SRSum), that consists of five sub-models, PriorSum, CSRSum, TSRSum, QSRSum, and SFSum. PriorSum encodes the latent semantic meaning of a sentence using a bi-gram convolutional neural network. SFSum encodes the surface information of a sentence, e.g., sentence length, sentence position, and so on. CSRSum, TSRSum, and QSRSum are three sentence relation sub-models corresponding to CSR, TSR, and QSR, respectively. CSRSum evaluates the ability of each sentence to summarize its local contexts. Specifically, CSRSum applies a CSR-based word-level and sentence-level attention mechanism to simulate the context-aware reading of a human reader, where words and sentences that have anaphoric relations or local summarization abilities are easily remembered and paid attention to. TSRSum evaluates the semantic closeness of each sentence with respect to its title, which usually reflects the main ideas of a document. TSRSum applies a TSR-based attention mechanism to simulate people’s reading ability with the main idea (title) in mind. QSRSum evaluates the relevance of each sentence with given queries for the query-focused summarization. QSRSum applies a QSR-based attention mechanism to simulate the attentive reading of a human reader with some queries in mind. The mechanism can recognize which parts of the given queries are more likely answered by a sentence under consideration. Finally as a whole, SRSum automatically learns useful latent features by jointly learning representations of query sentences, content sentences, and title sentences as well as their relations. We conduct extensive experiments on six benchmark datasets, including generic multi-document summarization and query-focused multi-document summarization. On both tasks, SRSum achieves comparable or superior performance compared with state-of-the-art approaches in terms of multiple ROUGE metrics.

european conference on information retrieval | 2016

Supervised Local Contexts Aggregation for Effective Session Search

Zhiwei Zhang; Jingang Wang; Tao Wu; Pengjie Ren; Zhumin Chen; Luo Si

Existing research on web search has mainly focused on the optimization and evaluation of single queries. However, in some complex search tasks, users usually need to interact with the search engine multiple times before their needs can be satisfied, the process of which is known as session search. The key to this problem relies on how to utilize the session context from preceding interactions to improve the search accuracy for the current query. Unfortunately, existing research on this topic only formulated limited modeling for session contexts, which in fact can exhibit considerable variations. In this paper, we propose Supervised Local Context Aggregation (SLCA) as a principled framework for complex session context modeling. In SLCA, the global session context is formulated as the combination of local contexts between consecutive interactions. These local contexts are further weighted by multiple weighting hypotheses. Finally, a supervised ranking aggregation is adopted for effective optimization. Extensive experiments on TREC11/12 session track show that our proposed SLCA algorithm outperforms many other session search methods, and achieves the state-of-the-art results.

Explore More