Is this you? Create Your Porfile

Zhendong Niu

Beijing Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zhendong Niu is active.

Explore More

Publication

Featured researches published by Zhendong Niu.

international conference on data mining | 2012

Fine-grained Product Features Extraction and Categorization in Reviews Opinion Mining

Sheng Huang; Xinlan Liu; Xueping Peng; Zhendong Niu

With the growth of user-generated contents on the Web, product reviews opinion mining increasingly becomes a research practice of great value to e-commerce, search and recommendation. Unfortunately, the number of reviews is rising up to hundreds or even thousands, especially for some popular items, which makes it a laborious work for the potential buyers and the manufacturers to read through them to make a wise decision. Besides, the free format and the uncertainty of reviews expressions, make fine-grained product features extraction and categorization a more difficult task than traditional information extraction techniques. In this work, we propose to treat product feature extraction as a sequence labeling task and employ a discriminative learning model using Conditional Random Fields (CRFs) to tackle it. We innovatively incorporate the part-of-speech features and the sentence structure features into the CRFs learning process. For product feature categorization, we introduce the semantic knowledge-based and distributional context-based similarity measures to calculate the similarities between product feature expressions, then an effective graph pruning based categorizing algorithm is proposed to classify the collection of feature expressions into different semantic groups. The empirical studies have proved the effectiveness and efficiency of our approaches compared with other counterpart methods.

Journal of Computers | 2012

Personalized web search using clickthrough data and web page rating

Xueping Peng; Zhendong Niu; Sheng Huang; Yumin Zhao

Personalization of Web search is to carry out retrieval for each user incorporating his/her interests. We propose a novel technique to construct personalized information retrieval model from the users’ clickthrough data and Web page ratings. This model builds on the user-based collaborative filtering technology and the top-N resource recommending algorithm, which consists of three parts: user profile, user-based collaborative filtering, and the personalized search model. Firstly, we conduct user’s preference score to construct the user profile from clicked sequence score and Web page rating. Then it attains similar users with a given user by user-based collaborative filtering algorithm and calculates the recommendable Web page scoring value. Finally, personalized informaion retrieval be modeled by three case applies (rating information for the user himself; at least rating information by similar users; not make use of any rating information). Experimental results indicate that our technique significantly improves the search performance.

web information systems engineering | 2013

CGMF: Coupled group-based matrix factorization for recommender system

Fangfang Li; Guandong Xu; Longbing Cao; Xiaozhong Fan; Zhendong Niu

With the advent of social influence, social recommender systems have become an active research topic for making recommendations based on the ratings of the users that have close social relations with the given user. The underlying assumption is that a user’s taste is similar to his/her friends’ in social networking. In fact, users enjoy different groups of items with different preferences. A user may be treated as trustful by his/her friends more on some specific rather than all groups. Unfortunately, most of the extant social recommender systems are not able to differentiate user’s social influence in different groups, resulting in the unsatisfactory recommendation results. Moreover, most extant systems mainly rely on social relations, but overlook the influence of relations between items. In this paper, we propose an innovative coupled group-based matrix factorization model for recommender system by leveraging the user and item groups learned by topic modeling and incorporating couplings between users and items and within users and items. Experiments conducted on publicly available data sets demonstrate the effectiveness of our approach.

international conference natural language processing | 2011

News topic detection based on hierarchical clustering and named entity

Sheng Huang; Xueping Peng; Zhendong Niu

News topic detection is the process of organizing news story collections and real-time news/broadcast streams into news topics. While unlike the traditional text analysis, it is a process of incremental clustering, and generally divided into retrospective topic detection and online topic detection. This paper considers the feature changes of modern news data experienced from the past, and presents a new topic detection strategy based on hierarchical clustering and named entities. Topic detection process is also divided into retrospective and online steps, and named entities in the news stories are employed in the topic clustering algorithm. For the online steps efficiency and precision, this paper first clusters news stories in each time window into micro-clusters, and then extracts three representation vectors for each micro-cluster to calculate the similarity to existing topics. The experimental results show remarkable improvement compared with recently most applied topic detection method.

international conference on multimedia and information technology | 2008

Mining Web Access Log for the Personalization Recommendation

Xueping Peng; Yujuan Cao; Zhendong Niu

This paper presents a personalization recommendation model to recommend potentially interesting resources to users based on the Web access log of users. This model builds on the apriori algorithm and the tf-idf technology, which consists of three parts: resource description, users preference extraction and the personalization recommendation. Firstly, our model generates resource text space vector by analyzing the resource information achieved by mining users Web access log, then it attains interest set to make use of the apriori algorithm based on the vector, finally, it recommends filtered and sorted resources to users content based recommendation model.

international conference on asian digital libraries | 2011

A discretization algorithm of numerical attributes for digital library evaluation based on data mining technology

Yumin Zhao; Zhendong Niu; Xueping Peng; Lin Dai

We present here a discretization algorithm for numerical attributes of digital collections. In our research data mining technology is imported into digital library evaluation to provide a better decision-making support. But data prediction algorithms work not well based on the traditional discretization method during the data mining process. The reason is that numerical attributes of digital collections are complicated and not in the same scale of distribution distance. We study the characteristic of numerical attributes and put forward a discretization method based on the Z-score idea of mathematical statistics. This algorithm can reflect the dynamic semantic distance for different numerical attributes and significantly enhance the precision rate and recall rate of data prediction algorithms. Furthermore a nonlinear conditional relationship among attributes of digital collections is discovered during the study of discretization algorithm and impacts the actual application result of traditional data mining algorithms.

international conference on computer engineering and technology | 2010

Search with index replication in power-law like peer-to-peer networks

Kun Zhao; Zhendong Niu; Yumin Zhao; Jun Yang

Many unstructured peer-to-peer applications exhibit a characteristic of complex networks, such as power-law degree distribution. We are motivated by the fact that the high degree nodes are well connected each other and design a novel cluster-based search protocol to take advantage of cluster-based index replication. The search success rate is improved by one order of magnitude and the index storage cost is reduced by almost one order of magnitude either. We also study the search performance through theoretical model and give the mathematical relationship between search performance and cluster threshold c. We further evaluate the cluster-based techniques by simulator-based experiments and the results prove the rightness of our mathematic analysis.

computer and information technology | 2008

The study on Detecting Near-Duplicate WebPages

Yujuan Cao; Zhendong Niu; Weiqiang Wang; Kun Zhao

Reprinting information among websites produces a great deal redundant WebPages. To improve search efficiency and user satisfaction, an algorithm to Detect near-Duplicate WebPages (DDW) is proposed. In the course of developing a near-duplicate detection system for a multi-billion page repository, we make two research contributions. First, we consider both syntactic and semantic information to present and compute documentspsila similarities. Second, after classifying web-pages into different categories, we index feature in each category then search for near-duplicates only in the same category. From Google searching results for 72 queries, we select 5835 near-duplicate WebPages manually. Then insert them into an existing collection which contains about 768,763 WebPages, as the test data. The experimental results demonstrate that our approach outperforms I-Match algorithms. In large-scale test, approximate linear time and space complexity are gotten.

Engineering Applications of Artificial Intelligence | 2018

Concept coupling learning for improving concept lattice-based document retrieval

Shufeng Hao; Chongyang Shi; Zhendong Niu; Longbing Cao

Abstract The semantic information in any document collection is critical for query understanding in information retrieval. Existing concept lattice-based retrieval systems mainly rely on the partial order relation of formal concepts to index documents. However, the methods used by these systems often ignore the explicit semantic information between the formal concepts extracted from the collection. In this paper, a concept coupling relationship analysis model is proposed to learn and aggregate the intra- and inter-concept coupling relationships. The intra-concept coupling relationship employs the common terms of formal concepts to describe the explicit semantics of formal concepts. The inter-concept coupling relationship adopts the partial order relation of formal concepts to capture the implicit dependency of formal concepts. Based on the concept coupling relationship analysis model, we propose a concept lattice-based retrieval framework. This framework represents user queries and documents in a concept space based on fuzzy formal concept analysis, utilizes a concept lattice as a semantic index to organize documents, and ranks documents with respect to the learned concept coupling relationships. Experiments are performed on the text collections acquired from the SMART information retrieval system. Compared with classic concept lattice-based retrieval methods, our proposed method achieves at least 9%, 8% and 15% improvement in terms of average MAP, IAP@11 and P@10 respectively on all the collections.

international conference on information engineering and computer science | 2010

An Study on Personalized Recommendation Model Based on Search Behaviors and Resource Properties

Xueping Peng; Sheng Huang; Zhendong Niu

This paper presents an personalized recommend-ation model to recommend potentially interesting resources to users based on the users search behaviors and resource properties. This model builds on the user-based collaborative filtering technology and the top-N resource recommending algorithm, which consists of three parts: users preference description, similar users calculation and the resource recommending model. Firstly, our model generates users preference to resources by calculating relevance score between query string and resource, the score of resource owner, the score of resource category and the score of browse sequence. Then it attains similar users by given user through calculated preferences before. Finally, it recommends filtered and sorted resources to users based top-N resource recommendation model. Our recommendation model is proved more accurate than the model purely based on users search behaviors by the experiments of our paper.

Explore More