Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tieyun Qian is active.

Publication


Featured researches published by Tieyun Qian.


Information Sciences | 2007

On the strength of hyperclique patterns for text categorization

Tieyun Qian; Hui Xiong; Yuanzhen Wang; Enhong Chen

The use of association patterns for text categorization has attracted great interest and a variety of useful methods have been developed. However, the key characteristics of pattern-based text categorization remain unclear. Indeed, there are still no concrete answers for the following two questions: Firstly, what kind of association pattern is the best candidate for pattern-based text categorization? Secondly, what is the most desirable way to use patterns for text categorization? In this paper, we focus on answering the above two questions. More specifically, we show that hyperclique patterns are more desirable than frequent patterns for text categorization. Along this line, we develop an algorithm for text categorization using hyperclique patterns. As demonstrated by our experimental results on various real-world text documents, our method provides much better computational performance than state-of-the-art methods while retaining classification accuracy.


conference on information and knowledge management | 2013

Early prediction on imbalanced multivariate time series

Guoliang He; Yong Duan; Tieyun Qian; Xu Chen

Multivariate time series (MTS) classification is an important topic in time series data mining, and lots of efficient models and techniques have been introduced to cope with it. However, early classification on imbalanced MTS data largely remains an open problem. To deal with this issue, we adopt a multiple under-sampling and dynamical subspace generation method to obtain initial training data, and each training data is used to learn a base learner. Finally, an ensemble classifier is introduced for early classification on imbalanced MTS data. Experimental results show that our proposed methods can achieve effective early prediction on imbalanced MTS data.


knowledge discovery and data mining | 2009

Simultaneously Finding Fundamental Articles and New Topics Using a Community Tracking Method

Tieyun Qian; Jaideep Srivastava; Zhiyong Peng; Phillip C.-Y. Sheu

In this paper, we study the relationship between fundamental articles and new topics and present a new method to detect recently formed topics and its typical articles simultaneously. Based on community partition, the proposed method first identifies the emergence of a new theme by tracking the change of the community where the top cited nodes lie. Next, the paper with a high citation number belonging to this new topic is recognized as a fundamental article. Experimental results on real dataset show that our method can detect new topics with only a subset of data in a timely manner, and the identified papers for these topics are found to have a long lifespan and keep receiving citations in the future.


World Wide Web | 2014

Users' interest grouping from online reviews based on topic frequency and order

Jianfeng Si; Qing Li; Tieyun Qian; Xiaotie Deng

Large volume of online review data can reveal consumers’ major interests on domain product, which attracts great research interests from the academic community. Most of the existing works focus on the problems of review summarization, aspect identification or opinion mining from an item’s point of view such as the quality or popularity of products. Considering the fact that users who generate those review texts draw different attentions to product aspects with respect to their own interests, in this article, we aim to learn K users’ interest groups indicated by their review writings. Such K interest groups’ identification can facilitate better understanding of major and potential consumers’ concerns which are crucial for applications like product improvement on customer-oriented design or diverse marketing strategies. Instead of using a traditional text clustering approach, we treat the groupId/clusterId as a hidden variable and use a permutation-based structural topic model called KMM. Through this model, we infer K interest groups’ distribution by discovering not only the frequency of product aspects (Topic Frequency), but also the occurrence priority of respective aspects (Topic Order). They jointly present an informative summarization on the raw review corpus. Our experiment on several real-world review datasets demonstrates a competitive solution.


international acm sigir conference on research and development in information retrieval | 2014

Co-training on authorship attribution with very fewlabeled examples: methods vs. views

Tieyun Qian; Bing Liu; Ming Zhong; Guoliang He

Authorship attribution (AA) aims to identify the authors of a set of documents. Traditional studies in this area often assume that there are a large set of labeled documents available for training. However, in the real life, it is hard or expensive to collect a large set of labeled data. For example, in the online review domain, most reviewers (authors) only write a few reviews, which are not enough to serve as the training data for accurate classification. In this paper, we present a novel two-view co-training framework to iteratively identify the authors of a few unlabeled data to augment the training set. The key idea is to first represent each document as several distinct views, and then a co-training technique is adopted to exploit the large amount of unlabeled documents. Starting from 10 training texts per author, we systematically evaluate the effectiveness of co-training for authorship attribution with limited labeled data. Two methods and three views are investigated: logistic regression (LR) and support vector machines (SVM) methods, and character, lexical, and syntactic views. The experimental results show that LR is particularly effective for improving co-training in AA, and the lexical view performs the best among three views when combined with a LR classifier. Furthermore, the co-training framework does not make much difference between one classifier from two views and two classifiers from one view. Instead, it is the learning approach and the view that plays a critical role.


international conference on tools with artificial intelligence | 2015

Active Learning for Multivariate Time Series Classification with Positive Unlabeled Data

Guoliang He; Yong Duan; Yifei Li; Tieyun Qian; Jinrong He; Xiangyang Jia

Traditional time series classification problem with supervised learning algorithm needs a large set of labeled training data. In reality, the number of labeled data is often smaller and there is huge number of unlabeled data. However, manually labeling these unlabeled examples is time-consuming and expensive, and sometimes it is even impossible. Although some semi-supervised and active learning methods were proposed to handle univariate time series data, few work have touched positive and unlabeled data for multivariate time series (MTS) classification due to the data being more complex. In this paper we focus on active learning for multivariate time series classification with positive unlabeled data. First, we propose a sample selection strategy to find the most informative unlabeled examples for manual labeling. Second, we introduce two active learning approaches to obtain a high-confident training dataset for classification. Experiments on real datasets demonstrate the validity of our proposed approaches.


web-age information management | 2015

Coherent Topic Hierarchy: A Strategy for Topic Evolutionary Analysis on Microblog Feeds

Jiahui Zhu; Xuhui Li; Min Peng; Jiajia Huang; Tieyun Qian; Jimin Huang; Jiping Liu; Ri Hong; Pinglan Liu

Topic evolutionary analysis on microblog feeds can help reveal users’ interests and public concerns in a global perspective. However, it is not easy to capture the evolutionary patterns since the semantic coherence is usually difficult to be expressed and the timeline structure is always intractable to be organized. In this paper, we propose a novel strategy, in which a coherent topic hierarchy is designed to deal with these challenges. First, we incorporate the sparse biterm topic model to extract some coherent topics from microblog feeds. Then the topology of these topics is constructed by the basic Bayesian rose tree combined with topic similarity. Finally, we devise a cross-tree random walk with restart model to bond each pair of sequential trees into a timeline hierarchy. Experimental results on microblog datasets demonstrate that the coherent topic hierarchy is capable of providing meaningful topic interpretations, achieving high clustering performance, as well as presenting motivated patterns for topic evolutionary analysis.


World Wide Web | 2014

Topic formation and development: a core-group evolving process

Tieyun Qian; Qing Li; Bing Liu; Hui Xiong; Jaideep Srivastava; Phillip C.-Y. Sheu

Recent years have witnessed increased interests in topic detection and tracking (TDT). However, existing work mainly focuses on overall trend analysis, and is not developed for understanding the evolving process of topics. To this end, this paper aims to reveal the underlying process and reasons for topic formation and development (TFD). Along this line, based on community partitioning in social networks, a core-group model is proposed to explain the dynamics and to segment topic development. This model is inspired by the cell division mechanism in biology. Furthermore, according to the division phase and interphase in the life cycle of a core group, a topic is separated into four states including birth state, extending state, saturation state and shrinkage state. In this paper, we mainly focus our studies on scientific topic formation and development using the citation network structure among scientific papers. Experimental results on two real-world data sets show that the division of a core group brings on the generation of a new scientific topic. The results also reveal that the progress of an entire scientific topic is closely correlated to the growth of a core group during its interphase. Finally, we demonstrate the effectiveness of the proposed method in several real-life scenarios.


World Wide Web | 2014

Exploiting small world property for network clustering

Tieyun Qian; Qing Li; Jaideep Srivastava; Zhiyong Peng; Yang Yang; Shuo Wang

Graph partitioning is a traditional problem with many applications and a number of high-quality algorithms have been developed. Recently, demand for social network analysis arouses the new research interest on graph partitioning/clustering. Social networks differ from conventional graphs in that they exhibit some key properties like power-law and small-world property. Currently, these features are largely neglected in popular partitioning algorithms. In this paper, we present a novel framework which leverages the small-world property for finding clusters in social networks. The framework consists of several key features. Firstly, we define a total order, which combines the edge weight, the small-world weight, and the hub value, to better reflect the connection strength between two vertices. Secondly, we design a strategy using this ordered list, to greedily, yet effectively, refine existing partitioning algorithms for common objective functions. Thirdly, the proposed method is independent of the original approach, such that it could be integrated with any types of existing graph clustering algorithms. We conduct an extensive performance study on both real-life and synthetic datasets. The empirical results clearly demonstrate that our framework significantly improves the output of the state-of-the-art methods. Furthermore, we show that the proposed method returns clusters with both internal and external higher qualities.


Chinese National Conference on Social Media Processing | 2016

Social Spammer Detection via Structural Properties in Ego Network

Baochao Zhang; Tieyun Qian; Yiqi Chen; Zhenni You

Social media have become popular communication platforms in recent years. A huge number of users disseminate and share information on these websites. Due to their popularity, social media have attracted numerous malicious users (spammers) to send spams, spread malware and phish scams. It is highly desirable to automatically distinguish legitimate users from spammers. Existing approaches mainly use behavior, content, or profile information as features to characterize the social spammers. However, to avoid being caught by the websites, the spammers pretend to post normal messages sometimes and change their behaviors continuously. This makes the behavior and content based approaches less effective.

Collaboration


Dive into the Tieyun Qian's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Qing Li

City University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jaideep Srivastava

Qatar Computing Research Institute

View shared research outputs
Top Co-Authors

Avatar

Enhong Chen

University of Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge