Min Peng | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Min Peng is active.

Explore More

Publication

Featured researches published by Min Peng.

World Wide Web | 2017

A probabilistic method for emerging topic tracking in Microblog stream

Jiajia Huang; Min Peng; Hua Wang; Jinli Cao; Wang Gao; Xiuzhen Zhang

Microblog is a popular and open platform for discovering and sharing the latest news about social issues and daily life. The quickly-updated microblog streams make it urgent to develop an effective tool to monitor such streams. Emerging topic tracking is one of such tools to reveal what new events are attracting the most online attention at present. However, due to the fast changing, high noise and short length of the microblog feeds, two challenges should be addressed in emerging topic tracking. One is the problem of detecting emerging topics early, long before they become hot, and the other is how to effectively monitor evolving topics over time. In this study, we propose a novel emerging topics tracking method, which aligns emerging word detection from temporal perspective with coherent topic mining from spatial perspective. Specifically, we first design a metric to estimate word novelty and fading based on local weighted linear regression (LWLR), which can highlight the word novelty of expressing an emerging topic and suppress the word novelty of expressing an existing topic. We then track emerging topics by leveraging topic novelty and fading probabilities, which are learnt by designing and solving an optimization problem. We evaluate our method on a microblog stream containing over one million feeds. Experimental results show the promising performance of the proposed method in detecting emerging topic and tracking topic evolution over time on both effectiveness and efficiency.

World Wide Web | 2018

Personalized app recommendation based on app permissions

Min Peng; Guanyin Zeng; Zhaoyu Sun; Jiajia Huang; Hua Wang; Gang Tian

With the development of science and technology, the popularity of smart phones has made exponential growth in mobile phone application market. How to help users to select applications they prefer has become a hot topic in recommendation algorithm. As traditional recommendation algorithms are based on popularity and download, they inadvertently fail to recommend the desirable applications. At the same time, many users tend to pay more attention to permissions of those applications, because of some privacy and security reasons. There are few recommendation algorithms which take account of apps’ permissions, functionalities and users’ interests altogether. Some of them only consider permissions while neglecting the users’ interests, others just perform linear combination of apps’ permissions, functionalities and users’ interests to implement top-N recommendation. In this paper, we devise a recommendation method based on both permissions and functionalities. After demonstrating the correlation of apps’ permissions and users’ interests, we design an app risk score calculating method ARSM based on app-permission bipartite graph model. Furthermore, we propose a novel matrix factorization algorithm MFPF based on users’ interests, apps’ permissions and functionalities to handle personalized app recommendation. We compare our work with some of the state-of-the-art recommendation algorithms, and the results indicate that our work can improve the recommendation accuracy remarkably.

conference on information and knowledge management | 2015

Topic Detection from Large Scale of Microblog Stream with High Utility Pattern Clustering

Jiajia Huang; Min Peng; Hua Wang

With the popularity of social media, detecting topics from microblog streams have become an increasingly important task. However, its a challenge due to microblog streams have the characteristics of high-dimension, short and noisy content, fast changing, huge volume and so on. In this paper, we propose a high utility pattern clustering (HUPC) framework over microblog streams. This framework first extracts a group of representative patterns from the microblog stream, and then groups these patterns into topic clusters. This approach works well on large scale of microblog streams because it clusters the patterns that perform better in describing topics, rather than clustering noises and microblogs directly. Furthermore, the proposed framework can detect coherent topics and new emerging topics simultaneously. Extensive experimental results on Twitter streams and Sina Weibo streams show that the developed method achieves better performance than other existing topic detection methods, leading to a desirable solution of detecting event from microblog streams.

web information systems engineering | 2013

High Quality Microblog Extraction Based on Multiple Features Fusion and Time-Frequency Transformation

Min Peng; Jiajia Huang; Hui Fu; Jiahui Zhu; Li Zhou; Yanxiang He; Fei Li

Online social media exhibits massive social event relevant messages. Some of them contain useful and meaningful information, while others might not worth reading. In this paper, for a given social event, we focus on extracting high quality information from massive social media messages, since the extracted information has valuable textual content, and is widely propagated and posted by authority. We propose an extraction framework to get high quality information by considering different features globally in social media. Specially, in order to reduce computing time and improve extraction precision, some important social media features are employed and transformed into wavelet domain and fused further, to get a weighted ensemble value. A large scale of Sina microblog dataset is used to evaluate the framework’s performance. Experimental results show that the proposed framework is effective to extract high quality information.

conference on information and knowledge management | 2015

Central Topic Model for Event-oriented Topics Mining in Microblog Stream

Min Peng; Jiahui Zhu; Xuhui Li; Jiajia Huang; Hua Wang; Yanchun Zhang

To date, data generates and arrives in the form of stream to propagate discussions of public events in microblog services. Discovering event-oriented topics from the stream will lead to a better understanding of the change of public concern. However, as the massive scale of the data stream, traditional static topic models, such as LDA, are no longer fit for topic detection and tracking tasks. In this paper, we propose a central topic model (CenTM), where a Multi-view Clustering algorithm with Two-phase Random Walk (MC-TRW) is devised to aggregate the LDAs latent topics into central topics. Furthermore, we leverage the aggregation of central topics alternately with MC-TRW and sequential topic inference to improve the scalability in the stream fashion, so as to derive the dynamic central topic model (DCenTM). Specifically, our model is able to uncover the intrinsic characteristics of the central topics and predict the trend of their intensity along a life cycle. Experimental results demonstrate that the proposed central topic model is event-oriented and of high generalization, it therefore can dispose the topic trend prediction effectively and precisely in massive data stream.

ACM Transactions on Knowledge Discovery From Data | 2018

Mining Event-Oriented Topics in Microblog Stream with Unsupervised Multi-View Hierarchical Embedding

Min Peng; Jiahui Zhu; Hua Wang; Xuhui Li; Yanchun Zhang; Xiuzhen Zhang; Gang Tian

This article presents an unsupervised multi-view hierarchical embedding (UMHE) framework to sufficiently reveal the intrinsic topical knowledge in social events. Event-oriented topics are highly related to such events as it can provide explicit descriptions of what have happened in social community. In many real-world cases, however, it is difficult to include all attributes of microblogs, more often, textual aspects only are available. Traditional topic modelling methods have failed to generate event-oriented topics with the textual aspects, since the inherent relations between topics are often overlooked in these methods. Meanwhile, the metrics in original word vocabulary space might not effectively capture semantic distances. Our UMHE framework overcomes the severe information deficiency and poor feature representation. The UMHE first develops a multi-view Bayesian rose tree to preliminarily generate prior knowledge for latent topics and their relations. With such prior knowledge, we design an unsupervised translation-based hierarchical embedding method to make a better representation of these latent topics. By applying self-adaptive spectral clustering on the embedding space and the original space concomitantly, we eventually extract event-oriented topics in word distributions to express social events. Our framework is purely data-driven and unsupervised, without any external knowledge. Experimental results on TREC Tweets2011 dataset and Sina Weibo dataset demonstrate that the UMHE framework can construct hierarchical structure with high fitness, but also yield topic embeddings with salient semantics; therefore, it can derive event-oriented topics with meaningful descriptions.

ACM Transactions on Information Systems | 2017

Parallelization of Massive Textstream Compression Based on Compressed Sensing

Min Peng; Wang Gao; Hua Wang; Yanchun Zhang; Jiajia Huang; Qianqian Xie; Gang Hu; Gang Tian

Compressing textstreams generated by social networks can both reduce storage consumption and improve efficiency such as fast searching. However, the compression process is a challenge due to the large scale of textstreams. In this article, we propose a textstream compression framework based on compressed sensing theory and design a series of matching parallel procedures. The new approach uses a linear projection technique in the textstream compression process, achieving fast compression speed and low compression ratio. Two processes are executed by designing elaborated parallel procedures for efficient compressing and decompressing of large-scale textstreams. The decompression process is implemented for approximate solutions of underdetermined linear systems. Experimental results show that the new method can efficiently achieve the compression and decompression tasks on a large amount of text generated by social networks.

web-age information management | 2015

Coherent Topic Hierarchy: A Strategy for Topic Evolutionary Analysis on Microblog Feeds

Jiahui Zhu; Xuhui Li; Min Peng; Jiajia Huang; Tieyun Qian; Jimin Huang; Jiping Liu; Ri Hong; Pinglan Liu

Topic evolutionary analysis on microblog feeds can help reveal users’ interests and public concerns in a global perspective. However, it is not easy to capture the evolutionary patterns since the semantic coherence is usually difficult to be expressed and the timeline structure is always intractable to be organized. In this paper, we propose a novel strategy, in which a coherent topic hierarchy is designed to deal with these challenges. First, we incorporate the sparse biterm topic model to extract some coherent topics from microblog feeds. Then the topology of these topics is constructed by the basic Bayesian rose tree combined with topic similarity. Finally, we devise a cross-tree random walk with restart model to bond each pair of sequential trees into a timeline hierarchy. Experimental results on microblog datasets demonstrate that the coherent topic hierarchy is capable of providing meaningful topic interpretations, achieving high clustering performance, as well as presenting motivated patterns for topic evolutionary analysis.

computer supported cooperative work in design | 2008

Authorization approaches for advanced permission-role assignments

Hua Wang; Jianming Yong; Jiuyong Li; Min Peng

Role-based access control (RBAC) has been proven to be a flexible and useful access control model for information sharing in distributed collaborative environments. Permission-role assignments (PRA) is one important process in the access model. However, problems may arise during the procedures of PRA Conflicting permissions may assign to one role, and as a result, the role with the permissions can derive unexpected access capabilities. This paper aims to analyze the problems during the procedures of permission-role assignments in distributed collaborative environments and to develop authorization allocation algorithms to address the problems within permission-role assignments. The algorithms are extended to the case of PRA with the mobility of permission-role relationship. Finally, comparisons with other related work are discussed to demonstrate the effective work of the paper.

Knowledge and Information Systems | 2017

Dynamic sampling of text streams and its application in text analysis

Gang Tian; Jiajia Huang; Min Peng; Jiahui Zhu; Yanchun Zhang

A large number of texts are rapidly generated as streaming data in social media. Since it is difficult to process such text streams with limited memory in real time, researchers are resorting to text stream compression and sampling to obtain a small portion of valuable information from the streams. In this study, we investigate the crucial question of how to use less memory space to store more valuable texts to maintain the global information of the stream. First, we propose a text stream sampling framework based on compressed sensing theory, which can sample a text stream with a lightweight framework to reduce the space consumption while still retaining the most valuable texts. We then develop a query word-based retrieval task as well as a topic detection and evolution analysis task on the sample stream to evaluate the performance of the framework in retaining valuable information. The framework is evaluated from several aspects using two representative datasets of social media, including compression ratio, runtime, information reserved rate, and efficiency of the text analysis tasks. Experimental results demonstrate that the proposed framework outperforms baseline methods and is able to complete the text analysis tasks with promising results.

Explore More