Yulong Pei
Eindhoven University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yulong Pei.
advances in social networks analysis and mining | 2016
Jianpeng Zhang; Yulong Pei; George H. L. Fletcher; Mykola Pechenizkiy
Due to the growing presence of large-scale and streaming graphs such as social networks, graph sampling and clustering play an important role in many real-world applications. One key aspect of graph clustering is the evaluation of cluster quality. However, little attention has been paid to evaluation measures for clustering quality on samples of graphs. As first steps towards appropriate evaluation of clustering methods on sampled graphs, in this work we present two novel evaluation measures for graph clustering called δ-precision and δ-recall. These measures effectively reflect the match quality of the clusters in the sampled graph with respect to the ground-truth clusters in the original graph. We show in extensive experiments on various benchmarks that our proposed metrics are practical and effective for graph clustering evaluation.
computer-based medical systems | 2017
Negar Ahmadi; Yulong Pei; Mykola Pechenizkiy
Alcoholism is a common disorder that leads to brain defects and associated cognitive, emotional and behavioral impairments. Finding and extracting discriminative biological markers, which are correlated to healthy brain pattern and alcoholic brain pattern, helps us to utilize automatic methods for detecting and classifying alcoholism. Many brain disorders could be detected by analysing the Electroencephalography (EEG) signals. In this paper, for extracting the required markers we analyse the EEG signals for two groups of alcoholic and control subjects. Then by applying wavelet transform, band-limited EEG signals are decomposed into five frequency sub-bands. Also, the principle component analysis (PCA) is employed to choose the most information carrying channels. By examining various features from different frequency sub-bands, six discriminative features for classification are selected. From functional brain network perspective, the lower synchronization in Beta frequency sub-band and loss of lateralization in Alpha frequency sub-band in alcoholic subjects are observed. Also from signal processing perspective we found that alcoholic subjects have lower values of fractal dimension, energy and entropy compared to control ones. Five different classifiers are used to classify these groups of alcoholic and control subjects that show very high accuracies (more than 90%). However, by comparing the performance of different classifiers, SVM, random forest and gradient boosting show the best performances with accuracies near 100%. Our study shows that fractal dimension, entropy and energy of channel C1 in Alpha frequency sub-band are the more important features for classification.
international joint conference on artificial intelligence | 2018
Yulong Pei; Jianpeng Zhang; George H. L. Fletcher; Mykola Pechenizkiy
Roles of nodes in a social network (SN) represent their functions, responsibilities or behaviors within the SN. Roles typically evolve over time, making role analytics a challenging problem. Previous studies either neglect role transition analysis or perform role discovery and role transition learning separately, leading to inefficiencies and limited transition analysis. We propose a novel dynamic non-negative matrix factorization (DyNMF) approach to simultaneously discover roles and learn role transitions. DyNMF explicitly models temporal information by introducing a role transition matrix and clusters nodes in SNs from two views: the current view and the historical view. The current view captures structural information from the current SN snapshot and the historical view captures role transitions by looking at roles in past SN snapshots. DyNMF efficiently provides more effective analytics capabilities, regularizing roles by temporal smoothness of role transitions and reducing uncertainties and inconsistencies between snapshots. Experiments on both synthetic and real-world SNs demonstrate the advantages of DyNMF in discovering and predicting roles and role transitions.
Expert Systems With Applications | 2018
Rosa Sicilia; Stella Lo Giudice; Yulong Pei; Mykola Pechenizkiy; Paolo Soda
Abstract In the last years social networks have emerged as a critical mean for information spreading bringing along several advantages. At the same time, unverified and instrumentally relevant information statements in circulation, named as rumours, are becoming a potential threat to the society. For this reason, although the identification in social microblogs of which topic is a rumour has been studied in several works, there is the need to detect if a post is either a rumor or not. In this paper we cope with this last challenge presenting a novel rumour detection system that leverages on newly designed features, including influence potential and network characteristics measures. We tested our approach on a real dataset composed of health-related posts collected from Twitter microblog. We observe promising results, as the system is able to correctly detect about 90% of rumours, with acceptable levels of precision.
Companion of the The Web Conference 2018 on The Web Conference 2018 - WWW '18 | 2018
Wouter Lightenberg; Yulong Pei; George H. L. Fletcher; Mykola Pechenizkiy
We introduce the Tink library for distributed temporal graph analytics. Increasingly, reasoning about temporal aspects of graph-structured data collections is an important aspect of analytics. For example, in a communication network, time plays a fundamental role in the propagation of information within the network. Whereas existing tools for temporal graph analysis are built stand alone, Tink is a library in the Apache Flink ecosystem, thereby leveraging its advanced mature features such as distributed processing and query optimization. Furthermore, Flink requires little effort to process and clean the data without having to use different tools before analyzing the data. Tink focuses on interval graphs in which every edge is associated with a starting time and an ending time. The library provides facilities for temporal graph creation and maintenance, as well as standard temporal graph measures and algorithms. Furthermore, the library is designed for ease of use and extensibility.
bioinformatics and biomedicine | 2017
Rosa Sicilia; Stella Lo Giudice; Yulong Pei; Mykola Pechenizkiy; Paolo Soda
In the last years social networks have emerged as a critical mean for information spreading. In spite of all the positive consequences this phenomenon brings, unverified and instrumentally relevant information statements in circulation, named as rumours, are becoming a potential threat to the society. Recently, there have been several studies on topic-independent rumour detection on Twitter. In this paper we present a novel rumour detection system which focuses on a specific topic, that is health-related rumours on Twitter. To this aim, we constructed a new subset of features including influence potential and network characteristics features. We tested our approach on a real dataset observing promising results, as it is able to correctly detect about 89% of rumours, with acceptable levels of precision.
International Conference on Complex Networks and their Applications | 2017
Jianpeng Zhang; Kaijie Zhu; Yulong Pei; George H. L. Fletcher; Mykola Pechenizkiy
Most existing sampling algorithms on graphs (i.e., network-structured data) focus on sampling from memory-resident static graphs and assume the entire graphs are always available. However, the graphs encountered in modern applications are often too large and/or too dynamic to be processed with limited memory. Furthermore, existing sampling techniques are inadequate for preserving the inherent clustering structure, which is an essential property of complex networks. To tackle these problems, we propose a new sampling algorithm that dynamically maintains a representative sample and is capable of retaining clustering structure in graph streams at any time. Performance of the proposed algorithm is evaluated through empirical experiments using real-world networks. The experimental results have shown that our proposed CPIES algorithm can produce clustering-structure representative samples and outperforms current online sampling algorithms.
acm symposium on applied computing | 2016
J Jianpeng Zhang; Mykola Pechenizkiy; Yulong Pei; J Julia Efremova
In real-world pattern recognition tasks, the data with multiple manifolds structure is ubiquitous and unpredictable. Performing an effective clustering on such data is a challenging problem. In particular, it is not obvious how to design a similarity measure for multiple manifolds. In this paper, we address this problem proposing a new manifold distance measure, which can better capture both local and global spatial manifold information. We define a new way of local density estimation accounting for the density characteristic. It represents local density more accurately. Meanwhile, it is less sensitive to the parameter settings. Besides, in order to select the cluster centers automatically, a two-phase exemplar determination method is proposed. The experiments on several synthetic and real-world datasets show that the proposed algorithm has higher clustering effectiveness and better robustness for data with varying density, multi-scale and noise overlap characteristics.
intelligent data analysis | 2018
Jianpeng Zhang; Yulong Pei; George H. L. Fletcher; Mykola Pechenizkiy
arXiv: Social and Information Networks | 2018
Yulong Pei; Xin Du; Jianpeng Zhang; George H. L. Fletcher; Mykola Pechenizkiy