Qinli Yang
University of Electronic Science and Technology of China
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Qinli Yang.
knowledge discovery and data mining | 2015
Junming Shao; Zhichao Han; Qinli Yang; Tao Zhou
In this paper, we introduce a new community detection algorithm, called Attractor, which automatically spots communities in a network by examining the changes of distances among nodes (i.e. distance dynamics). The fundamental idea is to envision the target network as an adaptive dynamical system, where each node interacts with its neighbors. The interaction will change the distances among nodes, while the distances will affect the interactions. Such interplay eventually leads to a steady distribution of distances, where the nodes sharing the same community move together and the nodes in different communities keep far away from each other. Building upon the distance dynamics, Attractor has several remarkable advantages: (a) It provides an intuitive way to analyze the community structure of a network, and more importantly, faithfully captures the natural communities (with high quality). (b) Attractor allows detecting communities on large-scale networks due to its low time complexity (O(|E|)). (c) Attractor is capable of discovering communities of arbitrary size, and thus small-size communities or anomalies, usually existing in real-world networks, can be well pinpointed. Extensive experiments show that our algorithm allows the effective and efficient community detection and has good performance compared to state-of-the-art algorithms.
IEEE Transactions on Knowledge and Data Engineering | 2013
Junming Shao; Xiao He; Christian Böhm; Qinli Yang; Claudia Plant
Synchronization is a powerful and inherently hierarchical concept regulating a large variety of complex processes ranging from the metabolism in a cell to opinion formation in a group of individuals. Synchronization phenomena in nature have been widely investigated and models concisely describing the dynamical synchronization process have been proposed, e.g., the well-known Extensive Kuramoto Model. We explore the potential of the Extensive Kuramoto Model for data clustering. We regard each data object as a phase oscillator and simulate the dynamical behavior of the objects over time. By interaction with similar objects, the phase of an object gradually aligns with its neighborhood, resulting in a nonlinear object movement naturally driven by the local cluster structure. We demonstrate that our framework has several attractive benefits: 1) It is suitable to detect clusters of arbitrary number, shape, and data distribution, even in difficult settings with noise points and outliers. 2) Combined with the Minimum Description Length (MDL) principle, it allows partitioning and hierarchical clustering without requiring any input parameters which are difficult to estimate. 3) Synchronization faithfully captures the natural hierarchical cluster structure of the data and MDL suggests meaningful levels of abstraction. Extensive experiments demonstrate the effectiveness and efficiency of our approach.
Computers, Environment and Urban Systems | 2012
Ebenezer Danso-Amoako; Miklas Scholz; Nickolas Kalimeris; Qinli Yang; Junming Shao
This study aims to provide a rapid screening tool for assessment of sustainable flood retention basins (SFRBs) to predict corresponding dam failure risks. A rapid expert-based assessment method for dam failure of SFRB supported by an artificial neural network (ANN) model has been presented. Flood storage was assessed for 110 SFRB and the corresponding Dam Failure Risk was evaluated for all dams across the wider Greater Manchester study area. The results show that Dam Failure Risk can be estimated by using the variables Dam Height, Dam Length, Maximum Flood Water Volume, Flood Water Surface Area, Mean Annual Rainfall (based on Met Office data), Altitude, Catchment Size, Urban Catchment Proportion, Forest Catchment Proportion and Managed Maximum Flood Water Volume. A cross-validation R2 value of 0.70 for the ANN model signifies that the tool is likely to predict variables well for new data sets. Traditionally, dams are considered safe because they have been built according to high technical standards. However, many dams that were constructed decades ago do not meet the current state-of-the-art dam design guidelines. Spatial distribution maps show that dam failure risks of SFRB located near cities are higher than those situated in rural locations. The proposed tool could be used as an early warning system in times of heavy rainfall.
pacific-asia conference on knowledge discovery and data mining | 2013
Junming Shao; Xiao He; Qinli Yang; Claudia Plant; Christian Böhm
Complex graph data now arises in various fields like social networks, protein-protein interaction networks, ecosystems, etc. To reveal the underlying patterns in graphs, an important task is to partition them into several meaningful clusters. The question is: how can we find the natural partitions of a complex graph which truly reflect the intrinsic patterns? In this paper, we propose RSGC, a novel approach to graph clustering. The key philosophy of RSGC is to consider graph clustering as a dynamic process towards synchronization. For each vertex, it is viewed as an oscillator and interacts with other vertices according to the graph connection information. During the process towards synchronization, vertices with similar connectivity patterns tend to naturally synchronize together to form a cluster. Inherited from the powerful concept of synchronization, RSGC shows several desirable properties: (a) it provides a novel perspective for graph clustering based on proposed interaction model; (b) RSGC allows discovering natural clusters in graph without any data distribution assumption; (c) RSGC is also robust against noise vertices. We systematically evaluate RSGC algorithm on synthetic and real data to demonstrate its superiority.
ACM Transactions on Knowledge Discovery From Data | 2016
Junming Shao; Qinli Yang; Hoang-Vu Dang; Bertil Schmidt; Stefan Kramer
Clustering very large datasets while preserving cluster quality remains a challenging data-mining task to date. In this paper, we propose an effective scalable clustering algorithm for large datasets that builds upon the concept of synchronization. Inherited from the powerful concept of synchronization, the proposed algorithm, CIPA (Clustering by Iterative Partitioning and Point Attractor Representations), is capable of handling very large datasets by iteratively partitioning them into thousands of subsets and clustering each subset separately. Using dynamic clustering by synchronization, each subset is then represented by a set of point attractors and outliers. Finally, CIPA identifies the cluster structure of the original dataset by clustering the newly generated dataset consisting of points attractors and outliers from all subsets. We demonstrate that our new scalable clustering approach has several attractive benefits: (a) CIPA faithfully captures the cluster structure of the original data by performing clustering on each separate data iteratively instead of using any sampling or statistical summarization technique. (b) It allows clustering very large datasets efficiently with high cluster quality. (c) CIPA is parallelizable and also suitable for distributed data. Extensive experiments demonstrate the effectiveness and efficiency of our approach.
Knowledge and Information Systems | 2017
Junming Shao; Xinzuo Wang; Qinli Yang; Claudia Plant; Christian Böhm
How to address the challenges of the “curse of dimensionality” and “scalability” in clustering simultaneously? In this paper, we propose arbitrarily oriented synchronized clusters (ORSC), a novel effective and efficient method for subspace clustering inspired by synchronization. Synchronization is a basic phenomenon prevalent in nature, capable of controlling even highly complex processes such as opinion formation in a group. Control of complex processes is achieved by simple operations based on interactions between objects. Relying on the weighted interaction model and iterative dynamic clustering, our approach ORSC (a) naturally detects correlation clusters in arbitrarily oriented subspaces, including arbitrarily shaped nonlinear correlation clusters. Our approach is (b) robust against noise and outliers. In contrast to previous methods, ORSC is (c) easy to parameterize, since there is no need to specify the subspace dimensionality or other difficult parameters. Instead, all interesting subspaces are detected in a fully automatic way. Finally, (d) ORSC outperforms most comparison methods in terms of runtime efficiency and is highly scalable to large and high-dimensional data sets. Extensive experiments have demonstrated the effectiveness and efficiency of our approach.
international conference on data mining | 2012
Junming Shao; Qinli Yang; Afra Wohlschlaeger; Christian Sorg
Alzheimers disease (AD) is the most common cause of age-related dementia, which prominently affects the human connectome. Diffusion weighted imaging (DWI) provides a promising way to explore the organization of white matter fiber tracts in the human brain in a non-invasive way. However, the immense amount of data from millions of voxels of a raw diffusion map prevent an easy way to utilizable knowledge. In this paper, we focus on the question how we can identify disrupted spatial patterns of the human connectome in AD based on a data mining framework. Using diffusion tractography, the human connectomes for each individual subject were constructed based on two diffusion derived attributes: fiber density and fractional anisotropy, to represent the structural brain connectivity patterns. Then, these humanconnectomes were further mapped into a series of unweighted graphs by discretization. After frequent sub graph mining, the abnormal score was finally defined to identify disrupted sub graph patterns in patients. Experiments demonstrated that our data-driven approach, for the first time, allows identifying selective spatial pattern changes of the human connectome in AD that perfectly matched grey matter changes of the disease. Our findings further bring new insights into how AD propagates and disrupts the regional integrity of large-scale structural brain networks in a fiber connectivity-based way.
IEEE Transactions on Knowledge and Data Engineering | 2018
Junming Shao; Feng Huang; Qinli Yang; Guangchun Luo
In this paper, we propose a prototype-based classification model for evolving data streams, called SyncStream, which allows dynamically modeling time-changing concepts, making predictions in a local fashion. Instead of learning a single model on a fixed or adaptive sliding window of historical data or ensemble learning a set of weighted base classifiers, SyncStream captures evolving concepts by dynamically maintaining a set of prototypes in a proposed P-Tree, which are obtained based on the error-driven representativeness learning and synchronization-inspired constrained clustering. To identify abrupt concept drifts in data streams, PCA and statistical analysis based heuristic approaches have been introduced. To further learn the associations among distributed data streams, the extended P-Tree structure and KNN-style strategy are introduced. We demonstrate that our new data stream classification approach has several attractive benefits: (a) SyncStream is capable of dynamically modeling the evolving concepts from even a small set of prototypes. (b) Owing to synchronization-based constrained clustering and P-Tree, SyncStream supports efficient and effective data representation and maintenance. (c) SyncStream is also tolerant of inappropriate or noisy examples via error-driven representativeness learning. (d) SyncStream allows learning relationship among distributed data streams at the instance level. The experimental results indicate its efficiency and effectiveness.
Brain Imaging and Behavior | 2018
Junming Shao; Chun Meng; Masoud Tahmasian; Felix Brandl; Qinli Yang; Guangchun Luo; Cheng Luo; Dezhong Yao; Lianli Gao; Valentin Riedl; Afra M. Wohlschläger; Christian Sorg
Brain imaging reveals schizophrenia as a disorder of macroscopic brain networks. In particular, default mode and salience network (DMN, SN) show highly consistent alterations in both interacting brain activity and underlying brain structure. However, the same networks are also altered in major depression. This overlap in network alterations induces the question whether DMN and SN changes are different across both disorders, potentially indicating distinct underlying pathophysiological mechanisms. To address this question, we acquired T1-weighted, diffusion-weighted, and resting-state functional MRI in patients with schizophrenia, patients with major depression, and healthy controls. We measured regional gray matter volume, inter-regional structural and intrinsic functional connectivity of DMN and SN, and compared these measures across groups by generalized Wilcoxon rank tests, while controlling for symptoms and medication. When comparing patients with controls, we found in each patient group SN volume loss, impaired DMN structural connectivity, and aberrant DMN and SN functional connectivity. When comparing patient groups, SN gray matter volume loss and DMN structural connectivity reduction did not differ between groups, but in schizophrenic patients, functional hyperconnectivity between DMN and SN was less in comparison to depressed patients. Results provide evidence for distinct functional hyperconnectivity between DMN and SN in schizophrenia and major depression, while structural changes in DMN and SN were similar. Distinct hyperconnectivity suggests different pathophysiological mechanism underlying aberrant DMN-SN interactions in schizophrenia and depression.
database systems for advanced applications | 2018
Zhong Zhang; Zhili Qin; Peiyan Li; Qinli Yang; Junming Shao
Multi-view learning attempts to generate a classifier with a better performance by exploiting relationship among multiple views. Existing approaches often focus on learning the consistency and/or complementarity among different views. However, not all consistent or complementary information is useful for learning, instead, only class-specific discriminative information is essential. In this paper, we propose a new robust multi-view learning algorithm, called DICS, by exploring the Discriminative and non-discriminative Information existing in Common and view-Specific parts among different views via joint non-negative matrix factorization. The basic idea is to learn a latent common subspace and view-specific subspaces, and more importantly, discriminative and non-discriminative information from all subspaces are further extracted to support a better classification. Empirical extensive experiments on seven real-world data sets have demonstrated the effectiveness of DICS, and show its superiority over many state-of-the-art algorithms.