Aaron F. McDaid | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Aaron F. McDaid is active.

Explore More

Publication

Featured researches published by Aaron F. McDaid.

Computational Statistics & Data Analysis | 2010

Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models

Paul D. McNicholas; Thomas Brendan Murphy; Aaron F. McDaid; Dermot Frost

Model-based clustering using a family of Gaussian mixture models, with parsimonious factor analysis-like covariance structure, is described and an ecient algorithm for its implementation is presented. This algorithm uses the alternating expectationconditional maximization (AECM) variant of the expectation-maximization (EM) algorithm. Two central issues around the implementation of this family of models, namely model selection and convergence criteria, are discussed. These central issues also have implications for other model-based clustering techniques and for the implementation of techniques like the EM algorithm, in general. The Bayesian information criterion (BIC) is used for model selection and Aitken’s acceleration, which is shown to outperform the lack of progress criterion, is used to determine convergence. A brief introduction to parallel computing is then given before the implementation of this algorithm in parallel is facilitated within the master-slave paradigm. A simulation study is then carried out to confirm the eectiveness of this parallelization. The resulting software is applied to two data sets to demonstrate its eectiveness when compared to existing software.

advances in social networks analysis and mining | 2010

Detecting Highly Overlapping Communities with Model-Based Overlapping Seed Expansion

Aaron F. McDaid; Neil J. Hurley

As research into community finding in social networks progresses, there is a need for algorithms capable of detecting overlapping community structure. Many algorithms have been proposed in recent years that are capable of assigning each node to more than a single community. The performance of these algorithms tends to degrade when the ground-truth contains a more highly overlapping community structure, with nodes assigned to more than two communities. Such highly overlapping structure is likely to exist in many social networks, such as Facebook friendship networks. In this paper we present a scalable algorithm, MOSES, based on a statistical model of community structure, which is capable of detecting highly overlapping community structure, especially when there is variance in the number of communities each node is in. In evaluation on synthetic data MOSES is found to be superior to existing algorithms, especially at high levels of overlap. We demonstrate MOSES on real social network data by analyzing the networks of friendship links between students of five US universities.

Computational Statistics & Data Analysis | 2013

Improved Bayesian inference for the stochastic block model with application to large networks

Aaron F. McDaid; Thomas Brendan Murphy; Nial Friel; Neil J. Hurley

An efficient MCMC algorithm is presented to cluster the nodes of a network such that nodes with similar role in the network are clustered together. This is known as block-modeling or block-clustering. The model is the stochastic blockmodel (SBM) with block parameters integrated out. The resulting marginal distribution defines a posterior over the number of clusters and cluster memberships. Sampling from this posterior is simpler than from the original SBM as transdimensional MCMC can be avoided. The algorithm is based on the allocation sampler. It requires a prior to be placed on the number of clusters, thereby allowing the number of clusters to be directly estimated by the algorithm, rather than being given as an input parameter. Synthetic and real data are used to test the speed and accuracy of the model and algorithm, including the ability to estimate the number of clusters. The algorithm can scale to networks with up to ten thousand nodes and tens of millions of edges.

advances in social networks analysis and mining | 2011

Partitioning Breaks Communities

Fergal Reid; Aaron F. McDaid; Neil J. Hurley

Considering a clique as a conservative definition of community structure, we examine how graph partitioning algorithms interact with cliques. Many popular community-finding algorithms partition the entire graph into non-overlapping communities. We show that on a wide range of empirical networks, from different domains, significant numbers of cliques are split across separate partitions, as produced by such algorithms. We examine the largest connected component of the sub graph formed by retaining only edges in cliques, and apply partitioning strategies that explicitly minimise the number of cliques split. We conclude that, due to the connectedness of many networks, any community finding algorithm that produces partitions must fail to find at least some significant structures. Moreover, contrary to traditional intuition, in some empirical networks, strong ties and cliques frequently do cross community boundaries.

advances in social networks analysis and mining | 2012

Percolation Computation in Complex Networks

Fergal Reid; Aaron F. McDaid; Neil J. Hurley

K-clique percolation is an overlapping community finding algorithm which extracts particular structures, comprised of overlapping cliques, from complex networks. While it is conceptually straightforward, and can be elegantly expressed using clique graphs, certain aspects of k-clique percolation are computationally challenging in practice. In this paper we investigate aspects of empirical social networks, such as the large numbers of overlapping maximal cliques contained within them, that make clique percolation, and clique graph representations, computationally expensive. We motivate a simple algorithm to conduct clique percolation, and investigate its performance compared to current best-in-class algorithms. We present improvements to this algorithm, which allow us to perform k-clique percolation on much larger empirical datasets. Our approaches perform much better than existing algorithms on networks exhibiting pervasively overlapping community structure, especially for higher values of k. However, clique percolation remains a hard computational problem, current algorithms still scale worse than some other overlapping community finding algorithms.

Physical Review E | 2011

Seeding for pervasively overlapping communities.

Conrad Lee; Fergal Reid; Aaron F. McDaid; Neil J. Hurley

In some social and biological networks, the majority of nodes belong to multiple communities. It has recently been shown that a number of the algorithms specifically designed to detect overlapping communities do not perform well in such highly overlapping settings. Here, we consider one class of these algorithms, those which optimize a local fitness measure, typically by using a greedy heuristic to expand a seed into a community. We perform synthetic benchmarks which indicate that an appropriate seeding strategy becomes more important as the extent of community overlap increases. We find that distinct cliques provide the best seeds. We find further support for this seeding strategy with benchmarks on a Facebook network and the yeast interactome.

advances in social networks analysis and mining | 2014

Overlapping stochastic community finding

Aaron F. McDaid; Neil J. Hurley; Brendan Murphy

Community finding in social network analysis is the task of identifying groups of people within a larger population who are more likely to connect to each other than connect to others in the population. Much existing research has focussed on non-overlapping clustering. However, communities in real-world social networks do overlap. This paper introduces a new community finding method based on overlapping clustering. A Bayesian statistical model is presented, and a Markov Chain Monte Carlo (MCMC) algorithm is presented and evaluated in comparison with two existing overlapping community finding methods that are applicable to large networks. We evaluate our algorithm on networks with thousands of nodes and tens of thousands of edges.

knowledge discovery and data mining | 2010