Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Isobel Claire Gormley is active.

Publication


Featured researches published by Isobel Claire Gormley.


BMC Bioinformatics | 2010

Probabilistic principal component analysis for metabolomic data

Gift Nyamundanda; Lorraine Brennan; Isobel Claire Gormley

BackgroundData from metabolomic studies are typically complex and high-dimensional. Principal component analysis (PCA) is currently the most widely used statistical technique for analyzing metabolomic data. However, PCA is limited by the fact that it is not based on a statistical model.ResultsHere, probabilistic principal component analysis (PPCA) which addresses some of the limitations of PCA, is reviewed and extended. A novel extension of PPCA, called probabilistic principal component and covariates analysis (PPCCA), is introduced which provides a flexible approach to jointly model metabolomic data and additional covariate information. The use of a mixture of PPCA models for discovering the number of inherent groups in metabolomic data is demonstrated. The jackknife technique is employed to construct confidence intervals for estimated model parameters throughout. The optimal number of principal components is determined through the use of the Bayesian Information Criterion model selection tool, which is modified to address the high dimensionality of the data.ConclusionsThe methods presented are illustrated through an application to metabolomic data sets. Jointly modeling metabolomic data and covariates was successfully achieved and has the potential to provide deeper insight to the underlying data structure. Examination of confidence intervals for the model parameters, such as loadings, allows for principled and clear interpretation of the underlying data structure. A software package called MetabolAnalyze, freely available through the R statistical software, has been developed to facilitate implementation of the presented methods in the metabolomics field.


Journal of the American Statistical Association | 2008

Exploring Voting Blocs Within the Irish Electorate

Isobel Claire Gormley; Thomas Brendan Murphy

Irish elections use a voting system called proportional representation by means of a single transferable vote (PR–STV). Under this system, voters express their vote by ranking some (or all) of the candidates in order of preference. Which candidates are elected is determined through a series of counts where candidates are eliminated and surplus votes are distributed. The electorate in any election forms a heterogeneous population; that is, voters with different political and ideological persuasions would be expected to have different preferences for the candidates. The purpose of this article is to establish the presence of voting blocs in the Irish electorate, to characterize these blocs, and to estimate their size. A mixture modeling approach is used to explore the heterogeneity of the Irish electorate and to establish the existence of clearly defined voting blocs. The voting blocs are characterized by their voting preferences, which are described using a ranking data model. In addition, the care with which voters choose lower tier preferences is estimated in the model. The methodology is used to explore data from two Irish elections. Data from eight opinion polls taken during the six weeks prior to the 1997 Irish presidential election are analyzed. These data reveal the evolution of the structure of the electorate during the election campaign. In addition, data that record the votes from the Dublin West constituency of the 2002 Irish general election are analyzed to reveal distinct voting blocs within the electorate; these blocs are characterized by party politics, candidate profile, and political ideology.


international conference on machine learning | 2006

A latent space model for rank data

Isobel Claire Gormley; Thomas Brendan Murphy

Rank data consist of ordered lists of objects. A particular example of these data arises in Irish elections using the proportional representation by means of a single transferable vote (PR-STV) system, where voters list candidates in order of preference. A latent space model is proposed for rank (voting) data, where both voters and candidates are located in the same D dimensional latent space. The relative proximity of candidates to a voter determines the probability of a voter giving high preferences to a candidate. The votes are modelled using a Plackett-Luce model which allows for the ranked nature of the data to be modelled directly. Data from the 2002 Irish general election are analyzed using the proposed model which is fitted in a Bayesian framework. The estimated candidate positions suggest that the party politics play an important role in this election. Methods for choosing D, the dimensionality of the latent space, are discussed and models with D = 1 or D = 2 are proposed for the 2002 Irish general election data.


PLOS Computational Biology | 2011

Transcriptomic Coordination in the Human Metabolic Network Reveals Links between n-3 Fat Intake, Adipose Tissue Gene Expression and Metabolic Health

Melissa J. Morine; Audrey C. Tierney; Ben van Ommen; Hannelore Daniel; Sinead Toomey; Ingrid M.F. Gjelstad; Isobel Claire Gormley; Pablo Perez-Martinez; Christian A. Drevon; Jose Lopez-Miranda; Helen M. Roche

Understanding the molecular link between diet and health is a key goal in nutritional systems biology. As an alternative to pathway analysis, we have developed a joint multivariate and network-based approach to analysis of a dataset of habitual dietary records, adipose tissue transcriptomics and comprehensive plasma marker profiles from human volunteers with the Metabolic Syndrome. With this approach we identified prominent co-expressed sub-networks in the global metabolic network, which showed correlated expression with habitual n-3 PUFA intake and urinary levels of the oxidative stress marker 8-iso-PGF2α. These sub-networks illustrated inherent cross-talk between distinct metabolic pathways, such as between triglyceride metabolism and production of lipid signalling molecules. In a parallel promoter analysis, we identified several adipogenic transcription factors as potential transcriptional regulators associated with habitual n-3 PUFA intake. Our results illustrate advantages of network-based analysis, and generate novel hypotheses on the transcriptomic link between habitual n-3 PUFA intake, adipose tissue function and oxidative stress.


Bayesian Analysis | 2009

A grade of membership model for rank data

Isobel Claire Gormley; Thomas Brendan Murphy

A grade of membership (GoM) model is an individual level mixture model which allows individuals have partial membership of the groups that characterize a population. A GoM model for rank data is developed to model the particular case when the response data is ranked in nature. A Metropolis-withinGibbs sampler provides the framework for model fitting, but the intricate nature of the rank data models makes the selection of suitable proposal distributions difficult. ‘Surrogate’ proposal distributions are constructed using ideas from optimization transfer algorithms. Model fitting issues such as label switching and model selection are also addressed. The GoM model for rank data is illustrated through an analysis of Irish election data where voters rank some or all of the candidates in order of preference. Interest lies in highlighting distinct groups of voters with similar preferences (i.e. ‘voting blocs’) within the electorate, taking into account the rank nature of the response data, and in examining individuals’ voting bloc memberships. The GoM model for rank data is fitted to data from an opinion poll conducted during the Irish presidential election campaign in 1997.


Computational Statistics & Data Analysis | 2016

Clustering with the multivariate normal inverse Gaussian distribution

Adrian O'Hagan; Thomas Brendan Murphy; Isobel Claire Gormley; Paul D. McNicholas; Dimitris Karlis

Many model-based clustering methods are based on a finite Gaussian mixture model. The Gaussian mixture model implies that the data scatter within each group is elliptically shaped. Hence non-elliptical groups are often modeled by more than one component, resulting in model over-fitting. An alternative is to use a mean-variance mixture of multivariate normal distributions with an inverse Gaussian mixing distribution (MNIG) in place of the Gaussian distribution, to yield a more flexible family of distributions. Under this model the component distributions may be skewed and have fatter tails than the Gaussian distribution. The MNIG based approach is extended to include a broad range of eigendecomposed covariance structures. Furthermore, MNIG models where the other distributional parameters are constrained is considered. The Bayesian Information Criterion is used to identify the optimal model and number of mixture components. The method is demonstrated on three sample data sets and a novel variation on the univariate Kolmogorov-Smirnov test is used to assess goodness of fit.


BMC Bioinformatics | 2013

MetSizeR: selecting the optimal sample size for metabolomic studies using an analysis based approach

Gift Nyamundanda; Isobel Claire Gormley; Yue Fan; William M. Gallagher; Lorraine Brennan

BackgroundDetermining sample sizes for metabolomic experiments is important but due to the complexity of these experiments, there are currently no standard methods for sample size estimation in metabolomics. Since pilot studies are rarely done in metabolomics, currently existing sample size estimation approaches which rely on pilot data can not be applied.ResultsIn this article, an analysis based approach called MetSizeR is developed to estimate sample size for metabolomic experiments even when experimental pilot data are not available. The key motivation for MetSizeR is that it considers the type of analysis the researcher intends to use for data analysis when estimating sample size. MetSizeR uses information about the data analysis technique and prior expert knowledge of the metabolomic experiment to simulate pilot data from a statistical model. Permutation based techniques are then applied to the simulated pilot data to estimate the required sample size.ConclusionsThe MetSizeR methodology, and a publicly available software package which implements the approach, are illustrated through real metabolomic applications. Sample size estimates, informed by the intended statistical analysis technique, and the associated uncertainty are provided.


BMC Bioinformatics | 2010

Bi-directional gene set enrichment and canonical correlation analysis identify key diet-sensitive pathways and biomarkers of metabolic syndrome

Melissa J. Morine; Jolene McMonagle; Sinead Toomey; Clare M. Reynolds; Aidan P. Moloney; Isobel Claire Gormley; Peadar Ó Gaora; Helen M. Roche

BackgroundCurrently, a number of bioinformatics methods are available to generate appropriate lists of genes from a microarray experiment. While these lists represent an accurate primary analysis of the data, fewer options exist to contextualise those lists. The development and validation of such methods is crucial to the wider application of microarray technology in the clinical setting. Two key challenges in clinical bioinformatics involve appropriate statistical modelling of dynamic transcriptomic changes, and extraction of clinically relevant meaning from very large datasets.ResultsHere, we apply an approach to gene set enrichment analysis that allows for detection of bi-directional enrichment within a gene set. Furthermore, we apply canonical correlation analysis and Fishers exact test, using plasma marker data with known clinical relevance to aid identification of the most important gene and pathway changes in our transcriptomic dataset. After a 28-day dietary intervention with high-CLA beef, a range of plasma markers indicated a marked improvement in the metabolic health of genetically obese mice. Tissue transcriptomic profiles indicated that the effects were most dramatic in liver (1270 genes significantly changed; p < 0.05), followed by muscle (601 genes) and adipose (16 genes). Results from modified GSEA showed that the high-CLA beef diet affected diverse biological processes across the three tissues, and that the majority of pathway changes reached significance only with the bi-directional test. Combining the liver tissue microarray results with plasma marker data revealed 110 CLA-sensitive genes showing strong canonical correlation with one or more plasma markers of metabolic health, and 9 significantly overrepresented pathways among this set; each of these pathways was also significantly changed by the high-CLA diet. Closer inspection of two of these pathways - selenoamino acid metabolism and steroid biosynthesis - illustrated clear diet-sensitive changes in constituent genes, as well as strong correlations between gene expression and plasma markers of metabolic syndrome independent of the dietary effect.ConclusionBi-directional gene set enrichment analysis more accurately reflects dynamic regulatory behaviour in biochemical pathways, and as such highlighted biologically relevant changes that were not detected using a traditional approach. In such cases where transcriptomic response to treatment is exceptionally large, canonical correlation analysis in conjunction with Fishers exact test highlights the subset of pathways showing strongest correlation with the clinical markers of interest. In this case, we have identified selenoamino acid metabolism and steroid biosynthesis as key pathways mediating the observed relationship between metabolic health and high-CLA beef. These results indicate that this type of analysis has the potential to generate novel transcriptome-based biomarkers of disease.


Computational Statistics & Data Analysis | 2012

Computational aspects of fitting mixture models via the expectation-maximization algorithm

Adrian O'Hagan; Thomas Brendan Murphy; Isobel Claire Gormley

The Expectation-Maximization (EM) algorithm is a popular tool in a wide variety of statistical settings, in particular in the maximum likelihood estimation of parameters when clustering using mixture models. A serious pitfall is that in the case of a multimodal likelihood function the algorithm may become trapped at a local maximum, resulting in an inferior clustering solution. In addition, convergence to an optimal solution can be very slow. Methods are proposed to address these issues: optimizing starting values for the algorithm and targeting maximization steps efficiently. It is demonstrated that these approaches can produce superior outcomes to initialization via random starts or hierarchical clustering and that the rate of convergence to an optimal solution can be greatly improved.


The Annals of Applied Statistics | 2014

Clustering South African Households Based on their Asset Status Using Latent Variable Models.

Damien McParland; Isobel Claire Gormley; Tyler H. McCormick; Samuel J. Clark; Chodziwadziwa Kabudula; Mark A. Collinson

The Agincourt Health and Demographic Surveillance System has since 2001 conducted a biannual household asset survey in order to quantify household socio-economic status (SES) in a rural population living in northeast South Africa. The survey contains binary, ordinal and nominal items. In the absence of income or expenditure data, the SES landscape in the study population is explored and described by clustering the households into homogeneous groups based on their asset status. A model-based approach to clustering the Agincourt households, based on latent variable models, is proposed. In the case of modeling binary or ordinal items, item response theory models are employed. For nominal survey items, a factor analysis model, similar in nature to a multinomial probit model, is used. Both model types have an underlying latent variable structure-this similarity is exploited and the models are combined to produce a hybrid model capable of handling mixed data types. Further, a mixture of the hybrid models is considered to provide clustering capabilities within the context of mixed binary, ordinal and nominal response data. The proposed model is termed a mixture of factor analyzers for mixed data (MFA-MD). The MFA-MD model is applied to the survey data to cluster the Agincourt households into homogeneous groups. The model is estimated within the Bayesian paradigm, using a Markov chain Monte Carlo algorithm. Intuitive groupings result, providing insight to the different socio-economic strata within the Agincourt region.

Collaboration


Dive into the Isobel Claire Gormley's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Adrian O'Hagan

University College Dublin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Helen M. Roche

University College Dublin

View shared research outputs
Top Co-Authors

Avatar

Joseph M. Queally

University College Hospital

View shared research outputs
Top Co-Authors

Avatar

Sinead Toomey

Royal College of Surgeons in Ireland

View shared research outputs
Top Co-Authors

Avatar

Stephen A. Brennan

University College Hospital

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge