Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mark A. Girolami is active.

Publication


Featured researches published by Mark A. Girolami.


Neural Computation | 1999

Independent component analysis using an extended infomax algorithm for mixed subgaussian and supergaussian sources

Te-Won Lee; Mark A. Girolami; Terrence J. Sejnowski

An extension of the infomax algorithm of Bell and Sejnowski (1995) is presented that is able blindly to separate mixed signals with sub- and supergaussian source distributions. This was achieved by using a simple type of learning rule first derived by Girolami (1997) by choosing negentropy as a projection pursuit index. Parameterized probability distributions that have sub- and supergaussian regimes were used to derive a general learning rule that preserves the simple architecture proposed by Bell and Sejnowski (1995), is optimized using the natural gradient by Amari (1998), and uses the stability analysis of Cardoso and Laheld (1996) to switch between sub- and supergaussian regimes. We demonstrate that the extended infomax algorithm is able to separate 20 sources with a variety of source distributions easily. Applied to high-dimensional data from electroencephalographic recordings, it is effective at separating artifacts such as eye blinks and line noise from weaker electrical signals that arise from sources in the brain.


IEEE Transactions on Neural Networks | 2002

Mercer kernel-based clustering in feature space

Mark A. Girolami

The article presents a method for both the unsupervised partitioning of a sample of data and the estimation of the possible number of inherent clusters which generate the data. This work exploits the notion that performing a nonlinear data transformation into some high dimensional feature space increases the probability of the linear separability of the patterns within the transformed space and therefore simplifies the associated data structure. It is shown that the eigenvectors of a kernel matrix which defines the implicit mapping provides a means to estimate the number of clusters inherent within the data and a computationally simple iterative procedure is presented for the subsequent feature space partitioning of the data.


IEEE Signal Processing Letters | 1999

Blind source separation of more sources than mixtures using overcomplete representations

Te-Won Lee; Michael S. Lewicki; Mark A. Girolami; Terrence J. Sejnowski

Empirical results were obtained for the blind source separation of more sources than mixtures using a previously proposed framework for learning overcomplete representations. This technique assumes a linear mixing model with additive noise and involves two steps: (1) learning an overcomplete representation for the observed data and (2) inferring sources given a sparse prior on the coefficients. We demonstrate that three speech signals can be separated with good fidelity given only two mixtures of the three signals. Similar results were obtained with mixtures of two speech signals and one music signal.


Computers & Mathematics With Applications | 2000

A Unifying Information-Theoretic Framework for Independent Component Analysis

Te-Won Lee; Mark A. Girolami; Anthony J. Bell; Terrence J. Sejnowski

Abstract We show that different theories recently proposed for independent component analysis (ICA) lead to the same iterative learning algorithm for blind separation of mixed independent sources. We review those theories and suggest that information theory can be used to unify several lines of research. Pearlmutter and Parra [1] and Cardoso [2] showed that the infomax approach of Bell and Sejnowski [3] and the maximum likelihood estimation approach are equivalent. We show that negentropy maximization also has equivalent properties, and therefore, all three approaches yield the same learning rule for a fixed nonlinearity. Girolami and Fyfe [4] have shown that the nonlinear principal component analysis (PCA) algorithm of Karhunen and Joutsensalo [5] and Oja [6] can also be viewed from information-theoretic principles since it minimizes the sum of squares of the fourth-order marginal cumulants, and therefore, approximately minimizes the mutual information [7]. Lambert [8] has proposed different Bussgang cost functions for multichannel blind deconvolution. We show how the Bussgang property relates to the infomax principle. Finally, we discuss convergence and stability as well as future research issues in blind source separation.


Proteomics Clinical Applications | 2007

Clinical proteomics: A need to define the field and to begin to set adequate standards

Harald Mischak; Rolf Apweiler; Rosamonde E. Banks; Mark R. Conaway; Joshua J. Coon; Anna F. Dominiczak; Jochen H. H. Ehrich; Danilo Fliser; Mark A. Girolami; Henning Hermjakob; Denis F. Hochstrasser; Joachim Jankowski; Bruce A. Julian; Walter Kolch; Ziad A. Massy; Christian Neusuess; Jan Novak; Karlheinz Peter; Kasper Rossing; Joost P. Schanstra; O. John Semmes; Dan Theodorescu; Visith Thongboonkerd; Eva M. Weissinger; Jennifer E. Van Eyk; Tadashi Yamamoto

The aim of this manuscript is to initiate a constructive discussion about the definition of clinical proteomics, study requirements, pitfalls and (potential) use. Furthermore, we hope to stimulate proposals for the optimal use of future opportunities and seek unification of the approaches in clinical proteomic studies. We have outlined our collective views about the basic principles that should be considered in clinical proteomic studies, including sample selection, choice of technology and appropriate quality control, and the need for collaborative interdisciplinary efforts involving clinicians and scientists. Furthermore, we propose guidelines for the critical aspects that should be included in published reports. Our hope is that, as a result of stimulating discussion, a consensus will be reached amongst the scientific community leading to guidelines for the studies, similar to those already published for mass spectrometric sequencing data. We contend that clinical proteomics is not just a collection of studies dealing with analysis of clinical samples. Rather, the essence of clinical proteomics should be to address clinically relevant questions and to improve the state‐of‐the‐art, both in diagnosis and in therapy of diseases.


Molecular & Cellular Proteomics | 2010

Naturally occurring human urinary peptides for use in diagnosis of chronic kidney disease

David M. Good; Petra Zürbig; Àngel Argilés; Hartwig W. Bauer; Georg Behrens; Joshua J. Coon; Mohammed Dakna; Stéphane Decramer; Christian Delles; Anna F. Dominiczak; Jochen H. H. Ehrich; Frank Eitner; Danilo Fliser; Moritz Frommberger; Arnold Ganser; Mark A. Girolami; Igor Golovko; Wilfried Gwinner; Marion Haubitz; Stefan Herget-Rosenthal; Joachim Jankowski; Holger Jahn; George Jerums; Bruce A. Julian; Markus Kellmann; Volker Kliem; Walter Kolch; Andrzej S. Krolewski; Mario Luppi; Ziad A. Massy

Because of its availability, ease of collection, and correlation with physiology and pathology, urine is an attractive source for clinical proteomics/peptidomics. However, the lack of comparable data sets from large cohorts has greatly hindered the development of clinical proteomics. Here, we report the establishment of a reproducible, high resolution method for peptidome analysis of naturally occurring human urinary peptides and proteins, ranging from 800 to 17,000 Da, using samples from 3,600 individuals analyzed by capillary electrophoresis coupled to MS. All processed data were deposited in an Structured Query Language (SQL) database. This database currently contains 5,010 relevant unique urinary peptides that serve as a pool of potential classifiers for diagnosis and monitoring of various diseases. As an example, by using this source of information, we were able to define urinary peptide biomarkers for chronic kidney diseases, allowing diagnosis of these diseases with high accuracy. Application of the chronic kidney disease-specific biomarker set to an independent test cohort in the subsequent replication phase resulted in 85.5% sensitivity and 100% specificity. These results indicate the potential usefulness of capillary electrophoresis coupled to MS for clinical applications in the analysis of naturally occurring urinary peptides.


Journal of The American Society of Nephrology | 2007

Advances in Urinary Proteome Analysis and Biomarker Discovery

Danilo Fliser; Jan Novak; Visith Thongboonkerd; Àngel Argilés; Vera Jankowski; Mark A. Girolami; Joachim Jankowski; Harald Mischak

Noninvasive diagnosis of kidney diseases and assessment of the prognosis are still challenges in clinical nephrology. Definition of biomarkers on the basis of proteome analysis, especially of the urine, has advanced recently and may provide new tools to solve those challenges. This article highlights the most promising technological approaches toward deciphering the human proteome and applications of the knowledge in clinical nephrology, with emphasis on the urinary proteome. The data in the current literature indicate that although a thorough investigation of the entire urinary proteome is still a distant goal, clinical applications are already available. Progress in the analysis of human proteome in health and disease will depend more on the standardization of data and availability of suitable bioinformatics and software solutions than on new technological advances. It is predicted that proteomics will play an important role in clinical nephrology in the very near future and that this progress will require interactive dialogue and collaboration between clinicians and analytical specialists.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2003

Probability density estimation from optimally condensed data samples

Mark A. Girolami; Chao He

The requirement to reduce the computational cost of evaluating a point probability density estimate when employing a Parzen window estimator is a well-known problem. This paper presents the Reduced Set Density Estimator that provides a kernel-based density estimator which employs a small percentage of the available data sample and is optimal in the L/sub 2/ sense. While only requiring /spl Oscr/(N/sup 2/) optimization routines to estimate the required kernel weighting coefficients, the proposed method provides similar levels of performance accuracy and sparseness of representation as Support Vector Machine density estimation, which requires /spl Oscr/(N/sup 3/) optimization routines, and which has previously been shown to consistently outperform Gaussian Mixture Models. It is also demonstrated that the proposed density estimator consistently provides superior density estimates for similar levels of data reduction to that provided by the recently proposed Density-Based Multiscale Data Condensation algorithm and, in addition, has comparable computational scaling. The additional advantage of the proposed method is that no extra free parameters are introduced such as regularization, bin width, or condensation ratios, making this method a very simple and straightforward approach to providing a reduced set density estimator with comparable accuracy to that of the full sample Parzen density estimator.


international acm sigir conference on research and development in information retrieval | 2003

On an equivalence between PLSI and LDA

Mark A. Girolami; Ata Kabán

Latent Dirichlet Allocation (LDA) is a fully generative approach to language modelling which overcomes the inconsistent generative semantics of Probabilistic Latent Semantic Indexing (PLSI). This paper shows that PLSI is a maximum a posteriori estimated LDA model under a uniform Dirichlet prior, therefore the perceived shortcomings of PLSI can be resolved and elucidated within the LDA framework.


Neural Computation | 2006

Variational Bayesian multinomial probit regression with Gaussian process priors

Mark A. Girolami; Simon Rogers

It is well known in the statistics literature that augmenting binary and polychotomous response models with gaussian latent variables enables exact Bayesian analysis via Gibbs sampling from the parameter posterior. By adopting such a data augmentation strategy, dispensing with priors over regression coefficients in favor of gaussian process (GP) priors over functions, and employing variational approximations to the full posterior, we obtain efficient computational methods for GP classification in the multiclass setting.1 The model augmentation with additional latent variables ensures full a posteriori class coupling while retaining the simple a priori independent GP covariance structure from which sparse approximations, such as multiclass informative vector machines (IVM), emerge in a natural and straightforward manner. This is the first time that a fully variational Bayesian treatment for multiclass GP classification has been developed without having to resort to additional explicit approximations to the nongaussian likelihood term. Empirical comparisons with exact analysis use Markov Chain Monte Carlo (MCMC) and Laplace approximations illustrate the utility of the variational approximation as a computationally economic alternative to full MCMC and it is shown to be more accurate than the Laplace approximation.

Collaboration


Dive into the Mark A. Girolami's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Walter Kolch

University College Dublin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mingjun Zhong

Dalian University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge