Markus Harva
Helsinki University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Markus Harva.
Signal Processing | 2004
Harri Valpola; Markus Harva; Juha Karhunen
In many models, variances are assumed to be constant although this assumption is often unrealistic in practice. Joint modelling of means and variances is difficult in many learning approaches, because it can lead into infinite probability densities. We show that a Bayesian variational technique which is sensitive to probability mass instead of density is able to jointly model both variances and means. We consider a model structure where a Gaussian variable, called variance node, controls the variance of another Gaussian variable. Variance nodes make it possible to build hierarchical models for both variances and means. We report experiments with artificial data which demonstrate the ability of the learning algorithm to find variance sources explaining and characterizing well the variances in the multidimensional data. Experiments with biomedical MEG data show that variance sources are present in real-world signals.
Signal Processing | 2007
Markus Harva; Ata Kabán
Linear factor models with non-negativity constraints have received a great deal of interest in a number of problem domains. In existing approaches, positivity has often been associated with sparsity. In this paper we argue that sparsity of the factors is not always a desirable option, but certainly a technical limitation of the currently existing solutions. We then reformulate the problem in order to relax the sparsity constraint while retaining positivity. This is achieved by employing a rectification nonlinearity rather than a positively supported prior directly on the latent space. A variational learning procedure is derived for the proposed model and this is contrasted to existing related approaches. Both i.i.d. and first-order AR variants of the proposed model are provided and they are experimentally demonstrated with artificial data. Application to the analysis of galaxy spectra show the benefits of the method in a real-world astrophysical problem, where the existing approach is not a viable alternative.
Monthly Notices of the Royal Astronomical Society | 2006
Louisa A. Nolan; Markus Harva; Ata Kabán; Somak Raychaudhury
Efficient predictive models and data analysis techniques for the analysis of photometric and spectroscopic observations of galaxies are not only desirable, but also required, in view of the overwhelming quantities of data becoming available. We present the results of a novel application of Bayesian latent variable modelling techniques, where we have formulated a data-driven algorithm that allows one to explore the stellar populations of a large sample of galaxies from their spectra, without the application of detailed physical models. Our only assumption is that the galaxy spectrum can be expressed as a linear superposition of a small number of independent factors, each a spectrum of a stellar subpopulation that cannot be individually observed. A probabilistic latent variable architecture that explicitly encodes this assumption is then formulated, and a rigorous Bayesian methodology is employed for solving the inverse modelling problem from the available data. A powerful aspect of this method is that it formulates a density model of the spectra, based on which we can handle observational errors. Further, we can recover missing data both from the original set of spectra which might have incomplete spectral coverage of each galaxy, or from previously unseen spectra of the same kind. We apply this method to a sample of 21 ultraviolet–optical spectra of well-studied early-type galaxies, for which we also derive detailed physical models of star formation history (i.e. age, metallicity and relative mass fraction of the component stellar populations). We also apply it to synthetic spectra made up of two stellar populations, spanning a large range of parameters. We apply four different data models, starting from a formulation of principal component analysis (PCA), which has been widely used. We explore alternative factor models, relaxing the physically unrealistic assumption of Gaussian factors, as well as constraining the possibility of negative flux values that are allowed in PCA, and show that other models perform equally well or better, while yielding more physically acceptable results. In particular, the more physically motivated assumptions of our rectified factor analysis enable it to perform better than PCA, and to recover physically meaningful results. We find that our data-driven Bayesian modelling allows us to identify those early-type galaxies that contain a significant stellar population that is ≲1-Gyr old. This experiment also concludes that our sample of early-type spectra showed no evidence of more than two major stellar populations differing significantly in age and metallicity. This method will help us to search for such young populations in a large ensemble of spectra of early-type galaxies, without fitting detailed models, and thereby to study the underlying physical processes governing the formation and evolution of early-type galaxies, particularly those leading to the suppression of star formation in dense environments. In particular, this method would be a very useful tool for automatically discovering various interesting subclasses of galaxies, for example, post-starburst or E+A galaxies.
Neurocomputing | 2008
Markus Harva; Somak Raychaudhury
A method for estimating time delays between signals that are irregularly sampled is presented. The approach is based on postulating a latent variable model from which the observed signals have been generated and computing the posterior distribution of the delay. This is achieved partly by exact marginalisation and partly by using MCMC methods. Experiments with artificial data show the effectiveness of the proposed approach while results with real-world gravitational lens data provide the main motivation for this work.
Pattern Recognition | 2010
Juan C. Cuevas-Tello; Peter Tiňo; Somak Raychaudhury; Xin Yao; Markus Harva
We study the problem of estimating the time delay between two signals representing delayed, irregularly sampled and noisy versions of the same underlying pattern. We propose and demonstrate an evolutionary algorithm for the (hyper)parameter estimation of a kernel-based technique in the context of an astronomical problem, namely estimating the time delay between two gravitationally lensed signals from a distant quasar. Mixed types (integer and real) are used to represent variables within the evolutionary algorithm. We test the algorithm on several artificial data sets, and also on real astronomical observations of quasar Q0957+561. By carrying out a statistical analysis of the results we present a detailed comparison of our method with the most popular methods for time delay estimation in astrophysics. Our method yields more accurate and more stable time delay estimates. Our methodology can be readily applied to current state-of-the-art optical monitoring data in astronomy, but can also be applied in other disciplines involving similar time series data.
2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing | 2006
Markus Harva; Somak Raychaudhury
A methodforestimating timedelays between signals that areirregularly sampled ispresented. Theapproach isbased onpostulating alatent variable modelfromwhichtheobserved signals havebeengenerated andcomputing theposterior distribution ofthedelay. Thisisachieved partly by exact marginalisation andpartly byusing MCMC methods. Experiments withartificial datashowtheeffectiveness of theproposed approach while results withreal-world gravitational lens data provide themainmotivation forthis work.
international symposium on neural networks | 2005
Markus Harva; Ata Kabán
Linear factor models with nonnegativity constraints have received a great deal of interest in a number of problem domains. In existing approaches, positivity has often been associated with sparsity. In this paper we argue that sparsity of the factors is not always a desirable option, but certainly a technical limitation of the currently existing solutions. We then reformulate the problem in order to relax the sparsity constraint while retaining positivity. A variational inference procedure is derived and this is contrasted to existing related approaches. Both i.i.d. and first-order AR variants of the proposed model are provided and these are experimentally demonstrated in a real-world astrophysical application.
international joint conference on neural network | 2006
Markus Harva
In many applications of supervised learning, the conditional average of the target variables is not sufficient for prediction. The dependencies between the explanatory variables and the target variables can be much more complex calling for modelling the full conditional probability density. The ubiquitous problem with such methods is overfitting since due to the flexibility of the model the likelihood of any datapoint can be made arbitrarily large. In this paper a method for predicting uncertainty by modelling the conditional density is presented based on conditioning the scale parameter of the noise process on the explanatory variables. The regularisation problems are solved by learning the model using variational EM. Results with synthetic data show that the approach works well and experiments with real-world environmental data are promising.
Journal of Machine Learning Research | 2007
Tapani Raiko; Harri Valpola; Markus Harva; Juha Karhunen
Archive | 2003
Harri Valpola; Antti Honkela; Markus Harva; Alexander Ilin; Tapani Raiko; Tomas Östman