Ella Bingham
Helsinki University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ella Bingham.
knowledge discovery and data mining | 2001
Ella Bingham; Heikki Mannila
Random projections have recently emerged as a powerful method for dimensionality reduction. Theoretical results indicate that the method preserves distances quite nicely; however, empirical results are sparse. We present experimental results on using random projection as a dimensionality reduction tool in a number of cases, where the high dimensionality of the data would otherwise lead to burden-some computations. Our application areas are the processing of both noisy and noiseless images, and information retrieval in text documents. We show that projecting the data onto a random lower-dimensional subspace yields results comparable to conventional dimensionality reduction methods such as principal component analysis: the similarity of data vectors is preserved well under random projection. However, using random projections is computationally significantly less expensive than using, e.g., principal component analysis. We also show experimentally that using a sparse random matrix gives additional computational savings in random projection.
International Journal of Neural Systems | 2000
Ella Bingham; Aapo Hyvärinen
Separation of complex valued signals is a frequently arising problem in signal processing. For example, separation of convolutively mixed source signals involves computations on complex valued signals. In this article, it is assumed that the original, complex valued source signals are mutually statistically independent, and the problem is solved by the independent component analysis (ICA) model. ICA is a statistical method for transforming an observed multidimensional random vector into components that are mutually as independent as possible. In this article, a fast fixed-point type algorithm that is capable of separating complex valued, linearly mixed source signals is presented and its computational efficiency is shown by simulations. Also, the local consistency of the estimator given by the algorithm is proved.
European Journal of Operational Research | 2001
Ella Bingham
Abstract A fuzzy traffic signal controller uses simple “if–then” rules which involve linguistic concepts such as medium or long , presented as membership functions. In neurofuzzy traffic signal control, a neural network adjusts the fuzzy controller by fine-tuning the form and location of the membership functions. The learning algorithm of the neural network is reinforcement learning, which gives credit for successful system behavior and punishes for poor behavior; those actions that led to success tend to be chosen more often in the future. The objective of the learning is to minimize the vehicular delay caused by the signal control policy. In simulation experiments, the learning algorithm is found successful at constant traffic volumes: the new membership functions produce smaller vehicular delay than the initial membership functions.
Proceedings of the National Academy of Sciences of the United States of America | 2008
Lee Hsiang Liow; Mikael Fortelius; Ella Bingham; Kari Lintulaakso; Heikki Mannila; Lawrence J. Flynn; Nils Chr. Stenseth
Do large mammals evolve faster than small mammals or vice versa? Because the answer to this question contributes to our understanding of how life-history affects long-term and large-scale evolutionary patterns, and how microevolutionary rates scale-up to macroevolutionary rates, it has received much attention. A satisfactory or consistent answer to this question is lacking, however. Here, we take a fresh look at this problem using a large fossil dataset of mammals from the Neogene of the Old World (NOW). Controlling for sampling biases, calculating per capita origination and extinction rates of boundary-crossers and estimating survival probabilities using capture-mark-recapture (CMR) methods, we found the recurring pattern that large mammal genera and species have higher origination and extinction rates, and therefore shorter durations. This pattern is surprising in the light of molecular studies, which show that smaller animals, with their shorter generation times and higher metabolic rates, have greater absolute rates of evolution. However, higher molecular rates do not necessarily translate to higher taxon rates because both the biotic and physical environments interact with phenotypic variation, in part fueled by mutations, to affect origination and extinction rates. To explain the observed pattern, we propose that the ability to evolve and maintain behavior such as hibernation, torpor and burrowing, collectively termed “sleep-or-hide” (SLOH) behavior, serves as a means of environmental buffering during expected and unexpected environmental change. SLOH behavior is more common in some small mammals, and, as a result, SLOH small mammals contribute to higher average survivorship and lower origination probabilities among small mammals.
Neural Processing Letters | 2003
Ella Bingham; Ata Kabán; Mark A. Girolami
The problem of analysing dynamically evolving textual data has arisen within the last few years. An example of such data is the discussion appearing in Internet chat lines. In this Letter a recently introduced source separation method, termed as complexity pursuit, is applied to the problem of finding topics in dynamical text and is compared against several blind separation algorithms for the problem considered. Complexity pursuit is a generalisation of projection pursuit to time series and it is able to use both higher-order statistical measures and temporal dependency information in separating the topics. Experimental results on chat line and newsgroup data demonstrate that the minimum complexity time series indeed do correspond to meaningful topics inherent in the dynamical text data, and also suggest the applicability of the method to query-based retrieval from a temporally changing text stream.
international acm sigir conference on research and development in information retrieval | 2002
Ella Bingham; Jukka Kuusisto; Krista Lagus
In this study we show experimental results on using Independent Component Analysis (ICA) and the Self-Organizing Map (SOM) in document analysis. Our documents are segments of spoken dialogues carried out over the telephone in a customer service, transcribed into text. The task is to analyze the topics of the discussions, and to group the discussions into meaningful subsets. The quality of the grouping is studied by comparing to a manual topical classification of the documents.
international symposium on neural networks | 2000
Ella Bingham; Aapo Hyvärinen
Separation of complex valued signals is a frequently arising problem in signal processing. In this article it is assumed that the original, complex valued source signals are mutually statistically independent, and the problem is solved by the independent component analysis (ICA) model. ICA is a statistical method for transforming an observed multidimensional random vector into components that are mutually as independent as possible. In this article, a fast fixed-point type algorithm that is capable of separating complex valued, linearly mixed source signals is presented and its computational efficiency is shown by simulations. We also present a theorem on the local consistency of the estimator given by the algorithm.
Neurocomputing | 2008
Ata Kabán; Ella Bingham
Presence-absence (0-1) observations are special in that often the absence of evidence is not evidence of absence. Here we develop an independent factor model, which has the unique capability to isolate the former as an independent discrete binary noise factor. This representation then forms the basis of inferring missed presences by means of denoising. This is achieved in a probabilistic formalism, employing independent beta latent source densities and a Bernoulli data likelihood model. Variational approximations are employed to make the inferences tractable. We relate our model to existing models of 0-1 data, demonstrating its advantages for the problem considered, and we present applications in several problem domains, including social network analysis and DNA fingerprint analysis.
european conference on principles of data mining and knowledge discovery | 2003
Jouni K. Seppänen; Ella Bingham; Heikki Mannila
Topics in 0–1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0–1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data.
Pattern Analysis and Applications | 2009
Ella Bingham; Ata Kabán; Mikael Fortelius
We present a probabilistic multiple cause model for the analysis of binary (0–1) data. A distinctive feature of the aspect Bernoulli (AB) model is its ability to automatically detect and distinguish between “true absences” and “false absences” (both of which are coded as 0 in the data), and similarly, between “true presences” and “false presences” (both of which are coded as 1). This is accomplished by specific additive noise components which explicitly account for such non-content bearing causes. The AB model is thus suitable for noise removal and data explanatory purposes, including omission/addition detection. An important application of AB that we demonstrate is data-driven reasoning about palaeontological recordings. Additionally, results on recovering corrupted handwritten digit images and expanding short text documents are also given, and comparisons to other methods are demonstrated and discussed.