Mahesan Niranjan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mahesan Niranjan is active.

Explore More

Publication

Featured researches published by Mahesan Niranjan.

Neural Computation | 1993

A function estimation approach to sequential learning with neural networks

Visakan Kadirkamanathan; Mahesan Niranjan

In this paper, we investigate the problem of optimal sequential learning, viewed as a problem of estimating an underlying function sequentially rather than estimating a set of parameters of the neural network. First, we arrive at a suboptimal solution to the sequential estimate that can be mapped by a growing gaussian radial basis function (GaRBF) network. This network adds hidden units for each observation. The function space approach in which the estimates are represented as vectors in a function space is used in developing a growth criterion to limit its growth. A simplification of the criterion leads to two joint criteria on the distance of the present pattern and the existing unit centers in the input space and on the approximation error of the network for the given observation to be satisfied together. This network is similar to the resource allocating network (RAN) (Platt 1991a) and hence RAN can be interpreted from a function space approach to sequential learning. Second, we present an enhancement to the RAN. The RAN either allocates a new unit based on the novelty of an observation or adapts the network parameters by the LMS algorithm. The function space interpretation of the RAN lends itself to an enhancement of the RAN in which the extended Kalman filter (EKF) algorithm is used in place of the LMS algorithm. The performance of the RAN and the enhanced network are compared in the experimental tasks of function approximation and time-series prediction demonstrating the superior performance of the enhanced network with fewer number of hidden units. The approach adopted here has led us toward the minimal network required for a sequential learning problem.

IEEE Transactions on Neural Networks | 1990

A theoretical investigation into the performance of the Hopfield model

Sreeram V. B. Aiyer; Mahesan Niranjan; Frank Fallside

An analysis is made of the behavior of the Hopfield model as a content-addressable memory (CAM) and as a method of solving the traveling salesman problem (TSP). The analysis is based on the geometry of the subspace set up by the degenerate eigenvalues of the connection matrix. The dynamic equation is shown to be equivalent to a projection of the input vector onto this subspace. In the case of content-addressable memory, it is shown that spurious fixed points can occur at any corner of the hypercube that is on or near the subspace spanned by the memory vectors. Analysed is why the network can frequently converge to an invalid solution when applied to the traveling salesman problem energy function. With these expressions, the network can be made robust and can reliably solve the traveling salesman problem with tour sizes of 50 cities or more.

Neural Computation | 2000

Sequential Monte Carlo Methods to Train Neural Network Models

J.F.G. de Freitas; Mahesan Niranjan; Arnaud Doucet

We discuss a novel strategy for training neural networks using sequential Monte Carlo algorithms and propose a new hybrid gradient descent/sampling importance resampling algorithm (HySIR). In terms of computational time and accuracy, the hybrid SIR is a clear improvement over conventional sequential Monte Carlo techniques. The new algorithm may be viewed as a global optimization strategy that allows us to learn the probability distributions of the network weights and outputs in a sequential framework. It is well suited to applications involving on-line, nonlinear, and nongaussian signal processing. We show how the new algorithm outperforms extended Kalman filter training on several problems. In particular, we address the problem of pricing option contracts, traded in financial markets. In this context, we are able to estimate the one-step-ahead probability density functions of the options prices.

british machine vision conference | 1998

Realisable Classifiers: Improving Operating Performance on Variable Cost Problems.

Martin J. J. Scott; Mahesan Niranjan; Richard W. Prager

A novel method is described for obtaining superior classification performance over a variable range of classification costs. By analysis of a set of existing classifiers using a receiver operating characteristic ( )c urve, a set of new realisable classifiers may be obtained by a random combination of two of the existing classifiers. These classifiers lie on the convex hull that contains the original points for the existing classifiers. This hull is the maximum realisable ( ). A theorem for this method is derived and proved from an observation about data, and experimental results verify that a superior classification system may be constructed using only the existing classifiers and the information of the original data. This new system is shown to produce the , and as such provides a powerful technique for improving classification systems in problem domains within which classification costs may not be known ap riori. Empirical results are presented for artificial data, and for two real world data sets: an image segmentation task and the diagnosis of abnormal thyroid condition.

Speech Communication | 1999

Parametric subspace modeling of speech transitions

Klaus Reinhard; Mahesan Niranjan

Abstract This paper describes an attempt at capturing segmental transition information for speech recognition tasks. The slowly varying dynamics of spectral trajectories carries much discriminant information that is very crudely modelled by traditional approaches such as HMMs. In approaches such as recurrent neural networks there is the hope, but not the convincing demonstration, that such transitional information could be captured. The method presented here starts from the very different position of explicitly capturing the trajectory of short time spectral parameter vectors on a subspace in which the temporal sequence information is preserved. This was approached by introducing a temporal constraint into the well known technique of Principal Component Analysis (PCA). On this subspace, an attempt of parametric modelling the trajectory was made, and a distance metric was computed to perform classification of diphones. Using the Principal Curves method of Hastie and Stuetzle and the Generative Topographic map (GTM) technique of Bishop, Svensen and Williams as description of the temporal evolution in terms of latent variables was performed. On the difficult problem of /bee/, /dee/, /gee/ it was possible to retain discriminatory information with a small number of parameters. Experimental illustrations present results on ISOLET and TIMIT database.

Neural Computation | 1996

Pruning with replacement on limited resource allocating networks by f-projections

Christophe Molina; Mahesan Niranjan

The principle of F-projection, in sequential function estimation, provides a theoretical foundation for a class of gaussian radial basis function networks known as the resource allocating networks (RAN). The ad hoc rules for adaptively changing the size of RAN architectures can be justified from a geometric growth criterion defined in the function space. In this paper, we show that the same arguments can be used to arrive at a pruning with replacement rule for RAN architectures with a limited number of units. We illustrate the algorithm on the laser time series prediction problem of the Santa Fe competition and show that results similar to those of the winners of the competition can be obtained with pruning and replacement.

Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468) | 1999

Sequential support vector machines

N. de Freitas; Marta Milo; P. Clarkson; Mahesan Niranjan

We derive an algorithm to train support vector machines sequentially. The algorithm makes use of the Kalman filter and is optimal in a minimum variance framework. It extends the support vector machine paradigm to applications involving real-time and non-stationary signal processing. It also provides a computationally efficient alternative to the problem of quadratic optimisation.

Neural Computation | 1995

On the practical applicability of VC dimension bounds

Sean B. Holden; Mahesan Niranjan

This article addresses the question of whether some recent Vapnik-Chervonenkis (VC) dimension-based bounds on sample complexity can be regarded as a practical design tool. Specifically, we are interested in bounds on the sample complexity for the problem of training a pattern classifier such that we can expect it to perform valid generalization. Early results using the VC dimension, while being extremely powerful, suffered from the fact that their sample complexity predictions were rather impractical. More recent results have begun to improve the situation by attempting to take specific account of the precise algorithm used to train the classifier. We perform a series of experiments based on a task involving the classification of sets of vowel formant frequencies. The results of these experiments indicate that the more recent theories provide sample complexity predictions that are significantly more applicable in practice than those provided by earlier theories; however, we also find that the recent theories still have significant shortcomings.

IEEE Transactions on Speech and Audio Processing | 1994

Fully vector-quantized neural network-based code-excited nonlinear predictive speech coding

Lizhong Wu; Mahesan Niranjan; Frank Fallside

Recent studies have shown that nonlinear predictors can achieve about 2-3 dB improvement in speech prediction over conventional linear predictors. In this paper, we exploit the advantage of the nonlinear prediction capability of neural networks and apply it to the design of improved predictive speech coders. Our studies concentrate on the following three aspects: (a) the development of short-term (formant) and long-term (pitch) nonlinear predictive vector quantizers (b) the analysis of the output variance of the nonlinear predictive filter with respect to the input disturbance (c) the design of nonlinear predictive speech coders. The above studies have resulted in a fully vector-quantized, code-excited, nonlinear predictive speech coder. Performance evaluations and comparisons with linear predictive speech coding are presented. These tests have shown the applicability of nonlinear prediction in speech coding and the improvement in coding performance. >

Neural Computation | 2000

Hierarchical Bayesian Models for Regularization in Sequential Learning

J.F.G. de Freitas; Mahesan Niranjan

We show that a hierarchical Bayesian modeling approach allows us to perform regularization in sequential learning. We identify three inference levels within this hierarchy: model selection, parameter estimation, and noise estimation. In environments where data arrive sequentially, techniques such as cross validation to achieve regularization or model selection are not possible. The Bayesian approach, with extended Kalman filtering at the parameter estimation level, allows for regularization within a minimum variance framework. A multilayer perceptron is used to generate the extended Kalman filter nonlinear measurements mapping. We describe several algorithms at the noise estimation level that allow us to implement on-line regularization. We also show the theoretical links between adaptive noise estimation in extended Kalman filtering, multiple adaptive learning rates, and multiple smoothing regularization coefficients.

Explore More