Snigdhansu Chatterjee
University of Minnesota
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Snigdhansu Chatterjee.
Bioinformatics | 2007
Nitai D. Mukhopadhyay; Snigdhansu Chatterjee
MOTIVATION Interaction among time series can be explored in many ways. All the approach has the usual problem of low power and high dimensional model. Here we attempted to build a causality network among a set of time series. The causality has been established by Granger causality, and then constructing the pathway has been implemented by finding the Minimal Spanning Tree within each connected component of the inferred network. False discovery rate measurement has been used to identify the most significant causalities. RESULTS Simulation shows good convergence and accuracy of the algorithm. Robustness of the procedure has been demonstrated by applying the algorithm in a non-stationary time series setup. Application of the algorithm in a real dataset identified many causalities, with some overlap with previously known ones. Assembled network of the genes reveals features of the network that are common wisdom about naturally occurring networks.
Annals of Statistics | 2008
Snigdhansu Chatterjee; Partha Lahiri; Huilin Li
Empirical best linear unbiased prediction (EBLUP) method uses a linear mixed model in combining information from different sources of information. This method is particularly useful in small area problems. The variability of an EBLUP is traditionally measured by the mean squared prediction error (MSPE), and interval estimates are generally constructed using estimates of the MSPE. Such methods have shortcomings like under-coverage or over-coverage, excessive length and lack of interpretability. We propose a parametric bootstrap approach to estimate the entire distribution of a suitably centered and scaled EBLUP. The bootstrap histogram is highly accurate, and differs from the true EBLUP distribution by only O(d 3 n -3/2 ), where d is the number of parameters and n the number of observations. This result is used to obtain highly accurate prediction intervals. Simulation results demonstrate the superiority of this method over existing techniques of constructing prediction intervals in linear mixed models.
Annals of Statistics | 2005
Snigdhansu Chatterjee; Arup Bose
We introduce a generalized bootstrap technique for estimators obtained by solving estimating equations. Some special cases of this generalized bootstrap are the classical bootstrap of Efron, the delete-d jackknife and variations of the Bayesian bootstrap. The use of the proposed technique is discussed in some examples. Distributional consistency of the method is established and an asymptotic representation of the resampling variance estimator is obtained.
The Annals of Applied Statistics | 2009
Snigdhansu Chatterjee; Peihua Qiu
This paper deals with phase II, univariate, statistical process control when a set of in-control data is available, and when both the in-control and out-of-control distributions of the process are unknown. Existing process control techniques typically require substantial knowledge about the in-control and out-of-control distributions of the process, which is often difficult to obtain in practice. We propose (a) using a sequence of control limits for the cumulative sum (CUSUM) control charts, where the control limits are determined by the conditional distribution of the CUSUM statistic given the last time it was zero, and (b) estimating the control limits by bootstrap. Traditionally, the CUSUM control chart uses a single control limit, which is obtained under the assumption that the in-control and out-of-control distributions of the process are Normal. When the normality assumption is not valid, which is often true in applications, the actual in-control average run length, defined to be the expected time duration before the control chart signals a process change, is quite different from the nominal in-control average run length. This limitation is mostly eliminated in the proposed procedure, which is distribution-free and robust against different choices of the in-control and out-of-control distributions.
Journal of Statistical Planning and Inference | 2003
Arup Bose; Snigdhansu Chatterjee
We introduce a generalized bootstrap technique for estimators obtained by minimizing functions that are convex in the parameter. We establish the consistency of these schemes via representation theorems. A number of classical resampling schemes, like the delete-d jackknife may be treated as special cases of this generalized bootstrap; and new ways of resampling are also introduced. Some of the schemes are computationally more efficient than classical techniques.
Statistics & Probability Letters | 2001
Arup Bose; Snigdhansu Chatterjee
For estimators of parameters defined as minimisers of Q([theta])=Ef([theta],X), we study the asymptotic and generalised bootstrap properties. We concentrate on the case where Q does not have adequate smoothness for standard analysis to work. We describe the properties required by Q as well as bootstrap weights for consistency of the bootstrap.
knowledge discovery and data mining | 2012
Jaya Kawale; Snigdhansu Chatterjee; Dominick Ormsby; Karsten Steinhaeuser; Stefan Liess; Vipin Kumar
Dipoles represent long distance connections between the pressure anomalies of two distant regions that are negatively correlated with each other. Such dipoles have proven important for understanding and explaining the variability in climate in many regions of the world, e.g., the El Nino climate phenomenon is known to be responsible for precipitation and temperature anomalies over large parts of the world. Systematic approaches for dipole detection generate a large number of candidate dipoles, but there exists no method to evaluate the significance of the candidate teleconnections. In this paper, we present a novel method for testing the statistical significance of the class of spatio-temporal teleconnection patterns called as dipoles. One of the most important challenges in addressing significance testing in a spatio-temporal context is how to address the spatial and temporal dependencies that show up as high autocorrelation. We present a novel approach that uses the wild bootstrap to capture the spatio-temporal dependencies, in the special use case of teleconnections in climate data. Our approach to find the statistical significance takes into account the autocorrelation, the seasonality and the trend in the time series over a period of time. This framework is applicable to other problems in spatio-temporal data mining to assess the significance of the patterns.
PLOS ONE | 2017
Alireza Ermagun; Snigdhansu Chatterjee; David Matthew Levinson
This empirical study sheds light on the spatial correlation of traffic links under different traffic regimes. We mimic the behavior of real traffic by pinpointing the spatial correlation between 140 freeway traffic links in a major sub-network of the Minneapolis—St. Paul freeway system with a grid-like network topology. This topology enables us to juxtapose the positive and negative correlation between links, which has been overlooked in short-term traffic forecasting models. To accurately and reliably measure the correlation between traffic links, we develop an algorithm that eliminates temporal trends in three dimensions: (1) hourly dimension, (2) weekly dimension, and (3) system dimension for each link. The spatial correlation of traffic links exhibits a stronger negative correlation in rush hours, when congestion affects route choice. Although this correlation occurs mostly in parallel links, it is also observed upstream, where travelers receive information and are able to switch to substitute paths. Irrespective of the time-of-day and day-of-week, a strong positive correlation is witnessed between upstream and downstream links. This correlation is stronger in uncongested regimes, as traffic flow passes through consecutive links more quickly and there is no congestion effect to shift or stall traffic. The extracted spatial correlation structure can augment the accuracy of short-term traffic forecasting models.
international conference on data mining | 2013
James H. Faghmous; Matthew Le; Muhammed Uluyol; Vipin Kumar; Snigdhansu Chatterjee
As spatio-temporal data have become ubiquitous, an increasing challenge facing computer scientists is that of identifying discrete patterns in continuous spatio-temporal fields. In this paper, we introduce a parameter-free pattern mining application that is able to identify dynamic anomalies in ocean data, known as ocean eddies. Despite ocean eddy monitoring being an active field of research, we provide one of the first quantitative analyses of the performance of the most used monitoring algorithms. We present an incomplete information validation technique, that uses the performance of two methods to construct an imperfect ground truth to test the significance of patterns discovered as well as the relative performance of pattern mining algorithms. These methods, in addition to the validation schemes discussed provide researchers new directions in analyzing large unlabeled climate datasets.
Journal of The Australian Mathematical Society | 2001
Arup Bose; Snigdhansu Chatterjee
We study the last passage time and its asymptotic distribution for minimum contrast estimators defined through the minimization of a convex criterion function based on U -functionals. This includes cases of non-smooth estimators for vector valued parameters. We also derive a Bahadur-type representation and the law of iterated logarithms for such estimators.