Pralay Senchaudhuri
Cytel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pralay Senchaudhuri.
Journal of the American Statistical Association | 2000
Cyrus R. Mehta; Nitin R. Patel; Pralay Senchaudhuri
Abstract Exact inference for the logistic regression model is based on generating the permutation distribution of the sufficient statistics for the regression parameters of interest conditional on the sufficient statistics for the remaining (nuisance) parameters. Despite the availability of fast numerical algorithms for the exact computations, there are numerous instances where a data set is too large to be analyzed by the exact methods, yet too sparse or unbalanced for the maximum likelihood approach to be reliable. What is needed is a Monte Carlo alternative to the exact conditional approach which can bridge the gap between the exact and asymptotic methods of inference. The problem is technically hard because conventional Monte Carlo methods lead to massive rejection of samples that do not satisfy the linear integer constraints of the conditional distribution. We propose a network sampling approach to the Monte Carlo problem that eliminates rejection entirely. Its advantages over alternative saddlepoint and Markov Chain Monte Carlo approaches are also discussed.
Journal of the American Statistical Association | 1988
Cyrus R. Mehta; Nitin R. Patel; Pralay Senchaudhuri
Abstract This article discusses importance sampling as an alternative to conventional Monte Carlo sampling for estimating exact significance levels in a broad class of two-sample tests, including all of the linear rank tests (with or without censoring), homogeneity tests based on the chi-squared, hypergeometric, and likelihood ratio statistics, the Mantel—Haenszel trend test, and Zelens test for a common odds ratio in several 2 × 2 contingency tables. Inference is based on randomly selecting 2 × k contingency tables from a reference set of all such tables with fixed marginals. Through a network algorithm, the tables are selected in proportion to their importance for reducing the variance of the estimated Monte Carlo p-value. Spectacular gains, up to four orders of magnitude, are achieved relative to conventional Monte Carlo sampling. The technique is illustrated on four real data sets.
Journal of Computational and Graphical Statistics | 1992
Cyrus R. Mehta; Nitin R. Patel; Pralay Senchaudhuri
Abstract We present an efficient algorithm for generating exact permutational distributions for linear rank statistics defined on stratified 2 × c contingency tables. The algorithm can compute exact p values and confidence intervals for a rich class of nonparametric problems. These include exact p values for stratified two-population Wilcoxon, Logrank, and Van der Waerden tests, exact p values for stratified tests of trend across several binomial populations, exact p values for stratified permutation tests with arbitrary scores, and exact confidence intervals for odds ratios embedded in stratified 2 × c tables. The algorithm uses network-based recursions to generate stratum-specific distributions and then combines them into an overall permutation distribution by convolution. Where only the tail area of a permutation distribution is desired, additional efficiency gains are achieved by backward induction and branch-and-bound processing of the network. The algorithm is especially efficient for highly imbalan...
Journal of the American Statistical Association | 1995
Pralay Senchaudhuri; Cyrus R. Mehta; Nitin R. Patel
Abstract Despite algorithmic advances in exact nonparametric inference, problems often occur that are too large for exact p value computations but too sparse for reliable asymptotic results. In these situations Monte Carlo methods are a good compromise. They bound the true p value within a confidence interval. But a factor discouraging the use of Monte Carlo p values is their sensitivity to the random number sequence. One can overcome this drawback by computing a 99% C.I. confidence interval of width less than .0001. Then the estimated p values become insensitive to the random number sequence up to three decimals. For all practical purposes, these estimates are invariant to random number sequences. The usual Monte Carlo method requires millions of samples to yield such an invariant estimate. The Monte Carlo scheme presented here decreases the sample size by two to three orders of magnitude. We illustrate this method with tests for r × c tables, two-sample survival data, and stratified 2 × 2 tables.
Biometrics | 1998
Cyrus R. Mehta; Nitin R. Patel; Pralay Senchaudhuri
The Cochran-Armitage test of trend is commonly used to determine if a dose-response relationship exists in a wide variety of biomedical settings including clinical trials, carcinogenicity studies, and toxicological risk assessment. For small, sparse, or unbalanced data sets, one generally adopts the exact version of the Cochran-Armitage test of trend for which numerical algorithms and software are readily available. No corresponding algorithms or software exist for the exact power and samplesize computations that are needed at design time, prior to gathering the data. This paper develops a network algorithm for computing the exact power of the Cochran-Armitage test of trend and applies it to several examples, thereby demonstrating that the corresponding asymptotic power computations can be rather misleading.
Statistics in Medicine | 2000
Chris Corcoran; Cyrus R. Mehta; Pralay Senchaudhuri
The Cochran-Armitage test for trend is a popular statistical procedure for detecting increasing or decreasing probabilities of response when a categorical exposure is ordered. Such associations may arise in a variety of biomedical research settings, particularly in dose-response designs such as carcinogenicity experiments. Previously, computing limitations mandated the use of the asymptotic trend test, but with the availability of new algorithms, increased computing power, and appropriate software the exact trend test is now a practical option. Nevertheless, the exact test is sometimes criticized on the grounds that it is conservative. In this paper we investigate the implications of this conservatism by comparing the true type I error and power of three alternative tests of trend - the asymptotic test, the exact test and an admissible exact test proposed by Cohen and Sackrowitz. The computations are performed by an extension to the network algorithm of Mehta et al. This allows us to make precise power comparisons between the tests under any given design without resorting to simulation. We show how this tool can guide investigators in choosing the most appropriate test by considering the design of two-year carcinogenicity studies carried out by the National Toxicology Program. We additionally compare the tests for various other combinations of sample sizes and number of groups or levels of exposure. We conclude that the asymptotic test, while more powerful where it is valid, generally does not preserve the type I error. This violation of the a priori testing level can be greatly affected by imbalance in the data or unequal spacing of dose levels.
Computational Statistics & Data Analysis | 2007
Thomas J. Santner; Vivek Pradhan; Pralay Senchaudhuri; Cyrus R. Mehta; Ajit C. Tamhane
This paper compares the exact small-sample achieved coverage and expected lengths of five methods for computing the confidence interval of the difference of two independent binomial proportions. We strongly recommend that one of these be used in practice. The first method we compare is an asymptotic method based on the score statistic (AS) as proposed by Miettinen and Nurminen [1985. Comparative analysis of two rates. Statist. Med. 4, 213-226.]. Newcombe [1998. Interval estimation for the difference between independent proportions: comparison of seven methods. Statist. Med. 17, 873-890.] has shown that under a certain asymptotic set-up, confidence intervals formed from the score statistic perform better than those formed from the Wald statistic (see also [Farrington, C.P., Manning, G., 1990. Test statistics and sample size formulae for comparative binomial trials with null hypothesis of non-zero risk difference or non-unity relative risk. Statist. Med. 9, 1447-1454.]). The remaining four methods compared are the exact methods of Agresti and Min (AM), Chan and Zhang (CZ), Coe and Tamhane (CT), and Santner and Yamagami (SY). We find that the CT has the best small-sample performance, followed by AM and CZ. Although AS is claimed to perform reasonably well, it performs the worst in this study; about 50% of the time it fails to achieve nominal coverage even with moderately large sample sizes from each binomial treatment.
Biometrics | 1994
Cyrus R. Mehta; Nitin R. Patel; Pralay Senchaudhuri; Anastasios A. Tsiatis
An efficient numerical algorithm is developed for computing stopping boundaries for group sequential clinical trials. Patients arrive in sequence, and are randomized to one of two treatments. The data are monitored at interim time points, with a fresh block of patients entering the study from one monitoring point to the next. The stopping boundaries are derived from the exact joint permutational distribution of the linear rank statistics observed across all the monitoring times. Specifically, the algorithm yields the exact boundary generating function, Pr(W1 < b1, W2 < b2, ..., Wi-1 < bi-1, Wi = wi), where Wj is the linear rank statistic at the jth interim time point. The distribution theory is based on assigning ranks after pooling all the patients who have entered the study, and then permuting the patients to the two treatments independently within each block of newly arrived patients. The methods are applicable for an arbitrary number of monitoring times, which need not be specified at the start of the study. The data may be continuous or categorical, and censored or uncensored. The randomization rule for treatment allocation can be adaptive. The algorithm is especially useful during the early stages of a clinical trial, when very little data have been gathered, and stopping boundaries are based on the extreme tails of the relevant boundary generating function. In that case the corresponding large-sample theory is not very reliable. To illustrate the techniques we present a group sequential analysis of a recently completed study by the Eastern Cooperative Oncology Group.
Statistical Methods in Medical Research | 2003
N Rabbee; Brent A. Coull; Cyrus R. Mehta; Nitin R. Patel; Pralay Senchaudhuri
We propose a new method for computing power and sample size for linear rank tests of differences between two ordered multinomial populations. The method is flexible in that it is applicable to any general alternative hypothesis and for any choice of rank scores. We show that the method, though asymptotic, closely approximates existing exact methods. At the same time it overcomes the computational limitations of the exact methods. This advantage makes our asymptotic approach more practical for sample size computations at the planning stages of a large study. We illustrate the method with data arising from both proportional and non-proportional odds models in the two ordered multinomial setting.
Biometrics | 2017
Pranab Ghosh; Lingyun Liu; Pralay Senchaudhuri; Ping Gao; Cyrus R. Mehta
Two-arm group sequential designs have been widely used for over 40 years, especially for studies with mortality endpoints. The natural generalization of such designs to trials with multiple treatment arms and a common control (MAMS designs) has, however, been implemented rarely. While the statistical methodology for this extension is clear, the main limitation has been an efficient way to perform the computations. Past efforts were hampered by algorithms that were computationally explosive. With the increasing interest in adaptive designs, platform designs, and other innovative designs that involve multiple comparisons over multiple stages, the importance of MAMS designs is growing rapidly. This article provides break-through algorithms that can compute MAMS boundaries rapidly thereby making such designs practical. For designs with efficacy-only boundaries the computational effort increases linearly with number of arms and number of stages. For designs with both efficacy and futility boundaries the computational effort doubles with successive increases in number of stages.