Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Pulak Ghosh is active.

Publication


Featured researches published by Pulak Ghosh.


Statistics in Medicine | 2011

A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a time‐to‐event

Dimitris Rizopoulos; Pulak Ghosh

Motivated by a real data example on renal graft failure, we propose a new semiparametric multivariate joint model that relates multiple longitudinal outcomes to a time-to-event. To allow for greater flexibility, key components of the model are modelled nonparametrically. In particular, for the subject-specific longitudinal evolutions we use a spline-based approach, the baseline risk function is assumed piecewise constant, and the distribution of the latent terms is modelled using a Dirichlet Process prior formulation. Additionally, we discuss the choice of a suitable parameterization, from a practitioners point of view, to relate the longitudinal process to the survival outcome. Specifically, we present three main families of parameterizations, discuss their features, and present tools to choose between them.


Bayesian Analysis | 2013

Posterior Consistency of Bayesian Quantile Regression Based on the Misspecified Asymmetric Laplace Density

Karthik Sriram; R. V. Ramamoorthi; Pulak Ghosh

We explore an asymptotic justication for the widely used and em- pirically veried approach of assuming an asymmetric Laplace distribution (ALD) for the response in Bayesian Quantile Regression. Based on empirical ndings, Yu and Moyeed (2001) argued that the use of ALD is satisfactory even if it is not the true underlying distribution. We provide a justication to this claim by establishing posterior consistency and deriving the rate of convergence under the ALD misspecication. Related literature on misspecied models focuses mostly on i.i.d. models which in the regression context amounts to considering i.i.d. random covariates with i.i.d. errors. We study the behavior of the posterior for the mis- specied ALD model with independent but non identically distributed response in the presence of non-random covariates. Exploiting the specic form of ALD helps us derive conditions that are more intuitive and easily seen to be satised by a wide range of potential true underlying probability distributions for the response. Through simulations, we demonstrate our result and also nd that the robustness of the posterior that holds for ALD fails for a Gaussian formulation, thus providing further support for the use of ALD models in quantile regression.


Journal of the American Statistical Association | 2009

Bayesian Analysis of Cancer Rates From SEER Program Using Parametric and Semiparametric Joinpoint Regression Models

Pulak Ghosh; Sanjib Basu; Ram C. Tiwari

Cancer is the second leading cause of death in the United States. Cancer incidence and mortality rates measure the progress against cancer; these rates are obtained from the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute (NCI). Lung cancer has the highest mortality rate among all cancers, whereas prostate cancer has the highest number of new cases among males. In this article, we analyze the incidence rates of these two cancers, as well as colon and rectal cancer. The NCI reports trends in cancer age-adjusted mortality and incidence rates in its annual report to the nation and analyzes them using the Joinpoint software. The location of the joinpoints signifies changes in cancer trends, whereas changes in the regression slope measure the degree of change. The Joinpoint software uses a numerical search to detect the joinpoints, fits regression within two consecutive joinpoints by least squares, and finally selects the number of joinpoints by either a series of permutation tests or the Bayesian information criterion. We propose Bayesian joinpoint models and provide statistical estimates of the joinpoints and the regression slopes. While the Joinpoint software and other work in this area assumes that the joinpoints occur on the discrete time grid, we allow a continuous prior for the joinpoints induced by the Dirichlet distribution on the spacings in between. This prior further allows the user to impose prespecified minimum gaps in between two consecutive joinpoints. We develop parametric as well as semiparametric Bayesian joinpoint models; the semiparametric framework relaxes parametric distributional assumptions by modeling the distribution of regression slopes and error variances using Dirichlet process mixtures. These Bayesian models provide statistical inference with finite sample validity. Through a simulation study, we demonstrate the performance of the proposed parametric and semiparametric joinpoint models and compare the results with the ones from the Joinpoint software. We analyze age-adjusted cancer incidence rates from the SEER Program using these Bayesian models with different numbers of joinpoints by employing the deviance information criterion and the cross-validated predictive criterion. In addition, we model the lung cancer incidence rates and the smoking rates jointly and explore the relation between the two longitudinal processes.


Statistics | 2010

Multivariate measurement error models based on scale mixtures of the skew–normal distribution

Victor H. Lachos; F. V. Labra; Heleno Bolfarine; Pulak Ghosh

Scale mixtures of the skew–normal (SMSN) distribution is a class of asymmetric thick–tailed distributions that includes the skew–normal (SN) distribution as a special case. The main advantage of these classes of distributions is that they are easy to simulate and have a nice hierarchical representation facilitating easy implementation of the expectation–maximization algorithm for the maximum-likelihood estimation. In this paper, we assume an SMSN distribution for the unobserved value of the covariates and a symmetric scale mixtures of the normal distribution for the error term of the model. This provides a robust alternative to parameter estimation in multivariate measurement error models. Specific distributions examined include univariate and multivariate versions of the SN, skew–t, skew–slash and skew–contaminated normal distributions. The results and methods are applied to a real data set.


European Journal of Operational Research | 2014

A semiparametric Bayesian approach to the analysis of financial time series with applications to value at risk estimation

M. Concepción Ausín; Pedro Galeano; Pulak Ghosh

GARCH models are commonly used for describing, estimating and predicting the dynamics of financial returns. Here, we relax the usual parametric distributional assumptions of GARCH models and develop a Bayesian semiparametric approach based on modeling the innovations using the class of scale mixtures of Gaussian distributions with a Dirichlet process prior on the mixing distribution. The proposed specification allows for greater flexibility in capturing the usual patterns observed in financial returns. It is also shown how to undertake Bayesian prediction of the Value at Risk (VaR). The performance of the proposed semiparametric method is illustrated using simulated and real data from the Hang Seng Index (HSI) and Bombay Stock Exchange index (BSE30).


Statistics in Medicine | 2010

Linear mixed models for skew‐normal/independent bivariate responses with an application to periodontal disease

Dipankar Bandyopadhyay; Victor H. Lachos; Carlos A. Abanto-Valle; Pulak Ghosh

Bivariate clustered (correlated) data often encountered in epidemiological and clinical research are routinely analyzed under a linear mixed model (LMM) framework with underlying normality assumptions of the random effects and within-subject errors. However, such normality assumptions might be questionable if the data set particularly exhibits skewness and heavy tails. Using a Bayesian paradigm, we use the skew-normal/independent (SNI) distribution as a tool for modeling clustered data with bivariate non-normal responses in an LMM framework. The SNI distribution is an attractive class of asymmetric thick-tailed parametric structure which includes the skew-normal distribution as a special case. We assume that the random effects follow multivariate SNI distributions and the random errors follow SNI distributions which provides substantial robustness over the symmetric normal process in an LMM framework. Specific distributions obtained as special cases, viz. the skew-t, the skew-slash and the skew-contaminated normal distributions are compared, along with the default skew-normal density. The methodology is illustrated through an application to a real data which records the periodontal health status of an interesting population using periodontal pocket depth (PPD) and clinical attachment level (CAL).


Statistics in Medicine | 2011

Assessing noninferiority in a three-arm trial using the Bayesian approach.

Pulak Ghosh; Farouk S. Nathoo; Mithat Gonen; Ram C. Tiwari

Non-inferiority trials, which aim to demonstrate that a test product is not worse than a competitor by more than a pre-specified small amount, are of great importance to the pharmaceutical community. As a result, methodology for designing and analyzing such trials is required, and developing new methods for such analysis is an important area of statistical research. The three-arm trial consists of a placebo, a reference and an experimental treatment, and simultaneously tests the superiority of the reference over the placebo along with comparing this reference to an experimental treatment. In this paper, we consider the analysis of non-inferiority trials using Bayesian methods which incorporate both parametric as well as semi-parametric models. The resulting testing approach is both flexible and robust. The benefit of the proposed Bayesian methods is assessed via simulation, based on a study examining home-based blood pressure interventions.


Vikalpa | 2015

Big Data: Prospects and Challenges

Janakiraman Moorthy; Rangin Lahiri; Neelanjan Biswas; Dipyaman Sanyal; Jayanthi Ranjan; Krishnadas Nanath; Pulak Ghosh

We are living in an era of data deluge. Data that the human race has accumulated in the past one decade, far exceeds the data that was available to mankind during the preceding century. McKinsey & Co. foresees that the society is ‘on the cusp of a tremendous wave of innovation, productivity, and growth as well as new modes of competition and value capture—all driven by Big Data’.2 They also expect that different stakeholders such as consumers, companies and businesses are likely to exploit the potential of Big Data. Eric Siegel, founder of Predictive Analytics World, estimates that on an average day we accumulate 2.5 quintillion bytes of data.3 Another important character of the ‘datafication’, as Viktor Mayor–Schonborge and Kenneth Cukier call it, is that ‘Data can frequently be collected passively, without much effort or even awareness on the part of those being recorded. And because the cost of storage has fallen so much, it is easier to justify keeping data than discard it.’4 As the cost of storage has fallen and computing power has increased, the size of data that was challenging before, can be easily handled with a desktop computer now. Several estimates about the accumulation of data have challenged our earlier imagination. Data scientists are increasingly using data quantities in Peta and Zeta bytes. Businesses, governments and developmental organizations—all are foreseeing that Big Data is likely to create value in multiple


Computational Statistics & Data Analysis | 2015

Comorbidity of chronic diseases in the elderly

Jakob Stöber; Hyokyoung Grace Hong; Claudia Czado; Pulak Ghosh

Joint modeling of multiple health related random variables is essential to develop an understanding for the public health consequences of an aging population. This is particularly true for patients suffering from multiple chronic diseases. The contribution is to introduce a novel model for multivariate data where some response variables are discrete and some are continuous. It is based on pair copula constructions (PCCs) and has two major advantages over existing methodology. First, expressing the joint dependence structure in terms of bivariate copulas leads to a computationally advantageous expression for the likelihood function. This makes maximum likelihood estimation feasible for large multidimensional data sets. Second, different and possibly asymmetric bivariate (conditional) marginal distributions are allowed which is necessary to accurately describe the limiting behavior of conditional distributions for mixed discrete and continuous responses. The advantages and the favorable predictive performance of the model are demonstrated using data from the Second Longitudinal Study of Aging (LSOA II).


Statistics in Medicine | 2013

Skew-elliptical spatial random effect modeling for areal data with application to mapping health utilization rates

Farouk S. Nathoo; Pulak Ghosh

Mixed models incorporating spatially correlated random effects are often used for the analysis of areal data. In this setting, spatial smoothing is introduced at the second stage of a hierarchical framework, and this smoothing is often based on a latent Gaussian Markov random field. The Markov random field provides a computationally convenient framework for modeling spatial dependence; however, the Gaussian assumption underlying commonly used models can be overly restrictive in some applications. This can be a problem in the presence of outliers or discontinuities in the underlying spatial surface, and in such settings, models based on non-Gaussian spatial random effects are useful. Motivated by a study examining geographic variation in the treatment of acute coronary syndrome, we develop a robust model for smoothing small-area health service utilization rates. The model incorporates non-Gaussian spatial random effects, and we develop a formulation for skew-elliptical areal spatial models. We generalize the Gaussian conditional autoregressive model to the non-Gaussian case, allowing for asymmetric skew-elliptical marginal distributions having flexible tail behavior. The resulting new models are flexible, computationally manageable, and can be implemented in the standard Bayesian software WinBUGS. We demonstrate performance of the proposed methods and comparisons with other commonly used Gaussian and non-Gaussian spatial prior formulations through simulation and analysis in our motivating application, mapping rates of revascularization for patients diagnosed with acute coronary syndrome in Quebec, Canada.

Collaboration


Dive into the Pulak Ghosh's collaboration.

Top Co-Authors

Avatar

Sarah Brown

University of Sheffield

View shared research outputs
Top Co-Authors

Avatar

Bhuvanesh Pareek

Indian Institute of Management Ahmedabad

View shared research outputs
Top Co-Authors

Avatar

Karthik Sriram

Indian Institute of Management Ahmedabad

View shared research outputs
Top Co-Authors

Avatar

Karl Taylor

University of Sheffield

View shared research outputs
Top Co-Authors

Avatar

Sudhir Voleti

Indian School of Business

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge