Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Avishek Chakraborty is active.

Publication


Featured researches published by Avishek Chakraborty.


The Annals of Applied Statistics | 2010

Modeling large scale species abundance with latent spatial processes

Avishek Chakraborty; Alan E. Gelfand; Adam M. Wilson; Andrew M. Latimer; John A. Silander

Modeling species abundance patterns using local environmental features is an important, current problem in ecology. The Cape Floristic Region (CFR) in South Africa is a global hot spot of diversity and endemism, and provides a rich class of species abundance data for such modeling. Here, we propose a multi-stage Bayesian hierarchical model for explaining species abundance over this region. Our model is specified at areal level, where the CFR is divided into roughly 37,000 one minute grid cells; species abundance is observed at some locations within some cells. The abundance values are ordinally categorized. Environmental and soil-type factors, likely to influence the abundance pattern, are included in the model. We formulate the empirical abundance pattern as a degraded version of the potential pattern, with the degradation effect accomplished in two stages. First, we adjust for land use transformation and then we adjust for measurement error, hence misclassification error, to yield the observed abundance classifications. An important point in this analysis is that only 28% of the grid cells have been sampled and that, for sampled grid cells, the number of sampled locations ranges from one to more than one hundred. Still, we are able to develop potential and transformed abundance surfaces over the entire region. In the hierarchical framework, categorical abundance classifications are induced by continuous latent surfaces. The degradation model above is built on the latent scale. On this scale, an areal level spatial regression model was used for modeling the dependence of species abundance on the environmental factors. To capture anticipated similarity in abundance pattern among neighboring regions, spatial random effects with a conditionally autoregressive prior (CAR) were specified. Model fitting is through familiar Markov chain Monte Carlo methods. While models with CAR priors are usually efficiently fitted, even with large data sets, with our modeling and the large number of cells, run times became very long. So a novel parallelized computing strategy was developed to expedite fitting. The model was run for six different species. With categorical data, display of the resultant abundance patterns is a challenge and we offer several different views. The patterns are of importance on their own, comparatively across the region and across species, with implications for species competition and, more generally, for planning and conservation.


Journal of the American Statistical Association | 2013

Spline-Based Emulators for Radiative Shock Experiments With Measurement Error

Avishek Chakraborty; Bani K. Mallick; Ryan G. McClarren; C.C. Kuranz; Derek Bingham; M.J. Grosskopf; Erica M. Rutter; Hayes F. Stripling; R. Paul Drake

Radiation hydrodynamics and radiative shocks are of fundamental interest in the high-energy-density physics research due to their importance in understanding astrophysical phenomena such as supernovae. In the laboratory, experiments can produce shocks with fundamentally similar physics on reduced scales. However, the cost and time constraints of the experiment necessitate use of a computer algorithm to generate a reasonable number of outputs for making valid inference. We focus on modeling emulators that can efficiently assimilate these two sources of information accounting for their intrinsic differences. The goal is to learn how to predict the breakout time of the shock given the information on associated parameters such as pressure and energy. Under the framework of the Kennedy–O’Hagan model, we introduce an emulator based on adaptive splines. Depending on the preference of having an interpolator for the computer code output or a computationally fast model, a couple of different variants are proposed. Those choices are shown to perform better than the conventional Gaussian-process-based emulator and a few other choices of nonstationary models. For the shock experiment dataset, a number of features related to computer model validation such as using interpolator, necessity of discrepancy function, or accounting for experimental heterogeneity are discussed, implemented, and validated for the current dataset. In addition to the typical Gaussian measurement error for real data, we consider alternative specifications suitable to incorporate noninformativeness in error distributions, more in agreement with the current experiment. Comparative diagnostics, to highlight the effect of measurement error model on predictive uncertainty, are also presented. Supplementary materials for this article are available online.


Statistics in Medicine | 2014

Imputation of confidential data sets with spatial locations using disease mapping models

Thais Paiva; Avishek Chakraborty; Jerry Reiter; Alan E. Gelfand

Data that include fine geographic information, such as census tract or street block identifiers, can be difficult to release as public use files. Fine geography provides information that ill-intentioned data users can use to identify individuals. We propose to release data with simulated geographies, so as to enable spatial analyses while reducing disclosure risks. We fit disease mapping models that predict areal-level counts from attributes in the file and sample new locations based on the estimated models. We illustrate this approach using data on causes of death in North Carolina, including evaluations of the disclosure risks and analytic validity that can result from releasing synthetic geographies.


Computational Statistics & Data Analysis | 2013

Spatial interaction models with individual-level data for explaining labor flows and developing local labor markets

Avishek Chakraborty; María Asunción Beamonte; Alan E. Gelfand; M.P. Alonso; Pilar Gargallo; Manuel Salvador

As a result of increased mobility patterns of workers, explaining labor flows and partitioning regions into local labor markets (LLMs) have become important economic issues. For the former, it is useful to understand jointly where individuals live and where they work. For the latter, such markets attempt to delineate regions with a high proportion of workers both living and working. To address these questions, we separate the problem into two stages. First, we introduce a stochastic modeling approach using a hierarchical spatial interaction specification at the individual level, incorporating individual-level covariates, origin (O) and destination (D) covariates, and spatial structure. We fit the model within a Bayesian framework. Such modeling enables posterior inference regarding the importance of these components as well as the O-D matrix of flows. Nested model comparison is available as well. For computational convenience, we start with a minimum market configuration (MMC) upon which our model is overlaid. At the second stage, after model fitting and inference, we turn to LLM creation. We introduce a utility with regard to the performance of an LLM partition and, with posterior samples, we can obtain the posterior distribution of the utility for any given LLM specification which we view as a partition of the MMC. We further provide an explicit algorithm to obtain good partitions according to this utility, employing these posterior distributions. However, the space of potential market partitions is huge and we discuss challenges regarding selection of the number of markets and comparison of partitions using this utility. Our approach is illustrated using a rich dataset for the region of Aragon in Spain. In particular, we analyze the full dataset and also a sample. Future data collection will arise as samples of the working population so assessing population level inference from the sample is useful.


Journal of Biomedical Optics | 2018

Pulsed terahertz imaging of breast cancer in freshly excised murine tumors

Tyler Bowman; Tanny Chavez; Kamrul Khan; Jingxian Wu; Avishek Chakraborty; Narasimhan Rajaram; Keith Bailey; Magda El-Shenawee

Abstract. This paper investigates terahertz (THz) imaging and classification of freshly excised murine xenograft breast cancer tumors. These tumors are grown via injection of E0771 breast adenocarcinoma cells into the flank of mice maintained on high-fat diet. Within 1 h of excision, the tumor and adjacent tissues are imaged using a pulsed THz system in the reflection mode. The THz images are classified using a statistical Bayesian mixture model with unsupervised and supervised approaches. Correlation with digitized pathology images is conducted using classification images assigned by a modal class decision rule. The corresponding receiver operating characteristic curves are obtained based on the classification results. A total of 13 tumor samples obtained from 9 tumors are investigated. The results show good correlation of THz images with pathology results in all samples of cancer and fat tissues. For tumor samples of cancer, fat, and muscle tissues, THz images show reasonable correlation with pathology where the primary challenge lies in the overlapping dielectric properties of cancer and muscle tissues. The use of a supervised regression approach shows improvement in the classification images although not consistently in all tissue regions. Advancing THz imaging of breast tumors from mice and the development of accurate statistical models will ultimately progress the technique for the assessment of human breast tumor margins.


The Journal of Thoracic and Cardiovascular Surgery | 2017

A prognostic tool to predict outcomes in children undergoing the Norwood operation

Punkaj Gupta; Avishek Chakraborty; Jeffrey M. Gossett; Mallikarjuna Rettiganti

Objectives: To create and validate a prediction model to assess outcomes associated with the Norwood operation. Methods: The public‐use dataset from a multicenter, prospective, randomized single‐ventricle reconstruction trial was used to create this novel prediction tool. A Bayesian lasso logistic regression model was used for variable selection. We used a hierarchical framework by representing discrete probability models with continuous latent variables that depended on the risk factors for a particular patient. Bayesian conditional probit regression and Markov chain Monte Carlo simulations were then used to estimate the effects of the predictors on the means of these latent variables to create a score function for each of the study outcomes. We also devised a method to calculate the risk of outcomes associated with the Norwood operation before the actual heart operation. The 2 study outcomes evaluated were in‐hospital mortality and composite poor outcome. Results: The training dataset used 520 patients to generate the prediction model. The model included patient demographics, baseline characteristics, cardiac diagnosis, operation details, site volume, and surgeon experience. An online calculator for the tool can be accessed at https://soipredictiontool.shinyapps.io/NorwoodScoreApp/. Model validation was performed on 520 observations using an internal 10‐fold cross‐validation approach. The prediction model had an area under the curve of 0.77 for mortality and 0.72 for composite poor outcome on the validation dataset. Conclusions: Our new prognostic tool is a promising first step in creating real‐time risk stratification in children undergoing a Norwood operation; this tool will be beneficial for the purposes of benchmarking, family counseling, and research.


Statistics and Computing | 2015

An adaptive spatial model for precipitation data from multiple satellites over large regions

Avishek Chakraborty; Swarup De; Kenneth P. Bowman; Huiyan Sang; Marc G. Genton; Bani K. Mallick

Satellite measurements have of late become an important source of information for climate features such as precipitation due to their near-global coverage. In this article, we look at a precipitation dataset during a 3-hour window over tropical South America that has information from two satellites. We develop a flexible hierarchical model to combine instantaneous rainrate measurements from those satellites while accounting for their potential heterogeneity. Conceptually, we envision an underlying precipitation surface that influences the observed rain as well as absence of it. The surface is specified using a mean function centered at a set of knot locations, to capture the local patterns in the rainrate, combined with a residual Gaussian process to account for global correlation across sites. To improve over the commonly used pre-fixed knot choices, an efficient reversible jump scheme is used to allow the number of such knots as well as the order and support of associated polynomial terms to be chosen adaptively. To facilitate computation over a large region, a reduced rank approximation for the parent Gaussian process is employed.


Seminars in Thoracic and Cardiovascular Surgery | 2018

An Empirically Derived Pediatric Cardiac Inotrope Score Associated With Pediatric Heart Surgery

Punkaj Gupta; Mallikarjuna Rettiganti; Andrew Wilcox; Mai-Anh Vuong-Dac; Jeffrey M. Gossett; Michiaki Imamura; Avishek Chakraborty

We aimed to empirically derive an inotrope score to predict real-time outcomes using the doses of inotropes after pediatric cardiac surgery. The outcomes evaluated included in-hospital mortality, prolonged hospital length of stay, and composite poor outcome (mortality or prolonged hospital length of stay). The study population included patients <18 years of age undergoing heart operations (with or without cardiopulmonary bypass) of varying complexity. To create this novel pediatric cardiac inotrope score (PCIS), we collected the data on the highest doses of 4 commonly used inotropes (epinephrine, norepinephrine, dopamine, and milrinone) in the first 24 hours after heart operation. We employed a hierarchical framework by representing discrete probability models with continuous latent variables that depended on the dosage of drugs for a particular patient. We used Bayesian conditional probit regression to model the effects of the inotropes on the mean of the latent variables. We then used Markov chain Monte Carlo simulations for simulating posterior samples to create a score function for each of the study outcomes. The training dataset utilized 1030 patients to make the scientific model. An online calculator for the tool can be accessed at https://soipredictiontool.shinyapps.io/InotropeScoreApp. The newly proposed empiric PCIS demonstrated a high degree of discrimination for predicting study outcomes in children undergoing heart operations. The newly proposed empiric PCIS provides a novel measure to predict real-time outcomes using the doses of inotropes among children undergoing heart operations of varying complexity.


Technometrics | 2017

Emulation of Numerical Models With Over-Specified Basis Functions

Avishek Chakraborty; Derek Bingham; Soma Sekhar Dhavala; C. C. Kuranz; R. Paul Drake; M.J. Grosskopf; Erica M. Rutter; Ben Torralva; James Paul Holloway; Ryan G. McClarren; Bani K. Mallick

ABSTRACT Mathematical models are frequently used to explore physical systems, but can be computationally expensive to evaluate. In such settings, an emulator is used as a surrogate. In this work, we propose a basis-function approach for computer model emulation. To combine field observations with a collection of runs from the numerical model, we use the proposed emulator within the Kennedy-O’Hagan framework of model calibration. A novel feature of the approach is the use of an over-specified set of basis functions where number of bases used and their inclusion probabilities are treated as unknown quantities. The new approach is found to have smaller predictive uncertainty and computational efficiency than the standard Gaussian process approach to emulation and calibration. Along with several simulation examples focusing on different model characteristics, we also use the method to analyze a dataset on laboratory experiments related to astrophysics.


Journal of The Royal Statistical Society Series C-applied Statistics | 2011

Point pattern modelling for degraded presence‐only data over large regions

Avishek Chakraborty; Alan E. Gelfand; Adam M. Wilson; Andrew M. Latimer; John A. Silander

Collaboration


Dive into the Avishek Chakraborty's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mallikarjuna Rettiganti

University of Arkansas for Medical Sciences

View shared research outputs
Top Co-Authors

Avatar

Punkaj Gupta

University of Arkansas for Medical Sciences

View shared research outputs
Top Co-Authors

Avatar

Jingxian Wu

University of Arkansas

View shared research outputs
Top Co-Authors

Avatar

Kamrul Khan

University of Arkansas

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge