Hsin-Cheng Huang
Academia Sinica
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hsin-Cheng Huang.
Journal of the American Statistical Association | 1999
Noel A Cressie; Hsin-Cheng Huang
Abstract Suppose that a random process Z(s;t), indexed in space and time, has spatio-temporal stationary covariance C(h;u), where h ∈ ℝd (d ≥ 1) is a spatial lag and u ∈ ℝ is a temporal lag. Separable spatio-temporal covariances have the property that they can be written as a product of a purely spatial covariance and a purely temporal covariance. Their ease of definition is counterbalanced by the rather limited class of random processes to which they correspond. In this article we derive a new approach that allows one to obtain many classes of nonseparable, spatio-temporal stationary covariance functions and fit several such classes to spatio-temporal data on wind speed over a region in the tropical western Pacific ocean.
Computational Statistics & Data Analysis | 1996
Hsin-Cheng Huang; Noel A Cressie
Abstract Consider a spatio-temporal stochastic process {Z(s; t): s e D; t = 1, 2 .... } and suppose it is of interest to predict {Z(s; to): s e D} at some fixed time point t o. Purely spatial methods use data Z(sl;to) ..... Z(s,;to) to construct a spatial predictor (e.g., kriging). But, when data {Z(si;t): i = 1,..., n; t = 1, 2,..., t o } are available, it is advantageous to treat the problem as one of spatio- temporal prediction. The US National Weather Service now use current snow water equivalent (SWE) data and a purely spatial model to predict SWE at sites where no observations are available. To improve SWE predictions, we introduce a spatio-temporal model that incorporates the SWE data from the past, resulting in a Kalman-filter prediction algorithm. A simple procedure for estimating the parameters in the model is developed and an example is presented for the Animas River basin in southwest Colorado. Keywords: Cross-validation; Kriging; Second-order stationary; Spatio-temporal model
Journal of Computational and Graphical Statistics | 2002
Hsin-Cheng Huang; Noel A Cressie; John Gabrosek
Polar orbiting satellites remotely sense the earth and its atmosphere, producing datasets that give daily global coverage. For any given day, the data are many and measured at spatially irregular locations. Our goal in this article is to predict values that are spatially regular at different resolutions; such values are often used as input to general circulation models (GCMs) and the like. Not only do we wish to predict optimally, but because data acquisition is relentless, our algorithm must also process the data very rapidly. This article applies a multiresolution autoregressive tree-structured model, and presents a new statistical prediction methodology that is resolution consistent (i.e., preserves “mass balance” across resolutions) and computes spatial predictions and prediction (co)variances extremely fast. Data from the Total Ozone Mapping Spectrometer (TOMS) instrument, on the Nimbus-7 satellite, are used for illustration.
Journal of the American Statistical Association | 2010
Xiaotong Shen; Hsin-Cheng Huang
Extracting grouping structure or identifying homogenous subgroups of predictors in regression is crucial for high-dimensional data analysis. A low-dimensional structure in particular—grouping, when captured in a regression model—enables to enhance predictive performance and to facilitate a model’s interpretability. Grouping pursuit extracts homogenous subgroups of predictors most responsible for outcomes of a response. This is the case in gene network analysis, where grouping reveals gene functionalities with regard to progression of a disease. To address challenges in grouping pursuit, we introduce a novel homotopy method for computing an entire solution surface through regularization involving a piecewise linear penalty. This nonconvex and overcomplete penalty permits adaptive grouping and nearly unbiased estimation, which is treated with a novel concept of grouped subdifferentials and difference convex programming for efficient computation. Finally, the proposed method not only achieves high performance as suggested by numerical analysis, but also has the desired optimality with regard to grouping pursuit and prediction as showed by our theoretical results.
Journal of the American Statistical Association | 2006
Xiaotong Shen; Hsin-Cheng Huang
Central to statistical theory and application is statistical modeling, which typically involves choosing a single model or combining a number of models of different sizes and from different sources. Whereas model selection seeks a single best modeling procedure, model combination combines the strength of different modeling procedures. In this article we look at several key issues and argue that model assessment is the key to model selection and combination. Most important, we introduce a general technique of optimal model assessment based on data perturbation, thus yielding optimal selection, in particular model selection and combination. From a frequentist perspective, we advocate model combination over a selected subset of modeling procedures, because it controls bias while reducing variability, hence yielding better performance in terms of the accuracy of estimation and prediction. To realize the potential of model combination, we develop methodologies for determining the optimal tuning parameter, such as weights and subsets for combining via optimal model assessment. We present simulated and real data examples to illustrate main aspects.
Journal of Toxicology and Environmental Health | 2002
Song-Lih Huang; Wen-Ling Cheng; Chung-Te Lee; Hsin-Cheng Huang; Chang-Chuan Chan
Ambient particles may cause pulmonary inflammation with ensuing morbidity. Particle-induced production of proinflammatory cytokines in vitro has been used as an indicator of particle toxicity. To identify particle components that were related to particle toxicity, Andersen dichotomous impactors were used to collect ambient fine (PM 2.5 ) and coarse (PM 2.5-10 ) particles in central Taiwan with extraction in endotoxin-free water. Mouse mono-cyte-macrophage cell line RAW 264.7 cells were exposed to particle extracts at 40 g/ml for 16 h, and tumor necrosis factor-alpha (TNF- f ) was measured in the medium by enzyme-linked immunosorbent assay (ELISA). Cell viabilities were all greater than 82%. Coarse particles stimulated higher TNF- f production than fine particles, and this was associated with greater particulate endotoxin content. Polymyxin B inhibited 42% of TNF- f production elicited by coarse particles and 32% of TNF- f production elicited by fine particles. In fine particles, TNF- f production was negatively correlated with Zn content, while no element in coarse particles correlated with TNF- f production. Results suggest that endotoxin and other components may be important factors for TNF- f production by macrophages in vitro.
Journal of Agricultural Biological and Environmental Statistics | 2005
Jun Zhu; Hsin-Cheng Huang; Jungpin Wu
An autologistic regression model consists of a logistic regression of a response variable on explanatory variables and an autoregression on responses at neighboring locations on a lattice. It is a Markov random field with pairwise spatial dependence and is a popular tool for modeling spatial binary responses. In this article, we add a temporal component to the autologistic regression model for spatial-temporal binary data. The spatial-temporal autologistic regression model captures the relationship between a binary response and potential explanatory variables, and adjusts for both spatial dependence and temporal dependence simultaneously by a space-time Markov random field. We estimate the model parameters by maximum pseudo-likelihood and obtain optimal prediction of future responses on the lattice by a Gibbs sampler. For illustration, the method is applied to study the outbreaks of southern pine bettle in North Carolina. We also discuss the generality of our approach for modeling other types of spatial-temporal lattice data.
Journal of the American Statistical Association | 2004
Xiaotong Shen; Hsin-Cheng Huang; Jimmy Ye
Typical modeling strategies involve model selection, which has a significant effect on inference of estimated parameters. Common practice is to use a selected model ignoring uncertainty introduced by the process of model selection. This could yield overoptimistic inferences, resulting in false discovery. In this article we develop a general methodology via optimal approximation for estimating the mean and variance of complex statistics that involve the process of model selection. This allows us to make approximately unbiased inferences, taking into account the selection process. We examine the operating characteristics of the proposed methodology via asymptotic analyses and simulations. These results show that the proposed methodology yields correct inferences and outperforms common alternatives.
Journal of the American Statistical Association | 2007
Hsin-Cheng Huang; Chun-Shu Chen
In many fields of science, predicting variables of interest over a study region based on noisy data observed at some locations is an important problem. Two popular methods for the problem are kriging and smoothing splines. The former assumes that the underlying process is stochastic, whereas the latter assumes it is purely deterministic. Kriging performs better than smoothing splines in some situations, but is outperformed by smoothing splines in others. However, little is known regarding selecting between kriging and smoothing splines. In addition, how to perform variable selection in a geostatistical model has not been well studied. In this article we propose a general methodology for selecting among arbitrary spatial prediction methods based on (approximately) unbiased estimation of mean squared prediction errors using a data perturbation technique. The proposed method accounts for estimation uncertainty in both kriging and smoothing spline predictors, and is shown to be optimal in terms of two mean squared prediction error criteria. A simulation experiment is performed to demonstrate the effectiveness of the proposed methodology. The proposed method is also applied to a water acidity data set by selecting important variables responsible for water acidity based on a spatial regression model. Moreover, a new method is proposed for estimating the noise variance that is robust and performs better than some well-known methods.
Technometrics | 2000
Hsin-Cheng Huang; Noel A Cressie
In a series of recent articles on nonparametric regression, Donoho and Johnstone developed waveletshrinkage methods for recovering unknown piecewise-smooth deterministic signals from noisy data. Wavelet shrinkage based on the Bayesian approach involves specifying a prior distribution on the wavelet coefficients, which is usually assumed to have a distribution with zero mean. There is no a priori reason why all prior means should be 0; indeed, one can imagine certain types of signals in which this is not a good choice of model. In this article, we take an empirical Bayes approach in which we propose an estimator for the prior mean that is “plugged into” the Bayesian shrinkage formulas. Another way we are more general than previous work is that we assume that the underlying signal is composed of a piecewise-smooth deterministic part plus a zero-mean stochastic part; that is, the signal may contain a reasonably large number of nonzero wavelet coefficients. Our goal is to predict this signal from noisy data. We also develop a new estimator for the noise variance based on a geostatistical method that considers the behavior of the variogram near the origin. Simulation studies show that our method (DecompShrink) outperforms the wellknown VisuShrink and SureShrink methods for recovering a wide variety of signals. Moreover, it is insensitive to the choice of the lowest-scale cut-off parameter, which is typically not the case for other wavelet-shrinkage methods.