Jianhua Z. Huang
Texas A&M University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jianhua Z. Huang.
Journal of the American Statistical Association | 2008
Lifeng Wang; Hongzhe Li; Jianhua Z. Huang
Nonparametric varying-coefficient models are commonly used for analyzing data measured repeatedly over time, including longitudinal and functional response data. Although many procedures have been developed for estimating varying coefficients, the problem of variable selection for such models has not been addressed to date. In this article we present a regularized estimation procedure for variable selection that combines basis function approximations and the smoothly clipped absolute deviation penalty. The proposed procedure simultaneously selects significant variables with time-varying effects and estimates the nonzero smooth coefficient functions. Under suitable conditions, we establish the theoretical properties of our procedure, including consistency in variable selection and the oracle property in estimation. Here the oracle property means that the asymptotic distribution of an estimated coefficient function is the same as that when it is known a priori which variables are in the model. The method is illustrated with simulations and two real data examples, one for identifying risk factors in the study of AIDS and one using microarray time-course gene expression data to identify the transcription factors related to the yeast cell-cycle process.
Biometrics | 2010
Mihee Lee; Haipeng Shen; Jianhua Z. Huang; J. S. Marron
Sparse singular value decomposition (SSVD) is proposed as a new exploratory analysis tool for biclustering or identifying interpretable row-column associations within high-dimensional data matrices. SSVD seeks a low-rank, checkerboard structured matrix approximation to data matrices. The desired checkerboard structure is achieved by forcing both the left- and right-singular vectors to be sparse, that is, having many zero entries. By interpreting singular vectors as regression coefficient vectors for certain linear regressions, sparsity-inducing regularization penalties are imposed to the least squares regression to produce sparse singular vectors. An efficient iterative algorithm is proposed for computing the sparse singular vectors, along with some discussion of penalty parameter selection. A lung cancer microarray dataset and a food nutrition dataset are used to illustrate SSVD as a biclustering method. SSVD is also compared with some existing biclustering methods using simulated datasets.
Manufacturing & Service Operations Management | 2008
Haipeng Shen; Jianhua Z. Huang
Accurate forecasting of call arrivals is critical for staffing and scheduling of a telephone call center. We develop methods for interday and dynamic intraday forecasting of incoming call volumes. Our approach is to treat the intraday call volume profiles as a high-dimensional vector time series. We propose first to reduce the dimensionality by singular value decomposition of the matrix of historical intraday profiles and then to apply time series and regression techniques. Our approach takes into account both interday (or day-to-day) dynamics and intraday (or within-day) patterns of call arrivals. Distributional forecasts are also developed. The proposed methods are data driven, appear to be robust against model assumptions in our simulation studies, and are shown to be very competitive in out-of-sample forecast comparisons using two real data sets. Our methods are computationally fast; it is therefore feasible to use them for real-time dynamic forecasting.
Journal of the American Statistical Association | 2012
Lisha Chen; Jianhua Z. Huang
The reduced-rank regression is an effective method in predicting multiple response variables from the same set of predictor variables. It reduces the number of model parameters and takes advantage of interrelations between the response variables and hence improves predictive accuracy. We propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty. We apply a group-lasso type penalty that treats each row of the matrix of the regression coefficients as a group and show that this penalty satisfies certain desirable invariance properties. We develop two numerical algorithms to solve the penalized regression problem and establish the asymptotic consistency of the proposed method. In particular, the manifold structure of the reduced-rank regression coefficient matrix is considered and studied in our theoretical analysis. In our simulation study and real data analysis, the new method is compared with several existing variable selection methods for multivariate regression and exhibits competitive performance in prediction and variable selection.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013
Chiwoo Park; Jianhua Z. Huang; Jim Ji; Yu Ding
This paper presents a method that enables automated morphology analysis of partially overlapping nanoparticles in electron micrographs. In the undertaking of morphology analysis, three tasks appear necessary: separate individual particles from an agglomerate of overlapping nano-objects; infer the particles missing contours; and ultimately, classify the particles by shape based on their complete contours. Our specific method adopts a two-stage approach: the first stage executes the task of particle separation, and the second stage conducts simultaneously the tasks of contour inference and shape classification. For the first stage, a modified ultimate erosion process is developed for decomposing a mixture of particles into markers, and then, an edge-to-marker association method is proposed to identify the set of evidences that eventually delineate individual objects. We also provided theoretical justification regarding the separation capability of the first stage. In the second stage, the set of evidences become inputs to a Gaussian mixture model on B-splines, the solution of which leads to the joint learning of the missing contour and the particle shape. Using twelve real electron micrographs of overlapping nanoparticles, we compare the proposed method with seven state-of-the-art methods. The results show the superiority of the proposed method in terms of particle recognition rate.
Annals of Statistics | 2010
Guang Cheng; Jianhua Z. Huang
Supported by NSF Grant DMS-09-06497.Supported in part by NSF Grants DMS-06-06580, DMS-09-07170, NCI Grant CA57030 and Award Number KUS-CI-016-04, made by King Abdullah University of Science and Technology (KAUST).
The Annals of Applied Statistics | 2010
Seokho Lee; Jianhua Z. Huang; Jianhua Hu
We develop a new principal components analysis (PCA) type dimension reduction method for binary data. Different from the standard PCA which is defined on the observed data, the proposed PCA is defined on the logit transform of the success probabilities of the binary observations. Sparsity is introduced to the principal component (PC) loading vectors for enhanced interpretability and more stable extraction of the principal components. Our sparse PCA is formulated as solving an optimization problem with a criterion function motivated from penalized Bernoulli likelihood. A Majorization-Minimization algorithm is developed to efficiently solve the optimization problem. The effectiveness of the proposed sparse logistic PCA method is illustrated by application to a single nucleotide polymorphism data set and a simulation study.
Journal of the American Statistical Association | 2010
Lan Zhou; Jianhua Z. Huang; Josue G. Martinez; Arnab Maity; Veerabhadran Baladandayuthapani; Raymond J. Carroll
Hierarchical functional data are widely seen in complex studies where subunits are nested within units, which in turn are nested within treatment groups. We propose a general framework of functional mixed effects model for such data: within-unit and within-subunit variations are modeled through two separate sets of principal components; the subunit level functions are allowed to be correlated. Penalized splines are used to model both the mean functions and the principal components functions, where roughness penalties are used to regularize the spline fit. An expectation–maximization (EM) algorithm is developed to fit the model, while the specific covariance structure of the model is utilized for computational efficiency to avoid storage and inversion of large matrices. Our dimension reduction with principal components provides an effective solution to the difficult tasks of modeling the covariance kernel of a random function and modeling the correlation between functions. The proposed methodology is illustrated using simulations and an empirical dataset from a colon carcinogenesis study. Supplemental materials are available online.
The Annals of Applied Statistics | 2008
Haipeng Shen; Jianhua Z. Huang
We consider forecasting the latent rate profiles of a time series of inhomogeneous Poisson processes. The work is motivated by operations management of queueing systems, in particular, telephone call centers, where accurate forecasting of call arrival rates is a crucial primitive for efficient staffing of such centers. Our forecasting approach utilizes dimension reduction through a factor analysis of Poisson variables, followed by time series modeling of factor score series. Time series forecasts of factor scores are combined with factor loadings to yield forecasts of future Poisson rate profiles. Penalized Poisson regressions on factor loadings guided by time series forecasts of factor scores are used to generate dynamic within-process rate updating. Methods are also developed to obtain distributional forecasts. Our methods are illustrated using simulation and real data. The empirical results demonstrate how forecasting and dynamic updating of call arrival rates can affect the accuracy of call center staffing.
Electronic Journal of Statistics | 2008
Jianhua Z. Huang; Haipeng Shen; Andreas Buja
Two existing approaches to functional principal components analysis(FPCA) are due to Rice and Silverman(1991) andSilverman(1996), both based on maximizing variance but introducing penalization in differ- ent ways. In this article we propose an alternative approach to FPCA using penalized rank one approximation to the data matrix. Our contributions are four-fold: (1) by considering invariance under scale transformation of the measurements, the new formulation sheds light on how regularization should be performed for FPCA and suggestsan efficient power algorithmfor computation; (2) it naturally incorporates spline smoothing of discretized functional data; (3) the connection with smoothing splines also facilitates construction of cross-validation or generalized cross-validation criteria for smoothing parameter selection that allows efficient computation; (4) differ- ent smoothing parameters are permitted for different FPCs. The method- ology is illustrated with a real data example and a simulation. AMS 2000 subject classifications: Primary 62G08, 62H25; secondary 65F30.