Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jianqing Fan is active.

Publication


Featured researches published by Jianqing Fan.


Journal of the American Statistical Association | 2001

Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties

Jianqing Fan; Runze Li

Variable selection is fundamental to high-dimensional statistical modeling, including nonparametric regression. Many approaches in use are stepwise selection procedures, which can be computationally expensive and ignore stochastic errors in the variable selection process. In this article, penalized likelihood approaches are proposed to handle these kinds of problems. The proposed methods select variables and estimate coefficients simultaneously. Hence they enable us to construct confidence intervals for estimated parameters. The proposed approaches are distinguished from others in that the penalty functions are symmetric, nonconcave on (0, ∞), and have singularities at the origin to produce sparse solutions. Furthermore, the penalty functions should be bounded by a constant to reduce bias and satisfy certain conditions to yield continuous solutions. A new algorithm is proposed for optimizing penalized likelihood functions. The proposed ideas are widely applicable. They are readily applied to a variety of parametric models such as generalized linear models and robust regression models. They can also be applied easily to nonparametric modeling by using wavelets and splines. Rates of convergence of the proposed penalized likelihood estimators are established. Furthermore, with proper choice of regularization parameters, we show that the proposed estimators perform as well as the oracle procedure in variable selection; namely, they work as well as if the correct submodel were known. Our simulation shows that the newly proposed methods compare favorably with other variable selection techniques. Furthermore, the standard error formulas are tested to be accurate enough for practical applications.


Archive | 1994

Local Polynomial Modelling and Its Applications

Jianqing Fan; Irène Gijbels

Introduction. Overview of Existing Methods. Framework for Local Polynomial regression. Automatic Determination of Model Complexity. .Applications of Local Polynomial Modeling. Applications in Nonlinear Time Series. Local Polynomial Regression for Multivariate Data. Reference. Index.


Journal of the American Statistical Association | 1992

Design-adaptive Nonparametric Regression

Jianqing Fan

Abstract In this article we study the method of nonparametric regression based on a weighted local linear regression. This method has advantages over other popular kernel methods. Moreover, such a regression procedure has the ability of design adaptation: It adapts to both random and fixed designs, to both highly clustered and nearly uniform designs, and even to both interior and boundary points. It is shown that the local linear regression smoothers have high asymptotic efficiency (i.e., can be 100% with a suitable choice of kernel and bandwidth) among all possible linear smoothers, including those produced by kernel, orthogonal series, and spline methods. The finite sample property of the local linear regression smoother is illustrated via simulation studies. Nonparametric regression is frequently used to explore the association between covariates and responses. There are many versions of kernel regression smoothers. Some estimators are not good for random designs, such as in observational studies, and ...


Journal of the American Statistical Association | 1997

Generalized Partially Linear Single-Index Models

Raymond J. Carroll; Jianqing Fan; Irène Gijbels; M. P. Wand

Abstract The typical generalized linear model for a regression of a response Y on predictors (X, Z) has conditional mean function based on a linear combination of (X, Z). We generalize these models to have a nonparametric component, replacing the linear combination α T 0X + β T 0Z by η0(α T 0X) + β T 0Z, where η0(·) is an unknown function. We call these generalized partially linear single-index models (GPLSIM). The models include the “single-index” models, which have β0 = 0. Using local linear methods, we propose estimates of the unknown parameters (α0, β0) and the unknown function η0(·) and obtain their asymptotic distributions. Examples illustrate the models and the proposed estimation methodology.


Annals of Statistics | 2004

Nonconcave penalized likelihood with a diverging number of parameters

Jianqing Fan; Heng Peng

A class of variable selection procedures for parametric models via nonconcave penalized likelihood was proposed by Fan and Li to simultaneously estimate parameters and select important variables. They demonstrated that this class of procedures has an oracle property when the number of parameters is finite. However, in most model selection problems the number of parameters should be large and grow with the sample size. In this paper some asymptotic properties of the nonconcave penalized likelihood are established for situations in which the number of parameters tends to ∞ as the sample size increases. Under regularity conditions we have established an oracle property and the asymptotic normality of the penalized likelihood estimators. Furthermore, the consistency of the sandwich formula of the covariance matrix is demonstrated. Nonconcave penalized likelihood ratio statistics are discussed, and their asymptotic distributions under the null hypothesis are obtained by imposing some mild conditions on the penalty functions. The asymptotic results are augmented by a simulation study, and the newly developed methodology is illustrated by an analysis of a court case on the sexual discrimination of salary.


Journal of the American Statistical Association | 2001

Regularization of Wavelet Approximations

Anestis Antoniadis; Jianqing Fan

In this paper, we introduce nonlinear regularized wavelet estimators for estimating nonparametric regression functions when sampling points are not uniformly spaced. The approach can apply readily to many other statistical contexts. Various new penalty functions are proposed. The hard-thresholding and soft-thresholding estimators of Donoho and Johnstone are specific members of nonlinear regularized wavelet estimators. They correspond to the lower and upper envelopes of a class of the penalized least squares estimators. Necessary conditions for penalty functions are given for regularized estimators to possess thresholding properties. Oracle inequalities and universal thresholding parameters are obtained for a large class of penalty functions. The sampling properties of nonlinear regularized wavelet estimators are established and are shown to be adaptively minimax. To efficiently solve penalized least squares problems, nonlinear regularized Sobolev interpolators (NRSI) are proposed as initial estimators, which are shown to have good sampling properties. The NRSI is further ameliorated by regularized one-step estimators, which are the one-step estimators of the penalized least squares problems using the NRSI as initial estimators. The graduated nonconvexity algorithm is also introduced to handle penalized least squares problems. The newly introduced approaches are illustrated by a few numerical examples.


Journal of the American Statistical Association | 2004

New Estimation and Model Selection Procedures for Semiparametric Modeling in Longitudinal Data Analysis

Jianqing Fan; Runze Li

Semiparametric regression models are very useful for longitudinal data analysis. The complexity of semiparametric models and the structure of longitudinal data pose new challenges to parametric inferences and model selection that frequently arise from longitudinal data analysis. In this article, two new approaches are proposed for estimating the regression coefficients in a semiparametric model. The asymptotic normality of the resulting estimators is established. An innovative class of variable selection procedures is proposed to select significant variables in the semiparametric models. The proposed procedures are distinguished from others in that they simultaneously select significant variables and estimate unknown parameters. Rates of convergence of the resulting estimators are established. With a proper choice of regularization parameters and penalty functions, the proposed variable selection procedures are shown to perform as well as an oracle estimator. A robust standard error formula is derived using a sandwich formula and is empirically tested. Local polynomial regression techniques are used to estimate the baseline function in the semiparametric model.


National Science Review | 2014

Challenges of Big Data Analysis.

Jianqing Fan; Fang Han; Han Liu

Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.


Journal of The Royal Statistical Society Series B-statistical Methodology | 2000

Two-step estimation of functional linear models with applications to longitudinal data

Jianqing Fan; J.-T. Zhang

Two-Step Estimation of Functional Linear Models with Applications to Longitudinal Data Jianqing Fan and Jin-Ting Zhang Department of Statistics UNC-Chapel Hill, NC 27599-3260 July 16, 1999 Abstract Functional linear models are useful in longitudinal data analysis. They include many classical and recently proposed statistical models for longitudinal data and other functional data. Recently, smoothing spline and kernel methods have been proposed for estimating their coe cient functions nonparametrically but these methods are either intensive in computation or ine cient in perfor- mance. To overcome these drawbacks, in this paper, a simple and powerful two-step alternative is proposed. In particular, the implementation of the proposed approach via local polynomial smooth- ing is discussed. Methods for estimating standard deviations of estimated coe cient functions are also proposed. Some asymptotic results for the local polynomial estimators are established. Two longitudinal data sets, one of which involves time-dependent covariates, are used to demonstrate the proposed approach. Simulation studies show that our two-step approach improves the kernel method proposed in Hoover, et al (1998) in several aspects such as accuracy, computation time and visual appealingness of the estimators. Key Words And Phrases : Functional linear models, functional ANOVA, local polynomial smoothing, longitudinal data analysis. Short title : Functional linear models


Journal of the American Statistical Association | 1995

Local Polynomial Kernel Regression for Generalized Linear Models and Quasi-Likelihood Functions

Jianqing Fan; Nancy E. Heckman; M. P. Wand

Abstract We investigate the extension of the nonparametric regression technique of local polynomial fitting with a kernel weight to generalized linear models and quasi-likelihood contexts. In the ordinary regression case, local polynomial fitting has been seen to have several appealing features in terms of intuitive and mathematical simplicity. One noteworthy feature is the better performance near the boundaries compared to the traditional kernel regression estimators. These properties are shown to carry over to generalized linear model and quasi-likelihood settings. We also derive the asymptotic distributions of the proposed class of estimators that allow for straightforward interpretation and extensions of state-of-the-art bandwidth selection methods.

Collaboration


Dive into the Jianqing Fan's collaboration.

Top Co-Authors

Avatar

Qiwei Yao

London School of Economics and Political Science

View shared research outputs
Top Co-Authors

Avatar

Yi Ren

Florida State University

View shared research outputs
Top Co-Authors

Avatar

Irène Gijbels

Université catholique de Louvain

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Runze Li

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Han Liu

Princeton University

View shared research outputs
Top Co-Authors

Avatar

Jiancheng Jiang

University of North Carolina at Charlotte

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yingying Fan

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Jinchi Lv

University of Southern California

View shared research outputs
Researchain Logo
Decentralizing Knowledge