Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kehui Chen is active.

Publication


Featured researches published by Kehui Chen.


Journal of the American Statistical Association | 2018

Network Cross-Validation for Determining the Number of Communities in Network Data

Kehui Chen; Jing Lei

ABSTRACT The stochastic block model (SBM) and its variants have been a popular tool for analyzing large network data with community structures. In this article, we develop an efficient network cross-validation (NCV) approach to determine the number of communities, as well as to choose between the regular stochastic block model and the degree corrected block model (DCBM). The proposed NCV method is based on a block-wise node-pair splitting technique, combined with an integrated step of community recovery using sub-blocks of the adjacency matrix. We prove that the probability of under-selection vanishes as the number of nodes increases, under mild conditions satisfied by a wide range of popular community recovery algorithms. The solid performance of our method is also demonstrated in extensive simulations and two data examples. Supplementary materials for this article are available online.


Journal of the American Statistical Association | 2012

Modeling Repeated Functional Observations

Kehui Chen; Hans-Georg Müller

We introduce a new methodological framework for repeatedly observed and thus dependent functional data, aiming at situations where curves are recorded repeatedly for each subject in a sample. Our methodology covers the case where the recordings of the curves are scheduled on a regular and dense grid and also situations more typical for longitudinal studies, where the timing of recordings is often sparse and random. The proposed models lead to an interpretable and straightforward decomposition of the inherent variation in repeatedly observed functional data and are implemented through a straightforward two-step functional principal component analysis. We provide consistency results and asymptotic convergence rates for the estimated model components. We compare the proposed model with an alternative approach via a two-dimensional Karhunen-Loève expansion and illustrate it through the analysis of longitudinal mortality data from period lifetables that are repeatedly observed for a sample of countries over many years, and also through simulation studies. This article has online supplementary materials.


Physiological Entomology | 2013

Effects of diet and host access on fecundity and lifespan in two fruit fly species with different life-history patterns

James F. Harwood; Kehui Chen; Hans-Georg Müller; Jane-Ling Wang; Roger I. Vargas; James R. Carey

The reproductive ability of female tephritids can be limited and prevented by denying access to host plants and restricting the dietary precursors of vitellogenesis. The mechanisms underlying the delayed egg production in each case are initiated by different physiological processes that are anticipated to have dissimilar effects on lifespan and reproductive ability later in life. The egg‐laying abilities of laboratory‐reared females of the Mediterranean fruit fly (Ceratitis capitata Wiedmann) and melon fly (Bactrocera cucurbitae Coquillett) from Hawaii are delayed or suppressed by limiting access to host fruits and dietary protein. In each case, this is expected to prevent the loss of lifespan associated with reproduction until protein or hosts are introduced. Two trends are observed in each species: first, access to protein at eclosion leads to a greater probability of survival and a higher reproductive ability than if it is delayed and, second, delayed host access reduces lifetime reproductive ability without improving life expectancy. When host access and protein availability are delayed, the rate of reproductive senescence is reduced in the medfly, whereas the rate of reproductive senescence is generally increased in the melon fly. Overall, delaying reproduction lowers the fitness of females by constraining their fecundity for the remainder of the lifespan without extending the lifespan.


Journal of the American Statistical Association | 2011

Stringing High-Dimensional Data for Functional Analysis

Kun Chen; Kehui Chen; Hans-Georg Müller; Jane-Ling Wang

We propose stringing, a class of methods where one views high-dimensional observations as functional data. Stringing takes advantage of the high dimension by representing such data as discretized and noisy observations that originate from a hidden smooth stochastic process. Assuming that the observations result from scrambling the original ordering of the observations of the process, stringing reorders the components of the high-dimensional vectors, followed by transforming the high-dimensional vector observations into functional data. Established techniques from functional data analysis can be applied for further statistical analysis once an underlying stochastic process and the corresponding random trajectory for each subject have been identified. Stringing of high-dimensional data is implemented with distance-based metric multidimensional scaling, mapping high-dimensional data to locations on a real interval, such that predictors that are close in a suitable sample metric also are located close to each other on the interval. We provide some theoretical support, showing that under certain assumptions, an underlying stochastic process can be constructed asymptotically, as the dimension p of the data tends to infinity. Stringing is illustrated for the analysis of tree ring data and for the prediction of survival time from high-dimensional gene expression data and is shown to lead to new insights. In regression applications involving high-dimensional predictors, stringing compares favorably with existing methods. The theoretical results and proofs and also additional simulation results are provided in online Supplemental Material.


Journal of the American Statistical Association | 2015

Localized Functional Principal Component Analysis

Kehui Chen; Jing Lei

We propose localized functional principal component analysis (LFPCA), looking for orthogonal basis functions with localized support regions that explain most of the variability of a random process. The LFPCA is formulated as a convex optimization problem through a novel deflated Fantope localization method and is implemented through an efficient algorithm to obtain the global optimum. We prove that the proposed LFPCA converges to the original functional principal component analysis (FPCA) when the tuning parameters are chosen appropriately. Simulation shows that the proposed LFPCA with tuning parameters chosen by cross-validation can almost perfectly recover the true eigenfunctions and significantly improve the estimation accuracy when the eigenfunctions are truly supported on some subdomains. In the scenario that the original eigenfunctions are not localized, the proposed LFPCA also serves as a nice tool in finding orthogonal basis functions that balance between interpretability and the capability of explaining variability of the data. The analyses of a country mortality data reveal interesting features that cannot be found by standard FPCA methods. Supplementary materials for this article are available online.


NeuroImage | 2014

FMEM: Functional mixed effects modeling for the analysis of longitudinal white matter Tract data

Ying Yuan; John H. Gilmore; Xiujuan Geng; Styner Martin; Kehui Chen; Jane-Ling Wang; Hongtu Zhu

Many longitudinal imaging studies have collected repeated diffusion tensor magnetic resonance imaging data to understand white matter maturation and structural connectivity pattern in normal controls and diseased subjects. There is an urgent demand for the development of statistical methods for the analysis of diffusion properties along fiber tracts and clinical data obtained from longitudinal studies. Jointly analyzing repeated fiber-tract diffusion properties and covariates (e.g., age or gender) raises several major challenges including (i) infinite-dimensional functional response data, (ii) complex spatial-temporal correlation structure, and (iii) complex spatial smoothness. To address these challenges, this article is to develop a functional mixed effects modeling (FMEM) framework to delineate the dynamic changes of diffusion properties along major fiber tracts and their association with a set of covariates of interest and the structure of the variability of these white matter tract properties in various longitudinal studies. Our FMEM consists of a functional mixed effects model for addressing all three challenges, an efficient method for spatially smoothing varying coefficient functions, an estimation method for estimating the spatial-temporal correlation structure, a test procedure with local and global test statistics for testing hypotheses of interest associated with functional response, and a simultaneous confidence band for quantifying the uncertainty in the estimated coefficient functions. Simulated data are used to evaluate the finite sample performance of FMEM and to demonstrate that FMEM significantly outperforms the standard pointwise mixed effects modeling approach. We apply FMEM to study the spatial-temporal dynamics of white-matter fiber tracts in a clinical study of neurodevelopment.


Journal of Nutrition | 2012

Gender and Single Nucleotide Polymorphisms in MTHFR, BHMT, SPTLC1, CRBP2, CETP, and SCARB1 Are Significant Predictors of Plasma Homocysteine Normalized by RBC Folate in Healthy Adults

Andrew J. Clifford; Kehui Chen; Laura McWade; Gonzalo Rincon; Seung-Hyun Kim; Dirk M. Holstege; Janel E. Owens; Bitao Liu; Hans-Georg Müller; Juan F. Medrano; J.G. Fadel; Alanna J. Moshfegh; David J. Baer; Janet A. Novotny

Using linear regression models, we studied the main and 2-way interaction effects of the predictor variables gender, age, BMI, and 64 folate/vitamin B-12/homocysteine (Hcy)/lipid/cholesterol-related single nucleotide polymorphisms (SNP) on log-transformed plasma Hcy normalized by RBC folate measurements (nHcy) in 373 healthy Caucasian adults (50% women). Variable selection was conducted by stepwise Akaike information criterion or least angle regression and both methods led to the same final model. Significant predictors (where P values were adjusted for false discovery rate) included type of blood sample [whole blood (WB) vs. plasma-depleted WB; P < 0.001] used for folate analysis, gender (P < 0.001), and SNP in genes SPTLC1 (rs11790991; P = 0.040), CRBP2 (rs2118981; P < 0.001), BHMT (rs3733890; P = 0.019), and CETP (rs5882; P = 0.017). Significant 2-way interaction effects included gender × MTHFR (rs1801131; P = 0.012), gender × CRBP2 (rs2118981; P = 0.011), and gender × SCARB1 (rs83882; P = 0.003). The relation of nHcy concentrations with the significant SNP (SPTLC1, BHMT, CETP, CRBP2, MTHFR, and SCARB1) is of interest, especially because we surveyed the main and interaction effects in healthy adults, but it is an important area for future study. As discussed, understanding Hcy and genetic regulation is important, because Hcy may be related to inflammation, obesity, cardiovascular disease, and diabetes mellitus. We conclude that gender and SNP significantly affect nHcy.


information processing in medical imaging | 2013

A longitudinal functional analysis framework for analysis of white matter tract statistics

Ying Yuan; John H. Gilmore; Xiujuan Geng; Martin Styner; Kehui Chen; Jane-ling Wang; Hongtu Zhu

Many longitudinal imaging studies have been/are being widely conducted to use diffusion tensor imaging (DTI) to better understand white matter maturation in normal controls and diseased subjects. There is an urgent demand for the development of statistical methods for analyzing diffusion properties along major fiber tracts obtained from longitudinal DTI studies. Jointly analyzing fiber-tract diffusion properties and covariates from longitudinal studies raises several major challenges including (i) infinite-dimensional functional response data, (ii) complex spatial-temporal correlation structure, and (iii) complex spatial smoothness. To address these challenges, this article is to develop a longitudinal functional analysis framework (LFAF) to delineate the dynamic changes of diffusion properties along major fiber tracts and their association with a set of covariates of interest (e.g., age and group status) and the structure of the variability of these white matter tract properties in various longitudinal studies. Our LFAF consists of a functional mixed effects model for addressing all three challenges, an efficient method for spatially smoothing varying coefficient functions, an estimation method for estimating the spatial-temporal correlation structure, a test procedure with a global test statistic for testing hypotheses of interest associated with functional response, and a simultaneous confidence band for quantifying the uncertainty in the estimated coefficient functions. Simulated data are used to evaluate the finite sample performance of LFAF and to demonstrate that LFAF significantly outperforms a voxel-wise mixed model method. We apply LFAF to study the spatial-temporal dynamics of white-matter fiber tracts in a clinical study of neurodevelopment.


Technometrics | 2014

Modeling Conditional Distributions for Functional Responses, With Application to Traffic Monitoring via GPS-Enabled Mobile Phones

Kehui Chen; Hans-Georg Müller

Motivated by problems involving a traffic monitoring system in which trajectory data are obtained from Global Positioning System-enabled mobile phones, we propose a novel approach to functional regression modeling, where instead of the usual mean regression the entire distribution of functional responses is modeled conditionally on predictors. An approach that sensibly balances flexibility and stability is obtained by assuming that the response functions are drawn from a Gaussian process, the mean and covariance function of which depend on predictors. The dependence of the mean function and covariance function of the response on the predictors is modeled additively. We demonstrate the proposed methods by constructing predicted curves and corresponding prediction regions for traffic velocity trajectories for a future time period, using current traffic velocity fields as predictor functions. The proposed functional regression and conditional distribution approach is of general interest for functional response settings, where in addition to predicting the conditional mean response function one is also interested in predicting the covariance surface of the random response functions, conditional on predictor curves.


Biometrika | 2018

A test of weak separability for multi-way functional data, with application to brain connectivity studies

Brian Lynch; Kehui Chen

SummaryThis paper concerns the modelling of multi-way functional data where double or multiple indices are involved. We introduce a concept of weak separability. The weakly separable structure supports the use of factorization methods that decompose the signal into its spatial and temporal components. The analysis reveals interesting connections to the usual strongly separable covariance structure, and provides insights into tensor methods for multi-way functional data. We propose a formal test for the weak separability hypothesis, where the asymptotic null distribution of the test statistic is a chi-squared-type mixture. The method is applied to study brain functional connectivity derived from source localized magnetoencephalography signals during motor tasks.

Collaboration


Dive into the Kehui Chen's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jane-Ling Wang

University of California

View shared research outputs
Top Co-Authors

Avatar

Hongtu Zhu

University of Texas MD Anderson Cancer Center

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

James R. Carey

University of California

View shared research outputs
Top Co-Authors

Avatar

Jing Lei

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

John H. Gilmore

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Xiujuan Geng

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Ying Yuan

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Alanna J. Moshfegh

United States Department of Agriculture

View shared research outputs
Researchain Logo
Decentralizing Knowledge