Xingdong Feng
Shanghai University of Finance and Economics
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xingdong Feng.
Medicine and Science in Sports and Exercise | 2010
Steven P. Broglio; Brock Schnebel; Jacob J. Sosnoff; Sunghoon Shin; Xingdong Feng; Xiaoqing He; Jerrad Zimmerman
INTRODUCTION Sport concussion represents the majority of brain injuries occurring in the United States with 1.6–3.8 million cases annually. Understanding the biomechanical properties of this injury will support the development of better diagnostics and preventative techniques. METHODS We monitored all football related head impacts in 78 high school athletes (mean age = 16.7 yr) from 2005 to 2008 to better understand the biomechanical characteristics of concussive impacts. RESULTS Using the Head Impact Telemetry System, a total of 54,247 impacts were recorded, and 13 concussive episodes were captured for analysis. A classification and regression tree analysis of impacts indicated that rotational acceleration (95582.3 rad·s−²), linear acceleration (996.1g), and impact location (front, top, and back) yielded the highest predictive value of concussion. CONCLUSIONS These threshold values are nearly identical with those reported at the collegiate and professional level. If the Head Impact Telemetry System were implemented for medical use, sideline personnel can expect to diagnose one of every five athletes with a concussion when the impact exceeds these tolerance levels. Why all athletes did not sustain a concussion when the impacts generated variables in excess of our threshold criteria is not entirely clear, although individual differences between participants may play a role. A similar threshold to concussion in adolescent athletes compared with their collegiate and professional counterparts suggests an equal concussion risk at all levels of play.
Journal of the American Statistical Association | 2012
Huixia Judy Wang; Xingdong Feng
We develop a new multiple imputation approach for M-regression models with censored covariates. Instead of specifying parametric likelihoods, our method imputes the censored covariates by their conditional quantiles given the observed data, where the conditional quantiles are estimated through fitting a censored quantile regression process. The resulting estimator is shown to be consistent and asymptotically normal, and it improves the estimation efficiency by using information from cases with censored covariates. Compared with existing methods, the proposed method is more flexible as it does not require stringent parametric assumptions on the distributions of either the regression errors or the covariates. The finite sample performance of the proposed method is assessed through a simulation study and the analysis of a c-reactive protein dataset in the 2007–2008 National Health and Nutrition Examination Survey. This article has supplementary material online.
Genomics, Proteomics & Bioinformatics | 2007
Xingdong Feng; Shuguang Huang; Jianyong Shou; Birong Liao; Jonathan M. Yingling; Xiang Ye; Xi Lin; Lawrence M. Gelbert; Eric Wen Su; Jude E. Onyia; Shuyu Li
To determine cancer pathway activities in nine types of primary tumors and NCI60 cell lines, we applied an in silico approach by examining gene signatures reflective of consequent pathway activation using gene expression data. Supervised learning approaches predicted that the Ras pathway is active in ~70% of lung adenocarcinomas but inactive in most squamous cell carcinomas, pulmonary carcinoids, and small cell lung carcinomas. In contrast, the TGF-β, TNF-α, Src, Myc, E2F3, and β-catenin pathways are inactive in lung adenocarcinomas. We predicted an active Ras, Myc, Src, and/or E2F3 pathway in significant percentages of breast cancer, colorectal carcinoma, and gliomas. Our results also suggest that Ras may be the most prevailing oncogenic pathway. Additionally, many NCI60 cell lines exhibited a gene signature indicative of an active Ras, Myc, and/or Src, but not E2F3, β-catenin, TNF-α, or TGF-β pathway. To our knowledge, this is the first comprehensive survey of cancer pathway activities in nine major tumor types and the most widely used NCI60 cell lines. The “gene expression pathway signatures” we have defined could facilitate the understanding of molecular mechanisms in cancer development and provide guidance to the selection of appropriate cell lines for cancer research and pharmaceutical compound screening.
Molecular Diagnosis & Therapy | 2007
Yuni Xia; Andrew Campen; Dan Rigsby; Ying Guo; Xingdong Feng; Eric Wen Su; Mathew J. Palakal; Shuyu Li
Gene expression patterns can reflect gene regulations in human tissues under normal or pathologic conditions. Gene expression profiling data from studies of primary human disease samples are particularly valuable since these studies often span many years in order to collect patient clinical information and achieve a large sample size. Disease-to-Gene Expression Mapper (DGEM) provides a beneficial community resource to access and analyze these data; it currently includes Affymetrix oligonucleotide array datasets for more than 40 human diseases and 1400 samples. The data are normalized to the same scale and stored in a relational database. A statistical-analysis pipeline was implemented to identify genes abnormally expressed in disease tissues or genes whose expressions are associated with clinical parameters such as cancer patient survival. Data-mining results can be queried through a web-based interface at http://dgem.dhcp.iupui.edu/. The query tool enables dynamic generation of graphs and tables that are further linked to major gene and pathway resources that connect the data to relevant biology, including Entrez Gene and Kyoto Encyclopedia of Genes and Genomes (KEGG). In summary, DGEM provides scientists and physicians a valuable tool to study disease mechanisms, to discover potential disease biomarkers for diagnosis and prognosis, and to identify novel gene targets for drug discovery. The source code is freely available for non-profit use, on request to the authors.
The Annals of Applied Statistics | 2009
Xingdong Feng; Xuming He
Probe-level microarray data are usually stored in matrices, where the row and column correspond to array and probe, respectively. Scientists routinely summarize each array by a single index as the expression level of each probe-set (gene). We examine the adequacy of a uni-dimensional summary for characterizing the data matrix of each probe-set. To do so, we propose a low-rank matrix model for the probe-level intensities, and develop a useful framework for testing the adequacy of uni-dimensionality against targeted alternatives. This is an interesting statistical problem where inference has to be made based on one data matrix whose entries are not i.i.d. We analyze the asymptotic properties of the proposed test statistics, and use Monte Carlo simulations to assess their small sample performance. Applications of the proposed tests to GeneChip data show that evidence against a uni-dimensional model is often indicative of practically relevant features of a probe-set.
Economics Letters | 2014
Yanping Yi; Xingdong Feng; Zhuo Huang
We proposed a method to estimate extreme conditional quantiles by combining quantile GARCH model of Xiao and Koenker (2009) and extreme value theory (EVT) approach. We first estimate the latent volatility process using the information of intermediate quantiles. We then apply EVT to the tail observations to obtain a sound estimate of the likelihood of experiencing an extreme event. Quantile autoregression and EVT together improve efficiency in estimation of extreme quantiles, by borrowing information from neighbor quantiles. Monte Carlo simulation indicates that, the proposed method is promising to provide more accurate estimates for VaR of a financial portfolio, where non-Gaussian tail is present.
Journal of the American Statistical Association | 2016
Xingdong Feng; Liping Zhu
In this article, we establish a novel connection between the null hypothesis H0 on the coefficients and a rank-reducible form of the varying coefficient model in quantile regression. We use B-splines to approximate the varying coefficients in the rank-reducible model, and make use of the fact that the null hypothesis H0 implies a unidimensional structure of a transformed coefficient matrix for the B-spline basis functions. By evaluating the unidimensional structure, we alleviate the difficulty of testing such hypotheses commonly considered in varying coefficient quantile models. We demonstrate through numerical studies that the proposed method can be much more powerful than the rank score test which is widely used in the quantile regression literature. Supplementary materials for this article are available online.
Biometrics | 2014
Xingdong Feng; Nell Sedransk; Jessie Q. Xia
Linear regressions are commonly used to calibrate the signal measurements in proteomic analysis by mass spectrometry. However, with or without a monotone (e.g., log) transformation, data from such functional proteomic experiments are not necessarily linear or even monotone functions of protein (or peptide) concentration except over a very restricted range. A computationally efficient spline procedure improves upon linear regression. However, mass spectrometry data are not necessarily homoscedastic; more often the variation of measured concentrations increases disproportionately near the boundaries of the instruments measurement capability (dynamic range), that is, the upper and lower limits of quantitation. These calibration difficulties exist with other applications of mass spectrometry as well as with other broad-scale calibrations. Therefore the method proposed here uses a functional data approach to define the calibration curve and also the limits of quantitation under the two assumptions: (i) that the variance is a bounded, convex function of concentration; and (ii) that the calibration curve itself is monotone at least between the limits of quantitation, but not necessarily outside these limits. Within this paradigm, the limit of detection, where the signal is definitely present but not measurable with any accuracy, is also defined. An iterative approach draws on existing smoothing methods to account simultaneously for both restrictions and is shown to achieve the global optimal convergence rate under weak conditions. This approach can also be implemented when convexity is replaced by other (bounded) restrictions. Examples from Addona et al. (2009, Nature Biotechnology 27, 663-641) both motivate and illustrate the effectiveness of this functional data methodology when compared with the simpler linear regressions and spline techniques.
Annals of Statistics | 2014
Xingdong Feng; Xuming He
The singular value decomposition is widely used to approximate data matrices with lower rank matrices. Feng and He [Ann. Appl. Stat. 3 (2009) 1634-1654] developed tests on dimensionality of the mean structure of a data matrix based on the singular value decomposition. However, the first singular values and vectors can be driven by a small number of outlying measurements. In this paper, we consider a robust alternative that moderates the effect of outliers in low-rank approximations. Under the assumption of random row effects, we provide the asymptotic representations of the robust low-rank approximation. These representations may be used in testing the adequacy of a low-rank approximation. We use oligonucleotide gene microarray data to demonstrate how robust singular value decomposition compares with the its traditional counterparts. Examples show that the robust methods often lead to a more meaningful assessment of the dimensionality of gene intensity data matrices.
international conference on data engineering | 2008
Andrew Campen; Yuni Xia; Dan Rigsby; Ying Guo; Xingdong Feng; Eric Wen Su; Mathew J. Palakal; Shuyu Li
Studies of gene expression in primary human disease tissue often span several years in order to achieve reasonably large sample sizes and to collect patient clinical information making this data particularly valuable. Due to the lack of a central repository, this data has only been available through disparate and non-publicly accessible sources following publication. We developed disease-to-gene expression mapper (D-GEM) as a publically accessible database and data mining toolbox for microarray data of human primary disease tissue. A statistical pipeline has also been implemented to identify genes over-expressed in disease tissue samples in comparison with normal control samples, or genes whose expression values are associated with clinical parameters such as patient survival rate. One potential application of this data is the identification of pathway specific cancer prognosis markers. By applying a novel, gene signatures for cancer prognosis in the context of known biological pathways in cancer development were identified and confirmed.