Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Huann Sheng Chen is active.

Publication


Featured researches published by Huann Sheng Chen.


BMC Bioinformatics | 2004

Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes

Hongying Jiang; Youping Deng; Huann Sheng Chen; Lin Tao; Qiuying Sha; Jun Chen; Chung-Jui Tsai; Shuanglin Zhang

BackgroundDue to the high cost and low reproducibility of many microarray experiments, it is not surprising to find a limited number of patient samples in each study, and very few common identified marker genes among different studies involving patients with the same disease. Therefore, it is of great interest and challenge to merge data sets from multiple studies to increase the sample size, which may in turn increase the power of statistical inferences. In this study, we combined two lung cancer studies using micorarray GeneChip®, employed two gene shaving methods and a two-step survival test to identify genes with expression patterns that can distinguish diseased from normal samples, and to indicate patient survival, respectively.ResultsIn addition to common data transformation and normalization procedures, we applied a distribution transformation method to integrate the two data sets. Gene shaving (GS) methods based on Random Forests (RF) and Fishers Linear Discrimination (FLD) were then applied separately to the joint data set for cancer gene selection. The two methods discovered 13 and 10 marker genes (5 in common), respectively, with expression patterns differentiating diseased from normal samples. Among these marker genes, 8 and 7 were found to be cancer-related in other published reports. Furthermore, based on these marker genes, the classifiers we built from one data set predicted the other data set with more than 98% accuracy. Using the univariate Cox proportional hazard regression model, the expression patterns of 36 genes were found to be significantly correlated with patient survival (p < 0.05). Twenty-six of these 36 genes were reported as survival-related genes from the literature, including 7 known tumor-suppressor genes and 9 oncogenes. Additional principal component regression analysis further reduced the gene list from 36 to 16.ConclusionThis study provided a valuable method of integrating microarray data sets with different origins, and new methods of selecting a minimum number of marker genes to aid in cancer diagnosis. After careful data integration, the classification method developed from one data set can be applied to the other with high prediction accuracy.


American Journal of Human Genetics | 2003

Transmission/Disequilibrium Test Based on Haplotype Sharing for Tightly Linked Markers

Shuanglin Zhang; Qiuying Sha; Huann Sheng Chen; Jianping Dong; Renfang Jiang

Studies using haplotypes of multiple tightly linked markers are more informative than those using a single marker. However, studies based on multimarker haplotypes have some difficulties. First, if we consider each haplotype as an allele and use the conventional single-marker transmission/disequilibrium test (TDT), then the rapid increase in the degrees of freedom with an increasing number of markers means that the statistical power of the conventional tests will be low. Second, the parental haplotypes cannot always be unambiguously reconstructed. In the present article, we propose a haplotype-sharing TDT (HS-TDT) for linkage or association between a disease-susceptibility locus and a chromosome region in which several tightly linked markers have been typed. This method is applicable to both quantitative traits and qualitative traits. It is applicable to any size of nuclear family, with or without ambiguous phase information, and it is applicable to any number of alleles at each of the markers. The degrees of freedom (in a broad sense) of the test increase linearly as the number of markers considered increases but do not increase as the number of alleles at the markers increases. Our simulation results show that the HS-TDT has the correct type I error rate in structured populations and that, in most cases, the power of HS-TDT is higher than the power of the existing single-marker TDTs and haplotype-based TDTs.


Annals of Human Genetics | 2003

Qualitative semi-parametric test for genetic associations in case-control designs under structured populations.

Huann Sheng Chen; Xiaofeng Zhu; Hongyu Zhao; Shuanglin Zhang

Recently, statistical methods have been proposed using genomic markers to control for population stratification in genetic association studies. However, these methods either have unacceptable low power when population stratification becomes strong or cannot control for population stratification well under admixture population models. In this paper, we propose a semiparametric association test to detect genetic association between a candidate marker and a qualitative trait of interest in case‐control designs. The performanceof the test is compared to other existing methods through simulations. The results show that our method gives correct type I error rate both under discrete population models and admixture population models, and our method is robust to the extent of the population stratification. In most of the cases we considered, our method has higher power and, in some cases, substantially higher power than that of existing methods.


Plant Physiology | 2004

Metabolic Profiling of the Sink-to-Source Transition in Developing Leaves of Quaking Aspen

Mijeong Lee Jeong; Hongying Jiang; Huann Sheng Chen; Chung-Jui Tsai; Scott A. Harding

Profiles of small polar metabolites from aspen (Populus tremuloides Michx.) leaves spanning the sink-to-source transition zone were compared. Approximately 25% of 250 to 300 routinely resolved peaks were identified, with carbohydrates, organic acids, and amino acids being most abundant. Two-thirds of identified metabolites exhibited greater than 4-fold changes in abundance during leaf ontogeny. In the context of photosynthetic and respiratory measurements, profile data yielded information consistent with expected developmental trends in carbon-heterotrophic and carbon-autotrophic metabolism. Suc concentration increased throughout leaf expansion, while hexose sugar concentrations peaked at mid-expansion and decreased sharply thereafter. Amino acid contents generally decreased during leaf expansion, but an early increase in Phe and a later one in Gly and Ser reflected growing commitments to secondary metabolism and photorespiration, respectively. The assimilation of nitrate and utilization of stored Asn appeared to be marked by sequential changes in malate concentration and Asn transaminase activity. Principal component and hierarchical clustering analysis facilitated the grouping of cell wall maturation (pectins, hemicelluloses, and oxalate) and membrane biogenesis markers in relation to developmental changes in carbon and nitrogen assimilation. Metabolite profiling will facilitate investigation of nitrogen use and cellular development in Populus sp. varying widely in their growth and pattern of carbon allocation during sink-to-source development and in response to stress.


Genetic Epidemiology | 2012

Next generation analytic tools for large scale genetic epidemiology studies of complex diseases.

Leah E. Mechanic; Huann Sheng Chen; Christopher I. Amos; Nilanjan Chatterjee; Nancy J. Cox; Rao L. Divi; Ruzong Fan; Emily L. Harris; Kevin B. Jacobs; Peter Kraft; Suzanne M. Leal; Kimberly A. McAllister; Jason H. Moore; Dina N. Paltoo; Michael A. Province; Erin M. Ramos; Marylyn D. Ritchie; Kathryn Roeder; Daniel J. Schaid; Matthew Stephens; Duncan C. Thomas; Clarice R. Weinberg; John S. Witte; Shunpu Zhang; Sebastian Zöllner; Eric J. Feuer; Elizabeth M. Gillanders

Over the past several years, genome‐wide association studies (GWAS) have succeeded in identifying hundreds of genetic markers associated with common diseases. However, most of these markers confer relatively small increments of risk and explain only a small proportion of familial clustering. To identify obstacles to future progress in genetic epidemiology research and provide recommendations to NIH for overcoming these barriers, the National Cancer Institute sponsored a workshop entitled “Next Generation Analytic Tools for Large‐Scale Genetic Epidemiology Studies of Complex Diseases” on September 15–16, 2010. The goal of the workshop was to facilitate discussions on (1) statistical strategies and methods to efficiently identify genetic and environmental factors contributing to the risk of complex disease; and (2) how to develop, apply, and evaluate these strategies for the design, analysis, and interpretation of large‐scale complex disease association studies in order to guide NIH in setting the future agenda in this area of research. The workshop was organized as a series of short presentations covering scientific (gene‐gene and gene‐environment interaction, complex phenotypes, and rare variants and next generation sequencing) and methodological (simulation modeling and computational resources and data management) topic areas. Specific needs to advance the field were identified during each session and are summarized. Genet. Epidemiol. 36 : 22–35, 2012.


PLOS ONE | 2012

Evaluation of Gene Association Methods for Coexpression Network Construction and Biological Knowledge Discovery

Sapna Kumari; Jeff Nie; Huann Sheng Chen; Hao-ran Ma; Ron Stewart; Xiang Li; Meng-Zhu Lu; William Taylor; Hairong Wei

Background Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. Methods and Results In this study, we compared eight gene association methods – Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffdings D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson – and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. Conclusions We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.


Cancer | 2012

Predicting US- and state-level cancer counts for the current calendar year: Part II: Evaluation of spatiotemporal projection methods for incidence

Li Zhu; Linda W. Pickle; Kaushik Ghosh; Deepa Naishadham; Kenneth Portier; Huann Sheng Chen; Hyune Ju Kim; Zhaohui Zou; James Cucinelli; Betsy A. Kohler; Brenda K. Edwards; Jessica B. King; Eric J. Feuer; Ahmedin Jemal

The current study was undertaken to evaluate the spatiotemporal projection models applied by the American Cancer Society to predict the number of new cancer cases.


Genetic Epidemiology | 2011

Entropy‐based information gain approaches to detect and to characterize gene‐gene and gene‐environment interactions/correlations of complex diseases

Ruzong Fan; Ming Zhong; S. Wang; Yiwei Zhang; Angeline S. Andrew; Margaret R. Karagas; Huann Sheng Chen; Christopher I. Amos; Momiao Xiong; Jason H. Moore

For complex diseases, the relationship between genotypes, environment factors, and phenotype is usually complex and nonlinear. Our understanding of the genetic architecture of diseases has considerably increased over the last years. However, both conceptually and methodologically, detecting gene‐gene and gene‐environment interactions remains a challenge, despite the existence of a number of efficient methods. One method that offers great promises but has not yet been widely applied to genomic data is the entropy‐based approach of information theory. In this article, we first develop entropy‐based test statistics to identify two‐way and higher order gene‐gene and gene‐environment interactions. We then apply these methods to a bladder cancer data set and thereby test their power and identify strengths and weaknesses. For two‐way interactions, we propose an information gain (IG) approach based on mutual information. For three‐ways and higher order interactions, an interaction IG approach is used. In both cases, we develop one‐dimensional test statistics to analyze sparse data. Compared to the naive chi‐square test, the test statistics we develop have similar or higher power and is robust. Applying it to the bladder cancer data set allowed to investigate the complex interactions between DNA repair gene single nucleotide polymorphisms, smoking status, and bladder cancer susceptibility. Although not yet widely applied, entropy‐based approaches appear as a useful tool for detecting gene‐gene and gene‐environment interactions. The test statistics we develop add to a growing body methodologies that will gradually shed light on the complex architecture of common diseases. Genet. Epidemiol. 35:706–721, 2011.


Cancer | 2012

Predicting US and State-Level Cancer Counts for the Current Calendar Year: Part I – Evaluation of Temporal Projection Methods for Mortality

Huann Sheng Chen; Kenneth Portier; Kaushik Ghosh; Deepa Naishadham; Hyune Ju Kim; Li Zhu; Linda W. Pickle; Martin Krapcho; Steve Scoppa; Ahmedin Jemal; Eric J. Feuer

A study was undertaken to evaluate the temporal projection methods that are applied by the American Cancer Society to predict 4‐year‐ahead projections.


Bioinformatics | 2013

Genetic Simulation Resources

Bo Peng; Huann Sheng Chen; Leah E. Mechanic; Ben Racine; John Clarke; Lauren Clarke; Elizabeth M. Gillanders; Eric J. Feuer

Summary: Many simulation methods and programs have been developed to simulate genetic data of the human genome. These data have been widely used, for example, to predict properties of populations retrospectively or prospectively according to mathematically intractable genetic models, and to assist the validation, statistical inference and power analysis of a variety of statistical models. However, owing to the differences in type of genetic data of interest, simulation methods, evolutionary features, input and output formats, terminologies and assumptions for different applications, choosing the right tool for a particular study can be a resource-intensive process that usually involves searching, downloading and testing many different simulation programs. Genetic Simulation Resources (GSR) is a website provided by the National Cancer Institute (NCI) that aims to help researchers compare and choose the appropriate simulation tools for their studies. This website allows authors of simulation software to register their applications and describe them with well-defined attributes, thus allowing site users to search and compare simulators according to specified features. Availability: http://popmodels.cancercontrol.cancer.gov/gsr. Contact: [email protected]

Collaboration


Dive into the Huann Sheng Chen's collaboration.

Top Co-Authors

Avatar

Eric J. Feuer

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Shuanglin Zhang

Michigan Technological University

View shared research outputs
Top Co-Authors

Avatar

Leah E. Mechanic

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Qiuying Sha

Michigan Technological University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Bo Peng

University of Texas MD Anderson Cancer Center

View shared research outputs
Top Co-Authors

Avatar

Denise Riedel Lewis

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Jason H. Moore

University of Pennsylvania

View shared research outputs
Researchain Logo
Decentralizing Knowledge