Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chenglong Yu is active.

Publication


Featured researches published by Chenglong Yu.


DNA Research | 2010

A Novel Construction of Genome Space with Biological Geometry

Chenglong Yu; Qian Liang; Changchuan Yin; Rong L. He; Stephen S.-T. Yau

A genome space is a moduli space of genomes. In this space, each point corresponds to a genome. The natural distance between two points in the genome space reflects the biological distance between these two genomes. Currently, there is no method to represent genomes by a point in a space without losing biological information. Here, we propose a new graphical representation for DNA sequences. The breakthrough of the subject is that we can construct the moment vectors from DNA sequences using this new graphical method and prove that the correspondence between moment vectors and DNA sequences is one-to-one. Using these moment vectors, we have constructed a novel genome space as a subspace in RN. It allows us to show that the SARS-CoV is most closely related to a coronavirus from the palm civet not from a bird as initially suspected, and the newly discovered human coronavirus HCoV-HKU1 is more closely related to SARS than to any other known member of group 2 coronavirus. Furthermore, we reconstructed the phylogenetic tree for 34 lentiviruses (including human immunodeficiency virus) based on their whole genome sequences. Our genome space will provide a new powerful tool for analyzing the classification of genomes and their phylogenetic relationships.


The Journal of Clinical Endocrinology and Metabolism | 2015

Serum Uric Acid Levels and Risk of Metabolic Syndrome: A Dose-Response Meta-Analysis of Prospective Studies

Huiping Yuan; Chenglong Yu; Xinghui Li; Liang Sun; Xiaoquan Zhu; Chengxiao Zhao; Zheng Zhang; Ze Yang

CONTEXT An excess circulating uric acid level, even within the normal range, is always comorbid with metabolic syndrome (MS), several of its components, and nonalcoholic fatty liver disease (NAFLD), which was regarded as hepatic manifestation of MS; however, these associations remain controversial. OBJECTIVE This study aimed to quantitatively assess the relationship between the serum uric acid (SUA) levels and the MS/NAFLD risk. DESIGN We searched for related prospective cohort studies including SUA as an exposure and MS/NAFLD as a result in MEDLINE (PubMed) and EMBASE databases up to January 31, 2015 and July 28, 2015, respectively. Pooled relative risks (RRs) and corresponding 95% confidence intervals (CIs) were extracted. A random-effects model was used to evaluate dose-response relationships. MAIN OUTCOMES On the basis of 11 studies (54 970 participants and 8719 MS cases), a combined RR of 1.72 (95% CI, 1.45-2.03; P < .0001) was observed for the highest SUA level category compared with the lowest SUA level category. Furthermore, based on nine studies (51 249 participants and 8265 MS cases), dose-response analysis suggested that each 1 mg/dL SUA increment was roughly linearly associated with the MS risk (RR, 1.30; 95% CI, 1.22-1.38; P < .0001). Beyond that, SUA level increased NAFLD risk (RR, 1.46; 95% CI, 1.31-1.63). Each 1 mg/dL SUA level increment led to 21% increase in the NAFLD risk. CONCLUSIONS This meta-analysis suggests that higher SUA levels led to an increased risk of MS regardless of the study characteristics, and were consistent with a linear dose-response relationship. In addition, SUA was also a causal factor for the NAFLD risk.


DNA and Cell Biology | 2008

A protein map and its application.

Stephen S.-T. Yau; Chenglong Yu; Rong He

Graphical representation of gene sequences provides a simple way of viewing, sorting, and comparing various gene structures. Here we first report a two-dimensional graphical representation for protein sequences. With this method, we constructed the moment vectors for protein sequences, and mathematically proved that the correspondence between moment vectors and protein sequences is one-to-one. Therefore, each protein sequence can be represented as a point in a map, which we call protein map, and cluster analysis can be used for comparison between the points. Sixty-six proteins from five protein families were analyzed using this method. Our data showed that for proteins in the same family, their corresponding points in the map are close to each other. We also illustrate the efficiency of this approach by performing an extensive cluster analysis of the protein kinase C family. These results indicate that this protein map could be used to mathematically specify the similarity of two proteins and predict properties of an unknown protein based on its amino acid sequence.


Information Sciences | 2011

DNA sequence comparison by a novel probabilistic method

Chenglong Yu; Mo Deng; Stephen S.-T. Yau

This paper proposes a novel method for comparing DNA sequences. By using a graphical representation, we are able to construct the probability distributions of DNA sequences. These probability distributions can then be used to make similarity studies by using the symmetrised Kullback-Leibler divergence. After presenting our method, we test it using six DNA sequences taken from the threonine operons of Escherichia coli K-12 and Shigella flexneri. Our approach is then used to study the evolution of primates using mitochondrial DNA data. Our method allows us to reconstruct a phylogenetic tree for primate evolution. In addition, we use our technique to analyze the classification and phylogeny of the Tomato Yellow Leaf Curl Virus (TYLCV) based on its whole genome sequences. These examples show that large volumes of DNA sequences can be handled more easily and more quickly by our approach than by the existing multiple alignment methods. Moreover, our method, unlike other approaches, does not require human intervention, because it can be applied automatically.


Gene | 2011

Protein map: an alignment-free sequence comparison method based on various properties of amino acids.

Chenglong Yu; Shiu-Yuen Cheng; Rong L. He; Stephen S.-T. Yau

In this paper, we propose a new protein map which incorporates with various properties of amino acids. As a powerful tool for protein classification, this new protein map both considers phylogenetic factors arising from amino acid mutations and provides computational efficiency for the huge amount of data. The ten amino acid physico-chemical properties (the chemical composition of the side chain, two polarity measures, hydropathy, isoelectric point, volume, aromaticity, aliphaticity, hydrogenation, and hydroxythiolation) are utilized according to their relative importance. Moreover, during the course of calculation of genetic distances between pairs of proteins, this approach does not require any alignment of sequences. Therefore, the proposed model is easier and quicker in handling protein sequences than multiple alignment methods, and gives protein classification greater evolutionary significance at the amino acid sequence level.


PLOS ONE | 2013

Real time classification of viruses in 12 dimensions.

Chenglong Yu; Troy Hernandez; Hui Zheng; Shek-Chung Yau; Hsin-Hsiung Huang; Rong Lucy He; Jie Yang; Stephen S.-T. Yau

The International Committee on Taxonomy of Viruses authorizes and organizes the taxonomic classification of viruses. Thus far, the detailed classifications for all viruses are neither complete nor free from dispute. For example, the current missing label rates in GenBank are 12.1% for family label and 30.0% for genus label. Using the proposed Natural Vector representation, all 2,044 single-segment referenced viral genomes in GenBank can be embedded in . Unlike other approaches, this allows us to determine phylogenetic relations for all viruses at any level (e.g., Baltimore class, family, subfamily, genus, and species) in real time. Additionally, the proposed graphical representation for virus phylogeny provides a visualization of the distribution of viruses in . Unlike the commonly used tree visualization methods which suffer from uniqueness and existence problems, our representation always exists and is unique. This approach is successfully used to predict and correct viral classification information, as well as to identify viral origins; e.g. a recent public health threat, the West Nile virus, is closer to the Japanese encephalitis antigenic complex based on our visualization. Based on cross-validation results, the accuracy rates of our predictions are as high as 98.2% for Baltimore class labels, 96.6% for family labels, 99.7% for subfamily labels and 97.2% for genus labels.


Journal of Theoretical Biology | 2015

A new method to cluster DNA sequences using Fourier power spectrum

Tung Hoang; Changchuan Yin; Hui Zheng; Chenglong Yu; Rong Lucy He; Stephen S.-T. Yau

Abstract A novel clustering method is proposed to classify genes and genomes. For a given DNA sequence, a binary indicator sequence of each nucleotide is constructed, and Discrete Fourier Transform is applied on these four sequences to attain respective power spectra. Mathematical moments are built from these spectra, and multidimensional vectors of real numbers are constructed from these moments. Cluster analysis is then performed in order to determine the evolutionary relationship between DNA sequences. The novelty of this method is that sequences with different lengths can be compared easily via the use of power spectra and moments. Experimental results on various datasets show that the proposed method provides an efficient tool to classify genes and genomes. It not only gives comparable results but also is remarkably faster than other multiple sequence alignment and alignment-free methods.


Molecular Psychiatry | 2017

The PHF21B gene is associated with major depression and modulates the stress response

Ma-Li Wong; Mauricio Arcos-Burgos; S. Liu; Jorge I. Vélez; Chenglong Yu; Bernhard T. Baune; Magdalene C. Jawahar; V. Arolt; Udo Dannlowski; Aaron Chuah; Gavin A. Huttley; R. Fogarty; Martin D. Lewis; Stefan R. Bornstein; Julio Licinio

Major depressive disorder (MDD) affects around 350 million people worldwide; however, the underlying genetic basis remains largely unknown. In this study, we took into account that MDD is a gene-environment disorder, in which stress is a critical component, and used whole-genome screening of functional variants to investigate the ‘missing heritability’ in MDD. Genome-wide association studies (GWAS) using single- and multi-locus linear mixed-effect models were performed in a Los Angeles Mexican-American cohort (196 controls, 203 MDD) and in a replication European-ancestry cohort (499 controls, 473 MDD). Our analyses took into consideration the stress levels in the control populations. The Mexican-American controls, comprised primarily of recent immigrants, had high levels of stress due to acculturation issues and the European-ancestry controls with high stress levels were given higher weights in our analysis. We identified 44 common and rare functional variants associated with mild to moderate MDD in the Mexican-American cohort (genome-wide false discovery rate, FDR, <0.05), and their pathway analysis revealed that the three top overrepresented Gene Ontology (GO) processes were innate immune response, glutamate receptor signaling and detection of chemical stimulus in smell sensory perception. Rare variant analysis replicated the association of the PHF21B gene in the ethnically unrelated European-ancestry cohort. The TRPM2 gene, previously implicated in mood disorders, may also be considered replicated by our analyses. Whole-genome sequencing analyses of a subset of the cohorts revealed that European-ancestry individuals have a significantly reduced (50%) number of single nucleotide variants compared with Mexican-American individuals, and for this reason the role of rare variants may vary across populations. PHF21b variants contribute significantly to differences in the levels of expression of this gene in several brain areas, including the hippocampus. Furthermore, using an animal model of stress, we found that Phf21b hippocampal gene expression is significantly decreased in animals resilient to chronic restraint stress when compared with non-chronically stressed animals. Together, our results reveal that including stress level data enables the identification of novel rare functional variants associated with MDD.


Fractals | 2015

GENERALIZED WEIERSTRASS–MANDELBROT FUNCTION MODEL FOR ACTUAL STOCKS MARKETS INDEXES WITH NONLINEAR CHARACTERISTICS

Lei Zhang; Chenglong Yu; J. Q. Sun

It is difficult to simulate the dynamical behavior of actual financial markets indexes effectively, especially when they have nonlinear characteristics. So it is significant to propose a mathematical model with these characteristics. In this paper, we investigate a generalized Weierstrass–Mandelbrot function (WMF) model with two nonlinear characteristics: fractal dimension D where 2 > D > 1.5 and Hurst exponent (H) where 1 > H > 0.5 firstly. And then we study the dynamical behavior of H for WMF as D and the spectrum of the time series γ change in three-dimensional space, respectively. Because WMF and the actual stock market indexes have two common features: fractal behavior using fractal dimension and long memory effect by Hurst exponent, we study the relationship between WMF and the actual stock market indexes. We choose a random value of γ and fixed value of D for WMF to simulate the S&P 500 indexes at different time ranges. As shown in the simulation results of three-dimensional space, we find that γ i...


PLOS ONE | 2014

DFA7, a New Method to Distinguish between Intron-Containing and Intronless Genes

Chenglong Yu; Mo Deng; Lu Zheng; Rong Lucy He; Jie Yang; Stephen S.-T. Yau

Intron-containing and intronless genes have different biological properties and statistical characteristics. Here we propose a new computational method to distinguish between intron-containing and intronless gene sequences. Seven feature parameters , , , , , , and based on detrended fluctuation analysis (DFA) are fully used, and thus we can compute a 7-dimensional feature vector for any given gene sequence to be discriminated. Furthermore, support vector machine (SVM) classifier with Gaussian radial basis kernel function is performed on this feature space to classify the genes into intron-containing and intronless. We investigate the performance of the proposed method in comparison with other state-of-the-art algorithms on biological datasets. The experimental results show that our new method significantly improves the accuracy over those existing techniques.

Collaboration


Dive into the Chenglong Yu's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rong Lucy He

Chicago State University

View shared research outputs
Top Co-Authors

Avatar

Rong L. He

Chicago State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hui Zheng

University of Illinois at Chicago

View shared research outputs
Top Co-Authors

Avatar

Jie Yang

University of Illinois at Chicago

View shared research outputs
Researchain Logo
Decentralizing Knowledge