Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jingyao Li is active.

Publication


Featured researches published by Jingyao Li.


BMC Bioinformatics | 2013

Group sparse canonical correlation analysis for genomic data integration

Dongdong Lin; Ji-Gang Zhang; Jingyao Li; Vince D. Calhoun; Hong-Wen Deng; Yu-Ping Wang

BackgroundThe emergence of high-throughput genomic datasets from different sources and platforms (e.g., gene expression, single nucleotide polymorphisms (SNP), and copy number variation (CNV)) has greatly enhanced our understandings of the interplay of these genomic factors as well as their influences on the complex diseases. It is challenging to explore the relationship between these different types of genomic data sets. In this paper, we focus on a multivariate statistical method, canonical correlation analysis (CCA) method for this problem. Conventional CCA method does not work effectively if the number of data samples is significantly less than that of biomarkers, which is a typical case for genomic data (e.g., SNPs). Sparse CCA (sCCA) methods were introduced to overcome such difficulty, mostly using penalizations with l-1 norm (CCA-l1) or the combination of l-1and l-2 norm (CCA-elastic net). However, they overlook the structural or group effect within genomic data in the analysis, which often exist and are important (e.g., SNPs spanning a gene interact and work together as a group).ResultsWe propose a new group sparse CCA method (CCA-sparse group) along with an effective numerical algorithm to study the mutual relationship between two different types of genomic data (i.e., SNP and gene expression). We then extend the model to a more general formulation that can include the existing sCCA models. We apply the model to feature/variable selection from two data sets and compare our group sparse CCA method with existing sCCA methods on both simulation and two real datasets (human gliomas data and NCI60 data). We use a graphical representation of the samples with a pair of canonical variates to demonstrate the discriminating characteristic of the selected features. Pathway analysis is further performed for biological interpretation of those features.ConclusionsThe CCA-sparse group method incorporates group effects of features into the correlation analysis while performs individual feature selection simultaneously. It outperforms the two sCCA methods (CCA-l1 and CCA-group) by identifying the correlated features with more true positives while controlling total discordance at a lower level on the simulated data, even if the group effect does not exist or there are irrelevant features grouped with true correlated features. Compared with our proposed CCA-group sparse models, CCA-l1 tends to select less true correlated features while CCA-group inclines to select more redundant features.


Frontiers in Cell and Developmental Biology | 2014

Integrative analysis of multiple diverse omics datasets by sparse group multitask regression

Dongdong Lin; Ji-Gang Zhang; Jingyao Li; Hao He; Hong-Wen Deng; Yu-Ping Wang

A variety of high throughput genome-wide assays enable the exploration of genetic risk factors underlying complex traits. Although these studies have remarkable impact on identifying susceptible biomarkers, they suffer from issues such as limited sample size and low reproducibility. Combining individual studies of different genetic levels/platforms has the promise to improve the power and consistency of biomarker identification. In this paper, we propose a novel integrative method, namely sparse group multitask regression, for integrating diverse omics datasets, platforms, and populations to identify risk genes/factors of complex diseases. This method combines multitask learning with sparse group regularization, which will: (1) treat the biomarker identification in each single study as a task and then combine them by multitask learning; (2) group variables from all studies for identifying significant genes; (3) enforce sparse constraint on groups of variables to overcome the “small sample, but large variables” problem. We introduce two sparse group penalties: sparse group lasso and sparse group ridge in our multitask model, and provide an effective algorithm for each model. In addition, we propose a significance test for the identification of potential risk genes. Two simulation studies are performed to evaluate the performance of our integrative method by comparing it with conventional meta-analysis method. The results show that our sparse group multitask method outperforms meta-analysis method significantly. In an application to our osteoporosis studies, 7 genes are identified as significant genes by our method and are found to have significant effects in other three independent studies for validation. The most significant gene SOD2 has been identified in our previous osteoporosis study involving the same expression dataset. Several other genes such as TREML2, HTR1E, and GLO1 are shown to be novel susceptible genes for osteoporosis, as confirmed from other studies.


bioinformatics and biomedicine | 2013

Network-based investigation of genetic modules associated with functional brain networks in schizophrenia

Dongdong Lin; Hao He; Jingyao Li; Hong-Wen Deng; Vince D. Calhoun; Yu-Ping Wang

We developed a new sparse multivariate regression method, collaborative sparse reduced rank regression(C-sRRR) for detecting genetic networks associated with brain functional networks in schizophrenia (SZ). Our study: 1) introduced both genetic and brain network structure to group single nucleotide polymorphism (SNP) and voxels simultaneously for utilizing the interacting effects implied in both features; 2) used collaborative sparse group lasso to perform genetic variants selection and nuclear norm penalty to address the interrelationship among voxels; 3) developed an efficient algorithm for solving the non-smooth optimization. In real data analysis, we constructed 8605 genetic sub-networks (modules) from 722177 SNPs with a median module size of 9. A functional brain network was extracted which also showed significant discriminative characteristics between SZ and healthy controls. A sub sampling strategy was applied to identify 57 highly ranked genes from 14 high-ranking modules. 14 of them are SZ susceptibility genes and 6 genes were consistent with the findings in previous study.


international symposium on biomedical imaging | 2013

Identifying genetic connections with brain functions in schizophrenia using group sparse canonical correlation analysis

Dongdong Lin; Ji-Gang Zhang; Jingyao Li; Vince D. Calhoun; Yu-Ping Wang

We investigate the correspondence between genetic variations with single nucleotide polymorphism (SNP) and brain activity measured by functional magnetic resonance imaging (fMRI). A group sparse canonical correlation analysis method (group sparse CCA) was proposed to explore the correlation between these two types of data, which are high dimensional with small number of samples. It can exploit the group or structural information within the data while filter out irrelevant features within each group. Our method outperforms the existing sparse CCA (sCCA) models in a simulation study. By applying it to the analysis of real data, we identified two pairs of significant canonical variates with correlations 0.7692 and 0.7168 respectively. A gene and brain region of interest (ROI) correlation analysis was further performed on the two pairs of canonical variates to confirm the correlation between genes and the region of interests in the brain.


BMC Systems Biology | 2013

An improved sparse representation model with structural information for Multicolour Fluorescence In-Situ Hybridization (M-FISH) image classification

Jingyao Li; Dongdong Lin; Hongbao Cao; Yu-Ping Wang

BackgroundMulticolour Fluorescence In-Situ Hybridization (M-FISH) images are employed for detecting chromosomal abnormalities such as chromosomal translocations, deletions, duplication and inversions. This technique uses mixed colours of fluorochromes to paint the whole chromosomes for rapid detection of chromosome rearrangements. The M-FISH data sets used in our research are obtained from microscopic scanning of a metaphase cell labelled with five different fluorochromes and a DAPI staining. The reliability of the technique lies in accurate classification of chromosomes (24 classes for male and 23 classes for female) from M-FISH images. However, due to imaging noise, mis-alignment between multiple channels and many other imaging problems, there is always a classification error, leading to wrong detection of chromosomal abnormalities. Therefore, how to accurately classify different types of chromosomes from M-FISH images becomes a challenging problem.MethodsThis paper presents a novel sparse representation model considering structural information for the classification of M-FISH images. In our previous work a sparse representation based classification model was proposed. This model employed only individual pixel information for the classification. With the structural information of neighbouring pixels as well as the information of themselves simultaneously, the novel approach extended the previous one to the regional case. Based on Orthogonal Matching Pursuit (OMP), we developed simultaneous OMP algorithm (SOMP) to derive an efficient solution of the improved sparse representation model by incorporating the structural information.ResultsThe p-value of two models shows that the newly proposed model incorporating the structural information is significantly superior to our previous one. In addition, we evaluated the effect of several parameters, such as sparsity level, neighbourhood size, and training sample size, on the of the classification accuracy.ConclusionsThe comparison with our previously used sparse model demonstrates that the improved sparse representation model is more effective than the previous one on the classification of the chromosome abnormalities.


BMC Bioinformatics | 2016

An integrative imputation method based on multi-omics datasets

Dongdong Lin; Ji-Gang Zhang; Jingyao Li; Chao Xu; Hong-Wen Deng; Yu-Ping Wang

BackgroundIntegrative analysis of multi-omics data is becoming increasingly important to unravel functional mechanisms of complex diseases. However, the currently available multi-omics datasets inevitably suffer from missing values due to technical limitations and various constrains in experiments. These missing values severely hinder integrative analysis of multi-omics data. Current imputation methods mainly focus on using single omics data while ignoring biological interconnections and information imbedded in multi-omics data sets.ResultsIn this study, a novel multi-omics imputation method was proposed to integrate multiple correlated omics datasets for improving the imputation accuracy. Our method was designed to: 1) combine the estimates of missing value from individual omics data itself as well as from other omics, and 2) simultaneously impute multiple missing omics datasets by an iterative algorithm. We compared our method with five imputation methods using single omics data at different noise levels, sample sizes and data missing rates. The results demonstrated the advantage and efficiency of our method, consistently in terms of the imputation error and the recovery of mRNA-miRNA network structure.ConclusionsWe concluded that our proposed imputation method can utilize more biological information to minimize the imputation error and thus can improve the performance of downstream analysis such as genetic regulatory network construction.


bioinformatics and biomedicine | 2012

Classification of multicolor fluorescence in-situ hybridization (M-FISH) image using structure based sparse representation model

Jingyao Li; Dongdong Lin; Hongbao Cao; Yu-Ping Wang

We developed a structure based sparse representation model for classifying chromosomes in M-FISH images. The sparse representation based classification model used in our previous work only considered one pixel without incorporating any structural information. The new proposed model extends the previous one to multiple pixels case, where each target pixel together with its neighboring pixels will be used simultaneously for classification. We also extend Orthogonal Matching Pursuit (OMP) algorithm to the multiple pixels case, named simultaneous OMP algorithm (SOMP), to solve the structure based sparse representation model. The classification results show that our new model outperforms the previous sparse representation model with the p-value less than le-6. We also discussed the effects of several parameters (neighborhood size, sparsity level, and training sample size) on the accuracy of the classification. Our proposed method can be affected by the sparsity level and the neighborhood size but is insensitive to the training sample size. Therefore, the comparison indicates that the structure based sparse representation model can significantly improve the accuracy of the chromosome classification, leading to improved diagnosis of genetic diseases and cancers.


Cytometry Part A | 2017

A patch-based tensor decomposition algorithm for M-FISH image classification

Min Wang; Ting-Zhu Huang; Jingyao Li; Yu-Ping Wang

Multiplex‐fluorescence in situ hybridization (M‐FISH) is a chromosome imaging technique which can be used to detect chromosomal abnormalities such as translocations, deletions, duplications, and inversions. Chromosome classification from M‐FISH imaging data is a key step to implement the technique. In the classified M‐FISH image, each pixel in a chromosome is labeled with a class index and drawn with a pseudo‐color so that geneticists can easily conduct diagnosis, for example, identifying chromosomal translocations by examining color changes between chromosomes. However, the information of pixels in a neighborhood is often overlooked by existing approaches. In this work, we assume that the pixels in a patch belong to the same class and use the patch to represent the center pixels class information, by which we can use the correlations of neighboring pixels and the structural information across different spectral channels for the classification. On the basis of assumption, we propose a patch‐based classification algorithm by using higher order singular value decomposition (HOSVD). The developed method has been tested on a comprehensive M‐FISH database that we established, demonstrating improved performance. When compared with other pixel‐wise M‐FISH image classifiers such as fuzzy c‐means clustering (FCM), adaptive fuzzy c‐means clustering (AFCM), improved adaptive fuzzy c‐means clustering (IAFCM), and sparse representation classification (SparseRC) methods, the proposed method gave the highest correct classification ratio (CCR), which can translate into improved diagnosis of genetic diseases and cancers.


international symposium on biomedical imaging | 2015

Detection of genetic factors associated with multiple correlated imaging phenotypes by a sparse regression model

Dongdong Lin; Jingyao Li; Vince D. Calhoun; Yu-Ping Wang

Recently, more evidence of polygenicity and pleiotropy has been found in genome-wide association (GWA) studies of complex psychiatric diseases (e.g., schizophrenia), where multiple interacting genetic variants may affect multiple phenotypic traits simultaneously. In this work, we propose a new sparse collaborative group-ridge low-rank regression model (sCGRLR) to study the pleiotropic effects of a group of genetic variants on multiple imaging-derived quantitative traits (i.e., endophenotype). In the method, we enforce sparse and low-rank regularizations to reduce the number of features and then construct an effective gene or gene-set based statistic test to evaluate the significance of selected features. We show the advantage of our method with other gene-set pleiotropy analysis methods and other sparse multivariate regression methods in terms of type I error and power on simulated data. Finally, we demonstrate its application to real data analysis on the study of schizophrenia.


bioinformatics and biomedicine | 2015

Segmentation of Multicolor Fluorescence In-Situ Hybridization (M-FISH) image using an improved Fuzzy C-means clustering algorithm while incorporating both spatial and spectral information

Jingyao Li; Dongdong Lin; Yu-Ping Wang

Multicolor Fluorescence In-Situ Hybridization (M-FISH) is an imaging technique for rapid detection of chromosomal abnormalities, where the segmentation of chromosomes has been a challenge. Multi-channel information of M-FISH images can be used in a segmentation algorithm to exploit the correlated information across channels for better image segmentation. In addition, the neighboring pixels share similar characteristics, so this spatial information can be further utilized to improve the robustness of the algorithm to the noise. Motivated by this fact, in this paper we proposed an improved Fuzzy C-means (FCM) clustering algorithm to overcome the problems of conventional FCM such as the sensitivity to noise by incorporating both spatial and spectral information. The experimental results on both simulated and real M-FISH images have shown that our proposed method can result in higher segmentation accuracy and lower false ratio than both conventional FCM and the improved adaptive FCM (IAFCM) we recently proposed.

Collaboration


Dive into the Jingyao Li's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dongdong Lin

The Mind Research Network

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Han Yan

Xi'an Jiaotong University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Li-Jun Tan

Hunan Normal University

View shared research outputs
Researchain Logo
Decentralizing Knowledge