Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Shili Lin is active.

Publication


Featured researches published by Shili Lin.


PLOS Genetics | 2005

A Dinucleotide Deletion in CD24 Confers Protection against Autoimmune Diseases

Lizhong Wang; Shili Lin; Kottil Rammohan; Zhenqiu Liu; Jin Qing Liu; Runhua Liu; Nikki Guinther; Judy Lima; Qunmin Zhou; Tony Wang; Xincheng Zheng; Daniel J. Birmingham; Brad H. Rovin; Lee A. Hebert; Yee Ling Wu; D. Joanne Lynn; Glenn Cooke; C. Yung Yu; Pan Zheng; Yang Liu

It is generally believed that susceptibility to both organ-specific and systemic autoimmune diseases is under polygenic control. Although multiple genes have been implicated in each type of autoimmune disease, few are known to have a significant impact on both. Here, we investigated the significance of polymorphisms in the human gene CD24 and the susceptibility to multiple sclerosis (MS) and systemic lupus erythematosus (SLE). We used cases/control studies to determine the association between CD24 polymorphism and the risk of MS and SLE. In addition, we also considered transmission disequilibrium tests using family data from two cohorts consisting of a total of 150 pedigrees of MS families and 187 pedigrees of SLE families. Our analyses revealed that a dinucleotide deletion at position 1527∼1528 (P1527del) from the CD24 mRNA translation start site is associated with a significantly reduced risk (odds ratio = 0.54 with 95% confidence interval = 0.34–0.82) and delayed progression (p = 0.0188) of MS. Among the SLE cohort, we found a similar reduction of risk with the same polymorphism (odds ratio = 0.38, confidence interval = 0.22–0.62). More importantly, using 150 pedigrees of MS families from two independent cohorts and the TRANSMIT software, we found that the P1527del allele was preferentially transmitted to unaffected individuals (p = 0.002). Likewise, an analysis of 187 SLE families revealed the dinucleotide-deleted allele was preferentially transmitted to unaffected individuals (p = 0.002). The mRNA levels for the dinucleotide-deletion allele were 2.5-fold less than that of the wild-type allele. The dinucleotide deletion significantly reduced the stability of CD24 mRNA. Our results demonstrate that a destabilizing dinucleotide deletion in the 3′ UTR of CD24 mRNA conveys significant protection against both MS and SLE.


IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2010

Sparse Support Vector Machines with L_{p} Penalty for Biomarker Identification

Zhenqiu Liu; Shili Lin; Ming Tan

The development of high-throughput technology has generated a massive amount of high-dimensional data, and many of them are of discrete type. Robust and efficient learning algorithms such as LASSO [1] are required for feature selection and overfitting control. However, most feature selection algorithms are only applicable to the continuous data type. In this paper, we propose a novel method for sparse support vector machines (SVMs) with L_{p} (p < 1) regularization. Efficient algorithms (LpSVM) are developed for learning the classifier that is applicable to high-dimensional data sets with both discrete and continuous data types. The regularization parameters are estimated through maximizing the area under the ROC curve (AUC) of the cross-validation data. Experimental results on protein sequence and SNP data attest to the accuracy, sparsity, and efficiency of the proposed algorithm. Biomarkers identified with our methods are compared with those from other methods in the literature. The software package in Matlab is available upon request.


Biometrics | 2012

Logistic Bayesian LASSO for identifying association with rare haplotypes and application to age-related macular degeneration.

Swati Biswas; Shili Lin

Rare variants have been heralded as key to uncovering missing heritability in complex diseases. These variants can now be genotyped using next-generation sequencing technologies; nonetheless, rare haplotypes may also result from combination of common single nucleotide polymorphisms available from genome-wide association studies (GWAS). The National Eye Institutes data on age-related macular degeneration (AMD) is such an example. Studies on AMD had identified potential rare variants; however, due to lack of appropriate statistical tools, effects of individual rare haplotypes were never studied. Here we develop a method for identifying association with rare haplotypes for case-control design. A logistic regression based retrospective likelihood is formulated and is regularized using logistic Bayesian LASSO (LBL). In particular, we penalize the regression coefficients using appropriate priors to weed out unassociated haplotypes, making it possible for the rare associated ones to stand out. We applied LBL to the AMD data and identified common and rare haplotypes in the complement factor H gene, gaining insights into rare variants contributions to AMD beyond the current literature. This analysis also demonstrates the richness of GWAS data for mapping rare haplotypes-a potential largely unexplored. Additionally, we conducted simulations to investigate the performance of LBL and compare it with Hapassoc. Our results show that LBL is much more powerful in identifying rare associated haplotypes when the false positive rates for both approaches are kept the same.


Human Heredity | 2009

Detection of Parent-of-Origin Effects Based on Complete and Incomplete Nuclear Families with Multiple Affected Children

Ji-Yuan Zhou; Yue-Qing Hu; Shili Lin; Wing K. Fung

Parent-of-origin effects are important in studying genetic traits. More than 1% of all mammalian genes are believed to show parent-of-origin effects. Some statistical methods may be ineffective or fail to detect linkage or association for a gene with parent-of-origin effects. Based on case-parents trios, the parental-asymmetry test (PAT) is simple and powerful in detecting parent-of-origin effects. However, it is common in practice to collect nuclear families with both parents as well as nuclear families with only one parent. In this paper, when only one parent is available for each family with an arbitrary number of affected children, we firstly develop a new test statistic 1-PAT to test for parent-of-origin effects in the presence of association between an allele at the marker locus under study and a disease gene. Then we extend the PAT to accommodate complete nuclear families each with one or more affected children. Combining families with both parents and families with only one parent, the C-PAT is proposed to detect parent-of-origin effects. The validity of the test statistics is verified by simulation in various scenarios of parameter values. A power study shows that using the additional information from incomplete nuclear families in the analysis greatly improves the power of the tests, compared to that based on only complete nuclear families. Also, utilizing all affected children in each family, the proposed tests have a higher power than when only one affected child from each family is selected. Additional power comparison also demonstrates that the C-PAT is more powerful than a number of other tests for detecting parent-of-origin effects.


Genetic Epidemiology | 2010

Detection of parent-of-origin effects using general pedigree data

Ji-Yuan Zhou; Jie Ding; Wing K. Fung; Shili Lin

Genomic imprinting is an important epigenetic factor in complex traits study, which has generally been examined by testing for parent‐of‐origin effects of alleles. For a diallelic marker locus, the parental‐asymmetry test (PAT) based on case‐parents trios and its extensions to incomplete nuclear families (1‐PAT and C‐PAT) are simple and powerful for detecting parent‐of‐origin effects. However, these methods are suitable only for nuclear families and thus are not amenable to general pedigree data. Use of data from extended pedigrees, if available, may lead to more powerful methods than randomly selecting one two‐generation nuclear family from each pedigree. In this study, we extend PAT to accommodate general pedigree data by proposing the pedigree PAT (PPAT) statistic, which uses all informative family trios from pedigrees. To fully utilize pedigrees with some missing genotypes, we further develop the Monte Carlo (MC) PPAT (MCPPAT) statistic based on MC sampling and estimation. Extensive simulations were carried out to evaluate the performance of the proposed methods. Under the assumption that the pedigrees and their associated affection patterns are randomly drawn from a population of pedigrees with at least one affected offspring, we demonstrated that MCPPAT is a valid test for parent‐of‐origin effects in the presence of association. Further, MCPPAT is much more powerful compared to PAT for trios or even PPAT for all informative family trios from the same pedigrees if there is missing data. Application of the proposed methods to a rheumatoid arthritis dataset further demonstrates the advantage of MCPPAT. Genet. Epidemiol. 34: 151–158, 2010.


Genetic Epidemiology | 2014

Detecting Rare Haplotype‐Environment Interaction With Logistic Bayesian LASSO

Swati Biswas; Shuang Xia; Shili Lin

Two important contributors to missing heritability are believed to be rare variants and gene‐environment interaction (GXE). Thus, detecting GXE where G is a rare haplotype variant (rHTV) is a pressing problem. Haplotype analysis is usually the natural second step to follow up on a genomic region that is implicated to be associated through single nucleotide variants (SNV) analysis. Further, rHTV can tag associated rare SNV and provide greater power to detect them than popular collapsing methods. Recently we proposed Logistic Bayesian LASSO (LBL) for detecting rHTV association with case–control data. LBL shrinks the unassociated (especially common) haplotypes toward zero so that an associated rHTV can be identified with greater power. Here, we incorporate environmental factors and their interactions with haplotypes in LBL. As LBL is based on retrospective likelihood, this extension is not trivial. We model the joint distribution of haplotypes and covariates given the case–control status. We apply the approach (LBL‐GXE) to the Michigan, Mayo, AREDS, Pennsylvania Cohort Study on Age‐related Macular Degeneration (AMD). LBL‐GXE detects interaction of a specific rHTV in CFH gene with smoking. To the best of our knowledge, this is the first time in the AMD literature that an interaction of smoking with a specific (rather than pooled) rHTV has been implicated. We also carry out simulations and find that LBL‐GXE has reasonably good powers for detecting interactions with rHTV while keeping the type I error rates well controlled. Thus, we conclude that LBL‐GXE is a useful tool for uncovering missing heritability.


BMC Genetics | 2003

Linkage analysis of the simulated data – evaluations and comparisons of methods

Swati Biswas; Charalampos Papachristou; Mark E Irwin; Shili Lin

The goal of this study is to evaluate, compare, and contrast several standard and new linkage analysis methods. First, we compare a recently proposed confidence set approach with MAPMAKER/SIBS. Then, we evaluate a new Bayesian approach that accounts for heterogeneity. Finally, the newly developed software SIMPLE is compared with GENEHUNTER. We apply these methods to several replicates of the Genetic Analysis Workshop 13 simulated data to assess their ability to detect the high blood pressure genes on chromosome 21, whose positions were known to us prior to the analyses. In contrast to the standard methods, most of the new approaches are able to identify at least one of the disease genes in all the replicates considered.


BMC Proceedings | 2009

Detection of imprinting and heterogeneous maternal effects on high blood pressure using Framingham Heart Study data

Jingyuan Yang; Shili Lin

Both imprinting and maternal effects could lead to parent-of-origin patterns in complex traits of human disorders. Statistical methods that differentiate these two effects and identify them simultaneously by using family-based data from retrospective studies are available. The usual data structures include case-parents triads and nuclear families with multiple affected siblings. We develop a likelihood-based method to detect imprinting and maternal effects simultaneously using data from prospective studies. The proposed method utilizes both affected and unaffected siblings in nuclear families by modeling familial genotypes and offsprings disease status jointly. Maternal effect is usually modeled as a fixed effect under the assumption that maternal variant allele(s) has (have) identical effect on any offspring. However, recent studies report that different people may carry different amounts of substances encoded by the mothers variant allele(s) (called maternal microchimerism), which could result in heterogeneity of maternal effects. The proposed method incorporates the heterogeneity of maternal effects by adding a random component to the logit of the penetrance. Our method was applied to the Framingham Heart Study data in two steps to detect single-nucleotide polymorphisms (SNPs) that may be associated with high blood pressure. In the first step, SNPs that affect susceptibility of high blood pressure through minor allele, genomic imprinting, or maternal effects were identified by using the proposed model without the random effect component. In the second step, we fitted the mixed effect model to the identified SNPs that have significant maternal effect to detect heterogeneity of the maternal effects.


American Journal of Epidemiology | 2011

Detection of Parent-of-Origin Effects for Quantitative Traits in Complete and Incomplete Nuclear Families With Multiple Children

Feng He; Ji-Yuan Zhou; Yue-Qing Hu; Fengzhu Sun; Jingyuan Yang; Shili Lin; Wing K. Fung

For a diallelic genetic marker locus, tests like the parental-asymmetry test (PAT) are simple and powerful for detecting parent-of-origin effects. However, these approaches are applicable only to qualitative traits and thus are currently not suitable for quantitative traits. In this paper, the authors propose a novel class of PAT-type parent-of-origin effects tests for quantitative traits in families with both parents and an arbitrary number of children, which is denoted by Q-PAT(c) for some constant c. The authors further develop Q-1-PAT(c) for detection of parent-of-origin effects when information is available on only 1 parent in each family. The authors suggest the Q-C-PAT(c) test for combining families with data on both parental genotypes and families with data on only 1 parental genotype. Simulation studies show that the proposed tests control the empirical type I error rates well under the null hypothesis of no parent-of-origin effects. Power comparison also demonstrates that the proposed methods are more powerful than the existing likelihood ratio test. Although normality is commonly assumed in methods for studying quantitative traits, the tests proposed in this paper do not make any assumption about the distribution of the quantitative trait.


Journal of Computational Biology | 2006

Genome-wide tagging SNPs with entropy-based Monte Carlo method.

Zhenqiu Liu; Shili Lin; Ming Tan

The number of common single nucleotide polymorphisms (SNPs) in the human genome is estimated to be around 3-6 million. It is highly anticipated that the study of SNPs will help provide a means for elucidating the genetic component of complex diseases and variable drug responses. High-throughput technologies such as oligonucleotide arrays have produced enormous amount of SNP data, which creates great challenges in genome-wide disease linkage and association studies. In this paper, we present an adaptation of the cross entropy (CE) method and propose an iterative CE Monte Carlo (CEMC) algorithm for tagging SNP selection. This differs from most of SNP selection algorithms in the literature in that our method is independent of the notion of haplotype block. Thus, the method is applicable to whole genome SNP selection without prior knowledge of block boundaries. We applied this block-free algorithm to three large datasets (two simulated and one real) that are in the order of thousands of SNPs. The successful applications to these large scale datasets demonstrate that CEMC is computationally feasible for whole genome SNP selection. Furthermore, the results show that CEMC is significantly better than random selection, and it also outperformed another block-free selection algorithm for the dataset considered.

Collaboration


Dive into the Shili Lin's collaboration.

Top Co-Authors

Avatar

Swati Biswas

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar

Pan Zheng

Children's National Medical Center

View shared research outputs
Top Co-Authors

Avatar

Yang Liu

Children's National Medical Center

View shared research outputs
Top Co-Authors

Avatar

Zhenqiu Liu

Cedars-Sinai Medical Center

View shared research outputs
Top Co-Authors

Avatar

Ji-Yuan Zhou

Southern Medical University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lizhong Wang

University of Alabama at Birmingham

View shared research outputs
Top Co-Authors

Avatar

Runhua Liu

University of Alabama at Birmingham

View shared research outputs
Top Co-Authors

Avatar

Wing K. Fung

University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge