Lei M. Li
University of Southern California
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lei M. Li.
PLOS Genetics | 2008
Min Wei; Paola Fabrizio; Jia Hu; Huanying Ge; Chao Cheng; Lei M. Li; Valter D. Longo
Calorie restriction (CR), the only non-genetic intervention known to slow aging and extend life span in organisms ranging from yeast to mice, has been linked to the down-regulation of Tor, Akt, and Ras signaling. In this study, we demonstrate that the serine/threonine kinase Rim15 is required for yeast chronological life span extension caused by deficiencies in Ras2, Tor1, and Sch9, and by calorie restriction. Deletion of stress resistance transcription factors Gis1 and Msn2/4, which are positively regulated by Rim15, also caused a major although not complete reversion of the effect of calorie restriction on life span. The deletion of both RAS2 and the Akt and S6 kinase homolog SCH9 in combination with calorie restriction caused a remarkable 10-fold life span extension, which, surprisingly, was only partially reversed by the lack of Rim15. These results indicate that the Ras/cAMP/PKA/Rim15/Msn2/4 and the Tor/Sch9/Rim15/Gis1 pathways are major mediators of the calorie restriction-dependent stress resistance and life span extension, although additional mediators are involved. Notably, the anti-aging effect caused by the inactivation of both pathways is much more potent than that caused by CR.
PLOS Genetics | 2009
Min Wei; Paola Fabrizio; Federica Madia; Jia Hu; Huanying Ge; Lei M. Li; Valter D. Longo
The effect of calorie restriction (CR) on life span extension, demonstrated in organisms ranging from yeast to mice, may involve the down-regulation of pathways, including Tor, Akt, and Ras. Here, we present data suggesting that yeast Tor1 and Sch9 (a homolog of the mammalian kinases Akt and S6K) is a central component of a network that controls a common set of genes implicated in a metabolic switch from the TCA cycle and respiration to glycolysis and glycerol biosynthesis. During chronological survival, mutants lacking SCH9 depleted extracellular ethanol and reduced stored lipids, but synthesized and released glycerol. Deletion of the glycerol biosynthesis genes GPD1, GPD2, or RHR2, among the most up-regulated in long-lived sch9Δ, tor1Δ, and ras2Δ mutants, was sufficient to reverse chronological life span extension in sch9Δ mutants, suggesting that glycerol production, in addition to the regulation of stress resistance systems, optimizes life span extension. Glycerol, unlike glucose or ethanol, did not adversely affect the life span extension induced by calorie restriction or starvation, suggesting that carbon source substitution may represent an alternative to calorie restriction as a strategy to delay aging.
PLOS ONE | 2008
Chao Cheng; Lei M. Li
Background MicroRNAs (miRNAs) play crucial roles in a variety of biological processes via regulating expression of their target genes at the mRNA level. A number of computational approaches regarding miRNAs have been proposed, but most of them focus on miRNA gene finding or target predictions. Little computational work has been done to investigate the effective regulation of miRNAs. Methodology/Principal Findings We propose a method to infer the effective regulatory activities of miRNAs by integrating microarray expression data with miRNA target predictions. The method is based on the idea that regulatory activity changes of miRNAs could be reflected by the expression changes of their target transcripts measured by microarray. To validate this method, we apply it to the microarray data sets that measure gene expression changes in cell lines after transfection or inhibition of several specific miRNAs. The results indicate that our method can detect activity enhancement of the transfected miRNAs as well as activity reduction of the inhibited miRNAs with high sensitivity and specificity. Furthermore, we show that our inference is robust with respect to false positives of target prediction. Conclusions/Significance A huge amount of gene expression data sets are available in the literature, but miRNA regulation underlying these data sets is largely unknown. The method is easy to be implemented and can be used to investigate the miRNA effective regulation underlying the expression change profiles obtained from microarray experiments.
research in computational molecular biology | 2003
Lei M. Li; Jong Hyun Kim; Michael S. Waterman
In this paper we describe a method for statistical reconstruction of haplotypes from a set of aligned SNP fragments. We consider the case of a pair of homologous human chromosomes, one from the mother and the other from the father. After fragment assembly and SNP detection, we wish to reconstruct the two haplotypes of the parents. Given a set of SNP sites inferred from the assembly alignment, we wish to divide the fragment set into two subsets, each of which represents one chromosome. Our method is based on a statistical model of sequencing errors, compositional information and haplotype memberships.We calculate probabilities of different haplotypes conditional on the alignment. Due to computational complexity, we first determine phases for neighboring SNPs. Then we connect them and construct haplotype segments. Also we compute the accuracy or confidence of the reconstructed haplotypes. We discuss other issues such as alternative methods, parameter estimation, computational efficiency, and relaxation of assumptions.
BMC Genomics | 2007
Chao Cheng; Paola Fabrizio; Huanying Ge; Valter D. Longo; Lei M. Li
BackgroundThree kinases: Sch9, PKA and TOR, are suggested to be involved in both the replicative and chronological ageing in yeast. They function in pathways whose down-regulation leads to life span extension. Several stress response proteins, including two transcription factors Msn2 and Msn4, mediate the longevity extension phenotype associated with decreased activity of either Sch9, PKA, or TOR. However, the mechanisms of longevity, especially the underlying transcription program have not been fully understood.ResultsWe measured the gene expression profiles in wild type yeast and three long-lived mutants: sch9 Δ, ras2 Δ, and tor1 Δ. To elucidate the transcription program that may account for the longevity extension, we identified the transcription factors that are systematically and significantly associated with the expression differentiation in these mutants with respect to wild type by integrating microarray expression data with motif and ChIP-chip data, respectively. Our analysis suggests that three stress response transcription factors, Msn2, Msn4 and Gis1, are activated in all the three mutants. We also identify some other transcription factors such as Fhl1 and Hsf1, which may also be involved in the transcriptional modification in the long-lived mutants.ConclusionCombining microarray expression data with other data sources such as motif and ChIP-chip data provides biological insights into the transcription modification that leads to life span extension. In the chronologically long-lived mutant: sch9 Δ, ras2 Δ, and tor1 Δ, several common stress response transcription factors are activated compared with the wild type according to our systematic transcription inference.
BMC Bioinformatics | 2007
Chao Cheng; Xiting Yan; Fengzhu Sun; Lei M. Li
BackgroundThe identification of transcription factors (TFs) associated with a biological process is fundamental to understanding its regulatory mechanisms. From microarray data, however, the activity changes of TFs often cannot be directly observed due to their relatively low expression levels, post-transcriptional modifications, and other complications. Several approaches have been proposed to infer TF activity changes from microarray data. In some models, a linear relationship between gene expression and TF-gene binding strength is assumed. In some other models, the target genes of a TF are first determined by a significance cutoff to binding affinity scores, and then expression differentiation is checked between the target and other genes.ResultsWe propose a novel method, referred to as BASE (binding association with sorted expression), to infer TF activity changes from microarray expression profiles with the help of binding affinity data. It searches the maximum association between bind affinity profile of a TF and expression change profile along the direction of sorted differentiation. The method does not make hard target gene selection, rather, the significances of TF activity changes are evaluated by permutation tests of binding association at the end. To show the effectiveness of this method, we apply it to three typical examples using different kinds of binding affinity data, namely, ChIP-chip data, motif discovery data, and positional weighted matrix scanning data, respectively. The implications obtained from all three examples are consistent with established biological results. Moreover, the inferences suggest new and biological meaningful hypotheses for further investigation.ConclusionThe proposed method makes transcription inference from profiles of expression and binding affinity. The same machinery can be used to deal with various kinds of binding affinity data. The method does not require a linear assumption, and has the desirable property of scale-invariance with respect to TF-specific binding affinity. This method is easy to implement and can be routinely applied for transcriptional inferences in microarray studies.
Artificial Intelligence in Medicine | 2006
Panagiota Kitsantas; Myles Hollander; Lei M. Li
OBJECTIVE Low birth weight (LBW) is a major public health problem. Compared to normal weight infants, LBW is positively associated with infant mortality and negatively associated with normative childhood cognitive and physical development. In the past two decades, research has identified important risk factors of LBW. In this study, we used classification trees to study the interactive nature of these factors. In particular we: (1) identify subgroups of women who are at a high risk of a LBW outcome in seven geographical regions of Florida, and (2) study the predictive performance of classification trees by comparing the tree-based results to those obtained using logistic regression. METHODS The data, 181,690 singleton births, were derived from Florida birth certificates recorded in 1998. Classification trees and logistic regression models were built based on seven geographical regions. The outcome variable consisted of two classes, namely LBW (< 2500 g) and normal birth weight (> or = 2500 g) cases, while a large number of known risk factors was examined. Tree and logistic regression models were compared using Receiving Operating Curves, and sensitivity and specificity analyses. RESULTS The use of classification trees has revealed a number of high-risk subgroups. For instance, White, Hispanic or Other non-white mothers who were healthy and smoked with a weight gain less than 20 lbs had a higher risk of a LBW birth compared to those with the same characteristics but with a weight gain of more than 20 lbs. Factors such as parity and marital status were important predictors for pregnancy outcomes among nonsmoker White, Hispanic or Other non-white mothers. Furthermore, we found that Black mothers were directly classified as a high-risk subgroup in the regions of Panhandle, Northeast, North Central, while in the Southern regions a series of other characteristics further defined the high-risk subgroup of Black mothers. Overall, the differences in predictive performance between tree models and logistic regression were minimal. CONCLUSION The present study demonstrated that classification trees can be used to identify high-risk subgroups of mothers who are at risk of LBW outcomes. Although these exploratory tree analyses revealed a number of distinctive variable interactions for each geographical area, the variable selection was similar across all seven regions. This study also demonstrated that classification trees did not outperform logistic regression models or vice versa; both approaches provided useful analyses of the data.
BMC Genomics | 2008
Chao Cheng; Lei M. Li
BackgroundThe cell cycle has long been an important model to study the genome-wide transcriptional regulation. Although several methods have been introduced to identify cell cycle regulated genes from microarray data, they can not be directly used to investigate cell cycle regulated transcription factors (CCRTFs), because for many transcription factors (TFs) it is their activities instead of expressions that are periodically regulated across the cell cycle. To overcome this problem, it is useful to infer TF activities across the cell cycle by integrating microarray expression data with ChIP-chip data, and then examine the periodicity of the inferred activities. For most species, however, large-scale ChIP-chip data are still not available.ResultsWe propose a two-step method to identify the CCRTFs by integrating microarray cell cycle data with ChIP-chip data or motif discovery data. In S. cerevisiae, we identify 42 CCRTFs, among which 23 have been verified experimentally. The cell cycle related behaviors (e.g. at which cell cycle phase a TF achieves the highest activity) predicted by our method are consistent with the well established knowledge about them. We also find that the periodical activity fluctuation of some TFs can be perturbed by the cell synchronization treatment. Moreover, by integrating expression data with in-silico motif discovery data, we identify 8 cell cycle associated regulatory motifs, among which 7 are binding sites for well-known cell cycle related TFs.ConclusionOur method is effective to identify CCRTFs by integrating microarray cell cycle data with TF-gene binding information. In S. cerevisiae, the TF-gene binding information is provided by the systematic ChIP-chip experiments. In other species where systematic ChIP-chip data is not available, in-silico motif discovery and analysis provide us with an alternative method. Therefore, our method is ready to be implemented to the microarray cell cycle data sets from different species. The C++ program for AC score calculation is available for download from URL http://leili-lab.cmb.usc.edu/yeastaging/projects/project-base/.
BMC Bioinformatics | 2006
Pierre Nicolas; Fengzhu Sun; Lei M. Li
BackgroundSingle Nucleotide Polymorphisms (SNPs) are the most common type of polymorphisms found in the human genome. Effective genetic association studies require the identification of sets of tag SNPs that capture as much haplotype information as possible. Tag SNP selection is analogous to the problem of data compression in information theory. According to Shannons framework, the optimal tag set maximizes the entropy of the tag SNPs subject to constraints on the number of SNPs. This approach requires an appropriate probabilistic model. Compared to simple measures of Linkage Disequilibrium (LD), a good model of haplotype sequences can more accurately account for LD structure. It also provides a machinery for the prediction of tagged SNPs and thereby to assess the performances of tag sets through their ability to predict larger SNP sets.ResultsHere, we compute the description code-lengths of SNP data for an array of models and we develop tag SNP selection methods based on these models and the strategy of entropy maximization. Using data sets from the HapMap and ENCODE projects, we show that the hidden Markov model introduced by Li and Stephens outperforms the other models in several aspects: description code-length of SNP data, information content of tag sets, and prediction of tagged SNPs. This is the first use of this model in the context of tag SNP selection.ConclusionOur study provides strong evidence that the tag sets selected by our best method, based on Li and Stephens model, outperform those chosen by several existing methods. The results also suggest that information content evaluated with a good model is more sensitive for assessing the quality of a tagging set than the correct prediction rate of tagged SNPs. Besides, we show that haplotype phase uncertainty has an almost negligible impact on the ability of good tag sets to predict tagged SNPs. This justifies the selection of tag SNPs on the basis of haplotype informativeness, although genotyping studies do not directly assess haplotypes. A software that implements our approach is available.
Journal of Computational Biology | 2004
Lei M. Li; Jong Hyun Kim; Michael S. Waterman
In this paper, we describe a method for statistical reconstruction of haplotypes from a set of aligned SNP fragments. We consider the case of a pair of homologous human chromosomes, one from the mother and the other from the father. After fragment assembly, we wish to reconstruct the two haplotypes of the parents. Given a set of potential SNP sites inferred from the assembly alignment, we wish to divide the fragment set into two subsets, each of which represents one chromosome. Our method is based on a statistical model of sequencing errors, compositional information, and haplotype memberships. We calculate probabilities of different haplotypes conditional on the alignment. Due to computational complexity, we first determine phases for neighboring SNPs. Then we connect them and construct haplotype segments. Also, we compute the accuracy or confidence of the reconstructed haplotypes. We discuss other issues, such as alternative methods, parameter estimation, computational efficiency, and relaxation of assumptions.