Lipi R. Acharya
University of New Orleans
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lipi R. Acharya.
Journal of Proteomics | 2015
Yaoyang Zhang; Tao Xu; Bing Shan; Jonathan R. Hart; Aaron Aslanian; Xuemei Han; Nobel C. Zong; Haomin Li; Howard Choi; Dong Wang; Lipi R. Acharya; Lisa Du; Peter K. Vogt; Peipei Ping; John R. Yates
Shotgun proteomics generates valuable information from large-scale and target protein characterizations, including protein expression, protein quantification, protein post-translational modifications (PTMs), protein localization, and protein-protein interactions. Typically, peptides derived from proteolytic digestion, rather than intact proteins, are analyzed by mass spectrometers because peptides are more readily separated, ionized and fragmented. The amino acid sequences of peptides can be interpreted by matching the observed tandem mass spectra to theoretical spectra derived from a protein sequence database. Identified peptides serve as surrogates for their proteins and are often used to establish what proteins were present in the original mixture and to quantify protein abundance. Two major issues exist for assigning peptides to their originating protein. The first issue is maintaining a desired false discovery rate (FDR) when comparing or combining multiple large datasets generated by shotgun analysis and the second issue is properly assigning peptides to proteins when homologous proteins are present in the database. Herein we demonstrate a new computational tool, ProteinInferencer, which can be used for protein inference with both small- or large-scale data sets to produce a well-controlled protein FDR. In addition, ProteinInferencer introduces confidence scoring for individual proteins, which makes protein identifications evaluable. This article is part of a Special Issue entitled: Computational Proteomics.
IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2012
Lipi R. Acharya; Thair Judeh; Zhansheng Duan; Michael G. Rabbat; Dongxiao Zhu
Reconstruction of signaling pathway structures is essential to decipher complex regulatory relationships in living cells. The existing computational approaches often rely on unrealistic biological assumptions and do not explicitly consider signal transduction mechanisms. Signal transduction events refer to linear cascades of reactions from the cell surface to the nucleus and characterize a signaling pathway. In this paper, we propose a novel approach, Gene Set Gibbs Sampling (GSGS), to reverse engineer signaling pathway structures from gene sets related to the pathways. We hypothesize that signaling pathways are structurally an ensemble of overlapping linear signal transduction events which we encode as Information Flows (IFs). We infer signaling pathway structures from gene sets, referred to as Information Flow Gene Sets (IFGSs), corresponding to these events. Thus, an IFGS only reflects which genes appear in the underlying IF but not their ordering. GSGS offers a Gibbs sampling like procedure to reconstruct the underlying signaling pathway structure by sequentially inferring IFs from the overlapping IFGSs related to the pathway. In the proof-of-concept studies, our approach is shown to outperform the existing state-of-the-art network inference approaches using both continuous and discrete data generated from benchmark networks in the DREAM initiative. We perform a comprehensive sensitivity analysis to assess the robustness of our approach. Finally, we implement GSGS to reconstruct signaling mechanisms in breast cancer cells.
international conference on machine learning and applications | 2009
Lipi R. Acharya; Dongxiao Zhu
Estimating the correlation structure of a gene set is an ubiquitous problem in many pattern analyses of replicated molecular profiling data. However, the commonly used Maximum Likelihood Estimates (MLE) approaches, do not automatically accommodate replicated measurements. Often, an ad hoc step of preprocessing e. g. averaging, either weighted, un-weighted or something in between is needed, which might wipe out important patterns of low magnitude and/or cancel out patterns of similar magnitude. We treat each replicate individually as a random variable and design a finite mixture model to estimate an optimal correlation structure from replicated molecular profiling data. Assuming that the measurements are independently, identically distributed (i. i. d.) samples from a mixture of two multivariate normal distributions, one with a constrained set of parameters and the other with an unconstrained parameter structure, we employ an Expectation-Maximization (EM) algorithm to estimate component parameters. We carry out a comparative study, including both simulations and real-world data analysis, to assess the estimation of correlation structure using the proposed model and the constrained model given by the first component of the mixture. The two models were further tested for their performances in clustering real-world data. The mixture model approach is shown to have an overall better performance.
international conference on bioinformatics | 2013
Thair Judeh; Thaer Jayyousi; Lipi R. Acharya; Robert G. Reynolds; Dongxiao Zhu
With the increasing availability of gene sets, novel approaches that focus on reconstructing networks from gene sets are of interest. Currently, few computational approaches explore the search space of candidate networks using a parallel search. As such, novel methods that employ search agents are needed to help better escape local optima. In particular, gene sets may model signal transduction events, which refer to linear chains or cascades of reactions starting at the cell membrane and ending at the cell nucleus. These events may be indirectly observed as a set of unordered and overlapping gene sets. Thus, the underlying goal is to reverse engineer the order information within each gene set to reconstruct the underlying source network. To achieve this goal, we developed the Gene Set Cultural Algorithm to discover the true order of the gene sets and to reconstruct the underlying network. In a proof of concept study, we show that the Gene Set Cultural Algorithm can satisfactorily reconstruct three E. coli networks from the DREAM initiative using simulated and unordered gene sets as the input.
IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2011
Dongxiao Zhu; Lipi R. Acharya; Hui Zhang
Estimation of pairwise correlation from incomplete and replicated molecular profiling data is an ubiquitous problem in pattern discovery analysis, such as clustering and networking. However, existing methods solve this problem by ad hoc data imputation, followed by aveGation coefficient type approaches, which might annihilate important patterns present in the molecular profiling data. Moreover, these approaches do not consider and exploit the underlying experimental design information that specifies the replication mechanisms. We develop an Expectation-Maximization (EM) type algorithm to estimate the correlation structure using incomplete and replicated molecular profiling data with a priori known replication mechanism. The approach is sufficiently generalized to be applicable to any known replication mechanism. In case of unknown replication mechanism, it is reduced to the parsimonious model introduced previously. The efficacy of our approach was first evaluated by comprehensively comparing various bivariate and multivariate imputation approaches using simulation studies. Results from real-world data analysis further confirmed the superior performance of the proposed approach to the commonly used approaches, where we assessed the robustness of the method using data sets with up to 30 percent missing values.
Briefings in Functional Genomics | 2016
Jie Hou; Lipi R. Acharya; Dongxiao Zhu; Jianlin Cheng
The advent of high-throughput genomics techniques, along with the completion of genome sequencing projects, identification of protein-protein interactions and reconstruction of genome-scale pathways, has accelerated the development of systems biology research in the yeast organism Saccharomyces cerevisiae In particular, discovery of biological pathways in yeast has become an important forefront in systems biology, which aims to understand the interactions among molecules within a cell leading to certain cellular processes in response to a specific environment. While the existing theoretical and experimental approaches enable the investigation of well-known pathways involved in metabolism, gene regulation and signal transduction, bioinformatics methods offer new insights into computational modeling of biological pathways. A wide range of computational approaches has been proposed in the past for reconstructing biological pathways from high-throughput datasets. Here we review selected bioinformatics approaches for modeling biological pathways inS. cerevisiae, including metabolic pathways, gene-regulatory pathways and signaling pathways. We start with reviewing the research on biological pathways followed by discussing key biological databases. In addition, several representative computational approaches for modeling biological pathways in yeast are discussed.
congress on evolutionary computation | 2014
Thair Judeh; Thaer Jayyousi; Lipi R. Acharya; Robert G. Reynolds; Dongxiao Zhu
With the increasing availability of gene sets and pathway resources, novel approaches that combine both resources to reconstruct networks from gene sets are of interest. Currently, few computational approaches explore the search space of candidate networks using a parallel search. In particular, search agents employed by evolutionary computational approaches may better escape false peaks compared to previous approaches. It may also be hypothesized that gene sets may model signal transduction events, which refer to linear chains or cascades of reactions starting at the cell membrane and ending at the cell nucleus. These events may be indirectly observed as a set of unordered and overlapping gene sets. Thus, the goal is to reverse engineer the order information within each gene set to reconstruct the underlying source network using prior knowledge to limit the search space. We propose the Gene Set Cultural Algorithm (GSCA) to reconstruct networks from unordered gene sets. We introduce a robust heuristic based on the arborescence of a directed graph that performs well for random topological sort orderings across gene sets simulated for four E. coli networks and five Insilico networks from the DREAM3 and DREAM4 initiatives, respectively. Furthermore, GSCA performs favorably when reconstructing networks from randomly ordered gene sets for the aforementioned networks. Finally, we note that from a set of 23 gene sets discretized from a set of 300 S. cerevisiae expression profiles, GSCA reconstructs a network preserving most of the weak order information found in the KEGG Cell Cycle pathway, which was used as prior knowledge.
Quantitative Biology | 2017
Lu Wang; Lipi R. Acharya; Changxin Bai; Dongxiao Zhu
BackgroundPrecision medicine approach holds great promise to tailored diagnosis, treatment and prevention. Individuals can be vastly different in their genomic information and genetic mechanisms hence having unique transcriptomic signatures. The development of precision medicine has demanded moving beyond DNA sequencing (DNA-Seq) to much more pointed RNA-sequencing (RNA-Seq) [Cell, 2017, 168: 584‒599].ResultsHere we conduct a brief survey on the recent methodology development of transcriptome assembly approach using RNA-Seq.ConclusionsSince transcriptomes in human disease are highly complex, dynamic and diverse, transcriptome assembly is playing an increasingly important role in precision medicine research to dissect the molecular mechanisms of the human diseases.
Eurasip Journal on Bioinformatics and Systems Biology | 2015
Lipi R. Acharya; Robert G. Reynolds; Dongxiao Zhu
Study of signaling networks is important for a better understanding of cell behaviors e.g., growth, differentiation, metabolism, proptosis, and gaining deeper insights into the molecular mechanisms of complex diseases. While there have been many successes in developing computational approaches for identifying potential genes and proteins involved in cell signaling, new methods are needed for identifying network structures that depict underlying signal cascading mechanisms. In this paper, we propose a new computational approach for inferring signaling network structures from overlapping gene sets related to the networks. In the proposed approach, a signaling network is represented as a directed graph and is viewed as a union of many active paths representing linear and overlapping chains of signal cascading activities in the network. Gene sets represent the sets of genes participating in active paths without prior knowledge of the order in which genes occur within each path. From a compendium of unordered gene sets, the proposed algorithm reconstructs the underlying network structure through evolution of synergistic active paths. In our context, the extent of edge overlapping among active paths is used to define the synergy present in a network. We evaluated the performance of the proposed algorithm in terms of its convergence and recovering true active paths by utilizing four gene set compendiums derived from the KEGG database. Evaluation of results demonstrate the ability of the algorithm in reconstructing the underlying networks with high accuracy and precision.
bioinformatics and biomedicine | 2009
Dongxiao Zhu; Guorong Xu; Lipi R. Acharya
Correlation-based pattern discovery from replicatedmolecular profiling data enables essential data mining tasks, such as discovering biomolecule association networks and functional modules. Unfortunately, the existing approaches are not tailoredto analyze replicated measurements, which is further confused by various replication mechanisms. With few exception, existing approaches average or summarize over replicates of diverse magnitude, which might wipe out important patterns of low magnitude and/or cancel out patterns of similar magnitude. The averaging or summarizing procedure, originally targetedfor univariate differential expression analysis, has become a nuisance in multivariate correlation-based pattern discovery. Multivariate approaches that treat each replicate individually provide a promising alternative. Here we propose a multivariateparsimonious correlation model for replicated molecular profilingdata with blind replication mechanisms, and a constrained (lessparsimonious) correlation model explicitly considers the informedreplication mechanisms. We derive a generalized formula forcorrelation-based pattern discovery for both blind and informedreplication mechanisms. To promote it’s use among the biomedicalresearch community, we develop a correlation-based patterndiscovery software with Graphical User Interface (GUI) foranalyzing replicated molecular profiling data.