Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ryan Gill is active.

Publication


Featured researches published by Ryan Gill.


BMC Medical Genomics | 2011

Modeling microRNA-mRNA Interactions Using PLS Regression in Human Colon Cancer

Xiaohong Li; Ryan Gill; Nigel G. F. Cooper; Jae Keun Yoo; Susmita Datta

BackgroundChanges in microRNA (miRNA) expression patterns have been extensively characterized in several cancers, including human colon cancer. However, how these miRNAs and their putative mRNA targets contribute to the etiology of cancer is poorly understood. In this work, a bioinformatics computational approach with miRNA and mRNA expression data was used to identify the putative targets of miRNAs and to construct association networks between miRNAs and mRNAs to gain some insights into the underlined molecular mechanisms of human colon cancer.MethodThe miRNA and mRNA microarray expression profiles from the same tissues including 7 human colon tumor tissues and 4 normal tissues, collected by the Broad Institute, were used to identify significant associations between miRNA and mRNA. We applied the partial least square (PLS) regression method and bootstrap based statistical tests to the joint expression profiles of differentially expressed miRNAs and mRNAs. From this analysis, we predicted putative miRNA targets and association networks between miRNAs and mRNAs. Pathway analysis was employed to identify biological processes related to these miRNAs and their associated predicted mRNA targets.ResultsMost significantly associated up-regulated mRNAs with a down-regulated miRNA identified by the proposed methodology were considered to be the miRNA targets. On average, approximately 16.5% and 11.0% of targets predicted by this approach were also predicted as targets by the common prediction algorithms TargetScan and miRanda, respectively. We demonstrated that our method detects more targets than a simple correlation based association. Integrative mRNA:miRNA predictive networks from our analysis were constructed with the aid of Cytoscape software. Pathway analysis validated the miRNAs through their predicted targets that may be involved in cancer-associated biological networks.ConclusionWe have identified an alternative bioinformatics approach for predicting miRNA targets in human colon cancer and for reverse engineering the miRNA:mRNA network using inversely related mRNA and miRNA joint expression profiles. We demonstrated the superiority of our predictive method compared to the correlation based target prediction algorithm through a simulation study. We anticipate that the unique miRNA targets predicted by the proposed method will advance the understanding of the molecular mechanism of colon cancer and will suggest novel therapeutic targets after further experimental validations.


PLOS ONE | 2017

A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data

Xiaohong Li; Guy N. Brock; Eric C. Rouchka; Nigel G. F. Cooper; Dongfeng Wu; Timothy E. O’Toole; Ryan Gill; Abdallah M. Eteleeb; Liz O’Brien; Shesh N. Rai

Normalization is an essential step with considerable impact on high-throughput RNA sequencing (RNA-seq) data analysis. Although there are numerous methods for read count normalization, it remains a challenge to choose an optimal method due to multiple factors contributing to read count variability that affects the overall sensitivity and specificity. In order to properly determine the most appropriate normalization methods, it is critical to compare the performance and shortcomings of a representative set of normalization routines based on different dataset characteristics. Therefore, we set out to evaluate the performance of the commonly used methods (DESeq, TMM-edgeR, FPKM-CuffDiff, TC, Med UQ and FQ) and two new methods we propose: Med-pgQ2 and UQ-pgQ2 (per-gene normalization after per-sample median or upper-quartile global scaling). Our per-gene normalization approach allows for comparisons between conditions based on similar count levels. Using the benchmark Microarray Quality Control Project (MAQC) and simulated datasets, we performed differential gene expression analysis to evaluate these methods. When evaluating MAQC2 with two replicates, we observed that Med-pgQ2 and UQ-pgQ2 achieved a slightly higher area under the Receiver Operating Characteristic Curve (AUC), a specificity rate > 85%, the detection power > 92% and an actual false discovery rate (FDR) under 0.06 given the nominal FDR (≤0.05). Although the top commonly used methods (DESeq and TMM-edgeR) yield a higher power (>93%) for MAQC2 data, they trade off with a reduced specificity (<70%) and a slightly higher actual FDR than our proposed methods. In addition, the results from an analysis based on the qualitative characteristics of sample distribution for MAQC2 and human breast cancer datasets show that only our gene-wise normalization methods corrected data skewed towards lower read counts. However, when we evaluated MAQC3 with less variation in five replicates, all methods performed similarly. Thus, our proposed Med-pgQ2 and UQ-pgQ2 methods perform slightly better for differential gene analysis of RNA-seq data skewed towards lowly expressed read counts with high variation by improving specificity while maintaining a good detection power with a control of the nominal FDR level.


Current Pharmaceutical Design | 2014

Differential Network Analysis in Human Cancer Research

Ryan Gill; Somnath Datta; Susmita Datta

A complex disease like cancer is hardly caused by one gene or one protein singly. It is usually caused by the perturbation of the network formed by several genes or proteins. In the last decade several research teams have attempted to construct interaction maps of genes and proteins either experimentally or reverse engineer interaction maps using computational techniques. These networks were usually created under a certain condition such as an environmental condition, a particular disease, or a specific tissue type. Lately, however, there has been greater emphasis on finding the differential structure of the existing network topology under a novel condition or disease status to elucidate the perturbation in a biological system. In this review/tutorial article we briefly mention some of the research done in this area; we mainly illustrate the computational/statistical methods developed by our team in recent years for differential network analysis using publicly available gene expression data collected from a well known cancer study. This data includes a group of patients with acute lymphoblastic leukemia and a group with acute myeloid leukemia. In particular, we describe the statistical tests to detect the change in the network topology based on connectivity scores which measure the association or interaction between pairs of genes. The tests under various scores are applied to this data set to perform a differential network analysis on gene expression for human leukemia. We believe that, in the future, differential network analysis will be a standard way to view the changes in gene expression and protein expression data globally and these types of tests could be useful in analyzing the complex differential signatures.


Bioinformation | 2014

dna: An R package for differential network analysis.

Ryan Gill; Somnath Datta; Susmita Datta

Differential network analysis provides a framework for examining if there is sufficient statistical evidence to conclude that the structure of a network differs under two experimental conditions or if the structures of two networks are different. The R package dna provides tools and procedures for differential network analysis of genomic data. The focus of this package is on gene-gene networks, but the methods are easily adaptable for more general biological processes. This package includes preprocessing tools for simultaneously preparing a pair of networks for analysis, procedures for computing connectivity scores between pairs of genes based on many available statistical techniques, and tools for handling modules of genes based on these scores. Also, procedures are provided for performing permutation tests based on these scores to determine if the connectivity of a gene differs between the two networks, to determine if the connectivity of a particular set of important genes differs between the two networks, and to determine if the overall module structure differs between the two networks. Several built-in options are available for the types of scores and distances used in the testing procedures, and additionally, the procedures provide flexible methods that allow the user to define custom scores and distances. Availability dna is freely available at The Comprehensive R Archive Network, http://CRAN.R-project.org/package=dna


Archive | 2014

Using RNA-seq Data to Detect Differentially Expressed Genes

Douglas J. Lorenz; Ryan Gill; Ritendranath Mitra; Susmita Datta

RNA-sequencing (RNA-seq) technology has become a major choice in detecting differentially expressed genes across different biological conditions. Although microarray technology is used for the same purpose, statistical methods available for identifying differential expression for microarray data are generally not readily applicable to the analysis of RNA-seq data, as RNA-seq data comprise discrete counts of reads mapped to particular genes. In this chapter, we review statistical methods uniquely developed for detecting differential expression among different populations of RNA-seq data as well as techniques designed originally for the analysis of microarray data that have been modified for the analysis of RNA-seq data. We include a very brief description of the normalization of RNA-seq data and then elaborate on parametric and nonparametric testing procedures, as well as empirical and fully Bayesian methods. We include a brief review of software available for the analysis of differential expression and summarize the results of a recent comprehensive simulation study comparing existing methods.


Communications in Statistics-theory and Methods | 2006

Detecting abrupt leaks in blended underground storage tanks

Ryan Gill; Jerome P. Keating; Michael Baron

We suggest and compare two multiple change-point algorithms (segmentation and sequential) for accurate detection of the onset of abrupt leaks in blended underground storage tanks. We apply these algorithms to two simulated scenarios to demonstrate the advantages of the sequential algorithm, and then we apply the sequential algorithm to the Cary blended site data. In addition, we obtain a confidence set for the locations of the change points conditional on the number of change points by inverting the related hypothesis test.


Systems Biomedicine | 2014

An integrative exploratory analysis of –omics data from the ICGC cancer genomes lung adenocarcinoma study

Sinjini Sikdar; Hyoyoung Choo Wosoba; Younathan Abdia; Sandipan Dutta; Ryan Gill; Somnath Datta; Susmita Datta

It is known that all agents that cause cancer (carcinogens) also cause a change in the DNA sequence. In order to identify such often subtle changes, we attempt to integrate multiple molecular profile data sets released by the International Cancer Genome Consortium (ICGC). The list of data sets includes matched gene and microRNA expression profiles, somatic copy number variation, DNA methylation, and protein expression profiles for lung adenocarcinoma patients receiving treatments. We consider both unsupervised and supervised learning techniques (clustering and penalized regression) to identify interesting molecular markers corresponding to each type of –omics profiles that can differentiate patients. Associations between important markers of 2 types have been studied. An adaptive ensemble binary regression model has been presented that uses the entirety of available –omics profiles leading to a more accurate clinical prognosis for the patients in the given sample. This integrated study provides a more comprehensive picture of lung adenocarcinoma.


Computational Statistics & Data Analysis | 2007

Computation of estimates in segmented regression and a liquidity effect model

Ryan Gill; Kiseop Lee; Seongjoo Song

Weighted least squares (WLS) estimation in segmented regression with multiple change points is considered. A computationally efficient algorithm for calculating the WLS estimate of a single change point is derived. Then, iterative methods of approximating the global solution of the multiple change-point problem based on estimating change points one-at-a-time are discussed. It is shown that these results can also be applied to a liquidity effect model in finance with multiple change points. The liquidity effect model we consider is a generalization of one proposed by Cetin et al. [2006. Pricing options in an extended Black Scholes economy with illiquidity: theory and empirical evidence. Rev. Financial Stud. 19, 493-529], allowing that the magnitude of liquidity effect depends on the size of a trade. Two data sets are used to illustrate these methods.


Briefings in Bioinformatics | 2016

Improving protein identification from tandem mass spectrometry data by one-step methods and integrating data from other platforms

Sinjini Sikdar; Ryan Gill; Susmita Datta

MOTIVATION Many approaches have been proposed for the protein identification problem based on tandem mass spectrometry (MS/MS) data. In these experiments, proteins are digested into peptides and the resulting peptide mixture is subjected to mass spectrometry. Some interesting putative peptide features (peaks) are selected from the mass spectra. Following that, the precursor ions undergo fragmentation and are analyzed by MS/MS. The process of identification of peptides from the mass spectra and the constituent proteins in the sample is called protein identification from MS/MS data. There are many two-step protein identification procedures, reviewed in the literature, which first attempt to identify the peptides in a separate process and then use these results to infer the proteins. However, in recent years, there have been attempts to provide a one-step solution to protein identification, which simultaneously identifies the proteins and the peptides in the sample. RESULTS In this review, we briefly introduce the most popular two-step protein identification procedure, PeptideProphet coupled with ProteinProphet. Following that, we describe the difficulties with two-step procedures and review some recently introduced one-step protein/peptide identification procedures that do not suffer from these issues. The focus of this review is on one-step procedures that are based on statistical likelihood-based models, but some discussion of other one-step procedures is also included. We report comparative performances of one-step and two-step methods, which support the overall superiorities of one-step procedures. We also cover some recent efforts to improve protein identification by incorporating other molecular data along with MS/MS data.


Systems Biomedicine | 2014

Bridging in vivo and in vitro data from Japanese Toxicogenomics Project using network analyses

Ryan Gill; Somnath Datta; Susmita Datta

Since experiments involving animal models are labor and time intensive, there is an attempt to replace these measurements on animal models with in vitro assays which has higher acceptance in the population concerning ethical issues. In this work, we explore to what extend animal models can be replaced by in vitro assays in the context of a toxicogenomics study. The data from the Japanese Toxicogenomics Project are gene expression profiles measured by microarrays from both in vitro and animal samples. We apply a comprehensive genomic association network analysis in order to study the comparative behavior of the genomic networks for the in vivo vs. in vitro data. The genomic networks are computed based on association scores of gene-gene pairs using a partial least squares modeling of gene expression values adjusted for sacrifice time and dosage. We apply permutation based statistical tests to compare the connectivity of a given gene, as well as a class of genes in the two networks which may be affected by a given drug. The goal is to identify parts of these networks including key genes that are not significantly altered for in vivo vs. in vitro samples for the majority of the drugs.

Collaboration


Dive into the Ryan Gill's collaboration.

Top Co-Authors

Avatar

Susmita Datta

University of Louisville

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dongfeng Wu

University of Louisville

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge