Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sergio Contrino is active.

Publication


Featured researches published by Sergio Contrino.


Bioinformatics | 2012

InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data

Richard N. Smith; Jelena Aleksic; Daniela Butano; Adrian Carr; Sergio Contrino; Fengyuan Hu; Mike Lyne; Rachel Lyne; Alex Kalderimis; Kim Rutherford; Radek Stepan; Julie Sullivan; Matthew Wakeling; Xavier Watkins; Gos Micklem

Summary: InterMine is an open-source data warehouse system that facilitates the building of databases with complex data integration requirements and a need for a fast customizable query facility. Using InterMine, large biological databases can be created from a range of heterogeneous data sources, and the extensible data model allows for easy integration of new data types. The analysis tools include a flexible query builder, genomic region search and a library of ‘widgets’ performing various statistical analyses. The results can be exported in many commonly used formats. InterMine is a fully extensible framework where developers can add new tools and functionality. Additionally, there is a comprehensive set of web services, for which client libraries are provided in five commonly used programming languages. Availability: Freely available from http://www.intermine.org under the LGPL license. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Nucleic Acids Research | 2012

modMine: flexible access to modENCODE data

Sergio Contrino; Richard N. Smith; Daniela Butano; Adrian Carr; Fengyuan Hu; Rachel Lyne; Kim Rutherford; Alexis Kalderimis; Julie Sullivan; Seth Carbon; E. Kephart; P. Lloyd; Eo Stinson; Nicole L. Washington; M. Perry; P. Ruzanov; Z. Zha; Suzanna E. Lewis; Lincoln Stein; Gos Micklem

In an effort to comprehensively characterize the functional elements within the genomes of the important model organisms Drosophila melanogaster and Caenorhabditis elegans, the NHGRI model organism Encyclopaedia of DNA Elements (modENCODE) consortium has generated an enormous library of genomic data along with detailed, structured information on all aspects of the experiments. The modMine database (http://intermine.modencode.org) described here has been built by the modENCODE Data Coordination Center to allow the broader research community to (i) search for and download data sets of interest among the thousands generated by modENCODE; (ii) access the data in an integrated form together with non-modENCODE data sets; and (iii) facilitate fine-grained analysis of the above data. The sophisticated search features are possible because of the collection of extensive experimental metadata by the consortium. Interfaces are provided to allow both biologists and bioinformaticians to exploit these rich modENCODE data sets now available via modMine.


Nucleic Acids Research | 2015

Araport: the Arabidopsis Information Portal

Vivek Krishnakumar; Matthew R. Hanlon; Sergio Contrino; Erik S. Ferlanti; Svetlana Karamycheva; Maria Kim; Benjamin D. Rosen; Chia Yi Cheng; Walter Moreira; Stephen A. Mock; Joe Stubbs; Julie Sullivan; Konstantinos Krampis; Jason R. Miller; Gos Micklem; Matthew W. Vaughn; Christopher D. Town

The Arabidopsis Information Portal (https://www.araport.org) is a new online resource for plant biology research. It houses the Arabidopsis thaliana genome sequence and associated annotation. It was conceived as a framework that allows the research community to develop and release ‘modules’ that integrate, analyze and visualize Arabidopsis data that may reside at remote sites. The current implementation provides an indexed database of core genomic information. These data are made available through feature-rich web applications that provide search, data mining, and genome browser functionality, and also by bulk download and web services. Araport uses software from the InterMine and JBrowse projects to expose curated data from TAIR, GO, BAR, EBI, UniProt, PubMed and EPIC CoGe. The site also hosts ‘science apps,’ developed as prototypes for community modules that use dynamic web pages to present data obtained on-demand from third-party servers via RESTful web services. Designed for sustainability, the Arabidopsis Information Portal strategy exploits existing scientific computing infrastructure, adopts a practical mixture of data integration technologies and encourages collaborative enhancement of the resource by its user community.


Nucleic Acids Research | 2014

InterMine: extensive web services for modern biology

Alex Kalderimis; Rachel Lyne; Daniela Butano; Sergio Contrino; Mike Lyne; Joshua Heimbach; Fengyuan Hu; Richard L. Smith; Radek Štěpán; Julie Sullivan; Gos Micklem

InterMine (www.intermine.org) is a biological data warehousing system providing extensive automatically generated and configurable RESTful web services that underpin the web interface and can be re-used in many other applications: to find and filter data; export it in a flexible and structured way; to upload, use, manipulate and analyze lists; to provide services for flexible retrieval of sequence segments, and for other statistical and analysis tools. Here we describe these features and discuss how they can be used separately or in combinations to support integrative and comparative analysis.


Database | 2011

The modENCODE Data Coordination Center: lessons in harvesting comprehensive experimental details.

Nicole L. Washington; Eo Stinson; M. Perry; P. Ruzanov; Sergio Contrino; Richard N. Smith; Z. Zha; Rachel Lyne; Adrian Carr; P. Lloyd; E. Kephart; Sheldon J. McKay; Gos Micklem; Lincoln Stein; Suzanna E. Lewis

The model organism Encyclopedia of DNA Elements (modENCODE) project is a National Human Genome Research Institute (NHGRI) initiative designed to characterize the genomes of Drosophila melanogaster and Caenorhabditis elegans. A Data Coordination Center (DCC) was created to collect, store and catalog modENCODE data. An effective DCC must gather, organize and provide all primary, interpreted and analyzed data, and ensure the community is supplied with the knowledge of the experimental conditions, protocols and verification checks used to generate each primary data set. We present here the design principles of the modENCODE DCC, and describe the ramifications of collecting thorough and deep metadata for describing experiments, including the use of a wiki for capturing protocol and reagent information, and the BIR-TAB specification for linking biological samples to experimental results. modENCODE data can be found at http://www.modencode.org. Database URL: http://www.modencode.org.


Genesis | 2015

Cross‐organism analysis using InterMine

Rachel Lyne; Julie Sullivan; Daniela Butano; Sergio Contrino; Joshua Heimbach; Fengyuan Hu; Alex Kalderimis; Mike Lyne; Richard N. Smith; Radek Štěpán; Rama Balakrishnan; Gail Binkley; Todd W. Harris; Kalpana Karra; Sierra A. T. Moxon; Howie Motenko; Steven B. Neuhauser; Leyla Ruzicka; Mike Cherry; Joel E. Richardson; Lincoln Stein; Monte Westerfield; Elizabeth A. Worthey; Gos Micklem

InterMine is a data integration warehouse and analysis software system developed for large and complex biological data sets. Designed for integrative analysis, it can be accessed through a user‐friendly web interface. For bioinformaticians, extensive web services as well as programming interfaces for most common scripting languages support access to all features. The web interface includes a useful identifier look‐up system, and both simple and sophisticated search options. Interactive results tables enable exploration, and data can be filtered, summarized, and browsed. A set of graphical analysis tools provide a rich environment for data exploration including statistical enrichment of sets of genes or other entities. InterMine databases have been developed for the major model organisms, budding yeast, nematode worm, fruit fly, zebrafish, mouse, and rat together with a newly developed human database. Here, we describe how this has facilitated interoperation and development of cross‐organism analysis tools and reports. InterMine as a data exploration and analysis tool is also described. All the InterMine‐based systems described in this article are resources freely available to the scientific community. genesis 53:547–560, 2015.


BMC Genomics | 2013

Cloud-based uniform ChIP-Seq processing tools for modENCODE and ENCODE

Quang M. Trinh; Fei-Yang Arthur Jen; Ziru Zhou; Kar Ming Chu; M. Perry; E. Kephart; Sergio Contrino; P. Ruzanov; Lincoln Stein

BackgroundFunded by the National Institutes of Health (NIH), the aim of the Mod el Organism ENC yclopedia o f D NA E lements (modENCODE) project is to provide the biological research community with a comprehensive encyclopedia of functional genomic elements for both model organisms C. elegans (worm) and D. melanogaster (fly). With a total size of just under 10 terabytes of data collected and released to the public, one of the challenges faced by researchers is to extract biologically meaningful knowledge from this large data set. While the basic quality control, pre-processing, and analysis of the data has already been performed by members of the modENCODE consortium, many researchers will wish to reinterpret the data set using modifications and enhancements of the original protocols, or combine modENCODE data with other data sets. Unfortunately this can be a time consuming and logistically challenging proposition.ResultsIn recognition of this challenge, the modENCODE DCC has released uniform computing resources for analyzing modENCODE data on Galaxy (https://github.com/modENCODE-DCC/Galaxy), on the public Amazon Cloud (http://aws.amazon.com), and on the private Bionimbus Cloud for genomic research (http://www.bionimbus.org). In particular, we have released Galaxy workflows for interpreting ChIP-seq data which use the same quality control (QC) and peak calling standards adopted by the modENCODE and ENCODE communities. For convenience of use, we have created Amazon and Bionimbus Cloud machine images containing Galaxy along with all the modENCODE data, software and other dependencies.ConclusionsUsing these resources provides a framework for running consistent and reproducible analyses on modENCODE data, ultimately allowing researchers to use more of their time using modENCODE data, and less time moving it around.


Plant and Cell Physiology | 2016

ThaleMine: A Warehouse for Arabidopsis Data Integration and Discovery

Vivek Krishnakumar; Sergio Contrino; Chia-Yi Cheng; Irina Belyaeva; Erik S. Ferlanti; Jason R. Miller; Matthew W. Vaughn; Gos Micklem; Christopher D. Town; Agnes P. Chan

ThaleMine (https://apps.araport.org/thalemine/) is a comprehensive data warehouse that integrates a wide array of genomic information of the model plant Arabidopsis thaliana. The data collection currently includes the latest structural and functional annotation from the Araport11 update, the Col-0 genome sequence, RNA-seq and array expression, co-expression, protein interactions, homologs, pathways, publications, alleles, germplasm and phenotypes. The data are collected from a wide variety of public resources. Users can browse gene-specific data through Gene Report pages, identify and create gene lists based on experiments or indexed keywords, and run GO enrichment analysis to investigate the biological significance of selected gene sets. Developed by the Arabidopsis Information Portal project (Araport, https://www.araport.org/), ThaleMine uses the InterMine software framework, which builds well-structured data, and provides powerful data query and analysis functionality. The warehoused data can be accessed by users via graphical interfaces, as well as programmatically via web-services. Here we describe recent developments in ThaleMine including new features and extensions, and discuss future improvements. InterMine has been broadly adopted by the model organism research community including nematode, rat, mouse, zebrafish, budding yeast, the modENCODE project, as well as being used for human data. ThaleMine is the first InterMine developed for a plant model. As additional new plant InterMines are developed by the legume and other plant research communities, the potential of cross-organism integrative data analysis will be further enabled.


F1000Research | 2017

Forever in BlueGenes: A next-generation genomic data interface powered by InterMine

Yo Yehudi; Daniela Butano; Matthew Chadwick; Justin Clark-Casey; Sergio Contrino; Joshua Heimbach; Rachel Lyne; Juli Sullivan; Gos Micklem


SWAT4LS | 2016

Making Linked Data SPARQL with the InterMine Biological Data Warehouse.

Maxime Déraspe; Gail Binkley; Daniela Butano; Matthew Chadwick; J. Michael Cherry; Justin Clark-Casey; Sergio Contrino; Jacques Corbeil; Joshua Heimbach; Kalpana Karra; Rachel Lyne; Julie Sullivan; Yo Yehudi; Gos Micklem; Michel Dumontier

Collaboration


Dive into the Sergio Contrino's collaboration.

Top Co-Authors

Avatar

Gos Micklem

University of Cambridge

View shared research outputs
Top Co-Authors

Avatar

Rachel Lyne

University of Cambridge

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Fengyuan Hu

University of Cambridge

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lincoln Stein

Ontario Institute for Cancer Research

View shared research outputs
Top Co-Authors

Avatar

Adrian Carr

University of Cambridge

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge