Pablo Marin-Garcia
University of Valencia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pablo Marin-Garcia.
Nature Genetics | 2015
Ilkka Lappalainen; Jeff Almeida-King; Vasudev Kumanduri; Alexander Senf; John Dylan Spalding; Saif ur-Rehman; Gary Saunders; Jag Kandasamy; Mario Caccamo; Rasko Leinonen; Brendan Vaughan; Thomas Laurent; Francis Rowland; Pablo Marin-Garcia; Jonathan Barker; Petteri Jokinen; Angel Carreño Torres; Jordi Rambla de Argila; Oscar Martinez Llobet; Ignacio Medina; Marc Sitges Puy; Mario Alberich; Sabela de la Torre; Arcadi Navarro; Justin Paschall; Paul Flicek
Paul Flicek and colleagues provide an update on the European Genome-phenome Archive (EGA), a service of the European Bioinformatics Institute (EMBL-EBI) and the Center for Genome Regulation (CRG). The authors describe the EGA policies and infrastructure, how access decisions are made, methods for data submission and future plans for expansion of this database.
BMC Genomics | 2010
Yuan Chen; Fiona Cunningham; Daniel Ríos; William M. McLaren; James Smith; Bethan Pritchard; Giulietta Spudich; Simon Brent; Eugene Kulesha; Pablo Marin-Garcia; Damian Smedley; Ewan Birney; Paul Flicek
BackgroundThe maturing field of genomics is rapidly increasing the number of sequenced genomes and producing more information from those previously sequenced. Much of this additional information is variation data derived from sampling multiple individuals of a given species with the goal of discovering new variants and characterising the population frequencies of the variants that are already known. These data have immense value for many studies, including those designed to understand evolution and connect genotype to phenotype. Maximising the utility of the data requires that it be stored in an accessible manner that facilitates the integration of variation data with other genome resources such as gene annotation and comparative genomics.DescriptionThe Ensembl project provides comprehensive and integrated variation resources for a wide variety of chordate genomes. This paper provides a detailed description of the sources of data and the methods for creating the Ensembl variation databases. It also explores the utility of the information by explaining the range of query options available, from using interactive web displays, to online data mining tools and connecting directly to the data servers programmatically. It gives a good overview of the variation resources and future plans for expanding the variation data within Ensembl.ConclusionsVariation data is an important key to understanding the functional and phenotypic differences between individuals. The development of new sequencing and genotyping technologies is greatly increasing the amount of variation data known for almost all genomes. The Ensembl variation resources are integrated into the Ensembl genome browser and provide a comprehensive way to access this data in the context of a widely used genome bioinformatics system. All Ensembl data is freely available at http://www.ensembl.org and from the public MySQL database server at ensembldb.ensembl.org.
Scientific Reports | 2015
M. Mar Rodríguez; Daniel R. Perez; Felipe Javier Chaves; Eduardo Esteve; Pablo Marin-Garcia; Joan Vendrell; Mariona Jové; Reinald Pamplona; Wifredo Ricart; Manuel Portero-Otin; Matilde R. Chacón; José Manuel Fernández Real
The human intestine is home to a diverse range of bacterial and fungal species, forming an ecological community that contributes to normal physiology and disease susceptibility. Here, the fungal microbiota (mycobiome) in obese and non-obese subjects was characterized using Internal Transcribed Spacer (ITS)-based sequencing. The results demonstrate that obese patients could be discriminated by their specific fungal composition, which also distinguished metabolically “healthy” from “unhealthy” obesity. Clusters according to genus abundance co-segregated with body fatness, fasting triglycerides and HDL-cholesterol. A preliminary link to metabolites such as hexadecanedioic acid, caproic acid and N-acetyl-L-glutamic acid was also found. Mucor racemosus and M. fuscus were the species more represented in non-obese subjects compared to obese counterparts. Interestingly, the decreased relative abundance of the Mucor genus in obese subjects was reversible upon weight loss. Collectively, these findings suggest that manipulation of gut mycobiome communities might be a novel target in the treatment of obesity.
Nucleic Acids Research | 2015
Roberto Alonso; Francisco Salavert; Francisco García-García; Marta Bleda; Luz Garcia-Alonso; Alba Sanchis-Juan; Daniel Perez-Gil; Pablo Marin-Garcia; Rubén Sánchez; Cankut Cubuk; Marta R. Hidalgo; Alicia Amadoz; Rosa D. Hernansaiz-Ballesteros; Alejandro Alemán; Joaquín Tárraga; David Montaner; Ignacio Medina; Joaquín Dopazo
Babelomics has been running for more than one decade offering a user-friendly interface for the functional analysis of gene expression and genomic data. Here we present its fifth release, which includes support for Next Generation Sequencing data including gene expression (RNA-seq), exome or genome resequencing. Babelomics has simplified its interface, being now more intuitive. Improved visualization options, such as a genome viewer as well as an interactive network viewer, have been implemented. New technical enhancements at both, client and server sides, makes the user experience faster and more dynamic. Babelomics offers user-friendly access to a full range of methods that cover: (i) primary data analysis, (ii) a variety of tests for different experimental designs and (iii) different enrichment and network analysis algorithms for the interpretation of the results of such tests in the proper functional context. In addition to the public server, local copies of Babelomics can be downloaded and installed. Babelomics is freely available at: http://www.babelomics.org.
PLOS Genetics | 2009
Melissa K. Boles; Bonney Wilkinson; Laurens Wilming; Bin Liu; Frank J. Probst; Jennifer Harrow; Darren Grafham; Kathryn E. Hentges; Lanette P. Woodward; Andrea Maxwell; Karen Mitchell; Michael Risley; Randy L. Johnson; Karen K. Hirschi; James R. Lupski; Yosuke Funato; Hiroaki Miki; Pablo Marin-Garcia; Lucy Matthews; Alison J. Coffey; Anne Parker; Tim Hubbard; Jane Rogers; Allan Bradley; David J. Adams; Monica J. Justice
An accurate and precisely annotated genome assembly is a fundamental requirement for functional genomic analysis. Here, the complete DNA sequence and gene annotation of mouse Chromosome 11 was used to test the efficacy of large-scale sequencing for mutation identification. We re-sequenced the 14,000 annotated exons and boundaries from over 900 genes in 41 recessive mutant mouse lines that were isolated in an N-ethyl-N-nitrosourea (ENU) mutation screen targeted to mouse Chromosome 11. Fifty-nine sequence variants were identified in 55 genes from 31 mutant lines. 39% of the lesions lie in coding sequences and create primarily missense mutations. The other 61% lie in noncoding regions, many of them in highly conserved sequences. A lesion in the perinatal lethal line l11Jus13 alters a consensus splice site of nucleoredoxin (Nxn), inserting 10 amino acids into the resulting protein. We conclude that point mutations can be accurately and sensitively recovered by large-scale sequencing, and that conserved noncoding regions should be included for disease mutation identification. Only seven of the candidate genes we report have been previously targeted by mutation in mice or rats, showing that despite ongoing efforts to functionally annotate genes in the mammalian genome, an enormous gap remains between phenotype and function. Our data show that the classical positional mapping approach of disease mutation identification can be extended to large target regions using high-throughput sequencing.
BMC Bioinformatics | 2017
Antonio Fabregat; Konstantinos Sidiropoulos; Guilherme Viteri; Oscar Forner; Pablo Marin-Garcia; Vicente Arnau; Peter D’Eustachio; Lincoln Stein; Henning Hermjakob
BackgroundReactome aims to provide bioinformatics tools for visualisation, interpretation and analysis of pathway knowledge to support basic research, genome analysis, modelling, systems biology and education. Pathway analysis methods have a broad range of applications in physiological and biomedical research; one of the main problems, from the analysis methods performance point of view, is the constantly increasing size of the data samples.ResultsHere, we present a new high-performance in-memory implementation of the well-established over-representation analysis method. To achieve the target, the over-representation analysis method is divided in four different steps and, for each of them, specific data structures are used to improve performance and minimise the memory footprint. The first step, finding out whether an identifier in the user’s sample corresponds to an entity in Reactome, is addressed using a radix tree as a lookup table. The second step, modelling the proteins, chemicals, their orthologous in other species and their composition in complexes and sets, is addressed with a graph. The third and fourth steps, that aggregate the results and calculate the statistics, are solved with a double-linked tree.ConclusionThrough the use of highly optimised, in-memory data structures and algorithms, Reactome has achieved a stable, high performance pathway analysis service, enabling the analysis of genome-wide datasets within seconds, allowing interactive exploration and analysis of high throughput data. The proposed pathway analysis approach is available in the Reactome production web site either via the AnalysisService for programmatic access or the user submission interface integrated into the PathwayBrowser. Reactome is an open data and open source project and all of its source code, including the one described here, is available in the AnalysisTools repository in the Reactome GitHub (https://github.com/reactome/).
PLOS Computational Biology | 2018
Antonio Fabregat; Florian Korninger; Guilherme Viteri; Konstantinos Sidiropoulos; Pablo Marin-Garcia; Peipei Ping; Guanming Wu; Lincoln Stein; Peter D’Eustachio; Henning Hermjakob
Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.
Bioinformatics | 2018
Antonio Fabregat; Konstantinos Sidiropoulos; Guilherme Viteri; Pablo Marin-Garcia; Peipei Ping; Lincoln Stein; Peter D’Eustachio; Henning Hermjakob; Janet Kelso
Abstract Motivation Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. For web-based pathway visualization, Reactome uses a custom pathway diagram viewer that has been evolved over the past years. Here, we present comprehensive enhancements in usability and performance based on extensive usability testing sessions and technology developments, aiming to optimize the viewer towards the needs of the community. Results The pathway diagram viewer version 3 achieves consistently better performance, loading and rendering of 97% of the diagrams in Reactome in less than 1 s. Combining the multi-layer html5 canvas strategy with a space partitioning data structure minimizes CPU workload, enabling the introduction of new features that further enhance user experience. Through the use of highly optimized data structures and algorithms, Reactome has boosted the performance and usability of the new pathway diagram viewer, providing a robust, scalable and easy-to-integrate solution to pathway visualization. As graph-based visualization of complex data is a frequent challenge in bioinformatics, many of the individual strategies presented here are applicable to a wide range of web-based bioinformatics resources. Availability and implementation Reactome is available online at: https://reactome.org. The diagram viewer is part of the Reactome pathway browser (https://reactome.org/PathwayBrowser/) and also available as a stand-alone widget at: https://reactome.org/dev/diagram/. The source code is freely available at: https://github.com/reactome-pwp/diagram. Supplementary information Supplementary data are available at Bioinformatics online.
Scientific Reports | 2016
M. Mar Rodríguez; Daniel R. Perez; Felipe Javier Chaves; Eduardo Esteve; Pablo Marin-Garcia; Joan Vendrell; Mariona Jové; Reinald Pamplona; Wifredo Ricart; Manuel Portero-Otin; Matilde R. Chacón; José Manuel Fernández Real
Scientific Reports 5: Article number: 1460010.1038/srep14600; published online: October122015; updated: February242016. In this Article, Figure 4g is a duplication of Figure 5a. The correct Figure 4g appears below as Fig. 1. Figure 1
Oncotarget | 2017
Ana-Barbara García-García; M. Carmen Gómez-Mateo; Rebeca Hilario; Pilar Rentero-Garrido; Alvaro Martínez-Domenech; Veronica Gonzalez-Albert; A. Cervantes; Pablo Marin-Garcia; Felipe Javier Chaves; Antonio Ferrández-Izquierdo; Luis Sabater
Background Pancreatic ductal adenocarcinoma (PDAC) is one of the most devastating malignancies in developed countries because of its very poor prognosis and high mortality rates. By the time PDAC is usually diagnosed only 20-25% of patients are candidates for surgery, and the rate of survival for this cancer is low even when a patient with PDAC does undergo surgery. Lymph node invasion is an extremely bad prognosis factor for this disease. Methods We analyzed the mRNA expression profile in 30 PDAC samples from patients with resectable local disease (stages I and II). Neoplastic cells were isolated by laser-microdissection in order to avoid sample ‘contamination’ by non-tumor cells. Due to important differences in the prognoses of PDAC patients with and without lymph node involvement (stage IIB and stages I-IIA, respectively), we also analyzed the association between the mRNA expression profiles from these groups of patients and their survival. Results We identified expression profiles associated with patient survival in the whole patient cohort and in each group (stage IIB samples or stage I-IIA samples). Our results indicate that survival-associated genes are different in the groups with and without affected lymph nodes. Survival curves indicate that these expression profiles can help physicians to improve the prognostic classification of patients based on these profiles.