Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David Pérez-Rey is active.

Publication


Featured researches published by David Pérez-Rey.


international conference on biological and medical data analysis | 2006

OntoDataClean: ontology-based integration and preprocessing of distributed data

David Pérez-Rey; Alberto Anguita; José Crespo

Within the knowledge discovery in databases (KDD) process, previous phases to data mining consume most of the time spent analysing data. Few research efforts have been carried out in theses steps compared to data mining, suggesting that new approaches and tools are needed to support the preparation of data. As regards, we present in this paper a new methodology of ontology-based KDD adopting a federated approach to database integration and retrieval. Within this model, an ontology-based system called OntoDataClean has been developed dealing with instance-level integration and data preprocessing. Within the OntoDataClean development, a preprocessing ontology was built to store the information about the required transformations. Various biomedical experiments were carried out, showing that data have been correctly transformed using the preprocessing ontology. Although OntoDataClean does not cover every possible data transformation, it suggests that ontologies are a suitable mechanism to improve quality in the various steps of KDD processes.


IEEE Journal of Biomedical and Health Informatics | 2015

Semantic Normalization and Query Abstraction Based on SNOMED-CT and HL7: Supporting Multicentric Clinical Trials

Sergio Paraiso-Medina; David Pérez-Rey; Anca I. D. Bucur; Brecht Claerhout; Raúl Alonso-Calvo

Advances in the use of omic data and other biomarkers are increasing the number of variables in clinical research. Additional data have stratified the population of patients and require that current studies be performed among multiple institutions. Semantic interoperability and standardized data representation are a crucial task in the management of modern clinical trials. In the past few years, different efforts have focused on integrating biomedical information. Due to the complexity of this domain and the specific requirements of clinical research, the majority of data integration tasks are still performed manually. This paper presents a semantic normalization process and a query abstraction mechanism to facilitate data integration and retrieval. A process based on well-established standards from the biomedical domain and the latest semantic web technologies has been developed. Methods proposed in this paper have been tested within the EURECA EU research project, where clinical scenarios require the extraction of semantic knowledge from biomedical vocabularies. The aim of this paper is to provide a novel method to abstract from the data model and query syntax. The proposed approach has been compared with other initiatives in the field by storing the same dataset with each of those solutions. Results show an extended functionality and query capabilities at the cost of slightly worse performance in query execution. Implementations in real settings have shown that following this approach, usable interfaces can be developed to exploit clinical trial data outcomes.


Computing | 2012

Nanoinformatics: developing new computing applications for nanomedicine

Victor Maojo; Martin Fritts; Fernando Martín-Sánchez; Diana de la Iglesia; Raul E. Cachau; Miguel García-Remesal; José Crespo; Joyce A. Mitchell; Alberto Anguita; Nathan A. Baker; José María Barreiro; Sonia E. Benítez; Guillermo de la Calle; Julio C. Facelli; Peter Ghazal; Antoine Geissbuhler; Fernando D. González-Nilo; Norbert Graf; Pierre Grangeat; Isabel Hermosilla; Rada Hussein; Josipa Kern; Sabine Koch; Yannick Legré; Victoria López-Alonso; Guillermo López-Campos; Luciano Milanesi; Vassilis Moustakis; Cristian R. Munteanu; Paula Otero

Nanoinformatics has recently emerged to address the need of computing applications at the nano level. In this regard, the authors have participated in various initiatives to identify its concepts, foundations and challenges. While nanomaterials open up the possibility for developing new devices in many industrial and scientific areas, they also offer breakthrough perspectives for the prevention, diagnosis and treatment of diseases. In this paper, we analyze the different aspects of nanoinformatics and suggest five research topics to help catalyze new research and development in the area, particularly focused on nanomedicine. We also encompass the use of informatics to further the biological and clinical applications of basic research in nanoscience and nanotechnology, and the related concept of an extended “nanotype” to coalesce information related to nanoparticles. We suggest how nanoinformatics could accelerate developments in nanomedicine, similarly to what happened with the Human Genome and other -omics projects, on issues like exchanging modeling and simulation methods and tools, linking toxicity information to clinical and personal databases or developing new approaches for scientific ontologies, among many others.


Methods of Information in Medicine | 2011

Biomedical informatics publications: a global perspective: part I: conferences.

Victor Maojo; Miguel García-Remesal; Concha Bielza; José Crespo; David Pérez-Rey; Casimir A. Kulikowski

BACKGROUND In the past decade, Medical Informatics (MI) and Bioinformatics (BI) have converged towards a new discipline, called Biomedical Informatics (BMI) bridging informatics methods across the spectrum from genomic research to personalized medicine and global healthcare. This convergence still raises challenging research questions which are being addressed by researchers internationally, which in turn raises the question of how biomedical informatics publications reflect the contributions from around the world in documenting the research. OBJECTIVES To analyse the worldwide participation of biomedical informatics researchers from professional groups and societies in the best-known scientific conferences in the field. The analysis is focused on their geographical affiliation, but also includes other features, such as the impact and recognition of the conferences. METHODS We manually collected data about authors of papers presented at three major MI conferences: Medinfo, MIE and the AMIA symposium. In addition, we collected data from a BI conference, ISMB, as a comparison. Finally, we analyzed the impact and recognition of these conferences within their scientific contexts. RESULTS Data indicate a predominance of local authors at the regional conferences (AMIA and MIE), whereas other conferences with a world-wide scope (Medinfo and ISMB) had broader participation. Our analysis shows that the influence of these conferences beyond the discipline remains somewhat limited. CONCLUSIONS Our results suggest that for BMI to be recognized as a broad discipline, both in the geographical and scientific sense, it will need to extend the scope of collaborations and their interdisciplinary impacts worldwide.


BMC Medical Informatics and Decision Making | 2012

CDAPubMed: a browser extension to retrieve EHR-based biomedical literature.

David Pérez-Rey; Ana Jimenez-Castellanos; Miguel García-Remesal; José Crespo; Victor Maojo

BackgroundOver the last few decades, the ever-increasing output of scientific publications has led to new challenges to keep up to date with the literature. In the biomedical area, this growth has introduced new requirements for professionals, e.g., physicians, who have to locate the exact papers that they need for their clinical and research work amongst a huge number of publications. Against this backdrop, novel information retrieval methods are even more necessary. While web search engines are widespread in many areas, facilitating access to all kinds of information, additional tools are required to automatically link information retrieved from these engines to specific biomedical applications. In the case of clinical environments, this also means considering aspects such as patient data security and confidentiality or structured contents, e.g., electronic health records (EHRs). In this scenario, we have developed a new tool to facilitate query building to retrieve scientific literature related to EHRs.ResultsWe have developed CDAPubMed, an open-source web browser extension to integrate EHR features in biomedical literature retrieval approaches. Clinical users can use CDAPubMed to: (i) load patient clinical documents, i.e., EHRs based on the Health Level 7-Clinical Document Architecture Standard (HL7-CDA), (ii) identify relevant terms for scientific literature search in these documents, i.e., Medical Subject Headings (MeSH), automatically driven by the CDAPubMed configuration, which advanced users can optimize to adapt to each specific situation, and (iii) generate and launch literature search queries to a major search engine, i.e., PubMed, to retrieve citations related to the EHR under examination.ConclusionsCDAPubMed is a platform-independent tool designed to facilitate literature searching using keywords contained in specific EHRs. CDAPubMed is visually integrated, as an extension of a widespread web browser, within the standard PubMed interface. It has been tested on a public dataset of HL7-CDA documents, returning significantly fewer citations since queries are focused on characteristics identified within the EHR. For instance, compared with more than 200,000 citations retrieved by breast neoplasm, fewer than ten citations were retrieved when ten patient features were added using CDAPubMed. This is an open source tool that can be freely used for non-profit purposes and integrated with other existing systems.


Bioinformatics | 2010

PubDNA Finder

Miguel García-Remesal; Alejandro Cuevas; David Pérez-Rey; Luis Martín; Alberto Anguita; Diana de la Iglesia; Guillermo de la Calle; José Crespo; Victor Maojo

SUMMARY PubDNA Finder is an online repository that we have created to link PubMed Central manuscripts to the sequences of nucleic acids appearing in them. It extends the search capabilities provided by PubMed Central by enabling researchers to perform advanced searches involving sequences of nucleic acids. This includes, among other features (i) searching for papers mentioning one or more specific sequences of nucleic acids and (ii) retrieving the genetic sequences appearing in different articles. These additional query capabilities are provided by a searchable index that we created by using the full text of the 176 672 papers available at PubMed Central at the time of writing and the sequences of nucleic acids appearing in them. To automatically extract the genetic sequences occurring in each paper, we used an original method we have developed. The database is updated monthly by automatically connecting to the PubMed Central FTP site to retrieve and index new manuscripts. Users can query the database via the web interface provided. AVAILABILITY PubDNA Finder can be freely accessed at http://servet.dia.fi.upm.es:8080/pubdnafinder


BioMed Research International | 2013

Using Nanoinformatics Methods for Automatically Identifying Relevant Nanotoxicology Entities from the Literature

Miguel García-Remesal; Alejandro García-Ruiz; David Pérez-Rey; Diana de la Iglesia; Victor Maojo

Nanoinformatics is an emerging research field that uses informatics techniques to collect, process, store, and retrieve data, information, and knowledge on nanoparticles, nanomaterials, and nanodevices and their potential applications in health care. In this paper, we have focused on the solutions that nanoinformatics can provide to facilitate nanotoxicology research. For this, we have taken a computational approach to automatically recognize and extract nanotoxicology-related entities from the scientific literature. The desired entities belong to four different categories: nanoparticles, routes of exposure, toxic effects, and targets. The entity recognizer was trained using a corpus that we specifically created for this purpose and was validated by two nanomedicine/nanotoxicology experts. We evaluated the performance of our entity recognizer using 10-fold cross-validation. The precisions range from 87.6% (targets) to 93.0% (routes of exposure), while recall values range from 82.6% (routes of exposure) to 87.4% (toxic effects). These results prove the feasibility of using computational approaches to reliably perform different named entity recognition (NER)-dependent tasks, such as for instance augmented reading or semantic searches. This research is a “proof of concept” that can be expanded to stimulate further developments that could assist researchers in managing data, information, and knowledge at the nanolevel, thus accelerating research in nanomedicine.


Methods of Information in Medicine | 2012

Biomedical informatics publications: a global perspective. Part II: Journals.

Maojo; Miguel García-Remesal; Concha Bielza; José Crespo; David Pérez-Rey; Casimir A. Kulikowski

BACKGROUND Biomedical Informatics (BMI) is a broad discipline, having evolved from both Medical Informatics (MI) and Bioinformatics (BI). An analysis of publications in the fieldshould provide an indication about the geographic distribution of BMI research contributions and possible lessons for the future, both for research and professional practice. OBJECTIVES In part I of our analysis of biomedical informatics publications we presented results from BMI conferences. In this second part, we analyse BMI journals, which provide a broader perspective and comparison between data from conferences and journals that ought to confirm or suggest alternatives to the original distributional findings from the conferences. METHODS We manually collected data about authors and their geographical origin from various MI journals: the International Journal of Medical Informatics (IJMI), the Journal of Biomedical Informatics (JBI), Methods of In formation in Medicine (MIM) and The Journal of the American Medical Informatics Association (JAMIA). Focusing on first authors, we also compared these findings with data from the journal Bioinformatics. RESULTS Our results confirm those obtained in our analysis of BMI conferences - that local and regional authors favor their corresponding MI journals just as they do their conferences. Consideration of other factors, such as the increasingly open source nature of data and software tools, is consistent with these findings. CONCLUSIONS Our analysis suggests various indicators that could lead to further, deeper analyses, and could provide additional insights for future BMI research and professional activities.


BMC Bioinformatics | 2010

A method for automatically extracting infectious disease-related primers and probes from the literature

Miguel García-Remesal; Alejandro Cuevas; Victoria López-Alonso; Guillermo López-Campos; Guillermo de la Calle; Diana de la Iglesia; David Pérez-Rey; José Crespo; Fernando Martín-Sánchez; Victor Maojo

BackgroundPrimer and probe sequences are the main components of nucleic acid-based detection systems. Biologists use primers and probes for different tasks, some related to the diagnosis and prescription of infectious diseases. The biological literature is the main information source for empirically validated primer and probe sequences. Therefore, it is becoming increasingly important for researchers to navigate this important information. In this paper, we present a four-phase method for extracting and annotating primer/probe sequences from the literature. These phases are: (1) convert each document into a tree of paper sections, (2) detect the candidate sequences using a set of finite state machine-based recognizers, (3) refine problem sequences using a rule-based expert system, and (4) annotate the extracted sequences with their related organism/gene information.ResultsWe tested our approach using a test set composed of 297 manuscripts. The extracted sequences and their organism/gene annotations were manually evaluated by a panel of molecular biologists. The results of the evaluation show that our approach is suitable for automatically extracting DNA sequences, achieving precision/recall rates of 97.98% and 95.77%, respectively. In addition, 76.66% of the detected sequences were correctly annotated with their organism name. The system also provided correct gene-related information for 46.18% of the sequences assigned a correct organism name.ConclusionsWe believe that the proposed method can facilitate routine tasks for biomedical researchers using molecular methods to diagnose and prescribe different infectious diseases. In addition, the proposed method can be expanded to detect and extract other biological sequences from the literature. The extracted information can also be used to readily update available primer/probe databases or to create new databases from scratch.


Computer Methods and Programs in Biomedicine | 2015

Enabling semantic interoperability in multi-centric clinical trials on breast cancer

Raúl Alonso-Calvo; David Pérez-Rey; Sergio Paraiso-Medina; Brecht Claerhout; Philippe Hennebert; Anca I. D. Bucur

BACKGROUND AND OBJECTIVES Post-genomic clinical trials require the participation of multiple institutions, and collecting data from several hospitals, laboratories and research facilities. This paper presents a standard-based solution to provide a uniform access endpoint to patient data involved in current clinical research. METHODS The proposed approach exploits well-established standards such as HL7 v3 or SPARQL and medical vocabularies such as SNOMED CT, LOINC and HGNC. A novel mechanism to exploit semantic normalization among HL7-based data models and biomedical ontologies has been created by using Semantic Web technologies. RESULTS Different types of queries have been used for testing the semantic interoperability solution described in this paper. The execution times obtained in the tests enable the development of end user tools within a framework that requires efficient retrieval of integrated data. CONCLUSIONS The proposed approach has been successfully tested by applications within the INTEGRATE and EURECA EU projects. These applications have been deployed and tested for: (i) patient screening, (ii) trial recruitment, and (iii) retrospective analysis; exploiting semantically interoperable access to clinical patient data from heterogeneous data sources.

Collaboration


Dive into the David Pérez-Rey's collaboration.

Top Co-Authors

Avatar

Raúl Alonso-Calvo

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Victor Maojo

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Miguel García-Remesal

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

José Crespo

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Sergio Paraiso-Medina

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Alberto Anguita

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Diana de la Iglesia

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge