Diego Martínez
University of Santiago de Compostela
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Diego Martínez.
International Journal of Human-computer Studies \/ International Journal of Man-machine Studies | 2005
M. Taboada; Diego Martínez; José Mira
We study the general question of how ontologies and reference terminologies can be used to make development of knowledge bases more manageable, taking into account the methodologies and tools available nowadays. For this, we have carried out a case study on designing a knowledge base oriented to support a diagnosis-aid application in ophthalmology. Ideally, starting from a pre-existing domain ontology, development of knowledge bases is centred only on collecting specific knowledge for a particular application. In practice, this is a very time-consuming approach, as ontology repositories do not usually provide many information-seeking facilities. In addition, it is unlikely to find an ontology that includes all the required knowledge. Consequently, design of knowledge bases requires the combination and adaptation of one or more source ontologies. In this work, particular attention is paid to the proper merging of two ontologies using the tool PROMPT. Our study emphasizes the advantages of using PROMPT for merging ontologies containing closely related portions of knowledge, as well as some proposals for improvement. In a second step, our approach extends the evolving ontology, with a new component that holds both a meta-model representing a very simplified structure of a terminology system into Protege-2000 and a set of constraints expressed using the Protege Axiom Language. This set of constraints allows us to check the consistency and coherence of the imported information. Defining meta-classes in Protege-2000 links this component to the rest of the models in the knowledge base. We report our experience in the reuse of several knowledge sources using Protege-2000 and several of the plug-ins.
IEEE Transactions on Biomedical Engineering | 2009
M. Taboada; Rosario Lalin; Diego Martínez
Nowadays, providing interoperability between different biomedical terminologies is a critical issue for efficient information sharing. One problem making interoperability difficult is the lack of automated methods simplifying the mapping process. In this study, we propose an automated approach to mapping external terminologies to the Unified Medical Language System (UMLS). Our approach applies a sequential combination of two basic matching methods classically used in ontology matching. First, a lexical technique identifies similar strings between the external terminology and the UMLS. Second, a structure-based technique validates, in part, the lexical alignmentby computing paths to top-level concepts and checking the compatibility of these top-level concepts across the external terminology and the UMLS. The method was applied to the mapping of the large-scale biomedical thesaurus EMTREE to the complete UMLS Metathesaurus. In total, 47.9% coverage of EMTREE terms was reached, leading to 80% coverage of EMTREE concepts. Our method has revealed a high compatibility in 6 out of 15 top-level categories across terminologies. The validation of lexical mappings ranges over 75.8% of the total lexical alignment. Overall, the method rules out a total of 6927 (7.9%) lexical mappings, with a global precision of 78%.
Database | 2014
M. Taboada; Hadriana Rodriguez; Diego Martínez; María Pardo; María Jesús Sobrido
Motivation: As the number of clinical reports in the peer-reviewed medical literature keeps growing, there is an increasing need for online search tools to find and analyze publications on patients with similar clinical characteristics. This problem is especially critical and challenging for rare diseases, where publications of large series are scarce. Through an applied example, we illustrate how to automatically identify new relevant cases and semantically annotate the relevant literature about patient case reports to capture the phenotype of a rare disease named cerebrotendinous xanthomatosis. Results: Our results confirm that it is possible to automatically identify new relevant case reports with a high precision and to annotate them with a satisfactory quality (74% F-measure). Automated annotation with an emphasis to entirely describe all phenotypic abnormalities found in a disease may facilitate curation efforts by supplying phenotype retrieval and assessment of their frequency. Availability and Supplementary information: http://www.usc.es/keam/Phenotype Annotation/. Database URL: http://www.usc.es/keam/PhenotypeAnnotation/
international parallel and distributed processing symposium | 2009
Diego Martínez; José Carlos Cabaleiro; Tomás F. Pena; Francisco F. Rivera; Vicente Blanco
This paper presents a new LogP-based model, called LoOgGP, which allows an accurate characterization of MPI applications based on microbenchmark measurements. This new model is an extension of LogP for long messages in which both overhead and gap parameters perform a linear dependency with message size. The LoOgGP model has been fully integrated into a modelling framework to obtain statistical models of parallel applications, providing the analyst with an easy and automatic tool for LoOgGP parameter set assessment to characterize communications. The use of LoOgGP model to obtain a statistical performance model of an image deconvolution application is illustrated as a case of study.
Journal of Biomedical Informatics | 2013
José Luis Iglesias Allones; M. Taboada; Diego Martínez; R. Lozano; María Jesús Sobrido
OBJECTIVE To explore semantic search to improve management and user navigation in clinical archetype repositories. METHODS In order to support semantic searches across archetypes, an automated method based on SNOMED CT modularization is implemented to transform clinical archetypes into SNOMED CT extracts. Concurrently, query terms are converted into SNOMED CT concepts using the search engine Lucene. Retrieval is then carried out by matching query concepts with the corresponding SNOMED CT segments. RESULTS A test collection of the 16 clinical archetypes, including over 250 terms, and a subset of 55 clinical terms from two medical dictionaries, MediLexicon and MedlinePlus, were used to test our method. The keyword-based service supported by the OpenEHR repository offered us a benchmark to evaluate the enhancement of performance. In total, our approach reached 97.4% precision and 69.1% recall, providing a substantial improvement of recall (more than 70%) compared to the benchmark. CONCLUSIONS Exploiting medical domain knowledge from ontologies such as SNOMED CT may overcome some limitations of the keyword-based systems and thus improve the search experience of repository users. An automated approach based on ontology segmentation is an efficient and feasible way for supporting modeling, management and user navigation in clinical archetype repositories.
BMC Medical Informatics and Decision Making | 2012
M. Taboada; Diego Martínez; Belén Pilo; Adriano Jimenez-Escrig; Peter N. Robinson; María Jesús Sobrido
BackgroundSemantic Web technology can considerably catalyze translational genetics and genomics research in medicine, where the interchange of information between basic research and clinical levels becomes crucial. This exchange involves mapping abstract phenotype descriptions from research resources, such as knowledge databases and catalogs, to unstructured datasets produced through experimental methods and clinical practice. This is especially true for the construction of mutation databases. This paper presents a way of harmonizing abstract phenotype descriptions with patient data from clinical practice, and querying this dataset about relationships between phenotypes and genetic variants, at different levels of abstraction.MethodsDue to the current availability of ontological and terminological resources that have already reached some consensus in biomedicine, a reuse-based ontology engineering approach was followed. The proposed approach uses the Ontology Web Language (OWL) to represent the phenotype ontology and the patient model, the Semantic Web Rule Language (SWRL) to bridge the gap between phenotype descriptions and clinical data, and the Semantic Query Web Rule Language (SQWRL) to query relevant phenotype-genotype bidirectional relationships. The work tests the use of semantic web technology in the biomedical research domain named cerebrotendinous xanthomatosis (CTX), using a real dataset and ontologies.ResultsA framework to query relevant phenotype-genotype bidirectional relationships is provided. Phenotype descriptions and patient data were harmonized by defining 28 Horn-like rules in terms of the OWL concepts. In total, 24 patterns of SWQRL queries were designed following the initial list of competency questions. As the approach is based on OWL, the semantic of the framework adapts the standard logical model of an open world assumption.ConclusionsThis work demonstrates how semantic web technologies can be used to support flexible representation and computational inference mechanisms required to query patient datasets at different levels of abstraction. The open world assumption is especially good for describing only partially known phenotype-genotype relationships, in a way that is easily extensible. In future, this type of approach could offer researchers a valuable resource to infer new data from patient data for statistical analysis in translational research. In conclusion, phenotype description formalization and mapping to clinical data are two key elements for interchanging knowledge between basic and clinical research.
Expert Systems | 2013
M. Taboada; M. Meizoso; Diego Martínez; D. Riaño; Albert Alonso
Natural language processing NLP has been used to process text pertaining to patient records and narratives. However, most of the methods used were developed for specific systems, so new research is necessary to assess whether such methods can be easily retargeted for new applications and goals, with the same performance. In this paper, open-source tools are reused as building blocks on which a new system is built. The aim of our work is to evaluate the applicability of the current NLP technology to a new domain: automatic knowledge acquisition of diagnostic and therapeutic procedures from clinical practice guideline free-text documents. In order to do this, two publicly available syntactic parsers, several terminology resources and a tool oriented to identify semantic predications were tailored to increase the performance of each tool individually. We apply this new approach to 171 sentences selected by the experts from a clinical guideline, and compare the results with those of the tools applied with no tailoring. The results of this paper show that with some adaptation, open-source NLP tools can be retargeted for new tasks, providing an accuracy that is equivalent to the methods designed for specific tasks.
knowledge representation for health care | 2009
M. Taboada; M. Meizoso; David Riaño; Albert Alonso; Diego Martínez
Knowledge Engineering allows to automate entity recognition and relation extraction from clinical texts, which in turn can be used to facilitate clinical practice guideline (CPG) modeling. This paper presents a method to recognize diagnosis and therapy entities, and to identify relationships between these entities from CPG free-text documents. Our approach applies a sequential combination of several basic methods classically used in knowledge engineering (natural language processing techniques, manually authored grammars, lexicons and ontologies), to gradually map sentences describing diagnostic and therapeutic procedures to an ontology. First, using a standardized vocabulary, our method automatically identifies guideline concepts. Next, for each sentence, it determines the patient conditions under which the descriptive knowledge of the sentence is valid. Then, it detects the central information units in the sentence, in order to match the sentence with a small set of predefined relationships. The approach enables automated extraction of relationships about findings that have manifestation in a disease, and procedures that diagnose or treat a disease.
international parallel and distributed processing symposium | 2007
Diego Martínez; Vicente Blanco; Marcos Boullón; José Carlos Cabaleiro; Casiano Rodríguez; Francisco F. Rivera
This paper presents a framework based on a user driven methodology to obtain analytical models of MPI applications on parallel systems in a systematic and easy to use way. This methodology consists of two stages. In the first one, instrumentation of the source code is performed using CALL, which is a profiling tool for interacting with the code in an easy, simple and direct way. New features are added to CALL to obtain different performance metrics and store the performance information in XML files. Using this information, an analytical model of the performance behavior is obtained in the second stage by means of R, a language and environment for statistical analysis. The structure of the whole framework is detailed in this paper, and some selected examples are used to show its practical use.
parallel, distributed and network-based processing | 2010
Diego Martínez; José Carlos Cabaleiro; Tomás F. Pena; Francisco F. Rivera; Vicente Blanco Pérez
A new method for obtaining models of the performance of parallel applications based on statistical analysis is presented in this paper. This method is based on the Akaike’s information criterion (AIC) that provides an objective mechanism to rank different models by means of an experimental data fit. The input of the modeling process is a set of variables and parameters that can a priori influence the performance of the application. This set can be provided by the user. Using this information, the method automatically generates a set of candidate models. These models are fit to the experimental data and the AIC score of each model is calculated. The model with the best AIC score is selected as the best model. Also, using the AIC scores of all candidate models, useful statistical information is provided to help the user to evaluate the quality of the selected model, as well as indications of how to interactively improve this modeling process. As a first case of study, statistical models obtained for different implementations of the broadcast collective communication in Open MPI are shown. These models are very accurate, exceeding its adjustment to theoretical approaches based on the LogGP model. Finally, the NAS Parallel Benchmark is also characterized using this new method with good results in terms of accuracy.