Julian Varghese
University of Münster
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Julian Varghese.
Database | 2016
Martin Dugas; Philipp Neuhaus; Alexandra Meidt; Justin Doods; Michael Storck; Philipp Bruland; Julian Varghese
Introduction: Information systems are a key success factor for medical research and healthcare. Currently, most of these systems apply heterogeneous and proprietary data models, which impede data exchange and integrated data analysis for scientific purposes. Due to the complexity of medical terminology, the overall number of medical data models is very high. At present, the vast majority of these models are not available to the scientific community. The objective of the Portal of Medical Data Models (MDM, https://medical-data-models.org) is to foster sharing of medical data models. Methods: MDM is a registered European information infrastructure. It provides a multilingual platform for exchange and discussion of data models in medicine, both for medical research and healthcare. The system is developed in collaboration with the University Library of Münster to ensure sustainability. A web front-end enables users to search, view, download and discuss data models. Eleven different export formats are available (ODM, PDF, CDA, CSV, MACRO-XML, REDCap, SQL, SPSS, ADL, R, XLSX). MDM contents were analysed with descriptive statistics. Results: MDM contains 4387 current versions of data models (in total 10 963 versions). 2475 of these models belong to oncology trials. The most common keyword (n = 3826) is ‘Clinical Trial’; most frequent diseases are breast cancer, leukemia, lung and colorectal neoplasms. Most common languages of data elements are English (n = 328 557) and German (n = 68 738). Semantic annotations (UMLS codes) are available for 108 412 data items, 2453 item groups and 35 361 code list items. Overall 335 087 UMLS codes are assigned with 21 847 unique codes. Few UMLS codes are used several thousand times, but there is a long tail of rarely used codes in the frequency distribution. Discussion: Expected benefits of the MDM portal are improved and accelerated design of medical data models by sharing best practice, more standardised data models with semantic annotation and better information exchange between information systems, in particular Electronic Data Capture (EDC) and Electronic Health Records (EHR) systems. Contents of the MDM portal need to be further expanded to reach broad coverage of all relevant medical domains. Database URL: https://medical-data-models.org
data integration in the life sciences | 2015
Victor Christen; Anika Groß; Julian Varghese; Martin Dugas; Erhard Rahm
Medical forms are frequently used to document patient data or to collect relevant data for clinical trials. It is crucial to harmonize medical forms in order to improve interoperability and data integration between medical applications. Here we propose a (semi-) automatic annotation of medical forms with concepts of the Unified Medical Language System (UMLS). Our annotation workflow encompasses a novel semantic blocking, sophisticated match techniques and post-processing steps to select reasonable annotations. We evaluate our methods based on reference mappings between medical forms and UMLS, and further manually validate the recommended annotations.
BMC Medical Research Methodology | 2016
Martin Dugas; Alexandra Meidt; Philipp Neuhaus; Michael Storck; Julian Varghese
BackgroundThe volume and complexity of patient data – especially in personalised medicine – is steadily increasing, both regarding clinical data and genomic profiles: Typically more than 1,000 items (e.g., laboratory values, vital signs, diagnostic tests etc.) are collected per patient in clinical trials. In oncology hundreds of mutations can potentially be detected for each patient by genomic profiling. Therefore data integration from multiple sources constitutes a key challenge for medical research and healthcare.MethodsSemantic annotation of data elements can facilitate to identify matching data elements in different sources and thereby supports data integration. Millions of different annotations are required due to the semantic richness of patient data. These annotations should be uniform, i.e., two matching data elements shall contain the same annotations. However, large terminologies like SNOMED CT or UMLS don’t provide uniform coding. It is proposed to develop semantic annotations of medical data elements based on a large-scale public metadata repository. To achieve uniform codes, semantic annotations shall be re-used if a matching data element is available in the metadata repository.ResultsA web-based tool called ODMedit (https://odmeditor.uni-muenster.de/) was developed to create data models with uniform semantic annotations. It contains ~800,000 terms with semantic annotations which were derived from ~5,800 models from the portal of medical data models (MDM). The tool was successfully applied to manually annotate 22 forms with 292 data items from CDISC and to update 1,495 data models of the MDM portal.ConclusionUniform manual semantic annotation of data models is feasible in principle, but requires a large-scale collaborative effort due to the semantic richness of patient data. A web-based tool for these annotations is available, which is linked to a public metadata repository.
Journal of the American Medical Informatics Association | 2018
Julian Varghese; Maren Kleine; Sophia Isabella Gessner; Sarah Sandmann; Martin Dugas
Objectives To systematically classify the clinical impact of computerized clinical decision support systems (CDSSs) in inpatient care. Materials and Methods Medline, Cochrane Trials, and Cochrane Reviews were searched for CDSS studies that assessed patient outcomes in inpatient settings. For each study, 2 physicians independently mapped patient outcome effects to a predefined medical effect score to assess the clinical impact of reported outcome effects. Disagreements were measured by using weighted kappa and solved by consensus. An example set of promising disease entities was generated based on medical effect scores and risk of bias assessment. To summarize technical characteristics of the systems, reported input variables and algorithm types were extracted as well. Results Seventy studies were included. Five (7%) reported reduced mortality, 16 (23%) reduced life-threatening events, and 28 (40%) reduced non-life-threatening events, 20 (29%) had no significant impact on patient outcomes, and 1 showed a negative effect (weighted κ: 0.72, P < .001). Six of 24 disease entity settings showed high effect scores with medium or low risk of bias: blood glucose management, blood transfusion management, physiologic deterioration prevention, pressure ulcer prevention, acute kidney injury prevention, and venous thromboembolism prophylaxis. Most of the implemented algorithms (72%) were rule-based. Reported input variables are shared as standardized models on a metadata repository. Discussion and Conclusion Most of the included CDSS studies were associated with positive patient outcomes effects but with substantial differences regarding the clinical impact. A subset of 6 disease entities could be filtered in which CDSS should be given special consideration at sites where computer-assisted decision-making is deemed to be underutilized. Registration number on PROSPERO: CRD42016049946.
Bioinformatics | 2018
Sarah Sandmann; Mohsen Karimi; Aniek O. de Graaf; Christian Rohde; Stefanie Göllner; Julian Varghese; Jan Ernsting; Gunilla Walldin; Bert A. van der Reijden; Carsten Müller-Tidow; Luca Malcovati; Eva Hellström-Lindberg; Joop H. Jansen; Martin Dugas
Motivation: The application of next‐generation sequencing in research and particularly in clinical routine requires valid variant calling results. However, evaluation of several commonly used tools has pointed out that not a single tool meets this requirement. False positive as well as false negative calls necessitate additional experiments and extensive manual work. Intelligent combination and output filtration of different tools could significantly improve the current situation. Results: We developed appreci8, an automatic variant calling pipeline for calling single nucleotide variants and short indels by combining and filtering the output of eight open‐source variant calling tools, based on a novel artifact‐ and polymorphism score. Appreci8 was trained on two data sets from patients with myelodysplastic syndrome, covering 165 Illumina samples. Subsequently, appreci8s performance was tested on five independent data sets, covering 513 samples. Variation in sequencing platform, target region and disease entity was considered. All calls were validated by re‐sequencing on the same platform, a different platform or expert‐based review. Sensitivity of appreci8 ranged between 0.93 and 1.00, while positive predictive value ranged between 0.65 and 1.00. In all cases, appreci8 showed superior performance compared to any evaluated alternative approach. Availability and implementation: Appreci8 is freely available at https://hub.docker.com/r/wwuimi/appreci8/. Sequencing data (BAM files) of the 678 patients analyzed with appreci8 have been deposited into the NCBI Sequence Read Archive (BioProjectID: 388411; https://www.ncbi.nlm.nih.gov/bioproject/PRJNA388411). Supplementary information: Supplementary data are available at Bioinformatics online.
Journal of Medical Internet Research | 2017
Julian Varghese; Sarah Sandmann; Martin Dugas
Background Medical coding is essential for standardized communication and integration of clinical data. The Unified Medical Language System by the National Library of Medicine is the largest clinical terminology system for medical coders and Natural Language Processing tools. However, the abundance of ambiguous codes leads to low rates of uniform coding among different coders. Objective The objective of our study was to measure uniform coding among different medical experts in terms of interrater reliability and analyze the effect on interrater reliability using an expert- and Web-based code suggestion system. Methods We conducted a quasi-experimental study in which 6 medical experts coded 602 medical items from structured quality assurance forms or free-text eligibility criteria of 20 different clinical trials. The medical item content was selected on the basis of mortality-leading diseases according to World Health Organization data. The intervention comprised using a semiautomatic code suggestion tool that is linked to a European information infrastructure providing a large medical text corpus of >300,000 medical form items with expert-assigned semantic codes. Krippendorff alpha (Kalpha) with bootstrap analysis was used for the interrater reliability analysis, and coding times were measured before and after the intervention. Results The intervention improved interrater reliability in structured quality assurance form items (from Kalpha=0.50, 95% CI 0.43-0.57 to Kalpha=0.62 95% CI 0.55-0.69) and free-text eligibility criteria (from Kalpha=0.19, 95% CI 0.14-0.24 to Kalpha=0.43, 95% CI 0.37-0.50) while preserving or slightly reducing the mean coding time per item for all 6 coders. Regardless of the intervention, precoordination and structured items were associated with significantly high interrater reliability, but the proportion of items that were precoordinated significantly increased after intervention (eligibility criteria: OR 4.92, 95% CI 2.78-8.72; quality assurance: OR 1.96, 95% CI 1.19-3.25). Conclusions The Web-based code suggestion mechanism improved interrater reliability toward moderate or even substantial intercoder agreement. Precoordination and the use of structured versus free-text data elements are key drivers of higher interrater reliability.
Studies in health technology and informatics | 2015
Julian Varghese; Sarah Schulze Sünninghausen; Martin Dugas
Studies in health technology and informatics | 2015
Julian Varghese; Sarah Schulze Sünninghausen; Martin Dugas
medical informatics europe | 2018
Iñaki Soto-Rey; Philipp Neuhaus; Philipp Bruland; Sophia Geßner; Julian Varghese; Stefan Hegselmann; Tobias Brix; Martin Dugas; Michael Storck
GMDS | 2018
Stefan Hegselmann; Michael Storck; Sophia Geßner; Philipp Neuhaus; Julian Varghese; Martin Dugas