Johan Hopstadius
Uppsala Monitoring Centre
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Johan Hopstadius.
Statistical Methods in Medical Research | 2013
G. Niklas Norén; Johan Hopstadius; Andrew Bate
Large observational data sets are a great asset to better understand the effects of medicines in clinical practice and, ultimately, improve patient care. For an empirical pattern in observational data to be of practical relevance, it should represent a substantial deviation from the null model. For the purpose of identifying such deviations, statistical significance tests are inadequate, as they do not on their own distinguish the magnitude of an effect from its data support. The observed-to-expected (OE) ratio on the other hand directly measures strength of association and is an intuitive basis to identify a range of patterns related to event rates, including pairwise associations, higher order interactions and temporal associations between events over time. It is sensitive to random fluctuations for rare events with low expected counts but statistical shrinkage can protect against spurious associations. Shrinkage OE ratios provide a simple but powerful framework for large-scale pattern discovery. In this article, we outline a range of patterns that are naturally viewed in terms of OE ratios and propose a straightforward and effective statistical shrinkage transformation that can be applied to any such ratio. The proposed approach retains emphasis on the practical relevance and transparency of highlighted patterns, while protecting against spurious associations.
knowledge discovery and data mining | 2008
G. Niklas Norén; Andrew Bate; Johan Hopstadius; Kristina Star; I. Ralph Edwards
We introduce a novel pattern discovery methodology for event history data focusing explicitly on the detailed temporal relationship between pairs of events. At the core is a graphical statistical approach to summarising and visualising event history data, which contrasts the observed to the expected incidence of the event of interest before and after an index event. Thus, pattern discovery is not restricted to a specific time window of interest, but encompasses extended parts of the underlying event histories. In order to effectively screen large collections of event history data for interesting temporal relationships, we introduce a new measure of temporal association. The proposed measure contrasts the observed-to-expected ratio in a time period of interest to that in a pre-defined control period. An important feature of both the observed-to-expected graph itself and the measure of association, is a statistical shrinkage towards the null hypothesis of no association. This provides protection against spurious associations and is an extension of the statistical shrinkage successfully applied to large-scale screening for associations between events in cross-sectional data, such as large collections of adverse drug reaction reports. We demonstrate the usefulness of the proposed pattern discovery methodology by a set of examples from a collection of over two million patient records in the United Kingdom. The identified patterns include temporal relationships between drug prescription and medical events suggestive of persistent or transient risks of adverse events, as well as temporal relationships between prescriptions of different drugs.
Drug Safety | 2008
Johan Hopstadius; G. Niklas Norén; Andrew Bate; I. Ralph Edwards
AbstractBackground and objectives: Automated screening for excessive adverse drug reaction (ADR) reporting rates has proven useful as a tool to direct clinical review in large-scale drug safety signal detection. Some measures of disproportionality can be adjusted to eliminate any undue influence on the ADR reporting rate of covariates, such as patient age or country of origin, by using a weighted average of stratum-specific measures of disproportionality. Arguments have been made in favour of routine adjustment for a set of common potential confounders using stratification. The aim of this paper is to investigate the impact of using adjusted observed-to-expected ratios, as implemented for the Empirical Bayes Geometric Mean (EBGM) and the information component (IC) measures of disproportionality, for first-pass analysis of the WHO database. Methods: A simulation study was carried out to investigate the impact of simultaneous adjustment for several potential confounders based on stratification. Comparison between crude and adjusted observed-to-expected ratios were made based on random allocation of reports to a set of strata with a realistic distribution of stratum sizes. In a separate study, differences between the crude IC value and IC values adjusted for (combinations of) patient sex, age group, reporting quarter and country of origin, with respect to their concordance with a literature comparison were analysed. Comparison was made to the impact on signal detection performance of a triage criterion requiring reports from at least two countries before a drug-ADR pair was highlighted for clinical review. Results: The simulation study demonstrated a clear tendency of the adjusted observed-to-expected ratio to spurious (and considerable) underestimation relative to the crude one, in the presence of any very small strata in a stratified database. With carefully implemented stratification that did not yield any very small strata, this tendency could be avoided. Routine adjustment for potential confounders improved signal detection performance relative to the literature comparison, but the magnitude of the improvement was modest. The improvement from the triage criterion was more considerable. Discussion and conclusions: Our results indicate that first-pass screening based on observed-to-expected ratios adjusted with stratification may lead to missed signals in ADR surveillance, unless very small strata are avoided. In addition, the improvement in signal detection performance due to routine adjustment for a set of common confounders appears to be smaller than previously assumed. Other approaches to improving signal detection performance such as the development of refined triage criteria may be more promising areas for future research.
Pharmacoepidemiology and Drug Safety | 2011
G. Niklas Norén; Johan Hopstadius; Andrew Bate; I. Ralph Edwards
We read with interest the paper by Schuemie describing an implementation of Bayesian disproportionality analysis (Gamma Poisson Shrinker (GPS)) for longitudinal data and emphasizing the importance of graphical representation. In this commentary, we describe a previously published related approach to safety signal detection in longitudinal medical records,2,3 which we argue has important advantages. We describe some details of the method that can serve as an extension of the approach described by Schuemie. We believe this makes the case for graphical representation coupled with Bayesian analysis in safety surveillance of longitudinal data more convincingly. As with Schuemie’s method, the method for temporal pattern discovery in Norén et al.2,3 uses Bayesian shrinkage to protect against spurious associations, contrasts event rates in different periods to filter out indications for treatment, and proposes a graphical statistical approach to characterize temporal patterns and facilitate clinical interpretation. In addition, it controls for time‐constant confounders through a self‐ controlled design while incorporating information on unexposed patients separately to account for systematic variability in event rates over time. It contrasts four distinct periods relative to treatment initiation to highlight time‐varying confounding by underlying disease. Its computational framework has been evaluated in the UK IMS Disease Analyzer data set of more
international health informatics symposium | 2012
Johan Hopstadius; G. Niklas Norén
The identification of unanticipated statistical associations is a core activity in exploratory analysis of high-dimensional biomedical data. Specifically, post-marketing surveillance for harmful effects of medicines relies on effective algorithms to detect associations between drugs and suspected adverse drug reactions. The WHO global individual case safety reports database, VigiBase, holds over six million reports and covers more than ten thousand medicinal products and thousands of distinct medical concepts. It collects data from more than 100 countries across the world and its first reports date back to the late 1960s. Local patterns may not show in database-wide analyses, and many others will vary substantially in strength or direction across data subsets. Still, routine screening of this and similar databases relies on global measures of association. In this paper, we propose a framework to detect local associations and characterise subset variability in high-dimensional data. We use shrinkage observed-to-expected ratios and employ multiple stratification by one or two covariates at a time. We consider subset-specific, stratified-then-pooled adjusted measures, and a novel measure to detect associations that hold in all-but-one subset. We use covariate permutation to select stratification covariates and gauge the vulnerability to spurious associations. Chance findings are a major concern! A naive subgroup analysis yielded more than 50% spurious local associations in VigiBase. To improve on this, we enforce conservative credibility intervals and also look for subset-specific associations that reproduce in at least one additional subset (e.g. two time periods). In addition to 119,500 global associations between drugs and medical events in VigiBase, such robust subgroup analysis uncovered 14,600 local associations at an estimated rate of 2.2% spurious.
Pharmacoepidemiology and Drug Safety | 2012
G. Niklas Norén; Johan Hopstadius; Andrew Bate; I. Ralph Edwards
In Safety surveillance of longitudinal databases: further methodological considerations published in this edition of Pharmacoepidemiology and Drug Safety, Dr. Schuemie1 contributes a commentary providing perspectives on signal detection in longitudinal observational databases. The focal point of the discussion are two methods proposed for this purpose: IC temporal pattern discovery (ICTPD)2,3 and LGPS/LEOPARD,4 with several fundamental similarities.5 We agree with many of Dr. Schuemie’s observations. In particular, we share his view that neither of the mentioned methods are perfect and that an approach combining each method’s respective strengths is likely to outperform either. Indeed, the similarity between the two approaches provides great opportunity to combine effective design choices. As implied by Dr. Schuemie, most differences in design can be accommodated as “parameter settings”1 and their impact independently evaluated. There is of course much to learn also from diametrically different approaches to analysis such as those based on propensity scores.6 The critical dissection of results obtained on both real and simulated data allows for rapid improvements in methods to gain knowledge on outcomes of drug therapy. Both commentaries1,5 emphasize the importance of evaluation against real-world data. The Observational Medical Outcomes Partnership (OMOP) has concluded its first phase of methods evaluation in a wide variety of observational longitudinal databases7: Figure 1 displays the predictive ability of eight methods in each of 10 different data sources (all methods were not evaluated in each). The reference set consisted of nine true positive adverse drug reactions and 44 drug-adverse event pairs considered to represent negative controls. The measure of performance is the area under curve (AUC) of a receiver-operating characteristic curve. AUC averages the predictive ability of a screening algorithm over all possible thresholds. It can be interpreted as the probability that the algorithm ranks a randomly selected true positive higher than a randomly selected negative control. Thus, an AUC of 1 indicates perfect performance, and one of 0.5 corresponds to no predictive ability at all. The evaluated methods clearly do have predictive ability but are far from perfect. They include ICTPD as well as simple disproportionality analysis solely based on the occurrence of events after therapeutic intervention, which is useful in reference to the evaluation on OMOP simulated data.1 The performance of LGPS/ LEOPARD on real-world OMOP data has not been published. Given the limited number of test cases, there is considerable variability in the performance estimates, and they must be interpreted with caution. With that caveat, the relative performance of ICTPD is consistently high across the eight databases in which it was evaluated, with the highest AUC of all evaluated methods in six out of those eight. The high variability in performance of certain methods across data sources merits in-depth evaluation. In this context, it should be noted that the evaluation pertains to specific implementations of each type of design, which may vary, and not necessarily to the designs themselves. In contrast to these results, Dr. Schuemie reports poor performance of ICTPD in an evaluation against simulated data, using OMOP’s first generation data simulator (here referred to as OSIM1, for clarity) as in his previous publication.4 On the OSIM1 data, ICTPD achieves a mean average precision of only 0.11—far below the LGPS/LEOPARD mean average precision of 0.24 and the simple disproportionality analysis mean average precision of 0.20 (as represented by basic IC analysis). The strength of simulated data is its undisputable point of reference: there are no ambiguities as to what constitutes a true positive and a negative control. Analysis in simulated data can be a good approach to evaluate specific aspects of a method and to identify areas of potential improvement. Notably, Schuemie observes that the poor performance of ICTPD on OSIM1 data may be due to a pattern of inflated expected counts in the control period immediately prior to initiation of drug treatment, which would reduce the method’s ability to compensate for confounding by the The results presented here are for the parameter settings of each method that maximized the AUC in a random effects meta-analysis across all databases; this was a single control period from 180 days to 1 day prior to prescription and a 30-day risk period for ICTPD; it was IC-type shrinkage combined with an indefinite risk period and consideration of repeated events for disproportionality analysis.
Data Mining and Knowledge Discovery | 2010
G. Niklas Norén; Johan Hopstadius; Andrew Bate; Kristina Star; I. Ralph Edwards
Drug Safety | 2013
Johanna Strandell; Ola Caster; Johan Hopstadius; I. Ralph Edwards; G. Niklas Norén
Drug Safety | 2008
Johan Hopstadius; G. Niklas Norén; Andrew Bate; I. Ralph Edwards
Archive | 2007
Johan Hopstadius; G. Niklas Norén; Andrew Bate; I. Ralph Edwards