A confidence interval robust to publication bias for random-effects meta-analysis of few studies
aa r X i v : . [ s t a t . M E ] J u l A confidence interval robust to publication bias forrandom-effects meta-analysis of few studies
M. Henmi , S. Hattori , T. Friede ∗ Institute of Statistical Mathematics, 10-3 Midori-cho, Tachikawa, Tokyo 190-8562, Japan Department of Biomedical Statistics, Graduate School of Medicine, Osaka University, Osaka 565-0871, Japan Department of Medical Statistics, University Medical Center G¨ottingen, G¨ottingen, Germany
Abstract
Systematic reviews aim to summarize all the available evidence relevant to a par-ticular research question. If appropriate, the data from identified studies are quanti-tatively combined in a meta-analysis. Often only few studies regarding a particularresearch question exist. In these settings the estimation of the between-study het-erogeneity is challenging. Furthermore, the assessment of publication bias is difficultas standard methods such as visual inspection or formal hypothesis tests in funnelplots do not provide adequate guidance. Previously, Henmi and Copas (Statistics inMedicine 2010, 29: 2969–2983) proposed a confidence interval for the overall effectin random-effects meta-analysis that is robust to publication bias to some extent.As is evident from their simulations, the confidence intervals have improved coveragecompared with standard methods. To our knowledge, the properties of their methodhas never been assessed for meta-analyses including fewer than five studies. In thismanuscript, we propose a variation of the method by Henmi and Copas employingan improved estimator of the between-study heterogeneity, in particular when dealingwith few studies only. In a simulation study, the proposed method is compared to sev-eral competitors. Overall, we found that our method outperforms the others in termsof coverage probabilities. In particular, an improvement compared with the proposalby Henmi and Copas is demonstrated. The work is motivated and illustrated by asystematic review and meta-analysis in paediatric immunosuppression following livertransplantations.
Keywords : Meta-analysis; publication bias; between-trial heterogeneity; confidenceinterval; coverage probability
Systematic reviews aim to summarize all the available evidence relevant to a particularresearch question. If appropriate, the data from identified studies are quantitatively com-bined in a meta-analysis. If the true effect is the same in all studies to be combinedin a meta-analysis, then the so-called common-effect or fixed-effect model is appropriate.In practical applications this assumption often appears to be too strict as some level ofbetween-trial heterogeneity in the effects is suspected. Then the random-effects model is ∗ correspondence to : Tim Friede, Department of Medical Statistics, University Medical CenterG¨ottingen, Germany; email: [email protected] t -quantileshave been proposed [5, 6, 7, 8]. With few studies only, however, they are often conservativeand so long that they are uninformative [9]. Also various likelihood-based methods haverecently been assessed in the specific situation of few studies and not found to be a solutionto the problem [10]. Between-trial heterogeneity estimates mentioned above often result inzero [4], with the notable exception of the method proposed by Chung et al [3]. Chung et alsuggested the so-called Bayes modal (BM) estimator, which uses in a Bayesian framework aweakly informative prior for the between-trial heterogeneity to avoid zero estimates of theheterogeneity. Furthermore, a fully Bayesian approach to random-effects meta-analysiswith weakly informative priors for the between-trial heterogeneity parameter has someadvantages in this situation, since zero estimates are avoided as with the BM estimatorand in addtition the uncertainty in estimating the heterogeneity is accounted for [4, 11].Of course the Bayesian credible intervals would not necessarily have frequentist properties.Evaluating the operating characteristics in extensive simulation studies, it was found thatthe frequentist coverage probabilities are often above the nominal level with conservativechoices of the prior for the between-trial heterogeneity [4, 11]. Bender et al[12] recentlyprovided an overview on the topic of meta-analyses with few studies.In systematic reviews, relevant evidence is identified through systematic searches ofliterature databases. If all relevant studies would be published, this would be sufficient.However, this is not always the case. The problem was first described as the ’file drawerproblem’ [13]. Today various types of reporting biases are carefully distinguished includingpublication bias, time lag bias, citation bias and outcome reporting bias to name but afew. Studies might not be published at all for various reasons or only with a certain delayor in journals or languages that are more difficult to access (see e.g. Table 7.2.a in [14]).In the following we focus here on the aspect of publication bias. Prospective registrationof clinical trials is one way to tackle this problem. It has become standard practice tosearch not only at least two electronic databases of the literature but also to search at leastone registry for clinical studies such as clinicaltrials.gov. The idea would be to includeunpublished studies in systematic reviews. However, access to the unpublished results isoften challenging as it requires the cooperation of investigators, sponsors etc.A number of methods have been proposed over the years to deal with publication bias[15]. A popular way to interrogate data for publication bias is the visualization in formof a so-called funnel plot. In this scatter plot, each study contributes an estimate of aneffect measure and its estimated standard error. The former is plotted on the x-axis, thelatter on the y-axis. If no publication bias is present, we would expect the plot to besymmetric to the vertical line running through the average effect. Any absence of thissymmetry might be interpreted as a signal that some form of reporting bias might bepresent. As this can be difficult to judge, formal hypothesis tests have been proposed(see e.g. [16]). The problem with the visual inspection as well as with the formal tests2s that they become more powerful with larger number of studies, but are less sensitivewith few studies only. In the context of funnel plots, trim-and-fill methods have beenproposed to correct the overall effect for potential publication bias [17]. Following analternative approach, several sensitivity analysis methods have been suggested based onselection functions describing the selective publication process [18, 19, 20]. For instance,Copas and Jackson [19] investigated the maximum bias over all possible selection functionswhich satisfy the (fairly weak) condition that studies with smaller standard errors are atleast as likely to be selected than studies with larger standard errors. Building on theirwork, Henmi et al [20] developed sensitivity analyses that, in contrast to the proposal byCopas and Jackson [19], account for uncertainty in estimation. Again, the methods arenot designed for the setting of few studies only.Our work is motivated by a systematic review and meta-analysis of controlled clinicaltrials assessing efficacy and safety of Interleukin-2 receptor antagonists (IL-2RA) in chil-dren having undergone liver transplantations [21], a rare surgical procedure in children.In total only six relevant studies were identified with little standardization with regardto the design of the studies implying some level of heterogeneity. Although the authorscarefully checked for publication bias using standard techniques, it cannot be excludedthat in particular some smaller studies were not published if they resulted in inconclusivetreatment effects.In contrast to the approaches to publication bias described above, Henmi and Copas[22] proposed a method for random-effects meta-analysis that is robust to the selection ofstudies. They modified the DerSimonian-Laird (DL) confidence interval ([2], (10) in [22])by replacing the random-effect estimator by the fixed-effect estimator of the overall effectand by replacing the normal quantiles by more accurate ones. The latter depends on thebetween-trial heterogeneity. The DL estimator is used in the computation of the quantiles.Therefore, with few studies this approach may not work well. In this paper, we proposea modification of the Henmi-Copas method by replacing the estimator of the between-study heterogeneity in the computation of the quantiles by the one developed by Chunget al [3]. The properties of the new approach are assessed and compared to alternativemethods including the Henmi-Copas approach and a proposal by Doi et al [23] in MonteCarlo simulation studies considering in particular the case of few studies with and withoutpublication bias. Our method is not conditional on having detected publication bias, e.g.in a funnel plot, since this would be very difficult with only few studies included in themeta-analysis. But it is robust to the selection of studies even with few studies as we willsee below.The manuscript is organized as follows. In the next section the new confidence intervalfor the overall effect is developed starting by introducing notation and reviewing themethod of Henmi and Copas [22]. The simulation study assessing the properties of thenew confidence interval in comparison to existing methods is presented in Section 3. InSection 4 the proposed method is applied to the motivating example. We close with abrief discussion of our findings and their limitations. Adopting the notation by Henmi and Copas [22], the true effect of an individual study i out of n independent studies is denoted by θ i . Estimates y i of the effects θ i are observedwith stand errors σ i . Here we consider the normal-normal hierarchical model (NNHM) ,which is the standard model for random-effects meta-analysis. In the NNHM , it is assumed3hat the θ i are from a normal distribution with expectation θ and variance τ , i.e. θ i | θ, τ ∼ N ( θ, τ ) , i = 1 , . . . , n. (1)Furthermore, the effect estimators Y i follow (at least approximately) a normal distributionwith expectation θ i and variance σ i , i.e. Y i | θ i ∼ N ( θ i , σ i ) , i = 1 , . . . , n. (2)From Equations (1) and (2) they follow the marginal model Y i | θ, τ ∼ N ( θ, σ i + τ ) , i = 1 , . . . , n. (3)If the between-trial heterogeneity τ is 0, then the random-effects model reduces to theso-called fixed-effect or common-effect model.The focus of our study is inference regarding θ , the overall effect. A standard method toconstruct an estimator and a (1 − α ) confidence interval for θ was proposed by DerSimonianand Laird [2] (DL). In short, the DL estimator of θ is given byˆ θ R = X ˆ w i Y i / X ˆ w i , (4)where ˆ w i = 1 / ( σ i + ˆ τ DL ). Here, the DL estimator ˆ τ DL of the between-study heterogeneity τ is given by ˆ τ DL = max ( , Q − ( n − P w i − P w i / P w i ) . (5)The weights w i are the fixed effect weights (with τ = 0), which are w i = 1 /σ i . Further-more, Q is the so-called Q -statistic defined by Q = X w i ( Y i − ˆ θ F ) , (6)where ˆ θ F is the fixed (or common) effect estimator of the overall effect withˆ θ F = X w i Y i / X w i . (7)If the estimator ˆ τ DL is assumed to be a fixed constant as the true value of τ , then it holdsthat Z = ˆ θ R − θ / √ P ˆ w i ∼ N (0 , . (8)This results in the DerSimonian-Laird (1 − α )% confidence interval (DL) for θ , which isgiven by (cid:18) ˆ θ R − z (1 − α ) / √ P ˆ w i , ˆ θ R + z (1 − α ) / √ P ˆ w i (cid:19) , (9)where z γ is the γ quantile of the standard normal distribution. The assumption thatthe estimate ˆ τ DL is the true value of τ might be reasonable when the between-studyheterogeneity can be estimated with high precision, i.e. when the number of studiesincluded in the meta-analysis is large. In medical applications, however, this is frequentlynot the case. As noted by several authors, the application of the DL approach in meta-analyses with small to moderate numbers of studies results in coverage probabilities belowthe nominal level 1 − α [4]. 4enmi and Copas [22] tackled the two problems that (a) the distribution of the pivotstatistic is quite different from the standard normal distribution when the number ofstudies n is small, and (b) the estimators of θ are biased due to selective publication ofsmaller studies with less favourable results (publication bias). With respect to the latterthey note that the common (or fixed) effect estimator ˆ θ F is more robust to publicationbias than the random-effects estimator ˆ θ R simply because smaller studies, which are lesslikely to be published when their outcome is not favourable, have a smaller weight in theconstruction of ˆ θ F than in ˆ θ R . To address the problem of the normal approximation theyderive the distribution of the pivot statistic based on the fixed effect estimator under therandom-effects model. More specifically, the variance of ˆ θ F is V ( τ ) = τ P w i + P w i ( P w i ) . (10)The variance V ( τ ) can be estimated by plugging in ˆ τ DL for τ . We denote this estimatorof V ( τ ) by V (ˆ τ DL ) = ˆ τ DL P ˆ w i + P ˆ w i ( P ˆ w i ) . (11)Recall that the weights ˆ w i also depend on ˆ τ DL . Hence, the pivot statistic U is given by U = ˆ θ F − θ q V (ˆ τ DL ) . (12)The point in the derivation of the distribution of U by Henmi and Copas [22] is to takeinto account the random variation of ˆ τ DL in addition to ˆ θ F as follows.The distribution function of U can be written as P ( U ≤ u ) = − Z ∞ u P (cid:18) Q ≤ f − (cid:18) ru (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) R = r (cid:19) p R ( r ) dr (if u ≥ Z u −∞ P (cid:18) Q ≤ f − (cid:18) ru (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) R = r (cid:19) p R ( r ) dr (if u < , (13)where the random variable R and the function f are defined by R = P w i ( Y i − θ ) √ P w i and f ( Q ) = s P w i { Q − ( n − } ( P w i ) − P w i + 1 , (14)respectively. The function p R ( r ) is the probability density function of R , which is the nor-mal density with mean zero and variance 1+ τ ( P w i / P w i ). The conditional distributionof Q given R , which is necessary to calculate the integral in (13), is a little complicated,but it is well approximated by the gamma distribution whose mean and variance coincidewith the exact conditional mean M ( R ) and variance V ( R ) of Q given R , respectively (see[22] and its Appendix A for the explicit formulas of M ( R ) and V ( R ) and their derivation).Since both of the conditional mean M ( R ) and variance V ( R ) depend on the unknown truevalue of τ as does the variance of R , Henmi and Copas [22] proposed the use of the DLestimator ˆ τ DL for τ again to approximate these quantities. Under this setting, the (ap-proximate) γ quantile u γ of U can be obtained by means of numerical integration andoptimization (see Appendix B in [22] for an implementation in R) and hence a (1 − α )confidence interval for θ is given by 5 ˆ θ F − u α/ q V (ˆ τ DL ) , ˆ θ F + u α/ q V (ˆ τ DL ) (cid:19) . (15)In simulation studies, Henmi and Copas [22] could show that their approach improvescoverage probabilities as compared to standard procedures including the DL approach.With only few studies included in the meta-analysis, however, the performance is notsatisfying. The poor performance of the method in this particular situation is caused (atleast partly) by the use of the DL estimator ˆ τ DL in the computation of the quantiles of thepivot statistic U as above, since ˆ τ DL results frequently in zero estimates with few studiesalthough the between-trial heterogeneity is positive, τ > θ [25, 26]. A number of suggestions have been madeon the choice of such weakly informative priors for τ including half-t [26] and half-normaldistributions [4]. Here we follow Chung et al [3] who proposed to use a gamma distributionwith shape η and rate λ as a prior for τ , specifically p ( τ ) = λ η τ η − e − λτ / Γ( η ) with gammafunction Γ( η ). This choice means that the logarithm of the posterior of θ and τ is equal tothe log likelihood plus a term depending only on τ but not θ . Rather than using the meanor median of the posterior, Chung et al [3] consider the mode, which can be computedby numerical optimization. This estimator of τ is referred to as the Bayes Modal (BM)estimator ˆ τ BM . As default, Chung et al recommend to use α = 2 and λ close to 0. TheBM estimator ˆ τ BM can be interpreted as a penalized maximum likelihood (ML) estimator[3]. In this paper, we propose to replace the DL estimator ˆ τ DL in the computation ofthe quantiles of the pivot statistic U by the BM estimator ˆ τ BM . The choice of the BMestimator is motivated by its performance in comparison to other estimators in recentsimulation studies (see e.g. Figures 2 and 3 in [4]). The resulting γ quantile is denoted by u ( BM ) γ . The (1 − α ) confidence interval for θ is then given by (cid:18) ˆ θ F − u ( BM ) α/ q V (ˆ τ DL ) , ˆ θ F + u ( BM ) α/ q V (ˆ τ DL ) (cid:19) . (16)In summary, our idea is that we still use the DL estimator ˆ τ DL in the construction ofthe pivot statistic U given in (12) in the same way as Henmi and Copas [22], but we usethe BM estimator ˆ τ BM in the approximate calculation for the distribution of U instead ofˆ τ DL . The reason for the use of the DL estimator ˆ τ DL in the construction of U is that it iseasier to calculate the distribution of the pivot statistic U , taking into account the effectof estimating τ . However, the distribution of U depends on the unknown true value of τ and it is necessary to use some estimate of τ to approximate the distribution of U . Onepossibility is to use the DL estimator ˆ τ DL again, which was done in [22], but it would beinaccurate unless the number of studies are sufficiently large. Hence, we propose the use ofthe BM estimator ˆ τ BM to improve the accuracy in estimating τ and in approximating thedistribution of U , which we expect to lead the improvement of the coverage probabilities ofthe Henmi-Copas (HC) confidence interval (15). In the next section, by simulation studies,we show that the new confidence interval (16) actually improves the HC confidence interval(15) in coverage probability as well as the DL confidence interval (9) in both cases withand without publication bias, especially when the number of studies is small.6able 1: Summary of the scenarios considered in the simulation study Parameter Values
Treatment effect θ . τ . , . , . n , , , , β = 4, γ = 3Severe publication bias β = 4, γ = 1 . In order to compare the performance of the proposed approach with previously suggestedprocedures a Monte Carlo simulation study was conducted. As comparators the methodsby Henmi and Copas [22] (HC), Chung et al [3] (BM), Doi et al [23] (IVH) and DerSimo-nian and Laird [2] (DL) were included. The first one is known to be robust to publicationbias to some extent, but its performance in meta-analyses with few studies only is un-known. The approach by Chung et al [3] was developed for the scenario of few studiesbut might not be robust to publication bias. Doi et al [23] proposed the inverse varianceheterogeneity model. As with the HC approach, the interval is centred around an esti-mator assuming the common-effect model. Therefore, it might have attractive propertiesin settings with publication bias. In contrast to the HC approach, however, it is basedon normal approximation. This approach was not included in recent method compari-son studies [24]. The DL approach was included here as it is often considered to be thestandard approach to random-effects meta-analysis. The simulation model by Brockwelland Gordon [27] formed the basis for our simulation study. It was used in several recentsimulation studies and therefore appeared to be a good choice. To account for publicationbias, we used the same selection function (probability that a study with an outcome y andassociated standard error σ is selected in the meta-analysis) P (selected | y, σ ) = exp (cid:20) − β (cid:26) Φ (cid:18) − yσ (cid:19)(cid:27) γ (cid:21) (17)as in [22] with the same sets of the parameters β and γ for moderate and severe publicationbias. Here, Φ is the cumulative distribution function of the standard normal distribution.Table 1 summarizes the simulation scenarios considered. Per scenario N = 2 ,
000 simula-tion replications were run.Figure 1 presents the simulated coverage probabilities for the different confidence inter-vals in the various scenarios. In all scenarios considered, the proposed method performs atleast as well as the HC method in terms of the coverage probability. With larger numberof studies, say n ≥
9, and more pronounced between-trial heterogeneity, say τ ≥ . n = 3or n = 6, and only low levels of between-trial heterogeneity, τ = 0 .
05, the coverage prob-abilities of the BM approach are slightly higher than those of the proposed method. In thescenarios with publication bias, however, the coverage probabilities of the BM approach7apidly decrease well below the nominal level of 0.95 with increasing numbers of studiesincluded in the meta-analysis and increasing levels of between-trial heterogeneity. With-out publication bias, the coverage of the IVH interval is similar to the coverage of the DLinterval, i.e. poor for small numbers of studies n and closer to the nominal level for larger n . In the settings with publication bias the coverage probabilities of the IVH intervalsare generally larger than those of the DL approach, in particular with more pronouncedheterogeneity τ and larger numbers of studies n . However, the coverage probabilities arebelow those achieved by the HC and HC-BM approaches. Overall, the coverage probabil-ities of the proposed approach are closest to the nominal level whereas the coverages forthe DL approach are well below the nominal level for several scenarios characterized bypublication bias and small numbers of studies included in the meta-analysis.In scenarios where different methods resulted in similar coverage probabilities closeto the nominal level it is of interest to compare the length of the intervals obtained bythese methods. Shorter intervals with the same coverage would of course be preferred.Table 2 gives the median interval lengths of the different confidence intervals for variouslevels of publication bias and heterogeneity τ as well as numbers of studies n includedin the meta-analysis. For instance, in the setting without publication bias and n = 3studies the median length of our proposed confidence interval (HC-BM interval) is 1.12,which is slightly larger than the median length of the BM intervals (1.15) although thecoverage is 0.96 just below the coverage of the BM intervals (0.97). Similarly, with n = 6studies the median lengths of the HC-BM and BM intervals are 0.76 and 0.67, respectively.In scenarios where the coverages of the HC and HC-BM intervals are close, the medianlengths of the intervals are similar again. In the scenarios without publication bias, wherethe coverages of the DL and IVH intervals are similar, the IVH intervals tend to be longerthan the DL intervals when heterogeneity is present (i.e. τ > Crins et al [21] report a systematic review and meta-analysis evaluating Interleukin-2receptor antibodies (IL-2RA) for immunosuppression in children who underwent livertransplantation. The authors identified a total of six controlled studies including tworandomized trials. Given the heterogeneity in the designs of the studies, some between-study heterogeneity in the treatment effects can be expected. Although Crins et al [21]did not identify any publication bias by visual inspection of funnel plots and formal testsfor asymmetry of these plots, this provides little reassurance that indeed no publicationbias is present, since the number of studies is fairly small, which hinders the identificationof publication bias in funnel plots or formal hypothesis tests. Therefore, there is a needfor methods for random-effects meta-analyses robust to publication bias in this setting.The endpoint acute rejections was reported in all six studies identified in the systematicreview, whereas only three also reported the outcome steroid-resistant rejections. Table3 summarizes the findings for both outcomes. These data were previously considered by8 = . τ = . τ = . No publication bias n Coverage prob. n Coverage prob. n Coverage prob.
Moderate publication bias n Coverage prob. n Coverage prob. n Coverage prob.
Severe publication bias n Coverage prob. n Coverage prob. n Coverage prob. F i g u r e : C o v e r ag e p r o b a b ili t i e s o f t h e v a r i o u s c o nfid e n ce i n t e r v a l s ( c i r c l e : H C , c r o ss : D L , d o t : H C - B M , p l u s : B M , t r i a n g l e : I VH ) d e p e nd i n go n t h e nu m b e r o f s t ud i e s n i n c l ud e d i n t h e m e t a - a n a l y s i s f o r n o , m o d e r a t e a nd s e v e r e pub li c a t i o nb i a s a nd f o r d i ff e r e n t d e g r ee s o f b e t w ee n - t r i a l h e t e r og e n e i t y τ = . , . , . . riede et al [4] who applied several point estimators and confidence intervals of the overalleffect including DL and BM to these. For acute rejections, DL and BM yielded log oddsratios (95% confidence interval) of -1.59 (-2.21, -0.96) and -1.61 (-2.35, -0.87), respectively.The between-study heterogeneity was estimated as ˆ τ DL = 0 .
16 and ˆ τ BM = 0 .
38 with theDL and BM methods, respectively. The fixed-effect estimate of the overall effect is -1.56smaller than the random-effects estimates. The HC interval given by (-2.24, -0.89) iscentred around the fixed-effect estimate. The HC-BM interval proposed here is calculatedas (-2.31, -0.82) which is considerably wider than the HC interval.For steroid-resistant rejections, DL and BM resulted in a log odds ratio (95% confi-dence interval) of -1.21 (-2.28, -0.15) and -1.32 (-2.78, 0.14), respectively. Whereas theDL method results in a statistically significant treatment difference, the effect is not sta-tistically significant with the BM approach although the point estimate hints at a morepronounced treatment effect. This is explained by the larger between-study heterogeneityof ˆ τ BM = 0 .
87 with the BM method which compares to ˆ τ DL = 0 .
14 with the DL method.These compare to the fixed-effect estimate of -1.17 with 95% confidence intervals of (-2.24,-0.09) and (-2.53, 0.20) for the HC and HC-BM methods, respectively. Again, the fixed-effect estimate is smaller than the effects obtained from random-effects meta-analyses.Furthermore, the HC-BM confidence interval is wider than the HC interval. Here, thiswider interval means that the effect is no longer statistically significant on the usual 5%level.
Meta-analyses of only a few studies are very common, but pose a number of challenges.These include the estimation of between-trial heterogeneity as well as the assessment ofpublication bias. Here we proposed a method that faces both challenges successfully. Theconfidence interval of the overall effect proposed by Henmi and Copas [22] was improvedby replacing the DerSimonian-Laird estimator by the Bayes Modal estimator of Chung etal [3] in the computation of the quantiles to construct the confidence interval. The useof a weakly informative prior biases the Bayes Modal estimator away from zero. Thisresulted in larger quantiles, in particular in situations with few studies and only smallto moderate levels of between-trial heterogeneity, which improved the coverage of theconfidence intervals.There are a number of limitations. We focused on properties related to estimatingthe overall effect and did not consider other parameters such as the heterogeneity τ [31].Furthermore, we refrained form investigating other selection functions, since Henmi andCopas state that their “experience of working with other such models suggests that theextent of bias depends much more on the choice of selection parameters [. . . ] than it doeson the particular mathematical form of the selection function itself” [22]. Also, we did notinclude other comparators such as the Knapp-Hartung-Sidik-Jonkman approach [6, 7, 8],since extensive comparisons were included in the paper by Henmi and Copas [22] and alsoin more recent simulation studies [4, 11].The normal-normal hierarchical model considered here is a standard model for random-effects meta-analyses. This model is very general but not without limitations since effectestimates are modelled and not the data directly implying a two-step procedure. Forinstance, considering binary outcomes and treatment effects summarized by odds ratiosJackson et al [28] discuss six alternative generalised linear mixed models which are moreefficient one-step procedures. Modelling the data directly can have particular benefits10hen dealing with rare events; see for example G¨unhan et al [29] or Gronsbell et al [30].The approach taken here to improve the coverage of confidence intervals of the overalleffect in pairwise meta-analysis might also be useful in more complex settings such asmeta-regression or network meta-analysis. The exploration of such opportunities is out ofthe scope of this manuscript but subject of future research. Highlights
What is already known • Estimated overall effects from meta-analyses might be impacted by reporting bias • A confidence interval for the overall effect has been proposed that is to some extentrobust to the selection of studiesWhat is new • The performance of the robust confidence interval previously proposed is assessed inmeta-analyses with few studies and found not to work well in this setting • The approach is refined resulting in improved coverage probabilities of the confidenceintervals in particular in meta-analyses with few studiesPotential impact for RSM readers outside the authors field • The refined approach is recommend for application in meta-analyses with few studiesyielding more reliable results
Data availability statement
The data used in Section 4 are provided in Table 3. Furthermore, they are given in thepaper by Crins et al [21] and are also included in the R package bayesmeta available fromCRAN.
Acknowledgements
The authors are grateful to Professor John Copas (Warwick) for discussions during hisvisit to Tokyo and Osaka in spring 2019.
ORCID
Satoshi Hattori 0000-0001-5446-2305Tim Friede 0000-0001-5347-7441
References [1] Veroniki AA, Jackson D, Viechtbauer W, Bender R, Bowden J, Knapp G, Kuß O,Higgins JPT, Langan D, Salanti G. Methods to estimate the between-study varianceand its uncertainty in meta-analysis.
Research Synthesis Methods
Controlled Clinical Trials
Statistics in Medicine
Research Synthesis Methods
Biometrics
Statistics in Medicine
Statistics in Medicine
Statistics inMedicine
BMC Medical ResearchMethodology
BMC Medical Research Methodology
Biometrical Journal
Research Synthesis Methods
PsychologicalBulletin .[15] Jin ZC, Zhou XH, He J. Statistical methods for dealing with publication bias inmeta?analysis.
Statistics in Medicine
BMJ
Journal of the American Statistical Association
Biostatistics
Biometrics
Biometrics
Pediatric Transplantation
Statistics in Medicine
Contemporary Clinical Trials
Research Synthesis Methods
Bayesian Approaches to Clinical Trials andHealth-Care Evaluation . Chichester: Wiley; 2004.[26] Gelman A. Prior distributions for variance parameters in hierarchical mod-els(Comment on Article by Browne and Draper).
Bayesian Analysis
Statistics in Medicine
Statistics inMedicine
Research Synthesis Methods
Statistics in Medicine
Statistics in Medicine τ as well as numbers of studies n included in the meta-analysis.publication bias τ nn