Gut | 2021

Incorrectly analysing stratified and minimised trials may lead to wrongfully rejecting superiority of interventions

Abstract

It is with great interest that we read the report of Yoshida et al on the effect of secondgeneration narrow band imaging compared to white light imaging on detecting early gastric cancer in highrisk patients. The trial was expertly designed with a large patient population and, although superiority of narrow band imaging could not be proven, has important implications for further research on this topic. However, a significant issue concerning the analyses attracted our interest and we would like to comment on it. The primary outcome, the difference in proportion of patients in whom early gastric cancer was diagnosed, failed to reach statistical significance (p=0.412). This difference in proportions was tested for significance using Fisher’s exact test. This might not have been the proper method for analysis, as patients in the study were randomised using minimisation with a random component, stratified by institution, age and indication of endoscopy. Imbalance of risk factors between treatment and control arms can occur by chance under normal randomisation, possibly leading to confounded treatment estimates. Stratification and minimisation are useful methods to ensure balance of risk factors between treatment arms. These methods can be beneficial in small and large trials, but for trials larger than 1000 patients little effect of minimisation on imbalance was found as compared with simple randomisation. One of the assumptions of Fisher’s exact test is that samples are random and independent, which is not the case in this study. The problem that occurs with stratified or minimised randomisation is clustering between treatment groups which introduces positive correlation between observations. The correlation between observations violates the independence assumption and will lead to standard errors (SE) that are biased upwards because tests for independent samples do not account for this correlation and will overestimate the variance of the treatment effect. As a SE that is biased upwards leads to inflated p values, not accounting for these balancing variables in the analysis may lead to wrongfully not rejecting the null hypothesis. This effect can be considerable, as a reanalysis of a large trial showed twofold to fourfold increases in p values and a simulation study showed reductions in power of up to 30 percentage points. 6 Thus, studies using either of the balancing methods should adjust their analysis for the balancing variables used in the randomisation procedure. Considering the report of Yoshida et al, this could mean that they may have erroneously concluded that secondgeneration narrow band imaging was not superior to white light imaging. We cannot determine the precise effect adjustment would have had in the study by Yoshida et al as a reanalysis requires the individual patient data. While adjustment is possible using Fisher’s exact test, 8 we suggest performing logistic regression analysis as this allows adjustment for multiple minimisation variables, does not rely on inefficient stratification, and can be used to determine the confidence interval around the treatment effect estimate. Unadjusted analyses in balanced randomised trials seem to be a recurring phenomenon. In 2012, a systematic review showed that only 26% of trials published in leading journals that used a balancing method correctly adjusted for all balancing factors. But even in more recent trials analyses are often not adjusted for balancing factors, as shown by the study of Yoshida et al, but also by other trials in the leading journals of gastroenterology and hepatology. 9 10 When used correctly minimisation and stratification are powerful tools for balancing randomised trials and improving the validity of studies. However, this has important consequences for data analysis. As such, we urge trialists to include the balancing variables as adjustment factors in their statistical analyses.

Volume None

Gut | 2021

Incorrectly analysing stratified and minimised trials may lead to wrongfully rejecting superiority of interventions

Abstract

Volume None

Pages None

DOI 10.1136/gutjnl-2021-324936

Language English

Journal Gut

Full Text