In medicine and psychology, clinical relevance refers to the practical importance of a treatment effect, meaning whether a treatment has a real, perceptible impact on daily life. As medical and psychological treatments advance, it becomes increasingly important to understand how to effectively quantify the effects of these treatments.
Statistical significance is mainly used in hypothesis testing to draw conclusions by testing the null hypothesis (i.e., there is no effect between variables). The significance level chosen (usually 0.05 or 0.01) represents the chance of falsely rejecting the true null hypothesis. If there is a significant difference between the two groups (for example, at α = 0.05), this means that there is only a 5% chance that the observed result would occur, assuming the difference was entirely due to chance. However, this does not provide any indication of the magnitude or clinical importance of this difference.
Practical clinical relevance relates to how effective an intervention or treatment is and how large a change the treatment causes. In clinical treatment testing, practical implications usually involve some quantitative information, such as effect size, number needed to treat (NNT), and prevention share. Effect size is a practical measure that quantifies the difference between a sample and expectations and provides important information about the study results. Results, including effect sizes, will help medical professionals better assess the effectiveness of treatments.
Effect sizes can provide important information about study results and suggest inclusion beyond statistical significance.
In psychology and psychotherapy, clinical significance is used as a technical term that provides information about whether a treatment is effective enough to change a patient's diagnostic label. Clinical significance The question answered in clinical treatment research is "Is the treatment effective enough to return the patient to normal on diagnostic criteria?" For example, a treatment may significantly reduce depressive symptoms (statistically significant), or the effect of change may be large (practically significant). 40% of patients no longer met the diagnostic criteria for depression (clinically significant).
Even with a significant difference and a medium or large effect size, a treatment may still fail to transform a patient from a dysfunctional to a functional state.
There are many methods for calculating clinical significance, including the Jacobson-Truax method, the Gulliksen-Lord-Novick method, the Edwards-Nunnally method, the Hageman-Arrindell method, and the hierarchical linear model (HLM).
Jacobson-Truax methodThis method involves calculating the reliability change index (RCI), which is equal to the difference between a participant's pre-test and post-test scores, and then dividing this difference by the standard error of the difference. Participants were classified as “recovered,” “improved,” “unchanged,” or “worsened” based on the directionality of the RCI and whether the cut-off score was achieved.
This method is similar to the Jacobson-Truax method and takes into account the effects of mean reversion. This was done by subtracting the group mean from the pre-test and post-test scores and then dividing the difference by the group standard deviation.
Edwards-Nunnally methodThis is a more rigorous way to calculate clinical significance, which uses the reliability score to move the pretest score closer to the mean and then creates a confidence interval for this adjusted pretest score. This means that when calculating the change from pretest to posttest, a larger actual change is required to show clinical significance compared to the Jacobson-Truax method.
The method involves a group change index and an individual change index. The reliability of the change can determine whether the patient has improved, remained the same, or worsened. In addition, the clinical significance of the change will be shown similar to the four categories used by Jacobson-Truax: worsening, no significant change, improved but not recovered, and recovered.
HLM is conducted through growth curve analysis rather than relying solely on pre-test and post-test comparisons. This requires three data points per patient, not just two (pretest and posttest).
In general, calculations of clinical significance are as diverse as statistical and practical significance, reflecting the actual effects of different treatments as well as individual variability among patients. So how do you determine whether a treatment actually improves a patient's quality of life, and what is the clinical significance behind this?