[PDF] Interaction between two exposures: determining odds ratios and confidence intervals for risk estimates

Abstract

In epidemiological research, it is common to investigate the interaction between risk factors for an outcome such as a disease and hence to estimate the risk associated with being exposed for either or both of two risk factors under investigation. Interactions can be estimated both on the additive and multiplicative scale using the same regression model. We here present a review for calculating interaction and estimating the risk and confidence interval of two exposures using a single regression model and the relationship between measures, particularly the standard error for the combined exposure risk group.

Full PDF

11 Interaction between two exposures: determining odds ratios and confidence intervals for risk estimates.

Jesse Huang , Ingrid Kockum , Pernilla Stridh * Equal contribution Abstract

Introduction

In epidemiological research, it is common to investigate the interaction between risk factors for an outcome such as a disease. This is done for a variety of reasons; it could for example be to maximize the benefit of public health interventions by identifying at-risk groups or to better understand the etiology of disease [1]. Interactions can be investigated on different scales, usually on the additive and multiplicative scale [1]. It is recommended to investigate both [2]. Usually, multiplicative interaction is estimated by fitting an interaction term between two exposures in a regression model [3]. Additive interaction is usually estimated by fitting separate risk models associated with individual and double exposures from which different measures of interaction can be estimated [4]. It is however also possible to estimate these measures of interaction from the same regression model that is used for estimating interaction on the multiplicative scale which has been recommended by VanderWeele and Knol [2]. An advantage with this approach is that both additive and multiplicative interaction can be estimated at the same time and by adjusting for covariates in the same way making the estimates comparable. We here present a review for calculating interaction and estimating the risk and confidence interval of two exposures using a single regression model and the relationship between measures, particularly the standard error for the combined exposure risk group. The relationship between the risk and multiplicative interaction estimates for two exposures.

In a standard case-control cohort with a binary outcome Z = {0,1} and two binary exposures X, Y = {0,1}, the frequency of all combinations can be represented as in

Table 1 . Table 1. Frequency distribution of two dichotomous exposures.

Outcome (Z) Exposure 1 (X) Exposure 2 (Y) 0 (-) 1 (+) 1 (+) d d c c b b a a The risk association between the outcome and exposures can be estimated using the logistic regression model shown below. [ Equation 1.1 ] Z ~ β + β X ( X ) + β Y ( Y ) + β XY ( X * Y ) The odds ratio (OR) can be derived from the natural exponential of the regression coefficients. With β X , β Y , and β XY corresponding to the effects of the added risk due to only the first exposure (X, OR ), added risk due to the second exposure (Y, OR ), and the multiplicative interaction (MI) term, respectively. [ Equation 1.2 ] OR = exp( β X ) [ Equation 1.3 ] OR = exp( β Y ) [ Equation 1.4 ] MI = exp( β XY ) The risk due to the combined exposure of both X and Y, OR , is similarly the exponential of β X+Y , which is equal to the sum of the regression coefficients of the first exposure ( β X ), second exposure ( β Y ), and multiplicative term ( β XY ). [ Equation 1.5 ] β

X+Y = β X + β Y + β XY [ Equation 1.6 ] OR = exp( β X+Y ) In addition, the 95% confidence intervals (CI) can be calculated using the corresponding standard error (SE) for each estimate. [ Equation 1.7 ] CI = exp( β X ± X ) [ Equation 1.8 ] CI = exp( β Y ± Y ) [ Equation 1.9 ] CI MI = exp( β XY ± XY ) [ Equation 1.10 ] CI = exp( β X+Y ± X+Y ) Measures of additive interaction as described by Rothman [5] includes the relative excess risk due to interaction (RERI), attributable proportion due to interaction (AP), and synergy index (SI), which can be estimated with the following [6]: [ Equation 1.11 ] RERI = OR - OR - OR + 1 [ Equation 1.12 ] AP = RERI / OR11 [

Equation 1.13 ] SI = ( OR - 1 ) / [ ( OR - 1) + ( OR - 1) ] Example 1. Example of an R-based logistic regression analysis (glm), Equation 1.1 > summary(glm(Z~X+Y+X*Y,data=test,family="binomial")) Call: glm(formula = Z ~ X + Y + X * Y, family = "binomial", data = test) Deviance Residuals: Min 1Q Median 3Q Max -1.223 -1.217 1.133 1.138 1.195 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.0930904 0.1246494 0.747 0.455 X -0.1345902 0.1792823 -0.751 0.453 Y 0.0007283 0.1766264 0.004 0.997 X:Y 0.1469936 0.2533262 0.580 0.562 (Dispersion parameter for binomial family taken to be 1) Null deviance: 1385.3 on 999 degrees of freedom Residual deviance: 1384.4 on 996 degrees of freedom AIC: 1392.4 Number of Fisher Scoring iterations: 3 {Obtained from using a simulated data (n=1000) with randomly generated distributions.}

2. The relationship between the effect estimate and standard error with the frequency distribution.

The effect estimate and standard error can also be determined in relation to the frequency data from

Table 1 . Odds ratios are defined by the ratio of the outcome (positive:negative) among the exposed over the unexposed. For example, { OR = ( d / d ) / ( a / a ) }. This means Equation 1.6 can be written as: [

Equation 2.1 ] β

X+Y = ln ( ( d / d ) / ( a / a ) ) The same can be derived for the coefficients, β X and β Y . [ Equation 2.2 ] β X = ln ( ( c / c ) / ( a / a ) ) [ Equation 2.3 ] β Y = ln ( ( b / b ) / ( a / a ) ) The intercept ( β ) is defined by the reference group 0/0 (a) . [ Equation 2.4 ] β = ln ( a / a ) Similarly, standard errors (SE) can be calculated in relation to the frequency table. [ Equation 2.5 ] SE X = sqrt ( 1/c + 1/c + 1/a + 1/a ) [ Equation 2.6 ] SE Y = sqrt ( 1/b + 1/b + 1/a + 1/a ) The standard error of the double exposure estimate, β X+Y , ( SE

X+Y ) is then: [

Equation 2.7 ] SE

X+Y = sqrt ( 1/d + 1/d + 1/a + 1/a ) In addition, the standard error of the intercept ( SE ) is determined by, [ Equation 2.8 ] SE = sqrt ( 1/a + 1/a )

3. Standard error of the combined exposure risk estimate

Although the standard errors are typically given for β X (OR ) and β Y (OR ), both the estimate and standard error for the combined exposure (β R and SE R , respectively) is not typically provided. The estimate can be derived using Equation 1.5 . However, we can derive the equation for calculating SE R using the typical estimates given (See Figure 1 ). Starting with

Equation 2.7 , the standard error of the combined exposure risk ( SE

X+Y ) is: SE

X+Y = sqrt ( 1/d + 1/d + 1/a + 1/a ) From Equation 2.8 , we know that {1/a + 1/a = ( SE ) ^ 2}, therefore [ Equation 3.1 ] SE

X+Y = sqrt ( 1/d + 1/d + ( SE ) ^ 2 ) We can derive d and d from Equation 2.1 and . β

X+Y = ln [ ( d / d ) / ( a / a ) ] * Equation 2.1 β = ln ( a / a ) * Equation 2.4 d / d = exp( β X+Y ) * exp( β ) d / d = exp( β X+Y + β ) [ Equation 3.2 ] d = d / exp( β X+Y + β ) n = d + d n = ( d / exp( β X+Y + β ) ) + d n = d * ( 1 / exp( β X+Y + β ) + 1 ) [ Equation 3.3 ] d = n / ( 1 / exp( β X+Y + β ) + 1 ) Replace d and d in Equation 3.1 with

Equation 3.2-3. SE X+Y = sqrt ( 1/d + 1/( d / exp( β R + β ) ) + ( SE ) ^ 2 ) SE X+Y = sqrt ( ( 1 + exp( β R + β ) ) / d + ( SE ) ^ 2 ) SE X+Y = sqrt ( ( 1 + exp( β R + β ) ) / (n / ( 1 / exp( β R + β ) + 1 )) + ( SE ) ^ 2 ) SE X+Y = sqrt ( ( 1 + exp( β R + β ) ) * ( 1 / exp( β R + β ) + 1 ) / n + ( SE ) ^ 2 ) [ Equation 3.4 ] SE X+Y = sqrt ( ( exp( β R + β ) + 1 / exp( β R + β ) + 2 ) / n + ( SE ) ^ 2 ) [Equation 3.5] SE X+Y = sqrt( (J + 1/J + 2) / n + ( SE ) ^2) where J = exp( β X+Y + β ) = exp( β X + β Y + β XY + β ) = OR X+Y * OR = OR XY * OR X * OR Y * OR Using the measures from Example 1, we can calculate both the estimate and the standard error of the combined exposure (SE

X+Y ) given that the number of samples with both exposures (n =245). β X+Y = β X + β Y + β XY β X+Y = -0.1345902 + 0.0007283 + 0.1469936 β

X+Y = 0.0131317 J = exp( β X + β Y + β XY + β ) J = exp( -0.1345902 + 0.0007283 + 0.1469936 + 0.0930904 ) J = 1.112069 SE X+Y = sqrt( (J + 1/J + 2) / n + ( SE ) ^2) SE X+Y = sqrt( (1.112069 + 1/1.112069 + 2) / n + ( 0.1246494 ) ^2) SE X+Y = sqrt( (1.112069 + 1/1.112069 + 2) / 245 + ( 0.1246494 ) ^2) SE

X+Y = 0.178634 We can check this value by performing the same logistic regression model using the derived variable, T , defined by: T = { 1 | if ( X=1 & Y=1 ); 0 | if ( X=0 & Y=0 )} Example 2. Calculating the combined exposure risk estimate and standard error using R

Call: glm(formula = Z ~ T, family = "binomial", data = test) Deviance Residuals: Min 1Q Median 3Q Max -1.223 -1.217 1.133 1.138 1.138 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.09309 0.12465 0.747 0.455 T 0.01313 0.17863 0.074 0.941 (Dispersion parameter for binomial family taken to be 1) Null deviance: 696.06 on 502 degrees of freedom Residual deviance: 696.06 on 501 degrees of freedom (497 observations deleted due to missingness) AIC: 700.06 Number of Fisher Scoring iterations: 3

Conclusion

In this overview, we detailed the relationship between risk measures and multiplicative interaction along with the method of assessing both the risk and confidence interval associated with each exposure group using the same regression model. The simplified process provides easier implementation and can be more resourceful in studies requiring repetitive calculations, such as when investigating gene x gene interactions which often consists of large number exposure pairs or when estimating significance of interactions using bootstrapping which requires numerous resampling.

References

1. Greenland, S.,

Interactions in epidemiology: relevance, identification, and estimation.

Epidemiology, 2009. (1): p. 14-7. 2. VanderWeele, T.J. and M.J. Knol, A tutorial on Interaction.

Epidemiol Methods, 2014. (1): p. 33-72. 3. Hosme, r.D.J., S. Lemeshow, and R. Sturdivant, Applied Logistic Regression , ed. N. Hoboken. 2013: Wiley. 4. Andersson, T., et al.,

Calculating measures of biological interaction.

Eur J Epidemiol, 2005. (7): p. 575-9. 5. Rothman, K.J. and S. Greenland, Modern epidemiology . 2 ed. 1998, Philadelphia: Lippincott-Raven. 6. Kalilani, L. and J. Atashili,

Measuring additive interaction using odds ratios.

Epidemiol Perspect Innov, 2006.3