On Some Statistical and Axiomatic Properties of the Injury Severity Score
OOn Some Statistical and Axiomatic Properties ofthe Injury Severity Score
Nassim Dehouche [email protected] University International CollegeSalaya, 73170, Thailand.
Abstract
The Injury Severity Score (ISS) is a standard aggregate indicator of theoverall severity of multiple injuries to the human body. This score is calcu-lated by summing the squares of the three highest values of the AbbreviatedInjury Scale (AIS) grades across six body regions of a trauma victim. Despiteits widespread usage over the past four decades, little is known in the (mostlymedical) literature on the subject about the axiomatic and statistical proper-ties of this quadratic aggregation score. To bridge this gap, the present paperstudies the ISS from the perspective of recent advances in decision science.We demonstrate some statistical and axiomatic properties of the ISS as amulticrtieria aggregation procedure. Our study highlights some unintended,undesirable properties that stem from arbitrary choices in its design and thatcall lead to bias in its use as a patient triage criterion.
Keywords:
Multicriteria decision making, Injury severity score, Triage.
The Injury Severity Score (ISS) is a widely-used aggregation procedure for assessinginjuries to multiple body parts and evaluating the emergency of care. The ISSaggregates multiple evaluations based on the Abbreviated Injury Scale (AIS) [1],a common evaluation scale for the severity of trauma to individual body parts.The AIS evaluates the severity of damage to each of nine body regions (head, face,neck, thorax, abdomen, spine, upper extremities, lower extremities, and external)on a scale of 0 to 5 . The ISS is the sum of the squares of the AIS scores ofthe three most severe injuries and is thus evaluated on a scale of 0 to 75. Afterreviewing past work on the ISS, and notably the seminal study [2] that introducedthis aggregation procedure, this paper questions the choice of a quadratic procedurerelative to two other arbitrary aggregation functions (the sum and sum of cubes of A grade of 6 additionally indicates untreatable injuries. This value being immaterial to thepurpose of this paper, we will omit it from our analysis a r X i v : . [ s t a t . A P ] J a n he three highest AIS scores). Then we study some axiomatic properties of the ISSand propose that an injury severity aggregation procedure should be seen as anadjustment lever to optimize target criteria, rather than a rigid formula that seeksto capture fundamental aspects of the response of the human body to injuries witha quadratic formula (as has been wildly conjectured in the original study in the faceof the high correlation of the ISS with mortality). The ISS was introduced in a study by Baker et al. [2], which considered a sampleof 2,128 motor vehicles occupants who were victims of accidents and admittedto one of 8 hospitals in the city of Baltimore, Maryland, USA, over a period oftwo years (1968-1969). For this sample, the study recorded a ratio of hospitaladmissions to deaths of 8:1. For individual hospitals, this ratio ranged from 5:1 to60:1, indicating different levels of severity of injuries for the typical patient that eachhospital received. Table 1 presents the distribution of AIS for the main injury ofeach patient in the sample, while Table 5 details the mortality rates correspondingto the highest AIS grade of patients.
AIS Grade Dead on arrival Dead later Survived Unknown Percentage
Table 1: Distribution of AIS grades over the sample of 2,128 patients in [2]The authors find that the ISS explains 49% of the variance in mortality, in thestudy sample.
Maximum AIS Percentage died • R : Head or neck • R : Face • R : Chest • R : Abdominal or pelvic contents • R : Extremities or pelvic girdle • R : ExternalFormally, let us denote AIS = { R , . . . , R } , the AIS scores of an injured patientover these six body regions, which we will also refer to as the patient’s AIS profile.The computation of the ISS aggregates these score in two steps:1. The three highest AIS scores are determined, that is A = max( AIS ), B =max( AIS − { A } ), and C = max( AIS − {
A, B } ).2. The sum of squares of A , B , and C is calculated, that is ISS = A + B + C . The first step of the ISS aggregation procedure (use of the three maxima) is justifiedin [2] by the fact that considering the sum of squares of the AIS scores of the threemost severe injuries considerably improved the correlation of the resulting score withmortality rates, when including the fourth highest AIS score had no appreciable ef-fect. In this work, we will not analyze the first steps of the aggregation procedureand focus on the second. However, in section 3.5, we show that statistical measuressuch as the correlation and standard deviation are not well suited for a variablesuch as the ISS, because they incorrectly assign it a cardinal value, which leads toinconsistent results. Thus, the criterion used to evaluate the appropriateness of theISS may be flawed. We should also mention an existing variant to the first step ofthe aggregation procedure, that questions not the use of three maxima for the AISbut the choice of body regions over which they are calculated. A widely-used suchvariant has been introduced under the denomination New Injury Severity Score(NISS) [3]. Instead, of considering the three most severely injured body regions,this variant considers the three most severe injuries overall, the reasoning being that3he original ISS method can potentially disregard more severe injuries that happento be in the same body region as the most severe injury. This medical modifica-tion is inconsequential to the analysis and claims made in this paper focusing on theintrinsic mathematical properties of the method. Our results apply to both variants.Thus the main focus of this study is the second step of the aggregation procedure.Indeed, in [2] the choice of aggregating the three maxima by summing their squareswas rather lightly justified as ”the simplest nonlinear function”, without furtherexplanations on the type of complexity being referred to. This justification will beput to question in the present work as the calculation of say the sum of cubes, orany other polynomial function of A , B , and C is no more complex than that ofthe ISS. As for the use of linear functions (e.g. summing the three maxima), it isdismissed in similarly vague terms with the sentence ”the quantitative relationshipof the AIS scores is not known and is almost certainly nonlinear”. The authors ofthe ISS further find that ”the death rate for persons with two injuries of grades 4and 3 was not comparable to that of persons with two injuries of grades 5 and 2(sum = 7 in both cases)”. Table 3 describes the scales of the ISS ( A + B + C ), as well as the sum ( A + B + C )and sum of cubes ( A + B + C ) functions. For A, B, C ∈ { , , , , , } , such that A ≥ B ≥ C and excluding triplet (0 , , A, B, C ) triplets,resulting in 44 distinct possible values of the ISS ( A + B + C ), as well as 13 and55 distinct values of ( A + B + C ) and ( A + B + C ), respectively.We have computed all rank reversals between the ISS, the sum, and the sumof cubes. In other words, the the number of pairs of injury profiles for whichthe rankings provided by the two aggregation functions are reversed. Among the C A + B + C , A + B + C , and A + B + C , regarding the comparison of the pair. In other words, and for two patients x and y , let ( A x , B x , C x ) and ( A y , B y , C y ) be their respective AIS profiles. Weconsider that there is discordance between the ISS and the sum of cubes aggregationfunction if ( A x + B x + C x > A y + B y + C y and A x + B x + C x < A y + B y + C y ) or( A x + B x + C x < A y + B y + C y and A x + B x + C x > A y + B y + C y ). There exist 84pairs of profiles for which there is such a discordance, which represents 5 .
6% of the1485 possible pairs of profiles (i.e. for a uniform distribution of AIS scores, the ISSand sum of cubes aggregation functions would disagree 5 .
6% of the time). The ISSand the sum are in discordance for 8% of the profiles, whereas the sum of cubes andthe sum are in discordance for 14 .
81% of the profile. Although a minority, thesecases of discordance are non-neglectable, particularly for large volumes of patients.
The seminal work [2] relied on the data in Table 4, which records the mortalityrates for the AIS scores of the three most severe injuries, which we denote A , B ank A + B + C A + B + C A + B + C Rank A + B + C A + B + C A + B + C
29 - 34 11830 - 35 12531 - 36 12632 - 38 12733 - 41 12834 - 42 12935 - 43 13336 - 45 13437 - 48 13638 - 50 14139 - 51 15240 - 54 15341 - 57 15542 - 59 16043 - 66 17944 - 75 18945 - - 19046 - - 19247 - - 19748 - - 21649 - - 25050 - - 25151 - - 25352 - - 25853 - - 27754 - - 31455 - - 375
Table 3: Possible grades and their rank, for the sum, sum of squares (ISS), andsum of cubesand C by decreasing order of severity. Number of persons
102 78 38
Most severe injury ( A ) 4 5 5 Second most severe injury ( B ) 3 3 4 Third most severe injury ( C ) 0-2 3 0-2 3 0-2 3 Percentage died
18% 43% 59% 86% 62% 92%Table 4: Mortality by AIS scores of the three most severe injuries in [2]The use of the ISS was supported in [2] by the data reproduced in Table 5, inwhich we have additionally included the sums of the three most severe ISS, of theirsquares (the ISS), and of their cubes, and calculated the (Pearson product-moment)correlation and Mutual Information of each profile with mortality rates. Figure 1,Figure 2, and Figure 3 respectively plot mortality rates according to sum, sum ofsquares (ISS), and sum of cubes of the three highest AIS scores for the sample of2,128 patients in [2]. 5
B C A + B + C A + B + C A + B + C Mortality(ISS) rate
Measurement theory [5] assumes that there exist some empirical structure thatone wishes to represent numerically (e.g. the body’s response to multiple injuries)and defines strict qualitative properties that the empirical structure must verify inorder to be represented numerically. Such numerical artifacts are said to possessan interval level of measurement if, throughout its scale, equal differences in themeasure reflect equal differences in the empirical structure being measured. Nothingindicates that the AIS and even less so the ISS possess such a property. The AIS andISS can be more modestly considered to possess an ordinal level of measurement,that is to say as indicators allowing the ranking of patients, e.g. for triage purposes.An ordinal measure is defined, by opposition to a cardinal one, as ”a variable whoseattributes can only be ranked” [4, 10]. For instance, we know that an underlyinginjury having an AIS score of 3 is less severe than a 4, which in turn is less severethan a 5, but it remains unknown whether the distance between a 3 and a 4 isequal, greater, or smaller than the distance between a 4 and a 5. It is the practice ofassigning the numerical values to the severity of these three injuries that sets the two8umerical distances between them to be equal. The interpretation of the distancesbetween ISS scores is similarly impossible. Indeed, the consecutive values in thedomain of the ISS, represented in Table 3 only reflect an increase in the severityof the overall injury (ordinal information), but the extent of that increase cannotbe given any interpretation (it contains no cardinal information). For instance,50 , ,
54 are three consecutive values in the domain of the ISS, without any possiblevalue between 51 and 54. A patient whose condition goes from an ISS of 50 to 51and then from 51 to 54 would have seen the severity of their injury increase by two(ordinal) units, not four (cardinal) units.Giving a cardinal meaning to the ISS could have been justified if the differencebetween two consecutive values of this scale kept increasing, reflecting a higherlevel of degradation as the severity of an injury increases, but this is not the case.In Table 3, we can observe for instance that the gap between the thirty-secondand thirty-third grades of the ISS (scores of 38 and 41, respectively) is wider thanbetween the thirty-fourth and thirty-fifth grades (scores of 42 and 43, respectively).
The value of the ISS is only ordinal, that is the information it provides is to rankthe overall severity of injuries to multiple body regions of patients, and not measureany intrinsic property of these injuries. Further, [7] warns against considering theISS/NISS as continuous statistical variables in correlation analyses with outcomemeasures (e.g. mortality), which has been the approach initially used to justify thequadratic aggregation of AIS grades in the original version of the ISS. If we acceptthe ISS as a purely ordinal indicator, a much simpler argument can be made toshow that the very concept of measuring Person’s correlation of the ISS with anyother variable does not apply. Pearson’s product-moment correlation is defined asthe covariance of two variables divided by the product of their standard deviation[ ? ]. Focusing on the ISS, we can observe that the concept of standard deviationdoes not apply to this variable.Consider the toy example in table 6 in which we measure the standard deviationof ISS, in three samples of two patients each. The two patients in each sample areof two consecutive ranks, with regards to the ISS (28 th and 29 th , 32 nd and 33 rd , aswell as 34 th and 35 rd ranks, respectively). Note that the ISS profiles of a patientin consecutive samples only differs by one unit of AIS (e.g. the three samples couldcorrespond to a similar degradation of patient 1 and of patient 2 injuries over threeperiods of time).We observe a significantly higher standard deviation and thus variance in sampleB than in sample A, which is not due to a wider dispersion of the severity of injuriesin sample B, but is solely due to the cardinal properties of the ISS. There happensto be no possible ISS values between 38 and 41. The range of ISS goes back to oneunit in sample C, and we find the same variance as in sample A.Thus, the very concept of a unit of deviation of the ISS is meaningless and nointerpretation can be made of the standard deviation of this variable and hence ofits covariance or Pearson correlation with any other variable. These concepts beingbased on that of a deviation of the observed ISS values relative to the mean, it is9ample Patient 1 Patient 2 Patient 1 ISS Patient 2 ISS Variance of ISSISS profile ISS profile (and rank) (and rank) in sampleA (5,2,2) (5,3,0) 33 (28th) 34 (29th) 0.5B (5,3,2) (5,4,0) 38 (32nd) 41 (33rd) 4.5C (5,3,3) (5,4,1) 42 (34th) 43 (35th) 0.5Table 6: The variance of the ISS arbitrarily increases because of the uneven gapsbetween consecutive grades of the ISS scaleimpossible to separate the amount of deviation that is due the observations and theamount due to the makings of ISS scale, with its uneven distances between grades.The calculations of the standard deviation and variance of the ISS, as well as itsPearson’s correlation with mortality and the analysis of said correlation does notaccount for the average and standard deviation of the distance between two consec-utive Injury Severity Scores (they are not one and zero respectively). It implicitlyconsider this score to be cardinal (i.e. a measure of the amount of something).However for measures of mortality the average and standard deviation of the dis-tance between two consecutive possible values are respectively one unit (dependingon the decimal precision considered for mortality rates) and zero. The ordinal nature of the ISS calls for the use of rank correlation. Spearman’srank-order correlation coefficient [8, 10] could be more appropriate measurementsof the association between ISS and mortality rates. Indeed, this statistic evaluatesthe monotonic association between two variables without utilizing ordinal informa-tion. However, it cannot be precisely evaluated in the presence of ties, which arecommon as seen in Figures 1, 2, and 3. Moreover, this indicator would be sensitiveto the intrinsic variance of the ISS for consecutive values of the AIS, illustrated withthe example in Table 6. A more robust measurement of the association betweenmortality and ISS would be offered by Mutual Information [11]. This more generalindicator, which is less sensitive to the cardinal properties of random variables and isnot limited to linear relationships, compares probability distributions as a whole andmeasures how different the joint probability distribution of two random variable isto the product of their marginal distributions. Thus Mutual Information
M I ( X, Y ),given by
M I ( X, Y ) = H ( X ) − H ( X | Y ) between two random variables X and Y is the amount of information (in bits) about one random variable that is gainedby knowing the value of the other random variable, where H ( X ) is the marginalentropy of X and H ( X | Y ) the conditional entropy of X in regard to Y . In its nor-malized form, mutual information quantifies this amount of information relative tothe intrinsic entropy of each random variable. The normalized mutual information N M I ( X, Y ) between X and Y is thus given by N M I ( X, Y ) = · MI ( X,Y ) H ( X )+ H ( Y ) . Usingthis indicator with the data in Table 5, for the three considered aggregation proce-dures and with p-values of order of magnitude 10 − , we find normalized amounts10f Mutual Information of 0 .
46, 0 .
55, and 0 .
71 between mortality rates in Table 5and the sum, sum of squares, and sum of cubes of AIS scores, respectively. For thisdata-set, there is thus a significantly higher amount of information concerning mor-tality rates contained in the sum of cubes than the sum of squares, which confirmsand quantifies the visual insight gained from Figure 2 and Figure 3 and suggestsMutual Information as a more appropriate measurement of the association betweenaggregate scores based on the AIS and mortality rates.
For an AIS profile of the form (
A, B, C ), we introduce the notation [ x A , x B , x C ] suchthat − A ≥ x A ≥ − A, − B ≥ x B ≥ − B, and − C ≥ x C ≥ − C indicate a changein the AIS profile of a patient (i.e. an overall degradation or improvement of theirinjuries), resulting in a new AIS profile ( A + x A , B + x B , C + x C ). We assume, withoutloss of generality, that these changes maintain the the three most severe injurieslocated in the same three body regions (out of the six AIS body regions previouslygrouped). For instance, [ − , , +1] represents an improvement of the most severeinjury of a patient by one AIS point (e.g. following care), and a degradation of theirthird most severe injury by one AIS point, without any change to their second mostsevere injury. These vectors can be conventionally added with ISS profiles to obtainthe resulting ISS profiles, e.g. a patient whose ISS profile is (4 , ,
2) would see theirISS profile become (4 , ,
2) + [ − , ,
1] = (3 , , ? ] of the ISS. A multicriteria aggregation procedure is said to be compensatory if it allows fortrade-offs between criteria, i.e. the possibility of compensating a disadvantage onsome criteria by an advantage on other criteria [13]. As such, the ISS being a simpleaggregated score, it is a fully compensatory procedure, in that any disadvantage onany criterion (a lower AIS score) can be compensated by an advantage on any othercriterion (a higher AIS score). For instance, improving the second most severeinjury by one AIS point, while degrading the third most severe injury by two AISpoints would bring the same change to the weighted sum, no matter its initial value.Should a patient accept a medical procedure that improves your second mostsevere injury by one AIS point, but degrades your third most severe injury by twoAIS points (for instance during transportation or waiting for said procedure)? Letus consider the toy example in Table 7.An improvement in Patient 2’s condition (decrease in ISS) is a degradation inPatient 1’s condition (increase in ISS).This property of the ISS function is arbitrary. It does not have anything to dowith the fact that Patient 1 was initially in a slightly worse state than Patient 2. Itis due to the fact that trade-offs between AIS scores A , b and C in the calculation ofthe ISS do not obey a fixed compensation rate. The very notion of improvement ordegradation of the AIS score is thus meaningless. A weighted sum wouldn’t suffer11 atient Patient 1 Patient 2 Initial ISS Profile (5 , ,
3) (4 , , , +1 , −
2] [0 , +1 , − , ,
1) (5 , , Table 8 shows a toy example in which Patient 1 and Patient 2 receive twice the sameprocedure (an improvement of their most severe injury by one AIS point followed byan improvement of their second most severe injury by one AIS point). Initially, theoverall condition of Patient 2 (ISS of 33) is worse than that of Patient 1 (ISS of 32).However, after the first procedure the order of severity of the conditions of the twopatients alternates to Patient 1 (ISS of 25) being worse off than patient 2 (ISS of 24)and then back to Patient 2 (ISS of 21) being in a worse condition than Patient 1 (ISSof 20), after the second procedure. Moreover, Table 9 shows a similar alternationof priority but with the condition of the two patients progressively degrading overtime). In a situation where the ISS is used as a triage rule, the order of prioritybetween the two patients would arbitrarily alternate, although the degradation oftheir states would be identical.
Patient Patient 1 Patient 2
Initial ISS Profile (4 , ,
0) (5 , , − , ,
0] [ − , , , ,
0) (4 , , , − ,
0] [0 , − , , ,
0) (4 , , atient Patient 1 Patient 2 Initial ISS Profile (4 , ,
0) (5 , , , +1 ,
0] [0 , +1 , , ,
0) (5 , , , , +1] [0 , , +1]Resulting ISS Profile (5 , ,
1) (5 , , The independence property states that identical performance on one or more criteriashould not influence the way two alternatives compare [12]. A transformation thatmaintains the value of the criterion equal should not change the way alternativescompare. In Table 10, we consider two pairs of ISS profiles, Patient 1 and Patient2 versus Patient 3 and Patient 4. The only difference between these two pairsconcerns the AIS score of the most severe injury (3 and 4 for patient 1 and patient2 respectively, 4 and 5 for patient 3 and patient 4 respectively). An identical change,[0 , +1 , ., ,
0) and( ., , ISS = 13), which was initially less severe than that of Patient 2 (
ISS = 16),becomes more severe (18 > <
25 and 25 < Patient Patient 1 Patient 2 Patient 3 Patient 4
Initial ISS Profile (3 , , ) (4 , , ) (4 , , ) (5 , , )Initial ISS 13 16 20 25Change [0 , +1 ,
0] [0 , +1 ,
0] [0 , +1 ,
0] [0 , +1 , , , ) (4 , , ) (4 , , ) (5 , , )Resulting ISS 18 17 25 26Table 10: An identical change to the second most severe injury ceteris paribus leads to different outcomes The choice of an aggregation function is a highly sensible one that impacts mortalityrates. The ISS is neither optimal in terms of its correlation with mortality nor in13egards to its axiomatic properties. By attempting to be two things at the same time(a cardinal measurement of the body’s response to multiple injuries as well as anordinal triage rule), the Injury Severity Score and similar, competing indices (NewInjury Severity Score, Exponential Severity Score [14], etc.) achieve sub-optimalresults in both regards. A complex, fundamental property such as the physiologicalresponse to injury is unlikely to be universally captured by a simple mathematicalfunction (the ISS) of ordinal mathematical measures (the AIS). If it existed sucha function would be unlikely to be stumbled upon with arguments such as ”thesimplest nonlinear relationship is quadratic”. However, research has mostly gone inthis direction and proposals compete on which function offers the highest correlationwith mortality. We have shown such measurements (as well as those of the standarddeviation/variance of the ISS) to be unfounded. We recommend system-thinkingrather than setting an arbitrary formula in stone and conjecturing that it capturesan essential property of the physiological response to injuries, when the sum ofcubes, another arbitrary formula, shows better association with mortality. Thechoice of an aggregation function to be used for AIS scores (ISS, sum of cubes, orany other function) should be made on a case by case basis, through simulationsfor the specific distribution of AIS scores in an emergency department, in a waythat optimizes target criteria. In our view, the unfounded, classical view in theliterature of the ISS as a cardinal measurement of the severity of injuries and theensuing correlation analyses with mortality have somehow hindered this actionableline of research.