HHuman mortality at extreme age
Léo R. Belzile * , Anthony C. Davison † , Holger Rootzén and Dmitrii Zholud ‡ Abstract
We use a combination of extreme value statistics, survival analysis and computer-intensive methods toanalyze the mortality of Italian and French semi-supercentenarians. After accounting for the effects of thesampling frame, we conclude that constant force of mortality beyond 108 years describes the data well andthere is no evidence of differences between countries and cohorts. These findings are consistent with use ofa Gompertz model and with previous analysis of the International Database on Longevity and suggest thatany physical upper bound for the human lifespan is so large that it is unlikely to be approached. Power cal-culations suggest that it is unlikely that there could be an upper bound below 130 years. There is no evidenceof differences in survival between women and men after age 108 in the Italian data and the InternationalDatabase on Longevity, but survival is lower for men in the French data.
Solid empirical understanding of human mortality at extreme age is important as one basis for researchaimed at finding a cure for ageing (described, e.g., in Vijg & Campisi, 2008), and is also an element in thehotly-debated and societally-important question whether the current increase in expected lifespan in developedcountries, of about three months per year since at least 1840 (Oeppen & Vaupel, 2002), can continue. The limitto human lifespan, if any, also attracts considerable media attention (e.g. Guarino, 2018; Saplakoglu, 2018;Hoad, 2019).Einmahl et al. Einmahl et al. (2019) analyse data on mortality in the Netherlands and conclude that “thereindeed is a finite upper limit to the lifespan” for both men and women. Their dataset, provided by StatisticsNetherlands and consisting of about 285,000 “Dutch residents, born in the Netherlands, who died in the years1986–2015 at a minimum age of 92 years”, had not undergone any validation procedure. As might be expected,the vast majority died before their 100th birthdays: 99.5% lived 107 or fewer years, and 97% died at age 101 oryounger. The cohorts for analysis were taken to be the calendar years of death, and truncation of lifetimes wasnot taken into account. Hanayama & Sibuya Hanayama & Sibuya (2016) also find an upper lifespan limit of 123years for Japanese persons aged 100 or more, by fitting a generalized Pareto distribution to one-year and four-year birth cohorts, taking into account the sampling scheme. In both cases, any plateauing of mortality may beconfounded with the increase in hazard between ages 100 and 105, and this would invalidate the extrapolationto extreme age.The validity of conclusions about mortality at extreme age depends crucially on the quality of the dataon which they are based (Gavrilov & Gavrilova, 2019), as age misrepresentation for the very old is commoneven in countries with otherwise reliable statistical data (Poulain, 2010). Motivated by this, demographic re-searchers from 15 countries contribute to the International Database on Longevity (
IDL ), which currently con-tains 860 individually-validated life lengths of supercentenarians who had reached age 110 or more on 28September 2019; the data, which cover different time periods for different countries, can be obtained from * HEC Montréal, Department of Decision Sciences, 3000, chemin de la Côte-Sainte-Catherine, Montréal (QC), Canada H3T 2A7( [email protected] ) † EPFL-SB-MATH-STAT, Station 8, École polytechnique fédérale de Lausanne, 1015 Lausanne, Switzerland ‡ Department of Mathematical Sciences, Chalmers and Gothenburg University, Chalmers Tvärgata 3, 41296 Göteborg, Sweden a r X i v : . [ s t a t . A P ] O c t Belzile et al. . For some countries the
IDL has recently been extended to include dataon semi-supercentenarians, i.e., people who lived to an age of at least 105. As of October 2019, it containednew French data on 9612 semi-supercentenarians and 241 supercentenarians, which we call the
France 2019 data; all these supercentenarians but only some of the semi-supercentenarians were validated.A 2016 version of the
IDL ( IDL 2016 ) was analysed by Gampe (Gampe, 2010) and Rootzén & Zholud(Rootzén & Zholud, 2017), the latter with extensive discussion (Rootzén & Zholud, 2018). Both papers madeallowance for the sampling scheme, and in particular for the truncation of lifetimes that it entails. They con-cluded that there is no indication of an increase in mortality for ages above 110 years, and hence no indicationof a finite upper limit to the human lifespan. Rootzén & Zholud (Rootzén & Zholud, 2017) also found no dif-ferences in mortality between men and women or between populations from regions and countries as varied asJapan, the USA or Europe. These conclusions are striking, but the small size of the
IDL and the lack of balancebetween the subgroups limit the statistical power available to detect such differences.The Italian National Institute of Statistics (
ISTAT ) has recently produced an important new database (ISTAT,2018) containing individually validated birth dates and survival times in days of all persons in Italy who wereat least 105 years old at some point in the period from 1 January 2009 to 31 December 2015. Using advancedsurvival analysis tools, Barbi et al. (Barbi et al., 2018) found that death rates in this dataset “reach or closelyapproach a plateau after age 105” and found a small but statistically significant cohort effect.Data analysis must take into account the sampling scheme underlying such data. The
ISTAT lifetimes areleft-truncated because only individuals who attain an age of at least 105 years during the sampling period areincluded, and they are right-censored because the date of death of persons still alive in 2016 is unknown; seeFigure 1. The right-censored lifetimes are shown by the tick marks at the right side of the panel, and include theoldest person; ignoring either the truncation or the censoring could lead to incorrect conclusions. The
France2019 lifetimes are left- and right-truncated: only individuals who are observed to die during the sampling periodappear in the dataset. The statistical consequences are discussed in the Appendix A.2.In our analysis we take the sampling frame into account and pinpoint the age, if any, at which mortalityattains a plateau, and disentangle the effects of age and of birth cohort. We also compare mortality in the
ISTAT , France 2019 and
IDL 2016 databases, and between men and women.We use the generalized Pareto distribution from extreme value statistics in the main analysis, supplementedby fits of the Gompertz distribution, which is standard in demography. We first outline our main results andconclusions; the Appendix gives a more detailed description of our methods.
Results for ISTAT data
Lifetimes beyond 105 years are highly unusual and the application of extreme value models (de Haan & Ferreira,2006) is warranted. We use the generalized Pareto distribution, F ( x ) = (cid:40) − (1 + γx/σ ) − /γ + , x ≥ , γ (cid:54) = 0 , − e − x/σ , x ≥ , γ = 0 , (1)to model x , the excess lifetime above u years. In (1), a + = max( a, and σ > and γ ∈ R are scale and shapeparameters. For negative shape parameter γ the distribution has a finite upper endpoint at − σ/γ , whereas γ ≥ yields an infinite upper endpoint.The corresponding hazard function, often called the “force of mortality” in demography, is the densityevaluated at excess age x , conditional on survival to then, i.e., h ( x ) = f ( x )1 − F ( x ) = 1( σ + γx ) + , x ≥ , (2) preprint , version of October 5, 2020 Author’s Original Version uman mortality at extreme age year of death a g e a t d e a t h ( y e a r s ) Figure 1: Lexis diagram of the
ISTAT data, showing age and calendar time at death for men (red crosses) andwomen (grey dot if only one woman, black dot if several). Censored observations are displayed in the rightmargin; the ticks indicate the ages of men (red) and women (black) still alive on 31 December 2015. Thesampling frame includes only persons aged at least 105 years and alive on 1 January 2009, or whose 105thbirthday occurred before 1 January 2016. The death counts for each calendar year are given at the top of thegraph for men (red) and women (black).where f ( x ) = d F ( x ) / d x is the generalized Pareto density function. If γ < , the hazard function tends toinfinity at the finite upper limit for exceedances. When γ = 0 , F is exponential and the hazard function isconstant, meaning that the likelihood that a living individual dies does not depend on age beyond the threshold.In this case, mortality can be said to have plateaued at age u .The choice of a threshold u such that Eq. [1] models exceedances appropriately is a basic problem in extremevalue statistics and is surveyed by Scarrott & MacDonald Scarrott & MacDonald (2012). If u is high enoughfor Eq. [1] to provide an adequate approximation to the distribution of exceedances, then the shape parameter γ is approximately unchanged if a higher threshold u (cid:48) is used, and the scale parameters for u and u (cid:48) have a knownrelationship, so a simple and commonly-used approach to the choice of threshold is to plot the parameters ofthe fitted distributions for a range of thresholds (Davison & Smith, 1990) and to use the lowest threshold abovewhich parameter estimates stabilise. This choice balances the extrapolation bias arising if the threshold is toolow with the increased variance incurred when taking u too high to retain an adequate number of observations.The upper left-hand panel of Figure 2 shows that for age thresholds close to 105 years the estimated shapeparameters for excess life lengths are negative, with 95% confidence intervals barely touching zero, but there isno systematic indication of non-zero shape above 107 years. The upper right-hand panel displays the estimatedscale parameter of the exponential model fitted to life lengths exceeding the threshold. The scale parametersdecrease for ages 105–107 but show no indication of change after age 107, where the scale parameter estimateis 1.45. Parameter stability plots suggest an exponential model and hence a constant hazard after age 107 or so,where a mortality plateau seems to be attained.The upper part of Table 1 shows results from fitting Eq. [1] and the exponential distribution to the ISTAT fora range of thresholds. The exponential model provides an adequate fit to the exceedances over a threshold at 108years, above which the hypothesis that γ = 0 , i.e., the exponential model is an adequate model simplification,is not rejected. Author’s Original Version preprint , version of October 5, 2020
Belzile et al.
105 107 109 111 - . . . . threshold (years) γ
105 107 109 111 . . . . threshold (years) σ e
105 107 109 111 - . . . . threshold (years) γ
105 107 109 111 . . . . threshold (years) σ e Figure 2: Parameter stability plots for the
ISTAT data (top) and for the
France 2019 data (bottom), showingthe shape γ of the generalized Pareto distribution (left) and the scale σ e of the exponential distribution (right)based on lifetimes that exceed the age threshold on the x -axis. The plots give maximum likelihood estimateswith 95% confidence intervals derived using a likelihood ratio statistic. The horizontal lines in the right-handpanels correspond to the estimated scale for excess lifetimes over 108 years for the ISTAT data. preprint , version of October 5, 2020 Author’s Original Version uman mortality at extreme age ISTAT threshold
105 106 107 108 109 110 111 n u σ .
67 (0 .
04) 1 . .
06) 1 .
47 (0 .
08) 1 .
47 (0 .
11) 1 .
33 (0 .
15) 1 .
22 (0 .
23) 1 . . γ − .
05 (0 . − .
07 (0 . − .
02 (0 . − .
01 (0 .
06) 0 .
03 (0 .
09) 0 .
12 (0 .
17) 0 .
06 (0 . σ e .
61 (0 .
03) 1 . .
04) 1 .
45 (0 .
05) 1 .
45 (0 .
08) 1 .
36 (0 .
11) 1 .
35 (0 .
17) 1 .
58 (0 . p -value .
04 0 .
01 0 .
70 0 .
82 0 .
74 0 .
44 0 . p ∞ .
02 0 .
01 0 .
35 0 .
41 0 .
63 0 .
78 0 . France 2019 threshold
105 106 107 108 109 110 111 n u σ .
69 (0 .
02) 1 .
59 (0 .
03) 1 .
54 (0 .
04) 1 .
43 (0 .
06) 1 .
36 (0 .
08) 1 .
34 (0 .
13) 1 .
26 (0 . γ − .
06 (0 . − .
04 (0 . − .
04 (0 . − .
02 (0 .
03) 0 .
02 (0 .
05) 0 .
05 (0 .
07) 0 .
09 (0 . σ e .
62 (0 .
02) 1 .
53 (0 .
03) 1 .
49 (0 .
03) 1 .
41 (0 .
05) 1 .
38 (0 .
07) 1 .
39 (0 .
11) 1 .
36 (0 . p -value × − .
01 0 .
05 0 .
60 0 .
72 0 .
48 0 . p ∞ × − × − .
02 0 .
30 0 .
64 0 .
76 0 . Table 1: Estimates (standard errors) of scale and shape parameters ( σ , γ ) for the generalized Pareto distributionand of the scale parameter ( σ e ) for the exponential model for the ISTAT and
France 2019 datasets as a functionof threshold, with number of threshold exceedances ( n u ), p -value for the likelihood ratio test of γ = 0 andprobability that γ ≥ based on the profile likelihood ratio test under the generalized Pareto model ( p ∞ ). ISTAT France 2019 IDL 2016 n σ e (95% CI) n σ e (95% CI) n σ e (95% CI) women
375 1 .
45 (1 . , .
62) 1116 1 .
46 (1 . , .
56) 507 1 .
39 (1 . , . men
40 1 .
41 (0 . , .
98) 94 0 .
90 (0 . , .
11) 59 1 .
68 (1 . , . All
415 1 .
45 (1 . , .
61) 1210 1 .
41 (1 . , .
51) 566 1 .
42 (1 . , . Table 2: Estimates of the scale, σ e , of the exponential distribution, with 95% confidence intervals (CI). Thisdistribution is fitted to exceedances of 108 years in the ISTAT and
France 2019 data and of 110 years in the
IDL 2016 data analysed in Rootzén & Zholud (2017).The estimated scale parameter obtained by fitting an exponential distribution to the
ISTAT data for peopleolder than 108 is 1.45 (years) with 95% confidence interval (1 . , . . Hence the hazard is estimated to be0.69 (1/years) with 95% confidence interval (0 . , . ; above 108 years the estimated probability of survivingat least one more year at any given age is 0.5 with 95% confidence interval (0 . , . .We investigated birth cohort effects, but found none; see Appendix A.3. Results for
France 2019 data
Estimation for the
France 2019 data was done as described in Rootzén & Zholud (Rootzén & Zholud, 2017),taking into account the left- and right truncation of the lifetimes. Both the parameter stability plots in thelower panels of Figure 2 and the results given in the lower part of Table 1 indicate that the exponential modelis adequate above 108 years. For persons older than 108 the exponential scale parameter is estimated to be1.41 (years) with 95% confidence interval (1 . , . , the hazard is estimated to be 0.71 (years − ) with 95%confidence interval (0 . , . and the estimated probability of surviving at least one more year is 0.49 with95% confidence interval (0 . , . .Table 2 shows that estimates of the scale parameter for the exponential distribution for women and men for Author’s Original Version preprintpreprint
France 2019 data was done as described in Rootzén & Zholud (Rootzén & Zholud, 2017),taking into account the left- and right truncation of the lifetimes. Both the parameter stability plots in thelower panels of Figure 2 and the results given in the lower part of Table 1 indicate that the exponential modelis adequate above 108 years. For persons older than 108 the exponential scale parameter is estimated to be1.41 (years) with 95% confidence interval (1 . , . , the hazard is estimated to be 0.71 (years − ) with 95%confidence interval (0 . , . and the estimated probability of surviving at least one more year is 0.49 with95% confidence interval (0 . , . .Table 2 shows that estimates of the scale parameter for the exponential distribution for women and men for Author’s Original Version preprintpreprint , version of October 5, 2020
Belzile et al. the
France 2019 data differ. If men are excluded, then the estimated scale parameter increases from 1.41 to1.46 years, and if the oldest woman, Jeanne Calment, is also excluded, the estimate for women drops to 1.44years. Similarly to the
ISTAT data, survival for ages 105–107 was lower in earlier cohorts.
PowerPower
Our analysis above suggests that there is no upper limit to human lifetimes and that constant hazard adequatelymodels excess liftime if one considers only those persons whose lifetimes exceed 108 years: there is no evidencethat the force of mortality above this age is other than constant. One might wonder whether increasing forceof mortality would be detectable, however, as the number of persons attaining such ages is relatively small. Toassess this we performed a simulation study described in the Appendix A.7, mimicking the sampling schemesof the
ISTAT , France 2019 and
IDL 2016 datasets as closely as possible and generating samples from thegeneralized Pareto distribution with − . ≤ γ ≤ . To remove overlap between the last two datasets, wedropped France from IDL 2016 .Any biological limit to their lifespan should be common for all humans, whereas differences in mortalityrates certainly arise due to social and medical environments and can be accommodated by letting hazards varyby factors such as country or sex. With the overlap dropped we can treat the datasets as independent andcompute the power for a combined likelihood ratio test of γ = 0 (infinite lifetime) against alternatives with γ < (finite lifetime). For concreteness of interpretation we express the results in terms of the implied upperlimit of lifetime ι = u − σ/γ . The left-hand panel of Figure 3 shows the power curves for the three datasetsindividually and pooled. The power of the likelihood ratio test for the alternatives ι ∈ { , , } years,for example, is . / . / . for the ISTAT data above 108, . / . / . for the France 2019 data above108, and . / . / . for the IDL 2016 data above 110. The power for ι = 125 / / years based on allthree datasets is . / . / . , so it appears to be rather unlikely that an upper limit to the human lifespan, ifthere is such a limit, is below 130 years or so.Similar calculations give the power for testing the null hypothesis γ = 0 against alternatives γ < . Forcingall datasets to have the same shape parameter would allow them to have different endpoints so we reject theoverall null hypothesis if we reject the exponential hypothesis any of the three datasets. The power of thisprocedure is shown with a dashed black line in Figure 3. The resulting combined power exceeds . for γ < − . and equals . for the alternative γ = − . , giving strong evidence against a sharp increase in thehazard function after 108 years. Gompertz model
The hazard function of the generalized Pareto distribution cannot model situations in which the hazard increasesto infinity, but the upper limit to lifetimes is infinite. This possibility is encompassed by the Gompertz distribu-tion (Gompertz, 1825), which has long been used for modelling lifetimes and often provides a good fit to dataat lower ages (e.g., Thatcher, 1999). When the Gompertz model is expressed in the form F ( x ) = 1 − exp (cid:110) − ( e βx/σ − /β (cid:111) , x > , σ, β > ,σ is a scale parameter with the dimensions of x , and the dimensionless parameter β controls the shape of thedistribution. Letting β → yields the exponential distribution with mean σ ; small values of β correspond tosmall departures from the exponential model. The fact that β cannot be negative affects statistical comparisonof the Gompertz and exponential models; see Appendix A.8.The Gompertz distribution has infinite upper limit to its support, so it cannot be used to assess whether thereis a finite upper limit to the human lifespan. Its hazard function, σ − exp( βx/σ ) , is finite but increasing for all preprint , version of October 5, 2020 Author’s Original Version uman mortality at extreme age . . .
751 120 130 140 150 160 170 180 upper limit to human lifespan p o w e r . . . − . − . − . − . − .
05 0 γ p o w e r data combined France IDL2016 Istat Figure 3: Power functions based on the
IDL 2016 (excluding French records),
France 2019 and
ISTAT databases and combined datasets, with rugs showing the lifetimes above 115. Left: power for the alterna-tive of a finite endpoint ι against the null hypothesis of infinite lifetime based on the likelihood ratio statistic.The endpoint cannot be lower than the largest observations in each database. Right: power of the Wald statisticfor the null hypothesis γ = 0 against the one-sided alternative γ < as a function of γ ; the dashed line repre-sents the power obtained by rejecting the null of exponentiality when any of the three one-sided test rejects. Thecurves are obtained by conditioning on the birthdates and left-truncated values in the databases, then simulatinggeneralized Pareto data whose parameters are the partial maximum likelihood estimates ( (cid:98) σ γ , γ ) . The simulatedrecords are censored if they fall outside the sampling frame for the ISTAT data and are simulated from a doublytruncated generalized Pareto distribution for
IDL 2016 and
France 2019 . See Appendix A.7 for more details.
Author’s Original Version preprintpreprint
Author’s Original Version preprintpreprint , version of October 5, 2020
Belzile et al. x ( β > ) or constant ( β = 0 ). The limiting distribution for threshold exceedances of Gompertz variables isexponential, and this limit is attained rather rapidly, so a good fit of the Gompertz distribution for lower x wouldbe compatible with good fit of the exponential distribution for threshold exceedances at higher values of x .Computations summarised in Appendix A.8 show that the exponential model, and hence also the Gompertzmodel with very small β , give equally good fits to the Italian and the French datasets above age 107, and thatthe Gompertz and generalised Pareto models fit equally well above age 105. Conclusions
None of the analyses of the
ISTAT , IDL 2016 or France 2019 data, for women and men separately or combined,indicates any deviation from exponentially distributed residual lifetimes, or equivalently from constant force ofmortality, beyond 108 years.Table 2 shows no differences between survival after age 108 in the
ISTAT data and survival after age 110 inthe
IDL 2016 data for women, for men, or for women and men combined, so we merged these estimates by tak-ing a weighted average with weights inversely proportional to the estimated variances. The resulting estimatesalso show no significant differences in survival between men and women, and we conclude that survival timesin years after age 108 in the
ISTAT data and after age 110 in the
IDL 2016 data are exponentially distributedwith estimated scale parameter . and 95% confidence interval (1 . , . . The corresponding estimatedprobability of surviving one more year is . with 95% confidence interval (0 . , . .There was no indication of differences in survival for women or the whole of the France 2019 data and inthe combined
ISTAT and
IDL 2016 data, but survival for men was lower in the
France 2019 data. A weightedaverage of the estimates for the
ISTAT data, the
France 2019 data and the
IDL 2016 data with France removedgives an exponential scale parameter estimate of . years with 95% confidence interval (1 . , . , andestimated probability . . , . of surviving one more year.Deleting the men from the France 2019 data or dropping Jeanne Calment changes estimates and confidenceintervals by at most one unit in the second decimal.There is high power for detection of an upper limit to the human lifespan up to around 130 years, based onfits of the generalized Pareto model to the three datatbases. Moreover there is no evidence that the Gompertzmodel, with increasing hazard, fits better than the exponential model, constant hazard, above 108 years.
Discussion
The results of the analysis of the newly-available
ISTAT data agree strikingly well with those for the
IDL supercentenarian database and for the women in the
France 2019 data. Once the effects of the sampling frameare taken into account by allowing for truncation and censoring of the ages at death, a model with constanthazard after age 108 fits all three datasets well; it corresponds to a constant probability of 0.49 that a livingperson will survive for one further year, with 95% confidence interval (0.48, 0.51). The power calculationsmake it implausible that there is an upper limit to the human lifespan of 130 years or below.Although many fewer men than women reach high ages, no difference in survival between the sexes isdiscernible in the
ISTAT and the
IDL 2016 data. Survival of men after age 108 is lower in the
France 2019 data, but it seems unlikely that this reflects a real difference between France and Italy and between France andthe other countries in the
IDL . It seems more plausible that this effect is due to some form of age bias or is afalse positive caused by multiple testing.If the
ISTAT and
France 2019 data are split by birth cohort, then we find roughly constant mortality fromage 105 for those born before the end of 1905, whereas those born in 1906 and later have lower mortality forages 105–107; this explains the cohort effects detected by Barbi et al. (2018). Possibly the mortality plateau isreached later for later cohorts. The plausibility of this hypothesis could be weighed if further high-quality databecome available. preprint , version of October 5, 2020 Author’s Original Version uman mortality at extreme age A. Supplementary material
This section summarises the methods used to produce figures and perform our inferences, and adds goodness offit diagnostics and hazard estimates.
A.1. Reproducibility
The
ISTAT data can be obtained from the Italian National Institute of Statistics by registering at the ContactCenter (https://contact.istat.it) and mentioning the Semisupercentenarian Survey and Marco Marsili as contactperson.LATool is a
MATLAB toolbox for life length analysis that makes alternative analyses possible. It con-sists of three files that are available from https://doi.org/10.1007/s10687-017-0305-5 , anda data file, data.mat , which can be downloaded, together with the full IDL database, by registering at . R R Core Team (2020) was used for many of the analyses and to generate all the figures as well as Tables 1and 3. Standalone code to reproduce the analyses is provided online.
A.2. Truncation and censoring
Figure 1 shows that many persons in the
ISTAT dataset were alive on 31 December 2016, when samplingfinished, so their lifetimes must be treated as right-censored. Lifetimes above 105 years on 1 January 2009 areleft-truncated and individuals not aged at least 105 from 1 January 2009 to 31 December 2016 are not included.Both censoring and truncation must be handled correctly to avoid biased inferences; the effect of the truncationis that inferences must be conditioned on the event that an individual appears in the database. Below we outlinehow this is achieved; see Rootzén & Zholud (2017) and Davison (2018) for more details.Consider a sampling interval C = ( b, e ) of calendar time during which individuals aged over a threshold of u years were observed. Let x = age − u denote the excess age of an individual who dies aged older than u , havingreached age u at calendar time t . Assume that the excess ages x are independent with cumulative distributionfunction F ( x ; θ ) , probability density function f ( x ; θ ) , and survival function S ( x, θ ) = 1 − F ( x, θ ) , where θ is a vector of parameters to be estimated.The likelihood contribution for someone who died in C is then f ( x ) S { ( b − t ) + } , x > , whereas that for someone who is known to be alive at the end of C , and thus whose lifetime is censored, is S ( e − t ) S { ( b − t ) + } , t < e. The likelihood function L ( θ ) is the product of the likelihood contributions from all individuals included in thedata under study. Estimates for the generalized Pareto and Gompertz distributions were found by numericalmaximization of the log likelihood, with standard errors obtained from the inverse observed information matrix.Explicit expressions exist for the maximum likelihood estimator of the exponential distribution scale parameterand its standard error, (cid:98) σ e = (cid:80) i { x i − ( b − t i ) + } deaths , se( (cid:98) σ e ) = (cid:98) σ e √ deaths . (3)The IDL and
France 2019 data were left- and right-truncated, and this was taken into account in our analysis;the likelihood contribution for these individuals is f ( x i ) F ( e − t i ) − F { ( b − t i ) + } , b − t i ≤ x i ≤ e − t i . (4) Author’s Original Version preprint , version of October 5, 2020 et al.
105 107 109 111 . . . . age (years) l o c a l h a z a r d
105 107 109 . . . . age (years) l o c a l h a z a r d Figure 4: Local hazard estimates with 95% pointwise confidence intervals for
ISTAT birth cohorts 1886-1905(left) and 1906–1910 (right), with horizontal lines at 0.7. The rightmost point includes all survivors beyond 111years (respectively 109 years).Inappropriate analysis can lead to misleading results: for example, fitting an exponential distribution tothe
ISTAT individuals who survive beyond age 107 without accounting for truncation or censoring gives theestimate (cid:98) σ e = 1 . with 95% confidence interval (1 . , . , to be compared with (cid:98) σ e = 1 . with 95%confidence interval (1 . , . once the truncation and censoring are accounted for; these confidence intervalsdo not overlap. A.3.
ISTAT cohort effects
The local hazard estimates in Figure 4 are obtained by splitting the likelihood contribution of individuals intoyearly blocks, using disjoint intervals with ( b, e ) = ( a, a + 1) years to avoid using the same individuals severaltimes. For the highest interval we set e = ∞ and include all individuals who survived into that interval.The parameter stability plots in Figure 5 and the estimated hazard plots in Figure 4 show roughly constantmortality for the birth cohort 1886–1905 for the entire age range. Mortality is lower for ages 105 and 106 forthe birth cohort 1906–1910, but equals that for the earlier birth cohort at ages 107 and above. This reductionin mortality for the later birth cohort implies that plateauing for the entire ISTAT dataset does not start untilapproximately the age of 108.
A.4. Graphical diagnostics
A quantile-quantile (or QQ-) plot is a standard diagnostic of the fit of a specified distribution to data, but itmust be modified to accommodate censoring or truncation. We plot the ordered ages at death, x i , of non-censored ISTAT individuals against plotting positions from (cid:98) F − { (cid:101) G ( x ) } (Waller & Turnbull, 1992), where inour case (cid:98) F − is the quantile function of the exponential distribution fitted to ages exceeding 108 years and (cid:101) G is the Kaplan–Meier estimate of the distribution function, corrected for censoring and truncation (Tsai et al.,1987). To assess the variability of the plot we simulate new ages at death from the fitted exponential distribution,conditioning on the birth dates, truncation time and censoring indicator; this amounts to simulating new lifetimesfrom a doubly truncated exponential distribution, since individuals whose death is observed during the samplingframe cannot exceed the age they would reach on 31 December 2015. The left-hand panel of Figure 6 shows 100such curves, corresponding to x i ( b ) on the y -axis and (cid:98) F b − { (cid:101) G b ( x i ( b ) ) } ; under the parametric bootstrap, both (cid:98) F and (cid:101) G are re-estimated using the simulated samples. The right-hand panel of Figure 6 shows approximate95% pointwise and simultaneous confidence intervals obtained using a simulation envelope for (cid:98) F − b { (cid:101) G b ( x ) } preprint , version of October 5, 2020 Author’s Original Version uman mortality at extreme age
105 107 109 111 - . . . . threshold (years) γ
105 107 109 111 . . . . threshold (years) σ e - . . . . threshold (years) γ . . . . threshold (years) σ e Figure 5: Parameter stability plots for the
ISTAT birth cohorts 1896–1905 (top) and 1906–1910 (bottom), forthe parameters γ of the generalized Pareto distribution (left) and for the scale σ e of the exponential distribution(right) obtained from exceedances of the age threshold on the x -axis. The plots give maximum likelihoodestimates with (profile) likelihood 95% confidence intervals. The horizontal line on the right panels correspondsto the estimated scale for exceedances above 108 for the full ISTAT dataset.(Davison & Hinkley, 1997, §4.2.4). Despite the discrepancy for individuals dying shortly after age 108, anexponential distribution adequately captures the observed mortality.
A.5. Differences between men and women
The imbalance in the number of men and women limits our ability to detect any effects of gender on mortalityat great age. To illustrate this, we conducted a simulation study based on the
ISTAT data to assess the powerof a likelihood ratio test for conditional lifetime exceedances above 108 years, for which we have 40 men and375 women; the lifetimes of 94 individuals, 15 of them men, are censored. We condition on the sampling frameand the sex of individuals and simulate new life trajectories for both men and women based on an exponentialdistribution with relative scale differing by a ratio λ > , corresponding to lower hazard for women. Thusindividuals who were older than 108 in 2009 survive at least beyond that age, but a simulated lifetime thatextends beyond 2016 is censored. For each of 10 000 simulated samples, we computed the likelihood ratiostatistic for comparison of fits to men and women separately and a combined fit; the statistic has an asymptotic χ distribution. We also considered conditioning only on the birth dates and the beginning of the samplingperiod to assess whether the right-censoring leads to loss of power, but the difference is negligible. Power of80% is achieved if λ ≈ . , corresponding to an average survival, σ e , of 1.85 years for women and 1.14 yearsfor men, with the corresponding differences for powers 20% and 50% being 0.28 years and 0.50 years. The Author’s Original Version preprint , version of October 5, 2020 et al.
108 112 116 theoretical quantiles o b s e r v e d q u a n t i l e s
108 112 116 theoretical quantiles o b s e r v e d q u a n t i l e s Figure 6: Exponential QQ-plot of the
ISTAT data for individuals who lived beyond 108 years and died before2016, with pointwise (dashed) and simultaneous (dotted) 95% confidence intervals obtained through simulation(right). The left panel shows trajectories for simulated lifetimes drawn from the exponential model for observednon-censored observations. The circles give the ordered ages at death of non-censored
ISTAT lifetimes againstthe plotting positions.power of the corresponding test for the
IDL 2016 data is expected to be similar.For the
France 2019 data a likelihood ratio test strongly rejects the hypothesis of equal hazard for womenand men. The ratio of the estimated hazards for men and women in this dataset is approximately 1.61. Thecombined power of the tests in the
ISTAT and
IDL 2016 data for detecting this ratio is approximately . . A.6. Hazard estimates
To construct a local hazard estimate based on the limiting generalized Pareto distribution, we note that thisdistribution has reciprocal hazard function r ( x ) = ( σ + γx ) + , where a + = max( a, for real a . A moreflexible functional form is r ( x ) = { σ + γx + g ( x ) } + , where g ( x ) is a smooth function of x that tends to zeroas x increases. For exploratory purposes we take g ( x ) = (cid:80) Kk =1 β k b k ( x ) , where b k ( x ) = ( κ k − x ) , κ < · · · < κ K ; here the κ k are fixed knots, and b k ( x ) = 0 for x ≥ κ k . The resulting cubic spline function g ( x ) has twocontinuous derivatives. This model has K + 2 parameters and corresponds to the generalized Pareto distributionfor x > κ K , but allows r ( x ) to depart from linearity for x ≤ κ K .Lifetime data are recorded to the nearest day and often there are ties for small x , so we use a discreteversion of the model, with x ∈ { , . . . , x max } days, where x max corresponds to 16 years after age 105. We let h ( x ) = 1 /r ( x ) denote the hazard function, set H ( x ) = (cid:80) xz =1 h ( z ) / for compatibility with the continuouscase, and obtain survivor and probability mass functions Pr(
X > x ) = exp {− H ( x ) } and Pr( X = x ) = h ( x ) exp {− H ( x ) } for x ∈ { , . . . , x max } .Let X denote a survival time (days) beyond 105 years. For each individual the available data are of the form ( s, d, x ) , where s = 0 if observation of X began at 105 years and, if not, s > is the age in days above 105at which observation of X began, d = 1 indicates death, X = x , and d = 0 indicates right-censoring, X > x . preprint , version of October 5, 2020 Author’s Original Version uman mortality at extreme age
105 110 115 120 . . . . age (years) h a z a r d f u n c t i o n ( / y e a r )
105 110 115 120 . . . . age (years) h a z a r d f u n c t i o n ( / y e a r ) Figure 7: Local hazard estimation for the
ISTAT data using a spline approach with five random knots centered at , . . . , years. Left: original estimate (heavy solid line) with pointwise (dotted) and overall (dashed) 95%envelopes obtained from bootstrap replicates, of which are displayed (grey solid lines). The meanpositions of the knots are shown by the dashed vertical lines. The black rug shows (jittered) times of deaths,and the red rug shows (jittered) censored survival times; the rugs are suppressed for the lower ages, as theretoo many points to be informative. The right-hand panel shows the same output for the best-fitting exponentialmodel, for which γ = 0 .The likelihood is then a product of terms of the form Pr( X = x ) d × Pr(
X > x ) − d Pr(
X > s ) , x > s, and depends on the parameters σ, γ, β , . . . , β K , estimates of which are readily obtained by numerical maxi-mization of the log likelihood function. The resulting fit depends on the knots κ , . . . , κ K , but to reduce thisdependence we generate knots at random, roughly evenly spaced in an interval (0 , x max ) , where x max is cho-sen large enough that r ( x ) should be linear for x > x max , i.e., the generalized Pareto model is fitted when x > x max .The left-hand panel of Figure 7 shows a local estimate of the hazard function for the ISTAT data, constrainedto have the form of (2) above 110 years. The hazard dips up to 108 years or so, then rises and declines slowly.To assess the significance of this decline we performed a bootstrap analysis Efron & Tibshirani (1993); Davison& Hinkley (1997), generating 5000 replicate datasets, the hazard function estimates for 100 of which are shownin the panel. These suggest that the slow decrease after age 110 is not significant, and this is confirmed by thepointwise and overall 95% confidence bands. The initial dip seems to be a genuine feature, but above 108 yearsthe confidence bands include a wide range of possible functions, including a constant hazard.
A.7. Power
To assess the statistical power of our procedures, we compute the maximum profile likelihood estimates (cid:98) σ γ for the original data with γ fixed at the values − . , . . . , and simulate new excess lifetimes for each γ ,conditioning on the sampling frame. In the ISTAT data, for example, we calculate the ages of all individualson January 1st, 2009, retain the dates at which they reach 108 years and sample new excess lifetimes from thegeneralized Pareto distribution with parameters ( γ, (cid:98) σ γ ) , right-censoring any simulated lifetimes beyond the end Author’s Original Version preprint , version of October 5, 2020 et al. of 2016. This ensures that the power calculations are as relevant as possible, not only in terms of the samplingscheme but also in terms of the underlying parameters. For each simulated dataset we compute the directedlikelihood ratio statistic r = sign( (cid:98) γ − γ ) { (cid:96) ( (cid:98) γ, (cid:98) σ ) − (cid:96) (0 , (cid:101) σ e ) } / and the Wald statistic w = (cid:98) γ/ se ( (cid:98) γ ) fortesting the null hypothesis γ = 0 against the alternative γ < , which corresponds to testing for exponentialityagainst possible upper bounds to the human lifespan. The asymptotic null distribution of the statistics r and w is standard normal and we can assess the quality of this approximation by calculating the Wald statistics when γ = 0 . These simulations show that the estimator of the shape parameter is unbiased and normally distributed,but the distribution of the Wald statistics is left-skewed, leading to inflated Type I error. For the power study,we thus use the simulated null distribution for comparison; tests performed in the paper should be consideredliberal, meaning their level would be higher than that claimed.We proceed similarly for the endpoint ι . In order to simulate from generalized Pareto distributions with ι fixed at a given value, we reparametrise the generalized Pareto log likelihood function in terms of ι and σ andestimate the scale parameters (cid:98) σ ι for each of the original three datasets with ι fixed, then use the relation ι = u − σ/γ to obtain the three implied shape parameters (cid:98) γ ι . We then simulate new datasets with the sampling schemedescribed above, but using the three sets of parameter pairs ( (cid:98) γ ι , (cid:98) σ ι ) . For the joint test of the null hypothesisof an exponential distribution, ι = ∞ , against the alternative of a common but finite ι , we reparametrize thelikelihood in terms of ι and allow the three datasets to have different values of σ .The left-hand panel of Figure 3 shows how the empirical proportion of rejections for a test of nominal size based on the directed likelihood root statistic r = −{ (cid:96) gp ( (cid:98) ι, (cid:98) σ ) − (cid:96) exp ( (cid:101) σ e ) } / varies with ι = u − σ/γ for γ < , for the ISTAT , France 2019 and the rest of the
IDL 2016 data. Here ( (cid:98) ι, (cid:98) σ ) are the maximumlikelihood estimates for the generalized Pareto distribution parametrized in terms of a common finite ι and threescale parameters and (cid:101) σ e are the maximum likelihood estimates of the three exponential scale parameters underthe null hypothesis. A.8. Gompertz model
The reciprocal hazard function of the Gompertz distribution r ( x ) = σ exp( − βx/σ ) encodes the speed ofconvergence to the limiting extreme value distribution (Smith, 1987); even if β > , r (cid:48) ( x ) = β exp( − βx/σ ) → exponentially fast as x → ∞ . Any improvement in the fit of the Gompertz model for exceedances over somethreshold would be shown by evidence that β > .Table 3 summarises the fit of this model for various thresholds and the ISTAT data, without sex or cohorteffects. The hypothesis that the Gompertz distribution ( β > ) reduces to an exponential distribution ( β = 0 )is a boundary hypothesis, so the likelihood ratio statistic to test β = 0 does not have the usual approximate χ distribution; its large-sample distribution is a 50:50 mixture of a point mass at 0 and a χ distribution, sometimeswritten as χ + χ (Self & Liang, 1987). Barbi et al. Barbi et al. (2018) do not notice this, leading them tomis-state the significance of the difference of log-likelihoods in their Table 2 — the likelihood ratio statistic fortesting β = 0 against β > is w = 2 × .
292 = 0 . , and Pr( χ > . ≈ . , which is the significancelevel quoted in Barbi et al. (2018). In fact the correct (asymptotic) level would be Pr( χ > . ≈ . .Here the conclusion does not change, but it might in other contexts.In general, p -values obtained using a parametric bootstrap are likely to be more reliable than asymptoticapproximations of the form χ + χ and are preferable for such comparisons. In the present case thisentails simulating independent datasets like the original data from the boundary exponential distribution (thenull hypothesis, β = 0 ), and estimating the p -value by the empirical proportion of these simulated datasets inwhich the likelihood ratio statistics w ∗ are no smaller than the original value w , i.e., (cid:99) Pr ∗ ( w ∗ ≥ w ) , where theasterisk indicates a parametric bootstrap simulation and the subscript indicates that the simulation is under thenull hypothesis; see Chapter 4 of Davison & Hinkley (1997). This approach was used to obtain the p -values inTable 3, which show that the exponential model is statistically indistinguishable from the Gompertz model from preprint , version of October 5, 2020 Author’s Original Version uman mortality at extreme age threshold
105 106 107 108 109 110
ISTAT n u σ .
67 (0 .
05) 1 .
71 (0 .
07) 1 .
48 (0 .
08) 1 .
47 (0 .
12) 1 .
36 (0 .
1) 1 .
35 (0 . β .
05 (0 .
02) 0 .
09 (0 .
04) 0 .
02 (0 .
05) 0 .
02 (0 .
07) 0 0 p -value .
02 0 .
01 0 .
37 0 .
45 1 1
France 2019 n u σ .
71 (0 .
03) 1 . .
03) 1 .
55 (0 .
05) 1 .
43 (0 .
06) 1 .
39 (0 . β .
08 (0 .
01) 0 .
05 (0 .
02) 0 .
06 (0 .
03) 0 .
02 (0 .
03) 0 0 p -value .
02 0 .
32 1 1
Table 3: Estimates (standard errors) of Gompertz parameters ( β , σ ) for the Italian ISTAT data (top) and the
France 2019 data (bottom) for different thresholds. The first row give the number of threshold exceedances( n u ), and the bootstrap p -values are for the likelihood ratio test of β = 0 against β > . Estimates of β reportedas zero are smaller than − . threshold
105 106 107 108 109 110
Gompertz .
01 1 .
65 2 .
21 0 .
22 0 .
68 1 . GP .
00 2 .
17 2 .
21 0 .
23 0 .
56 1 . exponential .
16 8 .
21 2 .
36 0 .
28 0 .
68 1 . Table 4: Deviances for comparison of extended generalized Pareto (GP) and the exponential, GP and Gompertzsub-models for different thresholds. The rows show the likelihood ratio statistic, i.e., twice the difference inlog-likelihood between the specified model and an encompassing extended generalized Pareto model.108 years onwards, though the Gompertz model fits better at 106 years and below for the
ISTAT data and at 107years and below for the
France 2019 data.Table 4 compares the fits of the Gompertz, generalized Pareto and exponential models to the
ISTAT data,with the baseline taken to be an extended generalized Pareto distribution that encompasses all three other mod-els; the details will be reported elsewhere. The generalized Pareto and Gompertz models fit equally well forall thresholds, since the differences between their respective deviances are minimal. Both are better than theexponential model below 107 years, but not above, in agreement with Table 3.
Author’s Original Version preprint , version of October 5, 2020 et al.
References
Barbi, E., Lagona, F., Marsili, M., Vaupel, J. W., & Wachter, K. W. (2018). The plateau of human mortality:Demography of longevity pioneers.
Science , 360(6396), 1459–1461.Davison, A. C. (2018). ‘The life of man, solitary, poore, nasty, brutish, and short’: Discussion of the paper byRootzén and Zholud.
Extremes , 21(3), 365–372.Davison, A. C. & Hinkley, D. V. (1997).
Bootstrap Methods and Their Application . Cambridge: CambridgeUniversity Press.Davison, A. C. & Smith, R. L. (1990). Models for exceedances over high thresholds (with Discussion).
Journalof the Royal Statistical Society: Series B (Statistical Methodology) , 52(3), 393–442.de Haan, L. & Ferreira, A. (2006).
Extreme Value Theory: An Introduction . Springer.Efron, B. & Tibshirani, R. J. (1993).
An Introduction to the Bootstrap . New York: Chapman & Hall.Einmahl, J. J., Einmahl, J. H. J., & de Haan, L. (2019). Limits to human life span through extreme value theory.
J. Amer. Statist. Assoc. , 114, 1075–1080.Gampe, J. (2010). Human mortality beyond age 110. In H. Maier & et al. (Eds.),
Supercentenarians chapter 13,(pp. 219–230). Heidelberg: Springer-Verlag.Gavrilov, L. A. & Gavrilova, N. S. (2019). Late-life mortality is underestimated because of data errors.
PLOSBiology , 17(2), 1–7.Gompertz, B. (1825). On the nature of the function expressive of the law of human mortality, and on a new modeof determining the value of life contingencies.
Philosophical Transactions of the Royal Society of London ,115, 513–585.Guarino, B. (2018). Good news for human life spans — at age 105, death rates suddenly stop going up.
Washington Post , (pp. June 28).Hanayama, N. & Sibuya, M. (2016). Estimating the upper limit of lifetime probability distribution, based ondata of Japanese centenarians.
J. Gerontol. A Biol. Sci. Med. Sci. , 71(8), 1014–1021.Hoad, P. (2019). ‘People are caught up in magical thinking’: was the oldest woman in the world a fraud?
TheGuardian , (pp. November 30).ISTAT (2018). Italian database on semisupercentenarians. See paper by Barbi et al. for instructions on how toobtain the data.Oeppen, J. & Vaupel, J. W. (2002). Broken limits to life expectancy.
Science , 296, 1029–1031.Poulain, M. (2010). On the age validation of supercentenarians. In H. Maier & et al. (Eds.),
Supercentenarians chapter 1, (pp. 4–30). Heidelberg: Springer-Verlag.R Core Team (2020).
R: A Language and Environment for Statistical Computing . R Foundation for StatisticalComputing, Vienna, Austria.Rootzén, H. & Zholud, D. (2017). Human life is unlimited — but short (with Discussion).
Extremes , 20(4),713–728.Rootzén, H. & Zholud, D. (2018). Rejoinder to discussion of the paper “Human life is unlimited — but short”.
Extremes , 21, 415–424.Saplakoglu, Y. (2018). Have humans reached their limit on life span? These researchers say no.
Live Science ,(pp. June 28).Scarrott, C. & MacDonald, A. (2012). A review of extreme-value threshold estimation and uncertainty quan-tification.
REVSTAT – Statistical Journal , 10, 33–60.Self, S. G. & Liang, K. Y. (1987). Asymptotic properties of maximum likelihood estimators and likelihood ratiotests under nonstandard conditions.
Journal of the American Statistical Association , 82, 605–610.Smith, R. L. (1987). Approximations in extreme value theory. Technical Report 205, Center for StochasticProcesses, University of North Carolina at Chapel Hill. preprint , version of October 5, 2020 Author’s Original Version uman mortality at extreme age
Journal of the Royal Statistical Society, Series A , 162, 5–43.Tsai, W.-Y., Jewell, N. P., & Wang, M.-C. (1987). A note on the product-limit estimator under right censoringand left truncation.
Biometrika , 74(4), 883–886.Vijg, J. & Campisi, J. (2008). Puzzles, promises, and a cure for ageing.
Nature , 544, 1065–1071.Waller, L. A. & Turnbull, B. W. (1992). Probability plotting with censored data.
American Statistician , 46(1),5–12.
Author’s Original Version preprintpreprint