Expectation of life at old age: revisiting Horiuchi-Coale and reconciling with Mitra

Data quality issues at advanced old age, such as incompleteness of registration of vital events and age misreporting, compromise estimates of the death rates and remaining life expectancy at those ages. Following up on Horiuchi and Coale (Population Studies 36: 317-326, 1982), Mitra (Population Studies 38: 313-319, 1984, Population Studies 39: 511–512, 1985), and Coale (Population Studies 39: 507–509, 1985), we examine the conventional approaches to constructing life tables from data deficient at advanced ages and the two adjustment methods by the mentioned authors. Contrary to earlier reports by Horiuchi, Coale, and Mitra, we show that the two methods are consistent and useful in drastically reducing the estimation errors in life expectancy as compared to the conventional approaches, i.e., the classical open age interval model and extrapolation of the death rates. Our results suggest complementing the classical estimates of life expectancy by adjustments using Horiuchi-Coale, Mitra, or other appropriate methods and avoiding the extrapolation method as a tool for estimating the life expectancy.


Introduction
The life table model, which describes the current mortality profile in terms of a hypothetical survival of individuals in a synthetic cohort, is an important tool in studying mortality and its many implications such as insurance policies, social policies, and population projections (Chiang 1978;Preston et al. 2001). It builds upon age-specific death rates and produces various indicators of mortality, survival, and longevity for the age groups. A limitation of the model appears at the older ages where data scarcity or deficiency forces statisticians to disregard age details and aggregate the available data into a single "open age interval" (Missov et al. 2016).
Typical data problems that prevent the extension of the life table to an older age are age exaggeration and other types of age misreporting, such as poorly documented return migration following retirement and missing death records. A recent study by Randall and Coast (2016) suggests that the data quality at ages 60+ in low-income countries is yet "very rough" with only a limited improvement over time in African countries. Even in developed countries with generally good mortality data, incompleteness of migrant registration or misreported ages at death may bias the mortality estimates for the elderly (Khlat and Courbage 1996;Kibele et al. 2008;Preston et al. 1996).
In regional demography and demography of social groups, small population size may be another source of data limitation that demands lowering the age at the start of the open age interval (Scherbov & Ediev 2011).
Choosing a broader open age interval that begins at younger ages may help mitigate some of the data quality problems, including the problems of age exaggeration and small population size. Lowering the age at the beginning of the open age interval, however, may itself have severe consequences for the quality of life table estimates because of departures from stationarity which is conventionally assumed for the age composition at the open age interval (Preston et al. 2001). In a stationary population, life expectancy at the beginning of the open age interval is inverse to the death rate: Hereinafter, a denotes the starting age of the open age interval, e a is the life expectancy at age a, and M a+ is the death rate in the open age interval. In this paper, we call (1) the classical estimate (approach).
The classical approach is not the only one available for cases with age reporting problems. The most common alternative method to calculate the last "problematic" age range of the life table is extrapolation of death rates based on their change at younger ages in combination with a mortality model (Mathers and Ho 2014;Missov et al. 2016;UN DESA/Population Division 1982Wilmoth et al. 2007). The World Health Organization (WHO) (Mathers and Ho 2014) used to extrapolate the death rates above the age of 85 by assuming a logistic mortality model. The United Nations (UN DESA/ Population Division 1982) construct life tables for developing countries by fitting the Gompertz-Makeham model at younger ages and closing at age 85+. The Human Mortality Database (HMD) (Wilmoth et al. 2007) also corrects the original data at ages 80+. Extrapolation does not assume population stationarity or any other population model. Such extrapolation, however, ignores the original empirical data pertaining to the open age interval. Furthermore, as we demonstrate in this paper, the extrapolation method tends to be less accurate in terms of life expectancy than the other methods considered here.
Two other alternatives to the classical approach also work with the open age interval, just like the classical method, but relax the stationarity assumption presumed in (1). Indeed, populations are rarely stationary. Improving survival and changing fertility and immigration modify the population age composition. Growing populations are typically of younger age composition, with more weight on younger ages with lower mortality (Preston et al. 2001); their death rate in the open age interval is lower than in a stationary population with similar mortality. As a result, the classical method is prone to overestimating life expectancies for such populations. To account for effects of population growth, Horiuchi and Coale (1982) assumed the stable population model (Preston et al. 2001) with the Gompertzian mortality (Gompertz 1825;Heligman and Pollard 1980) to develop the following formula: Here, r is the annual growth rate of the population in the open age interval and α a and β a are the model parameters (for numerical values, see Horiuchi andCoale 1982 or Ediev 2017).
Adjustment (2) was challenged by Mitra (1984Mitra ( , 1985 who also assumed population stability and derived a closed form solution: where x stands for the mean age of the population in the open age interval. Mitra's approach was, however, criticized for being prone to biases due to age exaggeration (Coale 1985). Indeed, having age exaggeration as a problem in the first place, one would cautiously use the empirical mean population age x in an adjustment procedure. Partly because the discussion between Horiuchi-Coale and Mitra was never resolved, but also due to the strong assumptions used in both approaches, their methods have not made it to a wider practical use by demographers and population statisticians.
Two developments since the time of discussion between Horiuchi-Coale and Mitra call for revisiting their results. First, the empirical basis for mortality studies and computational resources has advanced substantially, as it is reflected in the rich collection of high-quality data in the HMD (2017). Second, life expectancy in many countries has systematically advanced since 1980. On the one hand, better survival to old age boosts the importance of the open age interval for life table estimates. On the other hand, higher (and improving) life expectancies call for testing if the old methodology works well on current data. Our paper is a response to these needs and introduces some useful modifications to the original adjustment formulas. We also aim at reconciling the dispute between Horiuchi-Coale and Mitra and combining alternative methods for a better outcome.
Testing the models of expectation of life at old age on empirical data For each open age interval, we calculate the aggregated death rate for the open age interval using the death rates and population exposures from the HMD: Here, M x and P x are the HMD death rate and the population exposure at age x, respectively, and ω is the maximum attainable age group (110+ in HMD In a similar fashion, we examine the estimation errors in the methods of Horiuchi-Coale (2) and Mitra (3) and in the extrapolation method where death rates are extrapolated into the open age interval based on their rate of increase at younger age. We improved the stability of the Horiuchi-Coale and Mitra formulas by using the population growth rates averaged over 10-year periods prior to the estimation year. If we used the annual rates (results not presented here), the adjusted life expectancies would contain more outliers, especially in the Mitra method. For extrapolations, we use the Gompertz model (Doray 2008;Gompertz 1825) where the force of mortality increases exponentially with age. After some experimentation, we opt for the marginally better extrapolation without jump of the death rate at age a, with parameters fit on 20-years-of-age-long age intervals below the open age interval. This extrapolation is close to the common practices for low-quality data cases. The Gompertz model is also useful as a bridge to the work by Horiuchi-Coale who relied on the model in developing their own method.
Comparative results for estimation errors of life expectancy at birth and age a in all methods are presented in Figs. 1 and 2 and Table 1. The table contains The percentage difference (5) tends to 100% when the first of the methods compared shows much greater errors as compared to the second; it tends to − 100% in the opposite case when the second method performs much worse; it equals zero when the two methods show similar absolute estimation errors.
The estimation errors in the classical method are predominantly positive because of the worldwide growth of the elderly population in the course of demographic transition, which produced population age structures younger (of lower mortality) than the structure of stationary populations assumed in the method. The biases were relatively small (yet, quite substantial) for periods with shorter life expectancy at birth and soared to high levels as life expectancy grew. This comes well in agreement with the formal derivations by Horiuchi-Coale and Mitra. Currently, closing the life table at age 65, both sexes combined, would produce an upward bias in life expectancy at birth as high as 10 years in some countries and more than 2.5 years in many other cases. These    The extrapolation method, surprisingly, does not improve over the classical method in terms of bias and is rather unstable at open age intervals starting at younger age. Only in cases with high life expectancy and later onset of the open age interval does it systematically outperform the classical method (Fig. 2, plot k). Note that we used the Gompertz model that assumes exponential growth of the death rates as a function of age. If we used a more optimistic logistic-type model (e.g., the Kannisto model (Thatcher et al. 1998) fits better the pattern of mortality deceleration at oldest old age and is used by the WHO and the HMD), the upward biases in life expectancy estimates would be even higher.
The Horiuchi-Coale formula provides a remarkable improvement in terms of estimation errors over both the classical and the extrapolation methods at all levels of life expectancy (Table 1, Figs. 1 and 2,plots f,h,l,m), although the parameters for the formula were estimated back in the 1980s. The vast reduction of estimation errors, after applying the adjustment, indicates that the method is rather robust to violations of its underlying assumptions (the Gompertzian death rates and stable population age structure). Table 1 also includes results for the Horiuchi-Coale formula where we kept the original values for the parameter α a but re-estimated the other parameter β a based on our database ("H.-C. (hmd)" columns of Table 1; see Ediev 2017 for the parameter's values). Updating the model parameters to the more complete HMD provides only a marginal improvement in terms of root-mean squared error (RMSE). Yet, the method's RMSEs may perhaps be further reduced by fitting the model to more homogeneous data (to groups of populations with similar mortality dynamics and growth histories).
The Mitra formula is generally more accurate than the Horiuchi-Coale method (Table 1, Figs. 1 and 2, plot p), except for female life tables with low values of the life expectancy at birth. However, the method appears to be prone to producing outliers, especially overestimates of life expectancy (Figs. 1 and 2, plots d, g, i, j). The Mitra formula involves the population mean age in the open age interval, x, an indicator easy to calculate for populations with good-quality data, such as the HMD populations, but problematic for populations with age exaggeration. Hence, we checked if the formula remains accurate after substituting x by its prediction based on the regression involving the growth rate and the observed death rate: (Ediev 2017). Results for the Mitra formula with the approximate mean age (6) are shown in Table 1, in the "Mitra (regr.)" columns. Substituting the true mean age by its indirect estimate only marginally increases RMSEs. Even based on indirect mean age estimates, the method remains more accurate (but also more prone to producing outliers) than the Horiuchi-Coale method. The differences between the two methods, however, are minor as compared to the errors in the classical method. A closer inspection of cases where the Mitra method produces outlier estimates shows that it is the sensitivity of the method to the population growth rate that makes it unstable. Apparently, the quadratic (with respect to r) term in (3) causes the method to produce strong overestimates of the life expectancy in cases with strong population growth or decline and when the population stability assumption is violated.
In all methods, errors tend to increase as life expectancy grows. Interestingly, all methods tend to err more often to the positive side, i.e., they overestimate life expectancy, although the non-classical methods are free from the classical method's sources of error that is nested in the stationarity assumption. The reasons for the positive errors are different among the methods. The extrapolation method produces positive or negative errors depending on whether the death rates increase steeply above or below the minimal age of the open age interval. Mortality acceleration at younger old ages (Horiuchi and Wilmoth 1997;Horiuchi 1997), which is more typical in female populations, explains positive biases in the extrapolation method. At the same time, mortality deceleration at older ages (Horiuchi andWilmoth 1997, 1998;Horiuchi et al. 2003) may explain somewhat more prevalent negative biases of the extrapolation method at the open age interval 85+ and the tendency of the method to produce negative errors for the open age interval 95+ (results not shown here). The Horiuchi-Coale and Mitra methods tend to produce positive errors, because protracted periods of mortality decline at old age, as observed in many countries, and produce population age structures even younger than the stable populations assumed in the two methods (Ediev 2014;Guillot 2003;Horiuchi and Preston 1988). The Mitra method, additionally, tends to overestimate the true life expectancy because of the aforementioned instability of the method in cases of strong population change.
The tendency of both the classical and the Mitra estimates to exaggerate the life expectancy suggests a novel combined approach when the life expectancy estimate is obtained as the minimum of the two estimates: Here, upperscripts "Clas." and "M." refer to the classical and Mitra estimates. Estimation errors of life expectancy at birth in the combined method (7), also in comparison to single estimation methods, are presented in Fig. 3. Taking the minimum of the two estimates helps avoid the outlier estimates of the Mitra method (compare the boxplots in the first column in Fig. 3 at a = 75 and a = 85 to the boxplots "d" in Figs. 1 and 2 for the Mitra method). At the same time, the combined method performs, in most of the cases, as good as the Mitra method (column 5 in Fig. 3) and outperforms all other single-method alternatives (columns 2-4 in Fig. 3).
Our above results show that both the Horiuchi-Coale and Mitra formulas perform well in reducing life expectancy estimation errors caused by aggregating data for the open age interval. The Mitra method is marginally more accurate but less stable. Its reliance on the possibly exaggerated mean population age may be overcome by using the indirect estimates of the mean age (6). Both methods are by far superior to the classical and the extrapolation methods. It is rather surprising that the authors of these two methods came up with contradicting results in their papers.
A closer examination of the original papers, however, shows that a large part of the numerical differences between Horiuchi-Coale and Mitra were, in fact, due to different inputs used in their calculations rather than methodological differences. In particular, the largest discrepancy in the original papers was for e 65 for El Salvador in 1961: Mitra's estimates were larger 3.12 years for women and 3.02 years for men. When we recalculated the life expectancies using similar inputs in both approaches (Mitra 1984, pp. 11-12), we found that the two approaches are more consistent: Mitra's formula gives estimates by 1.69 and 1.27 years larger, respectively. After our recalculation, the estimates become closer also for Canada, Japan, Switzerland, UK, Mexico, and Malaysia. Altogether, the two methods differ by more than 1 year in only two cases, El Salvador and Puerto Rico, out of 13. If one takes into account that the Mitra method was relying on potentially biased official estimates of the mean population age in the open age interval, it becomes clear that the authors were more consistent than they concluded.

Discussion
Our results show that the violation of the stationary population assumption of the classical life table method has strong consequences for the accuracy of life expectancy estimates. The errors in the classical method increase as mortality declines. For a currently low-mortality population, closing the life table at age 65 would produce an HMD-average upward bias of more than 3 years and even larger RMSEs in the life expectancy at birth calculated by the classical method. Continuing increases in life expectancy will drive the biases of the classical method to even higher levels.
The methods developed by Horiuchi-Coale and Mitra drastically reduce estimation errors in the expectation of life as compared to both the classical and the extrapolation methods. Wider usage of these methods should be encouraged for populations where data on old-age mortality are missing (for example, due to small population size) or corrupted by age exaggeration or age misreporting in general. Although estimation errors of all methods increased as life expectancy grew for HMD populations, the comparative advantage of the Horiuchi-Coale and Mitra methods has only strengthened over time. Among the methods considered, the Horiuchi-Coale method may be preferred as being close to the best (on average) estimates provided by the Mitra formula but being more stable. Even better results, however, may be obtained by considering several alternative methods and selecting the most reasonable estimate (see the combined method (7) for an example).
A wider usage of the Horiuchi-Coale and Mitra methods in demographic analysis may be facilitated by their perfect fit to a number of popular indirect demographic methods. The Brass Growth Balance method, the Preston and Coale method, the Hill Generalized Growth Balance method, and the Bennett and Horiuchi Synthetic Extinct Generations method all involve estimates of the population growth rate (Moultrie et al. 2013;United Nations 1983). These methods provide ready inputs for the Horiuchi-Coale and Mitra formulas.
The method of extrapolating the death rates into the open age interval does not appear to be a good alternative to the classical method. The WHO and the HMD use an S-shaped model for extrapolating the death rates as alternative to the J-shaped Gompertz model used here. Indeed, logistic-type models were shown to better fit the deceleration of mortality at oldest old ages (Missov et al. 2016;Thatcher et al. 1998). However, our results for the younger open age intervals, where all mortality models fit closely to each other, suggest that the conclusion about the inferiority of extrapolation to the Horiuchi-Coale and Mitra methods is, in general, applicable to any mortality models other than the Gompertz model. In fact, a logistic-type model might even accentuate the upward biases in life expectancy estimates. In many applications, however, it is important to extend the age profile of the death rates into the open age interval. Although our results discourage from using the popular extrapolations, one may combine the more accurate adjusted estimate of life expectancy e a with the extrapolation model by constraining the parameters of the latter to fit the life expectancy estimate (Ediev 2017). Another area for further work is the study of estimation errors in nonparametric methods for the open age interval not considered here (Camarda 2012;Currie et al. 2004;de Beer 2012;Kostaki and Panousis,2001;Rizzi et al. 2015).
As an important consequence of the discrepancy between the actual death rate and life expectancy for the open age interval, the traditional approach of projecting the population in the open age interval (Preston et al. 2001) may lead to overestimates by dozens of percent of deaths in the open age interval. The adjustment formulas considered here as well as related conditioned extrapolations of the death rates may be used to compensate for this projection bias.
One may further improve life expectancy estimates by fitting the models on populations with closer history of growth and mortality reduction (e.g., of regions of a given country). Another direction of improvement might be considering population models more advanced than the stable population. For example, one may consider effects of the changing growth rate and mortality on the population age composition (Brouard 1986;Ediev 2014;Guillot 2003;Horiuchi and Preston, 1988). Results for the method combining the classical estimates with the Mitra method also suggest that pooling together several methods and making use of expert judgment about the likely direction of estimation biases may help reducing the estimation errors and stabilizing the estimation results in particular country cases.
Endnote 1 HMD modifies (smooths and extrapolates) the original data at some ages above age 80 using the logistic model. This might have distorted our findings for the highest open age intervals, especially for the extrapolation method. Yet, after rerunning our calculations on the raw, not modified, death rates also provided in the HMD, we came to results similar to those presented in this paper. Root-mean squared error (RMSE) obtained on raw data differ only in the second digit after comma from the RMSEs presented in Table 1 for all methods, except for the extrapolation. Even for the extrapolation method, there was only one case when RMSE based on the raw data differed in the first digit after comma from the results presented in Table 1 and was substantial in relative terms: females, life expectancy at birth range 40-50 years, a = 85 (RMSE on the smoothed death rates, 0.04; RMSE on the raw data, 0.16). Data were downloaded on 12.02.2016 and are consistent with inputs in (Ediev 2017).