Skip to main content

Race and life expectancy in the USA in the Great Depression


Prior work has highlighted increases in life expectancy in the USA during the Great Depression. This contradicts the tenet that life expectancy is positively correlated with human welfare, but it coheres with recent literature on mortality and recessions. We construct Lee-Carter interval estimates of life expectancy during the Great Depression, based on trends before 1929. In this analysis, all-race life expectancy did not grow unusually during the Great Depression. However, nonwhites did see greater-than-expected increases in life expectancy in 1930–1933. We discuss potential explanations. We conclude by urging scholars of mortality during this time period to focus on race whenever the data permit it.


During the Great Depression (1930–1933), both infant and non-infant death rates declined in the USA (Fishback et al. 2007), and life expectancy increased (Tapia Granados and Diez Roux 2009). This occurred despite drops in the gross domestic product (GDP) and a rising unemployment rate. Given that life expectancy is often regarded as a proxy for social conditions (e.g., Lieberson, 1980; Ewbank, 1987), this is surprising. What is more, it contradicts prior research (albeit for cause-specific mortality), which found a countercyclical relationship between the economy and heart disease death rates during this time period (Brenner 1971). However, there is increasing evidence for procyclical relation between mortality and the economy (e.g., Ruhm, 2000; Angelini and Mierau, 2014; Haaland and Telle, 2015; Ruhm, 2015; Sameem and Sylwester, 2017; Tapia Granados and Ionides, 2017; van den Berg et al., 2017). Our goal is to refine our understanding of mortality change specifically during the Great Depression, the largest recession in the USA since 1900.

Scholars have advanced several explanations for gains in life expectancy during economic downturns such as the Great Depression. First, the “income effect” explanation assumes that a fall in family income reduces consumption of health-damaging goods (e.g., alcohol, Khan et al., 2002). Second, the “hazards” explanation predicts, following elevated unemployment, fewer traffic-related and work-related accidents as well as reduced exposure to work-related hazards (Gerdtham and Ruhm 2006). Third, workers who remain employed during recessions may, out of fear of imminent job loss, reduce behaviors (e.g., alcohol consumption) that place them at risk of appearing deviant or delinquent (Catalano et al. 2002). Fourth, working-age adults who lose jobs may shift their time use to health-promoting activities for themselves and their family (e.g., exercise, parenting, caregiving for elderly parents, Ruhm, 2007). Whereas the “hazards” explanation enjoys the most empirical support in contemporary societies (Gerdtham and Ruhm 2006), the lack of historical data on health behaviors in the 1920s and 1930s makes it challenging to assess the relevance of these explanations to the Great Depression case.

Increasing life expectancy is the hallmark of mortality in the twentieth century (Oeppen and Vaupel, 2002; Vallin and Meslé, 2009; Canudas-Romo, 2010). Were increases in life expectancy, 1930–1933, unusual relative to prior trends, or can they be regarded as a continuation of them? Taking a counterfactual approach, we project (with uncertainty) life expectancy, 1930–1940, based on data from 1900–1929, using the Lee-Carter model. We then compare the resulting projection interval to observed life expectancy in the USA. We also disaggregate by sex and race (white/nonwhite), and—because of idiosyncratic compositional changes in the US death registration area, 1900–1933—we replicate the analysis using a balanced panel of states. To the best of our knowledge, the present work is the first to use the Lee-Carter approach for this type of historical counterfactual.

The contribution of this study to the literature on mortality response to social change is that it demographically contextualizes the rises in life expectancy during the Great Depression, by taking into account prior trends. Overall, we find that while all-race life expectancy rose during the Great Depression, the pattern cannot be regarded as unusual relative to the 1900–1929 trend. For nonwhites, the increases in life expectancy, 1930–1933, were greater (consistent with the point estimates of Tapia Granados and Diez Roux (2009)), and exceed the projection interval based on the 1900–1929 trend. Our analysis includes counterfactual projection intervals. We interpret the results as a demographically important increase in nonwhite life expectancy during the Great Depression; this is not statistical significance in a strict sense. We also consider that the Great Migration may have shifted the geography of the nonwhite population in ways that affected mortality. Our descriptive demographic analysis raises this hypothesis (migration), but we are unaware of internal migration data from the USA in this time period which has annual time resolution (not to mention stratification by race). Therefore we are unable to test the migration hypothesis. We strongly recommend that scholars working on mortality during the Great Depression should stratify their analyses by race whenever the data permit it.

Data and methods

Before describing our analytic approach in greater detail, we give a temporal definition of the Great Depression. Assigning a precise start date is a challenge (Eichengreen 2004). We use calendar-year mortality data, so we require only calendar-year precision in dating the Great Depression. Figure 1 presents two key economic indicators, the unemployment rate and inflation-adjusted gross domestic product per capita, using data from Carter et al. (2006). The unemployment rate rose dramatically in 1930 relative to 1929, and peaked in 1932 (on the difficulties of measuring unemployment during this period, cf. Darby1976 and Wallis1989). In 1933, unemployment was still high but was declining and by 1934, had steeply declined. Per capita GDP reached a then-historic high in 1929 and tumbled in 1930–1933; in 1934, it started to rebound. The last “normal” year, so to say, is 1929, while the recovery starts in 1933. We define the Great Depression as 1930–1933, inclusive.

Fig. 1
figure 1

Socioeconomic statistics, USA, 1900–1940. Unemployment rate (left y-axis), solid line. Per capita GDP (right y-axis) inflation-adjusted to 1996, dashed line

We use national data on mortality rates by age and sex for the USA, 1900–1940 (U.S. Department of Health 1956). These data are the longest series of pre-Great Depression mortality data available. Appendix I gives some descriptive statistics and time series plots of the input data. The USA’s mortality is racially imprinted (Preston et al. 2003), so it is logical to perform sub-analyses by race. We analyze data for the total population, as well as for whites and nonwhites separately; the data do not have more granular information on racial categories, but nonwhite in this period was predominantly black or African-American (Lerner 1975). All analyses are done separately by sex. For 1900–1932, the data are for the death registration area (DRA), a subset of the country (see (Hetzel 1997), pp.43–66). To test whether compositional changes in the DRA affect our results, we replicated the analysis using a balanced panel (the death registration states of 1910, using data from Linder and Grove (1943), Table 8). The balanced panel analysis is in Appendix III.

We used a Lee-Carter model (Lee and Carter 1992) to calculate a range of plausible life expectancies during the 1930s, based only on information from before 1930. Since their development 25 years ago, Lee-Carter models have enjoyed wide use in demography (Shang et al. 2011; Shang 2016). Lee-Carter has two parts, estimation and projection. Using data from 1900–1929, we fit the Lee-Carter b and k parameters using singular value decomposition (Lee 1992). Using the method of computing the a values recommended by Lee (2000), the projection is constrained to equal the data in the knot year of 1929 (see also Bell1997). We then used these estimates to project mortality for 1930 through 1940. The Lee-Carter model as implemented here is described in much greater detail in Appendix II.

The Lee-Carter technique is typically used for forecasting (Lee 2000), so it is worth further describing our application of Lee-Carter to historical data. We ask whether the increase in life expectancy, 1930–1933, was unusual relative to contemporary trends and variations. In short, we use data from before 1930 to construct a counterfactual near-term (10-year) projection. This provides an estimate of life expectancy in the 1930s based only on information from before the 1930s. Given that this series exhibits autocorrelation, the Lee-Carter approach is more appropriate than a polynomial extrapolation of the 1900–1929 life expectancy trend. An alternate approach would be ARIMA models, but these require at least 50 observations (years, in this context) to perform well (Box et al. 2008), so are not suited to the current problem, in which data begin in 1900. The Lee-Carter approach provides a projection interval (not a statistical confidence interval). The null hypothesis is that life expectancy during the Great Depression was not different from the twentieth century juggernaut of life expectancy up to 1929. Where the empirical data lie inside the projection interval, we fail to reject this null. This is not the same as saying life expectancy did not increase (Amrhein et al. 2017). The 1918 pandemic notwithstanding, life expectancy is a measure that tends to change slowly. Thus, the question at hand is more suited to variance-based measures—such as the Lee-Carter projection fan—than to regression discontinuity or similar designs.

The Lee-Carter projection is a random walk with drift of the model’s k parameter (Lee and Carter 1992; Li and Lee 2005), using the 1900–1929 mean annual increase as the drift and the 1900–1929 standard deviation. The process is repeated (all our results are based on 1,000,000 runs), generating a distribution of outcomes, from which the 95% projection interval is generated by taking the 2.5 and 97.5 percentiles. We modified the standard Lee-Carter approach to assume that the sexes are correlated, as opposed to independent random walks. That is to say, rather than model km and kf as independent Brownian random walks based on their variances (Freedman 1983), we model the deviations as being drawn from a bivariate normal distribution based on the male:female variance-covariance matrix. Since the projection interval is based on the tail densities, this elaboration is not crucial to the result (i.e., the projection interval widths). However, it is justified theoretically (Noymer and Van 2014; Raftery et al. 2014) and is similar to the cointegrated approach discussed by Carter and Lee (1992) or the Poisson approach of Li (2013).

Mortality was severely affected by an influenza pandemic in 1918, causing a conundrum for fitting k (Lee 1992). We chose to omit this year, pretending, so to say, that 1917 is followed by 1919. The graphs illustrate this by using an alternate line pattern, 1917–1919. Inclusion of 1918 would increase the standard deviation of the fitted k values. This would result in a wider projection interval, making it harder for the observed life expectancy to escape the interval. We chose the approach of excluding 1918 to avoid bias toward the null. Analysis was performed using IDL 8.7 (Exelis Visual Information Solutions, Inc., Boulder, CO).


Our results are principally graphical, shown in Figs. 2, 3, and 4. All the graphs have the same vertical and horizontal scale, allowing like-for-like comparisons. The shaded vertical bands indicate the Great Depression (1929–1933); 1929 is the knot year, in which the projection and observed data are aligned, and 1933 is the end of the Great Depression. Figure 2 presents the results for all races. During the Great Depression, life expectancy for either sex rose but did not escape the 95% projection interval; females are closer to the edge than are males. Interestingly, after the Great Depression ends (as defined, 1933), life expectancy decreases and then stabilizes, before starting to rise again in 1937. As a check on our calculations, we compared our life expectancy numbers to those of the Human Mortality Database (HMD) (Barbieri et al., 2015) and find excellent agreement.Footnote 1

Fig. 2
figure 2

All races life expectancy, 1900–1940, USA. Males (left) and females (right). With Lee-Carter projection interval for 1930–1940, based on 1900–1929. Due to the influenza pandemic, 1918 was omitted from the variance calculation (dashed lines)

Fig. 3
figure 3

Same as Fig. 2, but for whites only

Fig. 4
figure 4

Same as Fig. 2, but for nonwhites only

Figure 3 shows the same result for whites only. Not surprisingly, given the racial composition of the USA at the time, it is largely the same as Fig. 2, but the empirical life expectancy data are slightly closer to the center of the projection fan. The most interesting findings are for nonwhites, in Fig. 4: for males, in 1930 the observed life expectancy is inside but near the edge of the projection interval; this is similar to the finding for whites, although nonwhites are closer to the boundary of what would be considered a significant deviation from the prior trend. In 1931–1933, inclusive, nonwhite males’ life expectancy escapes the projection interval. Thus, during the Great Depression, nonwhite male life expectancy not only rose, but rose higher than expected from prior trends. Interestingly, when the worst of the Depression abated, nonwhite male life expectancy decreased again, and did not surpass its 1933 value until 1938, when it again exceeded the projection interval.

The most remarkable result is that for nonwhite females (Fig. 4). In 1930, like nonwhite males, life expectancy is at the edge of the projection interval. However, in 1931 and thereafter, nonwhite females’ life expectancy surpasses the projection fan. By 1940, the female life expectancy was about 2 years above even the upper bound of the projection interval. Contrast this to white females (Fig. 3), whose life expectancy was inside the projection interval, or to nonwhite males whose life expectancy in 1940 was about half a year outside the projection interval.

These findings are reinforced by some descriptive statistics on minima, maxima, and change (Table 1). These data summarize the observed trends and are split into the same training/projection period; they are not based on any Lee-Carter projections. The first six data columns refer to the training data for Lee-Carter (1900–1929, excluding 1918). The bottom part of the table pertains to the 1910–1929 balanced panel analysis of Appendix III. Not surprisingly given the general upward trend, minimum life expectancy occurs in 1900 for all race/sex combinations. For whites, the maximum life expectancy, 1900–1929, occurs in 1927, close to the end of the training period, as would be expected from the trend. For nonwhites, the maximum life expectancy occurs in 1922, seven years before the end of the window; this is in the wake of the 1920–1921 recession (see Fig. 1 and Wicker, 1966; Vernon, 1991).Footnote 2

Table 1 Table of before and after changes in life expectancy

Table 1 also gives the end minus start change in life expectancy (note, this is not the same as max minus min). The increases in life expectancy during the training period (viz., 1900–1929) were greater for nonwhites (3.96 more years of life expectancy gain for males, and 3.13 for females, indicated a and b, respectively, in Table 1). Bear in mind the much lower starting points for nonwhites—e.g., life expectancy below 30 years for nonwhite males.Footnote 3 The balanced panel shows more improvement for whites during the training period (but note that the training period for the balanced panel is not the same time span). These gains exhibit important racial differences: for whites, in the training period (1900–1929) the increases in life expectancy (in years of life per year of time) were 0.40 (c) for males and 0.42 (d) for females; for nonwhites it was 0.53 (e) for both sexes. The next six columns of Table 1 summarize 1929–1940. In keeping with generally upward trends, the minimum life expectancy data always occur in 1929, and the maxima always occur in 1939 or 1940. Unsurprisingly given Figs. 2, 3, and 4, all groups make substantial gains in life expectancy during the 12-year period beginning with the Great Depression, compared to the 30-year training period.

Given the profound compositional change in the death registration area, especially as regards race (see endnote 3), it is important to look at the balanced panel of states. From 1910–1929, white expansion of life expectancy was 0.38 years per year for either sex (f), while for nonwhites it was less, 0.31 (g) and 0.37 (h) for males and females, respectively. In the 12-year period beginning in 1929, life expectancy grew on average at a higher rate: for whites, 0.41 (i) and 0.47 (j) years per year for males and females, respectively. For nonwhites, the gains were astronomical: 0.78 (males) and 0.75 (females) years of life expectancy per calendar year (k and l, respectively). These increases are not due to changing composition of states, although changing composition of people, due to internal migration (and new birth cohorts), can affect the changes. The pace of improvement for nonwhites in the balanced panel states from 1929–1940 is nothing short of remarkable and is more than three times greater than Oeppen and Vaupel’s (2002) finding of 0.243 years per year for “best-practice” life expectancy gain at the global level (note also that that study was of record life expectancy among a sample of countries). Life expectancy is well-known to be affected by mortality levels at young ages, because child deaths result in more years of life lost. Nonetheless, the racial changes we see in life expectancy at birth are mirrored by life expectancy at age 15 (cf. Table 2, in Appendix I). The mortality changes we are studying are not concentrated in childhood.


To briefly summarize the results, life expectancy improved during the Great Depression in the USA—more for females than males, and much more for nonwhites than whites. For nonwhites, both sexes escaped a Lee-Carter projection interval based on 1900–1929, and nonwhite females saw the most notable increases. Nothing in the balanced panel analyses (Appendix III) indicates that these findings are an artifact of the changing composition of the death registration area. At lower levels of life expectancy, a fixed percentage improvement in death rates makes a larger change in life expectancy (Karpinos, 1946; Mitra, 1979 ; Pollard, 1982; Keyfitz, 1985, pp.62–72; Vaupel, 1986). Thus, in the present context, assuming the same proportional changes in death rates, we expect a slightly bigger response in life expectancy for nonwhites. However, the Lee-Carter analysis clearly shows that changes in nonwhite mortality were more profound.

The quality of mortality data for whites and nonwhites should not be assumed to be the same. Having complete death registration (and therefore being included in the death registration area) meant registering at least 90% of deaths (Hetzel 1997). Up to ten percent of deaths could be unregistered, and these could have been disproportionately nonwhite. Population denominators come from the census, for which nonwhite data quality was worse than that for whites (Karpinos 1939; Myers 1941; Price 1947). Although both nonwhite deaths and population were under-ascertained, it is unlikely that census undercounts mirrored death underregistration on an age-, sex-, and race-specific basis. Numerator-denominator mismatch can bias nonwhite death rates downward. Despite the stark differences in white and nonwhite life expectancies, in reality the gap may have been even larger (Elo 2001).

A related problem is age misreporting, thought to be greater among nonwhites. Complete birth registration, key in establishing age, came later than complete death registration, especially in poor southern counties where most nonwhites were born in the nineteenth and early twentieth centuries (Preston et al. 1998). This distorts age-specific death rates; in any event, data quality for nonwhites was poorer in this time period (Demeny and Gingrich 1967; Zelnik 1969; Ewbank 1987) and beyond (Elo and Preston 1994; Preston et al. 1996; Hill et al. 1997; Preston and Elo 2006). The life expectancy calculations require, as input, death rates at all ages, so the new series of infant mortality data (Eriksson et al. 2018) does not help in this application.

These data quality issues do not make our results uninterpretable. Our goal is to look at life expectancy differences over time, not to hang our hat on any particular point estimate. Many of the measurement issues with nonwhite life expectancy are constant over short time intervals and therefore do not affect inference about the Great Depression. Clearly, growth in life expectancy for nonwhites was greater than that for whites (Fig. 3 vs. Fig. 4). The data we analyze are aggregate vital statistics, fit for the purpose of identifying trends, but less suited to testing hypotheses about mechanisms.

The Great Depression overlapped with what is called the Great Migration, or the movement of blacks out of the South (both rural and urban) and into the more industrialized North (Eldridge and Thomas, 1964; Fligstein, 1981; Alexander, 1998). “More than 40% of the southern black population migrated out of the South between 1915 and 1970” (Boustan and Margo, 2016). This may have played a role in the racial differences in mortality change that we observe. Eriksson and Niemesh (2016) argue that black infant mortality was higher among births to migrants to the North. Thus, the mechanism may be slowing of the Great Migration during the Great Depression (Boone and Wilse-Samson, 2019). While the Great Migration is a potential explanation, annual data on internal migration flows by race are lacking (U.S. Bureau of the Census, 1946; Fishback et al., 2006; Boustan et al., 2010; Gutmann et al., 2016). Our results permit hypothesis formation that the Great Migration may have played a role in the mortality changes we observe, but we do not have the data to test this hypothesis.Footnote 4 We encourage other scholars to consider it.Footnote 5

Direct assistance programs may also play a role in the observed trends. Fishback et al. ("Black-white differences in access to new deal relief under the WPA in 1940 and the FERA in 1933", unpublished) show that blacks benefited from public relief programs by 1933 (see also Liu and Fishback, 2019). However, note that the New Deal programs—which probably reduced infant mortality (Fishback et al.(2001))—began in 1933. This is consistent with the gains seen in nonwhite life expectancy after 1936 (Fig. 4) but was too late to affect the 1929–1933 changes which are our principal focus. General improvements in public health programs during this period were either explicitly part of the New Deal and hence began in 1933 or were not limited to 1929–1933 (Duffy, 1990, pp.256–270) Moreover, the period 1929–1933 was not a watershed in medical innovation. The first group of modern antibiotics did not come into use until 1937 (Lesch, 2007; Jayachandran et al., 2010) and no major vaccines were invented. In any case, it would be peculiar if a medical-technological innovation favored nonwhites (Link and Phelan, 1995).


Life expectancy increased during the Great Depression (Tapia Granados and Diez Roux, 2009). This is interesting in and of itself, especially since it may be regarded as counterintuitive. For the population as a whole, the rise in life expectancy in 1930–1933 does not exceed a Lee-Carter projection interval constructed from pre-1930 data, as described. Thus, although it is a prosaic explanation, continuation of secular trend (a juggernaut underway before the Great Depression) may well explain the pattern of life expectancy in the early 1930s. This is congruent with Stuckler et al., (2012) (see also Tapia Granados, 2012, 2013 and Stuckler et al., 2013). Our principal finding is that race-specific analyses reveal a divergence in life expectancy after 1930. Nonwhite Americans (overwhelmingly blacks during this period) show a sharp rise in life expectancy in 1930–1933 that exceeds the projection interval; this holds for both sexes.

Strengths of our approach include use of the widely accepted Lee-Carter method to compute a projection interval for life expectancy. Given the constraint of only 30 data points before the Great Depression (viz., 1900–1929), the Lee-Carter method is an appropriate way to make a counterfactual projection of life expectancy during the Great Depression, based only on prior mortality data. An additional strength of our approach is that we include a balanced panel, the death registration states of 1910. Although the balanced panel results do not affect the overall conclusions, this is only knowable ex-post. We also analyzed nonwhites and whites separately, which allowed us to show distinct differences in life expectancy patterns during and after the Great Depression.

This study has a number of limitations. Our principal finding refers to nonwhites, but, as discussed, this is the group for which data quality is poorest. Since we are more interested in trends than levels, we think our findings are robust, but clearly better data quality is always a desideratum. Choosing the best input data (i.e., training data for the trend) to calculate an uncertainty interval for life expectancy is tricky.Footnote 6

Mortality data for the USA are available since 1900, and it is thus not possible to study trends in life expectancy over a longer period of time before the Great Depression. Compositional changes in the death registration area add to the input data challenges; using the balanced panel corrects for this, but at the cost of having ten fewer input observations. However, the balanced panel (Appendix III) is of states—not of people—and these states were on the receiving end of the Great Migration.

There is increasing evidence that mortality and the economy are procyclical. When the economy declines, so do death rates (Edwards, 2008; Ruhm, 2016).Footnote 7 Great Depression findings (Fishback et al., 2007; Tapia Granados and Diez Roux; 2009; present work) agree with this. Heart disease was a more important cause of death in the 1960s than either during the Great Depression or nowadays (Goldman and Cook, 1984; Tate et al., 2016). Thus, the decline in the relative importance of heart disease mortality may explain some of the divergence between older and more recent work on this cyclicality. This could be one of the reasons both the Great Depression era and recent times are procyclical, while mid-century evidence is more elusive. Other prominent causes of death that have been linked to the economy are air pollution (Schwartz and Dockery, 1992), accidents (Ruhm 2015; He 2016), and alcohol-related deaths (Brenner, 1975; Norström, 2007). Replicating our projection-based analysis with a portfolio of cause-specific projections is not an alternative (Wilmoth, 1995).

A possible explanation for our findings is a temporary abatement of the Great Migration. Whether or not the nexus between our findings and the Great Migration is causal or coincidental, our findings are principally descriptive demography. This study uses vital statistics (i.e., aggregate data) and thus, does not address causality in the way that microdata could. Nonetheless, this is a useful addition to knowledge about mortality in the Great Depression because of how our findings highlight nonwhite mortality changes, as well how they show that the changes for the total population are hard to distinguish from the prior trend.

Our study refines prior work by using uncertainty intervals (specifically, a Lee-Carter projection interval based on 1900–1929) and by focusing on race. Prior studies have noted that life expectancy expanded during the Great Depression, but the present work underscores that racial differences are key, and that for whites the changes, while positive, were not remarkable. Our principal finding agrees with the idea that the Great Depression was pivotal for life expectancy, but highlights that this is much clearer for nonwhites. We urge scholars working on health and mortality during 1920–1940 to stratify their analyses by race wherever the data permit it.

Appendix I: Input data description

This appendix presents an overview of the input data: three tables of descriptive statistics, followed by graphs. Appendix I: Table 2 summarizes empirical changes in e(0) and e(15) from 1929 and 1933, demonstrating that the racial differences are not concentrated in childhood.

Table 2 Improvement in empirical e(0) and e(15) from 1929 to 1933, for all twelve population × sex × panel combinations
Fig. 5
figure 5

Mortality rates by age, 1900–1940, all races. Solid lines are for the death registration area (1900–1932) and USA (1933–1940); dotted lines are for the death registration states of 1910 (1910–1940). Darker shading denotes input data to Lee-Carter model (up to 1929), but note that 1918 is excluded from the input data

Fig. 6
figure 6

Mortality rates by age, 1900–1940, whites. Solid lines are for the death registration area (1900–1932) and USA (1933–1940); dotted lines are for the death registration states of 1910 (1910–1940). Darker shading denotes input data to Lee-Carter model (up to 1929), but note that 1918 is excluded from the input data

Tables 3 and 4 summarize the input data for the Lee-Carter model (1900–1929 for the main data and 1910–1929 for the balanced panel, excluding 1918 in both cases). The graphs include up to 1940 and thus, depict more data than are summarized in the tables. Since the Lee-Carter model operates on log scale, the means in the following tables are geometric means (cf. Schoen (1970) on the geometric mean in mortality analysis). The Root Mean Squared Error (RMSE) is for the following model fit to the input data: log(M(x))=α+β·year.

Table 3 For the all-years data
Table 4 For the balanced panel data:
Fig. 7
figure 7

Mortality rates by age, 1900–1940, nonwhites. Solid lines are for the death registration area (1900–1932) and USA (1933–1940); dotted lines are for the death registration states of 1910 (1910–1940). Darker shading denotes input data to Lee-Carter model (up to 1929), but note that 1918 is excluded from the input data

Appendix II: Lee-Carter methodology

Here we provide greater detail on how the Lee-Carter projections were constructed. The complete IDL code and input data are available in the online SI.

The general description of the method is in the main body of the paper; what follows is a more step-by-step description of how we implemented the Lee-Carter models used in this paper:

  1. 1.

    Create a matrix, M, of logged age-specific death rates over time. The columns are the years and the rows are the ages. The first row is the logged death rate for infants, over time; the second row is the logged death rate for age 1–4, over time; and so on.

    In our application, the first year is 1900 and the last year is 1929. Thus,

    $$\mathbf{M} = \left(\begin{array}{cccc} \log(M(0)^{1900}) & \log(M(0)^{1901}) & \cdots& \log(M(0)^{1929})\\ \log(M(\text{1--4})^{1900}) & \log(M(\text{1--4})^{1901}) & \cdots& \log(M(\text{1--4})^{1929})\\ \vdots& \vdots& \ddots& \vdots\\ \log(M(\geq85)^{1900}) & \log(M(\geq85)^{1901}) & \cdots& \log(M(\geq85)^{1929}) \end{array}\right) $$

    in which M(x)YYYY is the age-specific death rate at age group x in year YYYY.

    This matrix contains the training data. In our case, 1918 was removed for reasons discussed in the main text; the 1917 column is followed by the 1919 column.

  2. 2.

    Calculate the Lee-Carter a values. As discussed in Lee (2000), in place of the row means (cf. Lee and Carter, 1992), use the final year of the training data (1929, in this example). Create a matrix of the same dimensions of M, in which the columns are the final column of M, replicated. Thus,

    $$\mathbf{a} = \left(\begin{array}{cccc} \log(M(0)^{1929}) & \log(M(0)^{1929}) & \cdots& \log(M(0)^{1929})\\ \log(M(\text{1--4})^{1929}) & \log(M(\text{1--4})^{1929}) & \cdots& \log(M(\text{1--4})^{1929})\\ \vdots& \vdots& \ddots& \vdots\\ \log(M(\geq85)^{1929}) & \log(M(\geq85)^{1929}) & \cdots& \log(M(\geq85)^{1929}) \end{array}\right) $$
  3. 3.

    Create a new matrix, Q=Ma.

  4. 4.

    Calculate the singular value decomposition, SVD, (Golub and Van Loan, 1996, pp.69–75) of Q. The transpose of Q has m= {number of years} rows and n= {number of age groups} columns. The SVD generates W, an n-element vector of singular values; U, an m×n matrix; and V, an n×n matrix. Let T equal the sum of the elements of the first column of V.

  5. 5.

    The Lee-Carter b values are the first column of V, normalized to sum to 1.0: b=V(:,1)/T.

  6. 6.

    The Lee-Carter k values are the first column of U, multiplied by T, multiplied by the first singular value: k=U(:,1)TW(1). The k values are a m-element vector (i.e., one for each year).

  7. 7.

    Some summary statistics of k, as estimated from the training data, will govern the behavior of the future (i.e., counterfactual) projection of the k parameter, which is the kernel of the Lee-Carter projection method.

  8. 8.

    The drift parameter is estimated as the slope of k over time. In the equation k=α+βX, in which X is the vector of years (1900, 1901,...), the drift parameter is \(\hat {\beta }\), from OLS. The drift parameter is race- and sex-specific.

  9. 9.

    The random deviations of the drift parameter are linked between the sexes. As discussed in the main body of the text, an “up” year for males tends to be an “up” year for females as well (and vice versa), and the same for “down” years, and so on. To accomplish this, the random deviations used in the projection are drawn from a multivariate normal distribution—not as univariate random deviates. To perform multivariate pseudorandom deviate generation during the projection part, we will require the variance-covariance (VCV) matrix of the male and female k steps from the training data (see, e.g., Selvin, 1998, p. 190). Linearly detrend the fitted k values by sex, then take first-differences. The trend will be “baked into the cake,” so to speak, of the projection—so the deviations we are interested in right now are the detrended series. Calculate the VCV matrix of these male and female k differences. This procedure is done separately by racial groups. The interdependencies modeled are between males and females, but do not span races.

  10. 10.

    The Lee-Carter projection is a Brownian random walk with drift, of the k parameter, using the drift parameter from step 8. The projected k values start at zero in the knot year (1929). For each year of the projection, they are adjusted by both the drift parameter from step 8 and by randomness. The randomness is modeled as a pseudorandom bivariate normal deviate. Most software has built-in multivariate normal routines; use the VCV from step 9 as input. If this is not available, multivariate normal deviates can be generated from the Cholesky decomposition of the VCV matrix (see Selvin1998). The random deviation from the annual drift is normally-distributed with the same variance as the series of estimated k values from step 6. It also has the same male:female covariance.

    At the end of each model iteration, we have a set of projected k values (for each sex), for the desired number of years in the “future.” Store these in memory.

  11. 11.

    Repeat step 10, one million times. This is done separately by race. The sexes are linked as described in steps 9 and 10, but each race × sex series has its own collection of one million sequences of projected k values.

  12. 12.

    For all years of each of the counterfactual projections (i.e., by race × sex), compute the 2.5 and 97.5 percentiles. These k values will determine the projection fans that are visualized in the paper.

  13. 13.

    Next, convert the k percentiles as calculated in step 12 to life expectancy values. Do not do this for all one million results per sex and race; just use the values determined in step 12.

  14. 14.

    First, calculate predicted M(x) values:

    $$\mathbf{M}_{i,p} = \mathbf{a} + \mathbf{b}k_{i,p} $$

    in which Mi,p is the vector of M(x) values for projected year i at percentile p (i.e., 2.5 or 97.5), and a is any single column of the matrix a discussed in step 2, b is as described in step 5, and ki,p is as described in step 12. Note that the Mi,p is not the same as the input matrix M in step 1, and this a is a subset of the one in step 2.

  15. 15.

    Convert the M(x) values from step 14 into life expectancies using the typical demographic formulae (see, e.g., Preston et al., 2001; Wachter, 2014 etc.)

  16. 16.

    This completes the Lee-Carter projection procedure.

Appendix III: Balanced panel graphs

The graphs in this appendix replicate the main analysis of the paper, using the death registration states of 1910.Footnote 8 The death registration area (DRA) of the USA changed a lot in the period 1900–1932, so some of the year-to-year variation is a compositional artifact. This can affect the Lee-Carter model, and the results in this appendix control for this. The solid lines with the shaded projection fan are the death registration states of 1910. Everything is done just as described the main text, but the data come from a balanced panel of states. This controls for the changing composition. Unfortunately, this also changes the length of the training data, which can affect the variance used in the Lee-Carter projection, so is not a perfect control. To assess whether having ten fewer observations (viz., missing 1900–1909) affects the Lee-Carter projection fan, we also analyzed the variable-composition DRA data, but starting in 1910. The dashed lines with the hatched projection fan are for same data as in the main text (i.e., the DRA), but subsetted to 1910 onward.

Fig. 8
figure 8

Same as Fig. 2 (in the main body of the paper), but for death registration states of 1910, only. The year-to-year variance in this figure is not influenced by changes in the composition of the death registration area (unlike Fig. 2). The dashed lines show data for all the death registration states, i.e., the same data and projection as in the main text, except starting in 1910. This is for comparative purposes as well as to control for the effect of using a shorter training period

Fig. 9
figure 9

Same as Fig. 8, but for whites only

Fig. 10
figure 10

Same as Fig. 8, but for nonwhites only

In short, while the graphs in the main text use all available data (1900–1929), the time series and projection fans in this appendix provide a comparison to a balanced panel, with two fans per graph to additionally control for the length of the estimation period being shorter than that in the main text. An alternate approach would be to use the death registration states of 1900Footnote 9 as the balanced panel, thus removing the question of different length time series. However, these states are not a good control because they are even more unrepresentative, being skewed heavily to the northeast. There is no comparable data set on the death registration states of 1920. Table 5 shows the width of the projection fans in 1940, for all the graphs. For the total population, and for whites, there is not a major impact of switching to a start date of 1910 or of using the balanced panel. For nonwhites, the differences are more substantial, in keeping with the changes associated with the Great Migration, discussed in the main text.

Table 5 Width of projection fan in 1940 (in years of life expectancy), for all eighteen population × sex × panel combinations


  1. This is only possible for all races, and only in 1933 and thereafter, per HMD data availability. For 1933–1939, the average difference (across all years and both sexes) between our calculations and those of the HMD are 0.045 years of life expectancy (maximum difference 0.093); these are negligible. For 1940, the average difference (across both sexes) is 0.145 years of life expectancy (maximum difference 0.153). The reason for the bigger (but still small) difference is a discrepancy of almost half a million between the population (exposures) used by Hornseth and Stanback(1954) (which was used to calculate our source of rates, U.S. Department of Health(1956)), and that of the HMD. The “custom” HMD population estimates (Andreeva and Barbieri, 2017, p.14) imply population shrinkage of 211,000 between the census enumeration (1 April 1940), and mid-year 1940, which seems implausible.

  2. For the balanced panel (Appendix III), the nonwhite peak for females occurs at the end of the training data (1929), not 1922 (Table 1).

  3. Life expectancy for all races is always in-between that for whites and nonwhites. However, the changing racial composition of the death registration area during this period accounts for the peculiar aspect that Δe(0) for all races is not sandwiched between that for whites and nonwhites. If the proportion nonwhite were the same at the start- and end-points, the Δe(0) would also be sandwiched. However, the proportion nonwhite changes substantially, throwing off the comparison. Due principally to changes in which states were in the reath registration area, the proportion nonwhite changed drastically in this time period. For instance, in 1900 for males, 2.1% of the registration population was nonwhite compared to 9.8% in 1929 (cf. Linder and Grove, 1943, table VIII). The 1929 death registration states included Nevada and New Mexico for the first time, and excluded only Texas (added in 1933) and Alaska and Hawai’i (not yet states) (Hetzel, 1997).

  4. Moreover, the Great Migration involved blacks moving from areas with worse data quality to areas with better data quality, which can also introduce a bias (Arthi et al. 2017).

  5. There has recently been a renaissance in Great Migration studies: cf. Eichenlaub et al.(2010), Collins and Wanamaker(2014), Black et al.(2015), Gutmann et al.(2016), Alexander et al.(2017), and Boustan(2017), among others.

  6. Here, we mean an uncertainty interval as regards continuation, or not, of prior trend (like our Lee-Carter projection interval), as opposed to statistical uncertainty of life expectancy point estimates. For the latter, see Wilson(1938), Chiang(1984), Brillinger(1986), and Lo et al.(2016).

  7. There is a debate about time scale (Brenner, 1979a,b, 1981) and whole- versus sub-populations (Sullivan and von Wachter, 2009; Noelke and Beckfield, 2014). See also Miller et al.(2009), Stevens et al.(2015), Cutler et al.(2016), and Seeman et al.(2018). The debate between Tapia Granados(2005a, b) and McKee and Suhrcke(2005) and Brenner(2005) is likewise relevant.

  8. These are California, Colorado, Connecticut, District of Columbia, Indiana, Maine, Maryland, Massachusetts, Michigan, Minnesota, Montana, New Hampshire, New Jersey, New York, Ohio, Pennsylvania, Rhode Island, Utah, Vermont, Washington, Wisconsin (Hetzel, 1997p.59).

  9. These are Connecticut, Delaware, District of Columbia, Indiana, Maine, Massachusetts, Michigan, New Hampshire, New Jersey, New York, Rhode Island, Vermont (op. cit. fn. 8).


Download references


We thank Raiza Balancio for research assistance. We thank the anonymous referees, as well as George Alter, Shari Eli, Ryan D. Edwards, Price Fishback, Beth Jarosz, Ron Lee, David Rehkopf, and Stew Tolnay for suggestions and feedback. Especial thanks to Leah Boustan for constructive criticism. A version of this work was presented at the 2016 PAA annual meeting in Washington, DC.

Author information

Authors and Affiliations



Designed study: AN Performed data analysis: AMI TTN AN Provided key feedback and refinement to analysis: TAB Wrote manuscript first draft: AMI AN Edited/revised manuscript: TAB AMI AN Proofread and approved manuscript: TAB AMI TTN AN Data and code available at: All authors read and approved the final manuscript.

Corresponding author

Correspondence to Andrew Noymer.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bruckner, T.A., Ima, A.M., Nguyen, T.T. et al. Race and life expectancy in the USA in the Great Depression. Genus 75, 16 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Life expectancy
  • Lee-Carter model
  • Great Depression
  • mortality cycles