A two-parameter hazard function to describe age patterns of mortality in ancient Northwestern Europe

When seeking to describe the age patterns of mortality for ancient populations, researchers are often confronted with small sample sizes or with missing data for several age groups. The traditional approach to dealing with these challenges is to smooth or complement such patterns by matching them to a model life table, either directly or through the Brass logit transformation. This procedure requires an appropriate model life table, which may not be available. We propose a hazard model that is both flexible enough to accurately describe an age pattern of mortality in ancient Northwestern Europe and restrictive enough to complement incomplete data. This paper presents a hazard function that contains four free-to-choose parameters. Tested against a large collection of life tables for northwestern European countries from the 17th to the 21st century, the number of free-to-choose parameters is stepwise reduced from four to only two. Compared with the Brass logit transformation with the Princeton Model West as its reference, the presented two-parameter hazard model is shown to fit the abovementioned dataset much better. The mean fitting error is found to be half the size. Moreover, this model is shown to fit a 13th-century mortality age pattern much better. The proposed two-parameter hazard model is capable of fitting a wide range of age patterns of mortality more closely than the traditional approach can. We therefore conclude that the proposed model facilitates the smoothing and the completion of age patterns of mortality in ancient Northwestern Europe even if they deviate substantially from well-documented patterns.


Introduction
Ever since Gompertz formulated his law of mortality in the 1820s, new mathematical hazard functions have been suggested to adequately describe age patterns of mortality through forces of mortality (Thiele and Sprague 1871;Perks 1932;Siler 1983;Thatcher 1999;Turner and Hanley 2010: p.488), mortality probabilities (Heligman and Pollard 1980;De Beer and Janssen 2016) or survivorship (Wong and Tsui 2015).
The underlying concept of a hazard rate, which was originally known as the intensity, or force, of mortality, plays a vital role in various scientific disciplines and has been applied in fields as diverse as demography, actuarial science, epidemiology, biology and engineering (Weibull 1951;Siler 1979;Hoem 1983;Kizilersü et al. 2018). Unfortunately, in order to adequately capture the hazard rates due to juvenility, senescence, the so-called 'accident hump' and maternal mortality, these descriptions often require sophisticated mathematical functions containing many parameters (Heligman and Pollard 1980;Wong and Tsui 2015;De Beer and Janssen 2016). If a mortality pattern, based on a huge amount of data, is known precisely and completely, it can provide sufficient information to assign appropriate values to all of these parameters. If, however, a reconstructed mortality pattern is based on limited data, it might be quite irregular or only partly known. In such a case, the application of a sophisticated model may not work well. Due to over-parameterization, it might pick up patterns that do not reflect reality, but are instead suggested by irregularities or gaps (Gage and Dyke 1986: 281-283). Thus, when the data are limited, a simpler and much more restrictive model is needed to ensure reliable parameter fitting. Such a model would not only be able to correct for irregularities, it would also be able to fill in the gaps. Therefore, this kind of model is essential when describing mortality patterns of ancient populations, like those from the Early Modern Period or the Middle Ages.
An example of such a pattern is the age pattern of mortality that Russell (1948) was able to reconstruct for England's elite around 1300. Since information on all age groups below age 15 is missing, a restrictive model is necessary to complement Russell's pattern.
Often, demographers and historians turn to the Princeton Model life tables to complete a mortality pattern (Hollingsworth and Hollingsworth 1971: 143;Poos and Smith 1984: 141-142;Loschky and Childers 1993: 87-88, 92-94;Wrigley et al. 1997: 282;Daponte et al. 1997Daponte et al. : 1262Daponte et al. -1263Benedictow 2004: 349-350). This renowned set of model life tables consists of four families, of which the Princeton Model West is the most widely used. Each of these four families has 25 levels, from the lowest life expectancy at level 1 to the highest at level 25. The lion's share of these levels are constructed on the basis of a large number of the late 19th-and 20th-century's life tables. However, the levels of these model life tables that are relevant for ancient populations are constructed by extrapolation only, since they lie outside the range of the empirical data used to construct these four families (Coale and Demeny 1966;Wrigley et al. 1997: 263;Preston et al. 2001: 196-197).
A well-established alternative to the direct use of model life tables is the Brass logit life table system, in which a life table is related to a standard model life table through a logit transformation (Brass 1971;Arriaga 1994: 111-117;Preston et al. 2001: 199-201;Wheldon et al. 2013: 102-103). Although highly valued in many cases, this method is not very successful when a mortality pattern differs substantially from the standard to which it is related (Coale et al. 1983: 10;Ferguson et al. 2003: 167-168). Since there are no suitable standards available for ancient populations, the Brass logit method may not be adequate to complete ancient age patterns of mortality. While a generalisation of this approach by Ewbank et al. (1983) substantially improves the fit, it also introduces two additional parameters, which is problematic in cases in which the mortality pattern is incomplete or highly irregular.
This study aims to formulate a simple, restrictive, mathematical function describing mortality age patterns that contains no more than two parameters. This function can be used in cases in which only a small amount of data is provided, and no suitable model life table is available to smooth an irregular pattern and to fill in existing gaps. Since the amount of ancient mortality data that is available is often too limited to enable researchers to differentiate between men and women, this function is designed to capture the mortality age pattern for both sexes combined.
Our study starts off by presenting a four-parameter model that consists of three separate terms, each of which has already been introduced in the mortality and survival literature. Subsequently, this number of four free-to-choose parameters is restricted to only two by formulating restrictions on the other two. This is done by testing the model against data from the Human Mortality Database and data on mortality in England over 17 decades between 1640 and 1810. We describe the procedure used for parameter fitting. Next, the four-parameter model is compared with the well-tested Siler model, and the restriction process is described. Validation of the two-parameter model's capacity to completing incomplete mortality age patterns is tested, and the performance of the restricted two-parameter model is compared with that of the abovementioned models and the Brass logit method using the Princeton Model West as the reference standard. As life tables from the Human Mortality Database might be valid alternatives to the Princeton Model life tables, the oldest HMD life table is also used as the reference standard.
A four-parameter mortality model Hazard rates, denoted as μ x , are limiting cases of age-specific mortality rates, which are the proportions of persons between age x and age x + n dying yearly. Since forces of mortality from different causes are additive (Preston et al. 2001: 78), several actuaries (Thiele and Sprague 1871: 314;Heligman and Pollard 1980: 49-50;Hoem 1983: 216) and biologists (Siler 1979: 751;Siler 1983: 374-375) have suggested that hazard rates might be composed of competing risks, with each dominating a distinctive age group. Hence, most parametric mortality models that describe the full age pattern of mortality consist of separate additive terms (De Beer and Janssen 2016).
The Siler model is a well-known model that consists of an exponential term, which dominates in juvenility before the age of 10; a constant introduced by Makeham (1860), which dominates in young adulthood; and the Gompertz term, which is dominant during senescence, from age 30 onwards (Siler 1979(Siler , 1983: The Siler model contains five parameters: α and ω shape the juvenility term, θ shapes the Makeham term, and β and φ shape senescence. This well-established model contains only a limited number of parameters, and can be fitted reasonably well to a wide range of mortality patterns (Siler 1983;Gage and Dyke 1986;Gage 1988: 429-430;Engelman et al. 2014Engelman et al. : 1367Engelman et al. -1372. We will use it as the reference model to which the proposed model's performance is compared. However, the exact form of each of these three terms is not undisputed. Kannisto suggested that a logistic form would describe old-age hazards much better than Gompertz's exponential form (Kannisto 1994;Thatcher et al. 1998;Thatcher 1999;De Beer and Janssen 2016: 3, 5;De Beer et al. 2017). Moreover, the constant Makeham term does not describe the 'accident hump' found in young adulthood (Thiele and Sprague 1871: 316;Heligman and Pollard 1980: 49-50). Quite recently, de Beer and Janssen (2016: 2-5) have introduced a logistic function to describe the mortality probability for adolescents and young adults. This function includes the excess mortality in early adulthood while also accounting for background mortality and is controlled by just a single parameter (De Beer and Janssen 2016: 4-5). Furthermore, instead of an exponential term, juvenility hazard rates can also be shaped as a power law (Weibull 1951: 293;Kizilersü et al. 2018: 10-11;De Beer and Janssen 2016: 5).
Taking Weilbull's power form (Weibull 1951) for juvenility, the term that contains the logistic form for adolescents and young adults introduced by De Beer and Janssen (2016), and the logistic term for senescence proposed by Kannisto (1994), the model that we present, containing four parameters, is specified by Henceforth, these four parameters will be called juvenile mortality shape ω; accident hump level θ; rate-of-ageing β, i.e., the rate at which mortality increases with age; and ageing inflection point φ, i.e., the age at which this logistic curve changes from being convex to being concave (Thatcher 1999: 21).
In this study, this model will not be tested in its hazard form since some of the data stemming from historical studies lack death rates, and instead, only mortality probabilities are provided (Russell 1948: 92-117;Wrigley, Davies, Oeppen and Schofield, 1997: 224, 239, 250, 290). This is hardly surprising, given that these data have been gathered by following people throughout their lives. Furthermore, mortality probabilities are often used for the calculation of life expectancy, which is highly relevant to economic and social historical research (De Beer and Janssen 2016: 6). Since it is our aim to develop a model for ancient populations, we will therefore formulate our model in terms of mortality probability n q x , and test it by finding the best fit to patterns of mortality probabilities, instead of death rates.
It is rather straightforward to formulate our hazard model (equation 2) in terms of the survivorship l x or mortality probability n q x . Indeed, the cumulative hazard Λ x , the integral under the hazard curve, of equation (2) is given by in which constant C guarantees that Λ 0 = 0, thus: From equation (3a), survivorship l x and mortality probability n q x between age x and age x + n can simply be calculated by using the well-known demographic relationships (Gage 1988: 431;Preston et al. 2001: 42, 60;Kizilersü et al. 2018: 11): After the substitution of equations (3) and (4) in equation (5), our model is formulated in terms of mortality probabilities. We will test it in this form against data from the Human Mortality Database and data on mortality in England over 17 decades between 1640 and 1810.

Datasets
The dataset that is used throughout our study is from the Human Mortality Database (2017,2017,2018). This database provides annual life tables that go back well into the 19th century for Norway, Denmark, England & Wales, the Netherlands, Belgium, and France; and even as far back as 1751 for Sweden (Table 1). Although this rich database provides different types of life tables, in the present study, we use only life tables in which data for the two sexes are combined, since our aim is to formulate guidelines in cases in which the amount of data might be too limited to differentiate between the sexes, as often occurs in historical data.
Although the age patterns of mortality of ancient populations can hardly be expected to resemble current northwestern European patterns, we include in our dataset life tables from the late 20th century, and even from the 21st century. We did so because in order to apply our model to ancient populations, it must be extrapolated beyond the range of our datasets. Therefore, the longer the time span on which this model can be based, the more confident we can be that such an extrapolation is appropriate. Using period life tables for separate years, the numbers of life tables for these seven countries sum up to a total of 1320. Henceforth, this set of 1320 tables will be referred to as the HMD dataset.
Although the database provides life tables with more detail, we used life table information for age groups 0, 1-4, 5-9, 10-14, etc., up to the group aged 110 and above, again because this study focuses on cases with limited data. Furthermore, we used only the mortality probabilities from the life table, because, as we noted above, most studies on ancient populations provide only mortality probabilities.
Although the English data to test the model on represent cohort data, we selected period life tables from the Human Mortality Database instead of cohort life tables. Given that in ancient populations the mortality age patterns were rather static (see for instance Loschky & Childers, 1993), we do not expect to observe big differences between period and cohort mortality data representing ancient populations. On the contrary, the cohort data from the Human Mortality Database reflect fast-changing societal circumstances. Within one cohort, high infant mortality in, for example, 1850 is accompanied by low mortality at the oldest ages in 1950. Therefore, we regard the period data from the Human Mortality Database, which reflect the same societal circumstances at every age, better suited to fit with the (semi-)static circumstances of ancient populations.
Besides the Human Mortality Database, a second source was used to test our model against. Wrigley, Davies, Oeppen and Schofield (1997) painstakingly reconstructed families from information in parish registers throughout England and were able to calculate age-specific mortality probabilities for Early Modern England (EME). In their highly valued study, they calculated mortality probabilities for both sexes for age groups below age 15 (0, 1-4, 5-9, 10-14) and above age 25 (25-29, 30-34, …, 80-84) for 17 decades between 1640 and 1810 (Wrigley et al. 1997: 224, 239, 250, 290). These mortality probabilities contain valuable information, especially on the juvenile mortality term, as will be seen later on. This dataset of 17 tables will be referred to as the EME dataset. Please note that these data mention mortality probabilities only, and not death rates. Since data on mortality between ages 15 and 25 were missing, these mortality age patterns are incomplete.
As an application of the new model to an even older, and truly ancient, mortality pattern, Russell's (1948: 92-117) age-specific mortality pattern for the English elite born in the 13th century before 1276 was used as a test case (see Table 2). Russell reconstructed his data from the so-called Inquisitions Post Mortem, which are documents drawn up by officials at the deaths of vassals of the king of England. Besides providing an inventory of each vassal's properties, they frequently mentioned the vassal's date of death and the age of his heir(s). By combining two Inquisitions for each lord, one in which he was named heir and the other mentioning his date of death, Russell was able to reconstruct the age at death for 532 lords. By counting the heirs who had entered lordship and subtracting those who had died below certain ages (15, 20, ..., 90), he calculated the number of persons alive at those ages, which he then used to calculate the mortality probabilities of age groups (15-19 up to 90-94). In line with Hollingsworth (1975: 156), we have adjusted Russell's mortality probabilities to take into account the possibility that a person who entered lordship in some age group might have died within that age group as well. Note that since data on mortality for ages below age 15 are missing, this mortality age pattern is incomplete.
Although Russell reconstructed mortality patterns for 25-year cohorts for later periods, we did not use them because they were highly influenced by ensuing plague waves. Furthermore, Poos, Oeppen and Smith (2012) have shown that Russell's data for the cohorts 1301-1325, 1326-1348 and 1348-1377 are biased, so we regard these cohorts as unsuitable for testing the new model. Therefore, only Russell's cohort of the English elite born before 1276 will be used. These data will be referred to as the Russell dataset.

Procedure for parameter fitting
To find the best fit to a given mortality pattern, we employ the Nelder-Mead general minimization procedure (Nelder & Mead, 1965). In doing so, we took 0.5 to be the edge length of the original simplex. In line with Dennis and Woods (1987: 120), the procedure ended when the distance between the simplex's centre and each of its vertices was less than 0.00005.
The fitting process, of course, requires a suitable loss function. Log-likelihood, which is often used in fitting mortality patterns, cannot serve as the loss function, since it makes use of numbers of deaths, which are not provided by the EME dataset (Thatcher, 1999: 29;Engelman, Caswell & Agree, 2014: 1394. Instead, the root mean squared error  (Siler 1979: 753;Gage 1988: 430;Bongaarts 2005: 25;Saikia & Bora 2013: 106) is used here in a somewhat adapted form to prevent small age groups, like those for the elderly above age 100, from having as much influence on the results as much larger age groups. With this aim in mind, and analogous to the use of weights by Gage (1988: 432-434), who used weights to reduce the effects of defective data points, we weight age groups according to the age distribution in the created life table for a stationary population. To this end, numbers of person-years lived between age x and x + n, denoted as n L x , serve as weights to calculate the weighted root mean square error (WRMS E). Furthermore, to make the comparison of errors of different mortality patterns possible, it is divided by the variance; thus, MSE(q)/var(q) (De Beer & Janssen, 2016: 6): where all summations are from ages 0 to 110, while excluding age groups for whom the data are missing in the empirical life table; e n q x and n q x are the mortality probability between age x and x + n in the constructed and empirical life table, respectively; and T Ã 0 ¼ P n L x , with n L x representing the number of person-years in the constructed life table. If no age groups are missing in the empirical life table, T Ã 0 represents the number of person-years lived above age 0. However, for the EME dataset, for which the age groups 15-19 and 20-24 are missing, T Ã 0 = 1 L 0 + 4 L 1 + 5 L 5 + 5 L 10 + 5 L 25 + 5 L 30 + … + 5 L 80 .
Graphically, n L x is represented by the area below the curve of l x on age interval [x, x + n]. The calculation of n L x usually requires the estimation of n a x , the mean number of person-years lived in the interval by those dying in the interval (Preston et al., 2001: 42-47). However, since equations (3) and (4) describe the curve of l x exactly, the area below this curve can be approximated as accurately as required with the Repeated Simpson's Rule (Griffiths & Smith 1991: 177). We will use its two-subintervals version: Fit of the four-parameter model To assess how well the four-parameter model (equation 2) performs compared to the Siler five-parameter model (equation 1), each of the 1320 life tables of the HMD and EME datasets is fitted to either model. The results are given in Table 3, which reports the weighted root mean square error (WRMSE) (see equation 6). For the 17th and 18th centuries the four-parameter model performs better than the Siler model (Table 3). Compared with these Early Modern results, both models show closer fits (= smaller errors) in the 19th, 20th, and early 21st centuries. The Siler model shows better results for recent years, while the four-parameter model proposed here performs better for the first half of the 20th century. Keeping in mind the extra parameter that is contained in the Siler model, we can conclude that our four-parameter model performs quite well.

From a four-parameter model to a two-parameter model
The results of the previous section show that the mean error for our model with four parameters is about the same as the mean error for the Siler model with five parameters. When considering age patterns of mortality based on small sample sizes or with missing data on several age groups, a more restrictive model containing fewer variables may be required. In this section, the number of free-to-choose parameters of the four-parameter model (equation 2) will be reduced stepwise by formulating restrictions on two of its parameters, resulting in the two-parameter model.

Reduction from four to three parameters
The first parameter to be considered is φ, which is the age at which the senescence logistic term (4) has its point of inflection, or in other words, it is the age at which this logistic curve changes from being convex to being concave. At this age, the logistic curve reaches 0.5. Thatcher (1999: 20-21) has already shown that φ was about age 99 Table 3 Goodness of fit when fitting the EME and the HMD datasets to the four-parameter model and the Siler model  (1997: 224, 239, 250, 290) for six historical mortality patterns. As Table 4 indicates, this parameter shows only minor variations during the past four centuries. Despite the major demographic changes that have taken place during this long period, the point of inflection has remained largely the same. This suggests that this value for φ is applicable to all periods under study. Furthermore, since the seven countries all have about the same value for φ in Table 4, it seems appropriate to expect this value to be valid for the whole of Western Europe. Taking the age of inflection φ to be 100, in line with the average value for both the HMD and the EME datasets, the four-parameter model (equation 2) can be reduced to a three-parameter model: Reduction from three to two parameters To search for an option to reduce the number of free parameters even further, this three-parameter model was fitted to the 1320 life tables of the HMD dataset and the EME dataset. The overall mean WRMSE is 0.7283, which is clearly above that for the Siler model and for our four-parameter model. Figure 1 presents the results for the  juvenile mortality shape parameter ω for the 1320 life tables of the HMD database. As Figure 1 shows, this parameter seems to increase from the mid-19th century onwards, which corresponds to declining juvenile hazard rates from that time onwards. These results are hardly surprising. However, before the mid-19th century, ω seems to remain constant at about 2.5. Figure  1 therefore suggests that there is a baseline value for juvenile mortality, which is especially relevant for ancient populations, since it clearly links juvenile mortality to this particular baseline pattern. Unfortunately, the lion's share of life tables before 1840 are from Sweden only, so the evidence that such a baseline value exists is somewhat limited.
Fortunately, however, the Early Modern England dataset, which we described in the previous section, provides data that can be used to investigate this pre-1850s period in a second country.
Although age groups 15-99 and 20-24 are missing, their data can be fitted to the three-parameter model. The results, presented in Fig. 2, show that the value for ω is about 2.8 in England during the whole Early Modern Period. Although this value for England deviates somewhat from the value for Sweden, the difference between the two values is too small to conclude that England and Sweden had different baseline values. Until we have new evidence that proves otherwise, it seems plausible to take the mean value for both countries-i.e., 2.7-as the baseline value for Northwestern Europe.
The 1320 life tables of the HMD dataset, represented by points in Fig. 1, can be divided into two groups, each with its own distinctive rule. Of these life tables, 284 seem to correspond to the baseline value of 2.7, as they have ω at or near 2.7. For the remaining 1036 HMD life tables, juvenility mortality lies below this baseline value, since for these life tables, ω takes values above 2.7. To reduce our three-parameter model to a two-parameter model by the elimination of ω as a free-to-choose parameter for these 1036 cases, linear regression analysis is used. The line of regression of ω on the two remaining parameters β and θ of the three-parameter model (equation 8) is: This rule will be combined with our notion of a baseline value at ω = 2.7 in a twostep procedure. First, the best fitting values of β and θ for the three-parameter model are used to predict the value of ω by equation (9). Second, the outcome is set to 2.7 when equation (9) predicts a value below this baseline, which is the case for 376 life tables: if ω < 2:7 then take ω−2:7: Since we aim to eliminate ω as a free-to-choose parameter, it is important to evaluate how well this two-step procedure is able to predict ω correctly. For each of the 1320 life tables from the HMD dataset, the combination of these two steps predicts a value for ω that can be compared to the best-fitting value of ω for the three-parameter model. As can be seen from Fig. 3, for most tables, the two values of ω are fairly similar (R 2 = 0.95). We therefore apply equations (9) and (10) to further simplify our model.

Our two-parameter mortality model
After starting off with a four-parameter model (equation 2), we were able to fix one of its parameters, φ, at 100, thereby reducing the model to a three-parameter model. The combination of equations (9) and (10) is a fair option for reducing the number of free-to-choose parameters even further with only a limited loss of accuracy, thereby reducing the three-parameter model to a two-parameter model consisting of three rules: Fig. 3 Best fit (determined by equation (8)) of juvenile mortality shape ω compared with its predicted value (calculated by equations (9) and (10)) in the three-parameter model (n = 1320). Source data: Human Mortality Database (2017,2017,2018) ω ¼ 119:3 Á β−1:01 Á θ−14:5:

ð11:aÞ
if ω < 2:7 then take ω ¼ 2:7; ð11:bÞ Parameter ω that shapes the power form (Weilbull 1951) for juvenility is no longer a free-to-choose parameter. Instead, it is determined by the two parameters that shape the two other terms of equation (11.c). The two remaining parameters are θ, which controls the accident hump level of the term for adolescents and young adults; and β, which specifies the rate-of-ageing in the logistic term for senescence (Thatcher 1999).

Fit and application of the two-parameter model
To test the proposed two-parameter model's (equation 11) validity with respect to its capacity to credibly complete age patterns of mortality, 11 life tables have been selected from the HMD dataset. 1 After transforming them into incomplete age patterns by removing four consecutive age groups, the two-parameter model has been fitted to these patterns. The results are given in Table 5.
Compared with the range of β within the HMD dataset, which is 0.078−0.182, the deviation between the incomplete and the corresponding complete age pattern is more than 3% of this range for only two out of 66 incomplete patterns. The range of θ is from -8.4 to -4.4. The fit of this parameter is much more sensitive to incompleteness. For 17 out of 66 incomplete patterns, the deviation is more than 3% of this range. Eight of these patterns, four of which are for incomplete life tables of Sweden for 2014, deviate even more than 5% of this range. Almost all deviations beyond 3% are found when the complete life table has a value below − 7 for parameter θ, which corresponds to low mortality risks for young adults. Thus, for post-World War II mortality age patterns, our two-parameter model seems less suitable for completing incomplete patterns.
With respect to the two-parameter model (equation 11), the mean error WRMSE for the whole HMD dataset is 0.088 (Table 6). To give an impression of such an error, Fig. 4 shows a HMD life table from the 19th century that is fitted with such an error. As the figure shows, the model follows the age pattern of mortality quite closely.
Since this model contains no more than two independent, free-to-choose parameters, the model is restrictive enough to cope with incomplete life tables and large irregularities without being very sensitive to accidental distortions. The trade-off is that this model cannot be fitted as well as the four-parameter model can, given that the errors are nearly twice as large, as Table 6 shows. When applying the 2-parameter model (11) to the 1320 life tables from the HMD dataset, the goodness of fit is about the same for all countries except Sweden and France. For Sweden, the larger error is due to the fact that Sweden is the only country with life tables going back well into the 18th century, when mortality crises occurred more frequently and were more severe. For France, the larger mean error is mainly caused by large distortions in the mortality pattern during World War I. Clearly, our model is unsuited to describing abnormal mortality patterns caused by epidemics or war.
As expected, we find that for the HMD dataset, the two-parameter model performs worse than the Siler model, which contains as many as 5 parameters. The differences are largest for the 21st century. Remarkably, however, we observe that for the Early Modern dataset, both models can be fitted with about the same level of accuracy. Note, however, that neither the four-parameter model nor the Siler model can be applied to the Russell dataset, since the mortality age pattern is missing below age 15. Table 6 also shows the results of the comparison of our two-parameter model with the current well-established Brass logit life table system for dealing with irregular and fragmented data (Arriaga, 1994: 112-117). As we noted in the introduction, the Brass logit transformation involves relating a mortality age pattern to a reference standard  (2017,2017,2018) after having applied a logit transformation to both age patterns. Since this transformation transforms logistic functions into straight lines, mortality age patterns are expected to transform into approximately linear functions, the relationship of which can be specified by two parameters. Therefore, this procedure is a two-parameter method if the reference standard is chosen beforehand. Comparison of our two-parameter model with the Brass two-parameter procedure gives us insight into how well our model performs.
The choice of the standard model life table, which we henceforth refer to as the reference life table, logically affects the outcomes of the method. Since our datasets contain a wide range of mortality patterns from pre-industrial to 21st-century western and northern Europe, we have chosen the most commonly used model life table, which is  Wrigley et al. (1997: 224, 239, 250, 290) the Princeton Model West, as the reference West level 12 is found to produce the smallest mean errors if the weight root mean square error is used as the measure of the goodness of fit for the mortality probability.
The results for this level are given in Table 6. As can be seen from this  Figure 5 shows the fit to Russell's data on the 13th century mortality by both our two-parameter model and the Brass method. Note that neither the Siler model nor the four-parameter model can be fitted to this pattern because all of the necessary information on juvenile mortality is missing. Thus, only a comparison with the Brass' method is feasible. Our two-parameter model appears to smoothen irregular patterns and fill in gaps in a way that seems quite plausible. This model closely follows this 13th-century mortality pattern until age 80. The deviation of this model from Russell's figures for age groups 85-89 and 90-94 do not have to worry us, given that these figures were based on no more than four and one cases of death, respectively. The Brass method with Princeton Model West level 12 as its reference table predicts much higher mortality probabilities for the young and the old than might be expected. The Brass method then predicts that the chances of a newborn having died within a year to have been as high as 0.33. Such a figure is well above the 0.14-0.20 range in the EME dataset (Wrigley et al., 1997: 224, 239, 250, 290). Even the earliest decade in this dataset shows that the mortality probability for newborns was 0.15, which was less than half the value predicted by the Brass method with Princeton Model West level 12 as its reference table. For the elderly, the chances of having died evidently deviated from Russell's figures from age 60 onwards, as Fig. 5 clearly shows. Overall, our two-parameter model displays a much closer fit than the Brass two-parameter method, which is reflected by the error having only half its value.
If the choice of the reference table is no longer limited to level 12, thereby adding a third parameter to the Brass method, it is level 24 that has the smallest fitting error in this particular case. Since level 24 reflects late 20th-century mortality patterns, this standard completes Russell's pattern with an extremely low mortality probability of Fig. 5 Fit of the Russell dataset; i.e., the incomplete 13th-century mortality pattern (n = 532). Notes: Age groups 0, 1-4, 5-9, 10-14, 15-19, …, 95+. Source data: Russell (1948: 180-1;202), but corrected for the possibility that persons who entered an age group might have died in that same age group as well (Hollingsworth 1975: 156). Best fit to the two-parameter model for β = 0.064; θ = − 6.3; WRMSE = 0.20. Best fit to the Brass logit model with reference to Princeton Model West level 12 (chosen because it produces the smallest mean error for the HMD dataset and the EME dataset): a = − 0.87; b = 1.37; WRMSE = 0.41. Best fit to the Brass logit model with reference to Princeton Model West level 24 (chosen because it produces the smallest error for this particular mortality pattern): a = − 0.74; b = 0.72; WRMSE = 0.33 0.037 for age group 1-4, which does not seem plausible for the 13th century. Indeed, in the 17th and 18th centuries, this figure lays within the 0.09-0.13 range, and did not decline to a value below 0.04 in England until after World War I (Wrigley et al., 1997: 224, 239, 250, 290;Human Mortality Database, 2017). It appears that in this case, the Brass model with a free-to-choose reference level is too flexible to select an appropriate level to serve as the reference table and complement the Russell dataset convincingly. This fit of Princeton Model West level 24 to a 13th-century mortality pattern illustrates that this Brass method with a free-to-choose reference level does not automatically lead to the most appropriate reference table.

Discussion
We set out to create a model that is capable of describing ancient age patterns of mortality. This model should be flexible enough to accurately describe a wide range of mortality patterns and restrictive enough to cope with the problems resulting from limited data (gaps, capriciousness). By choosing three components-i.e., describing mortality due to juvenility, mortality for adolescents and young adults, and mortality due to ageing-a four-parameter model was created that could describe age patterns of mortality as accurately as the Siler model.
Through a stepwise reduction of the number of free parameters, one parameter related to ageing was fixed to a constant value, and a second parameter related to juvenility was expressed in terms of the remaining two free parameters. As a result, the fourparameter model was reduced to a two-parameter model, thereby reducing its flexibility and making it more restrictive.
The two-parameter model (equations 11) has been found to be a simple, restrictive, hazard model, that is suitable for use in cases of limited data to fit mortality on a country-wide scale within north-western Europe from the 19th century onwards. With respect to completing incomplete mortality age patterns, the model has been found to be a valid tool, especially for patterns before World War II. If an appropriate model life table was available to serve as the reference standard, the Brass logit transformation could be fitted with at least the same accuracy as our two-parameter model. If, however, the age pattern of mortality differed substantially from the used reference standard, the two-parameter model introduced in this paper could be fitted much better than the Brass logit method. For the dataset used in this study, the mean fitting error for the new model was found to be 0.088, which was half that for the Brass logit method (with Princeton Model West level 12 as its reference standard, chosen because it produced the smallest mean fitting error to the HMD dataset and the EME dataset).
Since the two-parameter model we introduced has been shown to fit quite accurately a broad range of age patterns of mortality-i.e., from the patterns of 17th-century England, to 18th-century Sweden and England, to 21st-century western Europe-it may lead to acceptable results for earlier centuries as well. Our application of the model to a capricious 13th-century pattern, for which information on infant and child mortality was completely missing, has shown that this model produces a plausible age pattern of mortality. It should, however, be noted that for ages above 70, the new model seems to overestimate mortality risks somewhat. Thus, our model is suitable for use in cases of limited data. If, however, sufficient data are available to apply a more sophisticated model, the two-parameter model should be put aside, since it does not perform as well as, for instance, our four-parameter model (equation 2) or the Siler model (equation 1).
Furthermore, the two-parameter model cannot adequately describe abnormal mortality patterns caused by epidemics or war.
Well-established alternatives to our use of two parameters to describe ancient age patterns of mortality include the direct selection of a matching model life table and the Brass logit method, which uses a model life table for reference. Both alternatives require an appropriate model life table, which may not be available for an ancient population. Moreover, we demonstrated that when the Brass logit method was applied to the case of England's 13th-century elite, the age pattern of mortality deviated too much from the Princeton Model West life tables. Thus, this method was not able to produce an acceptable description. The two-parameter model proposed in this study might help in constructing model life tables for ancient population.
Our model is designed to be applied to mortality age patterns for both sexes combined, since it is intended for use in cases in which there are not enough data to differentiate between the sexes. For cases in which the distinctions between men and women are meaningful, sex-specific two-parameter models are preferable. To obtain such models, our four-parameter model might be a good starting point for a stepwise reduction process, since it is probably suitable for describing both male and female patterns separately.
In evaluating our model, we used three datasets covering Northwestern Europe. The question of whether our model might work well for regions beyond Northwestern Europe cannot be answered based on the present study. Unfortunately, outside of Northwestern Europe, life tables for the 19th century are hardly available yet, and the few that exist are all from countries that directly border the seven countries already included in this study. An important finding of this study that we used to reduce the number of free parameters from four to two was the existence of a baseline value for juvenile mortality, which was reflected in the juvenile mortality shape parameter ω being 2.7. However, the evidence presented in this study is insufficient to conclude whether this value can be assumed to hold beyond Northwestern Europe. Instead, it seems quite possible that for a region that has a different climate and different bacteria and viruses, the baseline value for juvenile mortality will be found at a different value for the mortality shape parameter ω.
Furthermore, we want to point out that our model was tested against data on countrywide mortality only. The lion's share of these data were dominated by rural mortality, since the urban population formed only a (small) minority. Highly urbanised areas might therefore display patterns that deviate substantially from the two-parameter model developed in this study.
Thus, although further research is clearly needed to test its scope, our model seems capable of fitting a wide range of historical age patterns of mortality more closely than the traditional model life table approach can and could therefore be used to compile a broader evidence base for historical mortality research.
Abbreviations WRMSE: Weighted root mean square error; HMD: Human Mortality Database; EME: Early Modern England; SD: Standard deviation