The effects of community clustering on under-five mortality in India: a parametric shared frailty modelling approach

The study of the effect of community clustering of under-five mortality has its implications in both research and policy. Studies have shown the contribution of community factors on under-five mortality. However, these studies did not account for censoring. We examine the presence of community dependencies and determine the risk factors of under-five mortality in India and its six state-regions by employing a Weibull hazard model with gamma shared frailty. We considered every possible way to ensure that the frailty models used in the study are not merely a consequence of how the data are organized rather than representing a substantive assumption about the source of the frailty. Data from the fourth round of the National Family Health Survey has been used. The study found that except for south India, children born in the same community in India and the other five state-regions shared similar characteristics of under-five mortality. The risk of under-five mortality decreased with an increase in mother’s schooling. Except for northern region, female births were less likely to die within first five years of life. We found a U-shaped relationship between preceding birth interval and under-five mortality. History of sibling’s death, multiple births and low-birthweight significantly increases the risk of under-five mortality in all the six state-regions. The Hindu–Muslim mortality gaps and Scheduled Caste or Tribe’s mortality disadvantage is diminishing. Since the factors associated with under-five mortality were not necessarily the same across the six state-regions of India, adopting a uniform approach in dealing with under-five mortality in India may not benefit all the regions equally.

long way to go to reduce under-five mortality and achieve Goal 3.2 of the Sustainable Development Goals: reducing under-five mortality to 25 per 1000 live births (Assembly, 2017). Under-five mortality is also a major concern in India. Although India has cut down its under-five mortality rate by half, from 69 deaths per 1000 live births in 2008 to 36 deaths per 1000 live births in 2018 (SRS, 2013(SRS, , 2018, it could not achieve Goal 4 of the Millennium Development Goals (International Institute for Population Sciences (IIPS) and ICF, 2017). Despite this significant decrease in the under-five mortality rate, one in every 28 children dies within five years of birth at the national level. In urban India, the ratio is one in every 39 children, and in rural India, it is one in every 25 children. Likewise, the under-five mortality rate varies considerably across the different states of India, with the central Indian states bearing a considerably high burden of under-five mortality compared to the south Indian states (SRS, 2018).
One plausible factor that prevents India from achieving the Millennium Development Goals may be the presence of regional and socioeconomic inequalities in mortality within the country. Dyson and Moore (1983) introduced the north-south divide in the Indian demography based on fertility and mortality. The north had high birth and death rates while the south had comparatively lower birth and death rates. A considerable variation in the under-five mortality rate across the Indian districts was also reported by Bora and Saikia (2018). They suggest providing particular focus in two northern states of India, namely, Chhattisgarh and Uttar Pradesh in dealing with child mortality. A recent study by Bora (2020) suggests prioritizing region-specific interventions such as economic, maternal education, and infrastructural development policies in the country's north, central, and northeast regions. Studies have also revealed inadequate access to healthcare services, low female literacy and poor household socioeconomic status as important factors that affect the health status of the children in India Po & Subramanian, 2011;Ram et al., 2013;Singh et al., 2011). It is, therefore, necessary to identify the mortality scenario among the marginalized or more impoverished strata of the population, communities and geographic regions for reducing the underfive mortality rates to a desirable level (Antai, 2011). The high prevalence of under-five mortality could also be attributed to unobserved differences between or within communities. There is substantial evidence that the health outcomes of the children are influenced by both individual and community-level characteristics in LMICs (Adedini et al., 2015;Deribew et al., 2007;Kayode et al., 2012;Kumar et al., 2012;Rudan et al., 2010;You et al., 2015). Unfortunately, these studies only account for the information on the event's occurrence or non-occurrence (Alotaibi et al., 2020;Niragire et al., 2011). There have been several attempts to study the impact of regional or community characteristics on risk factors of under-five mortality in India (Arulampalam & Bhalotra, 2006;Bora, 2020;Gupta et al., 2016;Singh et al., 2011). These studies showed the significant contribution of community or regional factors affecting under-five mortality. However, these studies did not consider the information on time-to-event or time-to-death of the child and did not account for censoring.
Time-to-event data differ from other types of data as they consider both the occurrence or non-occurrence of the event and the time when the event occurred. Standard regression models allow only accounting for information on whether the event has occurred or not. They do not consider the time of occurrence of an event, due to which they do not account for censoring. Censoring occurs when the study participants do not experience the event during the study period. The Cox proportional hazard model is the most popular model to address censoring (Cox, 1972). However, the Cox model fails to provide unbiased estimates when the observations are dependent. The dependency in the observation results in underestimated standard errors, and, in the case of non-linear models such as Cox proportional hazard models, the estimated parameters are both biased and inconsistent (Trussell & Rodriguez, 1990). A new dimension of studying child survival and under-five mortality based on frailty models have been adopted recently (Alotaibi et al., 2020;Niragire et al., 2011). The frailty model can assess and account for cluster-level variations of under-five mortality with information on the time-to-death of the child. These models are a generalization of the multivariable survival regression models, which can consider the impact of the presence of latent factors affecting the estimates of the model (Duchateau & Janssen, 2007;Gutierrez, 2002). This model is generally used when individuals or groups of individuals share unobserved frailty. It can be considered as a random-effect model version of the survival regression model. Therefore, it has the added advantage of estimating similarity in group characteristics considering the time-to-event information. The model's ability to identify the impact of the community characteristics on the estimates of under-five mortality with time-to-event information will facilitate in reevaluating the current policies and programmes targeted at reducing under-five mortality in the country (Alotaibi et al., 2020).
The study of the effect of community clustering of under-five mortality has its implications in both research and policy. It will help determine the cause of mortality, accounting for correlated observations of children from the same community. Having identified the gaps in the existing literature, our study aims to determine whether there are dependencies between individuals within the same community, that is, to check if the children share similar frailty within the community. In addition, our study aims to determine the risk factors of under-five mortality adjusted for the unobserved community effects in India and its six state-regions. We employed a parametric shared frailty modelling approach using data from the most recent round of the National Family Health Survey, conducted in India in 2015-2016, for which unit-level data are in the public domain.

Data
Data from the National Family Health Survey, 2015-2016 (NFHS-4), have been used for the study. NFHS-4 is a large-scale sample survey conducted across all the 36 states and union territories of India. NFHS-4 used a two-stage stratified sampling design to collect information from the study respondents. In the first stage, a total of 28,586 Primary Sampling Units (PSUs) were selected with probability proportional to the PSU size. Twenty-two households were selected with systematic sampling from the selected PSUs in the second stage. Overall, 699,686 women in the age group 15 to 49, residing in 601,509 surveyed households were interviewed, with a response rate of 96.7%. See IIPS & ICF (2017) for detailed information on sampling procedures and data collection of NFHS-4.

Analytical sample
The study included 233,763 children born to women in the age group 15 to 49 years, in the 5 years preceding the survey. Of these, 10,604 children died within five years of birth. Stillbirth, abortion, and miscarriages were not included in the study.

Outcome variable
The outcome of interest is the time-to-death of the children before their fifth birthday. Children who died within five years of birth were considered to have had the event and were coded as 1. While those children who survived until 59 months were censored and coded as 0. In NFHS-4, the information on child survival was collected retrospectively by interviewing the mother. The "age-at-death" of the child obtained from the mother was used as the time-to-death in the study.

Explanatory variables
Explanatory variables included in the study are broadly classified as maternal and child characteristics. The mother-related explanatory variables are-mother's age at first birth (less than 18 years, 19 to 24 years, 25 to 29 years and 30 years and above), mother's schooling (no schooling, primary, secondary and higher), mother's marital status (currently married and not currently married), wealth quintiles (poorest, poorer, middle, richer and richest), religion (Hindu, Muslim and others), and caste (scheduled caste (SC), scheduled tribe (ST) and others). The child-related explanatory variables are type of birth (singleton birth and multiple births), sex of the child (male and female), birth order (1, 2, 3-4 and 5 and above), birthsize (less than average, average and above-average), previous birth interval (first-order birth, less than 12 months, 12 to 23 months, 24 to 35 months, 36 to 47 months, 48 to 59 months, 60 to 85 months and 86 months and above), mode of delivery (non-caesarean and caesarean), assistance during delivery (skilled personnel and unskilled personnel) and history of sibling's death (yes and no).
During the survey, information on the child's size at birth was collected by interviewing the mothers as very large, larger than average, average, smaller than average, and very small (IIPS & ICF, 2017). For simplicity in analysis and interpretation, we categorized child's size at birth into three categories: less than average, average, and above average. Skilled personnel assisting the delivery includes doctors, nurses, auxiliary nurse midwives, midwives, and lady health visitors. Wealth quintiles are already estimated and given in the dataset.

Analytical procedure
A parametric shared frailty modelling approach was adopted in the study. Frailty can be considered as an unobserved random factor that affects the hazard function of an individual or a group of individuals (Wienke, 2014). The concept of frailty can be traced to Greenwood and Yule's (1920) work on "accident proneness". However, Vaupel et al. (1979) first introduced the term frailty in a univariate survival model. The extension of the model to measure correlation in survival data in Clayton (1978) and Clayton and Cuzick (1985) laid the foundation for considerable development in this area. This method has recently been adopted to study child survival in LMICs (Alotaibi et al., 2020;Niragire et al., 2011).
Importantly, frailty cannot be directly estimated from the data; rather, it is assumed to follow a distribution with mean = 1 and variance = θ . If an individual's frailty is less than 1, the individual is less likely to be frail and vice versa (Clayton & Cuzick, 1985;Gutierrez, 2002;Hougaard, 1986;Vaupel et al., 1979). Sometimes, it may so happen that two or more individuals share the same frailty value. Sharing these frailty values brings on a dependency between those individuals. In our study, frailty is modelled according to the number of communities. Community, in our study, is defined on the basis of the number of Indian states and union territories and whether the respondents were residing in urban or rural places of residence. Therefore, we had 64 communities inclusive of all 36 states and union territories of India. We considered every possible way to ensure that the shared frailty models used in the study are not merely a consequence of how the data are organized rather than representing a substantive assumption about the source of the frailty (Gutierrez, 2002). Therefore, we fit two sets of models under a couple of assumptions. The first set of models assumed that there were no community variations. The second set of models assumed that there is community variation which was accounted for by assigning a frailty distribution. The models were fitted with three hazard distributions, namely, exponential, Weibull and Gompertz. Thus, we had six models specified for the study. The models were compared using Bayesian Information Criteria (BIC) and Akaike Information Criteria (AIC). The model having the smallest value of BIC or AIC was considered the best model for the study.

Models without shared frailty
If T is the random time-to-failure (time-to-death of a child before his/her fifth birthday), then the generalized hazard model without shared frailty is defined as, where i = 1, 2, . . . , n and h 0 (t) is the baseline hazard function. X i is the vector of covariate belonging to the i th individual and β is the standard regression parameter. The hazard function is assumed to follow exponential, Gompertz or Weibull distribution in our study.

Models with shared frailty
Assume the n individuals were divided into k groups, and, for every k th group, an unobserved frailty parameter u j is defined j = 1, 2, . . . , k . Then the generalized shared frailty model with T as the random time-to-failure (time-to-death of a child before his/her fifth birthday) is defined as: where j = 1, 2, . . . , k and h 0 (t) is the baseline hazard function. X ij is the vector of covariates belonging to the i th individual in j th group and β is the standard regression parameter.
For an exponential hazard distribution, h 0 (t) = 1 and thus, (2) is defined as: For a Gompertz hazard distribution, h 0 (t) = e pt and thus, (2) is defined as: h ij t|u j = u j e βX ij e pt , where p is the shape parameter. For a Weibull hazard distribution, h 0 (t) = pt p−1 and thus, (2) is defined as: h ij t|u j = u j e βX ij pt p−1 , where p is the shape parameter. Both the parameters, β and p are estimated from the data.

Test for equality of survival curve
Unlike the Cox model, which assumes the covariates act proportionately on hazard function, the parametric survival models assume a particular distribution whose parameters depend on the covariates (Gutierrez, 2002;Wienke, 2014). Here we adopted a parametric model approach assuming a particular distribution for hazard function, and therefore, we used the log-rank test to compare the equality of survival curve across two or more groups of population rather than to test for proportionality. The log-rank test is a method that tests the hypothesis of no difference between two or more survival distributions at any point in time. The method is best suited for comparing survival distributions from data with censored observations (Bland & Altman, 2004;Harrington, 2014). The survival functions were estimated using the Kaplan Meier method.
We estimated the under-five mortality rate using the "syncmrates" package in STATA. The package is meant for estimating under-five mortality rates using the synthetic cohort probability method (Masset, 2016;Rutstein & Rojas, 2006). It also provides the 95% confidence interval (CI) for the estimated under-five mortality rates.
First, statistical analysis was conducted at the national level. To understand the regional disparity in risk factors affecting under-five mortality, separate statistical analysis was conducted for the six state-regions of India. See Additional file 1: Appendix S1 for details of states included in the six state-regions. As the association between previous birth interval and under-five mortality risk may be modified by mother's age at first birth, we performed a sensitivity analysis to examine the plausible influence of a mother's age at first birth on the relationship between previous birth intervals and under-five mortality. Additionally, we performed a sensitivity analysis to compare the effects of birthsize and birthweight on under-five mortality. This was done to ensure that the use of birthsize as a proxy of birthweight does not change the results. All analyses presented in the paper are appropriately weighted and adjusted for the complex survey design used in NFHS-4. All our statistical analyses were performed in STATA 16.0. Table 1 shows the percentage distribution of 233,763 births by maternal and child characteristics for India and its six state-regions. Sixty-two percent of births occurred to mothers whose age at first birth was between 19 and 24 years. The percentage of births reported by mothers who had attained no schooling, primary, secondary, and higher was 32%, 15%, 44%, and 9%, respectively. The percentage of births by mothers' schooling differed by state-regions. Ninety-eight percent of births occurred to currently married mothers. The percentage share of births decreased as the household h ij t|u j = u j e βX ij . wealth quintile increased. However, there exists a regional difference in the percentage of births by different wealth quintile. Seventy-two percent of the births belonged to Hindu households. Around 19% and 21% of births occurred in SC and ST households, respectively. Male births outnumbered female births at the national level and in all the six stateregions. The percentage of male birth ranged from 51% in the northeast region to 53% in the north region. About 98% of the births were singleton birth. A little more than one-third (36%) of the births were first-order births and 31% were second order birth. Eight percent of births were fifth or higher order births. The percentage of firstorder births ranged between 33% in the central region and 45% in the south region. The percentage of second order births ranged between 27% in the northeast region and 39% in the south region. The percentage of fifth or higher order births ranged between 1% in the south region and 11% in the northeast and central regions. Majority of births were of average birthsize. Less than 15% of births had less than average birthsize. The percentage of the births with average birthsize ranged between 61% in the west region and 75% in the north region. Majority of births (20%) in our sample were born with a birth interval of 24 to 35 months. The percentage of births with a 24 to 35 months birth interval ranged between 17% in the south region and 22% in the central region. Similarly, the percentage of births with a previous birth interval of less than 12 months ranges between 1% in the west region and 2% in the central region. A large majority of births occurred through a non-caesarean mode of delivery. A little more than three-fourths of the births were assisted by skilled personnel. Eleven percent of births had a history of sibling's death. Five percent of births in the south region had a history of sibling's death. In comparison, 17% and 13% of the births in the central and east regions had a history of sibling's death, respectively.   Table 2 shows the estimated under-five mortality rates by selected child characteristics, such as sex of the child, type of birth, birth order, birthsize, previous birth interval, mode of delivery, assistance during delivery, and history of sibling's death. The estimated under-five mortality rate was lower for females (48.1, 95% CI: 45.8-50.4) than males (53.4, 95% CI: 51.2-55.7). The estimated under-five mortality rates were also lower for females than for males in the east, northeast, west, and south regions. The estimated under-five mortality rate among multiple births was 211.4 (95% CI: 194.1-228.7) deaths per 1,000 live births compared to 48.2 (95% CI: 46.7-49.6) deaths per 1,000 live births among the singleton births. The estimated under-five mortality rates varied considerably by birth order. The estimated under-five mortality rates among births of 1, 2, 3 to 4 and 5 and above birth orders were 49.2 (95% CI: 46.5-52.0), 40.1 (95% CI: 37.9-42.4), 55.9 (95% CI: 52.8-59.1) and 89.5 (95% CI: 83.0-95.9) deaths per 1000 live births, respectively. The estimated under-five mortality rate among babies less than average size at birth was significantly higher than the rates among babies who were of average and above-average size at birth. The estimated under-five mortality rates among births with birthsize less than average, average and above-average were 104.0 (95% CI: 98.6-109.3), 42.0 (95% CI: 40.2-43.7) and 44.6 (95% CI: 41.4-47.8) deaths per 1,000 live births, respectively. An interesting picture emerges when we look at the estimated under-five mortality rates by preceding birth intervals. The estimated under-five mortality rate was highest for birth intervals less than 12 months. The estimated under-five mortality rates declined with an increase in the birth interval until 48-59 months, after which the estimated under-five mortality rates increased. A similar pattern was observed in all the six state-regions. The estimated under-five mortality rate was significantly higher among non-caesarean births (54.4, 95% CI: 52.7-56.2), births assisted by unskilled personnel (77.0, 95% CI:

Selecting the best model
The AIC and BIC values for different models are shown in Table 3. While M 1 , M 2 , and M 3 represent the models without a frailty distribution, M 4 , M 5 , and M 6 represent the models with gamma shared frailty distribution. A comparison of the AIC and BIC values across the models indicates that models with gamma shared frailty distribution performed better than their counterparts with similar hazard distributions. Notably, model M 1 with exponential hazard distribution without frailty performed the worst. Similarly, among the models with shared frailty distributions, M 4 performed the worst. Model M 6 with Weibull hazard distribution and gamma shared frailty distribution had the smallest AIC and BIC values. As M 6 appeared to be the best fit among the six models, subsequent analyses and interpretations are based on models with Weibull hazard distribution and gamma shared frailty distribution. Table 4 shows the hazard ratios of under-five mortality by maternal and child characteristics in India and its six state-regions based on the Weibull survival model with gamma shared frailty.

At the national level
Births occurred to mothers whose age at first birth was less than 18 years were more likely to die before their fifth birthday (HR = 1.08, 95% CI: 1.02-1.14) than those to mothers whose age at first birth was between 25 and 29 years. Compared to births occurred to a mother with no formal schooling, births occurred to a mother with secondary (HR = 0.90, 95% CI: 0.85-0.94) and higher (HR = 0.70, 95% CI: 0.63-0.78) level of schooling had less chances of dying before their fifth birthday. Births that occurred to widows, divorced or separated mothers were 1.16 (95% CI: 1.01-1.34) times as likely as births that occurred to currently married mother to die before the fifth birthday. Compared to births that occurred to mothers belonging to poorest wealth quintile, births that occurred to mothers belonging to middle (HR = 0.89, 95% CI: 0.84-0.95), richer (HR = 0.79, 95% CI: 0.73-0.86) and richest (HR = 0.60, 95% CI: 0.55-0.67) wealth quintiles were less likely to die before five years of age. Births in other caste families were less    likely to die before their fifth birthday than those belonging to a SC family (HR = 0.93, 95% CI = 0.88-0.98). Female births were less likely to die before their fifth birthday than male births (HR = 0.86, 95% CI: 0.83-0.89). Multiple births were 4.26 (95% CI: 3.93-4.62) times as likely as singleton births to die before the fifth birthday. Births with less than average birthsize were 2.17 times (95% CI: 2.07-2.27) more likely to die before the fifth birthday than births with average birthsize. While births with above-average birthsize were 1.21 times (95% CI: 1.15-1.28) more likely to die within the first five years of life. Compared to births with previous birth interval 36 to 47 months, births with birth interval less than 12 months (HR = 3.54, 95% CI: 3.14-3.98), 12 to 23 months (HR = 1.77, 95% CI: 1.64-1.92), 24 to 35 months (HR = 1.14, 95% CI: 1.05-1.24), 60 to 85 months (HR = 1.19, 95% CI: 1.07-1.34) and 86 months and above (HR = 1.76, 95% CI: 1.56-2.00) were more likely to die within five years of birth. Caesarean births and births attended by a skilled birth attendant were less likely to die within five years of birth. Births with a history of sibling's death were more likely to die than those with no history of sibling death (HR = 2.35, 95% CI: 2.24-2.47).

At the state-region level
Multiple births, birthsize less than average, and history of sibling's death were associated with elevated risk of under-five mortality in all the six state-regions of India. Multiple births were 3 to 4 times as likely to die within the first five years of life as singleton births. Likewise, births with below average birthsize were over 2 times as likely to die within the first five years of life as births with average birthsize. Births in a family with a history of sibling's death were twice more likely to die within the first five years of life than those without a history of sibling's death in the north, central and east regions and three times more likely in the northeast, west and south regions. Births with very short birth intervals (< 12 months) and very large birth intervals (> 85 months) were associated with higher risk of under-five mortality in all the six state-regions. Except for the south state-region, births with a birth interval of 12 to 23 months had higher chances of dying within five years of birth. Births with a birth interval of 24 to 35 months were more likely to die in the north and central state-regions. Except for the north region, male births were more likely to die within the first five years of life than female births.
The association of maternal programme related factors with under-five mortality was not uniform across the six state-regions. Births delivered through the caesarean mode of delivery had a lower risk of under-five mortality only in the north, east and northeast state-regions. Likewise, births assisted by unskilled health personnel had a higher risk of under-five mortality only in the central, east, west and south state-regions. Mother's socioeconomic characteristics were also not uniformly associated with under-five mortality in the six state-regions. For example, children of mothers with a higher level of schooling had lower risk of under-five mortality compared with children of mother with no schooling in the north, central, east, and south state-regions. Likewise, births in richer and richest wealth quintile households were less likely to die within five years of birth than those born in the poorest wealth quintile households in the north, central, east, and south state-regions. Figure 2 shows the Kaplan-Meier survival curves for under-five mortality by sex of the child, type of birth, birth order, birthsize, previous birth interval, mode of delivery, assistance during delivery, and history of sibling's death. The log-rank test for equality of survival curves for two or more categories of a variable was also estimated. The p-values are also provided against the variables in Fig. 2. The log-rank test was significant for all the variables indicating that the survival curves were different. Female births had better chances of survival within the first five years of birth than male births. Likewise, singleton births had better chances of survival compared to multiple births. While second order births had the best survival probability, births with birth order 5 or more had the lowest survival probability. Birth interval less than 12 months had the lowest survival probability, while birth interval with 36 to 47 months and 48 to 59 months had the highest survival probability. The survival probability with 60 to 83 months birth interval was lower than birth intervals with 36 to 47 months and 48 to 59 months. The survival probability of birth interval with 84 months and above was higher than the birth interval of 12 to 23 months but lower than the birth interval of 24 to 35 months. Caesarean births, births assisted by skilled health personnel, and births having no history of sibling's death had a higher probability of survival than their counterparts.

Community clustering of under-five mortality
The parameter ln(θ) in Table 4 shows the variation in unobserved effects on under-five mortality explained by assigning a frailty term. The parameter was estimated with the null-hypothesis that θ = 0 . The log-likelihood ratio test for θ = 0 was significant for India and the other five state-regions, namely, north, central, east, northeast, and west, indicating that the characteristics affecting the probability of under-five mortality may be similar within the community. Table 5 shows the hazard ratios of under-five mortality by previous birth intervals according to the mother's age at first birth in India. The risk of under-five mortality is higher among those with birth intervals lesser or higher than 36 to 47 months for children born to mothers whose age at first birth was lower than 30 years. However, this relationship was not evident among children of mothers whose age at first birth was 30 years or higher. Table 6 shows the hazard ratios of under-five mortality by birthweight. The risk of under-five mortality is higher among those with birthweight lesser than 2500 g and 4000 g or more. This result is similar to results obtained using birthsize: the risk of under-five mortality is higher among those births who were of less than average birthsize or above-average birthsize. Hence, our results are robust to the choice of birthsize or birthweight.

Discussion
This is the first study to discuss the effect of community dependency on under-five mortality in India and its six state-regions accounting for censoring. The presence of community dependencies results in the underestimation of the standard error of the estimate   (Trussell & Rodriguez, 1990). Our study provides a more robust estimate of under-five mortality by adjusting the unobserved community effects using the information on the time-to-death of the child. We used a Weibull hazard model with gamma shared frailty to understand the impact of unobserved community effects on the risk factors of underfive mortality. The result showed that except for south India, children born in the same communities in India and other five state-regions, namely, north, central, east, northeast, and west regions shared similar characteristics of under-five mortality. Several studies have documented the effects of community characteristics on under-five mortality in India (Bora, 2020;Gupta et al., 2016;Kravdal, 2004;Kumar et al., 2012;Singh et al., 2011). However, these studies neither considered the time-to-death information nor accounted for censoring. Identifying the robust estimate of under-five mortality risk factors considering the impact of unobserved community factors is necessary to make target-oriented policies and programmes to reduce under-five mortality rates in India. The risk of under-five mortality decreased with an increase in mother's schooling. At the national level, births that occurred to mothers with secondary or higher schooling had lower risk of under-five mortality compared with mothers with no schooling. The relationship between mother's schooling and risk of under-five mortality among their children was seen in the north, central, west, and south regions. This finding is consistent with the past Indian studies (Basu & Stephenson, 2005;Caldwell, 1994;Mandal & Chouhan, 2020;Mandal et al., 2019;Rajna et al., 1998;Vikram & Vanneman, 2020). In fact, earlier studies have argued that the importance of mother's schooling in reducing child mortality in India is becoming more potent over time (Bourne & Walker, 1991;Kravdal, 2004;Singh et al., 2011).
Female births were less likely to die within the first five years of life compared with male births in our study. The finding is consistent across the other five state-regions except for the north region. This finding is interesting given that India is still marked by considerable son preference. Our finding is in contrast to that in the earlier studies that show a higher risk of under-five mortality among female children compared to male children (Arnold et al., 1998;Arokiasamy, 2004;Das Gupta & Mari Bhat, 1997). Only Ladusingh and Singh (2006), in their study on north-eastern states of India, reported a female advantage in survival in the first five years of life. Such a reversal in the trend could indicate that the postnatal discrimination against female children is reducing in the country. In addition, due to the declining fertility and availability of sex-detection technology, the postnatal discrimination against female children is shifting to prenatal discrimination against the female foetus, a point that was also noted by Bhat and Zavier (2003). In such a situation, more and more female children are likely to be born in small families and are wanted. We found a U-shaped relationship between preceding birth interval and under-five mortality. The under-five mortality risks were lowest among births with birth intervals of 36 to 47 months and 48 to 59 months. The mortality risks were much higher among children with birth intervals less than 36 months and greater than 59 months; mortality risks being considerably higher among births with birth intervals less than 36 months. The U-shaped relationship between previous birth interval and underfive mortality was prominent among children of mothers whose age at first birth was below 30 years. Our finding is in complete alignment with the WHO recommendation of a three to five years interval between two consecutive births (World Health Organization, 2007). Our finding is also in alignment with other studies that have found a substantially higher risk of infant mortality among births with birth intervals less than three years (Molitoris et al., 2019;Rutstein, 2005). While the mortality risk plateaued after birth intervals of 36 months, according to Rutstein (2005), the mortality risks increased after 59 months in our study. Our findings also add to a relatively small body of research that shows that birth intervals longer than 60 months are disproportionately associated with higher risk of adverse maternal outcomes, which are known to be associated with foetal loss, low birthweight birth, preterm birth, and mortality in first few years of life (Conde-Agudelo & Belizán, 2000;Conde-Agudelo et al., 2006Skjaerven et al., 2002;Zhu et al., 1999). Our study also adds to Barclay et al. (2020), who found effects of very long birth intervals (> 60 months) on outcomes, such as preterm birth, low birthweight and hospitalization during childhood. Hanley et al. (2017) also found that birth intervals longer than 60 months increased the risk of low-birth-weight babies.
Death of a preceding sibling was associated with higher risk of under-five mortality, net of other independent variables. This relationship was seen in all the six state-regions of India. This finding is consistent with the earlier studies' findings that reported the association of death of a preceding sibling with that of subsequent infant death after controlling for maternal-level unobserved heterogeneity (Arulampalam & Bhalotra, 2006, 2008. Given that 11% of the births in India during the five years preceding NFHS-4 occurred to mothers who had experienced death of a child in the past, there is an urgent need for policy makers and programme managers to focus on such mothers and births. Our study also calls for greater focus on low-birth-weight babies as these babies comprised 18% of total live births in India (IIPS & ICF, 2017). In our study, multiple births were also more likely to die before their fifth birthday. Multiple births were at higher risk of death at 2, 7, and 42 days after delivery in Bills et al. (2018). Since multiple births are relatively uncommon (about 2%) in India, the mortality burden of multiple births is likely to be small.
Earlier studies on child mortality have indicated the Muslim mortality advantage in India despite Muslim parents being poorer and less educated than Hindu parents (Bhalotra & Van Soest, 2008;Bhalotra et al., 2010;Bhat & Zavier, 2005;Deolalikar, 2008;Geruso & Spears, 2014;Shariff, 1995). However, we did not find any Muslim underfive mortality advantage at the national level. We found a Muslim under-five mortality advantage only in the east region, where Muslim children were only 0.86 times as likely as Hindu children to die within five years of birth. On the contrary, we found a Muslim under-five mortality disadvantage in the central region. The estimated under-five mortality rates from NFHS-4 also indicate closing Hindu and Muslim mortality gaps at the national level (IIPS & ICF, 2017). These findings clearly indicate towards the reversal of Muslim under-five mortality advantage-that was present for many decades in India. Similarly, we find a diminishing of Scheduled Castes or Scheduled Tribes under-five mortality disadvantage in all the six state-regions of India. This finding is also in contrast to previous studies that highlighted the Scheduled Castes and Scheduled Tribe's mortality disadvantage with respect to child mortality (Bora et al., 2019;Dommaraju et al., 2008;Subramanian et al., 2006;Vishwakarma et al., 2020).
The reversal in mortality advantage or disadvantage for some of these groups may be attributed to the Government of India's greater focus on improving the health of the poor and the marginalized population subgroups. The ambitious National Rural Health Mission (NRHM), now known as National Health Mission (NHM), was launched in 2005 to improve the health of the poor and the marginalized population subgroups, such as rural poor, scheduled castes or tribes, women, children, etc. (National Health Portal, 2018). Another aim of the NRHM was to bring architectural corrections in the public health system of the country. Janani Suraksha Yojana (JSY), now strengthened and renamed as Janani Shishu Suraksha Karyakram (JSSK), is an important programme that aims at promoting institutional delivery among pregnant women belonging to the poor and the marginalized subgroups to effectively reduce maternal and neonatal mortality (National Health Portal, 2015). Since its implementation, the institutional delivery has increased manifold (institutional deliveries have risen sharply from 39% in 2005to 79% in 2015-2016(IIPS & ICF, 2017). Studies have also shown that such a sharp increase in institutional deliveries was accompanied by a significant decline in perinatal and neonatal deaths in India (Goudar et al., 2015). Socio-economic gaps in other maternal and child health care services utilization have also narrowed down over the last two decades (IIPS & ICF, 2017). Another important intervention in this direction is the Mahatma Gandhi National Rural Employment Guarantee Act (MGNREGA, 2005). MGNREGA provides legal guarantee for at least 100 days of employment every year at minimum wages for at least one able-bodied person in every rural poor household (https:// nrega. nic. in/ amend ments_ 2005_ 2018. pdf ). Limited research shows that MGNREGA has positive and significant effects on women's participation in household decision-making (De Mattos & Dasgupta, 2017). Participation in MGNREGA was also associated with reduced infant malnutrition in Rajasthan, India (Nair et al., 2013).

Limitations
Our study has a few limitations. First, our study is based on the retrospective survey data on child survival collected by interviewing women in the age group 15 to 49 years. Owing to the retrospective nature of the data, there is a possibility of recall bias when mothers retrospectively report the age-at-death of their children. The precision in the information on age-at-death of the child is important in establishing the actual proportion of deaths in a specific age group in order to have an accurate overall estimate of mortality (Alexander & Alkema, 2018). To reduce the recall bias, we restricted our analysis to births during the five years preceding the NFHS-4. Second, we could not include birthweight in the regressions because of the large number of missing cases. However, the exclusion of birthweight was compensated by including birthsize in the regression models. Although birthsize is a subjective measure, some earlier studies have shown that birthsize is a good proxy of birthweight in developing countries like India (Mani et al., 2012;Singh & Tripathi, 2013;Titaley et al., 2010). And, the sensitivity analysis also reveals no difference in using birthsize as a proxy of birthweight. Third, we could not include variables related to antenatal care in the regression models; the information related to antenatal care was collected only in reference to the most recent birth in the five years preceding NFHS-4. Finally, owing to inadequate sample size, we could not perform the sensitivity analysis for different state-regions.

Conclusion
We extended and updated the literature on under-five mortality by examining the factors associated with under-five mortality in India using information on over 0.23 million births that occurred during the 5 years preceding NFHS-4. NFHS-4 is India's most recent population representative household survey for which unit-level data are available in the public domain. Such a large sample size offered us a unique opportunity to examine the associations in each of the six state-regions separately. Moreover, we used advanced statistical models, such as the Weibull hazard model with gamma shared frailty, to understand the impact of unobserved community effects on the risk factors of under-five mortality. By doing so, we were able to show that births share similar characteristics of under-five mortality within the community, a finding that has rarely got attention in the existing literature. In addition, we were able to demonstrate the effects of variables such as birth interval and multiple births effectively. We were also able to show how the association of certain variables, such as sex of the child, religion and caste, with under-five mortality has changed over the last decade. Our study complements existing literature by providing a more robust estimate of under-five mortality risk factors for India and its six state-regions. The model's ability to account for censoring makes the estimates more robust than the estimates from the previous studies. By doing separate analysis for state-regions, we were able to identify factors that may contribute to the reduction in under-five mortality in these specific regions. Since the factors associated with under-five mortality were not necessarily the same across the six state-regions of India, adopting a uniform approach in dealing with under-five mortality in India may not benefit all the regions equally.