The potential impact of co-residence structures on socio-demographic inequalities in COVID-19 mortality

During the COVID-19 pandemic, confinement measures were adopted across the world to limit the spread of the virus. In France, these measures were applied between March 17 and May 10. Using high-quality population census data and focusing on co-residence structures on French territory, this article analyzes how co-residence patterns unevenly put different socio-demographic groups at risk of being infected and dying from COVID-19. The research ambition is to quantify the possible impact of co-residence structures heterogeneity on socio-economic inequalities in mortality stemming from within-household transmission of the virus. Using a simulation approach, the article highlights the existence of theoretical pronounced inequalities of vulnerability to COVID-19 related to cohabitation structures as well as a reversal of the social gradient of vulnerability when the age of the infected person increases. Among young age categories, infection is simulated to lead to more deaths in the less educated or foreign-born populations. Among the older ones, the inverse holds with infections having a greater potential to provoke deaths through the transmission of the virus within households headed by a highly educated or a native-born person. Demographic patterns such as the cohabitation of multiple generations and the survival of both partners of a couple help to explain these results. Even though inter-generational co-residence and large households are more common among the lower educated and foreign born in general, the higher educated are more likely to still live with their partner at higher ages.


Introduction
In early 2020, the COVID-19 pandemic caused a global health crisis, affecting more than 120 million people in 219 countries and claiming more than 2.7 million lives in a year 1 . Several international studies have already shown how COVID-19 unevenly affects different populations and social groups within countries. In the USA, the over-exposure of African-Americans in particular has been highlighted, pointing out that the health 1 See the John Hopkins University Coronavirus Resource Center.
inequalities structuring American society were exacerbated by the pandemic (van Dorn et al. 2020;Yancy 2020) with prevalence rates three times higher for cases and six times higher for deaths in predominantly black counties compared to white ones (Khunti et al. 2020). In general, pre-existing social inequalities in health seem to favor health inequalities in the COVID-19 pandemic as has been shown in China (Chen et al. 2020), Italy (Group and et al 2020), or in Great-Britain (ONS 2020b). Controlling for age, mortality rates in deprived areas of England were more than twice as high as those in privileged areas (ONS 2020a). In France, the number of deaths has increased sharply with marked differences according to the country of birth of the deceased. All causes taken together, deaths in March and April 2020 of people born abroad increased by 48% compared to the same period in 2019 and by 22% for deaths of people born in France (Papon and Robert-Bobée 2020). There is a large variety of possible factors that could underlie these socio-economic differences including living conditions 2 working conditions, 3 or pre-existing health inequalities 4 . In this article, we focus on the role that household composition could play in creating socio-economic differences in the risk of dying from COVID-19. By focusing on household structure, we follow earlier research that has shown how demographic factors, such as age, are likely to be important determinants of variation in mortality due to COVID-19 across countries and geographical areas (Dowd et al. 2020;Esteve et al. 2020b;Esteve et al. 2020a;Bajos et al. 2020).
However, the role played by demographic factors is not limited to the age of populations. Demographics can help shed light on the causes of variations in fatality rates by addressing factors such as prevalence of chronic disease, population density, economic disparities, sanitary conditions, and household size and composition (Nepomuceno et al. 2020). Inhabitants of the same dwelling indeed expose each other to the risk of withinhousehold transmission of the virus (Li et al. 2020), and in France, the first results of the EpiCoV survey 5 suggested the crucial importance of familial transmission with a rate of positive serological tests 6.1 times higher for individuals living with another previously infected person compared to the positivity rate of people living alone . The lockdown measures adopted in France between March and May and again in November and December, designed to limit social interactions and reduce the virus' reproduction rate, give an even more central role to co-residency structures in the evolution of mortality inequalities in the pandemic caused by the SARS-CoV-2 coronavirus 6 .
By focusing on French data, our main goal is to investigate how co-residence patterns can shape socio-economic inequalities in COVID-19 mortality in France through the channel of within-household transmission. In this work, the inequalities analyzed are the inequalities related to education level, migration status and citizenship status. The education, citizenship, and nativity status are both the benchmarks for many works on inequality, and their study could help to echo international work on COVID-19 social inequalities. The remainder of the article is structured by first presenting a theoretical 2 Population density of inhabited neighborhoods, phenomena of spatial segregation, inequalities in access to housing. 3 Inability to stop going to work physically, (non)existence of sick leave. 4 Inequalities in access to health insurance, co-morbidity factors. 5 Coordinated by the Institut national de la statistique et des études économiques (INSEE), the Institut national d'études démographiques (INED), and the Direction de la Recherche, des Études, de l'Évaluation et des Statistiques (DREES). 6 Even if the effectiveness of lockdown measures and the limitation of the epidemic may depend on a vector of causes such as institutions, health behaviors, prevalence of co-mortality factors, climate, or compliance with social distancing measures. reflection on differences in household composition across educational levels or nativity status groups that may have an influence on mortality inequalities caused by withinhousehold transmission ("Background" section). At the end of this section, we formulate a set of hypotheses to be tested. The data and the method are presented in a second part ("Data and method" section), followed by the results of our micro-simulation models ("Results" section) and a final discussion.

Background
In this paper, we aim to study the impact that household arrangements can have on group differences in mortality that can arise from the within-household transmission of COVID-19, even though household arrangements can affect the risk of dying from COVID-19 through other pathways such as the need for external help and the need for contact with members external to the household. Given this focus of the paper on mortality due to the within-household transmission of the virus and the strong age-gradient in case fatality ratios (Verity et al. 2020) two factors become important: the size of households-a limiting factor in the potential number of people to whom the virus can be transmitted within the household following an initial infection-and the age of household members. The more household members, and the older they are, the higher the risk that household members die after a person becomes infected with COVID-19. We label this risk of household members to die from COVID-19 after within-household transmission as vulnerability. We attribute to this vulnerability measure the value of the average number of deaths that an initial infection could trigger within an household through the transmission of the virus to the other household members. In the next section, we discuss how three co-residence types that shape household's vulnerability are expected to vary by education and migration status: individuals living alone, household size, and multi-generational households.

Living alone at all ages
The possibility of within-household transmission of COVID-19 is determined by a dichotomous feature of household composition: those made up of several people and those made of people living alone and therefore not at risk of becoming infected by other household members. We here only consider how living alone reduces within-household transmission of COVID-19. Unquestionably, living with relatives provides advantages of mutual assistance or moral support (Arpino et al. 2020), and individuals who live alone might be more at risk of becoming infected by COVID-19 through non-household members, but these aspects are not central to the analysis presented.
An increasing number of people live alone in European countries, and this phenomenon affects all age groups to varying degrees (Esteve et al. 2020c). The age at which one leaves the parental home-which may be linked to economic (Portela and Dezenaire 2014) or cultural (Van de Velde 2008) factors-matters for the share of individuals living alone at young ages. After that, the likelihood of moving in with a partner and having children, then widowhood at older ages, are the main successive causes explaining single-person households (Reher and Requena 2017;Requena et al. 2019). In France, during the first lockdown, 16% of the population lived alone, including a substantial number of people over 75 years of age (Bernard et al. 2020). Through the postponement of household formation and the increase in life expectancy, living alone became more common at all ages before retirement over the last decades (Bodier et al. 2015). After retirement, the proportion of people living alone is much higher and affects women more often, especially after 80 because women are on average younger than their partners, live longer, and re-couple less often after a break-up (Francine et al. 2001).
These average age trends differ by socio-demographic characteristics. Before the age of 45, education is positively correlated with the probability of living alone. Whereas higher educated persons are more likely to live in a couple, lower educated persons are more likely to live with children because of higher rates of single parenthood. After the age of 60, these differences diminish, and after the age of 60, the level of education is associated with a lower probability of living alone. This pattern stems from education-related health inequalities, which reduces the risk of being widowed for higher educated individuals at higher ages as compared to lower educated individuals (Blanpain 2016).
Hypothesis 1a: Higher educated individuals have a reduced risk of dying after the within-household transmission of COVID-19 because they are more likely to currently live in a one-person household. However, this negative relationship between vulnerability and education can reverse with age.
Regarding immigration, a stronger link to migration is on average associated with a lower probability of living alone. This might mirror the educational gradient discussed above since immigrants and descendants of immigrants are over-represented in the least educated categories. However, inter-generational cohabitation logic could also explain these figures if there is a higher prevalence of multi-generational households in immigrant populations.
Hypothesis 1b: Since people with a migration history are less likely to live alone, migration status should increase vulnerability to dying from COVID-19 after within-household transmission, regardless of the age of the person initially infected.

Household size
The higher the number of people in a dwelling, the more potential for within-household transmission of COVID-19. Several factors can lead to large families. In France, the number of children is not evenly distributed by level of parental education, with an overrepresentation of the least educated in large families (Pirus 2004).
Step families are also associated with a higher average number of children and are more prevalent among the least educated (Bodier et al. 2015). Immigrants are almost twice more likely to live at home with three or more children than non-immigrants but descendants of immigrants live in almost the same proportions as non-immigrants in households with three or more children and have similar fertility behavior (Blanpain and Lincot 2015).
Hypothesis 2: Household size is expected to increase the vulnerability of the least educated populations and those with an immigration background, at all ages.

Multi-generational households
Since the mortality rate of COVID-19 is highly age-dependent, household size analysis should be accompanied by an analysis in terms of the age of its inhabitants. In particular, the heterogeneous patterns of inter-generational cohabitation can cause high mortality differentials across social groups. Living with one's parents at different stages of the life cycle can impact inequalities in vulnerability to COVID-19. Education is positively related to the age at first childbearing (Davie 2012). At the same age, the children of parents with higher education can therefore transmit the virus to older people within their households if they still live in the parental home. However, living in the parental home happens less often in more educated households. Among 30-year-olds, lower educated individuals live with their parents three times more often than university graduates. Even after the first departure, it is not uncommon to see people returning to their parents' home. The reasons put forward in the surveys to explain a return to one's parents' home vary according to age, but across all age categories, the loss of employment, financial problems, or health problems account for half of the returns to the parents' home 7 . These problems predominantly affect individuals with precarious employment status and difficult working conditions, a population in which the least qualified people and foreigners are over-represented.
Hypothesis 3: Education and the absence of links with migration reduce vulnerability by lowering the likelihood of living in a multi-generational household.
All of these socially influenced cohabitation patterns suggest that the inequalities in COVID-19 vulnerability linked to within-household transmission depend on multiple factors that may have different directions and significance. Our aim is to give a quantitative indication of the relative importance of these factors in shaping the vulnerability of socio-economic groups to dying from COVID-19 after within-household transmission of the virus.

Data
The data used are from the Census of the French population carried out by the National Institute of Statistics and Economic Administration (INSEE) from 2009 to 2013. Each year, the census collects information on a subsample of all French households. The French population is divided into five groups, with data on each group being collected from a representative sample of household every 5 years. The inclusion of 5 years of data therefore covers the whole French population. The final sample used in our analysis includes 19.6 million observations representing 31% of the total French population living in private households. Each observation is weighted in order to make the sample representative of the total population in the median year of data collection (2011).
The only selection criteria we used to construct our sample was whether individuals lived in private households. Inhabitants of collective dwellings such as retirement houses were excluded from the sample by the statistical institute providing the data. The data is organized into households and provides information on age, education, citizenship, and migratory status for all individuals in each household. None of the cases was excluded because of missing information on one or more of these variables.

Decomposing direct and indirect risks
We aim to document group differences in the amount of deaths that are expected to arise after a person becomes infected with COVID-19 and subsequently exposes other household members to infection with the virus. In this case, deaths can be of two types: direct deaths stemming from a "primary" infection when an individual becomes infected outside of the household, and indirect deaths caused by "secondary" infections, i.e., linked to the transmission of the virus from the primarily infected person to other household members. Following the methodology of Esteve and colleagues (Esteve et al. 2020b), we compute for each individual the expected total number of deaths if that person becomes infected with COVID-19. We assume that primary infections occur at random because we concentrate on variation that arises after transmission within the household. Direct deaths per infection equal the age-specific probability of dying once infected and indirect deaths are computed by multiplying the number of co-residents the infected person has in each age category by both the age-specific probability of within-household infection 8 and the age-specific probability of death following an infection. As the co-residential patterns differ throughout the life cycle, individuals are classified into 10-year age groups.
Formally, we compute the average total number of deaths per infection in a given age category as: with N a the total number of inhabitants in age category a, n i,a the total number of individual i's co-resident members in age group a ∈ A with A the set of all age groups, p i the individual weight, r a the age-specific probability of infection computed by Davies et al. (2020), 9 and m a the age-specific infection fatality ratios as estimated for 10-year age groups using official epidemiological data in France between 27 May 2020 and 22 February 2021 10 , 11 .

Assessing the role of within-household transmission on COVID-19 social inequalities
Since the age-specific probabilities of dying m a are assumed invariant across social variables but co-residence structures are expected to vary according to the reference level of education or migration status 12 , assessing the role of within-household transmission on inequalities in COVID-19 vulnerability implies averaging the second part of Eq.
(2) not only by age but also by education level or nativity status. Formally, we compute the average number of indirect deaths following an infection in a given age category and a given socio-demographic category as: with N a,s v the total number of inhabitants in age category a and social category s from variable v, n i,a,s v the total number of individual i's co-resident members in age group a ∈ A with A the set of all age groups, r a the age-specific probability of infection, and m a the age-specific infection fatality ratios.
Comparing different values of Eq. (3) for different categories s of the same sociodemographic variable v identifies how the average differences in co-residence structures between social groups can shape the theoretical inequalities of mortality related to the transmission of the virus within households.
Before moving to the main analysis, we present the socio-demographic variables we use to compare the vulnerability of social groups.

Head-of-household level variables
The objective is to study the dispersion of the vulnerability index as a function of socioeconomic variables, in particular, education, citizenship, and nativity variables. However, not all inhabitants of the same dwelling belong to the same education or migration category. We therefore assign socio-demographic variables at the household level rather than at the individual level using information on the reference person, which is identified in each household following the INSEE methodology 13 . The benefits of this household approach are especially clear for individuals who have not completed their education and still live with their parents. The head-of-household variables v from Eq. (3) we use to study socio-economic differences in vulnerability are education, "citizenship status, " and "nativity status. " Educational level has been classified according to the recommendations of the Conference of European Statisticians for the 2010 Population and Housing Censuses and contains 5 levels describing the level of education completed: less than primary, primary, lower secondary, upper secondary, or university completed. Education was preferred to occupational categories because the later did not allow for the inclusion in the analysis of the inactive (including students and retirees) or the unemployed recorded at the time of the cross-sectional data collection. The nativity status distinguishes between individuals born in France and foreign-born individuals whereas the citizenship status is partitioned in three categories: citizen by birth, naturalized citizen, and non-citizen 14 . Information on having parents who migrated to France was not available in the data.

Co-residence patterns by age
We start the analysis by describing the household composition of French households and subsequently calculate the connected vulnerability to dying from COVID-19 and how this differs by education, citizenship, and nativity status. Figure 1 shows the average household composition by 10-year age groups. The average number of co-residents ranges from 3.2 for young children to 0.6 for those over 80 years of age ( Fig. 1). Children are the populations with the highest average number of co-residents. Between 20 and 29, individuals start leaving the parental home, leading to a fall in the average number of co-residents. Between the ages of 30 and 50, partnering and fertility 13 The reference person of the household is determined automatically using a rule based on the number of persons in the household. If the household has only one person, this person is the reference person. If the household has two persons, if they are a couple and of different sex, the man is the reference person; otherwise, the reference person is the oldest active person, or if neither of them is active, the oldest person. If the household has three or more persons, the three oldest persons in the household are considered. If a couple is identified among them, the oldest working man of this couple, failing this the oldest working man of this couple, failing this the oldest working woman of this couple, failing this the oldest woman of this couple is the reference person; otherwise, the reference person is the oldest working person among them, or if none of the three oldest persons in the household is active, the oldest person among them. 14 The similarity of the distribution of the individual level variables with the statistics published by the INSEE was verified upstream of the analysis of the head-of-household variables. The distribution of the sample by head-of-household categories is also available in Appendix A. behaviors increase the average size of households. From the age of 50 onward, children leaving the household coupled with increases in widowhood probability leads to a gradual fall in the number of co-residents. Unlike other age groups, people over 70 essentially live with individuals of their own generation. Figure 2 translates the age differences in household composition into risks of dying from COVID-19. For each age group, the figure indicates the expected number of deaths per 1000 random COVID-19 infections. This graph makes the distinction between "direct" and "indirect" deaths according to the decomposition presented in Eq. (2). The lighter part of the bars shows the age gradient of the fatality case ratio computed with French epidemiological data and relates to "direct" deaths, i.e., the risk that primarily infected persons pass away themselves. The darker part indicates additional deaths transmission of the virus to co-residents could cause. These indirect deaths range from 1.2 deaths per 1000 random infections in the 0-9 years old age category to 25.2 deaths for 1000 primary Fig. 2 Vulnerability of households to COVID-19-related deaths by age infections among persons over 80 years of age and 21.4 for the same figure for people between 70 and 79. In other words, the number of deaths that could arise from a person above 70 transmitting the virus to other household members is more than 17 times higher than the number of deaths that children aged 0-9 could cause in that manner. Below 60 years of age, this dispersion is lower. Indirect deaths range from 1.7 deaths per 1000 random infections among 30-39 years old to 5.0 in 50-59 years old. This figure more than doubles to 10.4 indirect deaths for 1000 primary infections among 60-69 years old. These differences are related to the fact that individuals less regularly live with their parents at age 30 and relatively more with children under 10. On the contrary, between 50 and 69, individuals live regularly with their partners since the widowhood rate is relatively low in these age categories and can therefore transmit the virus to a person from the same generation, whose age constitutes a non-negligible COVID-19 mortality risk.

Direct and indirect risks by age
These figures highlight that the vulnerability of a population to the virus not only depends on its age structure (direct deaths), but also on the number and age of other household members to which each individual can transmit the virus (indirect deaths). The significance of within-household transmission on population vulnerability is essential at all ages. The younger the infected individuals are, the greater the indirect deaths are as a proportion of all deaths. Among people under 20, more than 99% of predicted deaths following a primary infection are the deaths of another family member to whom the virus would be transmitted. As age increases, the absolute predicted number of indirect deaths per primary infection increases.
The next section focuses on documenting inequalities in theoretical indirect deaths due to within-household transmission between social groups defined by education category, country of birth, and citizenship status of the head-of-household 15 . Figure 3 displays the estimated average number of indirect deaths following the infection of 1000 individuals belonging to a given age and head-of-household education level category. In other words, each bar of the graph is a value of Index a,s v = N a,sv i=1 a∈A n i,a,s v * r a * m a p i N a,sv i=1 p i * 1000 with v being the reference education level of the household, s v all the possible education levels, and a the 9 age categories. Crossing the age of primarily infected persons with their household's reference level of education highlights two social gradients: between age groups and between different levels of education within the same age group (Fig. 3).

Education, nativity, and the reversal of the social gradient in mortality over the life cycle
In the youngest age groups, the lower the household's reference level of education, the higher the number of indirect deaths following a primary infection. Up to the age of 60, the educative gradient of our COVID-19 demographic vulnerability index is very pronounced within each age group. The greatest dispersion occurs between 30 and 39 years of age, for which persons living in a household with the lowest level of education are estimated to cause 7.5 times more deaths through within-household transmission following their own infection than a person of the same age category in a household where the reference degree is a university degree. As the age of the primarily infected person increases, the education-related vulnerability inequalities shrink and between ages 60 and 69, the head-of-household level of education no longer seems to play a major role. 15 All the results of the micro-simulation models are available in Appendix G.  However, after age 70, the direction of our educational gradient in mortality is inverted with our average vulnerability index increasing as the head-of-household education level raises. For instance, in households with a university degree as the reference education level, the infection of an octogenarian will cause significantly more deaths compared to the infection of an octogenarian living in an household with the lowest reference education level. This difference is estimated to be around 44% and to account for an additional 958 deaths per 100,000 primary infections.
Using the birthplace variable, we observe the same shift as the age of primarily infected persons increases (Fig. 4). Following an infection, people living in households headed by a foreign-born person can cause more indirect deaths through within-household transmission up to the age of 60. The relative difference peaks between 20 and 29 years of age. At that age, foreign-born household inhabitants are expected to cause on average 2.4 times more deaths through within-household transmission following an infection. After 30, this ratio shrinks.
From the age of 60 onward, not only does the total number of indirect deaths by infection increase substantially, but inequalities in vulnerability are reversed. The reference population 16 suffers a comparative disadvantage and experiences a higher average number of indirect deaths per infection. However, these differences remain small, and the reversal of our index gradient is less pronounced than with the level of education and the reversal appears slightly earlier in the life cycle. Estimated mortality differences amount to 60 additional indirect deaths per 100,000 primary infections among octogenarian and 133 among septuagenarians. Regarding citizenship, the findings are quite similar even if the inversion of the gradient is not that clear (Fig. 5).
Our result that social inequalities in expected COVID-19 mortality related to withinhousehold transmission of the virus are neither of the same magnitude nor the same direction depending on the age of the infected person is interesting and analytically challenging. In the next section, we dive deeper into two questions that emerge from the results: • Why are the households of the higher educated and native-born less vulnerable under the age of 60? • Why is it that from the age of 60 onward, inequalities in vulnerability to the virus following an infection seem to diminish and reverse across groups?

Additional analysis
In the literature review, we discussed previous research to formulate expectations about the prevalence of various household structures among different socio-economic groups and how these could affect vulnerability to COVID-19. In this regard, we discussed the amount of single-person households, household size, and multi-generational households as factors that could cause differences in vulnerability across social groups. Lower educated individuals and individuals with a migration background/non-citizens were expected to live in households that are more vulnerable to within-household transmission-related COVID-19 deaths for various reasons. First of all, they were expected to live in larger households. Table 1 shows the number of household members depending on the educational level, nativity status, and citizenship of the head-ofhousehold. Large households are indeed more prominent among the lower educated, the foreign-born and non-citizens. Second, they were expected to more often live in multigenerational households. Tables 7, 8, 9, 10, 11 and 12 in Appendix C and D, respectively, confirm this expectation, lower educated individuals or individuals with a migration background under the age of 50 living on average with older individuals or with at least one person being one or two generations older. Finally, they were expected to be less likely to live alone during the primary parenting ages. Table 2 shows the proportion of individuals living alone by age, education, nativity status, and citizenship variables. A positive educational gradient in living alone is indeed observed at younger ages, particularly between ages 20 and 39. Similarly, native-born and French citizens are more likely to live alone too.
These three factors combined provide explanations for why lower educated individuals; the foreign-born and non-citizens are expected to be at a higher risk of dying from COVID-19 after becoming infected by a household member. However, these gradients in vulnerability reverse at later ages. For education, the clearest explanation can be found in the number of persons living alone. Table 2 shows that after age 60, the higher educated are less likely to live alone than lower educated persons. To estimate to what extent these differences in the likelihood of living in a single-person household can explain our results, we restrict the analysis to households of more than two persons in additional analysis (Appendix Figs. 15,16,and 17 and Tables 13 and 15). For the education variable, the reversal of the gradient indeed disappears and the gaps beyond the age of 70 turn into a disadvantage for the least educated categories. A parametric model estimated for households with more than two persons is consistent with these results (Appendix G-Tables 16  and 17). These models also confirm for nativity status and citizenship that differences in living in a single-person household cannot explain why the gradient in vulnerability reverses at later ages.
Additional analysis (Appendix C, D, and E) shows that the main reason is that even though older persons who are not born in France or not a citizen live in larger households and are less likely to live alone, they live with fewer persons who are at a high risk of dying from COVID-19. In other words, the age composition of these households is different. Whereas native-born and citizens predominantly live with their partner or alone at  later ages, the foreign-born, and non-citizens less often live with a person from their own generation i.e. their partner 17 .

Robustness and sensitivity tests
Our results are consistent with detailed demographic explanations. However, our microsimulation model relies on the assumptions that may shape our results. Indeed, our modeling first assumes heterogeneity in the probability of being infected with the virus as a function of age. This assumption and the probabilities of infection incorporated into the model are based on the work of Davies and colleagues (Davies et al. 2020). Setting the probability of transmission to the other inhabitants of the dwelling to 1 for all individuals allows to give an upper-bound limit of the number of deaths that could result from an infection and to check the sensitivity of our results to this age-specific infection rate hypothesis. Figures 12, 13, and 14 available in Appendix F.2 synthesize the results of these models and show that our results-and, in particular, the reversal of mortality inequalities with the age of primarily infected persons-are robust to changes in the transmission rate.
In addition, the age-specific mortality rates are pivotal in shaping the results. We made the methodological choice to apply to our model the mortality rates computed with comprehensive French epidemiological data. However, the robustness of our results to a change in these mortality rates was tested by using the age-specific mortality rates calculated by Verity et al. (2020) and used for example in the work of Esteve and colleagues (Esteve et al. 2020b;Esteve et al. 2020a). The results obtained with these new mortality rates are available in Appendix F.3 and are also consistent with the model used in the article. Only the inequalities observed between native and foreign-born for primary infection at older ages are slightly smaller. Finally, our model has been designed to separately address inequalities by education, birthplace and citizenship status. We provide in the Appendix the details of a simple parametric model to control for the robustness of our results by controlling jointly-rather than separately-for education and birthplace category. This model also includes a set of additional other demographic variables and is reproduced for different household sizes (Appendix F.4).

Discussion
The aim of this article was to investigate to what extent differences in co-residence structures could cause variation in deaths through the within-household transmission of COVID-19 across social groups. Previous research has shown that mortality related to COVID-19 is socially stratified. Among the possible reasons that those from disadvantaged social groups are more likely to die from COVID-19 are differences in co-residence structures. In this article, we investigated to what extent this could be the case by relying on a simulation exercise.If a random person becomes infected with COVID-19, how many individuals are expected to die from the transmission of the virus to this person's household members? We found that education, citizenship, and place of birth of the head of household are all related to this expected number of deaths. The simulations showed that the infection of lower educated individuals, of persons who were born abroad, and of non-citizens are indeed expected to lead to more deaths related to within household transmission of the virus as compared the infection of higher educated individuals, the native born and citizens. However, an interesting element that came forward in the analysis is that these socio-economic gradients in vulnerability reverse with age. At higher ages, higher educated individuals, the native born, and citizens are more likely to still live with their partner. Given that the partners of older individuals are at higher risk of dying after infection with COVID-19, this increases the risk that someone dies after a higher educated, native-born, or citizen becomes infected with COVID-19. There are important limitations that have to be taken into account when interpreting this result. Firstly, the current work does not allow including the co-morbidity factors between the different socio-demographic categories studied. It is possible that case fatality ratios at higher ages differ by education due to comorbidity, a refinement we could not take into account. Secondly, collective housing information was not available either, an issue we have to leave for future research. Thirdly, in addition to demographic data, the calibration of the model was enriched with epidemiological data concerning the probability of virus transmission and mortality by age group. The likelihood of transmission probably differs according to other variables too. Future work could look at weighting the likelihood of virus transmission by type of family, intimate or social link between each pair of inhabitants within the same household. Fourthly, we did not take into account possible differences by gender. Women might more often live with household members (children/parents) and therewith live in more vulnerable households, but at later ages, women might be more likely to live alone as they are more likely to survive their partner. Future research can look into this further. Finally, it has to be emphasized that our analysis relies on simulations based on estimated case fatality ratios. The extent to which within-household transmission-related mortality varies across social groups in reality is a question that only future data collections can answer. Nonetheless, our results do confirm the concerns that socio-eonomically disadvantaged individuals are more likely to live in households vulnerable to COVID-19. However, this conclusion only holds under the age of 60. At later ages, co-residence structures change and socio-economically advantaged individuals are more likely to live with persons at high risk of dying from COVID-19. This could be taken into account when interpreting observed socio-economic gradients in mortality.

Conclusion
The role of cohabitation structures and their heterogeneity by age and social group play an important role in the theoretical social inequalities in COVID-19 mortality. By measuring vulnerability inequalities as the theoretical differences in deaths caused by the virus through within-household transmission following a random initial infection, the results highlight that these inequalities are important regardless of the age of person initially infected. However, the direction of the differences in average indirect deaths per infection change with the age of the initially infected person. The number of indirect deaths by infection is higher in the least educated households or those with a migration background in the youngest age categories. Then, these figures balance out before turning to the disadvantage of domestic-born or more educated populations for initial infections of individuals over the age of 60 years. These results-underlining the fact that cohabitation patterns alone can be a tenfold element of pronounced social inequalities in mortalityfind a strong resonance at a time when the return to lockdown measures is re-investing public debate as the third wave of epidemics appears in France. Also, by decomposing vulnerability inequalities on the basis of two criteria-namely the age and socio-demographic category of the primarily infected person-we posit that the social gradient in COVID-19 vulnerability linked to within-household transmission is subject to finer logic than the social gradient in mortality usually presented in the literature as its direction changes with the age of the initially infected person. The same pattern is observed for education level and place of birth, but a comparison of the age-specific cohabitation logics of these two variables shows that the underlying logics are different after the age of 60. For the level of education, single-person households are the main factor in inequalities, whereas in the case of foreign-born individuals, households composed of several generations and lower widowhood inequalities with domestic-born persons reduce the observed inequalities.

Nativity status Share
Foreign-born 17%

Native-born 83%
Note: 27% of individuals live in an household in which the head of household has a university degree. 83% of individuals live in a household in which the head-of-household was born in France Note: 95% of citizen by birth live in an household in which the head of household is a citizen by birth. 2% of them live in a household in which the head-of-household is not a citizen      Note: 60.3% of people aged 0-9 living in a household with a citizen by birth head of household live at least with another person aged beween 0 and 9 Table 11 Proportion of individuals with at least one co-resident by age and head-of-household education level Proportion of indivduals with at least one co-resident by age category

Co-residents age
Head-of-household variable Less than primary education  Note: 60.3% of people aged 0-9 living in a household with a citizen by birth head of household live at least with another person aged beween 0 and 9 Table 12 Proportion of individuals with at least one co-resident by age and head-of-household nativity status

F.1 Sample restrictions
As a robustness test of our interpretations, the microsimulation exercise was conducted by restricting the sample in two different ways. First, by excluding persons in the sample living alone in order to be able to quantify the importance of single-households in the variations in indirect deaths observed between sub-populations. This test aims both to verify that the results observed are not biased by people living alone and to provide elements for assessing the explanations provided for the inequalities observed at different ages and the role that living alone may play depending on the categorical variable studied (Fig. 15, 16, and 17). The results were replicated for fixed household sizes to provide additional insights to our interpretations (Tables 13,14,and 15).    Reference level: citizen by birth Note: When the full sample is considered, the primary infection of a 0-9-year-old living in a household headed by a naturalized citizen triggers a vulnerability index 58.7% higher than for the primary infection of a person of the same age living in a household headed by a person from the reference category

F.4 Econometric model
A parametric estimation model is used to control for the effect of a set of variables on simulated indirect deaths following an infection. The model used is an ordinary least squares model with categorical variables interaction terms. The variable explained is the individual number of theoretical indirect deaths from our micro-simulation models. This model seeks to test whether the effects of education level and nativity status put forward in our analysis are robust to a parametric estimation controlling for a set of other sociodemographic factors. These variables are sex, region of residence, age, and urban/rural character of the area of residence. Beside these variables, the head-of-household education level and nativity status are included, and the estimates associated to these variables are used to compute their estimated effect on simulated indirect deaths. To compute the effects for both head-of-household education level and nativity status, three models were carried out. The first one includes either the interaction between age and education level or between age and nativity status, along the set of other socio-demographic variables. The second model includes the same set of variables and both interaction terms between age and head-of-household education level and age and head-of-household nativity status to control simultaneously for both of those variables. Formally, model 2 equation was: The third model replicates the second one on the sub-sample of households with at least two persons.
All coefficients are significant at a 1% level. All estimation results are presented in the tables below in terms of differences in the number of deaths per 100,000 infections compared to a reference level. Reference level: native-born Note: In the model with education control, the random primary infections of 100,000 persons aged 0-9 living in household headed by a foreign-born person are associated with an estimated additional 81 deaths compared to 100,000 infections of individual of the same age and with similar socio-demographic characteristics but living in household headed by native-born persons [3] [4] [5] [2] [3] [4] [5] [2] [3] [4] Note: In the model with nativity control, the random primary infections of 100,000 persons aged 0-9 living in household headed by an individual with a lower secondary education ([3]) are associated with an estimated fewer 67 deaths compared to 100,000 infections of individual of the same age and with similar socio-demographic characteristics but living in household headed by a person with less than primary education (reference level).