Moving from North to North: how are the students’ university flows?

Student mobility has been much commented upon and much studied. Student mobility has social, economic, and political consequences. This form of mobility is relevant, in Italy, in terms of south-north flows, while the mobility of northern students toward the South and Centre of Italy is negligible. To the best of our knowledge, a proper focus on the dynamics among northern regions has not yet been carried out. This study focuses on the interregional mobility of northern first-year students. To this end, we use a longitudinal dataset with students’ individual histories from 2008 to 2017, obtained from the cohort-based datasets collected using the Italian Ministry of University’s administrative databases. Descriptive and model-based analyses are employed for assessing the association between the propensity to move and individual characteristics, as well as some territorial variables. A longitudinal study is also considered. Here, we see an increase in the population entering the university system and mobility flows across northern regions. The results show that students’ educational experiences influence the propensity to move. However, the most relevant driver of the phenomenon is the attractiveness of areas with a higher supply of university courses and a better economic context.


Introduction
Over the last 25 years, university education in the EU 15 has undergone a transition from an elitist to a mass form of education. In 2011, the European Higher Education Council noted "learning mobility is widely considered to contribute to enhancing the employability of young people through the acquisition of key skills and competences, including especially language competences and intercultural understanding, but also social and civic skills, entrepreneurship, problem-solving skills and creativity in general". This European statement refers both to "students' degree mobility" (i.e., students enrol in a university outside their area of residence to complete a full degree) and "students' credit mobility" (i.e., a limited period spent abroad for study or traineeship, see, for instance, Erasmus), across and within countries, as well as across university education levels. Degree mobility can be inter-country (across countries) or domestic (within countries). Inter-country degree mobility is concentrated in certain countries, with the top destinations (like the UK, Germany, and France). These cover almost 80% of the mobile student population and are characterised by a consistent number of degreemobile students coming from outside the EU. In reference to domestic mobility, the entity and patterns of the phenomenon vary according to the spread of universities across regions in the same country and urbanisation, employment opportunity, and regional education system levels. Bulgaria, Cyprus, Hungary, and Lithuania are characterised by few universities that receive a significant number of domestic students: some big universities, located in the capital, attract students from across the country. In Britain, many studies have been conducted on the relationship among student mobility, social inequality, and future jobs. In Italy, the degree mobility rate towards neighbouring regions increased from 10.8% in 2008 to 15.4% in 2017. But literature on the Italian context is mainly focused on domestic mobility, from the South towards the Centre and the North of Italy. In 2017, in most regions in Southern Italy, the movers for undergraduate courses stand at over 30% (Attanasio & Priulla, 2020). It is important to underline that almost one-third of all universities and five out of the big twelve public universities are placed in the North. Indeed 44% of all university courses supplied in Italy in 2017/18 were run in the North: 43% of scientific courses, 46% in health, and 43% in economics and law. The northern university system admitted 48% of all Italian firstyear students in 2017/18, corresponding to 138,921 students: 11.1% more than 2010, despite a decrease of 1.9% in the universities of central and a 10.4% decrease in the number of first-year students in the universities of southern Italy (ANVUR, Rapporto Biennale, 2018).
Even if the mobility of northern students has been explored focusing on specific areas and using local administrative data, a wide analysis of the mobility drivers and patterns in the whole northern area has not been performed yet. Moreover, the distinctive geographical, social, and economic characterization of the northern area supports the interest towards the description of mobility within this area, which is less important than the south-north one. The aim of this study is explorative, and the focus is on the Italian domestic mobility across northern regions "North to North". We are particularly interested in undergraduate university courses, and several research hypotheses are suggested to ease the description.
We consider both spatial and temporal patterns in northern mobility and other factors relative to the pre-university educational experience and geographical context. The analysis of the phenomenon takes into account university courses supply in the area. This is characterised by the presence of some big universities in the regions: Emilia Romagna, Veneto, Lombardy, and Piedmont. There are also important urban areas with several universities, such as Milan and Turin. This particular geographical distribution affects the patterns of interregional degree mobility.
This study is performed on a dataset based on the Database Mobysu.It (2016). This database collects cohort information on students' careers on first-years enroled in an Italian university from 2008 to 2017. The approach adopted is characterised by different steps. First, we study degree mobility in terms of the students' individual characteristics: gender, region of residence, nationality, individual educational experience, and university of destination. The longitudinal dimension of the dataset is also considered in terms of the time evolution of domestic degree mobility, with all the cohorts from 2008/09 to 2016/17. The approach adopted is based on a descriptive analysis and ordinal regression models aimed at assessing the role of individual and contextual variables on the propensity of first-year students to move.
The paper is organised as follows. The "Literature review" section is devoted to a literature review, while the "Research questions" section provides the research questions. The dataset and the statistical methods adopted in the analysis are reported in the "Data and methods" section. An analysis and the results of the model estimation are presented in the "Results" section. Finally, the "Conclusions" section offers a discussion of the relevant results and conclusions.

Literature review
Student mobility for higher education is an issue that matters at the academic, institutional, social and economic level. Indeed, university students play a significant role in the diffusion of knowledge-based processes and, as such, represent crucial tools in promoting local innovation and economic growth (Abramovsky, Harrison, & Simpson, 2007).
In literature, there are two strands on student mobility: the first focuses on international migration (Beech, 2018;Brooks & Waters, 2009;Gümüş, Gök, & Esen, 2020;Javed, Zainab, Zakai, & Malik, 2019), and the second concerns "domestic (degree)" migration, which usually occurs in a context of regional inequality, where mobile university students have important implications for future social mobility. Focusing on domestic mobility, Barrioluengo and Flisi (2017) noticed a strong heterogeneity in European countries: Bulgaria, Cyprus, Hungary, and Lithuania are the countries where the highest differences across universities exist, for example, where only a few universities receive a significant number of mobile students. Internal youth mobility towards urban areas is also witnessed in Poland (see Dolinska, Jonczy, & Rokita-Poskart, 2020), driving a permanent migration and growing differences in the development of the regions and regional capitals. In Van Bouwel and Veugelers (2013), the results of different studies on domestic student mobility are reported. In the USA, the university prestige accounts for only a modest proportion of inter-state migration of students. In the Netherlands, students are not guided by the educational quality of university programmes, but rather by the availability of urban amenities, thus supporting the "consumption perspective" of higher education over the "investment perspective". In Scotland and Wales, students who can enter a high-quality university in their own nation are less likely to move away for higher education. In Japan, meanwhile, quality differentials significantly increase the likelihood that Japanese students move away from their home region for higher education. In England, many studies have been conducted on the relationship among students' mobility, social inequality, and future jobs. In general, London plays the role of an "escalator region" for young people with upward mobility into professional and managerial occupations. Indeed, while London is particularly advantageous, there is a "migration premium" for upward social mobility associated with moving to a large UK city, compared to staying in home region. Thus, social class represents a key factor that drives the mobility choices of young people, with disadvantaged students less likely to leave home (Donnelly & Gamsu, 2018;Holdsworth, 2009). Moreover, in Britain, Oxbridge (i.e., Oxford and Cambridge) has a special role with implications of superior social or intellectual status, and students are consequently willing to travel to go there. In this context of heterogeneity across countries and of inter-regional inequalities, Italian domestic student mobility has its own special characteristics. These have increased in the last years, and they are related to Italian regional inequalities, with students from the poorer South being more likely to travel towards the Centre and the North of Italy (Attanasio & Enea, 2019). This pattern of students' mobility is consistent with the general interregional flows of highly skilled migrants from southern regions of Italy. These are driven by the search for more favourable socioeconomic contexts (Ballarino & Panichella, 2021;Etzo, 2011;Nifo & Vecchione, 2014). Thus, the imbalance between labour market conditions and education at the regional level pushes students and graduates to a constant out-migration of better or richer students and a decline in human capital in the origin regions (Faggian & McCann, 2009). Most of the literature on domestic migration, either of students or graduates, aims to identify the factors affecting it. A wide literature is devoted to the analysis of the patterns and drivers of internal migration at the international level (see, for instance, Green, 2018;Bernard, Bell, & Charles-Edwards, 2014;Bernard, 2017, Prakhov & Bocharova, 2019 and in Italy (Bonifazi, Heins, Licari, & Tucci, 2020;Mencarini, 1996). D'Agostino, Ghellini, and Longobardi (2019) point out the role of contextual factors in students' mobility from the South to the Centre and the North, checking for individual characteristics. Positive self-selection due to family background, expressed in terms of parental education levels, has been pointed out by Impicciatore and Tosi (2019). Other useful indications on the potential factors affecting domestic mobility patterns are pointed out in Dotti and al. (2013) and in Fratesi and Percoco (2014), where the focus is on selective South-North migration in Italy, driven by a lack of labour mobility and market competition.
Other studies, like Ciriaci (2014) and De Angelis, Mariani, and Torrini (2017), point to the role played in mobility choices by the research and teaching quality of universities of destination, the parent's education level, and the family economic conditions. The significant rise in the interregional mobility of Italian students in university education has been documented since the end of the 2000s, in terms of the incidence of movers from southern regions. This is despite the relatively low geographical mobility of students from the Centre and North of Italy.

Research questions
Several factors affect degree mobility, which may be relative to the context of destination or origin, at a macro level, or to the individual, at a micro level. The available data, collected at "university admission", allows for an evaluation of the time trend of the students' mobility phenomenon in the northern area.
Our research questions consider, firstly, the contextual and individual aspects of the students' sociodemographic background. We also look at aspects of educational experience in the secondary school and some characteristics of the university of destination. In particular, our first set of hypotheses concerns some well-known results and can be disentangled in the following way: H1. Do female students have lower mobility rates? Agostino et al. (2019) show that females have a lower mobility rate, and this evidence is stronger among parents with a lower educational background, see further Türk (2019).
H2. Does the lack of supply of university courses in the area of origin explain degree mobility?
There are some differences in migration behaviour across students living in the north. In our analysis, the differences associated with structural reasons, for instance, the absence of specific fields of studies in the residence area, are evaluated both by comparing the mobility trends in each northern region and by including the province of residence in the modelling approach. For instance, Tosi, Impicciatore, and Rettaroli (2019) affirm that students living in regions with a mega-university, offering a wide variety of courses, are less likely to migrate. The role of the supply of courses in some areas of residence is pointed out in D' Agostino et al. (2019) and in Santelli, Scolorato, and Ragozini (2019).
H3. Are students from scientific and classical "Liceo" more inclined to study far from home?
H3 concerns some covariates related to the characteristics of Italian secondary-school experience, as the type of secondary school is associated with socioeconomic background, which affects, in turn, the mobility of students. The type of school attended was included in the modelling approach in Tosi et al. (2019), D'Agostino et al. (2019), and Checchi and Flabbi (2007). These three studies claim that classical and scientific "Liceo" students are generally middle-upper class, and, thus, they are more likely to migrate.
H4. Are the most talented secondary school students more likely to be mobile?
The positive relationship between a student's ability, measured in terms of secondary school marks, checking for family background, and the propensity to migrate, has been assessed in Tosi et al. (2019). This research shows how top-grade students are more likely to move within North of Italy.
H5. Do students who studied in private secondary schools present higher rates of migration?
In the Italian educational system, the relationship between the mobility rate and the public-private status of secondary schools is mainly due to the fact that private high schools may be an indirect measure of socioeconomic status. Their positive role on mobility propensity has been assessed in many studies (Donnelly & Gamsu, 2018;Holdsworth, 2009).
H6. Are higher rates of mobile first-year university students registered in private universities?
As discussed in Santelli et al. (2019), in relation to the mobility of students from Campania, one of the reasons for migration is university prestige, which is tied particularly to some private and top-ranked Italian universities.

H7. Are there important differences in mobility rates across fields of studies?
A comparative analysis of the influence of individual and contextual factors across different fields of study is set out in D'Agostino et al. (2019).
H8. Does interregional degree mobility grow at the regional level?
Any trend analysis of the degree of mobility might reveal some specific regional patterns. This kind of regional analysis for the northern Italian university system is not common in the existing literature. In the present analysis, however, the richness of the data source allows for a detailed study of just this question.

Data
The database was obtained by the Italian Ministry of Education (MIUR), and it includes micro-level longitudinal information on university students' careers from 2008 to 2017 (Database Mobysu.It, 2016). The database collects information on students' secondary school and university experience and their general sociodemographics. The dataset contains information on student mobility among universities, which are not available from individual universities; these naturally have data only on their own students. Specifically, the database has about 200 to 300 variables per record (student) with a total number of records ranging from about 270,000 to about 295,000, depending on the cohort. The information collected in the original datasets is organised into cohorts, and for each year of observation, students entering the university system are followed up to July 2019. Some previous studies based on this dataset are available: in particular, the results reported in Attanasio and Enea (2019) assess change in southern mobility changes over a decade.
In the present work, we use all available cohorts (2008/09 to 2017/18) to describe the time patterns in mobility and the last available cohort (17/18) to provide up to date evidence on mobility drivers that might prove useful for decision-makers.
In our analysis, we select all first-year students living in northern Italy. We exclude, however, first-year students: Starting with a master's degree; Enroled in medicine, veterinary, or other 5-year courses; Enroled at telematic universities; Enroled with more than ten credits from previous university degrees, as they cannot be considered "real" first-year students.
As we are interested in student mobility in northern Italy, the variable of interest is defined by the distance covered by moving first-year students. However, as a measure, distance is difficult to define. Different types of variables could be used: distance in km, the route, time spent on travel, the means of transport. We decided to define mobility on three different levels based on the distance between the region of origin and that of the destination university. The variable of interest, the status, is classified in stayer (who remains in the region of residence), half mover (the mover to the neighbouring regions), and mover (the mover not in neighbouring regions of the North of Italy or in other regions), and it measures increasing levels of mobility, as at home, not far from home, and far from home. The last step in the data-cleansing process is the deletion of records with missing values. Therefore, the records under study are 861,428: 464,947 females (53.97%) and 396,481 males (46.03%), with a median age equal to nineteen, without important differences between genders. Most of these first-year students were born in Italy (809,162 or 93.93% of the observed data), and most are Italian citizens (825,225 or 95.80% of the students' population).

Methods
The first step in data analysis is based on simple descriptive statistics, on the last cohort, and trend analysis from 2008 to 2017; the second step is based on the generalized linear model applied to the 2008 and 2017 datasets. Usual descriptive statistics are considered along with Chi-squared testing procedures, while the mobility time trend is performed by graphical representation of population percentages of moving first-years conditional to the field of study. The analysis of conditional time trends for other factors included in the study is omitted because their effects are negligible.
Mobility flows are driven within a multi-regional system, and the representation of geographical complexity of the phenomenon should include terms of "separation" and of "interaction" (as suggested in Rogers, Willekens, Little, & Raymer, 2002) between aspects of "emissiveness" of provinces of origin and of "attractiveness" of regions of destination.
In this study, a generalized linear model with fixed effects is adopted in analysing the "determinants" of mobility. The same analysis is also performed using the Lasso procedure to overcome strong correlations among covariates and obtain further evidence on the significance of the estimated effects. In fact, the Lasso estimation procedure results are not influenced by the p-value issue related to the size of the dataset and discussed in Lin, Lucas, and Shmueli (2013).
The joint effects of the covariates are analysed through an ordered logit model assuming the variable status as the outcome. The definition of mobility adopted is an ordinal variable by construction, as the students choose their path according to a changing attitude to migration. This attitude can be affected by various factors. These include the province of residence, corresponding to level 3 of the NUTS European territorial classification. This is included in the model specification to account for the student's origin following a fixed-effect approach. For the sake of brevity, as the number of the provinces is 47, we do not include this covariate and the corresponding estimates in the descriptive analysis and the "Results" section. We recover this information providing graphical representations in the model estimation results sub-section. For the sake of simplicity, the model is estimated considering only the data referred to the first (2008) and the last (2017) available cohorts. The joint consideration of these two subsets allows for an identification of the change in the mobility propensity observed in the descriptive analysis reported in Figs. 2, 3, 4, and 5. The time effect is measured considering a dummy variable identifying the cohort. The interactions between time and explanatory variables are tested, but these effects are not significant in the classical model selection procedure or in the Lasso penalisation.
The model is specified by a vector generalized linear model (VGLM). The classspecific linear predictor is given by where j identifies the category of the response factor (three ordered categories in our analysis) and X is the set of covariates. A not-parallel model hypothesis is used to capture the different effects of the mobility drivers we observed in the two model equations. A more parsimonious model can be obtained adopting the partial proportional assumption for the model parameters, but the size of the dataset used for the analysis is large enough to dismiss this kind of hypothesis.
The model specification is, then, integrated considering the type of model family for the contrasts of the ordinal outcome variable. The adopted formulation is based on the adjacent categories' hypothesis. The assumed link function is the logit one which defines the log odds of the change of status (from stayer, j = 1, to half mover, j = 2, and from half mover to mover, j = 3).
Given the large dataset used, which is a population, the p-values have not the usual classical inferential meaning. The analysis includes the results of a regularisation approach (Lasso) for evaluating the size of the estimated parameters. As expected, some of the significant coefficients are negligible within the Lasso regression framework. We decide to use the penalised regression approach as a robust method for measuring the intensity of the relationships identified with the descriptive analyses. Lasso regression is also robust in the presence of collinearity among regressors, which can be a source of bias in the classical ordinal model estimation obtained with the VGLM approach. Under the not-parallel assumption, Lasso regularisation considers the penalisation likelihood function, based on the additive term defined as where λ determines the degree to which coefficients (B) are shrunk toward zero, and α is fixed at 1 to obtain the specific kind of regularisation.
All the analyses are developed in R (R Core Team, 2020). The core functions are used for the basic analysis of the relationships. The model estimation is obtained using the VGAM library in R (Yee, 2015), and it is estimated considering the fixed effect specification and the Lasso regression approach. The library ordinalNet (Wurm, Rathouz, and Hanlon, 2017) is used to this end.

Results
The structure of this section mirrors that in the "Data and methods" section. Thus, we first report a preliminary analysis of student mobility. The second part of the section is devoted to the model estimation results.

Preliminary analysis of students' mobility
This sub-section is divided into three parts: general data on students' mobility, a descriptive analysis of the response status conditional to the covariates, and trend analysis of the stayers, half movers, and movers.

General data on students' mobility
The distribution of first-year students in the northern regions is reported in Table 1, classified by origin and by destination region. The last row of Table 1 reports percentages of enroled first-year students in each region coming from all other northern regions.
The origin/destination array allows for a description of the directions of the mobility of northern first-year students. Lombardy, Veneto, and Emilia Romagna are the regions with the highest number of universities and they face the lowest proportions of outgoing movers. These three regions are followed by Piedmont, while the highest levels of half movers and movers come from Trentino Alto Adige. Mobility towards neighbouring regions affects Veneto, Liguria, and Piedmont and the Valle d'Aosta (VDA). In terms of destination, Lombardy seems to be the preferred region with the highest proportion of migration from other regions (49%, 69%, and 27% of students migrating from, respectively, Emilia Romagna, Piedmont and VDA, and Liguria). However, the proportion of incoming students from other regions is the lowest, revealing how the portion of first-year students in Lombardy coming from other regions is widely balanced by residents. On the other hand, Friuli Venezia Giulia and Emilia Romagna show the highest proportions of students from other northern regions. Mobility paths are furthermore disentangled by the pairwise balances, reported in Table 2, where each element (i, j) is the difference between incoming students in region i from region j and outgoing residents of region i towards region j. The last column reports the total balance of each region of origin: Liguria, Trentino Alto Adige, Piedmont with VDA, and Veneto have negative balances, while Emilia Romagna, Lombardy, and Friuli Venezia Giulia present positive payoffs.

Descriptive analysis of the response status conditional to the covariates
Descriptive statistics and graphics are presented in two parts, according to the hypotheses specified in the "Research questions" section. The first part (Table 3) includes gender (H1), the type of secondary school (H3), the secondary school grade (H4), the school ownership (H5), the university ownership (H6), and the field of studies (H7). In contrast, the second part (Table 4 and Fig. 1) regards the students' region of residence (H2) and their destination.
The marginal distribution in Table 3 "Total North Italy" describes the 2017 dataset composition in terms of the characteristics of northern first-year students, compared to the Italian population of first-year students (Rapporto ANVUR, 2018), shows a lower proportion of women (55.1% in Italy) and more students coming from scientific and classical "Liceo" (34.2% against 11.1% in Italy). These dissimilarities may be partially due to the selection applied in our dataset.
The numbers in Table 3 can give some general preliminary insights. The most interesting result is the highest propensity of "Liceo" students to move, especially those from a classical "Liceo". This is due to the Italian secondary school structure in which there are big differences in the typology of high schools, and in which these typologies may be a proxy of the socioeconomic background of students (Contini Table 1 First-year students' distribution by region of residence, region of destination, and status (stayer, half mover, and mover) (2008-2017)     & Scagni, 2010). In other words, classical and scientific "Liceo" students are more inclined to move because of their socioeconomic background. Moreover, we can observe that females are slightly more inclined to move than males; students with high marks prefer to move more; private universities present higher proportions of half-movers and movers than public universities; students doing health degrees are less "mobile" than others. 13.6% of movers are enroled in private universities in 2017, which registered higher proportions of incoming movers (3.6%). This result may be linked to the attractiveness of top-ranked institutions in Lombardy that account for 68% of the northern movers enroled at private universities. This feature may also be linked to the higher percentage of movers enroled at the social area courses, given the structure of these private universities. Table 4 points out the proportions of mobility levels and the marginal distributions by region of residence and destination.
First of all, it is very important to notice how the movers are on average less than 2%. However, the percentages rise above 10% in Trentino Alto Adige and reach about 9% in Liguria and Friuli Venezia Giulia, while half movers are between 10% and 14% in all the regions except in Veneto and Trentino Alto Adige, where these numbers are much higher. As expected, all these proportions are strongly connected to the geography of the North of Italy and the presence of three big university cities (Milan, Turin, and Bologna). The regions, in which these cities are to be found, have very high percentages of stayers, while the other important university city, Padua, though smaller than the other ones, does not absorb all the students in its region because of its proximity to Trentino and Friuli.
The attractiveness of the northern regions (H2), in general, and of some big universities or urban areas is furthermore depicted in Fig. 1. The left panel (A) shows all flows, including students attending a university in their residence region, and the right panel (B) represents the same phenomenon excluding non-migrating students. These plots are derived considering the full dataset of first-year students in the 2017 cohort, including the mobility towards "other" Italian regions.

Trend analysis of the stayers, half movers, and movers
The final analysis concerns the general time pattern of students' degree mobility ratios in the northern regions, introduced in Figs. 2 and 3. Panel A in Fig. 2 depicts the increasing proportion of moving northern first-year students, with a higher positive slope in the last years (2014)(2015)(2016)(2017). That feature is associated with a general growth of young northerners entering university education in the last 5 years of observation, as reported in Panel B in Fig. 2, after the decreasing trend in the previous period. The pattern of mobile first-year students in the northern area seems to be stable during the first period while rising in the last four surveyed years. The growth indexes for the movers, half movers, and stayers are computed assuming 2008 as the baseline. The lines show an increasing growth rate in half movers and movers, especially from 2013 to 2014, while during the period 2010-2013 movers experienced negative growth rates. The negative growth rates of stayers during the years before 2016 seem to be balanced by positive rates of movers and half movers during 2011-2014. Finally, all rates show an increase in the total population of first-year students, particularly since 2015.  Fig. 3 Rates of total mobility (movers plus half movers) by field of study (2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017) If we consider the time patterns conditional to the field of study and the region of origin, mobility increases for all the fields of studies, as reported in Fig. 3, with the highest increase due to students enroling in courses of the "humanities" field (from 15.8% in 2008 to 20.2% in 2017). The lowest are in "health" degrees, which do not include the degrees in "Medicine". Rates of mobility in the "scientific" and "social" fields have parallel trajectories.
In Fig. 4, the evolution of the first-year sub-population mobility levels is described separately for each region of residence. As one will notice, the temporal evolution of the growth indexes of sub-populations is substantially different from region to region. The growth rates of movers increase mainly in Friuli Venezia Giulia and Liguria, but Emilia Romagna also experienced a sudden rise from 2014. The growth rate of mobility towards neighbouring regions increases in almost all the northern regions except in Piedmont with VDA and in Trentino Alto Adige. Finally, non-mobile first-year students present a stable growth rate overall, except in Trentino Alto Adige, where all growth rates are stationary or decreased. This evidence points to the way that some regions face a general reduction in university education by the resident population. Finally, this longitudinal focus shows how Lombardy and Piedmont are the only northern regions where the growth rates of movers are negative at least till 2015.

Model estimation results
The estimation results of the model, defined in the "Methods" section, are reported in Table 5. The iteratively reweighted least squares (IRLS) estimation results are compared with those of the Lasso regression. This comparison between classical methods of estimation and Lasso regression shows that these two results are mostly overlapping, but the classical parameters are sometimes bigger than the Lasso ones. More precisely, the Lasso penalisation effects have a reduced size (connected with the shrinkage effects of Lasso) compared to the classical ones. The Lasso penalisation has a more prominent effect on the model for the transition from half mover to mover, where a larger set of estimated parameters are constrained to zero. With respect to the univariate analysis Table 5 Regression model. Results of the ordinal logit model and Lasso (with effects of region of destination and province of residence-summarised in the "Conclusions" section) reported in the "Preliminary analysis of students' mobility" section, the model includes age, as a dichotomous variable (≤20; >20).
The model summarises the joint effects of all the factors under consideration, including the region of destination, as a proxy of attractiveness, and the province of origin (summarised graphically in the following section) as a factor partially explaining mobility propensity. The estimated effects are generally coherent between the two components of the ordinal logit model. The results with opposite signs regard only Lombardy and Piedmont + VDA. In general, the estimated coefficients are significantly different from zero in the first regression, while the p-values in the second regression show a lower significance for the considered factors. Aside from the typical shrinkage effect, the results of the Lasso regression are consistent with the classical estimates of the first regression, while some effects that are significant in the second regression model are negligible for the Lasso approach.
The results of the model estimation can be finally studied for assessing the relationships hypothesised in the "Research questions" section.

Gender and age (H1)
In detail, our results for the gender difference in mobility propensity seems to discontinue the previous literature. In fact, despite the results in D'Agostino et al. (2019), northern females present a higher probability in the S vs HM, while gender is not significant in the HM vs M transition. Age shows an unusual result, as the students who are older than nineteen are slightly less mobile.

Area of origin (H2)
The association between mobility status and the area of origin (region or province) is significant, where most of the coefficients of the province of residence are significantly different from zero (see Fig. 5). Observing the differences among the regions of residence (Table 4), we can identify three regions with a large proportion of "movers" (Trentino Alto Adige, Friuli Venezia Giulia, and Liguria), which may be explained by their geographical location and the absence of large universities there. All the regions' parameters have the same sign in both equations, save Piedmont and Lombardy, whose coefficients are positive in the HM vs M equation. This is due to the presence of large universities in these regions. Moreover, by including in the model the province of residence as fixed effects, it is possible to isolate the "geographical" aspect, which is crucial in our analysis. In particular, the analysis of the two plots in Fig. 5 (where Bologna is the baseline) highlights the peculiarity of the result concerning the provinces: most of the provinces with a large (low) coefficient in the model for the transition S vs HM show a low (large) coefficient in the model for the transition HM vs M. This result means that areas with a high probability of mobility to adjacent regions show a low probability of mobility to the other regions and vice versa. Interestingly, most of the students from the provinces of Mantova, Belluno, Novara, and Alessandria move to adjacent regions. In those cases, the closer university is in a neighbouring region. The graphic description of mobility in the provinces of origin summarises the differential propensity to move nearby or further away for the first-year students living in each northern province.
Secondary school track (H3, H4, H5) The influence of school experience, in terms of type of school track attended, school ownership, and student performance, is in line with previous results in Table 3.
The "Other Liceo" category collects different tracks, mainly represented by the Human Sciences Liceo, which is not residual but central in terms of number of students enrolled. Thus, this Liceo is a reference category useful to point out the effects compared to the traditional Liceo types as the scientific and classical ones.
In the North of Italy, students coming from a classical "Liceo" migrate more than others; in fact, the log odds for classical "Liceo" are positive in both model components. They are followed by all the other "Liceo", including the scientific ones, while the other types of school have low percentages of half movers and movers (H3).

Secondary school grade
Students' performance represents another relevant driver of mobility propensity (Table 3), confirming the "self-selected mobility" of top-mark students (H4). The high school mark, ranging from 60 to 100, is marginally associated with mobility, as the movers generally report higher grades. Model estimates are in line with the increase in percentages of half movers and movers from 13.7% and 1.5% (proportions with a grade lower than 90) to 19.8% and 2.9%, (proportions with grade 100 or 100 cum laude). Moreover, there are differences between the log odds estimated in the model for a half mover probability compared to the corresponding movers in this framework. In the latter case, the Lasso estimates show that only the effect connected with the top marks is statistically significant.

High school ownership
Hypothesis H5 is supported by the models. The log odds for private secondary schools are positive and statistically significant in both equations. The related interpretation should consider that private secondary schools are relatively few in Italy, nor are they always linked to higher quality educational tracks. They are generally tied to the social and economic conditions of the family.

University ownership and field of studies (H6, H7)
In Table 3, we observe that the proportion of movers and half movers (3.6% and 18.4%) in private northern universities is higher than those in public ones (1.4% and 13.8%). This is partially supported as the estimated log odd is statistically significant only in the first equation (H6). In Italy, there are some private universities (in the North, they are concentrated in Milan) endowed by a recognised brand and often top-ranked at the international level. A similar result has been assessed in Santelli et al. (2016). At the same time, percentages of movers are similar for the "Sciences", the "Social Sciences", and for the "Humanities". In fact, the corresponding estimates show positive and close log odds (higher in the equation HM vs M). As expected, the propensity to move for "Science", "Social", and "Humanities" first-year students is higher than for those in "Health" degrees, as the 3-year degree courses in health sciences are offered in every region. The "Health" field of study has been chosen as reference category because it is homogeneous geographically with a low propensity to mobility. Notwithstanding, the observed differences are not relevant from a statistical point of view.

Region of destination
The inclusion of the region of destination covariate brings interesting results. The size of the log odds corresponding to the region of destination is very relevant: Emilia Romagna shows the highest attractiveness of half movers (as it borders several regions); Piedmont and Lombardy are the only regions that face the increased probability of receiving students from non-adjacent regions. Substantially, the log odds of the regions of destination can be interpreted as an attractivity index. Finally, the model identifies a negative log odds for younger first-year university students. The propensity to move is higher in the older first-year group, once we account for the other covariates, but only the coefficient estimated for the first component is significant.

Trends of mobility (H8)
The evidence on trend patterns of mobility are described in the "Preliminary analysis of students' mobility" section. The model evaluates the evolution of the phenomenon comparing 2008 and 2017 data. The time effect shows a positive and significant effect only in the first component (S vs HM).

Conclusions
This paper provides, for the first time, a description of degree mobility among first-year northern Italian university students, and analyses some of the factors in their moving habits and the temporal evolution of the phenomenon from 2008 to 2017. This explorative study has allowed to disentangle relevant information on the students' mobility patterns among the northern regions, which has important economic implications on local territories. We focus on two mobility levels: the first towards an adjacent region and the second to a non-adjacent region. The results show that most students choose a local university, which can also be located in a neighbouring region. Instead, Milan and Turin, the two big Italian northern cities, do not follow this rule. They attract students from all the northern regions. It seems that the choices of northern firstyear students are influenced more by the "investment-perspective" than by the "consumption-perspective" (Van Bouwel L., Veugelers, 2013), as Milan and Turin offer better economic conditions and better job opportunities. Another effect, that is difficult to control for, is the attractiveness of big cities, as Milan and Turin are the only large cities in the North of Italy. Moreover, the inclusion in the model of the province of residence as fixed effects allows for an easy summary of the geographical territories from which students tend to migrate. A measure of the "emissiveness" of provinces of origin and of the "attractiveness" of regions of destination is given by the parameters of the logistic models. A further investigation of the interaction between origin and destination of students could be interesting, but the number of interaction parameters is prohibitive, even from a computational point of view.
It is interesting to notice how flows of movers are similar for the "Sciences", "Social Sciences", and "Humanities", while, as expected, the 3-year degree courses in "Health" do not show mobility. It appears that mover patterns are driven by the presence of some "brand" and/or prestigious universities, sometimes private, spread equally over the "Sciences", "Social Sciences", and the "Humanities". The individual variable most affecting degree mobility among the northern regions is education in a classical "Liceo", which proves a useful proxy for the middle-upper social classes.
This explorative research is limited to the mobility of first-year students in the northern regions; however, the outcomes of our analysis provide an interesting insight on the mobility within the whole northern area, which has been better explored through the splitted analysis into short-distance and long-distance movers. Bologna, Milan, and Turin are the preferred northern cities of destination by all the Italian students (Attanasio & Priulla, 2020), but our study points out that Turin is not attractive by the northern first-year students. Further analyses of the relationships with mobility from the bachelor to the master level might prove of interest in the future. These future studies could usefully be flanked by a survey aimed at "discovering" student expectations and perspectives.
Funding This work is supported by Ministero dell'Istruzione, dell'Università e della Ricerca (MIUR), PRIN 2017. "From high school to job placement: micro data life course analysis of university student mobility and its impact on the Italian North-South divide". [grant n. 2017HBTK5P]. P.I. Massimo Attanasio.
Availability of data and materials Data will not be shared because they are part of a funded research involving a non-diffusion policy.

Declarations
Competing interests