Ravenstein Revisited: The Analysis of Migration, Then and Now

In 1876, 1885 and 1889, Ernst Ravenstein, an Anglo-German geographer, published papers on internal and international migration in Britain, Europe and North America. He generalized his findings as “laws of migration”, which have informed subsequent migration research. This paper aims to compare Ravenstein’s approach to investigating migration with how researchers have studied the phenomenon more recently. Ravenstein used lifetime migrant tables for counties from the 1871 and 1881 censuses of the British Isles. Data on lifetime migrants are still routinely collected but, because of the indeterminate time interval, they are rarely used to study internal migration. Today, internal migration measures from alternative sources are used to measure internal migration: fixed interval migrant data from censuses and surveys, continuous records of migrations from registers, and “big data” from telecommunications and internet companies. 
Ravenstein described and mapped county-level lifetime migration patterns, using the concepts of “absorption” and “dispersion”, using migration rates and net balances. Recently, researchers have used lifetime migrant stocks from consecutive censuses to estimate country to country flows for the world. In the last decade, an Australian-led team has built an international database of internal migration flow data and summary measures. Methods were developed to investigate the modifiable areal unit problem (MAUP), in order to design summary internal migration measures comparable across countries. Indicators of internal migration were produced for countries covering 80 percent of the world’s population. 
Ravenstein observed that most migrants moved only short distances, anticipating the development of “gravity” models of migration. Recent studies calibrated the relationship between migration and distance, using gravity models. For mid-19th century Britain, Ravenstein found the dominant direction of internal migration to be towards the “centres of commerce and industry”. Urbanization is still the dominant flow direction in most countries, though, late in the process, suburbanization, counter-urbanization and re-urbanization can occur. Ravenstein focussed on place-specific migration, whereas today researchers describe migration flows using area typologies, seeking spatial generality. Ravenstein said little about migrant attributes except that women migrated more than men. In recent decades, the behaviour of migrants by age, sex, education, ethnicity, social class and partnership status have been studied intensively, using microdata from censuses and surveys. 
Knowledge about processes influencing internal and international migration has rarely been built into demographic projections. Scenarios that link migration with sub-national or national inequalities and with climate or environmental change are influencing the design of policies to reduce inequalities or slow global warming. 
* This article belongs to a special issue on “Internal Migration as a Driver of Regional Population Change in Europe: Updating Ravenstein”.


Introduction
In the eighth and ninth decades of the 19 th Century, Ernst Georg Ravenstein, an Anglo-German cartographer, published three papers that used data from the 1871 and 1881 Censuses to describe the patterns of internal migration across the British Isles (Ravenstein 1876(Ravenstein , 1885. In a third paper, he extended his analysis to Europe, the United States and Canada, examining "the foreign element" in their populations (Ravenstein 1889). These papers continue to infl uence research into internal and international migration. The aim of this paper is to compare the data, methods and results of Ravenstein's work with research on migration in recent decades, to show both the connections and differences between his ideas and contemporary approaches. This paper is therefore a review of the fi eld of internal migration with excursions into international migration. The raw material used are Ravenstein's papers and a selection of papers and books published in recent decades on internal migration, which connect to themes introduced in the 19 th century work. Ravenstein's most cited paper published in 1885 has the title "Laws of Migration". 1 He presented the paper at the Statistical Society of London (later Royal Statistical Society), followed by a lengthy question and answer session (Ravenstein 1885: 228-235). A key outcome of the "Discussion of Mr. Ravenstein's Paper" was that both author and discussants agreed that the "laws" were not immutable rules but rather "empirical generalizations" specifi c to the time and place of the evidence. Table 1 presents these "Laws of Migration" from the 1885 paper. Tables A1 to A3 in the Appendix report interpretations of the "laws" by later scholars.
Ravenstein's work on migration has been expertly reviewed and interpreted by subsequent scholars (Grigg 1977;Dorigo/Tobler 1983 andGreenwood 2019) so remarks here will be brief. In his 1876 paper, Ravenstein covers much the same ground as his 1885 paper on the British Isles, without the key ingredient that has grabbed the attention of later scholars, his list of seven Laws of Migration (Table 1). Greenwood (2019: 271) extracts fi ve key points from Ravenstein (1876), which an-

Tab. 1:
Ravenstein's "Laws of Migration" in his 1885 paper Year-Law# Text Pages The Laws of Migration, Ravenstein (1885) 1885-1 We have already proved that the great body of our migrants only proceed a short distance, and that there takes place consequently a universal shifting of displacement of the population, which produces "currents of migration" setting in the direction of the great centres of commerce and industry which absorb the migrants. In forming an estimate of this displacement we must take into account the number of natives of each county which furnishes the migrants, as also the population of the towns or districts which absorb them. p.198 1885-2 It is the natural outcome of this movement of migration, limited in range, but universal throughout the country, that the process of absorption would go on in the following manner: The inhabitants of the country immediately surrounding a town of rapid growth, fl ock into it; the gaps thus left in the rural population are fi lled up by migrants from more remote districts, until the attractive force of one of our rapidly growing cities makes its infl uence felt, step by step, to the most remote corner of the kingdom. Migrants enumerated in a certain centre of absorption will consequently grow less with the distance proportionately to the native population which furnishes them, and a map exhibiting by tints the recruiting process of any town ought clearly to demonstrate this fact. That this is actually the case will be found by referring to maps 3, 4, 8 and 9. These maps show at the same time that facilities of communication may frequently countervail the disadvantages of distance. p.198-9 1885-3 The process of dispersion is the inverse of that of absorption and exhibits p.199 similar features.

1885-4
Each main current of migration produces a compensating counter-p.199 current.

1885-5
Migrants proceeding long distances generally go by preference to one of p.199 the centres of commerce or industry.

1885-6
The natives of towns are less migratory than those of the rural parts of p.199 the country.

1885-7
Females are more migratory than males. p.199 Source: Ravenstein (1885: 198-199) ticipate his 1885 Laws of Migration (Table A1). Ravenstein's 1885 list (Table 1) mixes elaborate statements of the regularity in which more than one proposition is made (Laws 1885-1 and 1885-2) 2 with pithy summary phrases (Laws 1885-3 to 1885-7). Grigg (1977) extends the list of Ravenstein laws to eleven (Table A2) using short sentences or phrases. Dorigo and Tobler (1983) interpret Ravenstein's statements as constituting precursors for later generalizations about push and pull factors infl uencing migration fl ows (Table A3). Dorigo and Tobler (1983) suggest that one of Ravenstein's Laws (1983-1 in Table A3) anticipates the scale and zonation dimensions of the Modifi able Areal Unit Problem or MAUP (Openshaw 1983). Another of Ravenstein's Laws (1983-2 in Table A3), Dorigo and Tobler propose, anticipates the hypothesis in which previous migrations by "friends and families" channel later migration. This hypothesis was investigated by Swedish geographer Hägerstrand (1957), using detailed longitudinal migration records. A third suggestion is that the confi guration of the transportation system is likely to play a signifi cant role in channelling migration fl ows (Ravenstein's Law 1983-3 , Table A3). So, we have the seven original Laws proposed by Ravenstein plus four added by Grigg and three by Dorigo and Tobler, fourteen in total. Ideally, this impressive number of empirical generalizations should be tested in any study of new internal migration data. Like most researchers, we select a sub-set to compare the approaches of Ravenstein then and migration scholars now. We structure the rest of the paper as follows. Section 2 considers the wide set of defi nitions of migration used in data collection, presenting them in a common graphical framework. Section 3 discusses how "raw" data may be used to estimate "refi ned" indicators of migration behaviour and how gaps in empirical information can be fi lled. Section 4 reports on recent research on internal migration patterns, tracking recent work linked to Ravenstein's generalizations. The fi nal section sets out an agenda for future research aimed at improving our knowledge of both internal and international migration.

Data: Defi nitions and sources then and now
To describe and understand migration requires robust defi nitions of the phenomenon and reliable quantitative and qualitative data based on one or more of the defi nitions. In this section of the paper, we review defi nitions of migration and offer a framework that links them.

What is migration?
For Ravenstein, migration was displacement of population between their place of birth and their place of enumeration at the census because those were the only comprehensive records available from 19 th century censuses. In the contemporary world we need to introduce distinctions between different types of migration based on national or local boundaries. If migration takes place between countries, properly it should be labelled "international migration". If migration takes place within a country it is usually labelled as "internal" or "domestic" migration. Some scholars and some offi cial statistics agencies distinguish between local mobility (migrations within local areas) and inter-area migration. Ravenstein wondered about how urban and rural migration differed but the 1871 and 1881 censuses provided no direct information about migration within cities. We consider this distinction arbitrary because of its dependence on local government boundaries which in many countries are subject to periodic reorganization, destroying temporal comparability. So, the ideal data set should record the geo-coded address of origin and the geo-coded address of destination so that tables of fl ows for any spatial system or scale can be created. Several diffi culties arise which mean this ideal is rarely achieved. Not everyone agrees with these distinctions. Ravenstein did not distinguish between internal and international migration, treating, for example, the Irish element in the population of England and Wales in a similar way to the foreign element. Papers that discuss international migration will use "migration" without any prior specifi cation that this is only convenient shorthand. The same occurs in papers about internal migration. The second diffi culty is that offi cial statistical agencies are reluctant to publish migration information for fl ows between small areas due to uncertainty or because of the risk of disclosure. Offi cial Statistical offi ces (such as Australian Bureau of Statistics) often perturb the counts in published census tables and create inconsistency across tables at the same spatial scale and across spatial scales for the same tables. A third diffi culty arises because it is diffi cult to capture geography in a census or survey question. Most statistical agencies rely on a write-in answer, but people's knowledge of places may be defi cient. The cost involved in converting a write-in place name answer to a geocoded reference is considerable. So, in 1961, when the General Register Offi ce of England and Wales decided to introduce a question on migration in the year before the census, the tabulation of lifetime migration was restricted to "home" countries with the United Kingdom (UK) and foreign countries, dropping birthplace by county used in censuses from the 1871 to 1951 (Friedlander/ Roshier 1966).

Households or individuals?
Human migration is defi ned as the relocation between dwellings of groups of individuals known as households over time and space. People migrate in small groups (households), which are either a nuclear family unit or a couple or a single person or several unrelated individuals, who share some common aspect of life together such as meals or housework or a set of relationships (husband-wife, child parent, student-fellow student). Household numbers and compositions alter as individuals join to form households or leave households to create or join new ones. Migration is associated with both household formation and dissolution. As activities shared among members change, so do the precise defi nitions adopted in census or survey questionnaires. An alternative defi nition, which is necessary when using administrative records, defi nes a household as the group of people who live at the same address. However, migration of households is rarely studied, except in the Netherlands (Van Imhoff/Keilman 1992), because complex changes take place at the same time in household membership (e.g. separation of partners, death of spouse, birth of a new child). Most research using aggregate data on migration focusses on individuals, whose characteristics are either fi xed (e.g. place of birth), stable (e.g. gender) or change systematically (e.g. age). However, there is large and expanding literature that analyses family and household mobility using household survey data. Mulder (2018) provides a comprehensive review of the fi eld to which she has contributed many rigorous and insightful studies.

Migrant attributes
The literature which investigates aggregate migration fl ows focuses most attention on the age and sex differentials. However, many more attributes are important in migration processes. These include health status (Wallace/Kulu 2014), labour force status (Marois et al. 2019a), family status (Mulder 2018), educational attainment (Bernard/Bell 2018), motivations (Coulter et al. 2011) andethnic status (Darlington-Pollock et al. 2019). Such attributes have been investigated using census microdata or survey data. Note that migrant attributes tend to be measured at the end of the migration interval and so their impacts are not precisely measured unless longitudinal microdata are available (e.g. from the population registers of Nordic countries or the census-based longitudinal studies of UK devolved administrations).

Usual residence
The term "usual residence" is the dwelling unit in which people normally reside (sleep, eat and do household work in). Sometimes a detailed defi nition is provided in the offi cial questionnaire, such as the place where people spend most of their nights. However, working members of a household may spend more time at a distant work location than at the residence they regard as "usual". People who use more than one residence need to choose which to report as their usual residence. Administrative registers ranging from comprehensive population lists through registrations for local or national tax purposes require reporting of a de jure (legal) residence though many allow registration of temporary residences.

A duration criterion for migration or an instantaneity?
To distinguish between usual and temporary residence, statistical offi ces employ length of residence as a criterion. For example, people who cross international boundaries are asked about how long they will be staying at their destinations. The United Nations specifi es that countries use a 12-month duration criterion to distinguish between long-term and short-term migrants, with the exact method of gath-ering this information left to national statistical offi ces. This may involve gathering electronic information on all entries and exits, administering a sample survey at entry or exit or deriving the information from registers. The duration criterion is linked to visa regulations governing visits, temporary and permanent migration. When the duration of stay is short, usually up to 3 months but up to 6 months in many countries, the entrant is classifi ed as a visitor. For durations of 3 months/6 months to 12, the entrant is classifi ed as a temporary or short-term immigrant.

Migration and space
Internal migration measurement is particularly sensitive to the number, size and shapes of regions as migration is recorded as people cross those region boundaries. If a national territory is divided into a few regions, a lower count of migrants or migrations will be recorded than if a larger number of regions, necessarily smaller, is used. The exact shape and borders of regions will also affect the counts of migrants or migrations. This dependence of a migration measures on the spatial systems used in measurement hampers comparison of internal migration across both countries and time periods. Bell and colleagues (Bell et al. 2002(Bell et al. , 2015a(Bell et al. /b, 2018Bell/ Muhidin 2011;Stillwell et al. 2014Stillwell et al. , 2016Rees et al. 2017a; IMAGE Project 2020a) 3 have pioneered the development of system wide measures of internal migration, in which the effect of size and zonation of areal units is established and stable summary measures derived for international comparisons of internal migration. Stillwell et al. (2018) use zone design software that assembles basic areal units into larger zones (IMAGE Project 2020b) to investigate the effects of scale and zone confi guration on migration indicators and distance decay parameters. The results of this research are that, above a threshold number of regions, many system-wide measures remain reasonably constant. The test data set employed by Stillwell et al. (2018) consists of migrant fl ows between 404 Local Authority Districts in the UK derived from the Special Migration Statistics from the UK's 2011 Census (Duke-Williams et al. 2018). Note that the IMAGE software does not "solve" the MAUP. Rather, it allows the user to explore the effect of spatial framework design on their own migration system and select scales and zone systems which can be compared between countries or time periods.

Full migration histories
It is vital, when using existing migration data or in designing new methods of measurement, to understand the key concepts involved. Here we use a set of graphs ( Fig. 1) to identify the most important types of measure.
Ideally, we would like to track the full migration career of individuals. Figure 1A shows an example history for one individual in a time-space diagram. The horizontal axis proceeds from left to right, from time t b when the individual is born through to  The open arrows, which straddle the border between regions i and j, represent a migration. The person experiences one migration from region i to region j in the t b to t 1 interval, a return migration from region j to region i in time interval t 1 to t 2 and a further migration to region j in time interval t 2 to t d . The individual makes three migrations over their lifetime, the second of which is return migration to region of birth and the third a return migration to a region with a previous spell of residence. A full set of migration histories from which migration could be measured would consist of all lifelines that entered, stayed in or left the system of interest. To measure the intensity of migration we would divide the number of events of migration by the person time of exposure in the regions of interest. Only the highest quality population registers, in the Nordic countries, provide the real-world equivalent of Figure 1A. In longitudinal studies based on censuses or surveys such as the Longitudinal Study of England and Wales (CeLSIUS 2019) there are missing elements. Reconstruction of migration histories through retrospective questions in surveys suffer from survivor bias as respondents who have died or emigrated cannot be interviewed.

Lifetime migration
Lifetime migration data analysed by Ravenstein (1876Ravenstein ( , 1885, compare only place of birth with place of current residence (Fig. 1B). In the 1871 and 1881 Censuses, a question was asked about a person's birthplace, using county as the geographical unit. Tables were constructed for the populations of England and Wales, Scotland and Northern Ireland by county of residence and county of birth. There is no information about the exact timing of the migration or about how many migrations occurred between birth and date of enumeration. In the decade 1871-1880, male life expectancy was estimated as 41.4 years and female as 44.6 years (ONS 2015). So, it is likely that most migration took place between 1831 and 1871. Similar tables of lifetime migration at county scale were produced in censuses from 1891 to 1951, except for 1931 (Friedlander/Roshier 1966). In 1961, lifetime migration tables were restricted to UK home countries and a fi xed one-year interval question substituted ("where were you living one year ago?"). In the sample Census of 1966 and the full Census of 1971, two migration questions were asked using one-year and fi ve-year intervals. From 1981 to 2011, only the one-year question was asked.

Migrants by last residence
Common questions included in censuses are: "where was your last residence" (open interval) or "where was your last residence in the previous X years" (fi xed interval), because this is the easiest question for respondents to answer. Figure 1C shows the time-space graph for the open interval case where the timing is uncertain. Figure 1D presents the fi xed interval case where the location at the start of the fi xed interval is uncertain. Uncertainty is indicated by using a pecked timeline or lifeline for the migrant. Most countries that collect last migration data (with timing undefi ned) do also collect duration of residence. These data can be used to approximate fi xed interval measures, but only if the duration is reported in fi nely graded intervals, and if the space to which the duration refers is clearly specifi ed. These conditions are rarely satisfi ed. This type of question provides important descriptive information but like the lifetime migration question picks out only one migration without precise information about when the migration occurred. Rees (1985) showed that it was diffi cult to use the results as an input for estimating or forecasting the population. However, Schmertmann (1999) proposes three methods for estimating multistate transition hazard rates from last migration data. The methods are compared using a micro-simulation model with invented test data. The author then applies the methods to estimate fi xed period transition fl ows from last migrant data for a Brazilian three state system (Parana, Sao Paolo, Rest of Brazil). He demonstrates that the back-projection method produces plausible estimates.

Fixed interval migrants
The left-hand graph in Figure 1D shows the lifeline of a typical fi xed interval migrant, based on a retrospective census or survey question. Data on fi xed interval migrants are generated by asking the question "where were you living X years ago?" (where X is typically, 1, 5 or 10 years) or "where were you living at the time of the last census?" (as in the French census, where censuses have been held at irregular rather than regular intervals). It is diffi cult to compare migration volumes and intensities of questions with different time intervals because of return and repeat migration (Rees 1977). Papers from the IMAGE project (Bell et al. 2015b;Stillwell et al. 2016 andRees et al. 2017a) report on both 1-year and 5-year migration results but do not compare them directly. Rogers et al. (2003) and Dyrting (2018) have developed models for translating between 1-year and 5-year probabilities, so in future harmonization may be possible if the assumptions of the models hold for a country. However, Figure  1D shows three graphs which have fi xed start and end states in the period which may not be captured by a census or survey question: the exist-die, born-survive and born-die transitions, all of which might involve an inter-regional migration. These additional life-state to life-state transitions were recognized by Rees and Wilson (1977) in their book on Spatial Demographic Analysis as building blocks of transition demographic accounts. In principle, the born-survive transition should be captured by a census question, either through cross-tabulating the residence region at the census of 0-year olds or 5-year olds against their place of birth, derived from the Births Register but this is rarely done.

Fixed interval migrants, conditional on duration of stay
Many surveys of international migration (ONS 2019) incorporate a condition that to be counted as a long-term migrant that person must reside for 12 months or more, following a recommendation by the United Nations. Durations of stay of 3 months up to 12 months classify migrants as Short-Term while durations of less than 3 months lead to classifi cation as Visitors (Fig. 1E). Nowok and Willekens (2011) develop a framework that identifi es types of migrants, conditional on how long they stay at their destination after migration. The authors specify a general process of migration drawing statistical theory and, using a microsimulation model of an invented sample of migrants, show how different intensities result in different expected transitions over intervals of different lengths (Fig. 5 and 6 in Nowok and Willekens 2011). This work should be an important building brick in the development of a tool to handle the Modifi able Temporal Unit Problem (MTUP) discussed later in section 3.5. The right most graph illustrates the situation when border crossings are counted. A migrant may make more than one border crossing in a time interval of interest as in the middle graph. This person would have been recognised once as a fi xed interval migrant between region i and region j. Such migrants are recognised as "performing" repeat migrations (left hand graph in Fig. 1F). When those migrations are a sequence of region i to j and then region j to i, they are return migrants (right hand graph in Fig. 1F). Later in the paper we show that it is necessary to estimate these two types of migrant and the number of moves they make to convert a migrant count into a migration count. Bell et al. (2015a) reported on the types of internal migration data gathered by countries across the world. Table 1 identifi es the types of retrospective questions used in the 2000 round of censuses. The commonest question is asked about place of birth (lifetime migration), followed by duration of residence, last migration and then the fi ve-year, one-year and other interval questions, upon which most comparative analysis is based. In addition, Bell et al. (2015a) record the number of countries reporting migrations (events) from registers, though such data are gathered mainly in European countries.

Sources of migration data
The categories of Census, Register and Survey cover a range of strategies and coverages of the population. Censuses are normally designed to be complete enumerations of the population and surveys are normally samples, designed to be representative of the national population. Often, diffi cult-to-handle questions, such as those about migration, are assigned to long-form representative samples to reduce the burden of coding geographic responses. In some cases, the migration questions are left out of the census and included in a separate national survey, as in the US Census of 2010 where no migration question was asked in the main census but included instead in the annual American Community Survey. This strategy is adequate for many kinds of migration measure (total infl ows, total outfl ows, age and gender profi les) for larger administrative units but provides little information on origin-destination fl ows, because the N categories of interest (e.g. ~3,000 US coun-ties) turn into N 2 fl ow dyads (3,000 by 3,000 or 9 million data points). The modern US migration researcher would have a hard time replicating Ravenstein's analysis of counties of absorption and dispersion.
Administrative Registers avoid the small sample problem by aiming to cover the whole of a target population such as all usual residents or all taxpayers or all patients in the health care system. So, coverage issues replace the sampling issues of survey or within census samples, substituting the advantages of continuous recording and capacity to generate good time series of migration information. However, registers are often restricted in the range of co-variates available. For example, the National Health Patient Register used to measure UK migration provides just basic demographic information (location, gender, age) for statistical release and imposes very strict safeguards and terms and conditions of use for associated health data. Fully developed registration systems such as that of Finland (Statistics Finland 2019) overcome this diffi cult by linking demographic information from the population register with socio-economic information from the tax and benefi t records and employment data from the business register.
For many questions about migration, researchers use national or international household surveys or microdata samples from censuses (Bell et al. 2015a). Such sample data are very useful where available, yielding new insights into migration behaviour by migration cohort  or by migration motivation (Coulter et al. 2011) by type of area, but they are too small for the analysis of the place to place migration that Ravenstein studied.
Alternative detailed survey data are collected by commercial polling or marketing fi rms who also need detailed place-specifi c data. Thomas et al. (2014) explore the potential of a large commercial survey data, produced by marketing company (Axciom 2019). Under licence the authors had access to unit postcode data for respondent households and their residents. They demonstrate that results for the de- terminants of migration are close to those from offi cial surveys, and that unweighted and reweighted data produce similar results. Stillwell and Thomas (2016) use postcode-to-postcode migration data to estimate intra-zonal migration distances. They show that traditional methods of estimating this variable are fl awed. Distance decay parameters calibrated for spatial interaction models of migration differ signifi cantly using individual rather than zone data. There has been huge interest in using electronic data generated by users of internet platforms and of smartphone devices to provide statistical information on a comparable basis for member states (UN 2019a). To date, there has been relatively little progress. The data belong to private corporations, which make limited data available but retain the right to change the algorithms that generate the information. Migration is not a variable that is directly generated from available internet queries and phone location traces. Corporations are reluctant to release the most detailed data for testing or calibration purposes. Nevertheless, for many countries that lack good offi cial statistics systems, "big data" are a useful source, particularly for establishing the location of most used residence and for monitoring daily journeys (Manley/Dennett 2018).
A second example of use of telecommunications data is provided by Lai et al. (2019). In a Namibia case study where both census and mobile phone call data records (CDRs) are available, the authors construct a model that predicted migration fl ows between 13 regions from CDR data and some fi xed co-variates. The aim was to develop migration fl ow data for years after the latest census up to the next. A linear model with co-variates had high goodness of fi t (R 2 = 0.94). There is no way of fully testing the model until the next census, but it clearly has considerable potential for countries without continuous records of migration. Mobile phone penetration in less developed countries is already high and getting higher.
An important study was carried out by a team of experts for the European Union to investigate the potential of using social media data to fi ll the gap between the latest offi cial Labour Force Survey data and "now" (European Commission 2019). The authors commented as follows: "the fi rst results of the application of the stocks model are experimental, but they are promising" but that "the approach taken to estimate EU Mobility Flows has not yet offered any plausible results" (European Commission 2019: 15). The problem, recognised by the authors, was that offi cial statisticians have virtually no control over social media data from Facebook or Twitter. Use of call data records seems more promising.

3
Filling the gaps: Harmonizing and estimating migration Ravenstein (1876) used lifetime migration from the 1871 Census of Great Britain and Ireland, while his 1885 paper employed similar tables from the 1881 Census. There were two gaps in the data: one spatial and one temporal. The spatial gap was the absence of fl ow data from between counties in one "home" nation (England and Wales together, Ireland and Scotland) and counties in another (Ravenstein 1885: Map 6). Outside the home nation only birthplace by home nation was made avail-able. The temporal gap was the absence of migration fl ows from one census to the next. The timing of the transition between birth county and residence county was unknown. Ravenstein did not have the statistical tools to fi ll these spatial and temporal gaps. Contemporary researchers take the view that once the phenomenon to be studied has been identifi ed, efforts should be made to use the partial data available along with auxiliary data and statistical models to make an estimate of the true fl ows. We now discuss a sequence of critiques, improvements and new estimates that have generated new estimates of migration, which can inform and improve offi cial statistical practice.

Harmonizing spatial units used to measure internal migration
Ravenstein was fortunate in being able to study historic counties, whose boundaries did not change between 1871 and 1881. American, Australian and Canadian researchers can use state or province boundaries which are fi xed from the date of admission to their respective Union, Commonwealth or Confederation, and French researchers can rely on boundaries for departéments, fi rst created in 1790 by the National Constituent Assembly of Revolutionary France. Other European countries have experienced greater volatility in both regional and municipal boundaries. The UK, for example, has experienced frequent changes in the boundaries of its local authorities, which have shrunk in number and grown in population through mergers, in order to achieve economies of scale in administration. Look up tables based on the populations of the smallest geographical units are used to apportion population statistics from the "old" geographic units to "new" geographic units (Simpson 2002). This works reasonably well for statistics that apply to one spatial entity. However, for migration fl ows which refer to two spatial units, the origin and the destination, this method is problematic. To aggregate migration fl ows from one geography to another, it is essential to aggregate fl ows between small basic spatial units (BSUs), which ideally nest within the larger units of interest. If there are still overlaps, then the centroids of zones can be used to match BSUs via a point in polygon function (Bell et al. 2015a).

Estimating missing internal migration data
Since the 1966 UK Census (a 10 percent sample), migration origin-to-destination fl ow data sets have been produced, at fi rst on user demand at cost (1971,1981) and then as publicly accessible data sets in 1991, 2001 and 2011, free at the point of use, though with restriction to safe settings for the most detailed tables (Duke-Williams et al. 2018). These fl ow data sets provide tables for regions, local authorities, and small areas (e.g. output areas and super-output areas), though the detail of the tables shrinks with spatial scale to avoid disclosure issues. However, the diffi culty that Ravenstein experienced of fl ow tables only being available within UK "home" countries and the lack of harmonised fl ow data for the years between censuses persists for fl ow tables derived from the main UK administrative register, the Patient Register of the National Health Service (NHS) (Lomax 2013a: Chapters 3 and 4). Flow tables are available for each mid-year to mid-year interval for each home country but fl ows between local authority districts (LADs) in one home country and LADs in another are not generated. Figure  The solution is to borrow information from a closely aligned source, in this case the full 2001 or 2011 Census fl ows matrices and adjust these borrowed fl ows to agree with the known marginals derived from the home country NHS registers for years 2001-02 to 2010-11. The generic algorithm used is called "iterative proportional fi tting" (IPF), applied to each sub-matrix in Figure 2 (Lomax 2013a/b;Lomax/ Norman 2016). The algorithm has been re-invented and renamed in many social disciplines. Yule (1912) used it to standardize cross-tabulations; Kruithof (1937) developed a double factor method for use with telephone call data; Deming and Stephan (1940) employed the method to adjust census tables; Bishop et al. (1975) wrote a defi nitive guide to the method. IPF is known as bi-proportional fi tting in spatial interaction modelling (Wilson 1971) and employed in migration modelling (Stillwell 1978(Stillwell , 1979. When employed in economics, IPF is referred to as the RAS method and used to estimate input-output accounts; when employed to re-weight sample survey results the term is referred to as statistical raking (Mercer et al. 2018). The Lomax work achieved consistent year by year estimates of migration fl ows between 389 local authority districts in the UK using IPF techniques.
American demographer Andrei Rogers and colleagues have worked intensively on methods for estimating internal migration when direct data are not available. In the monograph The Indirect Estimation of Migration (Rogers et al. 2010), the authors summarise achievements of three decades of research on migration estimation, acknowledging the key contribution of Willekens (1983) on the use of log-linear models to understand the contributions of origin, destination and interaction factors to migration fl ow matrices. One of the motivations for writing the book was concern that in 2010 the US Census Bureau would no longer ask a fi xed interval question on residence 5 years ago but instead rely on the annual (but sample) American Community Survey to provide information on internal migration with the United States (US Census Bureau 2019). Log-linear modelling helps to estimate missing fl ow data from marginals and proxy interaction data. The Rogers team also worked on modelling the age variation of migration intensity, using the migration by age data assembled in the Migration and Settlement study at the International Institute for Applied Systems Analysis (Rogers/Castro 1981a;Rogers/Willekens 1986), developing a multi-exponential model of the age profi le of migrants. The average exponents of the profi le across the 17 country inter-regional matrices by age in the study have been widely used to disaggregate or improve migration by age information (e.g. Sander  et al. 2014a). Subsequently, the age profi le model has been refi ned by Wilson (2010) and linked to life course event profi les (Bernard/Bell 2015). The decomposition of migration by age is hinted at by Ravenstein (1876) but no data were available for investigation (Greenwood 2019: 271).

Filling the gaps in international migration data
Despite the prominence of international migration in political discourses in most countries and the existence of national statistics (at least on immigration), there is no system for producing harmonised measures of international migration fl ows. The best data currently available are the national census tables of migrants by country of residence and country of birth, collected by the Statistics Division of the United Nations and published by the UN's International Organization for Migration (IOM 2019). These data are analogous to the 1871 and 1881 census data analysed by Ravenstein, referring to country to country fl ows rather than county to county fl ows. The United Nations Population Division assembles data from national statistical offi ces on net international migration where available and produces estimates as the residual balance after natural change is subtracted from total population change. The IOM data portal (IOM 2019) points to the category migrant fl ows, but these are only available in Europe from EUROSTAT (the Statistical Offi ce of the European Union). Flow estimates were compiled partially from data supplied by national statistical offi ces of EU member states and partially from estimates compiled by academic researchers working with EUROSTAT (Raymer et al. 2011(Raymer et al. , 2013. Poulain, Perrin and Singleton (2006) carried out a comprehensive survey of sources, defi nitions and measures of migration available from EU member states. The THESIM Report (Towards Harmonised European Statistics on International Migration) covers EU migration policy and data regulation, administrative systems of data collection, statistics on international migration fl ows (where available), residence and work permits, asylum applications, removals of persons refused leave to remain together with twenty-fi ve country reports written by national experts using a uniform template. The edited collection by Raymer and Willekens (2008) builds on the THESIM work by reviewing models available for generating better migration estimates.  proposes a schema for carrying out the estimation by reconciling confl icting origin country and destination country estimates or by borrowing information from other countries where it was not available for a country. In the case of the UK reliable data on fl ows to/from other EU countries were missing and had to be borrowed from origins or destinations with better statistics.
Raymer and colleagues working together in the MIMOSA (Migration Modelling for Statistical Analysis) project developed a model for estimating international migration fl ows between member states in the EU and EFTA and fl ows to and from the Rest of the World. The model combines the procedure for harmonization proposed by Van der Erf and Van der Gaag (2007), which works from most to least reliable estimates with the log-linear modelling framework set out in Raymer (2008), a gravity model for intra-EU and extra-EU fl ows in the European Union, where fl ow estimates are missing (Raymer/Abel 2009;Abel 2010;Raymer et al. 2011). The estimates were further refi ned in the IMEM (Integrated Modelling of European Migration) project which incorporates expert views about the reliability of fl ows by country and provides measures of uncertainty/confi dence (prediction intervals) around those estimates within a Bayesian framework (Raymer et al. 2013).
These estimates of migration fl ows between EU and EFTA states have been used as inputs in a subsequent demographic forecasting project DEMIFER (Demographic and Migratory Flows Affecting European Regions and Cities) (Rees et al. 2010;Rees et al. 2012). Scenarios of future migration are developed that are linked to migration and regional development in national and EU policies. The projection model used a multi-level approach (Kupiszewski/Kupiszewska 2011), in which migration fl ows between EU states were explicitly forecast and then allocated to regions within EU states using information on immigration and emigration or population totals by region for EU states. Dennett andWilson (2013, 2016) took the MIMOSA/DEMIFER estimates and used them to generate a full matrix of interregional migration across the EU that combines international fl ows between EU member states and internal interregional migration estimates.

Estimating period migration data from lifetime migration data
Section 3.2 showed how migration fl ows can be estimated from partial data, where reliable sub-totals were available together with a full matrix of reliable proxy migration fl ows to adjust. Section 3.3 discussed ways in which a full fl ows matrix could be constructed from partial information by using information from destination countries with high quality statistics where origin countries had very poor information. In this section (3.4), we report on methods of estimating period-specifi c migration fl ows between counties (internal migration) or countries (international migration) from tables of population classifi ed by current residence and place of birth, the same as the lifetime migrant tables used by Ravenstein (1876Ravenstein ( , 1885. Table 3 sets out the contributions to these estimations. The fi rst entry identifi es Ravenstein's use of lifetime migrant data using information from two censuses. He did not attempt to estimate period migration data but has stimulated subsequent researchers to investigate possible methods of doing so. Important work on the estimation of migrant fl ows from migrant stocks was carried out by Friedlander and Roshier (1966). The aim of the analysis was to use lifetime censuses from 1851 to 1951 to create period estimates for inter-census migration between counties in England and Wales. The authors built a model that uses two successive lifetime migrant tables, information on births and deaths in the inter-census intervals and three important assumptions. The fi rst is that persons who die do not make any migrations and die at their place of enumeration at the fi rst census. The second is that only one migration is recognised between censuses and additional (repeat) migrations are ignored. The third assumption is that fl ows from origin 1 to destination 2 are cancelled out by fl ows in the opposite direction.  Tables by Gender with  Tables , 1980, 1990, (1960-1990) Table 3 shows that we had to wait until the 2010s for a methodological breakthrough to be made by Abel (2013). He has subsequently refi ned the estimates in four further papers. Figure 3 sets out the 3-dimensional cube of migration fl ows invented by Abel (2013) that enabled the estimation of period migration from lifetime migrant stocks. Abel skilfully uses simple numerical illustrations to explain the procedures for estimating period migration fl ows from two consecutive migrant stock tables. Two faces of the cube represent lifetime migrant data at two successive censuses (Matrix A and Matrix B). We wish to estimate the contents of the third face, migrant transitions for a fi xed time interval (Matrix C). Abel's insight was to realise that if provisional estimates or assumptions of the elements in the three dimensional array were proposed, they could be adjusted to agree with the marginal totals in Matrices A and B. Matrix C elements were simply a sum of array elements over country of birth.
However, it is also useful to set out the general algebraic framework for the estimation, as in Table 4. The notation derives from work on demographic accounting (Rees/Wilson 1977;Rees 1985 andRees/Willekens 1986). The notation used is set out in Table 5. The variables to be estimated are migrant fl ows by country of birth Notes: Abel/Cohen (2019) also discuss two contributions (Beine et al. 2011 andBertoli et al. 2015) which use the net differencing of country i to country j migrant stocks in two successive censuses. These methods resemble those used by Friedlander/Roshier (1966).
(i), country of origin at the fi rst census (j) and country of destination (k), represented by the variable in Table 4D. How do we arrive at these period estimates of migrant fl ows by place of birth? We start with information on lifetime population stocks at the fi rst census, (Matrix A in Fig. 3, Table 4A) and the second census, (Matrix B in Fig. 3, Table 4B). These marginal tables can be used to adjust an initial distribution (seed values) of migration fl ows in the full array . The problem translates into one familiar to users of contingency tables. Iterative proportional fi tting methods can be used to estimate adjusted array elements, associated with an underpinning log-linear statistical model (Deming/Stephen 1940;Bishop et al. 1975;Willekens 1983;Rogers et al. 2010). Once an array of variables had been generated (Fig. 3), then the required migrant fl ows matrix (Matrix C in Fig. 3, Table 4E) is obtained as a sum over place of birth of the array variables, that is, . A condition for a successful solution to iterative proportional fi tting (IPF) is that the totals of elements in the constraint matrices (Matrix A and Matrix B in Fig. 3) are the same. Otherwise the IPF procedure will never converge. It is necessary to reduce the migrant total for Matrix A by the deaths that occur in the inter-census interval and to reduce the migrant total for Matrix B by the births that occur between censuses, as Friedlander and Roshier (1966) had earlier incorporated in their methods. In Table 4A the total deaths in a country, , are shaded in pink, indicating the counts are available from vital statistics records or UN estimations using, for example, the Demographic and Health Survey. The country of birth of persons dy-

Fig. 3:
A framework for estimating period migration from lifetime migration Source: Drawn by the authors, based on a model by Abel (2013).
ing is not known, but can be estimated by applying the proportion from the population stocks table to the total deaths over all birthplaces to produce a table of estimated deaths. Births are handled differently, simply by assigning total births to the country of residence at the second census, that is, . S u b t r a c t i o n o f deaths from population by country of residence at time t and subtraction of births from population by country of residence at time u produces two tables of estimated survivors in the third panels of Table 4A and 4B. The totals of these two sub-tables must be made equal, , for example, by constructing an average (Abel 2013(Abel , 2018Abel/Sander 2014). The two sub-table population elements are adjusted to agree with this common total.
The IPF estimation model requires specifi cation of initial or seed values of all array elements (Table 4C), designated . This process of estimating the origindestination-birthplace tables involves two steps: (1) making an assumption about the diagonal cells in the array (people who reside in the same country at both start and end of the time interval), and (2) estimating initial values of the off-diagonal elements (people who reside in a country at the end of the period from that at the start). The options for the diagonal include maximization of stayers (as in Abel 2013), minimization of stayers or an average of the two. Abel (2018) suggests options for the off-diagonal elements include adopting an independence model or introducing an interaction matrix which refl ects the diffi culty of moving between countries (e.g. spherical distance, airline connections). Azose and Raftery (2018a/b) use a weighted average of the assumption of maximum stayers and the independence model assumption, developing a method for optimizing the weights. The IPF procedure is carried out conditional on the margin totals and diagonal elements.
A spatial interaction model could be developed for a sub-set of country-to-country migrant fl ows for which reasonable estimates have been made. Such a model was developed by Kim and Cohen (2010), fi tting their model to a UN data set of international migration fl ows between industrialized countries. However, the drawback would be that if researchers wanted to fi t a global spatial interaction model to estimated fl ows, there would be an element of "double-counting". Given a choice of the seed values for the migrant fl ow array, estimates of the full array of migrant fl ows can be generated and then summed to produce the country to country matrix of fl ows (Matrix C in Fig. 3, Table 4E).
The Abel estimates generate migrant fl ows only for persons existing at time t and surviving at time u. Demographic accounting theory (Rees/Wilson 1977) demonstrates that for a complete picture of population change and migration it is necessary to consider migrants hidden in the deaths and births statistics. Table 6 expands the matrix in Table 4E by adding transitions for infants born in the period t to u, who survive in the destination country at time u (shaded green), the persons alive in the country of origin at time t who die in the t to u interval (shaded blue) and the infants born in the period who also die before the end of the time interval (shaded brown). The off-diagonal elements in each of these additional sub-matrices in Table 5 represent persons with (at least) one migration in the time interval. The number of such "hidden" migrants is likely to be small (perhaps 1 percent of exist-survive migrants)

Tab. 4:
Country to country migrant fl ows estimated using the Abel ( but it is better to estimate them than include in the closure error. Methods for estimating these migrant fl ows are described in Rees and Wilson (1977). Both Dennett (2016) and Azose and Raftery (2018a) regard the Abel estimates of bi-lateral migrant fl ows as under-estimates of the "true" fl ows. Of concern are migrants who return to their origin country during a time interval and migrants who make more than one migration. We have discussed these migration histories earlier (see Fig. 1A). Both authors compared the Abel estimates with the estimates of international migration between EU member states (see earlier for a discussion). The claim of under-estimation derives from a comparison of Abel estimates with migration fl ow estimates between EU member states (Raymer et al. 2011;Raymer et al. 2013). However, these international migration estimates are based, largely, on adjusted counts of entries to and exits from national population registers, where commonly no duration of residence criterion is applied. So, these register-based estimates count migration events (Fig. 1F, rightmost graph) rather than (transition) Tab. 5: Variable and subscript notation for lifetime and period migrants Place of residence at time u n Number of zones in system studies + Sum over the subscript replaced Note: Subscripts can be placed below the variable they are attached to or above. migrants (Fig. 1D, leftmost graph). In other words, the comparison is between migrants and migrations. Both are "true" fl ows, conditional on their defi nitions. The choice of which estimates to use depends on the application they are used in. Rees (1985) showed that either type of fl ow data could be used in a population projection based on a components-of-change model, but that a matching projection model needed to be used. If, for example, you wished to compute the carbon footprint of migrant travel, you would need to use the migration fl ows rather than the migrant fl ows.
Azose and Raftery (2018a/b) make estimates of the migration events additional to the number of migrants by borrowing results from three pieces of work on internal migration that compared one-year migrant rates with fi ve-year migrant rates in censuses where the migration question had asked respondents where they were living one-year and fi ve-years ago (Long/Boertlein 1990; Rogers et al. 2003 andNewbold 2005). Azose and Raftery (2018a/b) use an average of 1.5 for the Long-Boertlein Index, the ratio of the measured fi ve-year interval migrants to fi ve times the oneyear interval migrants as generated from retrospective questions on previous residence. The method assumes that migration intensity is constant over the fi ve-year interval and that populations are approximately constant. Azoze and Raftery adopt a one-year retrospective interval equivalent to the 12-month duration (either prospec-

Tab. 6:
Expanded  Table 7). Note that despite the paper's title, Azose and Raftery (2018a/b) do not identify the separate contributions of return and transit migrations. Table 7 shows how the Azose and Raftery migration estimates fi t into an event-based population accounts table.
The analyses in Table 3 are not the end of the story. It is necessary to extend the estimation to decompose migration fl ows by age, as in Friedlander and Roshier (1966). This would provide a better basis for using international migration fl ows in a global population projection than assumed the age profi le matched the average age profi le parameters of the 17 country case studies in Rogers and Castro (1981a). However, this assumes that country of birth and country of residence tables are available for age. The censuses and surveys containing lifetime migrant data should hold birth date or age at last birthday information, but special tabulations would need to be requested from censuses or might be generated from sample census microdata.

The "one-year/fi ve-year" problem
The problem of estimating consistent migration probabilities from data with different retrospective intervals had been observed for France by Courgeau (1973a), though his solution was to say that good longitudinal migration histories were needed to solve the problem. Courgeau's paper fi rst identifi ed the clear distinction

Tab. 7:
Estimates of country to country migration fl ows Notes: M = long-term migration using the UN 12-month duration definition. Migrant flow estimates derived from an average of minimum stayer and independence model by multiplication of five-year migrant estimates by the Long-Boertlein ratio (5-year migrant count/5× one-year migrant count) set to 1.5 based on empirical estimates in Rogers et al. (2003) and Newbold (2005).
between migrants (persons making a spatial transition between two points in time) and migrations (the events of changing locations).
In the 1970s and early 1980s, an ambitious study of future population change at regional scale using multi-state methods for 17 countries was carried out at the International Institute for Applied Systems Analysis (IIASA) located in Laxenburg, Austria by Andrei Rogers and Frans Willekens (Rogers/Willekens 1986). As is usual in studies across countries, there were diffi culties in harmonizing the data need for input to the software for implementing the projections (Willekens/Rogers 1978). The key harmonization problem was to align national migration data with model requirements for migration over fi ve-year intervals, matched with fi ve-year ages. Countries with migration events derived from population registers simply added up fi ve years of migrations. Countries which relied on censuses for migration information could either supply data based on a one-year retrospective migration question or a fi ve-year question. Occasionally, two questions were asked in the national census, as in the UK in 1971 (Rees 1979). The existence of two sets of data for the UK enabled Kitsul and Philipov (1981) to experiment with a high-and low-intensity movers model which gave reasonably good results, bearing in mind that the one-year data  were available for just one-year of the fi ve in the fi ve-year data .
Having surveyed contemporary research on estimating migrant and migration fl ows, we turn attention in the next section to contemporary research into a selection of Ravenstein's empirical fi ndings.

Recent research on internal migration patterns
This section is not designed to be a comprehensive review of what we know about contemporary internal migration but rather provides illustrations of how contemporary migration scholarship carries out analyses of the internal migration patterns that Ravenstein studied.

Comparisons of internal migration across countries: case study collections
Using lifetime migration data from censuses in the 1880s and earlier, Ravenstein (1889: Maps 1 and 2) describes the regional distribution of foreigners for the countries of Europe by region and the net gains (absorption) and losses (dispersion) in international migration at country scale. Drawing on the 1880 censuses in the United States of America (USA), he maps the distribution of the foreign-born population in 1880 across the states of the USA and the provinces of Canada (Map 3), the gains and losses of state-born, national-born and foreign-born migrants by state and province (Map 4), and examples of migration fi elds of dispersion (Map 5) and absorption (Map 6). Since the 1880s, the systematic comparison of internal or international migration patterns across countries has not fi gured prominently, with most work focussing on national case studies, brought together in edited and informative collections (Nam et al. 1990;White 2016;Champion et al. 2018). Com-parison between countries was diffi cult because data types, variable defi nitions and theoretical frameworks differ from chapter to chapter, with the editors making heroic attempts in discussion chapters to overcome the differences in chapter design and achieve a true synthesis of knowledge. Rogers and Willekens (1986) summarize the results of an international project covering 17 countries in which authors carried out and interpreted the same analysis, a multi-regional population projection incorporating inter-regional migration by age and sex using common software (Willekens/Rogers 1978). One common output across the case studies was a table of life expectancy by regions of birth and regions of subsequent residence (e.g. Rees 1979: Table 28). The span of life spent outside the region of birth was an index of internal migration intensity. In the late 1990s, a study for the Council of Europe (Rees/Kupiszewski 1999) focussed on internal migration within ten member states. Data on internal migration fl ows supplied by national collaborators at the smallest available scale made possible the elucidation of spatial patterns. These were urban concentration or de-concentration; the relationship of internal migration (or overall population change where internal migration statistics were not available) with population density, the links between internal migration and economic conditions, signalled by unemployment rates, and the relationship of internal migration to gender and the life course. The study identifi ed that many countries were still urbanizing; a few had adopted a counter-urbanization; and other exhibited an intermediate pattern of net fl ows both down the density gradient from the top and up the density gradient from the bottom.

Development of a robust methodology for comparing internal migration across countries
A step change in the ability to compare internal migration came with the publication of a paper proposing fi fteen measures organized by four measurement dimensions for internal migration: intensity, distance decay, connectivity and effect on population redistribution (Bell et al. 2002). The main aim of the paper was to design summary indices of internal migration for a country which could be submitted to international statistical databases (UN 2019b/c) and constitute the internal migra-  At the core of the IMAGE research was a recognition that the MAUP (Openshaw 1983) could potentially invalidate any comparison across countries and that methods were needed to compensate or control for the problem. The MAUP problem has two aspects. The fi rst, the scale problem, is that internal migration measures tend to increase with increasing numbers and decreasing sizes of the zones used. The second, the zoning problem, is that, for any given number of zones, there are multiple ways of creating those zones from smaller basic spatial units. Summary measures may differ drastically across different groupings of the same number of zones (Openshaw 1983: 24, Fig . 2). Stillwell et al. (2014) developed a software package that, given the input of a matrix of fl ows between basic spatial units for a country and associated populations, computes aggregations of the fl ow data for specifi ed numbers of zones and recomputes 15 national summary indices of migration ( Table 2 in Stillwell et al. 2014). For any aggregation (number of zones), values of the summary indicators are computed for a user-defi ned number, up to 1,000, of different zoning systems with reports of the median, upper and lower quartile values. It is then possible to evaluate the effect of scale (number of zones) and zonation (number of different constructions of the same number of zones) on the outcome summary indicator.
The scale effect is used to construct summary measures of migration intensity, drawing on an original proposal by Courgeau (1973b), further developed by Bell and Muhidin (2009) and Courgeau et al. (2012. Bell et al. (2015b) use the Courgeauderived methodology to compare migration intensities across the world, linking intensities with development status, with confi dence that the intensities were comparable. The authors used the Aggregate Crude Migration Intensity (ACMI), designed to be comparable across countries, to reveal the extraordinary variation in migration intensity around the world, from a high of 52 percent per annum in South Korea to a low of only 5 percent in India. Ravenstein did not discuss migration intensity directly in his papers because the period of exposure to the risk of migration in lifetime migrant data was uncertain. Stillwell et al. (2016) compare the effect of distance on migration volumes using the IMAGE software suite to reveal the scale and distance effects and make possible robust comparisons (see sub-section 4.3). Rees et al. (2017)  Net migration measures were used to examine re-distribution patterns and a general theory linked to urbanization was proposed. Papers on Latin America, Asia and Europe Charles-Edwards et al. 2017a/b;Bell et al. 2020;Rowe et al. 2019) compare on countries within world regions across three of the four dimensions of internal migration identifi ed in Bell et al. (2002).

Distance and migration
The evidence that Ravenstein presented for his short distance generalization (1885-1 in Table 1) and remarks on long-distance migration (1885-5 in Table 1) came from maps of migration rates to or from single counties or through tables of lifetime migrants and so were essentially qualitative observations. In the 20 th century, researchers measured distances from migrant origins to destinations and related the size of fl ows to the distances and other variables. A gravity model of inter-city movement was proposed by Zipf (1946), based on Newton's formulation of the attractions of planetary bodies. Since 1946, a very large number of gravity models of migration have been proposed and implemented and the model continues to be used. Poot et al. (2016) characterise some recent applications of the gravity model of migration as "the successful comeback of an ageing superstar". The authors make interesting proposals for how the model might be used in other applications, such as population forecasting.
However, the variety of model forms and explanatory variables in the corpus of gravity model case studies make it diffi cult to compare the impact of distance on migration across countries. The IMAGE project team therefore approached the task of comparing the effect of distance on migration by using a standard model for all countries: the doubly constrained spatial interaction model of Wilson (1971) with observed out-migration totals as the origin terms and in-migration totals as the destination terms, a negative power function and distance measured using national grid Cartesian co-ordinates . The IMAGE software was used to compute distance decay parameters at different population scales and employing different zoning solutions for a fi xed number of zones. Figure 4 presents results for a set of countries reporting fi xed interval migration for one-year data. Mean distance decay parameters (beta) are averaged over 1,000 different zoning confi gurations at each selected zone population size.
The ranges of betas at different zonal population sizes (Stillwell et al. 2016, Figure 8) are moderate to small. The graphs of Figure 4 show a feature characteristic of most IMAGE analyses: over ranges where the population sizes are small and numbers of zones are large, the distance decay parameter varies substantially by size and moves in different directions in different countries. However, for systems with fewer, larger zones the parameters for each country are relatively stable and can be compared. For example, the mean betas in the population size range 200,000 to 600,000 show few cross overs. We can be sure, for example, that Swiss migrants are more constrained by distance than Australian migrants. For one-year migrants, betas are stable from zone sizes of 100,000 population and larger. Countries exhibit greater variation in betas when migrants are measured over fi ve years than one year. Stillwell et al. (2016Stillwell et al. ( : 1672 summarise that, of those countries which collect one-year migration data, the frictional effect of distance is lowest in the USA, Canada, Australia and higher in much of Western Europe. For those countries that collect fi ve-year data, those with higher levels of development display lower levels of distance friction. Most demographic phenomena vary systematically by sex and age. The friction of distance is no exception. Figure 5 plots origin-specifi c distance decay parameters generated in a study for the UK government by a team led by Stewart Fotheringham and Tony Champion (ODPM 2002;Fotheringham et al. 2004). Gravity models were fi tted to a time series of migration fl ows (events) between 96 geographic zones in England and Wales over 14 years for 14 sex-age groups. The shaded box shows the upper quartile, median and lower quartile of the distribution of beta values across origin zones. The upper and lower whiskers are located 1.5 times the quartile-median difference, above and below the box. The numbered circles represent outlier zones beyond the whiskers. There is relatively little difference between male and female results by age, because men and women mostly migrate as couples or as families. There are two exceptions. At ages 16-19 many young people enter higher education, which in England and Wales often involves a migration. The lower median and downshifted box plot for young women probably refl ect a preference of either the daughters or their parents that they attend university closer to home. At ages 60 and over, the median for women and the boxplot extent are higher for women. At these ages men and women may no longer be together because of death or divorce. Men die earlier and so are less exposed to the necessity of migration after death of a spouse or partner. At other ages the male and female plots are very similar but vary systematically by age. Younger adult ages experience lower frictions of distance (that is, smaller negative numbers) in a pattern that resembles the plot of migration intensity by age.

Gravity models of migration
The gravity model has been used in hundreds of applications to understand fl ows of people, goods, services and investments. Economists have shown a revived interest, as Poot et al. (2016) suggest. However, because disciplines still work in silos, Ramos (2016), in his review of gravity models, still claims that the recently assembled UN and World Bank databases of lifetime migrants constitute "bilateral data on migratory fl ows", whereas in section 3.4 we have reviewed the recent attempts to convert what is migrant stock data into estimates of migrant and migration fl ows over fi xed time intervals. There is not space here to treat gravity models (also called spatial interaction models) fully but we briefl y describe some important papers. Sen and Smith (1995) and Crymble (2019) provide valuable overviews of gravity models. Wilson (1971) developed a family of models based on entropy maximizing methods and Willekens (1983) showed how similar log-linear models could be derived from statistical theory. Flowerdew and Murray (1982), Flowerdew and Lovett (1988), Flowerdew (2010) and Congdon (1993) describe Poisson regression models which deal with small numbers or zeroes in origin-destination cells. Abel (2010) provides an example of a gravity model used to fi ll in missing cells in a fl ow table. He fi ts a binomial regression model using an expectation-maximization algorithm and co-variates suggested by theory to fl ows between European countries that are reliably estimated. The model is then used with known co-variates to estimate cells where there are no reliable fl ow estimates. Raymer et al. (2019b) extend this approach to estimating migration fl ows between ASEAN countries in which there are no reliable estimates at all and gravity model parameters are borrowed from fi tting a model to migration fl ows between European countries instead. The model parameters are used with ASEAN country co-variates. Here we describe some recent work which has advanced our understanding or exposed problems that still need to be solved.
A key assumption made in gravity models of internal migration is that there are no restrictions on free movements within national spaces whereas between nations restrictions based on immigration policy are ubiquitous. History provides examples of violations of this assumption, ranging from the eviction of crofters in Highland Scotland by landowners leading to mass migration to Scotland's cities and other countries in the Highland Clearances (1750 to 1860) to the example of China's Hukou system (1950 to the present), which makes diffi cult the relocation of rural migrant families to the cities for which they provide vital labour because of restricted access to benefi ts (Pradier 2018). On the other hand, Fotheringham et al. (2004) found policy variables at local scale (employment or investment programmes) to have little infl uence on migration within England and Wales.
Some important decisions need to be made when constructing a gravity model of internal migration fl ows. First, it is necessary to choose the variable(s) that will represent the impedance between origin and destination. Normally, a simple distance measure is used. However, Shen (2016) has shown, in modelling interprovincial migration in China, that the mis-specifi cation of the spatial interaction effect (predictions of the "friction of distance") leads to absolute mean errors of 32 percent in migration fl ows compared with 15 percent for each of origin "emis-

Fig. 5:
Box plots of origin-specifi c distance decay parameters by sex and age group generated from a model used to predict internal migration fl ows between 96 zones (Former Health Service Areas) in England andWales, 1983-84 to 1997-98 Distance Decay Parameter Source: ODPM (2002), The Development of a Migration Model, Figure A21.7, p. 301. Crown Copyright.
siveness" and destination "attractiveness". More attention is needed to investigate alternatives to the minimum distance between origin and destination. Linked questions arise about whether the model should be constructed using intra-zone migration (if available) and how the intra-zone distance might be estimated. Stillwell and Thomas (2016) show, employing a person level data set recording postcode of origin and destination available from Acxiom Ltd, a marketing technology and services company (Acxiom 2019), that conventional geometric methods are inaccurate and that including intra-zonal fl ows and a more precise intra-zonal distance based on postcode locations improves the goodness of fi t of a spatial interaction model substantially. Second, it is necessary to experiment with the mathematical function of impedance which will be used, because predictions of fl ows are dependent on the decision. Openshaw and Connolly (1977) provide a methodology for spatial interaction modellers to use. Third, a decision is needed on whether to use a global distance prediction parameter  or a set of local parameters either attached to the origin or the destination. A related choice is whether to use a one-step model or a hierarchical model with several steps (Fotheringham et al. 2004). Fourth, it is necessary to select or write software that will calibrate the chosen model successfully (Dennett/Wilson 2016). Can the model be transformed into a linear equation for which standard regression software will fi t the coeffi cients, or should an iterative routine that searches for optimum parameters be used? Fifth, consideration is needed about which variables to include in predicting out-migration from origins and in-migration to destinations. We think Ravenstein would be proud of the sophisticated analyses that his simple "law" about migrants moving short distances has generated.

Directions of migration fl ows
Ravenstein's empirical generalisations arose out of his detailed attention to places. His papers are fi lled with list of places (zones) from which lifetime migrants come or to which they go. Today such lists are regarded as the raw material from which we can discover spatial generalisations. Places have meaning if you are familiar with the geography of the country being studied, but we try today to attach numerical information to that general knowledge. Ravenstein, being a cartographer by training, drew many maps to illustrate the patterns or directions of fl ows. But there is an issue. His maps and ours today are highly selective and convey only a fraction of the information embedded in the data. This can be termed a 3N problem, where N is the number of regions with a country for which you have migration information. To represent the matrix of fl ows, you need to draw N maps of outfl ows, N maps of infl ows and N maps of the balances. But there are many indexes you can use: the raw numbers, the raw data converted into percentage shares by division by the total infl ow or outfl ow or presented as rates by division of the population of the sending region (outfl ows) or receiving region (infl ows). If V is the number of useful variables for characterising the fl ows (age, sex, education, ethnicity) then 3N×V maps are needed. In effect, you have created an atlas of evidence for the directions of migration. Could not the maps be reduced by plotting simultane-ously all the origin to destination fl ows as lines? Such a plot would be unreadable. Ravenstein (1885) does draw some line maps for selected sets of fl ows but is careful to avoid clashes. Friedlander and Roshier (1966) plot the largest fl ows, which reveal the importance of Glamorgan (where most of the South Wales coalfi eld is located) in the 1890/99 and 1900/09 decades. The French Demographer, Daniel Courgeau, who gave the term "migration fi eld" to the pattern of outfl ows or infl ows around a geographical zone, fails to produce any maps of migration in his monograph on Les Champs Migratoires en France (Courgeau 1970). For visualization, he relies on graphs of migration against generalised distance bands.
Several techniques can be used to overcome some of these problems. Figure  6 shows how you can plot legible maps for net migration volumes using a proportional circle to represent the absolute number, while using colour to represent direction (gain or loss), for three separate annual intervals during a decade. The maps show continuity of the patterns but with changes in intensity in the third interval when the world was still experiencing the consequences of the global fi nancial crisis of 2008/09. Another technique, promoted by British Geographer Daniel Dorling, is the population cartogram, in which zones are drawn with an area proportional to their population, using hexagonal grids. Dorling and Thomas (2004) plot an atlas of cartograms based mainly on information from the 2001 Census. Each page offers both the variable on a standard geographic base (a transverse Mercator projection of Great Britain, produced by Ordnance Survey GB) and the hexagonal population cartogram. On pages 61 to 75 of Dorling and Thomas (2004), lifetime migration from 4 home countries and 11 countries outside the United Kingdom are plotted using the 2001 Census data together with change from the previous census in 1991. These maps are proud successors to those that Ravenstein drew in 1885 and Friedlander and Roshier produced in 1966. Local knowledge is brought to bear to explain unusual changes. For example, "The English born share of the local population has risen by two percentage points or more in Corby (as Scottish born immigrants to the steel works there have left or died)" (Dorling/Thomas 2004: 61).
One very successful graphic for depicting migration fl ows that is becoming widely used is the circular plot of migration, developed by Nikola Sander, using CIR-COS open source software (Abel/Sander 2014;Sander et al. 2014a). Figure 7 adapts an example from Sander et al. (2018). The plot is for fl ows between regions in the UK. Design decisions include selecting the number of regions that can be read in the static plot (12) and arranging them in a logical order around the circle. Northern regions are placed at the top, Southern regions at the bottom; regions have neighbours with which they exchange signifi cant fl ows; only the larger fl ows are depicted so that number of crossings is minimized. There is potential to design interactive versions of the circular plots, which can be referenced as part of the supplementary material associated with a publication. Failure to take good design decisions can result in what Sander et al. (2018) refer to as a "hairball". Azose and Raftery (2018b) use CIRCOS based plots to demonstrate the differences between their estimates of country to country migration fl ows and those of Abel and Sander (2014). Circular plots are used for each of the 15 Asian country cases of internal migration in Bell et al. (2020).

Drivers of migration
It is helpful to consider two kinds of drivers (factors) which infl uence migration: the context of the places where migrants live and migrate between and the individual characteristics of the migrants which infl uence their propensity to migrate.
The most important contextual drivers are the level and changes in economic development, associated with industrialization and urbanization, as Ravenstein recognised. For most of history cities have been the organisers of the space-economy (Jacobs 1970). Starting from a low base in the late 18 th Century, the share of the world's population which lives in urban places has grown to 58 percent in 2018 (UN 2018). In developed countries the urban population makes up between 68 percent and 82 percent of the total, with levels of 50 percent in Asia and 43 percent in Africa. The populations in some cities in eastern Europe, eastern Asia (Japan) and the rust belt of the USA have shrunk in recent decades. The main demographic component contributing to urban population growth has been internal migration supplemented by international immigration in some developed countries and by natural increase in developing country cities. In selected European countries and in the USA in some recent decades, there have been net fl ows out of cities (Rees/Kupiszewski 1999), which have been termed counter-urbanization. However, often, this counter trend has not persisted. A new wave of urban-ward migration has resumed. The populations in some cities in eastern Europe, eastern Asia (Japan, Korea) and North America have shrunk in recent decades, either as a result of manufacturing decline (e.g. Detroit, St Louis) or because of population ageing (e.g. Nagasaki, Busan). However, cities have also been spreading as new suburbs are added or new towns develop within the commuting fi eld of metropolitan centres. Suburbs and commuting towns are growing through internal migration from centre cities and sometimes through direct immigration from outside the country.
To monitor the migration processes producing urbanization, suburbanization, counter-urbanization and re-urbanization requires careful delineation of functional urban regions (FURs) split into cores, suburbs and peripheries. Many countries have developed geographical defi nitions of FURs and the European Commission has funded several projects to produce harmonised FUR defi nitions across EU member states. But FUR defi nitions differ across countries and most developing countries lack them. FURs also expand their boundaries over time, necessitating decisions about temporal harmonization of boundaries. Different zones within FURs (e.g. core, suburbs, periphery) contain rural settlements; zones outside FURs also contain urban settlements. Rees et al. (1996) analyses net internal migration trends using FUR and density frameworks, illustrating the complexities.
To surmount this defi nitional problem, Rees and Kupiszewski (1999) used population density by small areas within a country as a means of tracking changing internal migration patterns in ten EU member states, comparing results from the mid-1980s with those from the mid-1990s, straddling the revolution in political systems between 1989 and 1991. Figure 8, re-drawn from the Great Britain case study (Rees et al. 1996) illustrates the utility of the density classifi cation. The data used are internal migration fl ows, classifi ed by broad age for males (the female graphs are very similar), into and out of GB small areas (wards in England and Wales and postal sectors in Scotland) from which net migration rates were computed. Each graph plots the overall GB value for the net internal migration rate in each age band against density classes arranged from least dense (leftmost) to most dense (rightmost).
All age bands except for young adult slope from left to right, with positive net migration into low density areas which falls away to zero at middling densities and becomes negative at the higher densities (the city cores). The declines are steepest for the middle working ages (30-44) of high labour force participation and family formation. The graphs for the childhood ages (1-15) parallel those of 30-44 ages but are lower at the lowest density end, suggesting some differences between family and non-family migration. The downward slopes indicate strong suburbanization and counter-urbanization in the 1990-91 observation period, though this is likely to have been a high water for this process. The slope was less steep for the late working ages, 45 to pensionable age (65 for men and 60 for women in 1991). In the fi nal age band, pensionable age and above, there is net migration towards more dense areas Note: Counts of net internal migrants and population by electoral ward are aggregated to population density classes and rates computed. Source: Data from the 1991 Census, Special Migration Statistics, Crown Copyright. Indicators computed by the authors. Re-drawn from Rees et al. (1996: 72, Fig . 16). on the left side of the graphs followed by migration away from dense areas. This was interpreted by the authors as a desire among retired migrants not to locate in the most remote areas but in small towns in metropolitan peripheries where health care and other services were more accessible. Figure 8 represents the relationship between net internal migration and density for one country and time interval. Other researchers had studied this relationship across time in one country (Courgeau 1992) and across several countries (Fielding 1989). The technique was not to group observation areas into density bands but to plot net migration rates against density for all areas and then fi t a linear regression against the logarithm of population density. Courgeau (1992) splits the 95 départements into rural and urban parts, showing clearly the transition from a strong urbanizing trend in 1954-62 and 1962-68, a fl at trend in 1968-75 and a strong counterurbanizing trend in 1975-82. In the discussion above, we have seen how two individual drivers of migration, age and sex, interact with the context, represented by the population densities of different settlements. The age profi les of internal migration were studied in the Migration and Settlement project at IIASA led by Andrei  fi rst proposed a model for describing the variation of migration rate with age as a set of multi-exponential functions linked to the labour force ages, childhood ages, and retirement ages. Rogers and Castro (1981a) used the model to characterise the internal migration profi les of the inter-regional migration fl ows in the 17 countries of the project.
Subsequently, additional functions were added for: post-retirement migration by the elderly (Rogers 1992) and the double peaks at entry to and exit from higher education (Wilson 2010). In a Migration and Settlement case study for the UK, Rees (1979) observed that that the retirement peak was confi ned to fl ows from big cities to retirement regions and that international migration infl ows had a subdued childhood slope and a later peak in the labour force function. Wilson (2020) has replaced the rising slope of migration intensity in the age band of retirement with a function that sees an increase followed by a decrease from age onwards. Rogers and Castro (1981b) analyse migration by age schedules by reason for migration, but data sets providing this additional information are rare in censuses though more common in surveys. The model of the variation of migration by age has been extensively used in other analyses and the schedule parameters have been borrowed to decompose all age migration fl ows (e.g. in Sander et al. 2014b as input to the Wittgenstein world projections of population and human capital).
Bernard, Bell and Charles-Edwards have presented an alternative scheme for working with age schedules of migration (Bernard/Bell 2012, 2015Bernard et al. 2014aBernard et al. , 2016, which is simpler to implement and more informative. Country profi les are classifi ed by age at peak (years), normalized migration intensity at peak and overall migration intensity. They defi ne three clusters of countries. The fi rst cluster comprises countries with young age peaks (ages 21 or 22), high concentrations at peak ages but low overall intensity. The second cluster has peak ages from 21 to 24 with low concentrations at peak ages and moderate intensities. Countries in a third cluster have later peak ages (26 to 29), moderate to low concentrations at peak ages and relatively high overall migration intensities (Fig. 5 in Bernard et al. 2014a). In an important contribution to migration theory, Bernard et al. (2014b: 212) proposed: "a framework that links contextual factors to the age structure of migration through proximate determinants that directly affect migration ages: the prevalence, timing, and spread of life-course transitions. We focused on four key transitions that are concentrated at young adult ages and mark the passage to adulthood -exit from education, entry to the labor force, union formation, and childbirth -and sought to link these to the early-to-mid-20s peak commonly found in the age profi le of migration. The proposed framework enabled us to identify the determinants of cross-national differences in the age structure of migration and to quantify their relative importance across a global sample of 27 countries".
Other migrant attributes also have important effects. In World Population and Human Capital in the Twentieth Century (Lutz et al. 2014; for a précis and review see Rees 2018), the authors incorporate fertility and mortality rates decomposed by educational attainment categories into a model for projecting the population of world countries. Driven by an independent projection of how educational attainment will change over the current century, improving education raises survival rates but decreases fertility rates in the developing world. The outcome is a set of projected populations signifi cantly lower than those of the United Nations Population Division (UN 2015). Education provides skills and knowledge to improve both economic and social well-being. To a certain extent, education also acts as a surrogate for socioeconomic status, the occupational and income components of which can change through upward or downward social mobility, whereas the stock of an individual's human capital is more stable, although subject to gradual obsolescence as knowledge grows or to decreases in mental capacity at older ages. Bernard and Bell (2018) showed that, using IPUMS microdata from censuses and surveys, it was possible to measure the differences in propensity to migrate across educational attainment levels in a large set of countries. The authors showed that there was a near universal positive relationship between level of education and the propensity to migrate. Education was one of the few socio-economic variables for which this was the case. Therefore, it should be possible to introduce education as a driver of internal migration within sub-national population projections. The links between education and migration were largely ignored in Lutz et al. (2014). In recent work for the European Commission, Marois et al. (2019aMarois et al. ( /b, 2020 use a large microsimulation model of the European population to connect international migration to the attributes of individuals in the sample microdata in order to assess whether the negative economic consequences of population ageing can be mitigated through higher immigration, higher labour force participation and better integration of new arrivals into the labour force. They fi nd that policies that raise labour force participation rates at older ages, that encourage women to enter the labour market and that promote integration of foreign workers will reduce the increase in the economic dependency rate and mean that immigration levels need not be higher than today. Internal migration also varies by ethnicity and birthplace. International migrants by race or ethnicity or birthplace have initial distributions that differ from the native population. Subsequently, later generations migrate out of their initial settlement locations to other parts of the country. The consequences for the population composition by race have been monitored in US state projections by Frey (2015), for the population composition of local authorities in the UK by Rees et al. (2011Rees et al. ( , 2017b and for the birthplace composition of Australian state and local populations by Raymer et al. (2019a). The USA has experienced a shift in the destinations of Hispanic immigrants from traditional ports of entry to interior cities. In the UK the concentration of ethnic minorities in large cities has been lessened through out-migration. In Australia immigration has been concentrated in the major state capitals but with destinations widening over time. The roles of internal and international migration as contributors to national and local ethnic population change have been assessed for 2001-based projected populations (Rees et al. 2013). This analysis extended a decomposition analysis developed for world population regions by Bongaarts and Bulatao (1999), who claim that, when the world is divided into two regions, developed and developing, migration does not matter. This is a view that does not apply when you drill down to individual countries, sub-national regions and ethnic populations.

Temporal trends in internal migration
In this review of contemporary analysis of internal and international migration, we have not referred to changes over time. Abel and Sander (2014) and Abel (2018) showed that the view of international migration as an expanding phenomenon was incorrect for the period 1960 to 2015. Yes, the volume of fl ows showed growth over time in line with world population change, but, when expressed as a rate in relation to population, exhibited fl uctuating trends rather than the explosive growth of popular commentators. Zelinsky (1971) sets out a profoundly infl uential set of hypotheses about the "mobility transition". Note that his defi nition of mobility encompassed human movements including temporary migration, circular migration, daily commuting and visiting, a wider concept than the residential migration reviewed in this paper. He saw "circulation" as replacing the need for migration in later phases of the economic and social development of nation-states (Zelinsky 1971, Fig. 2). With hindsight we might extend Zelinsky's defi nition of circulation to include greater levels/distances of daily commuting, more working from home and telecommunications, including the internet and broad band replacing the need for migration. Other infl uential theories include the model of urbanization, counter-urbanization and reurbanization proposed by Geyer and Kontuly (1993). A theoretical framework linking development to population redistribution through net internal migration was proposed by Rees et al. (2017a, Fig. 8) and used in the country studies of Latin America , Asia (Charles-Edwards et al. 2017a/b, 2019Bell et al. 2020) and Europe (Rowe et al. 2019). The relationship of internal migration with development was considered in most of the analyses in Table 8, though most are cross-sectional except for Bell et al. (2018).
To test these theories and to clarify the directions of recent change requires analysis of long national time series in a robust comparative framework as set out in Bell et al. (2002). Champion and Shuttleworth (2017a) assemble from offi cial statistics on migration based on changes in patient address generated from the National Health Service Register to examine trends in longer and shorter distance migration. Champion and Shuttleworth (2017b) use the England and Wales Longitudinal Study, which links individuals across census from 1971 to 2011 to look at trends in the distances of migration, by type of person covering age, marital status, country of birth, occupation, and education. They tested the hypothesis, put forward by Cooke (2011Cooke ( , 2013, that migration rates were now declining, after demographic composition changes (ageing) had been accounted for. In collaboration, Champion et al. (2018) recruited authors from the US, UK, Australia, Japan, Sweden, Germany and Italy to explore trends in internal migration in a set of highly developed countries. Bell et al. (2018) contributed a chapter which examined trends in intensity in 66 countries in the IMAGE database for which observations were available for more than one period, focusing on changes in the 2000-2010 decade. Was Cooke's decline trend confi rmed? Well, yes and no and maybe (see Champion et al. 2018 for the range of case studies with differing outcomes).

5
Summary and discussion

Since Ravenstein, how far have we come?
In the introduction, we briefl y summarised Ravenstein's key insights into internal migration in Britain and in other countries in Europe. His articles also reveal an interest in international migration through his maps of foreign-born immigrants in the British Isles, selected European countries, the USA and Canada. In section 2, we placed the data set used, tables of lifetime migrants by country of residence at the census by country of birth, in a more general conceptual framework of defi nitions of migrants and migration. In section 3, we tracked further use of this type of migration data to the present decade when solutions have been put forward for estimating fi xed period migration from lifetime migration data. We made suggestions for the further refi nement of these estimates. In section 4, we showed how some of Ravenstein's empirical and verbal generalizations had been theorized and quantifi ed in 20 th and 21 st century research, covering the topics of migration and distance, gravity models of migration, migration fl ow patterns and visualizations, drivers of internal migration and the need for extending comparative internal migration analysis over time. We now set out an agenda for future work on both internal and international migration.

Agenda for making further progress
The goal in comparing internal migration across countries is to bring the "Cinderella" 4 of population change components (migration) to the demographic ball. Ideally, tables of comparable measures of internal and international migration should appear alongside the tables of indicators for mortality, fertility and net international migration published by the United Nations, the International Organization for Migration and the World Bank. We ask a series of questions, the answers to which may contribute to further progress in estimating, for as many countries as possible, both international and internal migration fl ows where data are missing or inadequate.

How can we maximise the number of countries and time period for which we have comparable internal migration measures?
The fi rst suggestion focuses on using the rich resources assembled in the IMAGE database so that the measures of migration can be assembled for as many countries as possible and for as many time periods as possible. For example, the models used to produce international migration fl ow estimates from lifetime migrant data could be applied to improving estimates of internal migration. Bell et al. (2015a) fi nd that 109 countries report collection of lifetime migrant data, but little use has yet been made of this resource in IMAGE project analyses. Many countries use census or survey questions on last migration/migrant data which often has an indeterminate time reference. Estimating fi xed interval migration from these sources would build up the space-time coverage of comparable internal migration measures. The IMAGE analyses report on 5-year and 1-year fi xed interval measures separately. The problem has been investigated many times, but no general solution has been adopted. Further research is needed, building on the literature and the recent model proposed by Dyrting (2018). A method for investigating the spatial MAUP has been enabled by the IMAGE studio. An equivalent tool for investigating the temporal MTUP (Modifi able Temporal Unit Problem) needs to synthesize the conditional framework of Nowok and Willekens (2011) and the migrant heterogeneity models (mover-stayer or high mobility-low mobility) of Dytring (2018) and Schmertmann (1999). A goal would be to specify a general method for conversion of migration data of different intervals to a common interval for international comparison, that derived the underpinning intensity of migration from the different data sets and then generated the estimation of migration transition rates based on a standard interval.
How should we improve the estimates of internal and international migration fl ows?
In section 3, methods for fi lling gaps in internal and international migration fl ow data sets were discussed. Innovative measures that make best use of available data collections have been proposed. However, offi cial statistical agencies are slow to take up these methods and use them to provide continuous time series, despite, in some cases, having funded the projects. There are no magic solutions to this problem. It is necessary to engage in an open dialogue with offi cial statisticians and other government departments (including local government) which use migration statistics. Useful venues are conferences attended by offi cial statisticians, other government departments meetings and academics. These conversations are best held outside of formal sessions over coffee or meals. Making solving the problem a shared task is better than an excuse for scoring points.

How can we use knowledge of internal and international migration in policy making?
Most countries and international organizations make use of population projections in formulating medium and long-term plans. Assumptions about internal and international migration are essential ingredients but these are rarely linked to policies. Internal migration assumptions usually involve continuation of inter-regional migration rates for a representative recent period. Poot et al. (2016) suggests there is scope to embody gravity models of migration into forecasts, although there would be a challenge in forecasting push factors at the origin and pull factors at the destination, as well as a need for better measures of impedance as Shen (2016) has pointed out. The uncertainty around future population numbers can be handled through specifying variant projections based on judgement or through generating probabilistic projections which provide confi dence intervals. Scenarios that specifically link international migration to policies (Rees et al. 2012;Abel et al. 2016;Cafaro/ Dérer 2019;Marois et al. 2020) provide forecasts of future populations conditional on the acceptance of specifi c policies. Such scenarios need a body of evidence to show that policies will result in a set direction of change in migration. Time series extrapolation, no matter how sophisticated, has a poor track record in forecasting international migration (Bijak et al. 2019).
What is the case for a world migration survey? Willekens et al. (2016: 898-899) have proposed a World Migration Survey (WMS), echoing a proposal by the International Union for the Scientifi c Study of Population: "The WMS should include respondents and partnering institutes in countries and of origin, transit, and destination and, ideally, should be longitudinal." Such a global migration survey might by implemented through the International Organization for Migration, the UN Population Division and the Organization for Economic Co-operation and Development. The expertise should be harnessed of international population research centres such as the Max Planck Institute for Demographic Research (Rostock), the Wittgenstein Centre for Demography and Global Human Capital (Vienna) and the Economic and Social Research Council Centre for Population Change (Southampton) and global polling organizations such as the Pew Centre or Gallup organization. The WMS will no doubt be besieged by researchers wanting to load their questions into the survey. Procedures should be implemented to require question or design proposals to draw up realistic cost/benefi t analyses.
Key questions that we suggest should be included are those that would unlock the value of current information so that existing but unsatisfactory data could be converted to fi xed interval migration data. Willekens et al. (2016: 897) suggest estimation of "fl ow data" as a target concept without elucidating how this is defi ned and measured. We hope that the measurement framework set out in section 2 of the paper will help clarify what measures should be included. Willekens et al. (2016) recommend "longitudinal data", without specifying exactly what that means: alternatives include regular cross-section surveys, surveys in which individuals are linked over time or full population registers.

What about climate change and migration?
Future research on migration cannot ignore the climate crisis, which threatens the existence of whole island nations and of a share of the world's population that lives in low-lying land vulnerable to sea-level rise. A one degree rise in global mean temperature has already increased storm frequency and the magnitude of droughts and increased fl ows of persons displaced by climate change. A requirement of future funding of migration research should be to take the climate emergency into as full an account as possible (Muttarak/Jiang 2015).

Concluding remarks
This paper began with a brief survey of the achievements of Anglo-German geographer, Ernst Ravenstein. His papers have infl uenced much research into migration in the century and a quarter since publication. We have reviewed how scholars now investigate the phenomena he wrote about. He was an international scholar and investigated migration both nationally and internationally. We hope he would be pleased with what the fi eld of migration research has achieved and would not be too daunted by the challenges of understanding migration in the modern world.

Appendix: Summaries of Ravenstein's Findings by Later Authors
Tab. A1: Key points from Ravenstein's 1876 papers as interpreted by Greenwood Year-# Text Pages 1876-1 Centres of industry and commerce tend to grow more rapidly p.271 than agricultural areas, with large towns growing more rapidly than surrounding rural areas. The observed differences in population growth patterns are due to migration.

1876-2
People migrate to towns from nearby rural areas, and migrants p.271 from more distant places fi ll the gaps left in the rural areas. Thus, Ravenstein sees migration occurring in stages, and he believes that with respect to growing towns the nearby potential labour supply is quickly depleted.

1876-3
Migration decreases with distance, an observation based on p.271 London and to a lesser extent on Glasgow, but other places as well.

1876-4
Migration is not determined by distance alone, but rather also is p.271 shaped by 'facility of access' and 'local circumstances', which at least in part apparently means the quality of roads and the transportation system. Thus, at least in some limited sense, the public sector may facilitate migration.

1876-5
Intervening opportunities tend to absorb migrants to any given p.271 urban destination.
Year-# Text Pages

1977-1
The majority of migrants go only a short distance. p.44 1977-2 Migration proceeds step by step. p.47 1977-3 Migrants proceeding long distances generally go by preference p.48 to one of the great centres of commerce and industry.

1977-5
The natives of towns are less migratory than those of rural districts. p.48

1977-6
Females are more migratory than males within the kingdom of p.49 their birth, but males move more frequently abroad.

1977-7
Most migrants are adults; families rarely migrated. p.49 1977-8 Towns grow more by migration than natural increase. p.50 1977-9 Migration increases as industries develop and the means of p.52 transport improves.

1977-10
The major direction of migration is from the rural areas to p.52 the towns.

1977-11
The main causes of migration are economic. p.53