Household Forecasts for the Planning of Long-Term Domestic Water Demand: Application to London and the Thames Valley

Methods for forecasting households in London and the Thames Valley were developed for input to forecasts of domestic water consumption. Households were forecast by ethnicity, size and property type. South Asian-headed households consumed more water (per capita) than average. Forecast populations for 60 Local Authorities were extracted from a UK-wide forecast and aggregated to six Water Resource Zones (WRZs). Household populations by age, sex and ethnicity were multiplied by trended headship rates to forecast households. Households were classified by size and property type using census microdata. Water demand was generated using modelled consumption rates, based on policy interventions. Between 2011 and 2101, the region will experience 85% growth in populations and 113% in households. The household growth will vary across WRZs between 54% and 126%. Water demand in London and the Thames Valley is forecast to grow by 90%, 69% and 46% under status quo, moderate and extreme conservation scenarios. The future growth in water demand under all scenarios poses a huge challenge for the region, already under water stress.

compulsory metering to 2031. These growth figures indicate a formidable challenge for Thames Water in increasing water supply, to reduce leakage and to persuade households to limit their consumption.
The forecasts in this paper are over a long-term horizon of 90 years, 2011-2101. For large water infrastructure projects, the public enquiry and infrastructure design stage can take up to 10 years, the construction stage may last a further 10 years, and at least 70 years is required as the pay-back period for capital loans to reduce their cost. Loans to water utility companies are regarded as a safe investment over the long-term as water is a necessity and the risks are spread over millions of bill-payers. Pension funds are attracted to utility investments because they need a guaranteed stream of reliable income to fund pension payments growing as population ages.
Although Water Resource Management Plans (WRMPs) need only specify how water supply is to be enhanced to meet demand over 25 years (WRSE, 2019), it would be prudent for water utilities to consider much longer time horizons.
Forecasting domestic water demand comprises the following toplevel tasks: (1) choosing the geographies for which forecasts are made, (2) forecasting populations, (3) forecasting households, (4) forecasting properties, (5) forecasting per household and per capita water consumption and (6) combining these to produce domestic water demand forecasts for water resource zones. Advice on methods for forecasting these inputs to forecasting domestic water demand are set out in a manual, WRMP19 Population, Household, Property and Occupancy Forecasting, produced for the Environment Agency and researched and written by UK Water Industry Research (UKWIR, 2015). We discuss each of these tasks in turn.
Water demand forecasts need to be delivered for geographies relevant to the organisation of water supply. These are water resource areas, the boundaries of which relate to river catchments and water pipeline networks and which are unrelated to the administrative or statistical areas used for official forecasts. The normal approach is to use the latest official local authority forecast and assign the projected outputs to water resource zones using "look across" conversion tables based on small area statistics. The method used in our forecasts was to allocate local authority level forecasts population to 2011 Census Output Areas (OAs) and then to use the geographic centres of OAs to allocate them to water resource zones. Other researchers have used larger Lower Super Output Areas (LSOAs) (UKWIR, 2015) or smaller scale individual properties and their geo-references (Edge Analytics 2018).
Population forecasting is implemented using either trend-based or housing-based methods. Trend-based forecasts employ single, biregional 2 or multi-regional cohort-component models using best estimates of component mortality, fertility and internal migration rates and either rates or flows for international migration. The model is driven by assumptions of how the component rates or flows are likely to change in the future. The assumptions use past trends and extrapolate them into the future. They may be based on departures from trend conditional on new policies or anticipated non-trend events.
The main alternative to the trend-based forecasting model is a model based on forecast changes in housing properties, derived from local plans and converted into population and household forecasts by applying occupancy and vacancy rates. Housing-based methods are used for short-term and small area forecasts; trend-based methods are used for long-term and large area forecasts.
Household forecasting methods are dependent, in trend forecasts, on the projected population by age and gender. From a recent census, rates at which members of an age-gender group represent a household are estimated. Projected populations by age and gender are multiplied by household representative rates (HRRs) 3 to produce forecasts of household numbers. In England and Wales, official trend assumptions for HRRs were made based on comparison of census information from 1971 to 2011 (DCLG, 2016a(DCLG, , 2016b. Trends in HRRs have been used in household forecasts up to 2014. ONS (2018) used trends between 2001 and 2011 to forecast HRRs to 2018 using Labour Force Survey data but assumed constant values thereafter.
Near constant HRRs were assumed because secular increases in HRRs for most age and sex groups had come to an end in the severely constrained housing supply situation in 2018. Alternative household forecasting methods used in different countries and by international agencies are reviewed in the second section of the paper.
It is necessary in both housing constrained and demographic trend projection methods to forecast properties. In the former approach, housing plans of local authorities are used to form a view about future new build. This approach must rely on varied and inconsistent plans, in which land is allocated to new housing. However, there are no guarantees that new housing will be delivered to the planned schedule. Delivery will depend on private market conditions, land speculation and hoarding, and whether public subsidies or investment are available. In trend forecasts, the forecast households are assumed to be able to occupy existing or new housing. New build may lag behind migrant demand. Insufficient properties may be released through out-migration or mortality of householders. In this case more people may be squeezed into the existing stock and HRRs will decrease. This happened when a wave of new immigrants from 2 A bi-regional model was used to produce the ETHPOP projections. Rogers (1985) developed the multi-regional cohort-cohort model to embody the effect of inter-regional migration on future populations. His model has been used widely in sub-national population projections (e. g. ONS 2018). The model requires detailed data on flows of migrants or migrations between regions, disaggregated by age and sex. It is often difficult to estimate reliable migration rates for a multi-regional projection. For example, the ETHPOP model would have required the estimation of 1,811,184 origin-destination rates (389 origins × 388 destinations × 12 ethnic groups). Rogers (1976), van Imhoff et al. (1997) and Wilson and Bell (2004) investigated the performance of aggregated forms of the multi-regional model. The bi-regional model gave results close to those of the multi-regional model, using equivalent input data for the same spatial system. The bi-regional model applies total out-migration rates to each region and computes total in-migration as the total out-migration rate multiplied by the combined population of the other regions. So, the number of migration rates that need to be computed for the ETHPOP model was 9,336 (origin rates plus rates for other regions combined for the ethnic groups = [389 + 389] × 12). At each time step of the model, a check is made that total out-migrants/out-migrations equal total in-migrants/in-migrations. In-migrants/in-migrations are adjusted to ensure this equality. 3 Household representatives were formerly termed heads of household. Heads were selfidentified in a census or survey questionnaire. This authoritarian model of household organisation has been replaced by a neutral rule-based method based on all inter-person relationships within a household.

| A SYSTEM FOR FORECASTING DOMESTIC WATER DEMAND
The analysis system designed to produce long-term forecasts of domestic water consumption for Thames Water's WRZs is illustrated in Figure 1. The boxes identify models and associated data sets. Four types of model are distinguished by shading pattern. The top left boxes describe the PHC prediction model, while the bottom left boxes specify the PHC forecasting model (for details see Rees et al., 2016;Rees et al., 2017a;Wohland et al., 2018). The boxes in middle and bottom right describe the household forecasting model, the focus of this paper. The bottom box in the figure brings together all the components in the analysis to produce domestic water demand forecasts.
The model for predicting household water demand showed that the key drivers were household size, property type and ethnicity of the household representative (Nawaz et al., 2019). Table 1  Other Ethnic and South Asian households. The latter consume between 32% and 65% more water, controlling for household size than the former. It is therefore essential that the forecasts of households in a water demand model included a projection classification by the three factors represented in Table 1.  Size  Detached  Semi-detached  Terrace  Flat  Detached  Semi-detached  Terrace  Flat   1  209  190  201  164  222  199  212  164   2  319  294  309  269  336  304  318  275   3  434  396  410  368  443  418  426  367   4  495  464  480  450  525  486  494  457   5  602  553  577  586  632  580  590  558   6+  743  614  752  675  686  716  735  The calibration of the model included rateable value of the property which captured the variation in housing unit extent (particularly of the garden) but this was fixed at its current value as there was no means of forecasting this tax variable (Nawaz et al., 2019).
Dummy variables representing the water resource zone proved insignificant indicating that water consumption did not vary by geographical location within the London and Thames Valley region, once We used forecasts of ethnic populations by age and sex using a UK-wide bi-regional, cohort component model (ETHPOP) for Local Authority Districts (LADs) Wohland et al., 2018;  LADs are projected in stages. Constant long-term fertility rates for twelve ethnic groups were assumed for local authorities in the NewETHPOP projections. In the absence of time series of such rates across several decades at local authority scale in London and the Thames Valley, it was adopted as the "least worst"option.
In a supplementary report for Thames Water (Rees, Norman, Wohland, Clark, & Nawaz, 2017b), we investigate the use of a fertility scenario that assumes convergence of the fertility rates of ethnic minority groups on a constant White British and Irish target.
The convergence assumed was symmetric, so that high fertility ethnic minority rates decline while low fertility ethnic minority rates rise. The argument for symmetric convergence rather than uniform decline was that the integration to British norms would apply also to low fertility groups such as the Chinese, whose fertility rates would rise as more students settled. A fourth variant population projection was implemented using this convergence scenario. The impact was to lower the growth of the Thames Water region population by 2% by 2101.
The analysis of Kulu and Hannemann (2016)

| Household definitions
The household is defined as a small group of people who live together and share living arrangements. The UN (2009) identifies two definitions of the household. The first is a group of people that resides in a dwelling. The second is the same group but the members share common housekeeping. A third definition of the household is sometimes used which includes family members who spend time away from the family home and which the members insist should be included in the household (Wittenberg, Collinson, & Harris, 2017). In this paper. we use the second definition of the household.
Most household forecasting methods build on population fore- casts. An essential first step, therefore, is to allocate the forecast population between the household population (~98%) and population

| The household representative method
The household representative rates are used in this approach to forecasting households. The method depends on the definition of a marker person in the household group. The UN terms this marker person the head, who is self-identified in the census or survey return.
However, many societies are uncomfortable with the patriarchal implications of "head". The term has been replaced by "household representative" in many official forecasts. Household members are no longer asked to identify the marker person. Rules based on person attributes are instead used to define a household representative (ONS, 2017). The household representative rate is the proportion of people in the household population who are household representative persons. If we multiply the household population classified by age and gender by the household representative rate for a future year, we obtain the projected number of households.

| The household membership method
An alternative forecasting approach is the household membership method, used when information about the household representative is missing but when administrative or census microdata are available.
For example, Harper and Mayhew (2016) and NISRA (2015aNISRA ( , 2015b adopt a method that merges the household formation and typology steps. It is assumed that each dwelling address contains one household. Households are allocated to a type based on their composition by individuals. Household membership rates are computed by dividing the household population, classified by age, sex and household type, by the household population by age and sex.
Household membership rates are multiplied by the projected household population by age and sex to produce forecast numbers of people in households in each class. To convert household population by type into households by type, it is necessary to divide by the average number of persons per household for each type, so an additional classification of households by size is required.

| The household transition method
The Household transition rate method is an attractive options because of the link to multi-state demographic methods. Multiregional population models have been generalised by Van Imhoff and Keilman (1991) to households with accompanying operational software, LIPRO. Crucial ingredients are data on events that happen to individual, which lead to transitions of households from one type to another. Longitudinal data are needed to generate event counts and subsequent transition probabilities, along with a set of assumptions. However, the method is anchored in the chosen household typology and the necessary longitudinal estimations required, which are rarely available.

| The micro-simulation method
The Micro-simulation method offers attribute richness and makes household projection a function of individual microdata organised by households, from which household tables can be easily generated. The household is a continually changing group entity. Members of the group can enter or leave the household through demographic or relationship changes. Because individuals arrive, change and leave, the household can morph from one type to another quite quickly. To overcome such complexity, researchers build microsimulation models using individuals as the entities not households. A sample of individuals in private household or communal populations is used as the base population. Events are simulated by sampling from distributions of event probabilities dependent on key individual attributes (Duley & Rees, 1991). Bélanger and Sabourin (2017) describe a software package for constructing such micro-simulations. A second method is to use an exogenous aggregate projection and to construct populations of individuals by sampling from a list of households Wu, Birkin, & Rees, 2010). Individual candidates are swapped into and out of the micro-simulated population until the fit with the exogeneous constraint tables is optimum. Van Imhoff and Post (1997) demonstrate the equivalence of macro-and micro-simulation population forecasting models and give guidance on use. To date micro-simulation models have been rarely applied by national statistical offices to the projection of households, because of the complexities involved. However, static versions are increasingly used to assess fiscal (Adams, 2016) and pension (Emmerson, Reed, & Shephard, 2004) policies.

| Adding household types
There is little agreement about which household typologies should be used in forecasts. Household numbers, sizes and structures are vital for the formulation of public and private policies influencing welfare, housing and consumption. Official typologies combine notions of size, relationship and age mix but all dimensions lack detail for use in water demand forecasting. The four UK national statistical offices produce household forecasts for each home country using different methods and typologies (DCLG, 2016a(DCLG, , 2016bNISRA, 2015aNISRA, , 2015bNRS, 2017;WG., 2017). Complex methods are needed to reproduce official household typologies at local scales (Wilson, 2013). Zeng et al. (1997Zeng et al. ( , 1999Zeng et al. ( , 2006Zeng et al. ( and 2013  over-crowding. The lack of linkage between household and dwelling attributes may also result from the difficulty of forecasting numbers and types of dwelling unit from official housing plans.

| Choosing a household forecasting method for water demand forecasts
Which method of household forecasting should be used when producing household forecasts for planning of domestic water supply? The data requirements and complexities of the household transition and the microsimulation methods rule those out. The household representative method, as used by DCLG and ONS for England at local authority level provides a database of past and trended rates for use in our forecast. But the official typology fails to link to the key drivers of water demand. So, we develop, following the household membership method, proportions of households by size, property type and ethnicity from census microdata, which can be added to the results of a household representative rate model. We describe this combined model in the next section of the paper.

| Data and typologies for forecasting households
The data sources we use to project forward the number of house-

| Forecasting populations and households
To forecast households, we employ a sequence of model equations. Table 4 provides definitions of terms used in the model. Table 5 (5A to 5C) sets out the population forecasting, the household/communal population split and the household formation steps of the model.  Wohland et al., 2016.
A feature of this forecast was the inclusion of ethnicity. The ETHPOP forecasts used a standard cohort-component design embedded in a bi-regional model. LAD populations were projected to 2061. In the Thames Water analysis, the LAD ETHPOP forecasts were extended to 2101 (Step A1 in Table 5

| Estimating and forecasting communal and household populations (Step B)
The populations forecast at Step A1 are the total numbers of usual residents. In order to estimate and forecast the number of households, the total population (P) needs to be split into the household population (HP) and the communal population (CP Note: See Table 4 for definitions of the variables and subscripts. are subtracted from the total projected population to give the household population by LAD, age, gender and ethnicity (Step B5).

| The household formation model (Step C)
The DCLG household projection model uses Household Representa-  (Table 5C and  The trends for females in Wandsworth are shown in Figure 3B.
DCLG projects rising HRR rates for older ages and declining rates at younger ages. Similar plots are shown in Figure 3C and 3D

| The household typology model (Step D)
Household typologies are added in Step D of the model (Table 5), using ethnicity, household size and property type rather than DCLG's official household classification (Table 3A) or ONS's replacement (Table 3B) or Harper and Mayhew's administrative emulation (Table 3C). Size is measured as persons per household, from 1 to 6 or more, with water use increasing with number. The property types are detached, semi-detached, terraced and flats, which define a gradient in decreases in outdoor water use related to gardens.

In
Step D (Table 5), for each LAD, we needed to estimate a fivedimension These regional estimates of HCPs can be tailored for each LAD using published local census tables as marginals to adjust the regional tables. We used three marginal tables for LADs: tables of occupancy (QS406EW), HRPs by HRP ethnicity (DC4201EW) and HRP gender by HRP age by property type (commissioned table   CT0621). Iterative proportional fitting (IPF) is used to make local estimates of full age, gender, ethnicity grouping, property type and occupancy tables, "seeding" the estimates using the appropriate regional counts and then using an iterative technique to modify these seed counts so that they match the available marginal counts for the LAD. This process was implemented in Visual Basic in EXCEL.

| Geo-conversion from LADs to WRZs (Step E)
The geographical units used in the population and household forecasts are lower-tier LADs in England. The populations and households for LADs need to be converted into results for WRZs.
Step E1 ( After the application of the look down procedure described above, there will be estimates of the projected number of households in each OA. The task then is to allocate each OA to a WRZ. Because some OAs do not nest exactly within a WRZ, a 'best fit' rule is adopted that OA population and household counts are allocated to the WRZ that contains its Population Weighted Centroid.

| Forecasting household water demand
The projected numbers of households are multiplied in Step F1 (

| THE FORECASTS OF HOUSEHOLDS
The total numbers of forecast households and populations, along with time series indicators are set out in

| Forecast households by average size
If the household size distribution remains constant in future, then household water demand will rise in parallel with the population.
However, the size distribution will change depending on the net outcome of the HRR trend assumptions of DCLG, the constant HCP assumptions and the shifts in age distribution, which drive the conversion of population forecasts into household forecasts. If household size falls then water demand will rise, because PCC is higher for smaller households than larger (Nawaz et al., 2019). In

| Forecast households by size, property type and HRP age and gender
The combination of the forecast household populations, the trended household formation rates (slightly upward) and the constant household typology proportions enables a projection to be made of the composition of future stocks of households. Figure 8 shows household compositions, for 2011 and 2061, for Wandsworth (left hand charts) and West Oxfordshire (right hand charts). Each diagram shows the number of households by size in Figure 8A and by property type in Figure 8B, by age and sex of the HRP, with male HRP households on the left side and female on the right side. We have summed over the ethnicity of the household head. The charts are slightly skewed towards having more male HRPs, but this skewness diminishes as age increases and over time.
Smaller households with 1-or 2-persons are the most common, although by 2051 this is less pronounced ( Figure 8A). Flats and, to a lesser extent, terraced properties are the dominant property type in Wandsworth in all projection years ( Figure 8B). Two-person younger households become more prominent in 2051.  Figure 9, which graphs total domestic water consumption under three scenarios of small change in PHC (the Business as Usual scenario, including a compulsory metering programme), modest conservation of water (the Light Green scenario) and drastic conservation of water (the Dark Green scenario). Table 7  and Other Ethnic. The former grouping consumed more water, controlling for household size, property type and spatial zone, than did the latter ( Table 1). The main reason for these differences was the role of ritual washing in the Muslim faith, practised by a large proportion of people of South Asian heritage.
The second intellectual contribution of the paper was to develop a method for forecasting household types relevant to forecasting domestic water demand, namely, household size by type of property occupied. DCLG had for many years used a complex, one-size fits all typology of households that was unsuitable for use in forecasting domestic water demand.
The third intellectual contribution of the paper was to show that that plausible long-term forecasts could be developed for subnational areas (Figures 4 to 7). One feature of the forecasts is that for the Thames Water region, growth is not linear but curvilinear with declines for larger households after 2081. The slowdown in growth at the end of the long-term period results from a constant net gain from international migration eventually being counter-balanced by natural decrease implicit in the assumption of belowreplacement fertility. Long-term forecasts are necessary for utilities investing in water supply infrastructure. The water supply works take a long-time to plan and build, are subject to many extensions of planned delivery dates and need a long payback period in order to obtain loans at low interest rates.

| The role of changes in ethnic composition and share of households by water resource zone.
In the discussion of Table 7, we identified the contributions to forecast domestic water demand made by population change, additional change and changes in household water consumption, mainly due to the roll out of metering to between 65 and 85% of households.
Here we examine changes due to differences in household growth in the whole Thames Water supply area due to shifts in the distribution of population between the two ethnic groupings and between water resource zones. A shift in the population to the South Asian grouping will tend to increase water consumption. A shift in population to the London WRZ will tend to decrease consumption because most new build in Greater London will be flats.
Households living in flats consume less water because they have window boxes rather than gardens to water. Table 9 summarises being equal, to an increase in household water consumption, assuming the differences between households with heads in the two groupings, identified in Table 1, persist.
Overall, the household numbers in the second block of columns and the shares of the Thames Water region total, show that domestic water consumption is dominated by the London water resource zone, which gains 4.6% share. The only other zone gaining share is Slough-Wycombe-Aylesbury, which despite the shift in ethnic composition, only gains 0.14% share of the total. The growth in households in the London water resource zone is the major factor behind the overall forecast increase in the Thames Water region. This increase will be moderated a little by the increase in households living in flats and by the slowdown in household growth after 2081 in London, which does not happen in the outer water resource zones. The influence of these factors along with those discussed earlier in connection with Table 7 is complex and merits a full decomposition analysis in future work.

| Comparisons of forecasts in the 2014 plan, this paper and the revised 2019 plan
The forecasts of populations and households for the domestic water supply area of Thames Water, described in this paper, were delivered      is higher at 185%.  argue that this is because this projection reflects the result of including ethnic heterogeneity in the method, which means higher growth for the London and Slough-Wycombe-Aylesbury water resource zones.

| Alternative scenarios for international migration and ethnic fertility
Uncertainty in forecasts can be assessed through formal error prediction methods or through variant projections. Elsewhere we apply historical errors in UK sub-national projections developed by UKWIR (2015) to the household forecasts . Table 10 has presented results for a set of variants for each WRMP round. Norman (2019a, 2019b) assessed decades. This reduced the fertility rates of South Asian groups but raised the rates for other groups with low fertility. Overall, the population increase was moderated but only marginally.

| Improving the population and household models
How might the demographic and household models be improved?
The key feature of the demographic model used to project LAD populations in the Thames Water region was that it produced results for each ethnic grouping. Ethnicity is used extensively in official statistics, but ethnic group definitions have been subject to Ideally, a forecasting model should be tested against reliable historical statistics. Population and household statistics in two successive censuses (e.g. 2001 and 2011) provide the jump-off and target populations for such an exercise, with trends in demographic rates and trends in household formation rates for 1991 to 2011 as input to assumption setting. However, it was not possible to assemble the same data for domestic consumption because the time series available only covered the period 2006 to 2015.
In this paper we have reviewed household forecasting methods and modified existing methods to meet the needs of domestic water demand forecasting. We have extended our previous ethnic population forecasts, which projected forward 50 years, from 2011 to 2061 to a 90-year horizon. The population forecasts were converted into household forecasts using a modified form of the household representative method. A new typology for households that combines property type and size, linked to ethnicity, was used to forecast households by size, property type and ethnicity. These households were used with forecasts of per household and per capita water consumption to project total water demand for London and the Thames Valley to 2101 (Nawaz et al., 2019). The current paper uses demographic and statistical techniques to generate knowledge for designing plans to meet future water demand. It is an exercise in applied social science, which has successfully knitted together analysis from three different domains: population, households and water consumption. Henderson of TWUL supplied the digital boundary data used in geoconverting Local Authority results to WRZs. Data, models and code used in this study are available from third parties, the authors and online as described in the supplemental file, Metadata-PSP-18-0065. docx which can be accessed along with a file containing data and code at https://doi.org/10.5518/729.

DISCLAIMER
The projections and interpretations in this paper should not be construed as reflecting the official views of TWUL, GLA, ONS or DCLG (from December 2017 the Ministry of Housing, Communities and Local Government, MHCLG) and are the responsibility of the authors.