Transport model development guidelines and checklists

Conference papers on specific transport modelling issues

Links to transport modelling guidance and technical references

The appraisal of transport projects involves using transport models to forecast the impacts of the transport project some decades into the future. These forecasts are instrumental in determining:

- how much travel demand is captured by the transport project and its impacts on the existing transport networks;
- the economic benefits of the project;
- the revenues for public transport schemes and toll roads.

The transport model forecasts also inform calculations of social and environmental impacts.

In order to understand these issues, transport model forecasts are made for the opening year of the transport project, which may be 10-15 years in future, allowing for the time involved in the planning processes and in the scheme design and construction phases. But as the transport scheme will have an economic and/or engineering life of 30 years or more, and must be demonstrated to be able to carry the forecast travel demands, forecasts are likely to be made for later years in the life of the scheme (eg 15 years after opening).

Transport model forecasts of 20-50 years into the future are self-evidently subject to uncertainties and it is common to analyse these uncertainties. Most governments require some consideration of risk and uncertainty, but the scope varies widely.

I do not propose to address the arguments for and against rigorous risk assessment here, but to illustrate some methods of estimating the uncertainty in transport model forecasts.

The UK government guidelines on transport modelling and appraisal are given in Webtag, and this includes forecasting and uncertainty. Originally Unit 3.15.5 dealt with this topic but this is now superseded by Unit M4.

In both, the first requirement is to produce an "uncertainty log" which summarises all known assumptions and uncertainties, covering the forecast inputs and the model parameters and specification. Associated with this is the recommendation that sensitivity tests are used to identify those modelling aspects to which the forecasts are most sensitive.

Whatever we choose to call it, common to any attempt to understand forecasting uncertainty is something like an uncertainty log which identifies the key sources of forecasting uncertainty, and which is established using knowledge, judgement and sensitivity tests.

There has been much research on the topic of uncertainty and I was involved in one of the early studies by the UK government which reported in 1980 (entitled Highway Appraisal under Uncertainty). A paper describing the work can be found in Transportation, Volume 9 (1980) pp 249-267, entitled Uncertainty in the Context of Highway Appraisal.

The literature identifies three categories of error: forecast input error (IE), model specification error and model estimation error (because these two sources are difficult to separate I classify them both as model errors, ME).

In the study, we used a model developed for a then current highway project. The process followed was to:

- identify the main sources of error (the uncertainty log);
- quantify those errors and attribute an error distribution (normal, gamma, rectangular etc.);
- use Monte Carlo techniques to simulate the combined impacts of all error sources.

In the Monte Carlo process, we ran the traffic model many times, in each run revising the forecast inputs and the model specifications (the coefficients) using sampled values for each source of error from their respective distributions.

The sources of error identified and quantified were:

- planning data: population, workforce and employment by category (IE);
- income forecasts (IE);
- fuel cost forecasts (IE);
- base year matrix (ME);
- assignment routeing parameter (ME);
- speed/flow relationships (ME).

We quantified the expected range of error for a given input or model coefficient using judgement, research and/or external views.

- For
__planning data__uncertainties, the key issue was the accuracy of the assumed future growth rates of each item. An analysis of the variations in small area growth rates over a 20 year period gave an indication of the range of variation, and an assumption was made on the proportion of this range which might reasonably be expected to be forecast by the planners. - A similar approach was taken to
__income__forecasts, in which the range of forecasting error was inferred from an analysis of historic GDP growth trend variations. - For the
__fuel cost__forecasts, UK government estimates of the range of uncertainty were used. - The errors in the
__base year matrix__were based on previous research (broadly reflecting survey sampling errors). - The uncertainty over the current value of the
__assignment routeing parameter__was inferred from the available research, and further allowance was made for uncertainty in the sensitivity of the future value to changes in income. - Uncertainties in the estimated
__road speeds__were taken from the evidence of speed/flow studies.

Finally, given estimates of the size of the uncertainties, the error distributions were generally a matter of judgement.

Having to run a complex model repeatedly in a Monte Carlo simulation of errors is not usually practical in a study, many models taking a considerable time to run. Consequently, a follow-on research project looked at whether this effort could be reduced using experimental design techniques.

In the following I illustrate some of the issues and solutions with examples drawn from transport project studies.

In any practical approach, something of the nature of the uncertainty log is the startpoint. The log should cover only the important sources, and sensitivity tests and judgement should be used to eliminate from consideration minor sources of uncertainty.

For the study of Speedrail (a high speed rail line between Sydney and Canberra) in 1999 the major forecasting uncertainties identified were:

- base market: the estimates of base market demands by mode of transport, derived from surveys (ME);
- market growth: the predictions of market growth, based on population and GDP growth forecasts (IE) and dependent on the elasticities of travel demand to income growth (ME);
- the transport project impacts: the forecast diversion of trips to Speedrail from existing modes of transport, the induced travel by Speedrail and the fare yield (ME).

For the surface access mode share forecasts in connection with planning the development of Stansted Airport in 2006/8, the major sources of forecast input errors in the air passenger forecasts included (this is not a complete list):

- air passenger characteristics: % transfer passengers, % foreign passengers, % business passengers (IE);
- transport network factors: road journey times and public transport services (IE);
- economic factors: vehicle operating costs, public transport fares, taxi fares, airport car parking charges (IE);
- other factors: vehicle occupancies (ME).

On the East Coast High Speed Rail Study (between Melbourne and Brisbane) which reported in 2014, the major sources of forecasting uncertainty were:

- scenario factors: economic growth, population, the future aviation scenario (IE);
- model forecasting uncertainties: current travel demands (survey sampling errors), projections of demand growth (the income elasticities), the mode share won by high speed rail, the induced high speed rail patronage (ME).

There are a number of issues to address in practical appraisals of uncertainty:

- the extent to which uncertainty can be addressed solely with sensitivity tests, as seems to be implied by Webtag;
- whether more attention is given to input errors than model errors, again as seems to be implied by Webtag;
- how to estimate the size of the errors and their distributions;
- how to determine the combined effects of the errors.

Concerning point 1, sensitivity tests are excellent for addressing the impacts of specific, important issues. In the studies of Speedrail and of East Coast High Speed Rail, sensitivity tests were used to investigate the impacts of changes in the characteristics of the competing transport modes (including competitive reactions to the high speed rail services). They are also useful if there are a few very significant uncertainties whose impacts need to be highlighted. In the case of East Coast High Speed Rail the pace of population and economic growth of the eastern states of Australia was addressed in sensitivity tests.

When there are a significant number of sources of uncertainty, the value of sensitivity tests alone reduces, the problem being that the many sources of uncertainty cannot be combined in a discrete number of simple sensitivity tests without generating very unlikely scenarios. In such circumstances, Monte Carlo techniques should be able to provide a more balanced assessment of the risks.

This is a matter for the context and jurisdiction. But, in principle, the estimate of forecasting uncertainty based only on input errors underestimates the risk, which may be serious in some contexts.

In estimating the size of the errors and their distributions, as shown by the earlier example, there is inevitably much judgement, which can be informed by:

- analysis of historic trends (for planning and economic data);
- information on uncertainty ranges available externally (from government institutions, academic studies etc);
- statistical data from model estimations, giving confidence limits on coefficients;
- validation exercises suggesting the range of error of model outputs (discussed later).

The selection of error distributions is mainly based on judgement: for trip matrices gamma or some similar non-negative distribution would be appropriate, while rectangular distributions are useful where there is no strong reason to expect the error distribution to be centrally focused.

With large numbers of sources of error, as there inevitably are in complex transport model systems, the estimation of forecasting uncertainty can be intimidating. As I have earlier remarked, while the original research demonstrated that estimates of forecasting uncertainty could be produced, the Monte Carlo process involving repeated runs of complex transport model systems is not practical for most studies.

There are other issues too. With a large number of errors sources, the work required to determine the uncertainties associated with each will also not be acceptable. Further complicating the error combination task with large numbers of error sources is the possibility of correlations between the errors, particularly the case if large numbers of model coefficients are included in the uncertainty log.

What all this implies is the need to design the approach to the study of uncertainty in a way which can meet practical constraints but does not compromise its value. The discussion that follows is based on some real examples which illustrate ways in which this may be done.

On the projects in which I have been involved, there were two fundamental aspects of the approach to estimating uncertainty.
The __first__ was to develop a much simplified, more aggregate version of the model in which the key uncertainties could be represented, and which was
capable of being used in a Monte Carlo framework. Associated with this, the __second__ aspect was to focus on the overall
errors of the model output, rather than on the individual errors associated with every element of the model specification
(eg all the model coefficients).

As an example, in one project (A) we created a simplified model of the form:

Scheme patronage = base market * market growth * diversion * (1+induced travel)

The base market was aggregated into around 20 segments (by trip purpose, mode of transport and broad geographic sector), for which survey error calculations were made. For these segments also, calculations of the growth factor errors were made (based on uncertainties in the population and GDP forecasts and the elasticities of travel demand to income growth).

The segments were further aggregated for the diversion error analysis. Sensitivity tests of variations to the mode choice model coefficients were used to derive an overall estimate of the uncertainty of the diversion forecast for each aggregated segment. The range of uncertainty for the induced travel forecast was based on judgement.

The simplified model and the associated levels of uncertainty were incorporated in the "@Risk" package. Monte Carlo analyses were used to derive the overall range of uncertainty of the forecasts of patronage (and yield).

A variant of this approach was used in other studies (B and C), both concerned with forecasting the mode shares for new public transport facilities.

As before, the key uncertainties, a subset of the full range of uncertainties, were first identified (in the log) and sensitivity tests were used to identify the effects on the model forecasts of each individual source of uncertainty. These were then combined in a Monte Carlo simulation of the form:

F_{k} = F_{0} *(1+u_{1k}) *(1+u_{2k}) *(1+u_{3k}) *(1+u_{4k}) *(1+u_{5k}) *(1+u_{6k}) .......

where:

F_{0} is the core model forecast

F_{k} is a revised forecast "k" allowing for uncertainties in each source

u_{i} are the individual sources of uncertainty and u_{ik} are the sampled uncertainty perturbations in forecast "k" for each source "i".

This simple model is run many times sampling values of u_{i} from their respective distributions of uncertainty, to give a distribution of the
forecast F_{k}. In both studies the Monte Carlo approach was implemented
within a spreadsheet.

In study C, it was further postulated that some sources could be correlated and the combined variance calculations were amended to allow for partial correlations between these sources.

Also in study C, the models were far too complex to consider estimating the uncertainty due to model errors by sensitivity testing individual coefficients. However, the forecasts of market share made by this model had been extensively validated against the market shares achieved by other, similar projects. Therefore, instead of individual model parameter tests, we used the results of the validation to estimate the likely overall range of uncertainty around the market share forecasts.

In all three studies, individual sensitivity tests were also used to illustrate the impacts of those combinations of uncertainties of most concern to the decision-makers.