Next: The Transims microsimulation approach
Up: Traffic flow characteristics
Previous: Introduction
  Contents
Prerequisite of any simulation model to be used is a certain amount of
confidence in its output. The process of building confidence depends
on human nature and is sometimes hard to explain. Yet, an organized
process towards model acceptance would help. Such an acceptance
process may be composed of the following four
elements (117):
- Verification - have the hypothesized behavioral rules been
implemented correctly?
- Validation - do the hypothesized behavioral rules produce
correct emergent behavior, such as correct fundamental diagrams? Note
that this does not specify a quantitative procedure; plausibility,
consistency with theory and experience, and documentation of emergent
behavior are the important elements here.
- Calibration - have the model parameters been optimized to
(possibly site specific) settings? This requires a decision on a data
set and a decision on an objective function that can quantify
the closeness of the simulation to the data set.
- Accreditation - Given a question, is the model indeed powerful
enough to help with it?
Note that this process is not uni-directional. For example, if one
cannot calibrate a model very well for a given scenario and a given
objective function, one will go back and change the microscopic rules
and then have to go through verification and validation again.
Also, a formally correct verification process can be shown to be
mathematically hard or computationally impossible except in very
simple situations (see, e.g., Chapters 14 - 16 in (119)).
Intuitively, the problem is that seemingly unrelated parts of the
implementation can interact in complicated ways, and to exhaustively
test all combinations is impossible. For that reason, both
practitioners and theoreticians suggest that one needs to allocate
resources intelligently between verification and validation.
Sometimes, the word ``validation'' is also used when a simulation
model, after calibration to a scenario and data set A, is run
under another scenario to test its predictive performance. Since this
represents in principle the same procedure - run the simulation model
against a scenario without further adjustment in the process - we do
not see a problem in the use of the word validation in both cases.
Next, one needs to decide on which networks to run the above
studies. The following seem to be useful:
- Building block cases such as ``traffic in a loop'' or ``traffic
through yield sign''. The chapters of the Highway Capacity
Manual (114), despite being under discussion, seem to be a
good starting point here. Maybe these cases will not be very useful
for calibration since ``clean'' data on these cases is difficult if
not impossible to obtain. Yet, these cases would certainly allow
plausibility check of a simulation model, and comparison to other
simulation models.
- Complicated test cases, which test a variety of behavior such as
merging or traffic signals, in a larger context (i.e. when
interacting). It would be nicest to have test cases from the real
world, together with real data. - These test cases would best be made
electronically available.
Of course, models have always been validated and calibrated,
e.g. (26,72,97).
For fluid-dynamical models, calibration can be
formalized (33,34). Yet, we
would like to stress that there are two diverging tendencies here:
- Models which are simple (i.e. have few parameters) are easy to be
formally calibrated in the sense that one can adjust the parameters so
that some objective function is minimized. Yet, the model may be too
simple to indeed reflect the ``meaning'' of the
data.32.1
- Models which have many parameters are in principle capable of
representing a much wider variety of dynamics. Yet, they are
difficult for formal calibration because the degrees of freedom are
too large. Here, the intuition of the developer is important, who
prescribes the simplifications, usually by making the problem more
homogeneous than it is (for example prescribing that drivers only fall
into few behavioral classes). - Microscopic models fall into this
category.
Ref. (36) nicely illustrates the problem: The
authors indeed decide on an objective function (match the two
parameters of a two-fluid model description of the real world
traffic); yet the procedure is trial and error in the sense that the
authors themselves decide on which aspects of NETSIM they believe to
be important.
This indicates, consistent with our own experience, that formal
calibration (in the sense of a formal procedure as opposed to
trial-and-error) of microscopic models is currently very hard
to achieve. This, in addition to the generally valid argument that
calibration does not protect one against having the wrong model,
implies to us that on the ``validation'' level, comparable and
meaningful test suites should be constructed, and that the model
behavior in these test suites should be publicly documented. This
effort should be geared towards understanding the strength and
weaknesses of a/the participating model (as opposed to deciding which
is the ``best'' model).
In this paper, we want to concentrate on the ``validation'' part in
the above sense in conjunction with ``building block'' test cases. We
mean that as a first important step; in the future, we would like to
be able to say something like ``the simulations in this study are
based on driving rules with their emergent behavior documented in the
appendix'', which would recognize the fact that the rules may have
changed since the last ``major'' publication. This does not preclude
that we will attempt to construct more realistic test scenarios in the
future.
Next: The Transims microsimulation approach
Up: Traffic flow characteristics
Previous: Introduction
  Contents
2004-02-02