A metropolitan region can consist of 10 million or more inhabitants,
the simulation of whom causes considerable demands on computational
performance. This demand on computation is made worse by the repeated
execution of the relaxation iterations. And in contrast to
simulations in the natural sciences, traffic particles ( travelers,
vehicles) have internal intelligence. This internal intelligence
translates into rule-based code, which does not vectorize and
therefore does not run efficiently on traditional vector
supercomputers, such as the Cray series. It does however run well on
modern workstation architectures, which makes traffic simulations
ideally suited for clusters of PCs, also called Beowulf clusters. One
uses domain decomposition, that is, each CPU obtains a patch of the
geographical region. Information and vehicles are exchanged between
the patches via message passing, for example using MPI (Message
Passing Interface, MPI (27)).
Two important numbers to judge the performance of a parallel simulation are speed-up and real time ratio. They are defined as follows:
Fig. 5 shows measured and predicted computing speeds as a function of the number of CPUs for the queue micro-simulation and the Switzerland scenario. Both figures refer to the simulation of the morning peak, as explained earlier in this paper. The top figure shows the real time ratio (RTR); the bottom figure shows the speed-up. As one can see, the plots are related by a vertical shift of the data, which means a multiplication with a constant multiplier because of the logarithmic scale. That multiplier is the RTR of the simulation on a single CPU; in our case, the RTR on a single CPU is about eight. The differences between RTR and speed-up would become important if one changed the scenario size: In the RTR plot, the graph would be shifted to the left or right for smaller or larger scenarios, respectively, meaning that the maximally reachable RTR would not change, but would be shifted to a different number of CPUs. In the speed-up plot, the graph would be shifted down or up, respectively, meaning that the maximally reachable speed-up depends on the size of the scenario. That is, for both plots the results depend on the scenario size; it is impossible to make parallel performance predictions without knowledge about the scenario size.
High computing speeds matter: For real time applications, one wants
the simulation to be considerably faster than reality so that the
prediction is finished much before reality catches up. For
transportation planning, about 50 iterations between simulation and
plans generation are necessary, meaning
hours of
simulated traffic for a 24-hour scenario. With our real time ratio
of 200, the computing time for this would still be 6 hours for the
micro-simulation alone.
![]() ![]() |
Besides the micro-simulation, also the feedback mechanism consumes computing time; including re-routing and agent database operations, it currently takes roughly 45 min per iteration for the morning peak Switzerland scenario (15). This is clearly the bottleneck of the current approach; better implementations are under investigation.
In summary, an iteration using our current implementation takes less than one hour. Running fifty iterations thus takes about two days.