A metropolitan region can consist of 10 million or more inhabitants, the simulation of whom causes considerable demands on computational performance. This demand on computation is made worse by the repeated execution of the relaxation iterations. And in contrast to simulations in the natural sciences, traffic particles ( travelers, vehicles) have internal intelligence. This internal intelligence translates into rule-based code, which does not vectorize and therefore does not run efficiently on traditional vector supercomputers, such as the Cray series. It does however run well on modern workstation architectures, which makes traffic simulations ideally suited for clusters of PCs, also called Beowulf clusters. One uses domain decomposition, that is, each CPU obtains a patch of the geographical region. Information and vehicles are exchanged between the patches via message passing, for example using MPI (Message Passing Interface, MPI (27)).
Two important numbers to judge the performance of a parallel simulation are speed-up and real time ratio. They are defined as follows:
Fig. 5 shows measured and predicted computing speeds as a function of the number of CPUs for the queue micro-simulation and the Switzerland scenario. Both figures refer to the simulation of the morning peak, as explained earlier in this paper. The top figure shows the real time ratio (RTR); the bottom figure shows the speed-up. As one can see, the plots are related by a vertical shift of the data, which means a multiplication with a constant multiplier because of the logarithmic scale. That multiplier is the RTR of the simulation on a single CPU; in our case, the RTR on a single CPU is about eight. The differences between RTR and speed-up would become important if one changed the scenario size: In the RTR plot, the graph would be shifted to the left or right for smaller or larger scenarios, respectively, meaning that the maximally reachable RTR would not change, but would be shifted to a different number of CPUs. In the speed-up plot, the graph would be shifted down or up, respectively, meaning that the maximally reachable speed-up depends on the size of the scenario. That is, for both plots the results depend on the scenario size; it is impossible to make parallel performance predictions without knowledge about the scenario size.
High computing speeds matter: For real time applications, one wants the simulation to be considerably faster than reality so that the prediction is finished much before reality catches up. For transportation planning, about 50 iterations between simulation and plans generation are necessary, meaning hours of simulated traffic for a 24-hour scenario. With our real time ratio of 200, the computing time for this would still be 6 hours for the micro-simulation alone.
|
Besides the micro-simulation, also the feedback mechanism consumes computing time; including re-routing and agent database operations, it currently takes roughly 45 min per iteration for the morning peak Switzerland scenario (15). This is clearly the bottleneck of the current approach; better implementations are under investigation.
In summary, an iteration using our current implementation takes less than one hour. Running fifty iterations thus takes about two days.