Next: Master-Slave Approach Up: Large scale transportation simulations Previous: Introduction

Domain decomposition

Domain decomposition might be defined as partitioning the geographical region into subregions of approximately equal size (Fig. 1). It is one of the crucial issues of parallel computing. After partitioning the domain into subdomains, each CPU in the system is assigned to one of these subdomains and performs the calculation on that subdomain.

**Figure:** From [6].Domain decomposition of transportation network. *Left:* Global view. *Right:* View of a slave CPU. The slave CPU is only aware of part of the network which is attached to its local nodes. This includes links which are shared with neighbor domains.
$\includegraphics{distrib-gz.eps}$

Since some of the vehicles in the traffic might leave a subdomain and enter into another subdomain on the way to their destinations, the traffic flow information near the boundary of the neighbor subdomains (or CPUs) needs to be exchanged. This is necessary in order to maintain the consistency between the CPUs.

In the following, we will describe the domain decomposition method for the cellular automata (CA) implementation which is used in TRANSIMS [12]. That particular implementation, however, is used for exposition only; the parallelization approach works on any driving logic which has a similar structure. The domain decomposition for parallelization is straightforward if the state at time t+1depens only on information from time step t, and on neighboring cells. Therefore, an updating process in such a system is in principle composed of two elements, namely, a communication for the boundary information at time step t, and an update from time step t to t+1. In the actual implementation, we use two communcations and two sub-updates per time step, see later.

Traffic simulations fulfill two conditions which make this approach efficient:

Domains of similar size: The street network can be partitioned into domains of similar size. A realistic measure for size is the accumulated length of all streets associated with a domain.
Short-range interactions: For driving decisions, the distance of interactions between drivers is limited. In our CA implementation, on links all of the TRANSIMS-1999 [12] rule sets have an interaction range of 37.5 meters (= 5 cells, each of which has a length of 7.5 meters) which is small with respect to the average link length. Therefore, the network easily decomposes into independent components.

We decided to cut the street network in the middle of links rather than at intersections; THOREAU [7] does the same. This separates the traffic complexity at the intersections from the complexity caused by the parallelization and makes optimization of computational speed easier.

In the implementation, each divided link is fully represented in both CPUs. Each CPU is responsible for one half of the link. In order to maintain consistency between CPUs, the CPUs send information about the first five cells of ``their'' half of the link to the other CPU. Five cells is the interaction range of all CA driving rules on a link. By doing this, the other CPU knows enough about what is happening on the other half of the link in order to compute consistent traffic. Therefore the resulting simplified update sequence on the split links is as follows (Fig. 2):

**Figure:** Example of parallel logic of a split link with two lanes.The figure shows the general logic of one time step. Remember that with a split link, one CPU is responsible for one half and another CPU is responsible for the other half. These two halves are shown separately but correctly lined up. The dotted part is the ``boundary region'', which is where the link stores information from the other CPU. The arrows denote when the information is transferred from one CPU to the other via boundary exchange [6].
$\resizebox*{0.8\textwidth}{0.8\textheight}{\includegraphics{parking-fig.eps}}$

Change lanes.
Exchange boundary information.
Calculate speed and move vehicles forward.
Exchange boundary information.

Note, however, that use of the CA can be viewed as a didactic example; any traffic simulation logic where the state at time t+1 uses only information from time t and where interaction is local can be parallelized in this way.

Next: Master-Slave Approach Up: Large scale transportation simulations Previous: Introduction

Nurhan Cetin
2001-05-31