When implementing the above concept, one needs to make some decisions about the technologies to use. In our case, the most important criterion was that large scale scenarios (several millions of agents) should be feasible, followed by the desire for flexibility and interoperability.
The framework's strategic/mental layer provides the mental state of the agents and allows them to learn about their environment and make decisions about their behavior. We divide this layer into a central agent database and several behavioral modules that model the different kinds of decisions that affect an agent's plan. For example, one module chooses activity durations, and another chooses routes. Figure 3 graphically depicts the relationships and interactions between these components. The agent database provides each agent with part of its mental state, and with a high-level decision-making ability. The modules provide the rest of the mental state and more detailed decision making abilities.
The task of the agent database is to maintain, for each agent in the simulation, some number of plans plus their scores, to select plans according to their scores, to add new plans, and to remove plans with a bad performance. This looks like a standard data base, and in fact our first prototype was implemented in MySQL, a public domain relational database (32). A standard relational database is, however, not well suited to data that has hierarchies of variable-length objects. In our case, we have a large number of agents, each of which has a variable number of plans, each of which has a variable number of activities/legs, each of which has a route description of variable length. Since plans for a particular agent are added/removed one by one, this means that the plans of an agent are spread out in memory within the database, resulting in slow performance. In addition, the relational database approach is awkward to use, since once more agent information is not in one place.
In fact, one would need an object-oriented database, rather than a standard relational database. However, object-oriented databases are slow, which is a direct consequence of the problem to insert variable-length objects into linear memory. On the other hand, for our purposes many properties of databases, such as an always consistent state also in under crashes, are not needed. It is therefore tempting to implement the agent database completely in software. Because of performance reasons, a decision for C++ was made, and the STL (Standard Template Library) was heavily used. This allows the program to implement a Person class, which contains one or more Plan classes. Each plan contains a sequence of activities and legs, and each leg contains the description of the route. Since the STL is used, it is straightforward to, say, add or remove a plan to or from a person. Also, since the whole agent database is written in C++, it is straightforward to do computations such as plans selection based on a logit model. The number of plans that an agent database in software can hold is limited by the memory that a single process can address. In a 32-bit architecture, this number is 2 GByte. Since in our current implementation one plan needs about 0.5 KByte, our current implementation can hold about 1 million agents with a maximum of 4 plans each.
The agent database needs to communicate with external strategy generation modules, and to send plans to the mobility simulation (Fig. 3). All communication is done by using exactly the same plans format. This format uses XML; an example is in Fig. 4. As one can see, the format is rather intuitive; this is in stark contrast to the TRANSIMS files. However, the main advantage of XML is its extensibility. That is, one can add fields to the format without breaking existing parsers. In particular, one can add fields only to a subset of agents, for example a format to describe a conditional strategy (Sec. 2.3). Such extensions would be very hard to do with TRANSIMS. It is important to note that the principal units of description are ``agents'' and ``plans''. Any external module using the same principal units will be able to communicate with our system. Somewhat unexpectedly, file size is less of an issue with XML than expected. When compressed, XML files have about the same size as TRANSIMS files with the same information.
7#3 |
A question remains of how to feed performance information from the mobility simulation back to the strategic modules (Fig. 3). Our current solution is that the entire output of the physical simulation consists of events which are output directly when they happen. For example, a traveler can depart, can enter/leave a link, etc. That is, the simulation of the physical system performs no data aggregation; this is done by the other modules themselves.
At this point, we are still investigating if events should be in plain text or in XML format; there are some performance advantages to the former, but in the long run this will probably be outweighed by the flexibility advantages of the latter. An XML events format roughly looks as follows
8#4
Note that such a line is generated separately for each event.
The agent database, for example, will read through the events information and register, for each agent, events that are necessary to compute the score. Since at this point the score depends on activity arrival and departure times only (see Sec. 4.7), these are the only events that the agent database will consider. In contrast, the router will read through the events and look for link entering/leaving events. If an agent enters a link, the router will store that information somewhere. If an agent leaves a link, the router will search for the corresponding link enter event, compute the link travel time, and enter that into some averaging mechanism for the link.
The advantage of events is that they are very easy to implement into the simulation of the physical system. In contrast, any data aggregation inside the simulation of the physical system in our experience is a continuous source of errors. This has to do with the fact that the team that writes the simulation is not truly interested in correct aggregation: Their main tool to check simulation correctness is the visual impression (and maybe some traffic flow considerations). In contrast, the team that is responsible for, say, the router or the agent database has a much higher interest in the correctness of the aggregation, since without that their module will not function. In our experience, seemingly trivial aspects such as this are rather important for the long-term robustness of the system.
The modules need to be called in a certain sequence in order to make the system run. For example, choosing new activity locations will necessitate new routes to and from the changed activities, so the route planning module should be called sometime after the activity location module is called. But routes and activity times do not (strictly) depend on each other, so it would be possible to make calls to either the activity time choice module or the router without calling the other one, or call them both in an arbitrary order.
At this point, let us assume that we treat period-to-period replanning only, and that each period corresponds to a day. Within-period replanning will be shortly discussed in Sec. 6. Let us assume further that the list of available modules is known, as well as the dependencies between them, and that the dependencies can be fulfilled without calling modules in a ``circular'' order. Let us also assume that there is some initial plans file, in which each agent is contained, and each agent has exactly one completely specified plan. Such initial plans files can be generated with variations of the methods discussed in this paper, but the system is easier to explain if one assumes the file is already there. Finally, let us assume that the mobility simulation was run based on the initial plans file, and that it has written events to a file. This initial condition is now followed by many iterations, each composed of the following sequence of actions:
The specifications of an external strategy module are perhaps already clear at this point. The minimum requirements are:
The specifications of the mobility simulation are perhaps also already clear at this point. The minimum requirements are:
Furthermore, future versions will necessitate a consistent way to deal with travel in different modes. It is clear that, in order to execute traffic with different modes, the use of these modes needs to be planned by the agent database and its external modules. However, as a simplification one could just assume that the execution follows exactly the plan - this would correspond to a system without congestion, without unexpected variability, etc. In that case, there are two options:
9#5
A simulation that can simulate the walk mode would let the agent walk
along the specified route. A simulation that cannot simulate the walk
mode would just assume that the walk takes 20 minutes, as specified in
the duration, and move the agent to the next activity
accordingly.
As mentioned elsewhere, the agent database needs a scoring function in order to give scores to plans that were executed. That scoring function needs to be entirely based on events information, and it needs to score the complete period (e.g. day). An example of a utility-based scoring function will be presented in Sec. 4.7.
An open problem is how to couple the scoring function used by the agent database to the scoring functions used by the external strategy generation modules. Because of stochastic effects, it is not necessary that they are completely consistent, but as mentioned before, some conceptual overlap is necessary. At this point, we solve this problem by manually defining the goals of the external modules. This is a subject of further investigation.