Results

NOTES: Due to time constraints, we have to rip out the stuff that's too hard to explain. The ripped out stuff is:

This we still need to try: Another thing that would make sense would be to have different kinds of utility functions for activity duration, depending on the type of activity. Staying home and drinking beer would have logarithmic utilities, but work could have a parabolic shape (like in Adrian Schneider's work) or some kind of triangular peak shape, which indicates your boss won't let you work more than 8 hours. Some things, like Kindergarten, would be binary: they would have a fixed amount of utility for doing them, and 0 for not doing them.

In this section we describe the results obtained from three scenarios based on the modules, parameters, network and initial conditions described above.

Routes Only

In the Routes Only scenario, we run the framework with the times replanning disabled, so that only route replanning may occur (i.e. 45#41). All agents are forced to forever use the initial activity time values, departing home at 6:00 AM and staying at work exactly 8 h. This scenario demonstrates how well the agents distribute themselves among the available routes.

Figure 6 shows the relaxation/learning behavior of the agents within this scenario, using two global performance measures: overall average score, and overall average travel time. One sees here that the average score relaxes to about 103.5 within 100-150 iterations, while average travel time relaxes to about 61 min within 20-30 iterations. Compared to the free-speed travel time of 54 min, the agents lose about 7 min due to congestion in this scenario.

**Figure 6:** Relaxation of scores and travel times for all three scenarios of the baseline case. These plots display the average values of the (a) score and (b) travel time collected over the entire population of agents during each iteration.
[Average Scores]46#42[Average Travel Times]47#43

Figure 7 shows the departure and arrival time distributions for this scenario, at iteration 0 (Fig. 7(a)) and iteration 250 (Fig. 7(b)). One can see from this figure that the work arrival time distribution (WATD) starts out at iteration 0 with an average of about 80-85 veh per 5 min time-bin, corresponding to roughly 1'000 veh/h, and lasting for about 2 hours. This makes sense, as the capacity of the bottleneck for a single home-to-work route is 1'000 veh/h and there are 2'000 veh that want to traverse that route. After 250 iterations, the arrival rate increases to about 750 veh/5 min, or 9'000 veh/h, which corresponds to the total capacity of nine routes of 1'000 veh/h each. After 15 minutes, over 90% of the agents have arrived at work, with the remaining arriving in the next 10 minutes. This extra 10 min comes from the incomplete equilibrium in the route distribution caused by the 10% route replanning, which is explained further below. Since agents cannot change their work duration in this scenario, the work departure time distribution (WDTD) is the same as the WATD shifted by 8 hours, and since there are no bottlenecks on the route home, the home arrival time distribution (HATD) is the same as the WDTD shifted by 39 minutes.

**Figure 7:** Histograms showing home departure time distribution (HDTD), work arrival time distribution (WATD), work departure time distribution (WDTD), and home arrival time distribution (HATD) of the three scenarios, before and after relaxation (250 iterations). The histograms are taken over 5 minute time-bins. For clarity the range stops at 900 vehicles, though some of the initial departure peaks are above this value; these peaks are labeled with their actual values
[Distributions before relaxation (common to all scenarios)]48#44[Routes Only distributions after relaxation]49#45 [Times Only distributions after relaxation]50#46[Routes and Times distributions after relaxation]51#47

The only degree of freedom in this scenario is the route choice. Figure 8(a) displays the usage of the different routes as a function of iteration. This figure has several features. First, as expected, all agents start out using the initial route (number 7, ``middle''), while the other eight routes (represented in the figure by route numbers 5 and 9) start out with no agents. Second, the percentage of agents using the middle route decreases at approximately a negative exponential rate. This makes sense, since 10% of all agents perform replanning each iteration. Some agents return to the middle route due to random plan selection, but most will stay on the other routes, lowering the percentage of agents using the middle route by roughly 10% of its previous value each iteration, until the agents are using all routes equally. It takes about 40 iterations for the middle route to have about the same usage percentage as the other routes.

The third feature of this figure is that after equilibrium most routes appear to be used on average by 10% of the agents at a time, rather than the 52#48% expected when nine equivalent routes are available. In addition, some routes appear to be used by 20% of the agents during certain iterations. These phenomena are explained by the fact that 10% of the agents perform replanning in each iteration, leaving the other 90% to choose freely which route they want to use. These 90% split up approximately evenly among the nine available routes, giving each a usage of about 10%, and the 10% who replan tend to choose the same route. This route will then have a total usage of about 20%. Only three routes are displayed here; if all nine were displayed, there would be a ``spike'' of 20% route usage occurring for some route in each iteration. This extra group of agents using one particular route causes that route to empty out later than the rest, extending the length of time agents arrive at work, as seen in Fig. 7(b). Agents who replan tend to choose the same route because of the fluctuations in the usage of a route. During a given iteration, some route will happen to be used least, and thus have the best travel time, so the router will use it for all (or most) of the replanned routes in the next iteration. These fluctuations are also driven by the fact that slightly more or less than 10% of the agents may be replanned in each iteration, due to the probabilistic selection of agents for replanning. Note that it is not until near iteration 100 that the spikes appear to be in equilibrium.

**Figure 8:** Route distributions for (a) the Routes Only scenario and (b) the Routes and Times scenario. The middle route has a distinctive curve since it is the one initially used by all agents. The other eight routes have qualitatively similar curves as Routes 5 and 9, only the ``spikes'' occur in different iterations.
[Routes Only]53#49[Routes and Times]54#50

The general interpretation of the above results for the Routes Only scenario is that the agents equilibrate to the different routes as best as they can, and once equilibrated stay in a very stable arrangement.

Times Only

In the second scenario, Times Only, we run the framework with the route replanning disabled, so that only times replanning may occur (i.e. 55#51). Here all agents must forever use the middle route for their trips from home to work. This scenario demonstrates how well the agents distribute themselves through time; i.e. how they handle peak-hour spreading.

Figure 6 includes the average scores and travel times for this scenario. One can see that these measures contain a considerable amount of oscillation in comparison to those of the other two scenarios, though the oscillation appears to diminish as the iterations continue. We presume that the oscillations are due to the system being in a chaotic regime, though more investigation is necessary to learn the exact cause. For now we only observe that they exist.

The average score oscillates around about 100.7, moving between 100.5 and 101.2, taking at least 200 iterations to reach this state. The average travel time centers around 72 min, oscillating between 68 min and 75 min, again taking about 200 iterations to get to that state. This scenario seems to find the worst scores and travel times of the three. It makes sense that the average travel times come out worse, since the agents cannot get around the 1'000 veh/h bottleneck of the middle route, while agents in the other scenarios can use nine times the capacity of this route for their home to work trips. This in turn explains why the average score is the lowest, because with more time being spent by the agents in travel, they have less time to spend working or at home, so they lose more potential score, which lowers their best achievable score.

Figure 7 shows the departure and arrival time distributions for this scenario, at iteration 0 (Fig. 7(a)) and iteration 250 (Fig. 7(c)). One can see that after 250 iterations, the WATD is still spread out to 2 hours and is still limited to 1'000 veh/h. This is as much as can be expected when all agents use the same route. The main peak of the home departure time distribution (HDTD) is shifted to about 5:30am, an earlier time compared to the 0th iteration, with a secondary peak around 7am. Figure 9 shows a close-up of the distributions during the morning rush-hour. In this figure one can see that about 3/4 of the agents arrive in the 90 min before 7am, and about 1/4 arrive in the half hour after. This makes sense, because agents arriving to work early are effectively penalized -6/h while those arriving late are penalized at three times this amount. So, an agent arriving 30 min late incurs the same penalty as one arriving 90 min early. Travel time as a function of departure time might be interesting for this scenario; this would explain the HDTD. Not reablly... Back to Fig. 7(c), we see that the WDTD and HATD are much more spread out than the WATD, lasting about 3 hours. Why? Presumably many are working late to offset their arrival penalty. How can I show that? Does Vickrey predict this?

**Figure 9:** Histograms showing HDTD and WATD for the Times Only scenario, after relaxation (250 iterations). The histograms are taken over 5 minute time-bins.
56#52

The general interpretation of the results for the Times Only scenario is that even with a time choice module that simply mutates existing plans, the feedback mechanism and the agent database allow agents to learn enough about the system to find a plausible distribution of departure times.

Routes and Times

In the Routes and Times scenario, we finally allow agents to utilize both the routing module and time choice module to develop new plans. Agents who perform time replanning also perform routes replanning on the resulting plan, as discussed in Sec. 4.4. This scenario demonstrates the complete relaxation behavior of the agents, where they may spread out over space and time.

Figure 6 includes the relaxation of scores and travel times for this scenario. One sees here that the average score is never perfectly relaxed, with what appears to be a slight oscillation with a period of about 800 iterations. However, after about 300 iterations the score seems to be rather stable, oscillating around 108. The average travel time initially finds the free-speed travel time within 100 iterations, then deviates from this value, eventually flattening out at about 55 min. The travel times may also have a oscillation, though it might also be a one-time ``bump.'' More iterations would be required to find this out. It seems reasonable that this occurs because the agents are able to compensate for slightly varying travel times, meaning the travel time is not as important to them when they have more degrees of freedom to explore. In any case, this scenario finds a better average score and better average travel time than the other two scenarios, as expected given the larger number of degrees of freedom given to the agents.

Figure 7 shows the departure and arrival time distributions for this scenario, at iteration 0 (Fig. 7(a)) and iteration 250 (Fig. 7(d)). In iteration 250, the HDTD peak has shifted to about 6:45am, a later time than that at iteration 0, or that at iteration 250 for the other scenarios. It makes sense that the peak is at a later time than that of the Times Only scenario, as the average travel times in this scenario are shorter. The time of 6:45am makes sense as well, because most agents only need 15 min for the home to work trip. This is supported by the narrow WATD peak, which indicates that most agents arrive to work between 6:50am and 7am. The peak is nearly the same as the HDTD peak, only shifted by 15 min. See also Fig. 10 for a closeup of those peaks. why is it not as tall as routes only? Naturally, both the HDTD and WATD peaks are wider in this scenario than in Routes Only, since the agents can explore alternate departure times from home. They are not as wide as those in Times Only, since agents can also take alternate routes to avoid congestion, and don't have to spread out in time as much.

**Figure 10:** Histograms showing HDTD and WATD for the Routes and Times scenario, after relaxation (250 iterations). The histograms are taken over 5 minute time-bins.
57#53

Figure 8(b) displays the usage of the different routes as a function of iteration. As with the Routes Only scenario, all agents start out using the middle route, while representative routes numbers 5 and 9 start with no agents. Also like in the Routes Only scenario, the percentage of agents using the middle route decreases rapidly while percentage of agents using the alternate route(s) increases. However, since 20% of the agents are given the chance to change their routes each iteration ( 58#54), the exchange of agents from the middle route to the other routes occurs more rapidly.

This figure shows higher oscillations in route usage compared to the Routes Only scenario. In that scenario, route equilibration is the only option for agents trying to avoid congestion. Agents using some route tend to ``notice'' other agents using the same route, in the sense that their trip was made longer by the presence of the other agents. In this scenario, however, agents can also avoid congestion by choosing different departure times. So, agents using the same route may do so at totally different times, any may not notice each other at all, since they did not encounter any congestion from other agents along that route. Thus, they do not have much reason to try to switch routes, causing less of an equalization among the route choices. Another way to put it is that the temporal spreading allows the routes to remain equivalent to each other, even if the number of agents using each route differs greatly.

The general interpretation of the results for the Routes and Times scenario is that both modules work together well to allow the agents to explore both spatial and temporal degrees of freedom to obtain better plans than possible with just one degree of freedom.

Varying $\beta$

Here we vary the $\beta$ plan selection parameter to higher and lower values from the baseline value of 2/, to see how selecting the best plan more or less often affects the score and travel time relaxation rates. We tried these values for $\beta$ : 0.001/, 0.01/, 0.1/, 1/, 2/, 4/, 10/, and 59#55/. A value of 59#55/ means agents always choose the plan with the best score.

Figures 11(a) and 11(b) show the effect of $\beta$ on the Routes Only scenario. The relaxed score and travel time averages are the same; only the rate of approach to those values differs. With a lower value of $\beta$ , agents are allowed more random selection among their plans, so the system approaches the steady-state at a slower rate, which makes sense. The infinite $\beta$ , which causes agents to always choose the best plan they have, allows for the fastest relaxation of both scores and travel times.

**Figure 11:** Relaxation of scores and travel times for the Routes Only scenario, comparing the relaxation behavior with varying $\beta$ values. These plots display the average values of the score (left) and travel time (right) collected over the entire population of agents during each iteration.
[Average Scores for Routes Only]60#56[Average Travel Times for Routes Only]61#57 [Average Scores for Times Only]62#58[Average Travel Times for Times Only]63#59 [Average Scores for Routes and Times]64#60[Average Travel Times for Routes and Times]65#61

Figures 11(c) and 11(d) show the effect of $\beta$ on the Times Only scenario. One can see that the oscillations have a higher amplitude and lower frequency for lower values of $\beta$ . For the scores, all the curves seem to have roughly the same worst score (lower bound) of about 100.5. However, with lower $\beta$ the system is able to find better (higher) scores, though it cannot stay at those values. Similarly, on the travel time plots, one can see that the worse possible travel time (upper bound) doesn't change much, but better (lower) travel times are reached with lower values of $\beta$ . A careful look at both figures shows that the best travel time is out of phase with the best score. For example, in the 66#62/ curve, the best value of travel time appears around iteration 350, while the best score appears around iteration 375. What does that mean????? My theory: Some agents begin to find a better time choice which gives them an advantage over the other agents. This initially drives the travel time average down. Then, other agents begin to find the same choice, which makes the choice worse for the original agents. How do I show this? - Bryan

Figures 11(e) and 11(f) show the effect of $\beta$ on the Routes and Times scenario. Like with the Routes Only scenario, the relaxed value of the score seems to remain essentially the same, but the different $\beta$ values approach it differently.

Overall, it appears that value of $\beta$ does not matter very much for the scenarios with the routing module enabled. Perhaps this is due to the fact that the routing module make decisions with some ``intelligence'' behind them, allowing for an additional learning mechanism for the agents. Possibly, the Times Only scenario is affected more by the value of beta, as the decisions made by the agent database are the only ones that have any effect on the learning behavior.

Varying $\beta _{travel}$

Here we vary the marginal utility of travel time, $\beta _{travel}$ from its baseline value of -6/h, to see how making travel time more or less important in the score calculation affects the score and travel time relaxation rates. We tried these values $\beta _{travel}$ : -0.06/h, -0.6/h, -6/h, -60/h, and -600/h.

Figures 12(a) and 12(b) show the effect of $\beta _{travel}$ on the Routes Only scenario. As expected, higher magnitudes of $\beta _{travel}$ cause the average score to relaxed to a lower value, since all else being equal, the same travel time costs more to the agent. For all values above -600/h, the curves seem to have the same relaxation behavior as seen in Fig. 6. For $\beta _{travel}=$ -600/h, the score takes about 200 more iterations to relax, while the travel time takes only 10-20 more iterations to relax.

**Figure:** Relaxation of scores and travel times for the different scenarios, comparing the relaxation behavior with varying $\beta _{travel}$ values. These plots display the average values of the score (left) and travel time (right) collected over the entire population of agents during each iteration. For better comparison of the relaxation rate, we shift the average score curve for $\beta _{travel}=$ -600/h up by 500.
[Average Scores for Routes Only]67#63[Average Travel Times for Routes Only]68#64 [Average Scores for Times Only]69#65[Average Travel Times for Times Only]70#66 [Average Scores for Routes and Times]71#67[Average Travel Times for Routes and Times]72#68

Figures 12(c) and 12(d) show the effect of $\beta _{travel}$ on the Times Only scenario. Here we see that different values of $\beta _{travel}$ can also change the oscillation amplitude and frequency for the scores and the travel times. For $\beta _{travel}=$ -60/h, the oscillation frequency is higher, and the amplitude is smaller, and for $\beta _{travel}=$ -600/h, the oscillation is nearly nonexistent. For the smaller two magnitudes of the marginal utility of travel, the score and travel times curves look qualitatively like those of the Routes and Times scenario in Fig. 6. They improve at first, then slightly deviate from the best value obtained. This supports the idea that the behavior of the Routes and Times scenario comes from the fact that travel time is less important to the agents when they are able to adjust their routes and their activity schedules simultaneously. In addition, as with the Routes Only scenario, Times Only takes longer to relax when the $\beta _{travel}$ value is higher.

Figures 12(e) and 12(f) show the effect of $\beta _{travel}$ on the Routes and Times scenario. Here we once again get basically the same relaxation behavior, offset only by the different strengths of the travel time in the overall score. The scores for $\beta _{travel}$ =-600/h takes longer to relax, but all scores for the Routes and Times scenario relax to higher values than those of the other two scenarios. Furthermore, one can see that for the lower marginal utility of travel, the travel times stay flat at close to the free-speed travel time. The deviation from this level occurs more for higher values of $\beta _{travel}$ , with $\beta _{travel}$ =-6/h being the only one that deviates and appears to return to the lower value.

Overall, it appears that reasonable values of $\beta _{travel}$ , as compared to the other marginal utility parameters, lead to the same relaxation behavior (if not the same scores) in the scenarios with the routing module enabled. The Times Only scenario's relaxation behavior is more affected by the value of $\beta _{travel}$ , and it appears that a value of -6/h may represent an unstable case at the border between two more stable regimes: one where travel time dominates the decision making, and one where it is not very important at all.