next up previous contents
Next: Relation to machine learning Up: Learning and feedback Previous: Interpretation as dynamical system   Contents

Relation to game theory

A Nash Equilibrium (NE) is a state where no agent can improve its payoff by unilaterally changing its strategy. In terms of this text, this means the system is at a NE if no agent can improve its score by unilaterally selecting a different (routing/activity/...) option. An equilibrium in game theory is a static concept; it is in consequence not the same as an equilibrium in dynamical systems.

For static assignment (Chap. 28), we have seen this as Wardrop's first principle, and the theory of static assignment started from there. We have also seen that in the case of static assignment, under certain conditions the solution was unique, meaning that there was only one NE.

The construct of a NE does not say anything about how a system can reach it. In standard game theory, it is assumed that each agent completely pre-computes its moves and then submits a ``strategy book'' to the referee, who will then play the game for the agents. The Nash Equilibrium definition implies that the solution is (marginally) stable if exactly one player deviates from the NE. Nothing is said about stability if two players simultaneously deviate from the NE.

Sometimes, a NE is a fixed point of a certain type of deterministic learning dynamics. A typical example is best reply, where each player plays what would have been optimal in the last period. If an agent has several best options, it choses the same as in the last period (if applicable). Under best reply, a NE, once reached, is repeated forever. Again, this does not say anything about stability, since fixed points can be attractive ($=$ stable), neutral, or repulsive ($=$ unstable).

There are subtleties involved in a translation from game theory to dynamical systems. Most importantly, one has to assume that in the dynamical system interpretation, the agents do not actively optimize any given quantity beyond the prescription of the dynamics. Rather, their behavior is completely given by the dynamic description, and this dynamics sometimes happens to have the NE as a fixed point. For example, the situation is different if an agent attempts to optimize the average reward over all iterations.

When moving from deterministic to stochastic simulations, the usual changes are necessary. In particular, the NE has to be suitably redefined, for example that each agent should not be able to improve the expected reward. Although this sounds feasible in theory, it is difficult in practice, since we do not know how to compute the expected reward via simulation. An approximation to the expected reward would be to simulate the transition from $n$ to $n+1$ with many different random seeds and average over all occuring rewards; however, this is neither computationally efficient nor plausible from the point of view of reality.

In conclusion, it seems that we are left with a system which has some relation to game theory, but they are not exactly the same. It is possible to change our system so that it maps exactly on game theory, but only by moving it farther away from what we would expect as plausible human behavior.


next up previous contents
Next: Relation to machine learning Up: Learning and feedback Previous: Interpretation as dynamical system   Contents
2004-02-02