 Main
Games with vector payoffs : a dynamic programming approach
 Kamble, Vijay Sukumar
 Advisor(s): Walrand, Jean
Abstract
In several decisionmaking scenarios in adversarial environments, a decisionmaker cares about multiple objectives at the same time. For example, in certain defense operations, an agent might be interested in simultaneously defending multiple targets from an enemy. In a repeated game against an unknown opponent, a player wants to minimize `regret', i.e., to try to choose a strategy that performs well relative to each strategy in some given class of strategies in hindsight. In dynamic asymmetric information games where a player lacks some information that other players have, a typical goal is to choose a strategy that gives appropriate worstcase guarantees simultaneously on all possibilities. Many of these scenarios can be modeled as a vectorvalued sequential game between the agent and an adversary. This thesis is concerned with characterizing and efficiently computing the optimal worstcase guarantees that an agent can achieve on the losses in such games.
The main contribution of this work is to show that for large classes of sequential games, these optimal guarantees can be characterized as the fixed point of a dynamic programming operator defined on the space of extremal (either maximal or minimal) elements of subsets of some partially ordered topological space. We first present this result in detail for the model of discounted repeated games with vector payoffs and then extend it to stochastic games with multiple states, and finally to reachability games (which model several types of pursuitevasion games that arise in defense operations). For each of these models, we prove several structural properties of the set of these optimal guarantees and the corresponding optimal strategies. This approach opens up the possibility of using many wellknown dynamic programming based methods and algorithms for approximating these guarantees and computing approximately optimal strategies. One such method based on approximate valueiteration is presented for the case of repeated games.
This approach results in the first characterization of the minmax optimal regret and the corresponding optimal strategy for expected regret minimization in repeated games with discounted losses. Further, it results in the first known procedure for efficiently computing an approximately optimal strategy for the uninformed player in Aumann and Maschler's celebrated model of zerosum discounted repeated games with incomplete information on one side.
Main Content
Enter the password to open this PDF file:













