Optimal strategies for repeated games

We extend the optimal strategy results of Kelly and Breiman and extend the class of random variables to which they apply from discrete to arbitrary random variables with expectations. Let Fn be the fortune obtained at the nth time period by using any given strategy and let Fn ∗ be the fortune obtained by using the Kelly–Breiman strategy. We show (Theorem 1(i)) that Fn/Fn ∗ is a supermartingale with E(Fn /Fn ∗) ≤ 1 and, consequently, E(lim Fn /Fn ∗) ≤ 1. This establishes one sense in which the Kelly–Breiman strategy is optimal. However, this criterion for ‘optimality’ is blunted by our result (Theorem 1(ii)) that E(Fn/Fn ∗ ) = 1 for many strategies differing from the Kelly–Breiman strategy. This ambiguity is resolved, to some extent, by our result (Theorem 2) that Fn ∗ /Fn is a submartingale with E(Fn ∗ /Fn ) ≤ 1 and E(lim Fn ∗ /Fn ) ≤ 1; and E(Fn ∗ /Fn ) = 1 if and only if at each time period j, 1 ≤ j ≤ n, the strategies leading to Fn and Fn ∗ are ‘the same’.


Introduction
Suppose a gambler is given the opportunity to bet a fixed fraction 'Y of his (infinitely divisible) capital on successive flips of a biased coin: on each flip, with probability p >! he wins an amount equal to his bet and with probability q = 1-p he loses his bet. What is a good choice for 'Y and why is it good?
This question is subtle because the obvious answer has an obvious flaw. The obvious answer is for the gambler to choose 'Y = 1 to maximize the expected value of his fortune. The obvious flaw is that he is then broke in n or fewer trials with probability 1-pn, which tends to 1 as n tends to 00. A germinal answer was given by Kelly [10]: a gambler should choose 'Y = Pq so as to maximize the expected value of the log of his fortune. He shows that a gambler who chooses 'Y = Pq will 'with probability 1 eventually get ahead and stay ahead of one using any other value of 'Y' ( [10], p. 920).
In an important paper Breiman [4] generalizes and considers strategies other 416 MARK FINKELSTEIN AND ROBERT WHITLEY than fixed-fraction strategies and generalizes the random variable as follows: Let X be a random variable taking values in {I, 2, · .. , s} == I, C€ be a class {A,, A 2 , ••. ,A r } of subsets of I whose union is I, and 0b 02, • • • , Or be positive numbers (odds). If for one round of betting a gambler bets fractional amounts (3b (32' ... .B, of his capital on the events {X E Ai}' · · . ,{X EAr}, then when X == i he gets a payoff of L (3jOj summed over those j with i in A j • In this setting Breiman discusses several 'optimal' properties of the fixed-fraction strategy which chooses (31' (32' ... , (3r so as to maximize the expected value of the log of the fortune and then bets these fractions on each trial, leading to the fortune F~at the conclusion of the nth trial. He shows that if F; is a fortune resulting from the use of any strategy, then lim FnIF~almost surely exists and E (lim FnlF~) ~1. In what follows we shall be concerned solely with magnitude results, like this asymptotic magnitude result of Breiman's, but the reader should be aware that under additional hypotheses Breiman also shows that T(x), the time required to have a fortune exceeding x, has an expectation which is asymptotically minimized by the above fixed-fraction strategy. The problem of how to apportion capital between various random variables is exactly the problem of portfolio selection, and so it is correct to suppose that these results on optimal allocation of capital are of considerable interest to economists, as Kelly recognized ( [10], p. 926). He also prophetically realized that economists, familar with logarithmic utility, could easily misunderstand his result and think, incorrectly, that the choice of maximizing the expected value of the log of the fortune depended upon using logarithmic utility for money. For discussion see [15], p. 216 and [17]. An interesting concise discussion of the 'capital growth model of Kelly [10], Breiman [4], and Latane [11]' from an economic point of view can be found in [3].
A brief discussion of Kelly's proof will motivate his criterion and allow us to make an important conceptual distinction between his results and Breiman's. Suppose a gambler bets the fixed fraction ' Y of his capital at each toss of the p-coin. Kelly considers the exponential growth rate If our gambler has W wins and L losses in the first n trials, F; == by the law of large numbers. The growth rate G is maximized by ' Y == Pq, and if he uses another 'Y his G will be less and therefore eventually so will his fortune. A complication enters when we consider, as Kelly did not, strategies which are not fixed-fraction strategies. In that case we can have different strategies with the same G, e.g., use v == 1 for the first 1000 trials and then use y == pq. This complication is intrinsic in the use of G and it has consequences which are quite serious for any application. For example, two strategies which at trial n give fortunes, respectively, of 1 and exp (,In), both have G == I! It is obviously unsatisfactory to regard these two strategies with the same G as 'the same', but it is done because using G makes it easy to extend the Kelly results to more general situations which involve more general random variables; using G the argument is a simple one employing either the law of large numbers or techniques which 'rely heavily on those used to generalize the law of large numbers' ( [15], p. 218). Breiman understood the problems created by using G and so he considered Fn/F~, not (Fn/F~)l/n. This is mathematically more difficult, but the results are more useful.

Definitions and lemmas
We shall consider situations with the property that at each time period a gambler can lose no more than the amount he invests, e.g., buying stock or betting on Las Vegas table games. Since there is a real limit to a gambler's liability, based on his total fortune, a broad interpretation of the phrase 'the amount he invests' will allow the inclusion of such situations as selling stock short or entering commodity futures contracts.
We suppose that there are a finite number of situations 1,2, ... , N on which a gambler can bet various fractions of his (infinitely divisible) capital. The random variables X,, x 2 , ••• ,X N represent, respectively, the outcome of a unit bet on situations 1, 2, · .. ,N. Because the loss can be no more than the investment, X;~-1 for 1~k~N. (Breiman considers the amount returned to the gambler after he has given up his bet in order to play, a real example of this sequence of events being betting on the horses. Here the amount the gambler gets back is~O, which corresponds to the amount he wins being~-1). We further suppose, with no loss of applicability, that in all of what follows each X; has an expectation, i.e., that E(\X k \) is finite. These will be the only restrictions on the random variables, and so we are considering a substantially larger class than those discrete random variables Breiman considers.
We also suppose that the gambler can repeatedly reinvest and change the proportion of the capital bet on the situations. The outcome at time j corresponds to the random variables xV), X~), ... , xW. For each k, 1~k~N, the results of repeated betting of one unit on the kth situation is a sequence X~l), X~2), ... , x~m), . .. of independent random variables, each having the same distribution as X k • In contrast to this independence, it is quite important for applications that X,, X 2 , ••• ,X N be allowed to be dependent.

MARK FINKELSTEIN AND ROBERT WHITLEY
A strategy for the game will be a sequence y(l),. · · , y(m), · . · of vectors, y(m) = ( y~m), y~m), ... ,y~)) giving the fractional amount y~m) of the capital which at the m th bet is bet on the kth situation. Thus y~m)~0, 1~k~N, and Ii'=1 y~m)~1. We allow the possibility that y(m) can depend, as a Borelmeasurable function, on the past outcomes X~1), .. · , X~), X~2), . · · ,X~), .. · ,x~m-1), · · · ,X~-1). (Breiman includes the sure-thing bet X o == 1, so that betting Yo on X o is the same thing as putting Yo aside; in this way his y's always sum to 1. We shall not do this.) Letting Fm be the fortune which is the result of m bets using y(l), y(2), · · · , y(m), and F o be the initial fortune, (1) r; =Fon [ 1 +kt y~l~l].

Proof.
(i) The function f(x) = log (1 + x) is strictly concave on (-1, 00), and so for x = (x., . · · ,x N ) a value of X, a and (3 in D, and 0< a < 1, Optimal strategies for repeated games 419 (8) an inequality which also holds if either a · x or (3 • x is -1. Integrating (5) with respect to the probability measure P of the space on which X is defined, is a max, then for From (5) and (6), (7) are -00 only at values of X where a · X = (3 • X = -1. In any case, a · X = (3 · X almost surely. 'Vi Let x = (x., · · . , XN) be a value of X and consider the function g('Vi) = log (1 +Lk¢i 'VkXk + 'ViXj). By the mean value theorem, (8) is dominated by the L 1 function I~II e for 'Yi small. Result (iii) follows from the Lebesgue dominated convergence theorem.
Here is a simple example which conceptually illustrates a practical use of the Kelly-Breiman criterion: maximize E(log F m ) . Example 1. Define two random variables X, and X 2 by flipping a fair coin: if heads, then X, = 100 and X 2 = -10, if tails, then X, = -1 and X 2 = 1. The payoff from X, is far superior to the payoff from X 2 , but because X, and X 2 are (completely) correlated and have payoffs with opposite signs, the criterion will mix both in order to smooth out the rate of capital growth. A simple calculation shows that </>(y) = E(Iog (1 + Y1X1 + yzX z)) is maximized over D on the face Y1 + Yz = 1 at yf:= 0.54 and y~:=0.46. (Lemma 3 will discuss the basic problem created by maxima occurring at non-interior points of D.) The extent to which the criterion will sacrifice expectation is surprising: A fascinating example of the use of this criterion, in which the underlying idea is the same as this example, is in hedging a warrant against its stock as described in [15], pp. 220-222. Example 2. For A> 0, let X have density Ae-A (x + l ) for x~-1, and 0 for x < -1; an exponential shifted to allow losses. We shall show that there is a unique y*,O~y*<l, which maximizes 4>(y) =E(Iog (1 + yX)) , O~y~l; y*= o iff A~1.
For future reference we note that it is not obvious that </> is continuous at 1; part of the computation involves an integration by parts and a change of In Example 2 v"= 0 iff E(X)~0, i.e., a gambler bets on X only if it has positive expectation. This is a special case of a more general result. Breiman [4], p. 65, calls a game favorable if there is a strategy such that the associated fortune F; tends almost surely to 00 with n, and he shows that this condition is equivalent to 4>(y*) being positive ( [4], Proposition 3, p. 68). Lemma 2 establishes the equivalence with the intuitive Condition (iv),
The surprising fact is that for one variable X, a gambler does not bet all his fortune no matter what b is, but he does bet all his fortune on two independent copies of X, for b large enough.
The reader who has carried out the calculations of Examples 2 and 3 knows that, because of the possible singularity on L Yk = 1, it is not clear that </> attains a maximum, and the differentiability of <p on the boundary L Yk = 1 is even less clear.

Think of a continuous strictly increasing concave function f on [0, 1] and redefine it at 1 so that its value there is less than f(O). If this redefined function
were E(log (1 + yX)), then X would be a most interesting game with no Kelly-Breiman optimal strategy: with unit fortune, if a gambler bet an amount less than 1 he could always do better by betting slightly more, but betting all would be worst.
One result of Lemma 3 is that there is an optimal v" so no game can have the property discussed in the paragraph above. Another result of Lemma 3 is that <p is continuous, when finite. This is important because when we compute 1'*, a numerical calculation which will generally give v" to a certain number of decimals, we want to know that using this approximation to the exact y* will give close to optimal performance.
The other result is a substitute for differentiation when y* has L y~.= 1, which allows us to derive the basic inequalities (9) and (10). Note that if all the random variables X,, X 2 , • • • ,X N are discrete, with a finite number of values, as they are in [4], then we can differentiate </> at y*: for then if v" · X equals -1 it does so with positive probability and </>(y*) = -00 < </>(0) = 0, contrary to <p(1'*) a maximum; thus </> is actually defined on a neighborhood of v" (which may extend outside D) and is differentiable as in Lemma 1. The problems which Lemma 3 resolves are those which arise from more general random variables.

Lemma 3.
(i) The function <p is continuous where finite, and attains a maximum at a point v" in D.
First, the advantage in using 1'*, as indicated by the fact that E(lim F n / F~)1 , does not require passage to the limit. This was noted by Durham, for the case of two branching processes, in the proof of Theorem 1 of [6], p. 571. In fact, given that the limiting result is true and given a finite strategy y(l), ... , y(m), extend it by setting y(j) = 1'* for j > m; then F n / F~= Fm/F; for n~m and E(Fm/F;)~1. This should reassure the careful investor who wonders whether a strategy good in the long run may not be inferior in any practical number of trials-disregard of this point leads to an overevaluation of games of the type which produces the St. Petersburg paradox.
Second, by our analysis of cP, we are able to show in Theorem 1 that the presence of the expectation in E(lim F n / F~)~1 raises serious problems in any superficial attempt to use this as an indication of the superiority of 1'*. (ii) Suppose that l' bets only on those X k with yt > 0, i.e., that y~j) = 0 if Ii K, 1~I~N and all j. If L yt = 1, then further suppose that L y~) = 1 for all j. Then Pn/F~is a martingale with E(Fn/F~) = 1.
By Lemma 3, E(X k/(l + 1'* · X)) equals a constant c for k in K and is~c for all k; this constant c = 0 if L 1't < 1. Hence We have shown that FnlF~is a positive supermartingale, with E (FnlF~)Ẽ (Fn-l/F~-l)~. · .~E(FoIF o) = 1, and so by the supermartingale convergence theorem it converges almost surely to a finite limit. Using Fatou's lemma, An examination of (12) shows that under the conditions of (ii), E(F m+1/F:+1 IC€m) = FmIF;, and (ii) follows. The requirement in Theorem l(ii) that if L 1'~= 1 then we must have L 1'~) = 1, for all j, in order to be sure to get a martingale, is made clear by a one-variable example, Let X == 2. Then <p(1') = log (1 + 21') is maximal at 1'* = 1. The gambler will do worse betting any amount less than 1, even though he still bets on the same random variable as 1'* does, the key observation being <p'(1) > O. In general, the 'partial derivatives' E(X k/(l + 1'* . X)), k in K, may be positive if L 1't = 1, whereas they are all 0 if L 1't < 1.
The surprising result of Theorem 1 is the broad conditions in (ii) under which E(FnIF~) = 1. To see what the surprise is, we shall superficially interpret Theorem l(i): since 'on the average', and 'for large n', FnIF~~l, the gambler 'does better' with F~than with F n • But then Theorem l(ii) tells us that if the gambler simply bets on the same variables as 1'* does, but in any proportions at all, and if 1'* bets all so does he, then E(FnIF~)= 1. So with the same intuitive interpretation as above, 'on the average' the gambler does the same with F; as with F~, so it really does not matter which strategy he uses! But we know that it does matter. For example, in a repeated biased-coin toss, if he plays a fixed fraction strategy betting an amount 1'1= 1'* = pq, then almost surely FnlF~Õ . Yet we have E(FnIF~) = 1. In general it will not help to look at lim FnIF~.
For example, if on the first flip of the coin he bets all his fortune, and from then on he bets p -q, FnlF~> 1 with probability p. Theorem 2 will help our understanding of this situation by showing that Fĩ s the only denominator with E (F n / F~)~1 for all F n ; in fact E (F~/F n ) > 1 if r; does not come from a strategy equivalent to using 1'* repeatedly. The suspicious reader will note that this characterization of the sense in which 1'* is optimal contains an expectation. Anyone attempting to state intuitively the result of Theorem 2(i) in the form 'F~is better than F; because, on the average, F~/Fn~1', should also be willing to apply the same interpretation to Theorem l(ii) and conclude that often, 'Fn/F~= 1 on the average and so F; and F~are often the same after all'.