Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Expected payout and optimal strategy for simple game

6 views
Skip to first unread message

Larry Penn

unread,
Mar 16, 1999, 3:00:00 AM3/16/99
to
You are given a randomly shuffled deck of 52 cards; as usual, 26 are red and
26 are black. You have the opportunity to play the following game in which
you will be successively displaying and then discarding the next (top) card
in the deck:

Whenever a black card is displayed, you win one dollar. Whenever a red card
is displayed, you lose one dollar. At any time you can choose not to
display the next card, in which case the game ends and your winnings and
losings are final. You are allowed to keep track of what cards have already
been displayed.

For example, if you are feeling lucky, and you keep turning over the first
26 cards, each of which happens to be black, then you will choose to stop at
that point since you will know that only red cards (losers) are left; in
this case you will win the maximum possible 26 dollars.

Question 1: What is your expected payout under an optimal strategy (i.e., a
strategy that maximizes your expected winnings)?

Question 2: Generalizing, assume the deck has R red cards and B black cards.
Is there a closed form solution (or asymptotic solution) to the expected
payout of this game? Is there a closed form or asymptotic solution to
determining whether the expected payout is positive as opposed to zero (if
it’s zero then your optimal strategy is to stop right there, i.e. not to
turn over even the first card).


Charles Whitmer

unread,
Mar 31, 1999, 3:00:00 AM3/31/99
to
Larry Penn wrote in message ...
(John Haugsand and Robert Israel have already covered the answers for finite cases, so I'll consider the case with a large number of cards.)
 
Asymptotic solution
-----------------------
 
Short answer when R+B is large:
 
Let A=.3691363808 and x0=.8399236757, then
 
1) If R=B, you expect to win A*sqrt(R+B) dollars with optimal play.
2) Do not play if R-B>=x0*sqrt(R+B).
3) If R-B<x0*sqrt(R+B), then the amount you expect to win is:
 
   F = A*sqrt(R+B)*exp((R-B)^2/(2*(R+B))*(1+erf((R-B)/sqrt(2*(R+B))))-(R-B)
 
 
 
Slightly longer answer:
 
Consider the game to be played with a random variable x(t) following Brownian motion, where t is the time to the end of the game and is decreasing.  We constrain the path of x(t) such that x(0)=0, and we assume that the variance of the Brownian motion is 1 per unit time.  The goal of the game is to stop playing with x as large as possible.
 
It is easy to see that the card game is a finite approximation to this game if we identify x=R-B and t=R+B.
 
So now suppose that at time t we are located at x, and we need to decide whether to continue.  This decision clearly depends on x, but the only other scale we have to decide how big or small x is, is sqrt(t)!  Thus our optimal playing decision depends only on the ratio x/sqrt(t).  For x/sqrt(t) above some limiting value x0, we will stop, otherwise we play.
 
If we call the amount of money we can expect to win with optimal play F(x,t), then the same dimensional argument tells us that:
 
   F(x,t)=sqrt(t)*f(x/sqrt(t))
 
To make this more obvious, consider the following.  Suppose you have carefully computed F(x,1), i.e. for all x values with 1 time unit left to play.  Now I want you to compute F(x,1/2).  You can save yourself some trouble by noting that the brownian paths are self similar, i.e. there is a one-to-one mapping of paths on [1/2,0] to paths on [1,0].  Explicitly, choose any x(t) on [1/2,0] and think of sqrt(2)*x(2*t) as a path on [1,0].  Thus, each path you consider in calculating F(x,1/2) you have already considered in calculating F(x,1).  The only difference is the x-scale, which does not change the optimal play of the game.  (Would you play the card game differently if you won and lost 50 cents on each card instead of a dollar?)
 
We can now write down a recursion relation, similar to the one given by Robert Israel, relating the F function at two time steps separated by a small dt.  Given that this relates scaled versions of the same function f, we'll be able to find a differential equation for f.  There are several equivalent ways to do this, and all are a little messy, so I'll just jump to the conclusions that can be drawn:
 
1) f' (the derivative of f) is continuous.
2) f''(x) = f(x) + x*f'(x) + 2*x
 
The general solution to the diff.eq. is:
 
   f(x)=A*exp(x^2/2)+B*exp(x^2/2)*erf(x/sqrt(2))-x
 
For large negative x, we shouldn't be able to win much more than -x, so we must have B=A>0.
 
Now we need to patch together this part of the solution for x<x0 with the part for x>x0 where f(x)=0.  This means that we need to find a value for A such that f(x0)=0 and f'(x0)=0 for some x0.  This happens to be possible only with:
 
    A=.3691363808...
    x0=.8399236757...
 
There appears to be no simple expression for either one, but it can be said that:
 
    A=sqrt(Pi/2)*(1-x0^2)
 
And that's it.
 
I have calculated a few of the finite cases, as per Robert Israel's method, and looked at F(R,B)/sqrt(R+B), which I claim should converge to A.
 
 R      B     F(R,B)  F(R,B)/sqrt(R+B)
 13     13    1.8408    0.3610
 26     26    2.6245    0.3639
 52     52    3.7316    0.3659
104    104    5.2946    0.3671
208    208    7.5033    0.3679

 
This seems to match nicely, as do other values of F(R,B) for R not equal to B.
 
Enjoy,
Chuck
 
 

 
0 new messages