\documentclass[12pt]{article}
\usepackage{graphicx}
\usepackage{amsmath}
\usepackage{amssymb}
\reversemarginpar
\topmargin -1in
\oddsidemargin .25in \textheight 9.4in \textwidth 6.4in
\renewcommand{\familydefault}{\sfdefault}
\renewcommand{\rmdefault}{cmss}
\begin{document}
\parindent 24pt \parskip 10pt
{\bf
\begin{LARGE}
\begin{center}
{\bf 5 Reputation and Repeated Games with Symmetric Information}
\end{center}
February 17, 2014\\
Eric Rasmusen, Erasmuse@indiana.edu.
Http://www.rasmusen.org.
\newpage
\noindent
{\bf The Chainstore Paradox}
\noindent
Suppose that we repeat { Entry Deterrence I} 20 times in the context of a
chainstore that is trying to deter entry into 20 markets where it has outlets.
First, though, let's look at the Prisoner's Dilemma.
\begin{center} {\bf Prisoner's Dilemma }
\end{center}
\begin{tabular}{lllccc} & & &\multicolumn{3}{c}{\bf Column}\\
& & & {\it Silence} & & {\it Blame } \\ & & {\it
Silence } & 5,5 & $\rightarrow$ & -5,10 \\ & {\bf Row:}
&&$\downarrow$& & $\downarrow$ \\ & & {\it Blame } & 10,-5
& $\rightarrow$ & {\bf 0,0} \\
\end{tabular}
What if we repeat it twice? $N$ times? An infinite number of times?
\newpage
Because the one-shot { Prisoner's Dilemma}
has a dominant-strategy equilibrium, blaming is the only Nash outcome for the
repeated { Prisoner's Dilemma}, not just the only perfect outcome.
The backwards induction
argument does not prove that blaming is the unique Nash
outcome. Why not? See the next page of slides.
\newpage
Here is why blaming is the only Nash outcome:
1. No strategy in
the class that calls for $Silence$ in the last period can be a Nash strategy,
because the same strategy with $Blame$ replacing $Silence$ would dominate it.
2. If both players have strategies calling for blaming in the last period, then
no strategy that does not call for blaming in the next-to-last period is Nash,
because a player should deviate by replacing $Silence$ with $Blame$ in the next-
to-last period. And then keep going to 2nd-to-last period, etc.
Uniqueness is only on the equilibrium
path. Nonperfect Nash strategies could call for cooperation at nodes away
from the equilibrium path.
The strategy of always blaming is not a dominant strategy, not even weakly.
If the one-shot
game has multiple Nash equilibria, the perfect equilibrium of the finitely
repeated game has not only the one-shot outcomes, but others.
Benoit \& Krishna (1985).
\newpage
What if we repeat the { Prisoner's Dilemma} an infinite
number of times?
Defining payoffs in games that last an infinite number of periods
presents the problem that the total payoff is infinite for any positive payment
per period.
\noindent
1 Use an {\bf overtaking criterion}. Payoff stream $\pi$ is preferred to
$\tilde{\pi}$ if there is some time $T^*$ such that for every $T \geq T^*$, $$
\sum_{t=1}^T \delta^t \pi_t > \sum_{t=1}^T \delta^t \tilde{\pi_t}. $$
\noindent 2 Specify that the discount rate is strictly positive, and use the
present value. Since payments in distant periods count for less, the discounted
value is finite unless the payments are growing faster than the discount rate.
\noindent
3 Use the average payment per period, a tricky method since some sort
of limit needs to be taken as the number of periods averaged goes to infinity.
\\
\newpage
Here is a strategy that yields an equilibrium with SILENCE.
\noindent
{\bf The Grim Strategy}\\
{\it 1 Start by choosing {\it Silence}.\\
2 Continue to choose {\it Silence} unless some player has chosen $Blame$, in
which case choose $Blame$ forever.}
The GRIM STRATEGY is an example of a trigger strategy.
Robert Porter (1983) Bell J. Economics, ``A study of cartel stability: The Joint Executive
Committee, 1880-1886,"
examines price wars between railroads in the 19th century. The classic reference.
Slade (1987)
concluded that price wars among gas stations in Vancouver used small punishments
for small deviations rather than big punishments for big deviations.
Now think back to the 20-repeated Entry Deterrence game.
\newpage
Not
every strategy that punishes blaming is perfect. A notable example is the
strategy of Tit-for-Tat.
\noindent
{\bf Tit-for-Tat}\\
{\it 1 Start by choosing {\it Silence}.\\
2 Thereafter, in period $n$ choose the action that the other player chose in
period $(n-1)$.}
Tit-for-Tat is almost never
perfect in the infinitely repeated { Prisoner's Dilemma}
because it is not rational for Column to punish Row's initial $Blame$.
The deviation that kills the potential equilibrium is
not from $Silence$, but from the off-equilibrium
action rule of {\it Blame in response to a Blame}.
Adhering
to Tit-for-Tat's punishments results in a miserable alternation of $Blame$ and
$Silence$, so Column would rather ignore Row's first $Blame$.
Problem 5.5 asks you to show this formally.
\newpage
\noindent
{\bf Theorem 1 (the Folk Theorem)
{ In an infinitely repeated n-person game with finite action sets at each
repetition, any profile of actions observed in any finite number of repetitions
is the unique outcome of some subgame perfect equilibrium given
{\bf
Condition 1:} The rate of time preference is zero, or positive and sufficiently
small;
{\bf Condition 2:} The probability that the game ends at any
repetition is zero, or positive and sufficiently small; and
{\bf Condition 3:} The set of payoff profiles that strictly Pareto dominate the
minimax payoff profiles in the mixed extension of the one-shot game is n-
dimensional.}
\newpage
\noindent
{\bf Condition 1: Discounting}
The Grim Strategy imposes the heaviest
possible punishment for deviant behavior.
\begin{center} {\bf The Prisoner's Dilemma }
\end{center}
\begin{tabular}{lllccc} & & &\multicolumn{3}{c}{\bf Column}\\
& & & {\it Silence} & & {\it Blame } \\ & & {\it
Silence } & 5,5 & $\rightarrow$ & -5,10 \\ & {\bf Row:}
&&$\downarrow$& & $\downarrow$ \\ & & {\it Blame } & 10,-5
& $\rightarrow$ & {\bf 0,0} \\
\end{tabular}
$$
\pi(equilibrium) = 5 + \frac{5}{r}
$$
$$
\pi(BLAME) = 10 + 0
$$
These are equal at $r=1$ , so $\delta = \frac{1}{1+r} = .5$
\newpage
\noindent
{\bf Condition 2: A probability of the game ending}
If $\theta>0$, the game ends
in finite time with probability one. The expected
number of repetitions is finite.
The probability that the game lasts till infinity is zero.
Compare with the Cauchy distribution (Student's t with one degree of freedom) which has no mean.
It still behaves like a discounted infinite
game, because the expected number of future repetitions is always large, no
matter how many have already occurred. It is ``stationary''.
The game still has no Last Period, and
it is still true that imposing one, no matter how far beyond the expected number
of repetitions, would radically change the results.
``1 The game
will end at some uncertain date before $T$.''
``2 There is a constant
probability of the game ending.'' \\
\newpage
Amazing Grace on Stationarity
\begin{quotation} \noindent {\it When we've been there ten thousand years,\\
Bright shining as the sun,\\ We've no less days to sing God's praise \\
Than when we'd first begun.} \end{quotation}
\newpage
\noindent
{\bf Condition 3: Dimensionality }
\noindent
The ``minimax payoff'' is the payoff that results if
all the other players pick strategies solely to punish player $i$, and he
protects himself as best he can.
The set of strategies $s_{-i}^{i*}$ is a set of $(n-1)$ {\bf minimax
strategies} chosen by all the players except $i$ to keep $i$'s payoff as low as
possible, no matter how he responds. $s_{-i}^{i*}$ solves
\begin{equation}
\label{e5.1} \stackrel{Minimize}{s_{-i}}\;\; \stackrel{Maximum}{s_{i}}
\pi_i(s_i, s_{- i}). \end{equation}
Player $i$'s {\bf minimax payoff},
{\bf minimax value}, or {\bf security value}: his payoff from this.
We'll come back and talk about this more after finishing up the dimensionality condition.
\newpage
The dimensionality condition is needed only for games with three or more
players.
It is satisfied if there is some payoff profile for each player in
which his payoff is greater than his minimax payoff but still different from the
payoff of every other player.
Thus, a 3-person Ranked Coordination game would fail it.
The condition is necessary because establishing the desired behavior requires
some way for the other players to punish a deviator without punishing
themselves.
\includegraphics[width=6in]{fig05-01.jpg}
\begin{center}
{\bf The Dimensionality Condition}
\end{center}
\newpage
\noindent
{\bf Minimax and Maximin}
{\it The strategy $s_i^*$ is a {\bf maximin strategy} for player $i$ if, given
that the other players pick strategies to make i's payoff as low as possible,
$s_i^*$ gives i the highest possible payoff. In our notation, $s_i^*$ solves}
\begin{equation} \label{e5.2}
\stackrel{Maximize}{s_i}\;\; \stackrel{Minimum}{s_{-i}} \pi_i(s_i,s_{-i}).
\end{equation}
The minimax and maximin
strategies for a two-player game with Player 1 as $i$:
\begin{center}
\begin{tabular}{cccc}
Maximin:& $Maximum$& $Minimum$ & $\pi_1$ \\ & $s_1$ &$s_2$& \\
& & & \\
Minimax:&
$Minimum$& $Maximum$ & $\pi_1$ \\ & $s_2$ &$s_1$& \\
\end{tabular}
\end{center}
In the { Prisoner's Dilemma}, the minimax and maximin strategies are both
{\it Blame}.
\newpage
\begin{center} {\bf Another Minimaxing Game }
\end{center}
\begin{tabular}{lllccc} & & &\multicolumn{3}{c}{\bf Tom}\\
& & & {\it Left} & & {\it Right } \\
& & {\it
Up } & 0,0 & & 1,-1 \\
& {\bf Joe:}
&& & & \\
& & {\it Down } & 1,2
& & {\bf 3,3} \\
\end{tabular}
If Tom picks Left, the most Joe can get is 1, from DOWN. Tom minimaxes Joe using LEFT.
If Joe picks Up, the most Tom can get is 0 from LEFT. Joe minimaxes Tom using UP.
If Joe picks Down, the worst he can do is 1, from Tom picking LEFT. That is Joe's maximin strategy.
If Tom picks Left, the worst he can get is 0, if Joe picks UP. That is Tom's maximin strategy.
\newpage
Joe's Maximin value: The highest payoff Joe can assure himself if the other players are out to get
him.
Joe's Maximin strategy: A strategy that assures Joe of his maximin payoff.
Joe's Minimax value: The lowest payoff Joe's opponent can limit him to.
Tom's Minimax strategy against Joe: Tom's strategy that limits Joe to Joe's minimax payoff.
The minimax and maximin
strategies for a two-player game :
\begin{center}
\begin{tabular}{cccc}
1's maximin strategy & $Maximum$& $Minimum$ & $\pi_1$ \\ & $s_1$ &$s_2$& \\
& & & \\
2's strategy &
$Minimum$& $Maximum$ & $\pi_1$ \\
to minimax 1: & $s_2$ &$s_1$& \\
\end{tabular}
\end{center}
\newpage
Under minimax, Player 2 is purely malicious but must choose his mixing probability first,
in his attempt to cause player 1 the maximum
pain.
Under maximin, Player 1 chooses his mixing probability first, in the belief that Player 2 is out to
get him.
In variable-sum games, minimax is for sadists and maximin for
paranoids.
The maximin strategy need not be unique.
Since maximin behavior can also be viewed as minimizing the maximum loss that
might be suffered, decision theorists refer to such a policy as a {\bf minimax
criterion.}
\newpage
\begin{center}
{\bf
The Minimax Illustration Game}
\begin{tabular}{lllccc} & & &\multicolumn{3}{c}{\bf Column}\\
& & & $Left$ & & $Right$ \\ & & $ Up $ &
$-2, \fbox{2} $ & & $\fbox{1},-2$ \\ & {\bf Row:} & $Middle$& $ \fbox{1}, -2$
& & $-2, \fbox{2}$ \\ & & $Down$ & $ 0, \fbox{1} $ & &
$0,\fbox{1}$ \\
\end{tabular}
\end{center}
\vspace{-12pt}
In the Minimax Illustration Game Row can guarantee himself a payoff of 0 by
choosing $Down$, so that is his maximin strategy.
Column cannot hold Row's
payoff down to 0 by using a pure strategy, so his minimax strategy must be mixed.
Column's minimax strategy is {\it (Probability
0.5 of Left, Probability 0.5 of Right)}.
Row would respond with $Down$,
for a minimax payoff of 0, since either $Up$, $Middle$, or a mixture of the two
would give him a payoff of $-0.5$ ($=0.5 (-2) + 0.5 (1)) $.
It happens that {\it Down, (Probability
0.5 of Left, Probability 0.5 of Right)} is a Nash equilibrium too.
\newpage
\begin{center}
{\bf
The Minimax Illustration Game}
\begin{tabular}{lllccc} & & &\multicolumn{3}{c}{\bf Column}\\
& & & $Left$ & & $Right$ \\ & & $ Up $ &
$-2, \fbox{2} $ & & $\fbox{1},-2$ \\ & {\bf Row:} & $Middle$& $ \fbox{1}, -2$
& & $-2, \fbox{2}$ \\ & & $Down$ & $ 0, \fbox{1} $ & &
$0,\fbox{1}$ \\
\end{tabular}
\end{center}
\vspace{-12pt}
Row's
strategy for minimaxing
Column is {\it (Probability 0.5 of Up, Probability 0.5 of Middle)}. Row then gets 0 with left, right, or a mixture.
Column's maximin
strategy is {\it (Probability 0.5 of Left, Probability 0.5 of Right)}, and his
minimax payoff is 0.
The {\bf Minimax Theorem} (von Neumann [1928]),
says that a minimax equilibrium exists in pure or mixed strategies for
every two-person zero-sum game and is identical to the maximin equilibrium.
\newpage
\bigskip
\noindent
{\bf Precommitment}
\noindent What if we allow players to commit at the start to a strategy for the rest of the game?
If precommitted strategies are chosen simultaneously, the equilibrium outcome of the finitely
repeated { Prisoner's Dilemma} calls for always blaming.
What about in sequence?
The outcome depends on the particular values of the parameters, but one possible equilibrium is the
following:
Row moves first and chooses the strategy ({\it Silence} until Column $Blame$s; thereafter always
$Blame$), and Column chooses ({\it Silence} until the last period; then $Blame$).
The observed outcome? Why is it Nash? The game has a second-mover advantage.
\newpage
\noindent The One-Sided Prisoner's Dilemma (Reputation)
\begin{tabular}{lllccc} & & &\multicolumn{3}{c}{\bf Consumer
(Column)}\\ & & & {\it Buy} & & {\it Boycott } \\ &
& {\it High Quality } & 5,5 & $\leftarrow$ & 0,0 \\ & {\bf Seller
(Row):} &&$\downarrow$& & $\updownarrow$ \\ & & {\it Low Quality }
& 10, -5 & $\rightarrow$ & {\bf 0,0} \\
\end{tabular}
The Nash and iterated dominance equilibria are ({\it Low Quality, Boycott}), but it is not a dominant-
strategy
equilibrium.
Buyer does not have a dominant strategy, because if Seller were to
choose {\it High Quality}, Buyer would choose $Buy,$ to obtain the payoff of 5;
but if Row chooses $Low Quality$, Column would choose $Boycott$, for a payoff of zero.
{\it Low Quality} is however, weakly dominant for Seller, which makes {\it (Low Quality, High Quality)} the
iterated dominant strategy equilibrium.
\newpage
\begin{center}
{\bf Product Quality}, Klein \& Leffler (1981)
\end{center}
\noindent
{\bf The Order of Play}\\
1 An endogenous number $n$ of firms decide to enter the market at cost $F$.
2 A firm that has entered chooses its quality to be $High$ or $Low$, incurring
the constant marginal cost $c$ if it picks $High$ and zero if it picks $Low$.
The choice is unobserved by consumers. The firm also picks a price $p$.
3
Consumers decide which firms to buy from. The amount bought from firm $i$ is denoted $q_i$.
4
All consumers observe the quality of all goods purchased in that period.
5
The game returns to (2) and repeats.
\newpage
\noindent {\bf Payoffs}\\
Consumers buy $q(p) = \sum_{i=1}^n q_i$ of high quality, 0 of low quality. where $\frac{dq}{dp} <
0$.
If a firm stays out,
its payoff is zero.\\ If firm $i$ enters, it receives $-F$ immediately. Its
current end-of-period payoff is $q_ip$ if it produces $Low$ quality and $q_i(p-
c)$ if it produces $High$ quality. The discount rate is $r \geq 0$.
\vspace*{1in}
\noindent
An equilibrium:
{\bf Firms.} $\tilde{n}$ firms enter. Each produces high quality and
sells at price $\tilde{p}$. If a firm ever deviates from this, it thereafter
produces low quality (and sells at the same price $\tilde{p} $).
\noindent {\bf Buyers.} Buyers start by choosing randomly among the firms
charging $\tilde{p}$. Thereafter, they remain with their initial firm unless it
changes its price or quality, in which case they switch randomly to a firm that
has not changed its price or quality.
\newpage
The equilibrium must satisfy three constraints: incentive compatibility, competition, and market
clearing.
The {\bf incentive compatibility} constraint says that the individual firm must
be willing to produce high quality.
\begin{equation}\label{e5.3}
\frac{q_i p}{1+r} \leq \frac{q_i(p-c)}{r} \;\;\;\;\;\;\;\;\; (incentive \;
compatibility). \end{equation}
That means the price must satisfy:
\begin{equation}\label{e5.4} \tilde{p} \geq
(1+r)c. \end{equation}
The second constraint is that competition drives profits to zero, so firms
are indifferent between entering and staying out of the market.
\begin{equation}
\label{e5.5} \frac{q_i(p-c)}{r} = F \;\;\;\;\;\;\;\;\; (competition)
\end{equation}
Replacing $p$
gives
\begin{equation}\label{e5.6} q_i = \frac{F }{c}.
\end{equation}
\newpage
Third,
the output must equal the quantity demanded by the
market.
\begin{equation}\label{e5.7} nq_i = q(p). \;\;\;\;\;\;\;\;\; (market
\;\; clearing) \end{equation}
Combining equations (\ref{e5.3}), (\ref{e5.6}),
and (\ref{e5.7}) yields \begin{equation}\label{e5.8}
\tilde{ n} = \frac{cq([1+r]c)}{F }. \end{equation}
What if there were no entry cost?
Would profits be dissipated?
\newpage
\noindent {\bf Reputation: Umbrella Branding }
What if there are two goods? Could a firm do better by using umbrella branding, selling both
under the threat of losing its entire reputation if one of them turns out to be defective?
What is your intuition?
Would it matter if the seller was a monopoly or not?
\newpage
\begin{center}
{\bf Customer Switching Costs}, Farrell \& Shapiro (1988)
\end{center}
{\bf Players}\\
Firms Apex and Brydox, and a series of customers, each of whom is first called
a youngster and then an oldster.
\noindent {\bf The Order of Play }\\
1a Brydox, the initial incumbent, picks the incumbent price $p_{1} ^i$.\\ 1b
Apex, the initial entrant, picks the entrant price $p_{1}^e$.\\ 1c The oldster
picks a firm.\\
1d The youngster picks a firm.\\ 1e Whichever firm attracted the youngster
becomes the incumbent.\\ 1f The oldster dies and the youngster becomes an
oldster.\\
2a Return to (1a), possibly with new identities for entrant and incumbent.
\newpage
\noindent {\bf Payoffs}\\ The discount factor is $\delta$. The customer
reservation price is $R$ and the switching cost is $c$. The per period payoffs
in period $t$ are, for $j= (i,e)$,
\vspace{1in}
Payoff for firm $j$: \\
\begin{tabular}{ll}
\hspace{-1.5in} & $\left\{ \begin{tabular}{ll} 0 & if no customers are
attracted.\\
$p_t^j$ & if just oldsters or just youngsters \\ $2p_t^j$ & if
both oldsters and youngsters \\ \end{tabular} \right.$
\end{tabular}
\vspace{1in}
The payoff for an oldster: \\
\hspace{-1.5in} \begin{tabular}{ll} $ $& $\left\{ \begin{tabular}{ll} $R -
p_t^i$ & if he buys from the incumbent.\\ $R - p_t^e - c$ & if he switches to
the entrant. \\ \end{tabular} \right.$ \end{tabular}
\vspace{1in}
The payoff for a youngster: \\
\hspace{-1.5in} \begin{tabular}{ll} & $\left\{ \begin{tabular}{ll} $R -
p_t^i $ & if he buys from the incumbent.\\ $R -p_{t}^e$ & if he buys from the
entrant.\\ \end{tabular} \right.$ \end{tabular}
\newpage
{\it A {\bf Markov strategy} is a strategy that, at each node, chooses the
action independently of the history of the game except for the immediately
preceding action (or actions, if they were simultaneous). }
Here, a firm's Markov strategy is its price as a function of whether the
particular is the incumbent or the entrant, and not a function of the entire
past history of the game.
There are two ways to use Markov strategies:
(1) The right way. Look for equilibria that
use Markov strategies ({\bf perfect Markov
equilibrium} )
(2)The wrong way. Disallow non-Markov strategies and then look for
equilibria.
\newpage
Brydox, the initial incumbent, moves first. It does not want Bertrand competition and zero profits. So it chooses $p^i$ low enough that
Apex is not tempted to choose $p^e < p^i-c$ and steal away the oldsters.
Entrant Apex's
profit is $p^i$ if it chooses $p^e = p^i$ and serves just youngsters (we need for it to get ALL the youngsters in equilibrium---open-set problem) and
$2(p^i-c)$ if it chooses $p^e = p^i-c$ and serves both oldsters and youngsters.
Brydox chooses $p^i$ to make Apex indifferent between these alternatives, so
\begin{equation} \label{e5.9}
p^i=2(p^i-c), \end{equation} and
\begin{equation} \label{e5.10} p^i = 2c.
\end{equation}
Apex will get all the entrants, and therefore
in equilibrium, Apex and Brydox take turns being the incumbent. Also, Apex charges
the same price as Brydox, which is the most it can get away with charging the youngsters:
$$
\label{e5.10} p^e= p^i = 2c.
$$
\newpage
Let's compute the payoffs. First, note that the Oldsters are getting a better price than the Youngsters, even though the are the captive customers.
The equilibrium payoff of the current
entrant is the immediate payment of $p^e$ plus the discounted value of being the
incumbent in the next period:
\begin{equation} \label{e5.11}
\pi_e^* = p^e + \delta \pi_i^*. \end{equation}
The incumbent's payoff is the immediate payment of $p^i$ plus the discounted value of
being the entrant next period: \begin{equation} \label{e5.12} \pi_i^* = p^i +
\delta \pi_e^*.
\end{equation}
In equilibrium the incumbent and the entrant sell the same
amount at the same price, so $\pi_i^*= \pi_e^*$ and
\begin{equation} \label{e5.13}
\pi_i^* = 2c + \delta \pi_i^*. \end{equation} It follows that
\begin{equation} \label{e5.14} \pi_i^* = \pi_e^* = \frac{2c}{1 - \delta}.
\end{equation}
\newpage
\bigskip \noindent {\bf 5.6 Evolutionary Equilibrium: { Hawk-Dove} }
\noindent
{\it A strategy $s^*$ is an {\bf evolutionarily stable strategy}, or {\bf ESS},
if, using the notation $\pi(s_i,s_{-i})$ for player $i$'s payoff when his
opponent uses strategy $s_{-i}$, for every other strategy $s'$ either}
\begin{equation}\label{e4.5} \pi( s^*,s^*) > \pi( s',s^*) \end{equation} {\it
or} \begin{equation}\label{e4.6}
\begin{array}{l} (a) \;\; \pi( s^*,s^*) = \pi( s',s^*)\\ {\rm and}\\ (b) \;\;
\pi( s^*,s') > \pi( s',s'). \\ \end{array} \end{equation}
\noindent If condition (\ref{e4.5}) holds, then a population of players using
$s^*$ cannot be invaded by a deviant using $s'$. If condition (\ref{e4.6})
holds, then $s'$ does well against $s^*$, but badly against itself, so that if
more than one player tried to use $s'$ to invade a population using $s^*$, the
invaders would fail.
\newpage
\noindent
{\it A strategy $s^*$ is an {\bf evolutionarily stable strategy}, or {\bf ESS},
if, using the notation $\pi(s_i,s_{-i})$ for player $i$'s payoff when his
opponent uses strategy $s_{-i}$, for every other strategy $s'$ either}
\begin{equation}\label{e4.5} \pi( s^*,s^*) > \pi( s',s^*) \end{equation} {\it
or} \begin{equation}\label{e4.6}
\begin{array}{l} (a) \;\; \pi( s^*,s^*) = \pi( s',s^*)\\ {\rm and}\\ (b) \;\;
\pi( s^*,s') > \pi( s',s'). \\ \end{array} \end{equation}
Condition (\ref{e4.5}) is satisifed when $s^*$ is a strong Nash equilibrium (although not every strong
Nash strategy is an ESS).
Condition (\ref{e4.6}) is satisfied if $s^*$ is only a weak Nash strategy, but the weak alternative $s'$
is not a best response to itself.
ESS is a refinement of Nash: Nash plus:
(a) it has the highest payoff of any strategy used in equilibrium (which rules out equilibria with
asymmetric payoffs),
(b) any other best response $s'$ is not as good a response as $s^*$ to itself.
\newpage
ESS is a refinement of Nash: Nash plus:
(a) it has the highest payoff of any strategy used in equilibrium (which rules out equilibria with
asymmetric payoffs),
(b)Any other best response $s'$ does better against $s^*$ than it does against $s'$.
Example: The Battle of the Sexes. The mixed strategy equilibrium is an ESS, because a player
using it has as high a payoff as any other player. The two pure strategy equilibria are not made up
of ESS's, though, because in each of them one player's payoff is higher than the other's.
Ranked Coordination has two pure strategy equilibria. They both use ESS's.
The ``bad'' equilibrium strategy is an ESS,
because given that the other players are using it, no player could do as well by deviating.
The mixed-strategy equilibrium is a best response to itself.
\newpage
Example: The Utopian Exchange Economy. In Utopia, each citizen can produce either one or two
units of individualized output. He will then go into the marketplace and meet another citizen.
If either of them produced only one unit, trade cannot increase their payoffs.
If both of them
produced two, they can trade one unit for one unit, and both end up happier with more variety.
\newpage
\begin{center} {\bf The Utopian Exchange Economy Game }
\begin{tabular}{lllccc} & & &\multicolumn{3}{c}{\bf Jones}\\
& & & {\it Low Output} & & $ High Output$ \\ & &
$Low Output$ & {\bf 1, 1} & $\leftrightarrow$ & { 1, 1}\\ & {\bf
Smith:} &&$\updownarrow$& & $\downarrow$ \\ & & {\it High Output } &
{ 1,1} & $\rightarrow$ & {\bf 2,2} \\
\end{tabular}
\end{center}
This game has three Nash equilibria, one of which is in mixed strategies.
{\it High Output} is
an ESS by condition (a): it is a strict Nash equilibrium.
{\it Low Output} fails to meet condition (b). {\it High output} is weakly best response to it, and {\it High output} does even better against itself.
If the economy began with all
citizens choosing {\it Low Output}, then if Smith deviated to {\it High
Output} he would not do any better, but if {\it two} people deviated to {\it
High Output}, they would do better in expectation because they might meet each
other and receive (2,2).
\newpage
\noindent {\bf An Example of ESS: { Hawk-Dove} }
A resource worth $V=2$ ``fitness units'' is at stake when
the two birds meet. If they both fight, the loser incurs a cost of $C=4$, which
means that the expected payoff when two Hawks meet is $-1$ ($=0.5[2] + 0.5[-4])$
for each of them.
\begin{center} {\bf Table 5 { Hawk-Dove}: Economics Notation }
\begin{tabular}{lllccc} & & &\multicolumn{3}{c}{\bf Bird Two}
\\ & & & {\it Hawk} & & $ Dove$ \\ & & $Hawk$
& -1,-1 & $\rightarrow$ & {\bf 2,0}\\ & {\bf Bird One:} &&$\downarrow$& &
$\uparrow$ \\ & & {\it Dove } & {\bf 0, 2} & $\leftarrow$ & 1,1 \\
\end{tabular}
\end{center}
\vspace{-24pt}
{\it Payoffs to: (Bird One, Bird Two). Arrows show how a player can increase
his payoff. }
\bigskip
\begin{center} {\bf Table 6 { Hawk-Dove}: Biology Notation}
\begin{tabular}{lllccc}
& & &\multicolumn{3}{c}{\bf Bird Two}\\ & &
& {\it Hawk} & & $ Dove$ \\ & & $Hawk$ & -1 & & 2
\\ & {\bf Bird One:} && & & \\ & & {\it Dove } & 0 & & 1 \\
\multicolumn{6}{l}{\it Payoffs to: (Bird One) } \end{tabular} \end{center}
\newpage
{ Hawk-Dove} has no symmetric pure-strategy Nash equilibrium, and hence no
pure-strategy ESS, since in the two asymmetric Nash equilibria, $Hawk$ gives a
bigger payoff than $Dove$, and the doves would disappear from the population.
In the mixed-strategy ESS, the equilibrium strategy is to be a hawk with
probability 0.5 and a dove with probability 0.5, which can be interpreted as a
population 50 percent hawks and 50 percent doves.
The equilibrium is stable in a
sense similar to the Cournot equilibrium. If 60 percent of the population were
hawks, a bird would have a higher fitness level as a dove. If ``higher
fitness'' means being able to reproduce faster, the number of doves increases
and the proportion returns to 50 percent over time.
\newpage
{\bf The bourgeois strategy} (a correlated strategy) is an ESS. Under this strategy, the bird
behaves as a hawk if it arrives first,
and a dove if it arrives second.
The bourgeois strategy has an expected payoff of 1 from meeting itself, and
behaves exactly like a 50:50 randomizer when it meets a strategy that ignores
the order of arrival, so it can successfully invade a population of 50:50
randomizers.
\newpage
The ESS is suited to games in which all the players are identical and
interacting in pairs.
The
approach follows three steps:\\
(1) the initial population proportions
and the probabilities of interactions,\\
(2) the pairwise interactions, \\
(3)
the dynamics by which players with higher payoffs increase in number in the
population.
\newpage
Slow dynamics also makes the starting point of the game important, unlike the
case when adjustment is instantaneous. Figure 2, taken from David Friedman
(1991), shows a way to graphically depict evolution in a game in which all three
strategies of $Hawk$, $Dove$, and $Bourgeois$ are used. A point in the triangle
represents a proportion of the three strategies in the population. At point
$E_3$, for example, half the birds play $Hawk$, half play $Dove$, and none play
$Bourgeois$, while at $E_4$ all the birds play $Bourgeois$.
\includegraphics[width=150mm]{fig05-02.jpg}
\begin{center} {\bf Evolutionary Dynamics in the Hawk-Dove-
Bourgeois Game} \end{center}
\newpage
\includegraphics[width=150mm]{fig05-02.jpg}
The figure also shows the importance of mutation in biological games. If the
population of birds is 100 percent dove, as at $E_2$, it stays that way in
the absence of mutation, since if there are no hawks to begin with, the fact
that they would reproduce at a faster rate than doves becomes irrelevant.
}
\end{LARGE}
}
\end{document}