\documentclass[12pt]{article}
\usepackage{graphicx}
\usepackage{amsmath}
\usepackage{amssymb}
\reversemarginpar \topmargin -1in
\oddsidemargin .25in \textheight 9.4in \textwidth 6.4in
\begin{document}
\parindent 24pt \parskip 10pt
\setcounter{page}{502}
\noindent 23 November 2005. \\
Eric Rasmusen, Erasmuse@indiana.edu
\begin{LARGE}
\begin{center}
{\bf Mathematical Appendix}
\end{center}
\end{LARGE}
\noindent
This appendix has three purposes: to remind some readers of the definitions of
terms they have seen before, to give other readers an idea of what the terms
mean, and to list a few theorems for reference. In accordance with these limited
purposes, some terms such as ``boundary point'' are left undefined. For fuller
exposition, see Rudin (1964) on real analysis, Debreu's {\it Theory of Value}
(1959), and Chiang (1984) and Takayama (1985) on mathematics for economists.
Intriligator (1971) and Varian (1992) both have good mathematical appendices and
are strong in discussing optimization, and Kamien \& Schwartz (1991) covers
maximizing by choice of functions. Border's 1985 book is entirely about fixed
point theorems. Stokey \& Lucas (1989) is about dynamic programming. Fudenberg
\& Tirole (1991a) is the best source of mathematical theorems for use in game
theory.
The web is very useful for mathematical definitions. See
http://en.wikipedia.org, http://mathworld.wolfram.com, and
http://planetmath.org.
\noindent
{\bf *A.1 Notation}
\begin{description}
\item[ $\sum$ ] {\bf Summation.} $\sum_{i=1}^3 x_i = x_1+x_2+x_3$.
\item[ $\Pi$ ] {\bf Product.} $\Pi_{i=1}^3 x_i = x_1x_2x_3.$
\item[$|x|$] {\bf Absolute value} of $x$. If $x \geq 0$ then $|x| = x$ and if
$x <0$ then $|x| = -x$.
\item[ $|$ ] {\bf ``Such that,''} ``given that,'' or ``conditional upon.''
$\{x|x<3 \}$ denotes the set of real numbers less than three. $Prob(x|y <5)$
denotes the probability of $x$ given that $y$ is less than 5.
\item[:] {\bf ``Such that.''} $\{x:x<3 \}$ denotes the set of real numbers
less than three. The colon is a synonym for $|$.
\item[${\bf R}^n$] The set of $n$-dimensional vectors of {\bf real numbers }
(integers, fractions, and the least upper bounds of any subsets thereof).
\item[ $ \{x,y,z \}$] {\bf A set of elements} $x, y$, and $z$. The set
$\{3,5\}$ consists of two elements, 3 and 5.
\item[ $\in$ ] {\bf ``Is an element of.''} $a \in \{2,5\}$ means that $a$ takes
either the value 2 or 5.
\item[ $\subset$ ]{\bf Set inclusion.} If $X=\{2,3,4\}$ and $Y=\{2,4\}$, then
$Y\subset X$ because $Y$ is a subset of $X$.
\item[ ] $[x,y]$ The {\bf closed interval} with endpoints $x$ and $y$. The
interval $[0, 1000]$ is the set $\{x| 0 \leq x \leq 1000 \}$. Square brackets
are also used as delimiters.
\item[ $(x,y)$ ] The {\bf open interval} with endpoints $x$ and $y$. The
interval $(0, 1000)$ is the set $\{x| 0 < x < 1000 \}$. $(0,1000]$ would be a
half-open interval, the set $\{x| 0 < x \leq 1000 \}$. Parentheses are also
used as delimiters.
\item[ $x!$ ] {\bf x-factorial.} $x! =x (x-1)(x-2)....(2) (1)$. $4! = 4(3)(2)
(1) = 24$. \item[ $ \left( \begin{array}
{l} a \\ b\\ \end{array}
\right)$] The number of unordered combinations of $b$ elements from a set
with $a$ elements. $\left( \begin{array}{l} a \\ b\\ \end{array} \right) =
\frac{a!}{b!(a-b)!}$, so $ \left( \begin{array}{l}
4 \\ 3\\ \end{array}
\right) = \frac{4!} {3!(4-3)!}= 24/6 = 4. $ (See {\it combination} and
$permutation$ below.)
\item[ $\times$ ] The {\bf Cartesian product.} $X \times Y$ is the set of
points $\{x,y\}$, where $x \in X$ and $y \in Y$.
\item[ $\epsilon$ ] An {\bf arbitrarily small positive number.} If my payoff
from both $Left$ and $Right$ equals 10, I am indifferent between them; if my
payoff from $Left$ is changed to $10 + \epsilon$, I prefer $Left$.
\item[ $\sim$ ] We say that $X \sim F$ if the random variable $X$ is {\bf
distributed according to} distribution $F$.
\item[ $\exists$] {\bf ``There exists...''} $\exists x>0: 9-x^2=0$.
\item[ $\forall$] {\bf ``For all...''} $\forall x \in [0,3], x^2 <10.$
\item[ $\equiv$] {\bf ``Equals by definition''} ``For clarity, let us define the
average income $x \equiv \frac{\theta^{w-1}}{(a-1)^2 +b^2 + c}$ for use in the
expressions below.''
\item[ $\rightarrow$] If $f$ {\bf maps} space $X$ into space $Y$ then $f: X
\rightarrow Y$.
\item[ $\frac{d f}{d x}$, $\frac{d^2 f}{d x^2}$ ] The {\bf first and second
derivatives} of a function. If $f(x) = x^2$ then $\frac{d f}{d x} = 2x$ and
$\frac{d^2 f}{d x^2} = 2$.
\item[ $f', f''$ ] The {\bf first and second derivatives} of a function. If
$f(x) = x^2$ then $f'= 2x$ and $f'' = 2$. Primes are also used on variables (not
functions) for other purposes: $x'$ and $x''$ might denote two particular values
of $x$.
\item[ $\frac{\partial f}{\partial x}$, $\frac{\partial^2 f}{\partial x
\partial y}$ ] {\bf Partial derivatives } of a function. If $f(x,y) = x^2y$
then $\frac{\partial f}{\partial x} =2xy$ and $\frac{\partial^2 f}{\partial x
\partial y} = 2x$.
\item[ $y_{-i}$] The set $y$ minus element $i$. If $y= \{y_1,y_2,y_3 \}$, then
$y_{-2} =\{y_1, y_3 \}$.
\item[ $Max( x,y)$ ] The {\bf maximum} of two numbers $x$ and $y$.
$Max(8,24)=24$.
\item[ $Min( x,y )$ ] The{\bf minimum} of two numbers $x$ and $y$. $Min(5,3)=
3$.
\item[$\lceil x \rceil$ ]{\bf Ceiling} ($x$). A number rounded up to the
nearest integer. $\lceil 4.2 \rceil =5$. This notation is not well known in
economics.
\item[$\lfloor x \rfloor$ ] {\bf Floor } ($x $). A number rounded down to
the nearest integer. $\lfloor 6.9 \rfloor =6$. This notation is not well known
in economics.
\item[ $Sup \;X$] The {\bf supremum (least upper bound)} of set $X$. If $ X =
\{x | 0 \leq x < 1000 \}$, then $sup\; X = 1000$. The supremum is useful because
sometimes, as here, no maximum exists.
\item[ $Inf\; X$ ] The {\bf infimum (greatest lower bound)} of set $X$. If $
X =\{x | 0 \leq x < 1000 \}$, then $inf \;X = 0$.
\item[ $Argmax$ ] The {\bf argument that maximizes} a function. If $e^* =
argmax\; EU(e)$, then $e^*$ is the value of $e$ that maximizes the function
$EU(e)$. The argmax of $f(x)=x - x^2$ is 1/2.
\item[$Maximum$] The {\bf greatest value} that a function can take. $Maximum
(x-x^2) = 1/4$.
\item[$Minimum$] The {\bf least value} that a function can take. $Minimum
(-5+x^2) = -5$.
\end{description}
\noindent {\bf *A.2 The Greek Alphabet}
\bigskip \begin{tabular}{lll}
$ A$ & $\alpha$ & alpha \\
$B$ & $\beta$ & beta \\ $\Gamma$ & $\gamma$ & gamma \\
$\Delta$ & $\delta$ & delta \\ $E$ & $\epsilon$ or $\varepsilon$ & epsilon\\
$Z$ & $\zeta$ & zeta\\ $H$ & $\eta$ & eta\\ $\Theta$ & $\theta$ & theta\\ $I$
& $\iota$ & iota\\ $K$ & $\kappa$ &kappa\\ $\Lambda$ & $\lambda$ & lambda\\
$M$ & $\mu$ & mu\\ $N$ & $\nu$ & nu\\ $\Xi$ & $\xi$ & xi\\ $O$ & $o$ &
omicron\\ $\Pi$ & $\pi$ &pi\\ $P$ & $\rho$ & rho\\
$\Sigma$ & $\sigma$ & sigma\\ $T$ & $\tau$ & tau\\ $\Upsilon$ & $\upsilon$ &
upsilon\\ $\Phi$ & $\phi$ & phi\\ $X$ & $\chi$ & chi\\ $\Psi$ & $\psi$ & psi\\
$\Omega$ & $\omega$ & omega
\end{tabular}
\bigskip
\noindent {\bf *A.3 Glossary}
\begin{description}
\item[{\bf almost always}] See ``generically.''
\item[{\bf annuity}] A riskless security paying a constant amount each year for
a given period of years, with the amount conventionally paid at the end of each
year.
\item[{\bf closed}] A closed set in ${\bf R}^n$ includes its boundary points.
The set $\{x: 0 \leq x \leq 1000 \}$ is closed.
\item[{\bf combination}] The number of unordered sets of $b$ elements from a
set with $a$ elements, denoted $ \left( \begin{array}{l} a \\
b\\
\end{array} \right) = \frac{a!}{b!(a-b)!}$. If we form sets of 2 element from
the set $ A=\{w,x,y,z\}$, the possibilities are $ \{w,x \}, \{w,y \}, \{w,z
\}, \{x,y \}, \{x,z \}, \{y,z \}$. Thus, $ \left( \begin{array} {l} 4 \\ 2\\
\end{array} \right) = \frac{4!}{2!(4-2)!}= 24/6 = 6. $ (See $permutation$
for the ordered version.)
\item[{\bf compact}] If set $X$ in ${\bf R}^n$ is closed and bounded, then $X$
is compact. Outside of Euclidean space, however, a set being closed and
bounded does not guarantee compactness.
\item[{\bf complete metric space}]
A metric space that includes the limits of all possible Cauchy sequences. All
compact metric spaces and all Euclidean spaces are complete.
\item[{\bf concave function}] The continuous function $f(x)$ defined on interval
$X$ is concave if for all elements $w$ and $z$ of $X$, $f(0.5w+0.5z) \geq
0.5f(w)+0.5f(z)$. If $f$ maps {\bf R} into {\bf R} and $f$ is concave, then
$f'' \leq 0$. See Figure 1.
\includegraphics[width=150mm]{figa-01.jpg}
\begin{center} {\bf Figure 1: Concavity and Convexity }
\end{center}
\item[{\bf continuous function}] Let $d(x,y)$ represent the distance between
points $x$ and $y$. The function $f$ is continuous if for every $\epsilon > 0$
there exists a $\delta(\epsilon)>0$ such that $d(x,y) < \delta (\epsilon)$
implies $d(f(x),f(y)) < \epsilon$.
\item[{\bf continuum}] A continuum is a closed interval of the real line, or a
set that can be mapped one-to-one onto such an interval.
\item[{\bf contraction}] The mapping $f(x)$ is said to be a contraction if there
exists a number $c < 1$ such that for the metric $d$ of the space $X$,
\begin{equation} \label{eA2.1}
d(f(x),f(y)) \leq c*d(x,y),\; {\rm for \;\; all}\; x,y \in X. \end{equation}
\item[{\bf convex function}] The continuous function $f(x)$ is convex if for all
elements $w$ and $z$ of $X$, $f(0.5w+0.5z) \leq 0.5f(w)+0.5f(z)$. See Figure 1.
Convex functions are only loosely related to convex sets.
\item[{\bf convex set}] If set $X$ is convex, then if you take any two of its
elements $w$ and $z$ and a real number $t: 0 \leq t \leq 1$, then $tw + (1-t)z$
is also in $X$.
\item[{\bf correspondence}] A correspondence is a mapping that maps each point
to one or more other points, as opposed to a function, which only maps to one.
\item[{\bf domain}] The domain of a mapping is the set of elements it maps
from-- something like the land that it can alter as it pleases. (The mapping
maps from the domain onto the range.)
\item[{\bf Leibniz's integral rule}] This is the rule for differentiation
under the integral sign.
\begin{equation}
\frac{\partial} {\partial z} \int_{a(z)}^{b(z) } f(x,z) dx = f(b(z),z)
\frac{\partial b(z)} {\partial z} -
f(a(z),z)\frac{\partial a(z)} {\partial z} +
\int_{a(z)}^{b(z) } \frac{\partial f(x,z)}{ \partial z} dx.
\end{equation}
\item[{\bf function}] If $f$ maps each point in $X$ to exactly one point in $Y$,
$f$ is called a function. The two mappings in Figure 1 are functions, but the
mapping in Figure 2 is not.
\item[{\bf generically}] If a fact is true on set $X$ generically, ``except on
a set of measure zero,'' or ``almost always,'' then it is false only on a
subset of points $Z$ that have the property that if a point is randomly chosen
using a density function with support $X$, a point in $Z$ is chosen with
probability zero. This implies that if the fact is false on $z \in {\bf R}^n$
and $z$ is perturbed by adding a random amount $\epsilon$, the fact is true on
$z+\epsilon$ with probability one.
\item[{\bf integration by parts}] This is a technique to rearrange integrals so
they can be solved more easily. It uses the formula
\begin{equation}
\int_{z=a}^{b} g(z) h'(z) dz = g(z) h(z)\left|_{z=a}^{b} \right. - \int_{z=a}
^{b} h(z) g'(z) dz.
\end{equation}
To derive this, differentiate $g(z) h(z)$ using the chain rule, integrate each
side of the equation, and rearrange.
\item[{\bf Lagrange multiplier}] The Lagrange multiplier $\lambda$ is the
marginal value of relaxing a constraint in an optimization problem. If the
problem is\\
\{ \begin{tabular}{c} $Maximize$\\$x$ \end{tabular} $x^2$ subject to $x \leq 5$
\}, then $\lambda = 2x^* = 10$.
\item[{\bf lattice}] A lattice is a partially ordered set (the $\geq$ ordering
is defined) where for any two elements $a$ and $b$, the values $inf(a,b)$ and
$sup(a,b)$ are also in the set.
A lattice is complete if the infimum and supremum of each of its subsets are in
the lattice.
\item[{\bf list}] (or $n$-Tuple) A set of $n$ elements in which the element
positions are ordered. $(Up, Down, Up)$ is a list, but $(Down, Up, Up)$ are
$(Up, Down)$ are different lists. See {\it set} and {\it multiset}.
\item[{\bf lower semicontinuous correspondence}] The correspondence $\phi$ is
lower semicontinuous at the point $x_0$ if \begin{equation} \label{eA2.2} x_n
\rightarrow x_0,\; y_0 \in \phi (x_0),\; {\rm implies} \;\; \exists y_n \in
\phi(x_n) \;\;{\rm such\;\;that}\;\;y_n \rightarrow y_0, \end{equation} which
means that associated with every $x$ sequence leading to $x_0$ is a $y$ sequence
leading to its image. See Figure 2. This idea is not as important as upper
semicontinuity.
\item[{\bf maximand}] A maximand is what is being maximized. In the problem
``Maximize $f(x, \theta)$ by choice of $x$'', the maximand is $f$.
\item[{\bf mean-preserving spread}] See the {\bf Risk} section below.
\item[{\bf measure zero}] See ``generically.''
\item[{\bf metric}] The function $d(w,z)$ defined over elements of set $X$ is a
metric if (1) $d(w,z) > 0$ if $w \neq z$ and $d(w,z) = 0$ if and only if $w =
z$; (2) $d(w,z) =d(z,w)$; and (3) $d(w,z) \leq d(w,y) + d(y,z)$ for points
$w,y,z \in X$.
\item[{\bf metric space}] Set $X$ is a metric space if it is associated with a
metric that defines the distance between any two of its elements.
\item[{\bf multiset}]
A set of $n$ elements
in which position in the listing is does not matter but multiplicity does.
$(Up, Down, Up)$ is a multiset, and $(Down, Up, Up)$ is the same multiset,
but $(Up, Down)$ is a different multiset. See {\it list} and {\it set}.
\item[{\bf $n$-Tuple}] (or Tuple). See {\it list}.
\item[{\bf one-to-one}] The mapping $f: X \rightarrow Y$ is one-to-one if every
point in set $X$ maps to a different point in $Y$, so $x_1 \neq x_2$ implies
$f(x_1) \neq f(x_2)$. An example is $f(x)=x/2$ with $X=[0,1]$ and $Y=[0,2]$.
\item[{\bf onto}] The mapping $f: X \rightarrow Y$ is onto $Y$ if every point in
$Y$ is mapped onto by some point in $X$. An example is $f(x)=x^2$ with $X=[-1,1]
$ and $Y=[0,1]$.
\item[{\bf open}] In the space ${\bf R}^n$, an open set is one that does not
include all its boundary points. The set $\{x: 0 \leq x < 1000 \}$ is open
(even though it does include {\it one} of its boundary points). In more general
spaces, an open set is a member of a topology.
\item[{\bf permutation}] The number of lists (sets with an order) of $b$
elements from a set with $a$ elements, which equals $\frac{a!}{ (a-b)!}$. If
we form sets of 2 elements from the set $ A=\{w,x,y,z \}$, the possibilities
are\\
$ \{w,x \}, \{x,w \}, \{w,y \}, \{y,w \}, \{w,z \}, \{z,w \}, \{x,y \}, \{y,x
\}, \{x,z \}, \{z,x \}, \{y,z \}, \{z,y \}$. \\
The number of these is $ \frac{4!}{ (4-2)!}= 24/2 = 12. $ (See
$combination$ for the unordered version.)
\item[{\bf perpetuity}] A riskless security paying a constant amount each year
in perpetuity, with the amount conventionally paid at the end of each year.
\item[{\bf quasi-concave}] The continuous function $f$ is quasi-concave if for
$w \neq z$, $f(0.5w + 0.5z) > min [f(w),f(z)]$, or, equivalently, if the set
$\{x \in X|f(x) > b\}$ is convex for any number $b$. Every concave function is
quasi-concave, but not every quasi-concave function is concave.
\item[{\bf quasilinear utility}] A utility function is quasilinear in variable
$w$ if a monotonic transformation can make it linear in $w$ and $w$ is separable
from all other variables in the utility function. $u(w,x) = w + \sqrt(x)$ and
$u(w,x) = log( w + \sqrt(x))$ are quasilinear; $u(w,x) = w x $ and $u(w,x) =
log( w ) + \sqrt(x) $ are not.
\item[{\bf range}] The range of a mapping is the set of elements to which it
maps-- the property over which it can spew its output. (The mapping maps
from the domain onto the range.)
\item[{\bf risk}] See the {\bf Risk} section below.
\item[{\bf set}] A set is a collection of objects in which position of
listing and multiplicity do not matter. $\{Up, Down\}$, $\{ Down,
Up\}$, and $\{Up, Down, Down\}$ are all the same set of 2 elements. See
{\it list} and {\it multiset}.
\item[{\bf stochastic dominance}] See the {\bf Risk} section below.
\item[{\bf strict}] The word ``strict'' is used in a variety of contexts to mean
that a relationship does not hold with equality or is not arbitrarily close to
being violated. If function $f$ is concave and $f' >0$, then $f'' \leq 0$, but
if $f$ is strictly concave, then $f'' < 0$. The opposite of ``strictly'' is
``weakly.'' The word ``strong'' is sometimes used as a synonym for ``strict.''
\item[{\bf supermodular}] See the {\bf Supermodularity} section below.
\item[{\bf support}] The support of a probability distribution $F(x)$ is the
closure of the set of values of $x$ such that the density is positive. If each
output between 0 and 20 has a positive probability density, and no other output
does, then the support of the output distribution is [0,20].
\item[ {\bf topology}] Besides denoting a field of mathematics, a topology is a
collection of subsets of a space called ``open sets'' that includes (1) the
entire space and the empty set, (2) the intersection of any finite number of
open sets, and (3) the union of any number of open sets. In a metric space, the
metric ``induces'' a topology by defining an open set. Imposing a topology on a
space is something like defining which elements are close to each other, which
is easy to do for ${\bf R}^n$ but not for every space (e.g., spaces consisting
of functions or of game trees).
\item[{\bf upper semicontinuous correspondence}] The correspondence $\phi:
X\rightarrow Y$ is upper semicontinuous at point $x_0$ if \begin{equation}
\label{eA2.5}
x_n \rightarrow x_0,\; y_n \in \phi (x_n),\; y_n \rightarrow y_0, \;\;{\rm
implies}\;\; y_0 \in \phi (x_0), \end{equation}
which means that every sequence of points in $\phi(x)$ leads to a point also
in $\phi(x)$. See Figure 2. An alternative definition, appropriate only if $Y$
is compact, is that $\phi$ is upper semicontinuous if the set of points
$\{x,\phi(x) \}$ is closed.
\includegraphics[width=150mm]{figa-02.jpg}
\begin{center} {\bf Figure 2 Upper Semicontinuity} \end{center}
\item[ {\bf vector}] A vector is a list (a set with order) that has a certain
structure I will not describe here, but which, for example, a list of real
numbers, a point in ${\bf R}^n$, will satisfy. The point $(2.5, 3, -4)$ is a
vector in ${\bf R}^3$.
\item[{\bf weak}] The word ``weak'' is used in a variety of contexts to mean
that a relationship might hold with equality or be on a borderline. If $f$ is
concave and $f' >0$, then $f'' \leq 0$, but to say that $f$ is weakly concave,
while technically adding nothing to the meaning, emphasizes that $f'' = 0$ under
some or all parameters. The opposite of ``weak'' is ``strict'' or ``strong.''
\end{description}
\bigskip
\noindent {\bf *A.4 Formulas and Functions}
$log (xy) = log(x) + log(y)$.
$log (x^2) = 2log (x)$.
$a^x = (e^{log (a)})^x$.
$e^{rt} = (e^r)^t$.
$e^{a+b} = e^a e^b$.
$a>b \Rightarrow ka < kb$, if $k<0$.
\noindent {\it The Quadratic Formula}: Let $ ax^2 + bx + c = 0. $ Then $ x =
\frac{-b \pm \sqrt{b^2-4ac}}{2a}. $
\noindent {\it Derivatives}
\begin{tabular}{l|l} \hline $f(x)$ & $f'(x)$\\ \hline & \\ $x^a$ & $ax^{a-1}$\\
& \\ $1/x$ & $-\frac{1}{x^2}$\\ & \\ $\frac{1}{x^2}$ & $-\frac{2}{x^3}$\\ & \\
$e^x$ & $e^x$\\ & \\ $e^{rx}$ & $re^{rx}$\\ & \\ $log(ax)$ & $1/x$\\ & \\
$log(x)$ & $1/x$\\ & \\ $a^x$ & $a^x log(a)$\\ & \\ $ f(g(x))$ & $f'(g(x))
g'(x)$ \\ & \\ \hline \end{tabular}
\bigskip \noindent {\it Determinants}
$ \left| \begin{array}{cc} a_{11} & a_{12}\\ a_{21} & a_{22}\\ \end{array}
\right| = a_{11}a_{22} - a_{21}a_{12}. $
$$ \left| \begin{array}{ccc} a_{11} & a_{12} & a_{13}\\ a_{21} & a_{22} & a_{23}
\\ a_{31} & a_{32} & a_{33}\\ \end{array} \right| = a_{11}a_{22}a_{33} - a_{11}
a_{23}a_{32} + a_{12}a_{23}a_{31} - a_{12}a_{21}a_{33} +a_{13}a_{21}a_{32} -
a_{13}a_{22}a_{31}. $$
\noindent {\bf Table 1: Some Useful Functional Forms}
\begin{tabular}{l|llll}
\hline
& & & & \\
${\it f(x)}$ & ${\it f'(x)}$ & ${\it f''(x)}$ & {\it Slope for $x > 0$} & {\it
Curvature}\\
& & & & \\
\hline
& & & & \\
$log(x)$ & $\frac{1}{x}$ & $-\frac{1}{x^2}$ & increasing & concave\\ & &
& & \\
$\sqrt{x}$ & $\frac{1}{2\sqrt{x}}$ & $-\frac{1}{4x^{(3/2)}}$ & increasing &
concave\\
& & & & \\
$x^2$ & $ 2x$ & 2 & increasing& convex\\
& & & & \\
$\frac{1}{x}$& $-\frac{1}{x^2}$& $\frac{2}{x^3}$ & decreasing & convex\\
& & & & \\
$7-x^2$ & $ -2x$ & -2 & decreasing& concave\\
& & & & \\ $7x-x^2$ & $ 7-2x$ & -2 & increasing/decreasing & concave\\ &
& & & \\ \hline \end{tabular}
The signs of derivatives can be confusing. The function $f(x) = x^2$ is
increasing at an {\it increasing} rate, but the function $f(x) = \frac{1}{x}$
is decreasing at a {\it decreasing} rate, even though $f''>0$ in each case.
\bigskip \noindent {\bf *A.5 Probability Distributions} The definitive listing
of probability distributions and their characteristics is the three-volume
series of Johnson \& Kotz (1970). A few major distributions are listed here. A
{\bf probability distribution} is the same as a {\bf cumulative density
function} for a continuous distribution. Any single value has infinitesimal
probability if the distribution is continuous (unless there is a probability
atom there), so rather than speaking of the probability of a value, we speak of
the probability of the value being in an interval, or of the {\bf density} at a
single value.
\noindent {\bf The Exponential Distribution}\\
The exponential distribution, which has the set of nonnegative real numbers as
its support, has the density function for mean $\lambda$ of
\begin{equation} \label{eA2.3d}
f(x)= \frac{ e^{ -x/\lambda}}{\lambda}.
\end{equation}
The cumulative density function is
\begin{equation} \label{eA2.3e}
F(x)= 1 - e^{-x/\lambda}.
\end{equation}
\noindent {\bf The Uniform Distribution}\\
A variable is uniformly distributed over support $X$ if each point in $X$ has
equal probability. If the support is $[\alpha,\beta]$, the mean is
$\alpha+\beta)/2$, the density is
\begin{equation} \label{eA2.3}
f(x)=\left\{ \begin{array}{ll} 0 & x<\alpha \\
& \\
\frac{1}{\beta-\alpha} & \alpha \leq x \leq \beta \\
& \\
0 & x>\beta,\\
\end{array} \right.
\end{equation}
and the cumulative density function is \begin{equation} \label{eA2.4} F(x)=
\left\{ \begin{array}{ll}
0 & x<\alpha \\
& \\
\frac{x-\alpha}{\beta-\alpha} & \alpha\leq x\leq \beta \\
& \\
1 & x>\beta \end{array} \right.
\end{equation}
\noindent {\bf The Normal Distribution}\\
The normal distribution is a two-parameter single-peaked distribution which
has as its support the entire real line. The density function for mean $\mu$
and variance $\sigma^2$ is \begin{equation} \label{eA2.3a} f(x)= \frac{1}
{\sqrt{2\pi \sigma^2}} e^{\frac{-(x-\mu)^2}{2 \sigma^2}} \end{equation} The
cumulative density function is the the integral of this, often denoted $\Phi(x)
$, which cannot be simplified analytically. You can find values at websites
such as \\
http://www.math2.org/math/stat/distributions/z-dist.htm or from a spreadsheet
program.
\noindent {\bf The Lognormal Distribution}\\
If $log(x)$ has a normal distribution, $x$ has a lognormal distribution.
This is a skewed distribution which has the set of positive real numbers as
its support since the logarithm of a negative number is not defined. The mean
is $e^{\sigma^2/2}$.
\bigskip \noindent {\bf A.6 Supermodularity}
Suppose that there are $N$ players in a game, subscripted by $i$ and $j$, and
that player $i$ has a strategy consisting of $\overline{s}^i$ elements,
subscripted by $s$ and $t$, so his strategy is the vector $ y^i = (y^i_{1},
\ldots, y^i_{ \overline{s}^i})$. Let his strategy set be $S^i$ and his payoff
function be $\pi^i(y^i, y^{-i}; z)$, where $z$ represents a fixed parameter.
We say that the game is a {\bf supermodular game} if the following four
conditions are satisfied for every player $i = 1, \ldots N$:
\noindent (A1) $S^i$ is a complete lattice.
\noindent (A2) $\pi^i: S \rightarrow R \cup \{-\infty \}$ is order
semicontinuous in $y^i$ for fixed $y^{-i}$, and order continuous in $y^{-i}$
for fixed $y^i$, and has a finite upper bound.
\noindent
(A3) $\pi^i$ is supermodular in $y^i$, for fixed $y^{-i}$. For all strategy
profiles $y$ and $y'$ in $S$,
\begin{equation} \label{eA2.13c}
\pi^i(y) + \pi^i(y') \leq \pi^i(supremum\{y,y'\} ) + \pi^i( infimum \{y,y'\}
). \end{equation}
\noindent
(A4) $\pi^i$ has increasing differences in $y^i$ and $y^{-i}$. For all $y^i
\geq y^{i}'$, the difference $\pi^i(y^i, y^{-i} ) - \pi^i(y^{i}', y^{-i} ) $ is
nondecreasing in $y^{-i}$.
In addition, it is sometimes useful to use a fifth assumption:
\noindent
(A5) $\pi^i$ has increasing differences in $y^i$ and $z$ for fixed $y^{-i}$;
for all $y^i \geq y^{i}'$, the difference $\pi^i( y^i, y^{-i},z ) - \pi^i(
y^{i}', y^{-i}, z ) $ is nondecreasing with respect to $z$.
\bigskip \noindent The conditions for {\bf smooth supermodularity} are:
\noindent A1$'$ The strategy set is an interval in $\boldmath{R^{\overline{s}
^i}} $: \begin{equation} \label{e13.53} S^i = [\underline{y^i}, \overline{y^i}].
\end{equation}
\noindent A2$'$ $\pi^i$ is twice continuously differentiable on $S^i$.
\noindent A3$'$ (Supermodularity) Increasing one component of player $i$'s
strategy does not decrease the net marginal benefit of any other component: for
all $i$, and all $s$ and $t$ such that $ 1 \leq s < t \leq \overline{s}^i$,
\begin{equation} \label{e13.54} \frac{\partial^2 \pi^i}{ \partial y^i_{s}
\partial y^i_{t}} \geq 0. \end{equation}
\noindent A4$'$ (Increasing differences in one's own and other strategies)
Increasing one component of $i$'s strategy does not decrease the net marginal
benefit of increasing any component of player $j$'s strategy: for all $i \neq
j$, and all $s$ and $t$ such that $1 \leq s \leq \overline{s}^i$ and $1
\leq t \leq \overline{s}^j $, \begin{equation} \label{e13.55}
\frac{\partial^2 \pi^i}{ \partial y^i_{s} \partial y^j_{t}} \geq 0 .
\end{equation}
\noindent The fifth assumption becomes
\noindent A5$'$: (Increasing differences in one's own strategies and parameters)
Increasing parameter $z$ does not decrease the net marginal benefit to player
$i$ of any component of his own strategy: for all $i$, and all $s$ such
that $ 1 \leq s \leq \overline{s}^i$, \begin{equation} \label{e13.56a}
\frac{\partial^2 \pi^i}{ \partial y^i_{s} \partial z } \geq 0 . \end{equation}
\bigskip
\noindent {\bf Theorem 1}\\
{\it If the game is supermodular, there exists a largest and smallest Nash
equilibrium in pure strategies.}
Theorem 1 is useful because it shows (a) existence of an equilibrium in pure
strategies, and (b) if there are at least two equilibria (note that the largest
and smallest equilibria might be the same strategy profile), then two of them
can be ranked in the magnitudes of the components of each player's equilibrium
strategy.
\noindent {\bf Theorem 2}\\ { If the game is supermodular and assumption
(A5$'$) is satisfied, then the largest and smallest equilibria are
nondecreasing functions of the parameter $z$.}
\noindent {\bf Theorem 3}\\ { If a game is supermodular, then for each player
there is a largest and smallest serially undominated strategy, where both of
these strategies are pure. }
\noindent {\bf Theorem 4}\\ {Let $\underline{y^i}$ denote the smallest
element of player $i$'s strategy set $S^i$ in a supermodular game. Let $y^*$
and $y^*'$ denote two equilibria, with $y^* \geq y^*'$, so $y$ is the ``big''
equilibrium. Then, \\ 1. If $\pi^i( \underline{y^i}, y^{-i})$ is increasing in
$y^{-i}$, then $\pi^i (y^*) \geq \pi^i (y^*')$.\\ 2. If $\pi^i(
\underline{y^i},y^{-i})$ is decreasing in $y^{-i}$, then $\pi^i (y^*) \leq
\pi^i (y^*')$.\\ 3. If the condition in (1) holds for a subset $N_1$ of players,
and the condition in (2) holds for the remainder of the players, then the big
equilibrium $y^*$ is the best equilibrium for players in $N_1$ and the worst for
the remaining player, and the small equilibrium $y^*'$ is the worst equilibrium
for players in $N_1$ and the best for the remaining players.}
The theorems here are taken from Milgrom \& Roberts (1990). Theorem 1 is their
corollary to Theorem 5. Theorem 2 is their Theorem 6 and corollary. Theorem 3
is their Theorem 5, and 4 is their Theorem 7. For more on supermodularity,
see Milgrom \& Roberts (1990), Fudenberg \& Tirole (1991, pp. 489-497), or
Vives's 2005 survey article. For a mathematician's view, see Topkis's
1998 {\it Supermodularity and Complementarity}.
\bigskip
\noindent {\bf A.7 Fixed Point Theorems}
Fixed points theorems say that various kinds of mappings from one set to another
result in at least one point being mapped back onto itself. The most famous
fixed point theorem is Brouwer's Theorem, illustrated in Figure 3. I will use a
formulation from page 952 of Mas-Colell, Whinston \& Green (1994).
\includegraphics[width=150mm]{figa-03.jpg}
\begin{center} {\bf Figure 3 A Mapping with Three Fixed Points}
\end{center}
\noindent {\bf The Brouwer Fixed Point Theorem.} Suppose that set A in $R^N$ is
nonempty, compact, and convex; and that $f: A\rightarrow A $ is a continuous
function from $A$ into itself. ( ``Compact'' means closed and bounded, in
Euclidean space.) Then $f$ has a fixed point; that is, there is an $x$ in A
such that $x = f(x)$.
The usefulness of fixed point theorems is that an equilibrium is a fixed
point. Consider equilibrium prices. Let $p$ be a point in $N$-space consisting
of one price for each good. Let $P$ be the set of all possible price points.
$P$ will be convex and compact if we limit it to finite prices. Agents in the
economy look at $p$ and make decisions about consumption and output. These
decisions change $p$ into $f(p)$. An equilibrium is a point $p*$ such that
$f(p*) = p*$. If you can show that $f$ is continuous, you can show that $p*$
exists.
This is also true for Nash equilibria. Let $s$ be a point in $N- space$
consisting of one strategy for each player -- a strategy profile. Let $S$ be the
set of all possible strategy profiles. This will be compact and convex if we
allow mixed strategies (for convexity) and if strategy sets are closed and
bounded. Each strategy profile $s$ will cause each player to react by choosing
his best response $f(s)$. A Nash equilibrium is $s*$ such that $f(s*) = s*$.
If you can show that $f$ is continuous -- which you can do if payoff functions
are continuous -- you can show that $s*$ exists.
The Brouwer theorem is useful in itself, and conveys the intuition of fixed
point theorems, but to prove existence of prices in general equilibrium and
existence of Nash equilibrium in game theory requires the Kakutani fixed point
theorem. That is because the mappings involved are not one-to-one functions,
but one-to-many-point correspondences. In general equilibrium, one firm might
be indifferent between producing various amounts of output. In game theory, one
player might have two best responses to another player's strategy.
\noindent
{\bf The Kakutani Fixed Point Theorem} (Kakutani [1941]) Suppose that set $A$ in
$R^N$ is a nonempty, compact, convex set and that $f: A\rightarrow A $ is an
upper hemicontinuous correspondence from A into itself, with the property
that the set $f(x)$ is nonempty and convex for every $x$. Then $f$ has a
fixed point; that is, there is an $x$ in $A$ such that $x $ is one
element of $f(x)$.
Other fixed point theorems exist for other kinds of mappings --- for example
for a mapping from a set of functions back into itself. In deciding which
theorem to use, care must be taken to identify the mathematical nature of the
set of strategy profiles and the smoothness of the best response functions.
\bigskip
\noindent {\bf *A.8 Genericity}
Suppose we have a space $X$ consisting of the interval between 0 and 100 on the
real line, $[0,100]$, and a function $f$ such that $f(x) = 3$ except that
$f(15) = 5$. We can then say any of the following:
1 $f(x) =3$ except on a set of measure zero.
2 $f(x) =3$ except on a null set.
3 Generically, $f(x) =3$.
4 $f(x) =3$ almost always.
5 The set of $x $ such that $f(x) =3$ is dense in $X$.
6 The set of $x $ such that $f(x) =3$ has full measure.
These all convey the idea that if parameters are picked using a continuous
random density, $f(x)$ will be 3 with probability one, and any other value of
$f(x)$ is very special in that sense. If you start with point $x$, and add
a small random perturbation $\epsilon$, then with probability one, $f(x +
\epsilon) = 3$. So unless there is some special reason for $x$ to take the
particular value of 15, you can count on observation $f(x) =3$.
Statements like these always depend on the definition of the space $X$. If,
instead, we took a space $Y$ consisting of the integers between 0 and 100,
which is {0,1,2,...,100}, then it is not true that ``f(x) = 3 except on a set
of measure zero.'' Instead, if $x$ is chosen randomly, there is a 1/101
probability that $x = 15$ and $f(x)= 5$.
The concept of ``a set of measure zero'' becomes more difficult to implement
if the space $X$ is not just a finite interval. I have not defined the
concept in these notes; I have just pointed to usage. This, however, is enough
to be useful to you. A course in real analysis would teach you the
definitions. As with the concepts of ``closed'' and ``bounded'',
complications can arise even in economic applications because of infinite
spaces and in dealing with spaces of functions, game tree branchings, or other
such objects.
Now, let us apply the idea to games. Here is an example of a theorem that
uses genericity.
\noindent {\bf Theorem:} ``Generically, all finite games of perfect information
have a unique subgame perfect equilibrium.''
\noindent {\bf Proof.} A game of perfect information has no simultaneous moves,
and consists of a tree in which each player moves in sequence. Since the game
is finite, each path through the tree leads to an end node. For each end node,
consider the decision node just before it. The player making the decision
there has a finite number $N$ of choices, since this is a finite game. Denote
the payoffs from these choices as $(P_1, P_2,...,P_N)$. This set of payoffs
has a unique maximum, because generically no two payoffs will be equal. (If they
were, and you perturbed the payoffs a little, with probability one they would
no longer be equal, so games with equal payoffs have measure zero.) The player
will pick the action with the biggest payoff. Every subgame perfect equilibrium
must specify that the players choose those actions, since they are the unique
Nash strategies in the subgames at the end of the game.
Next, consider the next-to-last decision nodes. The player making the
decision at such a node has a finite number of choices, and using the payoffs
determined from the optimal choice of final moves, he will find that some move
has the maximum payoff. The subgame perfect equilibrium must specify that move.
Continue this procedure until you reach the very first move of the game. The
player there will find that some one of his finite moves has the largest payoff,
and he will pick that one move. Each player will have one best action choice at
each node, and so the equilibrium will be unique. Q.E.D.
Genericity entered this as the condition that we are ignoring special games
in the theorem's statement -- games that have tied payoffs. Whether those are
really special or not depends on the context.
\bigskip \noindent
{\bf *A.9 Discounting}
\noindent A model in which the action takes place in real time must specify
whether payments and receipts are valued less if they are made later, i.e.,
whether they are {\bf discounted.} Discounting is measured by the discount
rate or the discount factor.
\noindent {\it The {\bf discount rate}, $r$, is the extra fraction of a unit of
value needed to compensate for delaying receipt by one period. }
\noindent {\it The {\bf discount factor}, $\delta$, is the equivalent in present
units of value of one unit to be received one period from the present.}
The discount rate is analogous to the interest rate, and in some models the
interest rate determines the discount rate. The discount factor represents
exactly the same idea as the discount rate, and $\delta= \frac{1}{1+r}$. Models
use $r$ or $\delta$ depending on notational convenience. Not discounting is
equivalent to $r=0$ and $\delta = 1$, so the notation includes zero discounting
as a special case.
Whether to put discounting into a model involves two questions. The first is
whether the added complexity will be accompanied by a change in the results or
by a surprising demonstration of no change in the results. A second, more
specific question is whether the events of the model occur in real time, so that
discounting is appropriate. The bargaining game of Alternating Offers from
Section 12.3 can be interpreted in two ways. One way is that the players make
all their offers and counteroffers between dawn and dusk of a single day, so
essentially no real time has passed. The other way is that each offer consumes
a week of time, so that the delay before the bargain is reached is important to
the players. Discounting is appropriate only in the second interpretation.
Discounting has two important sources: time preference and a probability that
the game might end, represented by the rate of time preference, $\rho$, and the
probability each period that the game ends, $\theta$. It is usually assumed
that $\rho$ and $\theta$ are constant. If they both take the value zero, the
player does not care whether his payments are scheduled now or ten years from
now. Otherwise, a player is indifferent between $\frac{x}{1+\rho}$ now and $x$
guaranteed to be paid one period later. With probability $(1-\theta)$ the game
continues and the later payment is actually made, so the player is indifferent
between $(1-\theta)x/(1+\rho)$ now and the promise of $x$ to be paid one period
later contingent upon the game still continuing. The discount factor is
therefore \begin{equation}\label{e1} \delta = \frac{1}{1+r} = \frac{(1-\theta)}
{(1+\rho)}. \end{equation}
Table 2 summarizes the implications of discounting for the value of payment
streams of various kinds. We will not go into how these are derived, but they
all stem from the basic fact that a dollar paid in the future is worth $\delta$
dollars now. Continuous time models usually refer to rates of payment rather
than lump sums, so the discount factor is not so useful a concept, but
discounting works the same way as in discrete time except that payments are
continuously compounded. For a full explanation, see a finance text (e.g.,
Appendix A of Copeland \& Weston [1988]).
\begin{small} \begin{center} {\bf Table 2 Discounting}
\begin{tabular}{|lll|}
\hline & \multicolumn{2}{c|}{\bf Discounted Value} \\ & {\bf r-notation} &
{\bf $\delta$-notation}\\ {\bf Payoff Stream} & (discount rate) & (discount
factor)\\ \hline & &\\ $x$ at the end of one period & $\frac{x}{1+r}$ & $\delta
x$\\ & &\\ $x$ at the end of each period in perpetuity & $\frac{x}{ r}$ &
$\frac{\delta x}{1-\delta} $\\ & &\\ $x$ at the start of each period in
perpetuity & $x + \frac{x}{ r}$ & $\frac{ x}{1-\delta} $\\ & &\\ $x$ at the
end of each period up through $T$ (first formula)& $\sum_{t=1}^T \frac{x}{(1+r)
^t}$ & $\sum_{t=1}^T \delta^t x$\\ & &\\ $x$ at the end of each period up
through $T$ (second formula)& $ \frac{x}{r} \left( 1- \frac{1}{(1+r)^T} \right)$
& $\frac{\delta x} {1-\delta} \left( 1- \delta^T \right)$ \\ & &\\ \hline &
&\\ $x$ at time $t$ in continuous time& $ x e^{-rt}$ & ---\\ & &\\ Flow of $x$
per period up to time $T$ in continuous time& $ \int_0^T x e^{-rt} dt$ &
---\\ & &\\ Flow of $x$ per period in perpetuity, in continuous time& $
\frac{x}{r}$ & ---\\ & &\\ \hline \end{tabular} \end{center} \end{small}
The way to remember the formula for an annuity over a period of time is to use
the formulas for a payment at a certain time in the future and for a perpetuity.
A stream of $x$ paid at the end of each year is worth $\frac{x}{r}$. A payment
of $Y$ at the end of period $T$ has a present value of $\frac{-Y}{(1+r)^T}$.
Thus, if at the start of period $T$ you must pay out a perpetuity of $x$ at the
end of each year, the present value of that payment is $ \left(\frac{x}{r}
\right)\left(\frac{1} {1+r} \right)^T$. One may also view a stream of payments
each year from the present until period $T$ as the same thing as owning a
perpetuity but having to give away a perpetuity in period $T$. This leaves a
present value of $\left(\frac{x}{r} \right) \left(1 - (\frac{1}{ 1+r })^T
\right)$, which is the second formula for an annuity given in Table 2. Figure
4 illustrates this approach to annuities and shows how it can also be used to
value a stream of income that starts at period $S $ and ends at period $T$.
\includegraphics[width=150mm]{figa-04.jpg}
\begin{center} {\bf Figure 4: Discounting }
\end{center}
Discounting will be left out of most dynamic games in this book, but it is an
especially important issue in infinitely repeated games, and is discussed
further in Section 5.2.
\bigskip \noindent {\bf *A.10 Risk}
We say that a player is {\bf risk averse} if his utility function is strictly
concave in money, which means that he has diminishing marginal utility of money.
He is {\bf risk neutral} if his utility function is linear in money. The
qualifier ``in money'' is used because utility may be a function of other
variables too, such as effort.
We say that probability distribution $F$ {\bf dominates} distribution $G$ in
the sense of {\bf first-order stochastic dominance} if the cumulative
probability that the variable will take a value less than $x$ is greater for $G$
than for $F$, i.e. if
\begin{equation} \label{e7.9}
{\rm for \;\;any\;\; } x,\; F(x) \leq G(x),
\end{equation}
and (\ref{e7.9}) is a strong inequality for at least one value of $x$. The
distribution $F$ dominates $G$ in the sense of {\bf second-order stochastic
dominance} if the area under the cumulative distribution $G$ up to $G(x)$ is
greater than the area under $F$, i.e. if
\begin{equation} \label{e7.10} {\rm for \;\;any\;\; } x, \int_{-\infty}^x F(y)
dy \leq \int_{-\infty}^x G(y)dy, \end{equation} and (\ref{e7.10}) is a strong
inequality for some value of $x$. Equivalently, $F$ dominates $G$ if, limiting
$U$ to increasing functions for first-order dominance and increasing concave
functions for second-order dominance,
\begin{equation} \label{e9} {\rm for \;all\; functions\; } U, \int_{-\infty}
^{+\infty} U(x) dF(x) > \int_{-\infty}^{+\infty} U(x) dG(x).
\end{equation}
If $F$ is a first-order dominant gamble, it is preferred by all players; if $F$
is a second-order dominant gamble, it is preferred by all risk-averse players.
If $F$ is first-order dominant it is second-order dominant, but not vice versa.
Milgrom (1981b) has used stochastic dominance to carefully define what we mean
by {\bf good news}. Let $\theta$ be a parameter about which the news is received
in the form of message $x$ or $y$, and let utility be increasing in $\theta$.
The message $x$ is more favorable than $y$ (is ``good news'') if for every
possible nondegenerate prior for $F(\theta) $, the posterior $F(\theta|x)$
first-order dominates $F(\theta|y)$.
Rothschild \& Stiglitz (1970) shows how two gambles can be related in other
ways equivalent to second-order dominance, the most important of which is the
{\bf mean-preserving spread}. Informally, a mean-preserving spread is a density
function which transfers probability mass from the middle of a distribution to
its tails. More formally, for discrete distributions placing sufficient
probability on the four points $a_1$, $a_2$, $a_3$, and $a_4$,
{\it A {\bf mean-preserving spread} is a set of four locations $a_1