\documentclass[12pt, fleqn, usenames,dvipsnames]{article}
\usepackage{amsmath}
\usepackage{eurosym}
% \usepackage[fleqn]{nccmath}
% \usepackage{amsthm}
%\usepackage{amssymb}
\usepackage{graphicx}
\usepackage{hyperref}
\usepackage{verbatim}
\hypersetup{breaklinks=true,
pagecolor=white,
colorlinks=true,
linkcolor= blue,
hyperfootnotes= true,
urlcolor=blue
}
\urlstyle{rm}
\usepackage{fancyhdr} \pagestyle{fancy} \fancyhead{}\fancyfoot{} \rhead{\thepage}
\newcommand{\margincomment}[1]
{\mbox{}\marginpar{\tiny\hspace{0pt}#1}}
\newcommand{\comments}[1]{}
\renewcommand{\baselinestretch}{1.2}
% \oddsidemargin -.1in
%\evensidemargin -.1in
%\textwidth 7in
\begin{document}
% \begin{raggedright}
\parindent 18pt
\parskip 10pt
\titlepage
%\vspace*{12pt}
\begin{center}
\begin{Large}
{\sc **On the Efficiency and Desirability of Disclosure of Whether Someone Has a Clean Criminal Record \\
\footnote{(Steve--- here's a new possible title--- we should stress the positive aspect of being able to prove a clean record. Also we should include not just employers but volunteering and marriage, if we can fit them in.}\\
%\bigskip
Eric Rasmusen and Steven Shavell
May 1, 2016
}%end of small capitals
\end{Large}
{\it Abstract}
\end{center}
\begin{footnotesize}
werwerwewerr
This is not ready for circulation.
\noindent
Eric Rasmusen: Dan R. and Catherine M.
Dalton
Professor, Department of Business Economics and Public Policy, Kelley
School
of Business, Indiana University, Bloomington Indiana.
\href{mailto:erasmuse@indiana.edu}{ Erasmuse@indiana.edu}.
Phone: (812) 345-8573. Messaging: 8123458573@vtext.com.
\noindent
Steven Shavell:
This paper:
\url{http://www.rasmusen.org/papers/crime-records-rasmusen-shavell.pdf}. (not posted yet)
{
\noindent
Keywords:sdfsdf.}
\noindent
{ We would like to thank sdfsdfsd for their comments. }
\end{footnotesize}
\newpage
\noindent
{\sc 1. Introduction}
Not written yet.
\noindent
\underline{ 2. Basic assumptions: crime-averse and crime-prone people. }
The total population is of size 1 and is made up of two mutually exclusive groups of individuals, crime prone and crime averse. Crime-prone people are a fraction $\beta$ of the population, and crime-averse make up the remainder, $\gamma$.
People live for two periods, youth and maturity, earning a wage of $w$ in each period. We assume zero discounting.
In the youth period, a crime-prone person has the opportunity to commit a harmful act, causing harm $h>0$, where $h$ does not vary across people. If a crime-prone person commits the harmful act, he enjoys a benefit $b$, where $b$ is distributed uniformly on $[0,h]$.\footnote{It might well be that we can relax the uniform density assumption and just use a general f(b). THat is all that is needed for the comparison between regimes. THe only possible use of the uniform density is on the comparative statics on detection rpobability $p$ for the full-information case. Eric put in the assumption because he thought it would allow him to solve for $p^*$, but it turns out solving is still to hard.} If there is a type of crime-prone person who is indifferent between committing a crime and not, we will denote his type by $b^*$. It will be seen when we define social welfare that the commission of the harmful act lower social welfare by $h- b$, which is positive, so it makes sense to call the act a crime.
Crime-averse people have $b \leq 0$ and hence never commit crimes. We can think of them as people who don't enjoy any benefits $b$ from committing the act, or who would experience disutility from doing so (like from beating someone up, or from stealing...due to guilt...whatever).
If someone commits a crime in youth, he is instantly detected with probability $p$ and suffer a jail sentence $s \leq 1$. There are no legal errors once someone is detected. The disutility to a person who is punished is $d\cdot s $. After serving his sentence, he works during the rest of his youth, a length $1-s$.
The cost to the state of maintaining $p$ is $c(p)$, where $ c' > 0 $ and $c'' > 0.$
There is also a social cost of jailing a person of $k \cdot s$, where $s$ is the length of the sentence and $ k > 0.$ We require that the government choose a sentence such that $s < \overline{s} <1$. This represents the need to limit sentences out of a sense of limited retributive justice (it is unfair to sentence someone to 10 years for shoplifting a radio) or the need for marginal deterrence (if someone is sentenced to 10 years for shoplifting a radio, they will commit the more profitable crime of robbery instead of shoplifting).\footnote{We could have different sentences for each period if we had different $\overline{s}$'s. Maybe we shoudl think about different values of $s$ too. } The state makes three choices about law enforcement: it chooses $ p,s$, and whether or not to disclose criminal records to employers (that is, to disclose whether a person was caught for a crime).
In the second, maturity, all people work and nobody commits crimes.
In youth, everyone is equally productive, with output $r_1$ and wage $w_1= r_1>0$. In maturity, crime-prone people are less productive than crime-averse people. The productivity parameter of a crime-prone person is $r_\beta$ and of a crime-averse person is $r_\gamma$ , where $ r_1 < r_\beta < r_\gamma .$ The interpretation of the assumption is that people who are disposed to commit crime-prone acts are people who have personal characteristics that would lead them to benefit from committing criminal acts—like social irresponsibility, low intelligence, poor impulse control, lack of respect for authority, etc.--- also have low productivity in adult jobs that require trust.
A person's actual productivity, however, depends on his job as well as his parameter.
There are two kinds of maturity jobs, good jobs and bad jobs. The productivity of a crime-averse person is $r_\gamma$ in a good job and $r_\gamma- x_\gamma$ in a bad job. The productivity of a crime-prone person is $r_\beta$ in a bad job and $r_\beta- x_\beta$ in a good job. Thus, $ x_\gamma$ and $x_\beta$ are the loss from matching a person to the wrong job. We assume that $r_\gamma> r_\beta$. We make no assumption on whether $r_\gamma- x_\gamma> r_\beta$; it could be that a crime-prone person is better at a crime-prone job than a crime-averse person would be. We assume that engaging in crime has no effect on a person's productivity (in contrast to Rasmusen (1996)).
Denote the equilibrium wage of someone in a good job with a clean criminal record by $\overline{w}_\gamma$ and the wage with a conviction as $\underline{w}_\gamma$. Similarly, the wages for a bad job are $\overline{w}_\beta$ and $\underline{w}_\beta$. Note that these are independent of whether the worker is crime-averse or crime-prone, but if the quality of the worker is known, the crime-averse workers will never take bad jobs and the crime-prone workers will never be offered good jobs. The wage of an employee will equal his expected marginal product due to competition among employers.
What employers know about the productivity of employees will be discussed below. Note, however, that since the productivity of a crime-prone worker is lower in a good job than in a bad job, and only crime-prone workers commit crimes, the equilibrium wage of a worker with a criminal record in a good job will always be lower than his wage in a bad job: $\underline{w}_\gamma = r_\beta- x_\beta <\underline{w}_\beta. $
\noindent
\underline{Basic assumptions: social welfare. } Social welfare is (a) the sum of the benefits people derive from committing crimes and the net production of people on the labor market, less (b) the sum of the disutility suffered by people in jail ($d \cdot s$ for each person in jail $s$ years), the cost to state of jailing people ($k \cdot s$ for each person in jail $s $ years), the cost $c(p)$ of maintaining $p$ over the two periods, and the harm done by crime ($h$ for each person who commits the crime-prone act).
Comment: this is an intuitively appealing social welfare measure. And, if we literally added up the utilities of people (and had them pay taxes to finance state activities), we'd see it is the sum of their utilities.
\noindent
\underline{ The Equilibrium under Full Information.}\\
Suppose whether people are crime-averse or crime-prone is public knowledge, but it is still difficult to detect crime. What is the optimal penalty?
In this situation, in the second period, crime-prone people will get bad jobs at a wage of $w=r_\beta$ and crime-averse people will get good jobs at a wage of $w=r_\gamma$.
If $b$ is high enough, a crime-prone person will commit a crime. Denote by $b^*$ the benefits to a crime-prone person just indifferent about committing a crime in period 1, which is determined by his expected loss from crime. Since the wage is unaffected by that decision, the cutoff benefit is determined by the disutility of prison and the lost income. In youth, a person's flow utility for the period from abstaining from crime whether is, since $w_1 = r_1$,
\begin{equation} \label{e0}
U^1(crime-prone, honest) = r_1,
\end{equation}
and his utility from crime is
\begin{equation} \label{e0}
U^1(crime-prone,criminal) = b+ (1-p)r_1 + p (1-s)r_1 - p\cdot s \cdot d = b+ r_1 -ps (r_1 +d)
\end{equation}
so
\begin{equation} \label{e0}
b^* = p\cdot s (r_1+ d)
\end{equation}
In the second period, a crime-averse person's utility is
\begin{equation} \label{e0}
U^2(crime-averse) = r_\gamma,
\end{equation}
and a crime-prone person's
utility is
\begin{equation} \label{e0}
U^2(crime-prone) = r_\beta.
\end{equation}
Those who are deterred are those for whom $b \leq b^*$. crime-prone people with $b> b^*$ commit crimes; crime-prone people with $b \leq b^*$ do not. Thus, the fraction of crime-prone people deterred in period $1$ is $F(b^*) =b^*/h$ and the fraction of crime-prone people who commit crimes is $1 - F(b_t^*)= 1 - b^*/h$.
The cost to victims and the cost to government of detection and punishment is, since there are $\beta$ crime-prone people in total,
\begin{equation} \label{e0}
\begin{array}{lll}
U(victims, govt.) & = &\displaystyle - (1- \frac{b^*}{h})\beta h \\
&& \\
&& \displaystyle - (1- \frac{b^*}{h}) \beta p \cdot s \cdot k - c(p) \\
\end{array}
\end{equation}
The first line of the social welfare equation below consists of the utility of the crime-averse people in both periods plus the utility of the crime-prone people who commit and do not commit crimes in the first period. The second line consists of the utility of the crime-prone people in the second period, the disutility of victims of crime, and the disutility of the government.
\begin{equation} \label{e0}
\begin{array}{lll}
Welfare &= & \displaystyle \gamma r_1 + \gamma r_\gamma + \left(\int_0^{b^*} r_1 f(b)db \right) \beta + \left(\int_{b^*}^h ( b+ r_1 -ps (r_1 +d) ) f(b)db \right) \beta \\
&&\\
&& \displaystyle +\beta r_\beta
- (1- \frac{b^*}{h})\beta h
- (1- \frac{b^*}{h}) \beta p \cdot s \cdot k - c(p) \\
\end{array}
\end{equation}
This simplifies to
\begin{equation} \label{e0}
\begin{array}{lll}
Welfare &= & \displaystyle \gamma r_1 + \gamma r_\gamma +
\displaystyle \frac{b^*}{h} r_1 \beta + \frac{h^2}{2h} \beta - \frac{(b^*)^2}{2h} \beta + \left(1- \frac{b^*}{h}\right) (r_1- p\cdot s (r_1+ d) \beta \\
&&\\
&&\displaystyle + \beta r_\beta
- \left(1- \frac{b^*}{h}\right) \beta h - \left(1- \frac{b^*}{h}\right) \beta p \cdot s \cdot k - c(p) \\
\end{array}
\end{equation}
We can differentiate social welfare to get the optimal $p$, using $ \frac{db^*}{dp } = s( r_1+d) $,
\begin{equation}
\begin{array}{lll}
\frac{d\; welfare }{dp}
& = & \displaystyle \frac{s( r_1+d) r_1\beta}{h}
- \frac{2 b^*s(r_1 + d) \beta}{2h}
- \frac{s( r_1+d) (r_1- p\cdot s (r_1+ d) ) \beta}{h}\\
& & \\
& &\displaystyle - \left(1- \frac{b^*}{h}\right) \cdot s (r_1+ d) \beta +
\frac{ s( r_1+d) \beta h}{h}
+ \frac{ s( r_1+d) \beta psk }{h}
-\left(1- \frac{b^*}{h}\right) \beta \cdot s \cdot k - c'(p) \\
& & \\
& & \\
& = & \displaystyle \frac{s( r_1+d) r_1\beta}{h} - \frac{ ps(r_1 + d)s(r_1 + d) \beta}{ h}
-\frac{s (r_1+d) r_1\beta}{h}
+ \frac{s^2( r_1+d)^2p \beta}{h}\\
& & \\
& &\displaystyle - \left(1- \frac{ps(r_1 + d)}{h}\right) \cdot s (r_1+ d) \beta + s( r_1+d) \beta \\
& & \\
&&\displaystyle + \frac{ s( r_1+d) \beta psk }{h}
-\left(1- \frac{ps(r_1 + d)}{h}\right) \beta \cdot s \cdot k - c'(p) \\
& & \\
& & \\
& = & \displaystyle \beta \Big[
- \left(1- \frac{ps(r_1 + d)}{h}\right) \cdot s (r_1+ d) + s( r_1+d) \\
& & \\
& & \displaystyle + \frac{ s( r_1+d) psk }{h}
-\left(1- \frac{ps(r_1 + d)}{h}\right) \cdot s \cdot k \Big] - c'(p) \\
& & \\
& & \\
& = & \displaystyle \frac{\beta}{h} \Big[ - \left(h- ps(r_1 + d) \right) \cdot s (r_1+ d) + h s( r_1+d) \\
& & \\
& & + s( r_1+d) psk
-\left(h- ps(r_1 + d) \right) \cdot s \cdot k \Big] - c'(p) \\
& & \\
& & \\
& = & \displaystyle \frac{\beta}{h} \Big[ ps^2(r_1 + d)^2
+ s^2( r_1+d) p k
- hsk + ps^2(r_1 + d) k \Big] - c'(p) \\
\end{array}
\end{equation}
That ends up rather a mess, but we can do comparative statics on $p^*$ using the implicit function theorem. Let's check the second derivative:
\begin{equation}
\begin{array}{lll}
\frac{d^2\; welfare }{dp^2}
& = & \displaystyle \frac{\beta}{h} \Big[ s^2(r_1 + d)^2
+ s^2( r_1+d) k
+ s^2(r_1 + d) k \Big] - c''(p) \\
\end{array}
\end{equation}
Almost every term is positive, instead of the negative that we want for a concave maximand. So if $c''=0$, the government would choose enough punishment to stomp out crime completely. If $c''>0$ and is big enough, we'd get a smaller $p$ and some crime, and there might even be multiple equilibria. At any maximum, though, we'd have $\frac{d^2\; welfare }{dp^2} <0$ and we could use the implicit function theorem.
Our only other parameters besides $s$ (dealt with later-- it turns out that $s^* = \overline{s}$) are $h, r_1, d, k$.
\begin{equation}
\begin{array}{lll}
\frac{d^2\; welfare }{dp \cdot d h }
& = & \displaystyle - \frac{\beta}{h^2} \Big[ ps^2(r_1 + d)^2
+ s^2( r_1+d) p k
- hsk + ps^2(r_1 + d) k \Big] - \frac{\beta}{h } \Big[ sk \Big] \\
& &\\
& = & \displaystyle - \frac{\beta}{h^2} \Big[ ps^2(r_1 + d)^2
+ s^2( r_1+d) p k
+ ps^2(r_1 + d) k \Big] + \frac{\beta}{h^2} \Big[ - hsk \Big] - \frac{\beta}{h^2 } \Big[ hsk \Big] \\
&& \\
& = & \displaystyle - \frac{\beta}{h^2} \Big[ ps^2(r_1 + d)^2
+ s^2( r_1+d) p k
+ ps^2(r_1 + d) k \Big]<0 \\
\end{array}
\end{equation}
So $p$ goes up with $h$, since at an equilibrium $\frac{d^2\; welfare }{dp^2 } >0$ and
\begin{equation}
\displaystyle \frac{dp }{d h } = -
\frac{\frac{d^2\; welfare }{dp \cdot d h }}{ \frac{d^2\; welfare }{dp^2 } }
\end{equation}
How about $r_1$ and $d$?
\begin{equation}
\begin{array}{lll}
\frac{d^2\; welfare }{dp \cdot d r_1 }
& = &\displaystyle \frac{\beta}{h} \Big[ ps^2 2(r_1 + d)
+ s^2 r_1 p k
+ ps^2(r_1 k \Big] >0 \\
\end{array}
\end{equation}
Thus, $\frac{dp }{d r_1 }<0$. The parameters $r_1$ and $d$ only enter in sum, so $d$ would have the same comparative statics.
\begin{equation}
\begin{array}{lll}
\frac{d^2\; welfare }{dp \cdot d k }
& = & \displaystyle \frac{\beta}{h} \Big[
s( r_1+d) ps
-\left(h- ps(r_1 + d) \right) \cdot s \Big] \\
&&\\
& = & \displaystyle \frac{\beta s}{h} \Big[
( r_1+d) ps
- h+ ps(r_1 + d) \Big] \\
&&\\
& = & \displaystyle \frac{\beta s}{h} \Big[
( r_1+d) ps
- h+ ps(r_1 + d) \Big] \\
&&\\
& = & \displaystyle \frac{\beta s}{h} \Big[
b^*
- h + b^* \Big] \\
\end{array}
\end{equation}
If $b^* > h/2$ then $\frac{d^2\; welfare }{dp \cdot d k }>0$ and $\frac{dp }{d k }<0$. What is happening is that if $b^*$ is big, then few crime-prone people are engaging in crime, and the effect of an increase in $p$ is to increase the number of them that go to prison, increasing prison costs. Thus, a higher prison per-unit cost $k$ leads to less detection being more attractive. If, on the other hand, $b^*$ is small enough that most crime-prone people are engaging in crime, the effect of an increase in $p$ is to reduce the number of prisoners, because increased deterrence more than makes up for the greater probability of detecting a criminal. In that case, an increase in $k$ makes high $p$ more desirable.
\noindent
Proposition: The optimal sentence is $s = \overline{s}$.
\noindent
Proof: The probability $p$ of detection enters welfare only via $b^*$, $ps$, and $c(p)$. Recall that $b^*= ps(d+ r_\beta)$, so $p$ only enters via $c(p)$ and the multiple $ps$. Suppose that the optimal level of $ps$ equals $X$. We can maintain $ps=X$ while reducing the cost $c(p)$ by shrinking $p$ and increasing $s$, so long as $s w^* (closed \;records, good ) = \gamma r_{\gamma} + \beta (r_{\beta} - x_{\beta} )
\end{equation}
because, noting that $\gamma = 1-\beta$,
\begin{equation} \label{e0}
\begin{array}{lll}
\displaystyle
(1-\beta) r_\gamma + \theta \beta (r_{\beta}-x_\beta) &>&
( (1-\beta) + \theta \beta ) ((1-\beta) r_{\gamma} + \beta (r_{\beta} - x_{\beta} ))\\
&&\\
r_\gamma - \beta r_\gamma + \theta \beta (r_{\beta}-x_\beta) &>&
r_\gamma + \beta^2 r_\gamma - 2\beta r_\gamma + \theta \beta r_\gamma - \theta \beta^2 r_\gamma \\
&& + \beta (r_{\beta}-x_\beta) - \beta^2 (r_{\beta}-x_\beta) +
\theta \beta^2 (r_{\beta}-x_\beta) \\
&&\\
\theta \beta (r_{\beta}-x_\beta) &>&
\beta \Big[ \beta r_\gamma - r_\gamma+ \theta r_\gamma - \theta \beta r_\gamma \\
&&+ (r_{\beta}-x_\beta) - \beta (r_{\beta}-x_\beta) + \theta \beta (r_{\beta}-x_\beta) \Big) \\
&&\\
0 &>&
- (1-\beta) r_\gamma + \theta r_\gamma -\theta (r_{\beta}-x_\beta) + (1-\beta)(r_{\beta}-x_\beta)
- \theta \beta (r_\gamma - (r_{\beta}-x_\beta)) \\
&& \\
0 &>&
- (1-\beta) (r_\gamma- (r_{\beta}-x_\beta)) + \theta (1-\beta) (r_\gamma - (r_{\beta}-x_\beta) )
\\
\end{array}
\end{equation}
END OF PROOF.
A crime-averse person will have lifetime utility of
\begin{equation} \label{e0}
U(crime-averse) = r_1 + w^*
\end{equation}
A crime-prone person over his lifetime will have, if he abstains from crime,
\begin{equation} \label{e0}
U(crime-prone, abstain) = r_1 + w^*
\end{equation}
If he does commit a crime, his expected payoff is
\begin{equation} \label{e0}
U(crime-prone, commit) = b+ (1-p)r_1 + p (1-s)r_1 - psd + (1-p) w^* + p r_\beta
\end{equation}
Equating, we get
\begin{equation} \label{e0}
\begin{array}{l}
r_1 + w^* = b^*+ (1-p)r_1 + p (1-s)r_1 - psd + (1-p) w^* + p r_\beta \\
\\
r_1 + pw^* - r_1 +pr_1 -pr_1 + psr_1 -psd - pr_\beta = b^*\\
\\
b^* (open\; records)= p(w^* - r_\beta) + ps(r_1 +d) ) \\
\end{array}
\end{equation}
The full-information and closed-records value of $b^*$ was
\begin{equation} \label{e0}
b^* (closed\; records)= p\cdot s (r_1+ d),
\end{equation}
so if $w^*> r_\beta$, if $p$ and $s$ remain the same, $b^*$ is higher under open records
and there is less crime.
Let's try to solve for $b^*$. The expression for $w^*$ we found was
\begin{equation} \label{e0}
w^*= \displaystyle
\frac{ \gamma r_\gamma + \left( \frac{(h -ph + b^*)\beta}{h} \right) (r_{\beta}-x_\beta) }{\gamma + \left( \frac{(h -ph + b^*)\beta}{h} \right) } \\
\end{equation}
So we have
\begin{equation} \label{e0}
\begin{array}{lll}
b^* (open\; records) &= & p(\frac{ \gamma r_\gamma + \left( \frac{(h -ph + b^*)\beta}{h} \right) (r_{\beta}-x_\beta) }{\gamma + \left( \frac{(h -ph + b^*)\beta}{h} \right) } - r_\beta) + ps(r_1 +d) ) \\
& &\\
b^* (\gamma + \left( \frac{(h -ph + b^*)\beta}{h} \right) ) &= & p \gamma r_\gamma + p\left( \frac{(h -ph + b^*)\beta}{h} \right) (r_{\beta}-x_\beta) \\
& & - r_\beta (\gamma + \left( \frac{(h -ph + b^*)\beta}{h} \right) ) + ps(r_1 +d) (\gamma + \left( \frac{(h -ph + b^*)\beta}{h} \right) ) \\
& &\\
b^* h \gamma +b^*(h -ph + b^*)\beta &= & p h \gamma r_\gamma + p( h -ph + b^*)\beta (r_{\beta}-x_\beta) - h r_\beta \gamma \\
& & - r_\beta (h -ph + b^*)\beta ) + ps(r_1 +d) h\gamma + ps(r_1 +d) (h -ph + b^*)\beta ) \\
& &\\
b^* h (1-\beta) +b^* h\beta -phb^*\beta + ( b^*)^2\beta &= & p h (1-\beta) r_\gamma + [p h \beta -p^2h \beta + pb^* \beta ] (r_{\beta}-x_\beta) \\
&&- h r_\beta (1-\beta) - r_\beta h \beta + r_\beta ph \beta - r_\beta b^* \beta\\
& & + ps(r_1 +d) h(1-\beta) + ps(r_1 +d) (h -ph + b^*)\beta ) \\
\end{array}
\end{equation}
Using Mathematica, this quadratic equation can be solved.
But it's so complicated it's useless.
Social welfare is now
\begin{equation} \label{e0}
\begin{array}{lll}
Welfare &= & \displaystyle \gamma r_1 + \gamma w^* + \left(\int_0^{b^*} r_1 f(b)db \right) \beta + \left(\int_0^{b^*} w^* f(b)db \right) \beta \\
& & \displaystyle + \left(\int_{b^*}^h (b+ (1-p)r_1 + p (1-s)r_1 - psd + (1-p) w^* + p r_\beta) f(b)db \right) \beta \\
&& \displaystyle - \left(\int_{b^*}^h h f(b)db\right) \beta - \left(\int_{b^*}^h p \cdot s \cdot k f(b)db\right) \beta - c(p) \\
\end{array}
\end{equation}
\noindent
Proposition: Social welfare is higher when records are open than when they are closed.
\noindent
Proof. We will show that even if $p$ and $s$ take the values that are optimal under open records, social welfare is higher with closed records. A fortiori it will be if those policy variables are optimized for closed records.
First,
rewrite open-records social welfare differently, in terms of our old expression $\theta$ for the percentage of crime-prone people not convicted:
\begin{equation} \label{e0}
\begin{array}{lll}
Welfare &= &\displaystyle \gamma w^* + \left(\int_0^{b^*} f(b)db \right) \beta w^* + \left(\int_{b^*}^h f(b)db \right) \beta ((1-p) w^* + p r_\beta ) \\
& & \displaystyle +\gamma r_1 + \left(\int_0^{b^*} r_1 f(b)db \right) \beta + \left(\int_{b^*}^h (r_1- spr_1 - psd ) f(b)db \right) \beta \\
&& \displaystyle + \left(\int_{b^*}^h (b -h) f(b)db \right) \beta - \left(\int_{b^*}^h p \cdot s \cdot k f(b)db\right) \beta - c(p) \\
& & \\
&= &
\displaystyle \gamma w^* + \theta \beta w^* + (1-\theta) \beta r_\beta \\
& & \displaystyle +\gamma r_1 + \beta r_1 - \left(\int_{b^*}^h ( spr_1 + psd ) f(b)db \right) \beta \\
&& \displaystyle - \left(\int_{b^*}^h (b -h) f(b)db \right) \beta - \left(\int_{b^*}^h p \cdot s \cdot k f(b)db\right) \beta - c(p) \\
& & \\
&= &
\displaystyle \bigg( \gamma + \theta \beta \bigg) \bigg( \frac{\gamma r_\gamma + \theta \beta (r_\beta-x_\beta)}{
\gamma + \theta \beta} \bigg) + (1-\theta) \beta r_\beta) \\
& & \displaystyle +\gamma r_1 + \beta r_1 - \left(\int_{b^*}^h ( spr_1 + psd ) f(b)db \right) \beta \\
&& \displaystyle - \left(\int_{b^*}^h (b -h) f(b)db \right) \beta - \left(\int_{b^*}^h p \cdot s \cdot k f(b)db\right) \beta - c(p) \\
& & \\
&= &
\displaystyle \bigg[ \gamma r_\gamma + \theta \beta r_\beta-\theta \beta x_\beta + \beta r_\beta -\theta \beta r_\beta \bigg] \\
& & \displaystyle +\gamma r_1 + \beta r_1 - \left(\int_{b^*}^h ( spr_1 + psd ) f(b)db \right) \beta \\
&& \displaystyle - \left(\int_{b^*}^h (b -h) f(b)db \right) \beta - \left(\int_{b^*}^h p \cdot s \cdot k f(b)db\right) \beta - c(p) \\
& & \\
&= &
\displaystyle \bigg[ \gamma r_\gamma + \beta (r_\beta- x_\beta) + (1- \theta ) \beta x_\beta \bigg] \\
& & \displaystyle +\gamma r_1 + \beta r_1 - \left(\int_{b^*}^h ( spr_1 + psd ) f(b)db \right) \beta \\
&& \displaystyle - \left(\int_{b^*}^h (b -h) f(b)db \right) \beta - \left(\int_{b^*}^h p \cdot s \cdot k f(b)db\right) \beta - c(p) \\
\end{array}
\end{equation}
The equivalent to the expression in the brackets in the first line for closed-records welfare is $\gamma r_\gamma + \beta (r_\beta- x_\beta)$, so open records subtracts an additional term . The second line would be the same except that we know $b^*$ is higher with open records (there is less crime), so the negative third term of line two is smaller under open records. The first two terms of the third line are negative and decreasing in $b^*$, so they subtract less under open records. The last term, $-c(p)$, is the same for both under our temporary assumption that $p$ is chosen to be the same. Thus, social welfare is higher under open records when $p$ and $s$ are at the level optimal for closed records, and would be even higher if they were chosen at the level optimal for open records.
END OF PROOF
\noindent
Proposition: Welfare can be higher under open records than under full information.
\noindent
Proof. [This proof needs more formalization] Again we will take $p$ to be optimized for the alternative, full information, rather than for open records, and show that open records can have higher welfare anyway.
\begin{equation} \label{e0}
\begin{array}{lll}
Welfare (full \; info ) &= & \displaystyle \gamma (r_1 + r_\gamma) + \left(\int_0^{b^*} r_1 f(b)db \right) \beta + \left(\int_{b^*}^h ( b+ r_1 -ps (r_1 +d) ) f(b)db \right) \beta \\
&&\\
&& \displaystyle +\beta r_\beta
- (1- \frac{b^*}{h})\beta h
- (1- \frac{b^*}{h}) \beta p \cdot s \cdot k - c(p) \\
\end{array}
\end{equation}
Note that
\begin{equation} \label{e0}
\begin{array}{lll}
b^* (full \; info ) &= & \displaystyle ps(r_1 +d) \\
&&\\
&& \displaystyle b^* (open\; records)= p(w^* - r_\beta) + ps(r_1 +d) \\
\end{array}
\end{equation}
There is more crime under full information, because it does not have the disincentive of a lower wage for a convicted crime-prone person than for an unconvicted crime-prone person. This will reduce social welfare via lost wages in youth and via increased cost of crime to victims and increased punishment to the government and the criminals. [show that]. On the other hand, output is lower under open records, because crime-prone people are mismatched into good jobs.
Thus, consider a case where $x_\beta$, the loss from mismatch is positive but sufficiently small. This will make $w^*$ very close to $\gamma r_\gamma + \theta r_\beta$, and the welfare loss from mismatch will be small. The deterrence effect of conviction will be large, however, when $w^*$ is large, so $b^*$ will remain larger under open records than under full information. Thus, the advantages of the open records case in terms of such things as reduced victimization costs will dominate the disadvantage of mismatch.
END OF PROOF.
Unfortunately, we cannot derive the optimal detection probability $p$ for the open-records case. This is for two reasons. First, social welfare is a function of $w^*$ and $b^*$. Both are functions of $p$ and of each other. Thus, to differentiate with respect to $p$ it is first necessary to solve for
$w^*$ and $b^*$. This can be done under the assumption of uniform $f(b)$, but $b^*$ is the solution to a quadratic equation that has too many terms to be usefully interpreted and $w^*$ has that intricate solution in both its numerator and its denominator. Intuitively, the complexity is because when the detection probability rises, that increases $b^*$ (it reduces crime), but when crime falls, the good-job wage falls too because more crime-prone types are pooled with crime-averse types. In turn, the fall in the good-job wage tends to have the opposite effect on $b^*$, reducing the disincentive to crime because it reduces the difference in wages between a good job and a bad one.
The second reason that we cannot derive the optimal detection probability $p$ is that the sentence, $s$, is no longer necessarily equal to the maximum, $\overline{s}$. This is itself an interesting negative result, so we will set it out as a proposition:
\noindent
Proposition: The optimal $s$ sometimes equals $\overline{s}$, but not necessarily, in the open-records model.
\noindent
Proof: We can use numerical examples to show this. First here are parameters for which the optimal sentence is $s^*=0$:
\noindent
Example 1: $h = 1, d = 1, r_1 = .2, r_\gamma = 1, r_\beta = .6, k = 20, x_\beta = .4,
\beta = .5, \overline{s} = .2, c(p) = \frac{p^2}{5}.$ \\
$s^*=.2, p^* \approx .70, b^* \approx .12, w^* \approx .78.$
Second, here are parameters for which the optimal sentence is $s^*= \overline{s}$:
\noindent
Example 2: $h = 1, d = 7, r_1 = .2, r_\gamma = 1, r_\beta = .6, k = 1, x_\beta = .4,
\beta = .2, \overline{s} = .2, c(p) = \frac{p^2}{5}.$ \\
$s^*=.2, p^* \approx .59, b^* \approx .97, w^* \approx .84.$
The mathematica logs for these calculations can be found at \url{rasmusen.org/papers/crime1.txt} and \url{rasmusen.org/papers/crime2.txt}. END OF PROOF
Our previous argument from Becker for $s=\overline{s}$ now fails. That argument was that $p$ and $s$ appear in the social welfare function only as the product $ps$ except in the last term $-c(p)$, so that in optimizing $ps$, $p$ should be chosen as small as possible and $s$ chosen large enough to put $ps$ at the optimal level. Now, however, social welfare is
\begin{equation}
\begin{array}{lll}
Welfare (open) &= &
\displaystyle \bigg[ \gamma r_\gamma + \beta (r_\beta- x_\beta) + (1- \theta ) \beta x_\beta \bigg] \\
& & \displaystyle +\gamma r_1 + \beta r_1 - \left(\int_{b^*}^h ( ps r_1 + psd ) f(b)db \right) \beta \\
&& \displaystyle - \left(\int_{b^*}^h (b -h) f(b)db \right) \beta - \left(\int_{b^*}^h p \cdot s \cdot k f(b)db\right) \beta - c(p) \\
\end{array}
\end{equation}
Now $p$ appears in isolation from $s$ in several places. First, $p$ appears in $\theta$ without $s$, since
$
\theta= 1- p\left( \int_{b^*}^{h} f(b)db \right). $ Second, $w^*$ is a function of $\theta$ and thus of $p$, and $b^*$ (which appears in the integrals in the welfare equation), is a function of $w^*$. Thus, it is no longer true that we can optimize $ps$ by choice of either $p$ or $s$ without any concern except that $p$ adds an extra cost.
\noindent
{\bf Discussion}. We should talk about the bad side of stigma: that it reduces the return to work of convicts, so their crime rate increases. We can note that a policy superior to closed records is the California city one of paying convicts not to commit crimes, or of subsidizing their wages in bad jobs. We can make a little model of that, separate from our main model because we won't try to optimize $p$ in it.
It is useful to extend the reasoning behind the closed-records arguments to other kinds of transactions than simple sales of labor. By the same logic, if an auto manufacturer's cars are more dangerous than average, that fact should not be revealed by the government. Revealing it would put the manufacturer closer to bankruptcy and hence increase its temptation to cut corners again for short-run gain. If a restaurant is unsanitary, it should be fined, but the public should not be made aware of its cockroaches and unwashed hands, because, again, that would put financial pressure on the business and tempt it to evade taxes and commit fraud.
Do people really think that someone's convictions for drunk driving should be under seal, so he will be able to more easily get a job as a truckdriver instead of a store clerk? Do they think an embezzler's convictions should be secret, so he can go back to his old job as an accountant? Do they think a lawyer's thefts from his client should be concealed, lest he return to crime instead of the bar? Going a step further, if a hotel knowingly hires a convict and he harms a guest, should the hotel be immune from suit because it is fulfilling a praiseworthy action? Is there any model in which stigma is inefficient that could not be applied to criminal businesses as well as to criminal individuals?
Indeed, the idea of closed-records is intrinsically the idea that some crime should be encouraged: to wit, the crime of employment fraud. When someone falsely claims he does not have a criminal record in order to gain employment, we have all the elements of criminal fraud. Thus, the closed-records argument is that we should divert ex-convicts into employment fraud for fear of alternative crimes they might commit. That does not invalidate the argument, but it calls note to the fact that the tradeoff is not harmless hiring versus harmful crime: it is the hope that crimes committed against employers will substitute for crimes committed against other people in society.
Introspection is useful too. Would you hire someone to paint your living room whom you knew was just released from prison for armed robbery two years ago? We would not. If other information is available, that will be used too, of course. One of us knows at least two pedophiles he would trust to paint his living room, though he would still not hire them as babysitters. And many economics departments would surely consider hiring the microeonomics theorist Rafael Robb when he is released from prison in 2017, despite his bludgeoning his wife to death. After a decade in prison there would be some doubt as to whether he can still write publishable papers, but he is unlikely to murder colleagues.\footnote{ A dean might balk at hiring him, however. On the 2017 release, see the comment at
\url{http://jimfishertruecrime.blogspot.com/2013/01/rafael-robb-professor-and-wife-killers.html}.}
One must also think of social interactions. If records are sealed, a woman (or her less romantic friends and family) will not be able to find out if her prospective husband has a criminal record that he has been lying about. Is that desirable? If the records are sealed, the ex-con has a better chance of finding someone who will marry him, which is part of reformation and would diminish the probability that he returns to crime.
What people have in mind, we think, is the image of some unfortunate child of a poor neighborhood who has been to prison for some non-violent crime (no specific crime: imagining it is drug dealing or larceny or any other non-generic crime spoils the picture) and wants to go straight yet will be hired by no employer, so he resorts to armed robbery. This image assumes that the imaginer knows better than employers who is worth hiring. Here, the image conflicts with two standard principles of economics: that people actually engaged in a business know it better than the academic economist or the government official, and that people like to make profits by hiring the most cost-effective labor they can.
We would like to replace that image with the image of some unfortunate child of a poor neighborhood who has {\it not} been to prison for some non-violent crime and who wants to go straight yet is hired by no employer, so he resorts to armed robbery. Why would an employer not hire this innocent? The reason is that in a closed-records regime, the employer would not know he was an innocent. Closed records deprives the honest individual from proving he is honest. He cannot distinguish himself from the people around him. Thus, the employer who wants to hire someone without a criminal record, having to rely on observables, instead requires a college degree or parents from a wealthier neighborhood. Closed-records hurts someone with a clean record as much as it helps someone with a criminal record.
\bigskip
\noindent
{\bf Notes from the end of the first draft}
Some guesses/conjectures relating to the solution of p and s in this regime with history known:
Another comment about why, apparently, the Becker argument can't apply. If the cost of jailing people is very high, that is, k is very high, then it's very expensive to obtain deterrence via jail. And we have the option of using suspended sentences and penalizing people by making them get the wage $w(r_{\beta} )$ instead of $w(\underline {r})$ for life L. This has to be optimal if k gets large enough, which would contradict the optimality of s = L under the Becker argument.
Anyhow, supposing that, in general, the $s*$, the optimal $s$, is $< $L, the after release wage penalty becomes relevant. We ought to write down the first order condition for that and interpret it.
We should also discuss the possibility that $s* = 0$, the suspended sentence possibility. Perhaps we should prove that if $k$ is high enough, this must be true.
We should then also write down the first order condition for the optimal $p$ and interpret it.
Finally, we need to prove that social welfare is higher under the regime with history disclosed than not. That might be somewhat tricky (and if so, I think that would be a crime-averse thing, since it would mean that the paper's key result is more subtle than meets the eye). Presumably, a natural way to demonstrate the result is to show that we can do better welfare wise than the solution under the no history regime. That would be easy, I think, if $ s* < L$ under the no history regime, for then if we kept s* and p* and just allowed history to be known, we'd get more deterrence due to the wage drop during the period $L - s*$. But we can't use that argument since $s* = L$ under the no disclosure regime. Thus, my suspicion is that we'd need to do something like keeping p* but reducing s slightly below L. At the margin, this reduction should increase deterrence due to your wage reduction effect during $(L - s),$ and it also increases efficiency in production (since instead of having a uniform wage, employers get information and more efficiently choose z for those with a record). I am hoping that these 2 benefits at the margin would allow the proof to go through...we'd probably have to make use of the first order condition for the optimal p under the non-disclosure regime to prove this...I'm not sure...it's conjecture.
Again, if we have a version of the model that we can solve, with a uniform distribution of b, etc., it would be nice to do that for this regime with information and to do comparative statics for it.
One more thing: you commented in your section III, on pp. 539-540, on the notion that the drop in wages suffered by crime-prone guys with records might lead them to commit more crimes (or maybe have other negative social effects). Your main point was, give the crime-prone guys money to offset the negative externality due to the wage drop. Yes, this undoes the deterrent benefit from the wage drop, but it still allows employers to make efficient use of the information from their records. In our model, this can be shown since there's efficiency in the labor market due to the $ry(z) - z$ function. Thus, I propose that we show a simple result. The result would be that, even if there's a negative externality of low wages, disclosing history is still optimal (presuming that subsidizing wages is socially costless).
In sum, via the model, we'd have described the optimal policy under the no disclosure and the disclosure regimes; shown that the disclosure regime is unambiguously superior; observed that a suspended sentence is possibly optimal under the disclosure regime but is useless under the no disclosure regime; and added the corollary about negative externalities from low wages that I just mentioned.
Because this is a complete, coherent, and natural model, I think it might be paid some attention; and because its main point is unmistakeable—releasing information about histories is socially crime-averse—it might provoke more people to show why we're wrong (more than the somewhat subpar efforts I've seen and that you have as referee).
Of course, there'd be an introduction and conclusion to the article, with some comments and interpretation...
\footnote{Do we want to use the example of city blacks, or, better, city young people in general for a particular city? I think the proportion who commit crimes is over 50\%, so the outlook is bleak for all of them if criminal records are closed--- employers will offer only crime-prone jobs in that city, rationally believing the average youth is a criminal.
We should also consider the special case, theoretically, of where the marginal product of the crime-prone people is negative, $r_\beta=0$. Note that we don't need $\beta$ to necessarily be a big fraction of the population. }
\newpage
\noindent
{\bf References} (these are possibly relevant, copied from Eric's Exrpessive Law paper; we will cut out a lot of them)
Becker, Gary S. 1968. ``Crime and Punishment: An Economic Approach.'' {\it Journal of Political Economy} 76: 169-217.
Bernstein, Lisa. 1992. ``Opting Out of the Legal System: Extralegal Contractual Relations
in the Diamond Industry.'' {\it Journal of Legal Studies} 21: 115-157.
Bernstein, Lisa. 2001. ``Private Commercial Law in the Cotton Industry: Creating
Cooperation through Rules, Norms, and Institutions.'' {\it Michigan Law Review} 99 (7):
1724-1788.
Funk, Patricia. 2004. ``On the Effective Use of Stigma as a Crime-Deterrent.'' {\it
European Economic Review} 48: 715-728.
Harel, Alon and Alon Klement. 2007.
``The Economics of Stigma: Why More Detection
of Crime May Result in Less Stigmatization.'' {\it
Journal of Legal Studies} 36: 355-377.
McAdams, Richard and Eric Rasmusen. 2007. ``Norms and the Law.'' {\it Handbook of Law and Economics, Vol 2}, eds. A. Mitchell Polinsky and Steven Shavell, chapter 20, pp. 1573-1618. Amsterdam, Elsevier.
Rasmusen, Eric. 1996. ``Stigma and Self-Fulfilling Expectations of Criminality.'' {\it
Journal of Law and Economics}
39 (2): 519-543.
$\int_b^\infty x f(x)dx = \int_b^\infty x \lambda e^{-\lambda x} dx$
$$
\lambda \bigg[\frac{ -x e^{-\lambda x}}{\lambda}|^\infty_b+\frac{1}{\lambda} \int_b^\infty e^{-\lambda x} dx \bigg]
$$
$$
\lambda \bigg[0- \frac{ -b e^{-\lambda b}}{\lambda} +\frac{1}{\lambda} \frac{-e^{-\lambda x}}{\lambda} |_b^\infty \bigg]
$$
$$
\lambda \bigg[ - \frac{ -b e^{-\lambda b}}{\lambda} +\frac{1}{\lambda} (0 - \frac{-e^{-\lambda b}}{\lambda} ) \bigg]
$$
$$
b e^{-\lambda b} + \frac{ e^{ -\lambda b}}{\lambda}
$$
%\end{raggedright}
\end{document}