This article appeared in a journal published by Elsevier. The attached
copy is furnished to the author for internal non-commercial research
and education use, including for instruction at the authors institution
and sharing with colleagues.
Other uses, including reproduction and distribution, or selling or
licensing copies, or posting to personal, institutional or third party
websites are prohibited.
In most cases authors are permitted to post their version of the
article (e.g. in Word or Tex form) to their personal website or
institutional repository. Authors requiring further information
regarding Elsevier’s archiving and manuscript policies are
encouraged to visit:
http://www.elsevier.com/copyright
Author's personal copy
The rate of return to the HighScope Perry Preschool Program
James J. Heckman
,1
, Seong Hyeok Moon
2
, Rodrigo Pinto
2
, Peter A. Savelyev
2
, Adam Yavitz
3
Department of Economics, University of Chicago, 1126 East 59th Street, Chicago, Illinois 60637, United States
abstractarticle info
Article history:
Received 18 April 2009
Received in revised form 28 October 2009
Accepted 2 November 2009
Available online 18 November 2009
JEL classication:
D62
I22
I28
Keywords:
Rate of return
Costbenet analysis
Standard errors
Perry Preschool Program
Compromised randomization
Early childhood intervention programs
Deadweight costs
This paper estimates the rate of return to the HighScope Perry Preschool Program, an early intervention
program targeted toward disadvantaged African-American youth. Estimates of the rate of return to the Perry
program are widely cited to support the claim of substantial economic benets from preschool education
programs. Previous studies of the rate of return to this program ignore the compromises that occurred in the
randomization protocol. They do not report standard errors. The rates of return estimated in this paper
account for these factors. We conduct an extensive analysis of sensitivity to altern ative plausible
assumptions. Estimated annual social rates of return generally fall between 7 and 10%, with most estimates
substantially lower than those previously reported in the literature. However, returns are generally
statistically signicantly different from zero for both males and females and are above the historical return
on equity. Estimated benet-to-cost ratios support this conclusion.
© 2009 Published by Elsevier B.V.
1. Introduction
President Barack Obama has actively promoted early childhood
education as a way to foster economic efciency and reduce inequality.
4
He has also endorsed accountability and transparency in government.
5
In an era of tight budgets and scal austerity, it is important to prioritize
expenditure and use funds wisely. As the size of government expands,
there is a renewed demand for costbenet analyses to weed out
political pork from economically productive programs.
6
The economic case for expanding preschool education for disadvan-
taged children is largely based on evidence from the HighScope Perry
Preschool Program, an early intervention in the lives of disadvantaged
children in the early 1960s.
7
In that program, children were randomly
assigned to treatment and control group status and have been
systematically followed through age 40. Information on earnings,
employment, education, crime and a variety of other outcomes are
collected at various ages of the study participants. In a highly cited
paper, Rolnick and Grunewald (2003) report a rate of return of 16% to the
Perry program.
8
Beleld et al. (2006) report a 17% rate of return.
Critics of the Perry program point to the small sample size of
the evaluation study (123 treatments and controls), the lack of
a substantial long-term effect of the prog ram on IQ, and the
absence of statistical signicance for many estimated treatment
effects.
9
Hanushek and Lindseth (2009) question the strength of the
evidence on the Perry program, claiming that estimates of its impact
are fragile.
The literature does little to assuage these concerns. All of the reported
estimates of rates of return are presen ted without standard errors,
Journal of Public Economics 94 (2010) 114128
Corresponding author. Tel.: +1 773 702 0634; fax: +1 773 702 8490.
E-mail addresses: [email protected] (J.J. Heckman), [email protected]
(S.H. Moon), [email protected] (R. Pinto), [email protected] (P.A. Savelyev),
[email protected] (A. Yavitz).
1
Henry Schultz Distinguished Service Professor of Economics at the University of
Chicago, Professor of Science and Society, University College Dublin, Alfred Cowles
Distinguished Visiting Professor, Cowles Foundation, Yale University and Senior Fellow,
American Bar Foundation.
2
Ph.D. candidate, Department of Economics, University of Chicago.
3
Research Professional at the Economic Research Center, University of Chicago.
4
See Dillon (2008).
5
Weekly address of thePresident, January 31, 2009, as citedin Bajaj andLabaton (2009).
6
The McArthur Foundation has recently launched an initiative to promote the
application of costbenet analysis in the service of making government effective. See
Fanton (2008).
7
See, e.g., Shonkoff and Phillips (2000) or Karoly et al. (2005). No other early childhood
intervention has a follow-up into adult life as late as the Perry program. For example, the
benetcost study of the Abecedarian Program only follows people to age 21, and relies
heavily on extrapolation of future earnings (Barnett and Masse, 2007).
8
The rate of return estimates presented in Rolnick and Grunewald (2003) are based
on cost and benet estimates reported in Schweinhart et al. (1993).
9
See Herrnstein and Murray (1994, pp. 404405). Heckman et al. (2009b) show
statistically signicant treatment effects for males and females using small sample
permutation tests. They also nd close agreement between small sample tests and
large sample tests in the Perry sample.
0047-2727/$ see front matter © 2009 Published by Elsevier B.V.
doi:10.1016/j.jpubeco.2009.11.001
Contents lists available at ScienceDirect
Journal of Public Economics
journal homepage: www.elsevier.com/locate/jpube
Author's personal copy
leaving readers uncertain as to whether the estimates are statistically
signicantly different from zero.
The paper by Rolnick and Grunewald (2003) is based on the age-
27 data. It does not conduct a se nsitivity analysis for the effects of
alternative assumpti ons, nor does it pre sent a standard error for the
estimated rate of return.
10
The study by Beleld et al. (2006) is
bas ed on the age-40 da ta we use. It does not report standard errors
for its estimates. It conducts a limited sensitivity analysis.
11
Any computation of the lifetime rate of return to the Perry program
must address four major challenges: (a) the randomization protocol
was compromised; (b) there are no data on participants past age 40
and it is necessary to extrapolate out-of-sample to obtain earnings
proles past that age to estimate lifetime impacts of the program;
(c) some data are missing for participants prior to age 40; and (d) there
is difculty in assigning reliable values to non-market outcomes such
as crime. The last point is especially relevant to any analysis of the
Perry program because crime reduction is one of its major benets.
Unless these challenges are carefully addressed, the true rate of return
remains uncertain as does the economic case for early intervention.
This paper presents rigorous estimates of the rate of return and the
benet-to-cost ratio for the Perry program. Our analysis improves on
previous studies in seven ways. (1) We account for compromised
randomization in evaluating this program. As noted in Heckman et al.
(2009b), in the Perry study, the randomization actually implemented in
this program is somewhat problematic because of reassignment of
treatment and control status after random assignment. (2) We develop
standard errors for all of our estimates of the rate of return and for the
benet-to-cost ratios accounting for components of the model where
standard errors can be reliably determined. (3) For the remaining
components of costs and benets where meaningful standard errors
cannot be determined, we examine the sensitivity of estimates of rates
of return to plausible ranges of assumptions. (4) We present estimates
that adjust for the deadweight costs of taxation. Previous estimates
ignore the costs of raising taxes in nancing programs. (5) We use a
much wider variety of methods to impute within-sample missing
earnings than have been used in the previous literature, and examine
the sensitivity of our estimates to the application of alternative
imputation procedures that draw on standard methods in the literature
on panel data.
12
(6) We use state-of-the-art methods to extrapolate
missing future earnings for both treatment and control group
participants. We examine the sensitivity of our estimates to plausible
alternative assumptions about out-of-sample earnings. We also report
estimates to age 40 that do not require extrapolation. (7) We use local
data on costs of education, crime, and welfare participation whenever
possible, instead of following earlier studies in using national data to
estimate these components of the rate of return.
Table 1 summarizes the range of estimates from our preferred
methodology, defended later in this paper. Estimates from a diverse
set of methodologies can be found in the Appendix, Part J. All point in
the same direction. Separate rates of return are reported for benets
accruing to individuals vs. those that accrue to society at large that
include the impact of the program on crime, participation in welfare,
and the resulting savings in social costs.
This estimate of the overall annual social rate of return to the Perry
program is in the range of 710%. For the benet of non-economist
readers, annual rates of return of this magnitude, if compounded and
reinvested annually over a 65 year life, imply that each dollar invested
at age 4 yields a return of 60300 dollars by age 65. Stated another
way, the benet-cost ratio for the Perry program, accounting for dead-
weight costs of taxes and assuming a 3% discount rate, ranges from 7
to 12 dollars per person, i.e., each dollar invested returns in present
Table 1
Selected estimates of IRRs (%) and benet-to-cost ratios.
Return To individual To society
a
To society
a
Murder cost
b
High ($4.1M) Low ($13K)
All
d
Male Female All
d
Male Female All
d
Male Female
Deadweight loss
c
IRR 0% 7.6 8.4 7.8 9.9 11.4 17.1 9.0 12.2 9.8
(1.8) (1.7) (1.1) (4.1) (3.4) (4.9) (3.5) (3.1) (1.8)
50% 6.2 6.8 6.8 9.2 10.7 14.9 8.1 11.1 8.1
(1.2) (1.1) (1.0) (2.9) (3.2) (4.8) (2.6) (3.1) (1.7)
100% 5.3 5.9 5.7 8.7 10.2 13.6 7.6 10.4 7.5
(1.1) (1.1) (0.9) (2.5) (3.1) (4.9) (2.4) (2.9) (1.8)
Discount rate
Benetcost ratios 0% ––– 31.5 33.7 27.0 19.1 22.8 12.7
(11.3) (17.3) (14.4) (5.4) (8.3) (3.8)
3% ––– 12.2 12.1 11.6 7.1 8.6 4.5
(5.3) (8.0) (7.1) (2.3) (3.7) (1.4)
5% ––– 6.8 6.2 7.1 3.9 4.7 2.4
(3.4) (5.1) (4.6) (1.5) (2.3) (0.8)
7% ––– 3.9 3.2 4.6 2.2 2.7 1.4
(2.3) (3.4) (3.1) (0.9) (1.5) (0.5)
Notes: Kernel matching using NLSY data is used to impute missing values for earnings before age-40, and PSID projection for extrapolation of later earnings. For details of these
procedures, see Section 3. In calculating benet-to-cost ratios, the deadweight loss of taxation is assumed to be 50%. Nine separate types of crime are used to estimate the social cost
of crime; see the Appendix, Part H for details. Standard errors in parentheses are calculated by Monte Carlo resampling of prediction errors and bootstrapping; see the Appendix, Part
K for details. Lifetime net benet streams are adjusted for compromised randomization. For details, see Section 4.
a
The sum of returns to program participants and the general public.
b
High murder cost accounts for the standard statistical value of life, while Low does not.
c
Deadweight cost is dollars of welfare loss per tax dollar.
d
All is computed from an average of the proles of the pooled sample, and may be lower or higher than the proles for each gender group.
10
More extensive sensitivity analyses can be found in Schweinhart et al. (1993) for
the age 27 data on which Rolnick and Grunewald (2003) draw their cost and benet
estimates. See also Barnett (1996).
11
This study builds on the costbenet analyses of two previous studies: Barnett
(1985) uses the data up to age 19, and Barnett (1996) uses the data up to age 27.
While neither study reports the rate of return to the Perry program, they show that the
present value of the net benet of the program is still positive at a very high real
discount rate (11%), which implies that the rate of return is greater than this. They also
explore the consequences of alternative assumptions about costs and benets. Our
analysis builds on and extends these important studies.
12
See, e.g., MaCurdy (2007) for a survey of these methods.
115J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128
Author's personal copy
value terms 7 to 12 dollars back to society.
13
We report a range of
estimates because of uncertainty about some components of benets
and costs for which standard errors cannot be assigned. These
estimates are above the historical return to equity.
14
However, our
estimates are substantially below the estimates of the rate of return to
the Perry program reported in previous studies. This difference is
driven mainly by our approach to evaluating the social costs of crime.
We present an extensive sensitivity analysis of the consequences of
alternative assumptions about the social cost of crime for the
estimated rate of return. The benet-to-cost ratios presented in the
bottom of Table 1 support the rate of return analysis. The rest of the
paper justies the estimates presented in Table 1.
This paper proceeds in the following way. Section 2 discusses the
Perry program and how it was evaluated. Section 3 discusses the
sampling plan used to collect the outcomes of the experiment and the
empirical problems it creates, which require imputation and extrapo-
lation to compute the rate of return. Problems of estimating non-market
benets of the program are also discussed. Section 4 presents our
estimates and their sensitivity to alternative plausible assumptions. We
contrast our approach with the approaches taken by other analysts. In
the nal section, we summarize our ndings and draw conclusions.
2. Perry: experimental design and background
The HighScope Perry Preschool Program was an early childhood
education program conducted at the Perry Elementary School in
Ypsilanti, Michigan, during the early 1960s. Beginning at age three and
lasting two years, treatment consisted of a 2.5-hour preschool program
on weekdays during the school year, supplemented by weekly home
visits by teachers.
The curriculum was based on supporting children's cognitive
and socio-emotional development through active learning where
both teachers and children had major roles in shaping children's
learning. Children were encouraged to plan, carry out, and reect
on their own activities through a plan-do-review process. Adults
observed, supported, and extended children's play as appropriate.
They also encouraged children to make choice s, problem solve,
and engage in activities. Instead of providing lessons, Perry
emphasized reective and open-ended questions asked by tea-
chers. Examples are: What happened? How did you make that?
Can you sho w me? Can you help another child? (Schw ei nhart
et al., 1993, p. 33).
15
2.1. Eligibility criteria
Five cohorts of preschoolers were enrolled in the program in the
early to the mid-1960s. Drawn from the community served by the
Perry Elementary School, participants were located through a survey
of families associated with that school, as well as through neighbor-
hood group referrals, and door-to-door canvassing. Disadvantaged
children living in adverse circumstances were identied using IQ
scores and a family socioeconomic status (SES) index. Those with IQ
scores outside the range of 7085 were excluded, as were those with
untreatable mental defects.
2.2. The compromised randomization protocol
A potential problem with the Perry study is that after random
assignment, treatment and controls were reassigned, compromising
the original random assignment and making simple interpretation of
the evidence problematic. In addition, there was some imbalance in
the baseline variables between treatment and control groups. Heck-
man et al. (2009b) discuss the Perry selection and randomization
protocols in detail. They correct for the imbalance in pre-program
variables and the compromise in randomization using matching. We
use their procedures in this analysis.
16
2.3. Evidence on selective participation
Weikart et al. (1978) claim that virtually all eligible families
agreed to participate in the program, implying that there is no issue of
bias arising from selective participation of more motivated families
from the pool of eligible participants.
17
2.4. Study follow-up
Follow-up interviews were conducted when participants were
approximately 15, 19, 27, and 40 years old. Attrition remains low
throughout the study, with over 90% of the original sample participating
in the age-40 interview. At these interviews, participants provided
detailed information about their lifecycle trajectories i ncluding schooling,
economic activity, marital life, child rearing, and incarceration. In addition,
Perry researchers collect administrative data in the form of school records,
police and court records, and welfare program participation records.
2.5. The previous literature and its critics
As the oldest and most cited early childhood intervention evaluated
by the method of random assignment, the Perry study serves as a
agship for policy makers advocating public support for early childhood
programs. Schweinhart et al. (2005) and Heckman et al. (2009b) nd
substantial treatment effects. Crime reduction is a major benet of this
program.
18
The latter study systematically addresses several important
statistical issues that arise in analyzing the Perry data including its small
sample size. The authors show that for the Perry data small sample
permutation inference (based on randomly assigning treatment labels
for treatments and controls) produces the same inference about the null
hypothesis of no treatment effect as is produced from application of test
statistics that are justied only in large samples. Thus, concerns over the
small sample size of the Perry study are unfounded.
Table 2 presents some descriptive statistics on treatmentcontrol
differences. Additional detail about the program can be found in the
Appendix, Part A for this paper.
For the cost-benet analysis of this program, the HighScope
Foundation collaborated with outside researchers and produced a
13
A 3% discount rate is consistent with the recommendations of OMB (1992, Web
Appendix C) and GAO (1991).
14
The estimated mean returns are above the post-World War II stock market rate of
return on equity of 5.8% (see DeLong and Magin, 2009).
15
The Appendix, Part A provides further information on the program.
16
The randomization protocol used in the Perry Preschool Program was complex. For
each designated eligible entry cohort, children were assigned to treatment and control
groups in the following way. (1) Participant status of the younger siblings is the same
as that of their older siblings; (2) Those remaining were ranked by their entry IQ score
with odd- and even-ranked subjects assigned to separate groups; (3) Some individuals
initially assigned to one group were swapped between groups to balance gender and
mean SES scores, with StandfordBinet scores held more or less constant. This
produced an imbalance in family background variables; (4) A coin toss randomly
selected one group as the treatment group and the other as the control group; (5)
Some individuals provisionally assigned to treatment, whose mothers were employed
at the time of the assignment, were swapped with control individuals whose mothers
were not employed. The rationale for this swap was that it was difcult for working
mothers to participate in home visits assigned to the treatment group. For further
discussion of the Perry randomization protocol, see the Appendix, Part L and Heckman
et al. (2009b).
17
Heckman et al. (2009b) discuss the external validity of the Perry study. Using the
NLSY sample of African-Americans who were born in the same years as the Perry
participants, they estimate that 17% of the males and 15% of the females in the NLSY
would be eligible for the Perry program if it were applied nationwide. Perry over-
represents the most disadvantaged segment of the African-American population of
children.
18
Their ndings are generally consistent with ndings from a recent study of Head
Start (Garces et al., 2002). Those authors nd that African-Americans who participated
in Head Start are signicantly less likely to have been booked or charged with a crime.
116 J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128
Author's personal copy
series of studies.
19
These studies report high internal rates of return
(IRR): 16% by Rolnick and Grunewald (2003) and 17% by Beleld et al.
(2006). Our analysis challenges these estimates. Unlike the estimates
reported in previous studies, our estimated rates of return recognize
problems with the data and problems raised by imbalances in pre-
program variables between treatments and controls and by compro-
mised randomization. The previous studies are unable to answer
many important questions: How reliable are the IRR estimates? Can
we conclude that the estimated IRRs are statistically signicantly
different from zero? Are all assumptions, accounting rules and
estimation methods employed in previous studies reasonable? How
would different plausible earnings imputation and extrapolation
methods impact estimates of the IRR? If crime costs drive the IRR
results, as previous studies have found, what are the consequences of
estimating these costs under different plausible assumptions?
3. Program costs and benets
The internal rate of return (IRR) is the annualized rate of return
that equates the present values of costs and benets between
treatment and control group members. Lifetime benets and costs
through age 40 are directly measured using follow-up interviews.
Extrapolation can be used to extend these proles through age 65.
Alternatively, we also compute rates of return through age 40 to
eliminate uncertainty due to extrapolation. The scope of our
evaluation is conned to the costs and benets of education, earnings,
criminal behavior, tax payments, and reliance on public welfare
programs. There are no reliable data on health outcomes, marital and
parental outcomes, the quality of social life and the like.
20
Hence, our
estimated rate of return likely understates the true rate of return,
although we have no direct evidence on this issue. We present
separate estimates of rates of return for private benets and more
inclusive social benets.
3.1. Initial program cost
We use estimates of initial program costs reported in Barnett
(1996). These include both operating costs (teacher salaries and
administrative costs) and capital costs (classrooms and facilities). This
information is summarized in the Appendix, Part C. In undiscounted
year-2006 dollars, cost of the program per child is $17,759.
3.2. Program benets: education
Perry promoted educational attainment through two avenues: total
years of education attained and rates of progression to a given level of
education. This pattern is particularly evident for females. Treated
females received less special education, progressed more quickly
through grades, earned higher GPAs, and attained higher levels of
education than their control group counterparts. T he statistical
signicance of these differences depends on the methodology used,
but all results point in the same direction. For males, however, the
impact of the program on schooling attainment is weak at best.
21
In this section, we report estimates of tuition and other pecuniary
costs paid by individuals to regular K-12 educational institutions,
colleges, and vocational training institutions, and additional social
costs incurred by society to educate them.
22
The amount of edu-
cational expenditure that the general public spends is greater if
persons attain more schooling or if they progress through school less
efciently. The Appendix, Part D, presents detailed information on
educational attainment and costs in Perry.
3.2.1. K-12 education
To calculate the cost of K-12 education, we assume that all Perry
subjects went to public school at the annual cost per pupil in the state of
Michigan during the period in question, $6645.
23
Treatment group
members spent only slightly more time in the K-12 system, in spite of
the discrepancy between treatment and control group graduation rates.
Among females, control subjects were held back in school more often.
This equalized the social cost of educating them in the K-12 system with
the social cost of the treatment group. Society spent comparable
amounts of resources on individuals during their K-12 education
regardless of their treatment experience, albeit for different reasons.
19
Barnett (1985) uses age-19 data. Barnett (1996) uses age-27 data. Beleld et al.
(2006) use age-40 data.
Table 2
Descriptive statistics.
Outcome Age Female Male
Control Treatment Control Treatment
Sample size 26 25 39 33
Mother's age At birth 25.7 26.7 25.6 26.5
(1.5) (1.2) (1.1) (1.1)
Parent's HS grade-level 3 9.1 9.4 9.6 9.5
(0.4) (0.5) (0.3) (0.4)
StanfordBinet IQ 3 79.6 80.0 77.8 79.2
(1.3) (0.9) (1.1) (1.2)
HS graduation (%) 27 31% 84% 54% 48%
(9%) (7%) (8%) (9%)
Currently employed (%) 27 55% 80% 56% 60%
(10%) (8%) (8%) (9%)
Yearly earnings
a
($) 27 10,523 13,530 14,632 17,399
(2068) (2200) (2129) (2155)
Currently employed (%) 40 82% 83% 50% 70%
(8%) (8%) (8%) (8%)
Yearly earnings
a
($) 40 20,345 24,434 24,730 32,023
(3883) (4752) (4495) (4938)
Ever on welfare (%) 1827 82% 48% 26% 32%
(8%) (10%) (7%) (8%)
Ever on welfare (%) 2640 41% 50% 38% 20%
(10%) (10%) (8%) (7%)
Arrests, murder
b
40 0.04 0.00 0.05 0.03
(0.04) () (0.04) (0.03)
Arrests, rape
b
40 0.00 0.00 0.36 0.12
()() (0.16) (0.06)
Arrests, robbery
b
40 0.04 0.00 0.36 0.24
(0.04) () (0.15) (0.14)
Arrests, assault
b
40 0.00 0.04 0.59 0.33
() (0.04) (0.18) (0.14)
Arrests, burglary
b
40 0.04 0.00 0.59 0.42
(0.04) () (0.19) (0.16)
Arrests, larceny
b
40 0.19 0.00 1.03 0.33
(0.10) () (0.30) (0.22)
Arrests, MV theft
b
40 0.00 0.00 0.15 0.03
()() (0.11) (0.03)
Arrests, all felonies
b
40 0.42 0.04 3.26 2.12
(0.18) (0.04) (0.68) (0.60)
Arrests, all crimes
b
40 4.85 2.20 12.41 8.21
(1.27) (0.53) (1.95) (1.78)
Ever arrested (%) 40 65% 56% 95% 82%
(10%) (10%) (4%) (7%)
Notes: Numbers in parentheses are standard errors.
Source: Perry Preschool data. See Schweinhart et al. (2005).
a
In year-2006 dollars.
b
Mean occurrence.
20
The Appendix, Part B, summarizes the data sources which we use in this paper.
21
This pattern was noted in Heckman (2005). Heckman et al. (2009b) discuss this
phenomenon in the context of the local labor market in which Perry participants reside.
In the late 1970s, as Perry participants entered the workforce, the local male-friendly
high-wage automotive manufacturing sector was booming. Persons did not need high
school diplomas to get good entry-level jobs in manufacturing (see Goldin and Katz,
2008).
22
All monetary values are in year-2006 dollars unless otherwise specied. Social
costs include the additional funds beyond tuition paid required to educate students.
23
See Total expenditure per pupil in public elementary and secondary education for years
19741980, as reported by the Digest of Education Statistics (19751982, each year, in year-
2006 dollars). We assume that public K-12 education entails no private cost for individuals.
Detailed per-pupil expenditures for Ypsilanti schools are not available for the relevant years.
117J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128
Author's personal copy
Most treatment females who stayed longer obtained diplomas, while
most control females who stayed longer repeated grades and many
eventually dropped out of school. For males, educational experiences
were very similar between treatments and controls.
3.2.2. GED and special education
Some male dropou ts acquired high school certicates through
GED t esting. Our estimates of the private costs of K- 12 education
include the cost of getting a GED.
24
Female control subjects received
more special education than treatment subjects. For males, there
was no differenc e in receipt of spe cial education by treatment status.
Spe cial se rvices require additional spending. To calculate this cost,
we use estimates from Chambers et a l. (2004), who provide a
historical trend of the ratio of per-pupil costs for special and regular
education.
25
3.2.3. 2- and 4-year colleges
To calculate the cost of college education, we use each individual's
record of credit hours attempted multiplied by the cost per credit hour
(including both student-paid tuition costs and public institutional
expenditures), taking into account the type of college attended.
26
Male
control subjects attended more college classes than male treatment
subjects the reverse of the pattern for females. As a result, the social
cost of college education is bigger for the control group among males
while it is bigger for the treatment group among females.
After the age-27 interview, many Perry subjects progressed to higher
education. Without having detailed information about educational
attainment between the age-27 and age-40 interviews, we make some
crude cost estimates. For college education, we assume some college
education to be equivalent to 1-year attendance at a 2-year college. For
2-year or 4-year college degrees, we take the tuition and expenditure
estimates used for college going before age 27.
27
Without detailed
information on whether a subject did or did not get any nancial
support, we assume that the private cost for a 2-year master's degree is
the same as that for a 4-year bachelor's degree. Control males and
treatment females pursued higher education more vigorously than did
their same-sex counterparts, although only the treatment effect for
females is statistically signicant.
3.2.4. Vocational training
Some subjects attended vocational training programs. Among
males, control group members were more likely to attend vocational
programs, although the treatment effect is not precisely determined.
Among females, the pattern is reversed and the treatment effect is
precisely determined. Thus, the public spent more resources to train
control males and treatment females than their respective counter-
parts.
28
Individual costs are calculated using the number of months
each Perry subject attended a vocational training institute. Table 3
summarizes the components of estimated educational costs. The
other components of costs and benets are discussed later.
3.3. Program benets: employment and earnings
To construct lifetime earnings proles, we must solve two practical
problems. First, job histories were constructed retrospectively only for
a xed number of previous job spells.
29
Missing data must be imputed
using econometric techniques. Second, data on the Perry sample ends
at the time of the age-40 interview. In order to generate lifetime
proles, it is necessary to predict earnings proles beyond this age or
else to estimate rates of return through age 40. The latter assumption is
conservative in assuming no persistence of treatment effects past age
40. We report both sets of estimates in this paper. The proportion of
non-missing earnings data is about only 70% for ages 1940. The
Appendix, Part G presents descriptive statistics and the procedures
used to extrapolate earnings when extrapolation is used.
3.3.1. Imputation
To impute missing values for periods prior to the age-40 interview, we
use four different imputation procedures and compare the estimates
based on them. First, we use simple piecewise linear interpolation, based
on weighted averages of the nearest observed data points around a
missing value. This approach is used by Beleld et al. (2006). For truncated
spells,
30
we rst impute missing employment status with the mean of the
corresponding gender-treatment data from the available sample at the
relevant time period, and then we interpolate. Second, we impute missing
values using estimated earnings functions t on a matched 1979 National
Longitudinal Survey of Youth (NLSY79) low-ability
31
African-American
subsample of the same age as Perry subjects. Heckman et al. (2009b) show
that this subsample from the NLSY79 is similar in characteristics and
outcomes to the Perry controls. The NLSY79 longitudinal data are far more
complete than the Perry data. We estimate earnings functions for each
NLSY79 genderage cross-section using education dummies, work expe-
rience and its square as regressors and then impute from this equation the
missing values for the corresponding Perry genderage cross-section. For
truncated spells, we assume symmetry around the truncation points.
Third, we use a kernel procedure that matches each Perry subject to
similar observations in the NLSY79 sample to impute missing values in
Perry. Each Perry subject is matched to all observations in the NLSY79
comparison group sample, but with different weights that depend on a
measure of distance in characteristics between Perry experimentals, and
comparison group members.
32
This procedure weights more heavily NLSY
sample participants who more closely match Perry subjects. For truncated
spells, we rst match the length of spells, and then earnings. Fourth, we
estimate dynamic earnings functions using the method of Hause (1980),
discussed by MaCurdy (2007), for each NLSY79 agegender group. This
procedure decomposes individual earnings processes into observed
abilities, unobserved time-invariant components and serially correlated
shocks. The procedure uses the estimated parameters of the Hause model
to impute missing values in the Perry earnings data. For truncated spells,
we assume symmetry around truncation points. All four methods are
24
For detailed statistics about the GED, see Heckman et al. (forthcoming).
25
In 196869, this ratio was about 1.92; in 197778, it was 2.17. Since Perry subjects
attended K-12 education in the interval between these two periods, we set the ratio to
2 and apply it to all K-12 schooling years, which gives an additional $6645 annual per-
pupil cost for special education.
26
Total cost is the sum of private tuition and public expenditure.For student-paid tuition
costs at a 2-year college, we use the 1985 tuition per credit hour for Washtenaw
Community College ($29); for a 4-year college, that of Michigan State University for the
same year ($42). To calculate public institutional expenditure per credit hour, we divide
the national mean of total per-student annual expenditure (National Center for Education
Statistics, 1991, Expenditure per Full-Time-Equivalent Student, Table 298) by 30, a
typical credit-hour requirement for full-time students at U.S. colleges. This calculation
yields $590 per credit hour for 2-year colleges and $1765 for 4-year colleges.
27
For missing information on educational attainment, we use the corresponding
gender-treatment group mean.
28
We assume that all costs are paid by the general public. Estimates by Tsang (1997)
suggest per-trainee costs which are 1.8 times the per-pupil costs of regular high school
education.
29
At each interview, participants were asked to provide information about their
employment history and earnings at each job for several previous jobs. From this
interview design, three problems arise. First, for people with high job mobility, some
past jobs are unreported. Second, for people who were interviewed in the middle of a
job spell, it may not be possible to precisely specify the end point of that job spell
that is, we have a right-censoring problem for the job spells at the time of interview.
Third, even when the dates for each job spell can be precisely specied, it is not
possible to identify how earning proles evolve within each job spell because an
interviewee reports only one earnings value for each job.
30
As noted in the previous footnote, there are job spells in progress at the time of
interview.
31
This low-ability subsample is selected by initial background characteristics that
mimic the eligibility rule actually used in the Perry program. NLSY79 is a nationally-
representative longitudinal survey whose respondents are almost at the same age
(birth years 19561964) as the Perry sample (birth years 19571962). We extract a
comparison group from this data using birth order, socioeconomic status (SES) index,
and AFQT test score. These restrictions are chosen to mimic the program eligibility
criteria of the Perry study. For details, see the Appendix, Part F.
32
We use the Mahalanobis (1936) distance.
118 J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128
Author's personal copy
conservative in that they impose the same earnings structure on the
missing data for treatment and controls. The fourth method preserves
differences in pre- existin g patterns of unobservables between treatments
and controls. See the Appendix, Part G, for further discussion.
3.3.2. Extrapolation
Given the absence of earnings data after age 40, we employ three
extrapolation schemes to extend sample earnings proles to later ages.
First, we use March 2002 Current Population Survey (CPS) data to obtain
earnings growth rates up to age 65. Since the CPS does not contain
measures of cognitive ability, it is not possible to extract low-ability
subsamples from the CPS that are comparable to the Perry control group.
We use CPS age-by-age growth rates (rather than levels of earnings) of
three-year moving average of earnings by race, gender, and educational
attainment to extrapolate earnings, thereby avoiding systematic selec-
tion effects in levels.
33
We link the CPS changes to the nal Perry earnings.
Second, we use the Panel Study of Income Dynamics (PSID) to
extrapolate earnings proles past age-40. In the PSID, there is a word-
completion test score from which we can extract a low-ability
subsample in a fashion similar to the way we extract a matched sample
from the NLSY79 using AFQT scores.
34
To extrapolate Perry earnings
proles, we rst estimate a random effects model of earnings using
lagged earnings, education dummies, age dummies and a constant as
regressors. We use the tted model to extrapolate earnings after age 40.
35
Third, we also use individual parameters from an estimated Hause
(1980) model. For computing rates of return, we obtain complete lifetime
earnings proles from these procedures and compare the results of using
alternative approaches to extrapolation on estimated rates of return.
36
All
three methods are conservative in that they impose the same earnings
dynamics on treatments and controls. However, in making projections all
three methods account for age-40 individual earnings differences
between treatments and controls.
The earnings analyzed in Table 3 and the Appendix, Part G (Tables
G.4 and G.5), include all types of fringe benets listed in Employer
Costs for Employee Compensation (ECEC), a Bureau of Labor Statistics
(BLS) compensation measure. Even though the share of fringe benets
in total employee compensation varies across industries, due to data
limitations, our calculations assume the share to be constant at its
economy-wide average regardless of industry.
37
3.4. Program benets: criminal activity
Crime reduction is a major benet of the Perry program.
38
Valuing
the effect of crime reduction in terms of costs and benets is not trivial
given the difculty in assigning reliable monetary values to non-
Table 3
Summary of lifetime costs and benets (in undiscounted 2006 dollars).
Crime ratio
a
Murder cost
b
Male Female
Treatment Control Treatment Control
Cost of education
c
K-12/GED
d
107,575 98,855 98,678 98,349
College, age 27
e
6705 19,735 21,816 16,929
Education, age N 27
e
2409 3396 7770 1021
Vocational training
f
7223 12,202 3120 674
Lifetime effect
g
10,275 14,409
Cost of crime
h
Police/court 105.7 152.9 24.7 53.8
Correctional 41.3 67.4 0.0 5.3
Victimization Separate High 370.0 729.7 2.9 320.7
Separate Low 153.3 363.0 2.9 16.1
By type Low 215.0 505.7 2.8 43.3
Lifetime effect
g
Separate High 433 352.2
Separate Low 283 47.6
By type Low 364 74.9
Gross earnings
i
Age 27 186,923 185,239 189,633 165,059
Ages 2840 370,772 287,920 356,159 290,948
Ages 4165 563,995 503,699 524,181 402,315
Lifetime effect
g
145,461 211,651
Cost of welfare
j
Age 27 89 115 7064 13,712
Ages 2840 831 2701 11,551 5911
Ages 4165 1533 2647 6528 7363
Lifetime effect
g
3011 1844
a
A ratio of victimization rate (from the National Crime Victimization Survey) to arrest rate (from the Uniform Crime Report), where By type uses common ratios based on a
crime being either violent or property and Separate does not.
b
High murder cost accounts for value of a statistical life, while Low does not.
c
Source: National Center for Education Statistics (Various) for 19751982 (annually).
d
Based on Michigan per-pupil expenditures (special education costs calculated using National Center for Education Statistics, Various, 19751982, annually).
e
Based on expenditure per full-time-equivalent student (from National Center for Education Statistics, 1991 ).
f
Based on regular high school costs and estimates from Tsang (1997).
g
Treatment minus control.
h
In thousand dollars.
i
Gross earnings before taxation, including all fringe benets. Kernel matching and PSID project are used for imputation and extrapolation, respectively.
j
Includes all kinds of cash assistance and in-kind transfers.
33
Beleld et al. (2006) use mean values of CPS earnings in their Perry extrapolations.
In doing so, they neglect the point that Perry subjects belong to the bottom of the
distribution of ability and the further point that CPS subjects are sampled from a more
general population with higher average ability.
34
For information on how we extract this subsample from PSID, see the Appendix,
Part F.
35
By taking residuals from a regression of earnings on a constant, period dummies and
birth year dummies, we can remove uctuations in earnings due to period-specicand
cohort-specic shocks. See Rodgers et al. (1996) for a description of the procedure we use.
36
For all proles used here, survival rates by age, gender and education also are
incorporated, which are obtained from National Vital Statistics Reports (2004). Beleld
et al. (2006) do not account for negative correlation between educational attainment
and death rates.
37
The share of fringe bene t has uctuated over time with the historical average of
about 30%. Given the limitations of our data, we apply the economy-wide average
share at the corresponding year to each person's earnings assuming all fringe benets
are tax-free.
38
See, for example, Schweinhart et al. (2005), and Heckman et al. (2009b). Table 2
shows that the effect is mainly due to males. Heckman et al. (2009a) nd evidence that
may explain this pattern. Program treatment effects for males mainly operate through
enhancing noncognitive or behavioral skills that are very predictive of criminal
behavior.
119J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128
Author's personal copy
market outcomes. In this sub-section, we improve on the previous
studies (for example, Beleld et al., 2006) by exploring the impact on
rates of return and costbenet analysis of a variety of assumptions
and accounting rules. For each subject, the Perry data provide a full
record of arrests, convictions, charges and incarcerations for most of
the adolescent and adult years. They are obtained from administrative
data sources.
39
The empirical challenges addressed in this section
are twofold: obtaining a complete lifetime prole of criminal activities
for each person, and assigning values to that criminal activity. The
Appendix, Part H presents a comprehensive analysis of the crime data
which we summarize in this section.
3.4.1. Lifetime crime proles
Even though the arrest records for Perry participants cover most
of their adolescent and adult lives, information about criminal acti-
vities stops at the time of the age-40 interview. To overcome this
problem, we use national crime statistics published in the Uniform
Crime Report (UCR), which are collected by the Federal Bureau of
Investigation (FBI) from state and local agencies nationwide. The UCR
provides arrest rates by gender, race, and age for each year. We apply
population rates to estimate missing crime.
40
See the discussion in the
Appendix, Part H.
3.4.2. Crime incidence
Estimating the impact of the program on crime requires estimating
the true level of criminal activity at each age and obtaining reasonable
estimates of the social cost of each crime. For a crime of type c at time
t, the total social cost of that crime V
t
c
can be calculated as a product of
the social cost per unit of crime C
t
c
and the incidence I
t
c
:
V
c
t
= C
c
t
× I
c
t
:
We do not directly observe the true incidence level I
t
c
. Instead, we
only observe each subject's arrest record at age t for crime c, A
t
c
.
41
If
we know the incidence-to-arrest ratio I
t
c
/A
t
c
from other data sources,
we can estimate V
t
c
by multiplying the three terms in the following
expression:
V
c
t
= C
c
t
×
I
c
t
A
c
t
× A
c
t
:
To obtain the incidence-to-arrest ratio I
t
c
/A
t
c
for each crime of type c at
time t, we use two national crime datasets: the Uniform Crime Report
(UCR) and the National Crime Victimization Survey (NCVS).
42
The UCR
provides comprehensive annual arrest data between 1977 and 2004 for
state and local agencies across the U.S. The NCVS is a nationally-
representative household-level data set on criminal victimization which
provides information on levels of unreported crime across the U.S. By
combining these two sources, we can calculate the incidence-to-arrest
ratio for each crime of type c at time t. As noted in the Federal Bureau of
Investigation (2002), however, the crime typologies derived from the
UCR and those of the NCVS are not strictly comparable. To overcome
this problem, we developed a unied categorization of crimes across
the NCVS, UCR, and Perry data sets for felonies (the Appendix, Part H,
Table H.4) and misdemeanors (the Appendix, Part H, Table H.5). The
Appendix, Part H, Table H.7, shows our estimated incidence-to-arrest
ratios for these crimes. To check the sensitivity of our results to the
choice of a particular crime categorization, we use two sets of incidence-
to-arrest ratios and compare results. For the rst set, we assu me
that each crime type has a different incidence-to-arres t ratio. These
are denoted Separated in our tables. For the second set, we use
two broad categories , violent vs. property crime. These are denoted
by Property vs. violent in our tables. Further, to account for local
context, we calculate ratios using UC R/NC VS crime levels that are
geographically specic to the Perry program: only crimes commit-
ted or arrests made in Metropolitan Sampling Areas of the
Midwest.
43
3.4.3. Unit costs of crime
Using a simplied version of a decomposition developed in
Anderson (1999) and Cohen (2005), we divide crime costs into
victim costs and Criminal Justice System costs, which consist of police,
court, and correctional costs.
3.4.3.1. Victim costs. To obtain total costs from victimization levels,
we use unit cos ts from Cohen (2005). Different types of crime are
ass ociated with different victimization unit costs. Some crimes are
not associated with a ny victimization costs. In the Appendix, Part H,
Table H.13, we summarize the unit cost estimates used for d ifferent
types of crime.
3.4.3.2. Police and court costs. Police, court, and other administrative
costs are based on Michigan-specic cost estimates per arrest
calculated from the UCR and Expenditure and Employment Data for
the Criminal Justice System (CJEE) micro datasets.
44
Since we only
observe arrests, and do not know whether and to what extent the
courts were involved (for example, whether there was a trial ending
in acquittal), we assume that each arrest incurred an average level of
all possible police and court costs. This unit cost was applied to all
observed arrests (regardless of crime type).
3.4.3.3. Correctional costs. Estimating correctional costs in Perry is a
more straightforward task, as the data include a full record of
incarceration, parole, and probation for each subject. To estimate the
unit cost of incarceration, we use expenditures on correct ional
institutions by state and local governments in Michigan divided by
the total institution population. To estimate the unit costs of parole
and probation, we perform a similar calculation.
45
39
The earliest records cover ages 839 and the oldest cover ages 1344. However,
there are some limitations. At the county (Washtenaw) level, arrests, all convictions,
incarceration, case numbers, and status are reported. At the state (Michigan) level,
arrests are only reported if they lead to convictions. For the 38 Perry subjects spread
across the 19 states other than Michigan at the time of the age-40 interview, only 11
states provided criminal records. No corresponding data are provided for subjects
residing abroad.
40
We use the year-2002 UCR for this extrapolation.
41
Given these data limitations, we do not model each individual's criminal behavior
so that we are not able to fully account for the individual level dynamics of criminality.
We address the heterogeneity of criminal activity across Perry sample members by
including a number of tables and gures in the Appendix, Part H on estimation of the
costs of crime and by conducting sensitivity analyses at the aggregate level by
including or excluding a group of hard-core criminals who repeat crimes.
42
The Federal Bureau of Investigation website provides annual reports based on UCR
(http://www.fbi.gov/ucr/ucr.htm). NCVS are available at Department of Justice
website (http://www.ojp.usdoj.gov/bjs/cvict.htm).
43
For this purpose, the Midwest is dened as Ohio, Michigan, Indiana, Illinois,
Wisconsin, Minnesota, Iowa, Missouri, North Dakota, South Dakota, Nebraska, and
Kansas. The City of Ypsilanti, where the Perry program was conducted, belongs to the
Detroit Metropolitan Sampling Area. For a comparison of these ratios between the
local and national levels, see the Appendix, Part H.3.
44
From Bureau of Justice Statistics (2003), we obtain total expenditures on police
and judiciallegal activities by federal, state, and local governments. We divide the
expenditures from Michigan state and local governments by the total arrests in this
area obtained from UCR. To account for federal agencies' involvement, we add another
per arrest police/court cost which is calculated by dividing the total expenditure of
federal government with the total arrests at national level. This calculation is done for
years 1982, 1987, 1992, 1997, and 2002. For periods between selected years, we use
interpolated values. See the Appendix, Part H.
45
Beleld et al. (2006) compute crime-specic criminal justice system costs. Even
though in principle this approach could be more accurate than ours, we do not adopt it
in this paper because their data source is questionable and we could not nd any other
relevant sources. Their unit cost estimates are obtained from a study of the police and
courts of Dade County, Florida which has quite different characteristics from
Washtenaw County, Michigan where the Perry experiment was conducted. We
examine the sensitivity of our estimates to alternate ways to measure costs in Table 5.
See the Appendix, Part H for further discussion.
120 J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128
Author's personal copy
3.4.4. Estimated social costs of crime
Table 3 summarizes our estimated social costs of crime. Our
approach differs from that used by Beleld et al. (2006) in several
respects. First, in estimating victimization-to-arrest ratios, police and
court costs, and correctional costs, we use local data rather than national
gures. Second, we use two different values of the victim cost of murder:
an estimate of the statistical value of life ($4.1 million) and an estimate
of assault victim cost ($13,000).
46
We report separate rates of return for
each estimate. Only four murders are observed in the Perry arrest
records.
47
If one uses the statistical value of life as the cost of murder to
the victim, a single murder might dominate the calculation of the rate of
return. To avoid this problem, Barnett (1996) and Beleld et al. (2006)
assign murder the same low cost as assault. We adopt this method as
one approach for valuing the social cost of murder, but we also explore
an alternative that includes the statistical value of life in murder
victimization costs. Contrary to intuition, however, assuming a lower
murder cost is not conservative in terms of estimating the rate of
return because the lone treated male murderer committed his crime at a
very early age (21) while the two control male murderers committed
their crimes in their late 30s. As a result, assigning a high victimization
cost to murder decreases the rate of return for males. Given the temporal
pattern of murder, we present rate of return estimates using both high
and low victim costs for murder (the former includes the statistical
value of life, and the latter does not) and compare the results. Third, we
assume that there are no victim costs associated with driving
misdemeanors and drug-related crimes.WhereasBeleld et al.
(2006) assign substantial victim costs to these types of crimes, we
consider them to be victimless. Although such crimes could be the
proximal cause of victimizations, such victimizations would be directly
associated with other crimes for which we already account.
48
This
approach results in a substantial decrease of crime cost compared to the
cost of crime used in previous studies because these specic crimes
account for more than 30% of all crime reported in the Perry study.
3.5. Tax payments
Taxes are transfers from the taxpayer to the rest of society, and
represent benets to recipients that reduce the welfare of the taxed
unless services are received in return. Our analysis considers benets
to recipients, benets to the public, and total social benets (or costs).
The latter category nets out transfers but counts costs of collecting and
avoiding taxes.
Higher earnings translate into higher absolute amounts of income tax
payments (and consumption tax payments) that are benecial to the
general public excluding program participants. Since U.S. individual
income tax rates and the corresponding brackets have changed over
time, in principle we should apply relevant tax rates according to period,
income bracket, and ling status. In addition, most wage earners must
pay the employee's share of the Federal Insurance Contribution Act
(FICA) tax, such as the Social Security tax and the Medicare tax. In 1978,
the employee's marginal and average FICA tax rate for a four-person
family at a half of US median income was 6.05% of taxable earnings. It
gradually increased over time, rea ching 7.65% in 1990, and has remained
at that level ever since.
49
Here, we simplify the calculation by applying a
15% individual tax rate and 7.5% FICA tax rate to each subject's taxable
earnings in each year.
50
Beleld et al. (2006) use the employer's share of
FICA tax in addition to these two components in computing the benet
to the general public, but we do not. A recent consensus among eco-
nomistsisthatemployer's share of payroll taxes is passed on to
employees in the form of lower wages than would otherwise be paid.
51
Since this tax burden is already incorporat ed in realized earn-
ings, we do not count it in computing the benet accrued to the
general public while employers who ar e also among the general
public pay some money to the gove rnment. The Appendix, Part J,
Tables J.1J.3, show how individual gross earnings are decom-
posed into net earnings and tax payments under this assumption.
3.6. Use of the welfare system
Most Perry subjects were signicantly disadvantaged and received
considerable amounts of nancial and non-nancial assistance from
various welfare programs. Differentials in the use of welfare are another
important source of benet from the Perry program. We distinguish
transfers, which may benet one group in the society at the expense of
another, from the costs associated with making such transfers. Only the
latter should be counted in computing gains to society as a whole.
We have two types of information on the use of the welfare
system: incidence of welfare dependence, and actual welfare
payments. The Appendix, Part I, Table I.1, presents descriptive
statistics comparing welfare incidence, the length of welfare spells,
and the welfare benets that are actually received by treatments and
controls. One nding is that control females depend on welfare
programs more heavily than treatment females before age 27. That
pattern is reversed at later ages.
52
For males, the scale of welfare usage
is lower, with controls more likely to use welfare at all ages.
Two types of data limitations affect our calculation. One is that we
do not have enough information about receipt of various in-kind
transfer programs, such as medical, housing, education, and energy
assistance, which represent a large portion of total U.S. welfare
expenditures. The other is that even for cash assistance programs such
as General Assistance (GA), AFDC/TANF, and Unemployment Insur-
ance (UI), we do not have complete lifetime proles of cash transfers
for each individual. Given these limitations, we adopt the following
method to estimate full lifetime proles of welfare receipt.
First, we use the NLSY79 and PSID comparison samples to impute the
amount received from various cash assistance and food stamp programs.
Prior to age 27, we employ the NLSY79 black low-ability subsample.
Since only the total number of months on welfare programs is known for
the Perry sample during this age range, such imputations are unavoidabl e.
We impute individual monthly welfare receipt for each year using
coefcients from NLSY79 individual welfare payments for the
corresponding year regressed on gend er and education indicators, a
46
See Cohen (2005) and, for a literature review, see Viscusi and Aldy (2003), who
provides a range of $29 million for the value of a statistical life.
47
One is committed by a control female,two by control males, and one by a treated male.
48
Driving misdemeanors include driving without a license; suspended license;
driving under the inuence of alcohol or drugs; other driving misdemeanors; failure to
stop at an accident; improper license plate. Drug-related crimes include drug abuse,
sale, possession, or trafcking. Beleld et al. (2006) use $3538 to evaluate the cost of
driving misdemeanors and $2620 for drug-related crimes (in year-2006 dollars).
Beleld et al. (2006) compute expected victim costs for these cases, which include, for
example, probable risk of death. This practice leads to double counting, and thus to
overstating savings in victimization costs due to the Perry program.
49
See Tax Policy Center (2007).
50
The effective tax rate for the working poor is much higher than this because
people lose eligibility for various welfare programs or withdraw benets as income
increases. See Moftt (2003). Because, in computing the rate of return, we account for
the effects of Perry on all kinds of welfare benets, including in-kind transfers, we
apply the baseline tax rates to earnings data alone to avoid double counting.
51
Congressional Bu dget Ofce (2007). Anderson and Meyer (2000) present
empirical evidence supporting this view.
52
Beleld et al. (2006) suggest that delayed child rearing and higher educational
attainment among treatment females can explain this phenomenon. However, this
pattern is at odds with evidence from the NLSY79 in which greater use of welfare is
associated with lower educational attainment. Bertrand et al. (2000) show that a
person's welfare participation can be affected by behaviors of others in a network.
Since the Perry program was conducted in a small town (Ypsilanti, Michigan) and the
treated females have known each other from their childhood, they could presumably
share and exchange information about welfare programs. This may have made it easier
for them to apply and receive benets. In the NLSY79 which samples randomly from
many communities, this effect is unlikely to be at work. If the network effect
dominated, the observed contradictory pattern should be interpreted as the composite
of treatment effect and network effect. While having a better social network also could
be a treatment effect, this distinction would be useful for investigating the external
validity of this program.
121J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128
Author's personal copy
dummy variable for teenage pregnancy, number of months in wedlock,
employment status, earnings, and the number of biological children.
53
In
this regression, welfare payments include food stamps and all kinds of
cash assistance available in the NLSY79 dataset, such as Unemployment
Insurance (UI), AFDC/TANF, Social Security, Supplemental Security
Income (SSI), and any other cash assistance. For ages 2840, the Perry
records provide both the total number of months on welfare and the
cumulative amount of receipts through UI, AFDC, and food stamps. We use
the observed amounts for these programs. For other welfare programs, we
use a regression-based imputation sche me similar to that used to analyz e
the data prior to age 27. The total amount is computed as a sum of these
two components. To extrapolate this prole past age 40, we use the PSID
dataset, which con tains proles over longe r stretches of the life cycle than
does the NLSY79. As with the earnings extrapolation, we target the low-
ability subsample of the PSID dataset. We rst estimate a random effects
model of welfare receipt using a lagged dependent variable, education
dummies, age dummies, and a constant as regressors. We use the tted
model to extrapolate. As with the NLSY79 imputation, the dependent
variable in this model includes all cash assistance and food stamps.
54
Second, to account for in-kind transfers, we employ the Survey of
Income and Program Participation (SIPP) data. In SIPP, we calculate the
probability of being in specic in-kind transfer programs for a less-
educated black population born between the years 1956 and 1965,
using the year-1984, -1996, and -2004 micro datasets.
55
We estimate
linear probability models for participation in each variety of programs
using gender and educational attainment variables as predictors.
56
This
calculation is done separately for Medicaid, Medicare, housing assis-
tance, education assistance, energy assistance, public training programs,
and other public service programs. We interpolate values for missing
data in periods between the years of available data for each SIPP series.
Past 2004, we use year-2004 estimates assuming that the current
welfare system continues. To convert this probability to monetary
values, we use estimates of Moftt (2003) for real expenditures on the
combined federal, state, and local spending for the largest 84 means-
tested transfer programs. We adjust upward cash assistance amounts by
a product of the probability of participation in each program and the
ratio of real expenditures of the in-kind program to that of cash
assistance, so that the resulting amount becomes the expected cash
value of in-kind receipt. We aggregate across programs to obtain overall
totals. Table 3 summarizes our estimated proles of welfare use.
For society, each dollar of welfare involves administrative costs.
Based on Michigan state data, Beleld et al. (2006) estimate a cost to
society of 38 cents for every dollar of welfare disbursed. We use this
estimate to calculate the cost of welfare programs to society.
57
3.7. Other program benets
Other possibly benecial effects of the Perry program that are not
easily quantied include the psychic cost of education, the utility gain
from committing crime, the value of leisure, the value of marital and
parental outcomes, the contribution of the program to child care, the value
of wealth accumulation, the value of social life, the value of improved
health and longevity, and any intergenerational effects of the program.
58
These benets are not included in our analysis due to data limitations.
4. Internal rates of return and benet-to-cost ratios
In this section, we calculate internal rates of return and benet-to-
cost ratios for the Perry program under various assumptions and
estimation methods. The internal rate of return (IRR) compares
alternative investment projects in a common metric. For each gender
and treatment group, we construct average life cycle benet and cost
proles and then compute IRRs. We also compute standard errors for all
of the estimated IRRs and benet-to-cost ratios. The computation of
standard errors is constructed in three steps. In the rst step, we use the
bootstrap to simultaneously draw samples from Perry, the NLSY79, and
the PSID.
59
For each replication, we re-estimate all parameters that are
used to impute missing values, and re-compute all components used in
the construction of lifetime proles. Notice that in this process, all
components whose computations do not depend on the comparison
group data also are re-computed (e.g., social cost of crime, educational
expenditure, etc.) because the replicated sample consists of randomly
drawn Perry participants. In the second step, we adjust all imputed
values for prediction errors on the bootstrapped sample by plugging in
an error term which is randomly drawn from comparison group data by
a Monte Carlo resampling procedure. Combining these two steps allows
us to account for both estimation errors and prediction errors. Finally we
compute point estimates of IRRs for each replication to obtain bootstrap
standard errors. The Appendix, Part K, describes in greater detail the
procedure used to compute our standard errors.
Table 4 and the Appendix, Part J, show estimated IRRs computed
using various methods for estimating earnings proles and crime costs
and under various assumptions about the deadweight cost of taxation.
Numbers in parentheses below each estimate are the associated
standard errors. We rst use the victim cost associated with a murder
as $4.1 million, which includes the statistical value of life (column
labeled High). We also provide calculations setting the victim cost of
murder to that of assault, which is about $13,000, to avoid the problem
that a single murder might dominate the evaluation (nal two sets of
columns). To gauge the sensitivity of estimated returns to the way
crimes are categorized, we rst assume that crimes are separated and
compute victimization ratios separately for each crime (columnslabeled
Separated). We then aggregate crimes into two categories: property
and violent crimes, and compute victimization ratios within each
category. Average costs of crime are computed for each category. The
estimates reported in these tables account for the deadweight costs of
taxation: dollars of welfare loss per tax dollar. Since different studies
suggest various estimates of the size of the deadweight cost of taxation,
in this paper we show results under various assumptions for the size of
deadweight loss associated with $1 of taxation: 0, 50, and 100%.
60
There
are many components of the calculation that are affected by this
consideration: initial program cost, school expenditure paid by the
general public, welfare receipt and the overhead cost of welfare paid by
the general public, and all kinds of criminal justice system costs such as
police, court, and correctional costs.
61,62
53
The exact imputation equation is given in the Appendix, Part I.
54
We remove cohort and year effects. See the Appendix, Parts G and I.
55
Since SIPP does not contain any kind of ability measure, we use a subsample whose
educational attainment is less than or equal to some college credits without diploma.
56
We t the same equation to both treatment and control group members.
57
This cost consists of two components: the cost of administering welfare disbursement
and the cost accrued due to overpayments and payments to ineligible families.
58
In this study, we do not include the child care cost that parents of program subjects
would have paid without this program. Thus, we likely underestimate the true benets.
Beleld et al. (2006) count child care for participants as a benet, even for women who do
not work, and hence they inate the benets of the Perry program. The effect of the
program on longevity is accounted for to some degree because we adjust all proles used
in this study for survival rates by age, gender and educational attainments.
59
For procedures that use different nonexperimental samples, we conduct the same
analyis.
60
See Browning (1987), Heckman and Smith (1998), Heckman et al. (1999), and
Feldstein (1999) for discussion of estimates of deadweight costs.
61
We do not apply this adjustment to income tax paid by Perry subjects to avoid
double counting. Observed earnings are already adjusted for taxes. See the discussion
in the Appendix, Part J.
62
Previous studies do not adjust reported estimates for the deadweight cost of taxation.
On this point, Barnett (1985) writesthat "because the estimated net effectof preschool is to
decrease public expenditure, ignoringthese coststends to underestimatethe social benets
from preschool. Therefore it has a conservative inuence on the results" (p. 84, footnote
number 2). However, the Perry program increased public expenditure on education. The
timing of costs and benets matters in computing rates of return and benet-cost ratios.
Thus, theimpact of ignoringdeadweight lossesis far from clear.We specically incorporate
deadweight costs into our estimation and conduct sensitivity analyses to address this issue
and clarify its impact. The Appendix, Part J, presents the estimated IRRs and benet-to-cost
ratios under different assumptions about the deadweight costs of taxation.
122 J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128
Author's personal copy
The Perry program incurs the deadweight costs associated with
initial funding but saves the de adwe ight costs associated with
taxes used to fund transfer recipients. Table 1 provides a compari-
son of results across differen t assumptions of the deadweight costs
and the real discount rate for the sel ected earnings imputation and
extrapolation method. Table 4 presents results across different
earnings imputation and extrapolation methods usi ng ou r pre-
fer red estimate of 50% for the deadweight co st.
63
Estimates based
on k ernel matching imputation and PSID projection of missing
earnings are reported in Table 1. Table 4 shows that our estimates are
robust to the choice of alternative extrapolation/interpolation proce-
dures.
64
A complete set of our results can be found in the Appendix,
Part G.
Heckman et al. (2009b) document that the randomization protocol
implemented in the Perry program is somewhat problematic.
65
Post-
randomization reassignments were made to promote compliance with
the program. This and other modications to simple random assign-
ment created an imbalance between treatment and control groups in
family background such as father's presence at home, an index of
socioeconomic status (SES), and mother's employment status at
program entry.
66
This imbalance could induce a spurious relationship
between outcomes and treatment assignments, which violates the
assumption of independence that is the core concept of a randomized
experiment. To produce valid IRRs and standard errors, analysts of the
Perry data should take into account the corruption of the randomization
protocol used to evaluate the Perry program and examine its effects on
estimated rates of return. One way to control for potential biases is to
condition all lifetime cost and benet streams on the variables that
determine reassignment. By adjusting cost and benet streams for these
variables, corruption-adjusted IRRs and the associated standard errors
can be computed.
67
All results presented in Table 4 are adjusted for
compromised randomization. The Appendix, Part L compares estimates
with and without this adjustment. The adjustment does not greatly
change the estimated IRR although it affects the strength of individual
level treatment effects (Heckman et al., 2009b).
68
Table 4
Internal rates of return (%), by imputation and extrapolation method and assumptions about crime costs assuming 50% deadweight cost of taxation.
Returns To individual To society, including the individual (nets out transfers)
Victimization/arrest ratio
a
Separated Separated Property vs. violent
Murder victim cost
b
High ($4.1M) Low ($13K) Low ($13K)
Imputation Extrapolation All
c
Male Female All
c
Male Female All
c
Male Female All
c
Male Female
Piecewise linear interpolation
d
CPS 6.0 5.0 7.7 8.9 9.7 15.4 7.7 9.7 9.5 7.7 10.1 10.2
(1.7) (1.8) (1.8) (4.9) (4.2) (4.3) (2.6) (3.0) (2.7) (3.9) (4.5) (3.6)
PSID 4.8 2.5 7.4 7.3 8.0 15.3 7.6 9.2 10.0 7.2 9.5 10.5
(1.6) (1.8) (1.5) (5.0) (4.1) (3.7) (2.7) (3.1) (2.8) (3.7) (4.4) (3.1)
Cross-sectional regression
e
CPS 5.0 4.8 6.8 7.3 8.3 14.2 7.4 10.0 8.7 7.2 10.1 9.2
(1.4) (1.5) (1.3) (4.5) (4.1) (4.0) (2.3) (2.9) (2.2) (3.4) (4.0) (3.3)
PSID 4.9 4.3 5.9 8.6 9.8 14.9 7.2 10.0 7.8 7.2 10.4 8.7
(1.6) (1.8) (1.5) (2.3) (3.3) (5.2) (2.9) (3.0) (1.5) (3.7) (4.1) (1.5)
Hause 4.8 4.9 6.8 7.3 8.5 14.9 7.2 10.0 8.7 7.1 10.1 9.3
(1.4) (1.4) (1.2) (4.0) (4.2) (3.4) (2.7) (2.9) (2.3) (3.0) (4.1) (3.2)
Kernel matching
f
CPS 6.9 7.6 6.6 8.1 9.5 14.7 8.5 11.2 8.8 8.5 11.1 9.4
(1.3) (1.1) (1.4) (4.5) (4.1) (3.2) (2.5) (2.9) (2.9) (3.5) (4.3) (3.5)
PSID 6.2 6.8 6.8 9.2 10.7 14.9 8.1 11.1 8.1 8.1 11.4 9.0
(1.2) (1.1) (1.0) (2.9) (3.2) (4.8) (2.6) (3.1) (1.7) (2.9) (3.0) (2.0)
Hause 6.3 8.0 7.1 8.4 9.7 14.6 8.8 11.2 9.3 8.5 11.2 9.6
(1.2) (1.2) (1.3) (4.3) (4.0) (4.0) (2.3) (2.5) (2.4) (3.2) (4.2) (3.7)
Hause
g
CPS 7.1 6.5 6.5 8.0 8.9 14.7 8.5 10.5 8.6 8.3 10.5 9.1
(2.5) (2.7) (2.0) (4.7) (4.2) (4.2) (2.6) (2.2) (2.7) (3.1) (4.0) (3.3)
PSID 7.0 6.0 6.2 9.7 10.5 14.8 8.8 11.0 7.4 8.8 11.3 8.4
(3.0) (2.9) (2.2) (3.7) (3.8) (5.6) (3.2) (3.4) (2.5) (3.7) (3.1) (3.2)
Hause 6.5 5.7 6.3 7.8 8.7 14.5 8.2 10.6 8.5 8.2 11.0 9.4
(2.3) (2.0) (1.8) (4.7) (4.2) (3.5) (2.5) (3.0) (2.7) (3.3) (4.0) (3.6)
Notes: Standard errors in parentheses are calculated by Monte Carlo resampling of prediction errors and bootstrapping. All estimates are adjusted for compromised randomization.
All available local data and the full sample are used unless otherwise noted.
a
A ratio of victimization rate (from the NCVS) to arrest rate (from the UCR), where Property vs. violent uses common ratios based on a crime being either violent or property and
Separated does not.
b
High murder cost valuation accounts for statistical value of life, while Low does not.
c
The All IRR represents an average of the proles of a pooled sample of males and females, and may be lower or higher than the proles for each gender group.
d
Piecewise linear interpolation between each pair reported.
e
Cross-sectional regression imputation using a cross-sectional earnings estimation from the NLSY79 black low-ability subsample.
f
Kernel-matching imputation matches each Perry subject to the NLSY79 sample based on earnings, job spell durations, and background variables.
g
Based on the Hause (1980) earnings model.
63
In principle, since different types of taxes are levied by different jurisdictions that
create different deadweight losses, we should account for this feature of the welfare
system. In practice, this is not possible. The Appendix, Part J, Tables J.1J.3, take the
reader through the calculation underlying three of the cases reported in this table.
64
The kernel matching procedure imposes the fewest functional form assumptions on
the earnings equations on the Perry-NLSY79 matched data, and is more data-sensitive
than crude interpolation schemes. The PSID projection imposes an autogressive earnings
function with freely specied covariates on the Perry-PSID matcheddata,which is the least
functionally form dependent. At the same time it enables us to link post-age-40 earnings
with the observed earnings at 40 and respects differences in unobservables between
treatments and controls. For details, see the Appendix, Part G.
65
For details on the randomization protocol, see Weikart et al. (1978) and Heckman
et al. (2009b). The latter paper analyzes program treatment effects accounting for the
corrupted randomization. For further discussion of the randomization procedure, see
the Appendix, Part L.
66
The fraction of children with working mothers at study entry is much higher in the
control group (31%) than in the treatment group (9%). This trend continues to hold when
males and females are viewed separately. In contrast, treatment females had a greater
fraction of fathers present at study entry than control females (68% vs. 42%), while
treatment males had a smaller fraction of fathers present than control males (45% vs. 56%).
67
We use the FreedmanLane procedure discussed in Heckman et al. (2009b). They
show that the results from this procedure agree with the results obtained from other
parametric and semiparametric estimation procedures.
68
While Barnett (1985) recognizes this problem, he does not adjust his estimates for any
imbalances in the randomization protocol. He only discusses thepossibility of imbalancein
treatment assignment based on mother's working status. As previously noted, however,
the imbalance is observed on several variables. We explicitly adjust our estimates for this
problem and compare our estimates with unadjusted ones in the Appendix, Part L.The
effects on the estimated rates of return are not always in the same direction.
123J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128
Author's personal copy
The e stim ated rates of retu rn reported in Table 4 are compar-
able across different imputation and extrapolation schemes. Kernel
matching imputation tends to produce slightly higher estimates.
Alternative extrapolation methods h ave a more modest effect on
estimated rates of return.
69
Alternative assumptions about t he
victim cost of murder whether to include the statistical value o f
life affect the estimated rates of return in a counterintuitive
fashion. Assigning a high number to the value of a life lowers the
estimated rate of return because the one murder committed by a
treatment group male occurs earlier than the two committed by
malesinthecontrolgroup.
70
Although estimated costs are sensitive
to assumptions about the victimization cost associated with murder,
they are not very sensitive to the crime categorization method.
Adjusting for deadweight losses of taxes lowers the rate of return to
the program. Our esti mates of the overall rate of return hover in the
range of 710%, and are statistically signicantly different from zero
in most cases.
71
Table 5 presents some sensitivity analyses. First, it de monstrates
how IRRs change when we exclude outliers from our computat ion.
We consider two types of outliers: subjects who at tain more t han
a 4-year college degree and a group of hard-core criminal
off enders. In the Perry dataset, we o bserve 2 subjects who acquired
master's degrees: 1 control male and 1 treatment female. Compared
to the full sample result, excluding these subjects h as only modest
effects on the estimated IRRs. To exclude the hard-core criminals,
we use the total number of lifetime charges through age 4 0
including juvenile crimes. We exclude the top 5% ( six persons) of
hard-core offenders from the full sample and recalculate the IRRs.
72
Eliminating these offenders increases the estimated social IRRs
obtained from the pooled samp le, and strengthens the precision of
the estimates.
Table 5 also compares our estimates with two other sets of IRRs: one
based on national gures in all computations and the other on crime-
specic criminal justice system (CJS) costs used in Beleld et al. (2006).
Accounting for local costs instead of relying on national gures increases
estimated IRRs. The signs and magnitude of the effect of using local costs
differs by cost component. As noted in Section 3, schooling expenditures
for K-12 education in Michigan are slightly higher than the
corresponding gures at the national level, although accounting for
this has only modest effects on the estimated IRRs. The modest effect
arises because total education expenditure is not substantially different
between the control and the treatment group after accounting for grade
retention, special education, and years in regular schooling. Using local
crime costs increases the estimated IRR and increases the precision of
the estimates. No clear pattern emerges from using local vs. national
victimization-to-arrest ratios. However, criminaljustice system costs for
Michigan are unambiguously higher than the corresponding national
costs and accounting for these raises the estimated rate of return. Using
the crime-specic CJS costs employed by Beleld et al. (2006) has mixed
effects on the estimated IRR depending on the victimization-to-arrest
ratio used. As noted in Section 3, their estimated costs of the criminal
justice system are based on data collected from a speciclocalarea
whose characteristics are quite different from the region where the
Perry program was conducted.
4.1. Benet-to-cost ratios
As noted in Hirshleifer (1970), use of the IRR to evaluate
pro grams is potentially problematic.
73
For this reason, it is useful
69
The proles labeled as all pool the data regardless of gender. The IRR from pooled
samples can be higher or lower than the IRR for each separate gender subsample.
70
Thus, when Beleld et al. (2006) assign a low value to murder victimization, they
are actually presenting an upward-biased estimate.
71
Tables J.4 and J.5 in the Appendix, Part J, present the estimates for other
assumptions about deadweight loss.
72
The Appendix, Part H, Table H.12, identies these individuals. This group consists
of 3 control males and 3 treatment males. This small group commits about 25% of all
charges and 27% of all felony charges associated with non-zero victim cost.
Table 5
Sensitivity analysis of internal rates of return (%).
Returns To individual To society, including the individual (nets out transfers)
Victimization/arrest ratio
a
Separated Separated Property vs. violent
Murder victim cost
b
High ($4.1M) Low ($13K) Low ($13K)
All Male Female All
c
Male Female All
c
Male Female All
c
Male Female
Full sample 6.2 6.8 6.8 9.2 10.7 14.9 8.1 11.1 8.1 8.1 11.4 9.0
(s.e.) (1.2) (1.1) (1.0) (2.9) (3.2) (4.8) (2.6) (3.1) (1.7) (2.9) (3.0) (2.0)
Excluding MAs
d
6.1 6.7 6.8 9.1 10.6 15.0 7.9 10.9 8.4 7.9 11.2 9.2
(s.e.) (1.1) (1.2) (1.0) (2.8) (3.3) (4.6) (2.5) (3.0) (1.7) (2.8) (3.1) (2.1)
Excluding hard-core criminals
e
6.3 6.8 6.8 10.2 11.0 14.9 9.7 11.5 8.1 9.5 11.7 9.0
(s.e.) (1.3) (1.1) (1.0) (2.8) (3.1) (4.8) (2.6) (3.2) (1.7) (2.6) (3.0) (2.0)
Based on national gures
f
6.3 6.9 6.6 9.1 10.2 14.6 8.0 10.5 7.7 8.2 10.9 8.4
(s.e.) (1.2) (1.2) (1.1) (2.8) (3.0) (4.9) (2.5) (3.2) (1.6) (2.7) (3.2) (2.1)
Based on crime-specic CJS costs from Beleld et al. (2006)
g
6.4 7.0 6.6 9.1 10.7 14.8 8.0 11.1 8.0 8.2 11.5 8.6
(s.e.) (1.2) (1.3) (1.1) (2.8) (3.1) (4.6) (2.7) (3.1) (1.6) (3.0) (3.1) (2.2)
Notes: All estimates reported are adjusted for compromised randomization. Kernel matching imputation and PSID projection are used for missing earnings. Standard errors in
parentheses are calculated by Monte Carlo resampling of prediction errors and bootstrapping. Deadweight cost of taxation is assumed at 50%. All available local data and the full
sample are used unless otherwise noted.
a
A ratio of victimization rate (from the NCVS) to arrest rate (from the UCR), where Property vs. violent uses common ratios based on a crime being either violent or property and
Separated does not.
b
High murder cost accounts for statistical value of life, while Low does not.
c
All is computed from an average of the proles of the pooled sample, and may be lower or higher than the proles for each gender group.
d
Excluding 2 subjects with masters degrees (1 control male and 1 treatment female).
e
Excluding the top 5% of hard-core recidivists (3 control males and 3 treatment males).
f
National means are used in all computations. All national gures are obtained from the same source of local data.
g
Based on crime-specic CJS costs from Beleld et al. (2006) .
73
The internal rate of return i s the solution ρ of a polynomial equation
T
t =1
Y
Tr
t
Y
Ctl
t
ð1+ρÞ
t
¯
C = 0, where Y
t
Tr
and Y
t
Ctl
denote lifetime net benet streams at time t
for treatment and control group, respectively, and C
̄
is the initial program cost. Given
the high order of this equation (T=65 3 =62), there may be multiple real solutions
and some solutions may be complex. Fortunately, however, a unique positive real root
ρ
is found for all of the combination of methodologies and assumptions examined in
this paper. While we obtain a unique positive real IRR, it is still problematic whether
IRR or Benet-to-Cost ratio can serve as a correct criterion for a policy decision
comparing two mutually exclusive projects since the project with the higher internal
rate of return can have a lower present value than a rival project. See Hirshleifer (1970,
p. 7677). Carneiro and Heckman (2003) discuss the limits of the IRR in the context of
evaluating early childhood programs.
124 J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128
Author's personal copy
to consider benet-to-cost ra tios using different discount rates.
Table 1 and Tables J.6J.9 in the Appendix, Part J, present the benet-
to-cost ratios and the associated standard errors under different
assumptions about the discount rate, the deadweight cost of
taxation, the method of extrapolation and the method of interpo-
lation. Table L.1 shows the benet-to-cost ratios adjusted for com-
pro mised randomization. Th e effects of the adjustment are modest.
For typical discount rates (35%) that are below the internal rates
of return presented in Table 1, we estimate substantial benet-to-
cost ratios.
74
The benet-to-cost ratios generally su pport the rate of
return analysis.
4.2. Crime vs. other outcomes
Table 6 decomposes the benet-to-cost ratio reported in Table 1
(reproduced in the rst three columns of the table) into components
due to crime reduction and other components. The percentage
contributions of crime and other components are given in the
third row in each segment of Table 6. For both males and females, the
contribution of crime to the overall benetcost ratio is substantial
when a high value of victim's life is assumed. The contribution
declines when lower values are placed on victim lives.
For females, both the rate of return and the benetcost ratio are
higher than the value placed on a murder victim's life. For males, the
rate of return decreases when a higher value is placed on life. The
benetcost ratio increases. This pattern is explained by the time
pattern of murders among male treatments and controls. The one
male treatment murder occurs at an early age. The two male control
murders occur at later ages. At a zero rate of discount, the timing of
the murders does not matter. Because of the high internal rates of
return that we estimate, the timing matters. At a sufciently high
discount rate, the male benet-to-cost ratios decline as the value of
life increases.
4.3. Age 40 analysis
Table 7 presents a more conservative version of our preferred
analysis (kernel matching for imputation and PSID extrapolation with a
50% deadweight loss). Instead of computing the rate of return and
benetcost ratios through age 65, we compute rate of returns and
benetcost ratios only through age 40, assuming that benets stop
after that age. This eliminates the need to extrapolate past age 40, and
thus eliminates one source of model uncertainty. We compare the
estimates from the third row of Table 4 with the estimates that assume
that benets stop at age 40.
Unsurprisingly, rates of return and benetcost levels fall somewhat,
but in most cases they remain precisely determined. Even under this
very conservative assumption, the rate of return is substantial, precisely
estimated, and above the historical yield on equity.
4.4. Comparisons with previous studies
Internal rates of return for the Perry program have been reported
in two previous studies by Rolnick and Grunewald (2003) and Beleld
et al. (2006). Our estimates of the IRR are lower than those reported in
previous studies. While a number of factors produce this effect, the
dominant source of the difference between our estimates and the
previous estimates is in our treatment of crime and its social costs. In
particular, as noted in Section 3, treatment of the social cost of some
victimless crimes plays a crucial role.
Table 8 compares the estimates reported in previous studies with
those reported in this paper. While we employ a variety of methods
to e stim ate lifetime costs and benets, here we present estimates
from a method that is similar to the one used in previous studies:
linear interpolation and CPS-based extrapolation for missing
earnings along with crimes separated by type and a low murder
cost ($13,000) assumption. We present two sets of results: one
with no deadweight loss, and the other w ith a 50% deadweight loss,
which we prefer. Each set is reported adjusting and not adjusting
for the compromised rand omization of the program afeature
never considered in previous studies of the Perry program. As
pre viously noted for b oth the benetcost ratio, and the rate of
return, adjustment h as fairly modest consequences. I n addition, no
previous study accounts for the deadweight cost of taxation.
Compared to the two pre viou s studies, our estimated program
benets are smaller for education an d crime, and larger for earnings
and wel fare costs.
The difference in education cost estimates is mainly caused by
our treatment of vocational training. As shown in the Appendix, Part J,
Table J.1, the costs of college education and vocational training differ
between control and the treatment groups, while the K-12 education
costs do not. Previous studies do not account for costs of vocational
training.
Our estimated earning differentials are larger for males and
smaller for females compared to those reported in previous studies.
While our imputation method for missing earnings before age 40 is
similar in man y respects to the method used in previous studie s,
our extrapolation method for a ges after 40 is different. We account
for the fact that Per ry subjects are at the bott om of t he distribution
74
The appropriate social discount rate is a hotly debated topic. Some have argued for
a zero or negative social discount rate (Dasgupta et al., 2000). U.S. Government
recommendations (OMB, 1992 and GAO, 1991) suggest a 2.43% discount rate is
appropriate for long term investment projects like the Perry program.
Table 6
Decomposition of benet-to-cost ratios: crime vs. other outcomes.
Discount
rate
Total Crime Other outcomes
All Male Female All Male Female All Male Female
(a) High murder cost
0% 31.5 33.7 27.0 19.7 20.7 16.8 11.8 13.0 10.2
(11.3) (17.3) (14.4) (8.6) (11.3) (15.3) (3.0) (4.0) (3.6)
––– 62.7% 61.3% 62.1% 37.3% 38.7% 37.9%
3% 12.2 12.1 11.6 8.0 7.2 8.3 4.2 4.9 3.3
(5.3) (8.0) (7.1) (4.0) (5.1) (7.6) (1.1) (1.4) (1.4)
––– 65.3% 59.5% 71.5% 34.7% 40.5% 28.5%
5% 6.8 6.2 7.1 4.5 3.5 5.5 2.3 2.7 1.6
(3.4) (5.1) (4.6) (2.5) (3.2) (5.0) (0.6) (0.7) (0.8)
––– 66.1% 56.4% 76.8% 33.9% 43.6% 23.2%
7% 3.9 3.2 4.6 2.6 1.6 3.7 1.3 1.6 0.9
(2.3) (3.4) (3.1) (1.7) (2.1) (3.4) (0.4) (0.4) (0.5)
––– 66.5% 50.5% 80.1% 33.5% 49.5% 19.9%
(b) Low murder cost
0% 19.1 22.8 12.7 7.3 9.8 2.5 11.8 13.0 10.2
(5.4) (8.3) (3.8) (3.2) (5.5) (1.5) (3.0) (4.0) (3.6)
––– 38.1% 42.8% 19.5% 61.9% 57.2% 80.5%
3% 7.1 8.6 4.5 2.9 3.6 1.2 4.2 4.9 3.3
(2.3) (3.7) (1.4) (1.5) (2.6) (0.7) (1.1) (1.4) (1.4)
––– 40.2% 42.2% 26.5% 59.8% 57.8% 73.5%
5% 3.9 4.7 2.4 1.6 1.9 0.8 2.3 2.7 1.6
(1.5) (2.3) (0.8) (1.0) (1.7) (0.4) (0.6) (0.7) (0.8)
––– 41.0% 41.3% 31.9% 59.0% 58.7% 68.1%
7% 2.2 2.7 1.4 0.9 1.1 0.5 1.3 1.6 0.9
(0.9) (1.5) (0.5) (0.7) (1.2) (0.3) (0.4) (0.4) (0.5)
––– 41.9% 39.1% 36.1% 58.1% 60.9% 63.9%
Notes: The categories Crime and Other outcomes sum up to the Total. Standard
errors in parentheses are calculated by Monte Carlo resampling of prediction errors
and bootstrapping; see the Appendix, Part K, for details. The percentages reported are
the contributions of each component. Kernel matching is used to impute missing values
in earnings before age-40, and PSID projection for extrapolation of later earnings. For
details of these procedures, see Section 3. In cal culating benet-to-cost ratios,
deadweight loss of taxation is assumed at 50%. Lifetime net benet streams are
adjusted for corrupted randomization by being conditioned on unbalanced pre-
program variables. For details, see Section 4.
125J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128
Author's personal copy
of ability by using CPS age-by-age growth rate s (rather than levels
of earnings) to make extrapolations by race, gender and educa-
tional attainment. Previous studies neglec t this aspec t of the Perry
sample and use national means of CPS earnings to project missing
earnings, inating the level of earnings for both treatments and
controls.
Table 7
Comparison of IRRs and B/C ratios between age ranges (kernel matching and PSID projection).
Return to Individual Society
a
Society Society
Arrest ratio
b
Separate Separate Property vs. violent
Murder cost
c
High ($4.1M) Low ($13K) Low ($13K)
All
e
Male Fem. All Male Fem. All Male Fem. All Male Fem.
Deadweight loss
d
IRR 50% Age 65 6.2 6.8 6.8 9.2 10.7 14.9 8.1 11.1 8.1 8.1 11.4 9.0
(1.2) (1.1) (1.0) (2.9) (3.2) (4.8) (2.6) (3.1) (1.7) (2.9) (3.0) (2.0)
Age 40 4.9 5.9 5.6 8.8 10.3 14.9 7.5 10.7 7.9 7.5 11.1 8.9
(1.9) (1.5) (1.2) (3.4) (3.3) (5.2) (3.0) (3.0) (2.3) (3.4) (3.1) (2.4)
Discount rate
Benetcost ratios 0% Age 65 –––31.5 33.7 27.0 19.1 22.8 12.7 21.4 25.6 14.0
(11.3) (17.3) (14.4) (5.4) (8.3) (3.8) (6.1) (9.6) (4.3)
Age 40 –––22.5 24.7 18.2 12.4 15.4 7.3 14.3 17.8 8.3
(9.5) (14.6) (11.2) (4.7) (7.0) (3.3) (5.4) (8.2) (3.7)
3% Age 65 –––12.2 12.1 11.6 7.1 8.6 4.5 7.9 9.5 5.1
(5.3) (8.0) (7.1) (2.3) (3.7) (1.4) (2.7) (4.4) (1.7)
Age 40 –––9.8 9.7 9.5 5.4 6.5 3.2 6.1 7.4 3.8
(4.9) (7.3) (6.3) (2.2) (3.4) (1.5) (2.6) (4.0) (1.7)
5% Age 65 –––6.8 6.2 7.1 3.9 4.7 2.4 4.3 5.1 2.8
(3.4) (5.1) (4.6) (1.5) (2.3) (0.8) (1.7) (2.8) (1.1)
Age 40 –––5.8 5.2 6.3 3.2 3.8 1.9 3.6 4.2 2.3
(3.2) (4.8) (4.3) (1.4) (2.2) (0.9) (1.6) (2.6) (1.1)
7% Age 65 –––3.9 3.2 4.6 2.2 2.7 1.4 2.5 2.9 1.7
(2.3) (3.4) (3.1) (0.9) (1.5) (0.5) (1.1) (1.8) (0.7)
Age 40 –––3.5 2.7 4.2 1.9 2.3 1.2 2.1 2.4 1.5
(2.2) (3.3) (3.0) (0.9) (1.5) (0.6) (1.1) (1.8) (0.7)
Notes: Kernel matching is used to impute missing values in earnings before age-40, and PSID projection for extrapolation of later earnings. For details of these procedures, see
Section 3. In calculating benet-to-cost ratios, deadweight loss of taxation is assumed at 50%. Lifetime net benet streams are adjusted for corrupted randomization by being
conditioned on unbalanced pre-program variables. For details, see Section 4. Standard errors in parentheses are calculated by Monte Carlo resampling of prediction errors and
bootstrapping; see the Appendix, Part K, for details.
a
The sum of returns to program participants and the general public.
b
High murder cost accounts for statistical value of life, while Low does not.
c
Deadweight cost is dollars of welfare loss per tax dollar.
d
A ratio of victimization rate (from the NCVS) to arrest rate (from the UCR), where Property vs. violent uses common ratios based on a crime being either violent or property and
Separate does not.
e
All is computed from an average of the proles of the pooled sample, and may be lower or higher than the proles for each gender group.
Table 8
Comparison with previous studies.
Author Rolnick and Grunewald (2003)
a
Beleld et al. (2006)
b
This paper
c
Deadweight cost 0% 0% 0% 50%
All Male Female All Male Female All Male Female
Education cost 9034 14,382 2349 4325 11,318 (5547) 6434 16,819 (8227)
Earnings 43,583 68,429 82,690 78,010 42,965 127,485 78,010 42,965 127,485
Crime cost 101,132 386,985 14,602 66,780 101,924 17,164 75,062 112,248 22,564
Welfare cost 381 3118 (1333) 3698 2421 5502 5547 3631 8253
Total benet 154,130 472,914 98,309 152,813 158,627 144,605 165,053 175,662 150,075
Initial program cost 17,759 17,759 17,759 17,759 17,759 17,759 26,639 26,639 26,639
Benet/cost ratio, unadj.
d
8.7 26.6 5.5 8.6 8.9 8.1 6.2 6.6 5.6
(s.e.)
e
(n.a.) (n.a.) (n.a.) (3.9) (4.3) (5.0) (3.0) (3.9) (3.6)
Benet/cost ratio, adj.
f
n.a. n.a. n.a. 9.2 9.8 8.0 6.6 5.4 7.3
(s.e.)
e
(n.a.) (n.a.) (n.a.) (3.5) (4.0) (4.7) (2.7) (3.0) (3.2)
IRR to society, unadj. (%)
d
16.0 21.0 8.0 8.6 10.6 11.6 8.0 9.8 10.2
(s.e.)
e
(n.a.) (n.a.) (n.a.) (2.6) (2.8) (3.2) (2.9) (3.4) (3.1)
IRR to society, adj. (%)
f
n.a. n.a. n.a. 8.3 10.4 11.0 7.7 9.7 9.5
(s.e.)
e
(n.a.) (n.a.) (n.a.) (2.4) (2.2) (2.9) (2.6) (3.0) (2.7)
Notes: All monetary values are in year-2006 dollars. Discount rate is assumed to be 3 percent following GAO (1991) and OMB (1992).
a
IRR are recalculated from Rolnick and Grunewald (2003), Table 1A, excluding child care cost; Original cost and benet estimate calculations can be found in Schweinhart et al.
(1993).
b
Recalculated from Beleld et al. (2006, Table 9), excluding child care cost.
c
Linear interpolation and CPS projection are used for missing earnings. Separated crime types and low murder cost ($13,000) are used for social cost of crime.
d
Unadjusted for compromised randomization.
e
Standard errors are calculated using bootstrapping. They are not computed in Rolnick and Grunewa ld (2003), Barnett (1996),orBeleld et al. (2006).
f
Adjusted for compromised randomization.
126 J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128
Author's personal copy
Another major difference between our study and previous ones
lies in our treatment of the social cost of crime. Our estimated
effect of crime reduction is much smaller than that reported in two
previous studies. We assume no victim costs are associated with
driving misdemeanors and drug-related crim es, while Beleld
et al. (2006) assign substantial victim costs to these crimes. We
consider these crimes as victimless. Even though these crimes may
be associated with other crimes which ma y generate victims, such
crimes would be recorded separately in t he Perry cr ime record. This
more careful accounting for crime results in a subs tantial decrease of
the estimated cost of crime because victimless crimes account for
more than 30% of a ll crime incidence reported in the Perry crime
record.
5. Conclusion
This paper estimates the rate of return and the benetcost ratio
for the Perry Preschool Program, accounting for locally determined
costs, missing data, the deadweight costs of taxation, and the value
of no n-marke t benets and costs. It improves on previous estimates
by accounting for corruption in th e randomization pr otocol, by
dev eloping standard errors for these estimates and by exploring the
sen sitivity of estimates to alternative assumptions about missing
data and the value of non-market benets. Our est imates are robust
to a variety of alterna tive assumptions about inter polation, extrap-
olation, and deadweight losses. In most cases, they are statistic ally
signicantly different from zero. This is true for both males and
females.
75
In general, the estimated annual rates of return are above
the historical return to equity of about 5.8% but below previous
estimates reported in the literature. Table 1 summarizes our
estimates of the rate of re turn for selected methodologies. O ur
benet-to-cost ratio estimates support the rate of return analysis.
Benets on health an d the well-being of futu re generations are not
estimated due to d ata limitations. All thing s considered, our anal ysis
likely provides a lower-bound on the true rate of return to the Perry
Preschool Program.
76
Acknowledgements
We are grateful to Lena Malofeeva and Larry Schweinhart of the
HighScope Foundation for their comments and their continued
support of our ongoing collaboration. We are grateful to the editor,
Dennis Epple, and two anonymous referees for their comments and to
Steve Durlauf, Jeff Grogger, Steven Barnett, Clive Beleld, and
participants at the Public Policy and Economics seminar at the Harris
School, University of Chicago, March, 2009. This research was
supported by the Committee for Economic Development by a grant
from the Pew Charitable Trusts and the Partnership for America's
Economic Success (PAES); the JB & MK Pritzker Family Foundation;
Susan Thompson Buffett Foundation; and NICHD (R01HD043411).
The views expressed in this presentation are those of the authors and
not necessarily those of the funders listed here. Supplemental results
are available in the Appendix.
Appendix. Supplementary results
Supplemental results associated with this article can be found, in
the online version, at doi:10.1016/j.jpubeco.2009.11.001.
References
Anderson, D.A., 1999. The aggregate burden of crime. Journal of Law and Economics
42 (2), 611642 October.
Anderson, M., 2008. Multiple inference and gender differences in the effects of early
intervention: a reevaluation of the Abecedarian, Perry Preschool and early training
projects. Journal of the American Statistical Association 103 (484), 14811495
December.
Anderson, P.M., Meyer, B.D., 2000. The effects of the unemployment insurance payroll
tax on wages, employment, claims and denials. Journal of Public Economics 78 (12),
81106 October.
Bajaj, V., Labaton, S., February 1 2009. Big risks for U.S. in trying to value bad bank assets.
New York Times Business/Economy. URL http://www.nytimes.com/2009/02/02/
business/economy/02value.html (accessed 2/4/2009).
Barnett, W.S., 1985. The Perry Preschool Program and its long-term effects: A benet-
cost analysis. Early Childhood Papers 2. High/Scope Educational Research
Foundation, Ypsilanti, MI.
Barnett, W.S., 1996. Lives in the Balance: Age 27 BenetCost Analysis of the High Scope
Perry Preschool Program. High/Scope Press, Ypsilanti, MI.
Barnett, W.S., Masse, L.N., 2007. Comparative benetcost analysis of the Abecedarian
program and its policy implications. Economics of Education Review 26 (1),
113125 February.
Beleld, C.R., Nores, M., Barnett, W.S., Schweinhart, L., 2006. The High/Scope Perry
Preschool Program: costbenet analysis using data from the age-40 followup.
Journal of Human Resources 41 (1), 162190.
Bertrand, M., Luttmer, E.F.P., Mullainathan, S., 2000. Network effects and welfare
cultures. Quarterly Journal of Economics 115 (3), 10191055 August.
Browning, E.K., 1987. On the marginal welfare cost of taxation. American Economic
Review 77 (1), 1123 March.
Carneiro, P., Heckman, J.J., 2003. Human capital policy. In: Heckman, J.J., Krueger, A.B.,
Friedman, B.M. (Eds.), Inequality in America: What Role for Human Capital
Policies? MIT Press, Cambridge, MA, pp. 77239.
Chambers, J. G., Parrish, T. B., Harr, J. J., 2004. What are we spending on social education
services in the united states, 19992000? Report 1, Special Education Expenditure
Project (SEEP). Center for Special Education Finance, United States Department of
Education, Ofce of Special Education Programs, Washington, DC.
Cohen, M.A., 2005. The Costs of Crime and Justice. Routledge, New York.
Congressional Budget Ofce, 2007. Historical Effective Federal Tax Rates: 1979 to 2005.
Congressional Budget Ofce, Washington, DC.
Dasgupta, P., Mäler, K.-G., Barrett, S., 2000. Intergenerational equity, social discount
rates and global warming, unpublished manuscript, Department of Economics,
University of Cambridge. Revised version of the paper with the same title that was
published in Discounting and Intergenerational Equity, (Washington, DC: Resources
for the Future, 1999).
DeLong, J., Magin, K., 2009. The U.S. equity return premium: past, present and future.
Journal of Economic Perspectives 23 (1), 193208 Winter.
Dillon, S., December 17 2008. Obama pledge stirs hope in early education. New York
Times US Politics, early edition. URL http://www.nytimes.com/2009/02/02/business/
economy/02value.html (accessed 2/4/2009).
Fanton, J., 2008. Philanthropy, benet
cost analysis, and public policy, remarks by
Jonathan F. Fanton at the 2008 BenetCost Analysis Conference, Washington D.C.,
June 25, 2008. URL http://www.macfound.org/site/apps/nlnet/content3.aspx?
c=lkLXJ8MQKrH b=4255617 ct=5597397.
Federal Bureau of Investigation, 2002. Crime in the United States. Department of Justice,
Federal Bureau of Investigation, Washington, DC, database, 19952001. URL http://
www.fbi.gov/ucr/02cius.htm.
Feldstein, M., 1999. Tax avoidance and the deadweight loss of the income tax. Review of
Economics and Statistics 81 (4), 674680 November.
Garces, E., Thomas, D., Currie, J., 2002. Longer-term effects of Head Start. American
Economic Review 92 (4), 9991012 September.
Goldin, C., Katz, L.F., 2008. The Race between Education and Technology. Belknap Press
of Harvard University Press, Cambridge, MA.
Hanushek, E., Lindseth, A.A., 2009. Schoolhouses, Courthouses, and Statehouses:
Solving the Funding-Achievement Puzzle in America's Public Schools. Princeton
University Press, Princeton, NJ.
Hause, J.C., 1980. The ne structure of earnings and the on-the-job training hypothesis.
Econometrica 48 (4), 10131029 May.
Heckman, J.J., 2005. Invited comments. In: Schweinhart, L.J., Montie, J., Xiang, Z.,
Barnett, W.S., Beleld, C.R., Nores, M. (Eds.), Lifetime Effects: The High/Scope Perry
Preschool Study Through Age 40. Monographs of the High/Scope Educational
Research Foundation, 14. High/Scope Press, Ypsilanti, MI, pp. 229233.
Heckman, J.J., Smith, J.A., 1998. Evaluating the welfare state. In: Strom, S. (Ed.),
Econometrics and Economic Theory in the Twentieth Century: The Ragnar Frisch
Centennial Symposium. Cambridge University Press, New York, pp. 241318.
Heckman, J.J., Humphries, J.E., Mader, N., LaFontaine, P.A., forthcoming . The GED
and the Problem of Nonc ogn itiv e Skills in Amer ica. Univers ity of C hi cago Press,
Chicago.
Heckman, J.J., LaLonde, R.J., Smith, J.A., 1999. The economics and econometrics of active
labor market programs. In: Ashenfelter, O., Card, D. (Eds.), Handbook of Labor
Economics, vol. 3A. North-Holland, New York, pp. 18652097. Ch. 31.
Heckman, J.J., Malofeeva, L., Pinto, R., Savelyev, P.A., 2009a. The powerful role of
noncognitive capabilities in explaining the effects of the Perry Preschool Program,
unpublished manuscript, University of Chicago, Department of Economics.
Heckman, J.J., Moon, S.H., Pinto, R., Savelyev, P.A., Yavitz, A.Q., 2009b. A Reanalysis of
the HighScope Perry Preschool Program, unpublished manuscript, University of
Chicago, Department of Economics. First draft, September, 2006.
75
A recent paper by Anderson (2008) claims to nd no effect for the Perry program
for males. He focuses on only a few arbitrarily weighted outcomes and does not
compute rates of return or benetcost ratios.
76
Our analysis answers many of the objections raised by Nagin (2001) against
previous costbenet studies of the returns to early interventions. We consider costs
and benets to society, rather than governments or individuals alone. However, due to
data limitations, we do not value the psychic benets of crime reduction to society at
large apart from the reduction in victimization costs.
127J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128
Author's personal copy
Herrnstein, R.J., Murray, C.A., 1994. The Bell Curve: Intelligence and Class Structure in
American Life. Free Press, New York.
Hirshleifer, J., 1970. Investment, Interest, and Capital. Prentice-Hall, Englewood Cliffs, NJ.
Karoly, L.A., Kilburn, M.R., Cannon, J.S., 2005. Early Childhood Interventions: Proven
Results, Future Promise. RAND, Santa Monica, CA.
MaCurdy, T.E., 2007. A practitioner's approach to estimating intertemporal relation-
ships using longitudinal data: lessons from applications in wage dynamics. In:
Heckman, J.J., Leamer, E. (Eds.), Handbook of Econometrics. Vol. 6, Part 1 of
Handbooks in Economics. Elsevier, Amsterdam, pp. 40574167. Ch. 62.
Mahalanobis, P.C., 1936. On the generalized distance in statistics. Proceedings of the
National Institute of Sciences of India 2 (1), 4955.
Moftt, R.A., 2003. The negative income tax and the evolution of U.S. welfare policy.
Journal of Economic Perspectives 17 (3), 119140 Summer.
Nagin, D.S., 2001. Measuring the economic benets of developmental prevention
programs. Crime and Justice 28, 347384.
National Center for Education Statistics, 1991. Digest of Education Statistics, 1990. U. S.
Department of Education, Washington, DC.
National Center for Education Statistics, Various. Digest of Education Statistics. National
Center for Education Statistics, Washington, DC.
Ofce of Management and Budget, October 1992. Circular no. a-94, revised January
2008.
Rodgers, J.D., Brookshire, M.L., Thornton, R.J., 1996. Forecasting earnings using age-
earnings proles and longitudinal data. Journal of Forensic Economics 9 (2), 169210.
Rolnick, A., Grunewald, R., 2003. Early Childhood Development: Economic Develop-
ment with a High Public Return. Tech. Rep. Federal Reserve Bank of Minneapolis,
Minneapolis, MN.
Schweinhart, L.J., Barnes, H.V., Weikart, D., 1993. Signicant Benets: The High/Scope
Perry Preschool Study Through Age 27. High/Scope Press, Ypsilanti, MI.
Schweinhart, L.J., Montie, J., Xiang, Z., Barnett, W.S., Beleld, C.R., Nores, M., 2005.
Lifetime Effects: The High/Scope Perry Preschool Study Through Age 40. High/
Scope Press, Ypsilanti, MI.
Shonkoff, J.P., Phillips, D., 2000. From Neurons to Neighborhoods: The Science of Early
Child Development. National Academy Press, Washington, DC.
Tax Policy Center, 2007. Individual income tax brackets, 19452007.
Tsang, M.C., 1997. The cost of vocational training. International Jour nal of Manpower
18 (1/2), 6389.
United States General Accounting Ofce, May 1991. Discount Rate Policy. Government
Printing Ofce, Washington, DC.
Viscusi, W.K., Aldy, J.E., 2003. The value of a statistical life: a critical review of market
estimates throughout the world. Journal of Risk and Uncertainty 27 (1), 576
August.
Weikart, D.P., Epstein, A.S., Schweinhart, L., Bond, J.T., 1978. The Ypsilanti Preschool
Curriculum Demonstration Project: Preschool Years and Longitudinal Results.
High/Scope Press, Ypsilanti, MI.
128 J.J. Heckman et al. / Journal of Public Economics 94 (2010) 114128