Highlights:
- In social policy as in medicine, many positive initial research findings for a program or treatment are not reproduced in subsequent rigorous evaluations.
- The good news is that some important positive findings do reproduce. This report highlights a recent example—the Per Scholas employment and training program for low-income workers.
- A newly reported, high-quality randomized controlled trial (RCT) found that Per Scholas increased participants’ average earnings by a remarkable 27 percent, or $4,829, in the third and latest year of the study’s follow-up, compared to the control group (statistically significant, p<0.01).
- These findings closely replicate those of a prior high-quality RCT completed eight years earlier, which found an earnings increase of 32 percent ($4,663) in the second and final year of that study.
- Earnings effects of this size in a high-quality RCT are very unusual. The replication of these effects in two such studies means that the findings are not a statistical fluke, and policy officials would likely produce major earnings gains for many low-income workers through a faithfully-implemented expansion of Per Scholas.
- A brief comment from the study team follows the main report.
When a credible study finds that a social program produces large effects on important life outcomes, the skeptic may say, “I’ll believe it when I see it replicated.” The skeptic has a point. As we have discussed previously, in social policy just as in medicine, positive findings of even a well conducted randomized controlled trial (RCT)–widely considered the strongest evaluation method—sometimes do not reproduce in a second high-quality study. The reasons for failed replications may include, for example: (i) the original finding was a statistical fluke, as sometimes statistically-significant positive findings appear by chance; (ii) the original finding was valid, but occurred only because of unique factors that were not present in the replication study (e.g., in a job training RCT, tight labor market conditions that were no longer present during the new study); or (iii) in the replication study, the program was not faithfully implemented in adherence to the key features that generated the effect in the initial study.
The nihilist might go further than the skeptic and dismiss the blockbuster RCT findings as being perhaps interesting to researchers but of little policy or practical value. The world is ever changing, the nihilist may say, and what works in one setting at a particular point in time tells us very little about whether the program will work when it is implemented at another time in a different place.
We side with the skeptic, not the nihilist, for the simple reason that there are examples of RCT findings of large effects that were successfully replicated. An important historical example in social policy is Los Angeles’ replication of Riverside, California’s effective welfare-to-work program in the 1990s (described here). In this Straight Talk report, we highlight a recently reported example of successful replication: the Per Scholas employment and training program for low-income workers. Here is our overview of the new RCT findings (our full three-page summary is linked here):
This was a well-conducted randomized controlled trial (RCT) of Per Scholas’ WorkAdvance program, which provides training and employment services in the information technology sector to low-income workers in the Bronx (a borough of New York City). The study, which randomly assigned 700 individuals to Per Scholas or to a control group, found that Per Scholas substantially increased both employment and earnings. In the third year following random assignment, Per Scholas group members earned an average of $4,829 (or 27 percent) more than control group members, and 81 percent of the Per Scholas group was employed versus 75 percent of the control group. Both effects were statistically significant. Moreover, the earnings effect grew over time, from approximately zero in year one to $3,744 in year two to $4,829 in year three. The study also found that Per Scholas significantly improved participants’ self-reported life satisfaction and reduced their use of the Supplemental Nutrition Assistance Program (SNAP).
Source: Hendra et. al. 2016 and Schaberg 2017 (see references below).
What is remarkable about these findings is not only that the earnings effect is so large (a $4,829, or 27 percent, increase in year three), but that the findings closely replicate those of a high-quality RCT of Per Scholas completed eight years earlier. The earlier study, conducted by different researchers with a two-year follow-up (as opposed to three), found that the program increased earnings in the second year by $4,663, or 32 percent. Our two-page summary of the earlier study is linked here.
So is this the promised land for evidence-based policy—i.e., strong, replicated evidence of meaningful improvement in people’s lives? Can policy officials now have strong confidence that if they implement Per Scholas on a larger scale in a similar population (adhering, of course, to the program’s key features), they will see major earnings gains for the low-income workers who participate?
We think the short answer is “yes” for two reasons. First, the successful replication provides high confidence that the initial finding was not just a statistical fluke, since the chance of two such flukes in succession is extremely small. Second, the successful replication shows that Per Scholas’ positive effects apply over different time periods and economic conditions—the relatively healthy labor market of the mid-2000s in the first study, as well as the weaker labor market of 2011-2015 in the wake of the Great Recession in the second study.
The only important caveat, we believe, is that both RCTs were conducted at Per Scholas’ program site in the Bronx. While the studies convincingly establish that the Bronx site is highly effective, it is still possible that Per Scholas’ other sites around the country may be less so (e.g., due to different staff capabilities or different relationships with local employers at these sites). Per Scholas now operates at six sites in the United States, including the Bronx. As the program expands, we would encourage a replication RCT at one or preferably more of the other sites to hopefully confirm that the program’s large effects generalize across different settings.
The skeptic has learned through long experience that many initial positive findings do not hold up in stronger studies or replication trials. The Per Scholas findings should impress the skeptic. A third successful replication in another site should turn her into a full believer.
Response provided by Rick Hendra, lead study author
We were very pleased to see these briefs and feel that it is important to emphasize, as the authors do, that social programs (such as Per Scholas) can be quite impactful and can replicate their impacts under even the most rigorous research design. Briefs like this are an important antidote to the cynicism about social programs that sometimes grows to the point where commentators feel that “nothing works.” The tone, framing, and content on the briefs seems right to us.
In addition to the factors that make replication difficult that were mentioned in the brief, one additional factor relates to the ability to detect a program’s impact. Evaluations are often designed so that they have an 80 percent chance of being able to detect the impact of a program when the program has an impact to be detected. The likelihood of detecting an impact in two evaluations, therefore, is even lower (around 64 percent). To do so, programs must have a large enough signal to show through the noise twice and have effects that are robust to various external conditions. In other words, it is really remarkable when a program’s impacts can be replicated and Per Scholas’ program is clearly an effective program.
The brief mentions that Per Scholas reduced the use of SNAP benefits, but doesn’t mention that their program also reduced the use of unemployment insurance benefits, TANF/welfare, and publicly-funded health insurance (by 10 percentage points). The program’s effect on income is also important. Many programs increase earnings but lead to an offsetting reduction in public assistance, making the effect on net income insignificant. In Per Scholas, the effect on earnings more than offset the public assistance reductions, leading to an impact on overall income. This is very important because previous studies have found that Next Generation effects translate only when income goes up, not just earnings.[1] We shouldn’t speculate beyond the data, but we are hoping that in the future we can collect criminal justice records as well. As other scholars have emphasized, we are probably undercounting the benefits of these programs by not including criminal justice records.
We are relieved to see that there is not an overemphasis in the briefs on Per Scholas’ focus on the information technology sector as being the key to their impacts. The P/PV study found impacts that are just as large in other sectors. This, coupled with recent findings on Project Quest and some of the early results coming out of the PACE evaluation, strongly suggests that what we see with Per Scholas is more about their use of a highly effective strategy that has legs beyond this one site or sector. Finally, we agreed with the recommendation for another test of Per Scholas or a similar program in a different labor market so that we can learn more about the active ingredients (through a multi-arm or factorial test) and moderating conditions (if any).
[1] Morris, P, Gennetian, L, & Duncan, G. (2005). “Effects of Welfare and Employment Policies on Young Children: New Findings on Policy Experiments Conducted in the Early 1990s.” Social Policy Report XIX:II.
References:
Hendra, R., Greenberg, D.H., Hamilton, G., Oppenheim, A., Pennington, A., Schaberg, K. & Tessler, B.L. (2016). Encouraging evidence on a sector-focused advancement strategy: two-year impacts from the WorkAdvance demonstration. MDRC. Linked here.
Schaberg, K. (2017). Can sector strategies promote longer-term effects? Three-year impacts from the WorkAdvance demonstration. MDRC. Linked here.