You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.

The Harvard Family Research Project separated from the Harvard Graduate School of Education to become the Global Family Research Project as of January 1, 2017. It is no longer affiliated with Harvard University.

Terms of Use ▼

Karen Pittman, Senior Vice President of the International Youth Foundation, spoke about the challenges to evaluating youth development programs, issues with promoting policy changes and scale-up, and other countries’ experiences with youth development programming.

We asked Karen Pittman, Senior Vice President of the International Youth Foundation, to speak with us about what some of the challenges are to evaluating youth development programs, what issues arise when trying to promote policy changes and scale-up, and what other countries’ experiences might bring to youth development programming in the United States.

What are some of the challenges to evaluating youth development programs?

One of the primary challenges for the youth development field is the definition of outcomes, particularly intermediate ones. Many programs are not able to achieve larger outcomes such as reduced teen pregnancy, reduced violence, and increased literacy. Yet funding tends to be problem or outcome-focused, and there is pressure from funders for programs to overcommit. You need to find intermediate and proximate outcomes that are positive, achievable, and also understandable and/or seen as important by parents, policymakers, and others. Fortunately, good work has been done in identifying and building basic competencies in youth. These competencies translate into intermediate skills that help young people to navigate (work in diverse settings), socialize, and learn. If we see these intermediate outcomes now, then down the road, we should be able to point to achievement of larger outcomes.

The other primary challenge is defining what the “it” of programs is. Much of the research says, at a basic level, that what these programs deliver is relationships. But how do you define that? The Fund for the City of New York, Youth Development Project, is trying to answer this. It brought together about a dozen seasoned youth organizations to look at the research base on inputs and outputs and actually operationalize it. They developed three booklets around this [see New & Noteworthy section for information].

There are also challenges around comparison or control groups. In a healthy community, youth are often in more than one program or organization. This is how it should be. But you have a hard time bringing traditional evaluation methods such as comparison and control groups to this. It is difficult to find young people who receive “no services” in the true sense of the word. So the comparison, at the organizational level, is really a comparison of a known organization’s services and activities against those of unknown organizations. This is why many large organizations (e.g., Girls Inc.) revert to evaluating the impact of a specific program within their organization (using their own members as a control group) rather than try to get a sense of the value of the overall experience.

What elements might a “good” evaluation of youth development programs include?

Good evaluations of youth development programs should include several features. They should employ a participatory approach, including youth in the evaluation and working closely with the director and staff to understand their theory of change and to find out how the outcomes connect in a concrete way to what they are doing. Evaluators should also get a sense from staff as to where else people are getting services. Depending on the program, evaluators also need to pay careful attention to “doses” and services. Young people come in and out and can get very different services. Good evaluations also try to unbundle outputs and outcomes as much as possible. And, as I just mentioned, it is important to get as much information as possible about the other programs, organizations, and resources available in participants’ lives.

What are some of the challenges/ lessons to translating research findings about successful youth programming into policy changes?

One of the problems is that there is not much research about what works—the number of unevaluated programs far outweighs the number evaluated. We are moving in uncharted territory. Even where there is evaluation research, there are traditionally a few problems. First, there is the “black box” problem.

Even when you have some fairly robust outcomes, you often aren’t sure how you got them. Youth programming is complex—was it the field trips, the relationships, the youth workers that worked? So you are left saying “adopt the whole package.” This is one of the lessons from YouthBuild—you have to implement all the components to get the outcomes. But, as YouthBuild learned, it is very hard to get people to do the whole thing because they want to take it apart. Then you can have quality problems.

The second problem is the outcomes themselves. One of the ways to effect policy change is to say very clearly “if you do this, this changes.” Unfortunately, we do not have those rigorous increases or reductions or outcomes that have a lot of policy currency. We should also make a distinction, although an artificial one, between place-based programs and curricula. Place-based programs are more complex—the outcomes you get from those are tricky to translate into policy. We have not done a good job of translating lessons learned into our public institutions. Good evaluation also requires upfront thinking about replication and scale issues. Typically, this means serving more kids or having the program taken over into the school. Unfortunately, integration of youth programs into the school day and curricula is not happening. I think that, in some ways, this is because evaluations have not thought about what needs to be seen to change policy.

Two broad types of evidence are needed. The first centers around the question of effectiveness. Does the program produce results? In particular, does it produce results that have currency with the public institution being targeted for uptake? If schools are expected to pick up programs, we have to demonstrate links in school attendance, academic progress, classroom disruption, etc. If juvenile justice or public health dollars are to be reallocated, we have to demonstrate links to youth problems or to the intermediate outcomes that the public and policymakers link to youth problems. Youth development purists have strong negative reactions to these suggestions. But in the end, demonstrating improvements in civic awareness or leadership skills is nice but insufficient if no one cares. To the extent that the target institutional home is known, evaluation outcomes should be developed with this organization in mind.

The second type speaks to scale. Can the program be replicated or expanded? Again, if it is being institutionalized, could teachers do it? Are they willing? Are they proficient? Do they produce the same results as community-based workers? Could volunteers do it? Could it be picked up by organizations other than the one that created it? Is it affordable? Is there an infrastructure that could take it on to move it out to significantly more young people? To the extent that the program has been delivered in different settings, to different populations, by different actors, these should be seen as pretests of robustness. They should be examined, not washed over.

What are some of the challenges/lessons in expanding successful programs? Are there some examples where this has been successfully done?

The “black box” problem also exists when thinking about expansion. It is hard to say which program dimensions are critical and which are “nice.” It is very hard in doing an expansion or replication to be sure that you have the right components. It is also important to distinguish between going to scale in a place and nationwide. A lot of programs have gone to scale nationally, but have not done so where they live. This is an important point. National programs replicate by picking the strongest possible candidates for local implementation. Locally, any program going to scale will quickly run through its top picks and be forced to rely on organizations that may not have the capacity or experience needed to deliver the program. Technical assistance, training, and good models and principles become very important. Handbooks don’t do it.

The Beacon Schools in New York City offer one of the best examples we have of going to scale in a place. They addressed the local capacity issue directly by building a very robust and aggressive technical assistance network for Beacons implementers. The results, I think, have been impressive. There is variation among the sites, but there is a bottom line of quality that is often not seen when over 70 organizations in one place are implementing a program idea.

The Beacons also offer a very good example of how to define outcomes. They have been very conservative in terms of what they have promised in terms of outcomes. They were careful not to commit to reducing crime rates and drug abuse rates. The Beacons were built around a youth rights issue—young people in every neighborhood need to have places to go, people to talk to, things to do. The Beacons implementers knew if they opened centers, kids would use them. They have documented this hypothesis well. In moving quickly, Beacons implementers and technical assistance providers did not hold delivery of any service hostage to quality—i.e., they did not let the perfect become the enemy of the good. However, they have worked very hard to bring best practice knowledge into the network.

Are there experiences/lessons of youth programming from other countries that might inform work in the U.S.?

The main thing I have learned from working in other countries is that they tend to first think sustainability, then scale, then evaluation for effectiveness. It is not an anti-evaluation bias; it is just that they do not typically think “let’s pilot something for ten years and really, really get it right and then take it to scale.” What other countries see as most important when dealing with youth issues, locally or nationally, is that something is there for young people. In many instances, there is nothing there, and to think about piloting something for 50 kids until they get it right seems almost unconscionable. They take the best of what they know and work to bring it quickly to as many kids as possible in a sustainable way. Then they come back and improve the quality. What many in other countries assume is that quality is “fixable” and it is most important to get the system in place.

This is what the Beacons did. There was no pilot that had a major evaluation on which the next batch was hinged. In the first year, there were 9, then 27, and now 89. This was very much, “we’re going to get stuff out to as many kids as possible and our benchmark will be doubling and redoubling each year until we have one in each neighborhood.” Along the way, through a good public and private partnership, and investment in technical assistance and training, they have been working on quality.

The difference between programming here and programming elsewhere thus appears in some basic assumptions. In the United States, once we demonstrate the results of something, we face the issue of where the money for it comes from. In other countries, they figure out what they can do with what they have and work very quickly with communities to think about what to do and then get it implemented. The idea that you build it, pilot it, demonstrate that it works, and then someone will come along with the money, is a western, if not a uniquely American, view.

The other lesson is that there is much more of a balance in other countries between process and outcome evaluation. In this country, we assume that with evaluation we want to see individual-level changes. Other countries are often much more content to look at and approach evaluation as a look at the process, and they seek community-level changes. It isn’t a methodology sophistication issue. It is really that those in other countries do not necessarily need or want to know that 26 percent of kids in a given program increased their test scores. They want to know that a reasonable number of kids are in the program and that the community is engaged and behind it. They want to know that those kids who have left the program are doing something in the community. They would not be very happy if the young person whose test scores increased left the community. In this process, they look for elements of scale and sustainability.

Karen Horsch, Research Associate, HFRP

‹ Previous Article | Table of Contents | Next Article ›

© 2016 Presidents and Fellows of Harvard College
Published by Harvard Family Research Project