How to make evaluation work (Evaluation Series: Part 2)

Anthony Betori
May 26
4 min read

In my last blog, I explored some of the history of evaluation, and how that history can help us understand why evaluation can be so heavy and so unhelpful.

In this blog, we’ll explore some higher level ideas to support building evaluation tools and strategies that work better and feel better, starting with the idea that first made all this stuff make sense to me: Goodhart’s Law.

Charles Goodhart, an economist, wrote in 1975 that, “any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.” That gobbeldygook has since been translated into a much easier phrase to understand: “When a measure becomes a target, it ceases to be a good measure.”

The core of Goodhart’s idea is that evaluation is supposed to help us, but the very act of evaluating can lead to unintended consequences that distract us from our original goal. With my classes, I always use the following example:

Imagine that you own a factory that makes shoes. You are proud of your business, and you are hoping to expand so that you can serve more customers. You tell your employees that, in order to grow sales, everyone is expected to make 20 pairs of shoes a month. This will mean you’ll have more shoes to sell.

What could go wrong?

Well, there are three ways your employees can respond:

They can distort the system. They could whisper to their fellow worker, “I actually can make 30 pairs in a month, so I slow down to make 20 and can spend more time on breaks.”
They can distort the data. They could say, “Well, they didn’t say I can’t count shoes with mistakes. If we skip the quality check, I can easily get to 20 each month.”
Finally, they can work to improve the system. They could say, “I have an idea to make more shoes! If we buy this machine, it will help.” Then they’ll make the target number of shoes.

There are three corresponding ways you as the factory owner can respond.

Make it difficult to distort the system. You can micromanage your employees, make sure they’re always working, monitor every step to ensure they’re doing what they’re supposed to, and punish those who break the rules. Yikes.
Make it difficult to distort the data. You can require constant quality checks to make sure every stage is functioning. Exhausting.
Or, you can bring your workers in on the vision of the project overall, make sure they understand why we have the goals we have, and incentivize work to improve the system, not the outcome. Now, your employees are looking for ways to make more shoes because they get a bonus for extra shoes.

Donald Wheeler has this to say in Understanding Variation:

“Before you can improve any system, you must listen to the voice of the system. You must understand how the inputs affect the outputs of the system. Finally, you must be able to change the inputs (and possibly the system) to achieve the desired results. This will require sustained effort, constancy of purpose, and an environment where continual improvement is the operating philosophy.”

Put another way: “If you want to improve some process, you have to ignore the goal first, in favor of examining the process itself.”

So what does this mean for you, as you manage a nonprofit program?

First, you’ve got to involve your team in the full vision of your program. Why do you do what you do? What are you trying to change? How does what you’re doing contribute to that change?

Second, you’ve got to lead with humility, and let the team critique and change your systems. Obviously, manage this change and scale it to what’s possible, but things must change.

Third, evaluate the correct things with the appropriate tools. You likely don’t have the statistical firepower or cash/personnel to evaluate everything you could evaluate, or even everything you want to evaluate. Instead, consider what you must evaluate, and build the tools to support that need.

Like we discussed in the previous blog, we haven’t actually been evaluating nonprofit programs for all that long. I think that is partially why we see so much variance in what evaluation looks like and why it feels sometimes like we’re inventing something new. We likely are inventing something new, or at least applying an older idea to a new situation. This also means we need to approach evaluation at a scale that makes sense for what we can actually do with the evaluation data, and work at every stage to ensure that the measures themselves don’t become the point of the program. The point of the program is the change you want to see in your community. Never lose sight of that.

That’s a lot of words and a lot of ideas, so let’s break it down into some quick tips to consider:

Evaluate with intention - only measure what you need to measure.
Do what only you have to do for people who aren’t your clients - grantors have requirements, but follow point one and measure only what they ask for if it’s outside what you think you should do.
Start with a specific question - how is the activity you’re doing contributing to the change you want to see? That should be your guiding light.
Measure as little as possible - cut every question you don’t need. PLEASE! And be careful with demographic data collection. If you aren’t using every domain of identity to make decisions or ensure you’re serving who you’re meant to, then cut those questions. The price of admission to our services should not be a complete biography that goes nowhere.
Remember you’re contributing to an emerging science - talk about evaluation with others, and research how to do it best. Keep learning. Don’t give up. Take your time. And if you figure something out, publish it/share it/present on it/etc.

Alright, get out there and evaluate! I’d love to know what you think as well. You know how to get a hold of me. And if you’d like to book a three-hour class on this topic for your workplace, email me!

P.S. This blog is certified AI-free, and partly inspired by Tiny Experiments by Anne-Laure Le Cunff. Thank you to that book and author for helping me get my butt off the couch and into my office chair to write these ideas I’ve had for ages.

How to make evaluation work (Evaluation Series: Part 2)

Recent Posts

Comments