by Scott Bayley

Three approaches to measurement in program evaluation

Three different approaches for measuring anything

In my experience staff working in international development often struggle with devising measures of outcome variables and abstract concepts such capacity or empowerment.

The three approaches

I’m going to suggest three different approaches for measuring anything, each with different strengths and weaknesses:

  1. Ask stakeholders for their opinion

  2. Measure the consequences of the change that you are interested in

  3. Directly measure the attribute of interest.


An example will help to clarify how this works in practice. A friend of mine is a world class athlete who travels the world attending rowing competitions. In the lead up to a race she is training twice a day six days a week. She is keen to add muscle mass while reducing body fat.

From time to time my friend will ask me “Do you think I am adding muscle and losing fat?”. This is an example of approach #1.

She also regularly weighs herself on the scales to check her weight (approach #2).

Finally, my friend goes to the local university twice a year, pays a fee and spends the day having her body scanned, being weighed in and out of the water, having skin fold tests, etc. She then comes away with a report on how much muscle, fat and bone she has in her left arm, right arm, left leg, right leg, torso, and so on (approach #3).

Now asking me for my opinion is quick and cheap but it is also terribly inaccurate (approach #1). Extensive research on cognitive bias has conclusively shown that stakeholder perceptions are highly inaccurate, even for experts. Weighing herself every week (approach #2) tells my friend about changes in her weight, but not the balance of muscle to fat, hence this approach isn’t sufficiently accurate for her purposes. Approach #3 is the most time consuming and expensive of all, but it’s also the most accurate. This same logic applies to our measurement of program outcomes and concepts such as capacity and empowerment. Methods that are quick and cheap are generally highly inaccurate.

So what does a sound performance measure look like? In 2018 I prepared guidance for the Australian Department of Foreign Affairs and Trade that advocated for:

Utility: The information being collected needs to support program management (decision making, learning and continuous improvement, reporting).

Validity: The performance indicator needs to measure the desired outcome/concept – not something else. (I’ve noticed this is a common challenge for UN agencies).

Reliability: The performance indicator should measure consistently over time, minimal random error.

Sensitivity: When the result changes, the performance measure should be sensitive to this change.

Simplicity: It should be relatively straightforward to collect and analyse the data.

Affordability: The program needs to be able to afford to collect the information.

My 2001 journal article ‘Measuring customer satisfaction’ discusses the technical aspects of sound measurement. You can download a free copy from my website.

Scott Bayley, Managing Director of Scott Bayley Evaluation Services and former Principal Consultant for Monitoring Evaluation and Learning at Oxford Policy Management (OPM) for the Asia Pacific region.

Scott Bayley is Senior Principal Specialist, MEL at Oxford Policy Management (OPM).
Scott leads OPM Australia’s monitoring, evaluation and learning (MEL) work for the Australian Department of Foreign Affairs and Trade and the New Zealand Ministry of Foreign Affairs and Trade.

Call Scott now


 +61 452 509 756


 Email me

Scott Bayley Evaluation Services - Continuous Improvement


Fellow of the:

Australian Evaluation Society logo

Ask a question

A quick question or,

Make an appointment

Please type your full name

This field is required

Please supply a valid email address

This field is required

Please type your phone number

This field is required

Ask for a quote or ask a question.

This field is required