By Craig Charney | Insights | Series II | No. 4 | June 2014
If you are struggling with surveys, evaluations, assessments, or market research in developing countries, email the Survey Doctor with your question.
Q: Training programs are a frequent component of our international development efforts, including one we are now running with civil society organizations. But I’ve never really felt we understood how to evaluate training, other than giving a test at the end to trainees. How can we deliver a fuller sense of the results of the program to the clients – and the funders?
A: Finding out what the trainees know at the end of the program is certainly part of evaluating training programs – but it’s only one part. You also need to discover more about the trainees’ experience, what they actually learned, if it affected what they do, and whether this achieved the program objectives.
There are methods to do each of these – but they are not the same thing, as work we did on a civil society training evaluation shows.
The four levels of training evaluation research are reaction, learning, behavior, and results. This four-level model of training evaluation was spelled out by the late Donald Kirkpatrick, the leading expert on the subject.
Reaction is the experience of the participants in the program – in short, customer satisfaction. This includes their evaluations of the program overall, trainers, curriculum, materials, facilities (and food, accommodation, and transport if provided by the program), along with whether they think the program has achieved or will achieve its aims. These evaluations should also probe for comments and suggestions on how the program can be improved. Participants are often the source of very valuable feedback that can identify barriers that may be preventing the training from attaining its goals or provide ideas that will increase its effectiveness.
The best way to get their feedback is in writing at the end of the course. If this is impossible, follow-up should occur as soon as possible after its end, with reminder emails or phone interviews for those who do not respond to the initial feedback request. For this type of evaluation, there is no reason not to obtain a 100% response rate. The survey can include both quantitative and qualitative questions. (If there are large numbers of participants, the questionnaire may need to be more quantitative in orientation and processed by computer, with a limited number of qualitative interviews conducted as follow-ups.)
Learning is the most familiar type of evaluation – what did the trainees learn during the program? But it’s worth remembering that knowledge is only one of the things training tries to instill. There are also the dimensions of skill development and changing attitudes as well.
So a test of knowledge at the end of a course – the most common evaluation of learning – is important, but not the only way to measure it. Evaluation might also involve practicing skills, which might mean a live demonstration, or measuring change in attitudes, which could call for qualitative survey work. However, these are only static measures at a point in time – they do not prove that something has changed as a result of the course. To do that, you need a pre-test of participants before the program, a control group to compare them to, or, ideally, both (unless you know for sure the participants knew nothing of the subject matter beforehand).
The third level of evaluation is behavior – whether what participants actually do has changed because of the training. This implies both that they continue to understand the changes required and that they actually perform them. For instance, in an anti-corruption or compliance training program, this might include whether or not they remember what the procedures they were taught and whether they practice it.
Behavioral evaluation requires survey work at an appropriate interval after the end of the training program. You can survey the former trainees themselves, but it may also be useful to survey their immediate supervisors or subordinates, particularly if you are concerned that trainees might not be fully accurate or objective in their reporting or assessment of their own behavior. Here, too, it is important to be able to compare the results to a pre-assessment before the training and/or a control group that has not experienced it if you want to prove that the behavior is the product of the training.
The last level of assessment is probably the most important yet the least often tested: change in the results of the trainees’ work. In other words, has the program not just affected the conduct of trainees, but actually yielded the overall effects it aimed at? This might be greater organizational effectiveness or productivity or broader impacts on the domains the trainees work in, such as higher incomes, better health, or more learning.
Testing results involves measuring the expected results from the program, among trainees’ organizations or in the domains where they operate. In some cases there may be fairly ready measures of impact – for example, health or economic statistics from the area of the project – but in most cases some type of survey work is required to isolate the impacts of the program. Identifying the right targets for the surveys then becomes crucial. When it involves an assessment of the overall impact of the project, then surveys of the project beneficiaries are important. When it involves organizations, it may involve surveying organizational leaders, or the beneficiaries they interact with, or their funders and partners, depending on the aims of the training program. Either way, a baseline survey before the training and/or a control group is very important to specify impact. (For an example of an evaluation of a training program for civil society organizations which identified impacts in the groups’ work, see our study of USAID’s I-PACS program in Afghanistan, available here, which did comparisons both over time and with control groups.)
Of course, training evaluation surveys face the same methodological issues as other evaluation surveys. These include the adequacy of the sample, comparison to a control group, the extent and statistical significance of changes noted, whether they were sustained, and the relation of benefits to costs. Warren Bobrow produced a useful article summarizing these issues as they apply to training evaluations.
Evaluations of training programs also require the same conceptual apparatus as other non-economic evaluations: a clear theory of change, indicators that can be measured, and the like. This is why, for training evaluations of any substantial scale, professional evaluation help is recommended!
Donald Kirkpatrick and James Kirkpatrick, Evaluating Training Programs: The Four Levels (San Francisco, 3d edition, 2006).
J.J. Phillips, Handbook of training evaluation and measurement methods (Houston, 2d ed, 1991).