Designing an impact evaluation work plan: a step-by-step guide

May 4, 2021

This article is the second part of our 2-part series on impact evaluation. In the first article, “Impact evaluation: overview, benefits, types and planning tips,” we introduced impact evaluation and some helpful steps for planning and incorporating it into your M&E plan.

In this blog, we will walk you through the next steps in the process – from understanding the core elements of an impact evaluation work plan to designing your own impact evaluation to identify the real difference your interventions are making on the ground. Elements in the work plan include but are not limited to – the purpose, scope and objectives of the evaluation, key evaluation questions, designs and methodologies and more. Stay with us as we deep dive into each element of the impact evaluation work plan!

Key elements in an impact evaluation work plan

Developing an appropriate evaluation design and work plan is critically important in impact evaluation. Evaluation work plans are also called terms of reference (ToR) in some organisations. While the format of an evaluation design may vary on a case by case basis, it must always include some essential elements, including:

Background and context
The purpose, objectives and scope of the evaluation
Theory of change (ToC)
Key evaluation questions the evaluation aims to answer
Proposed designs and methodologies
Data collection methods
Specific deliverables and timelines

1. Background and context

This section provides information on the background of the intervention to be evaluated. The description should be concise and kept under one page and focus only on the issues pertinent for the evaluation – the intended objectives of the intervention, the timeframe and the progress achieved at the moment of the evaluation, key stakeholders involved in the intervention, organisational, social, political and economic factors which may have an influence on the intervention’s implementation etc.

2. Defining impact evaluation purpose, objectives and scope

Consultation with the key stakeholders is vital to determine the purpose, objectives and scope of the evaluation and identify some of its other important parameters.

The evaluation purpose refers to the rationale for conducting an impact evaluation. Evaluations that are being undertaken to support learning should be clear about who is intended to learn from it, how they will be engaged in the evaluation process to ensure it is seen as relevant and credible, and whether there are specific decision points around where this learning is expected to be applied. Evaluations that are being undertaken to support accountability should be clear about who is being held accountable, to whom and for what.

The objective of impact evaluation reflects what the evaluation aims to find out. It can be to measure impact and to analyse the mechanisms producing the impact. It is best to have no more than 2-3 objectives, that way the team can explore few issues in depth rather than examine a broader set superficially.

The scope of the evaluation includes the time period, the geographical and thematic coverage of the evaluation, the target groups and the issues to be considered. The scope of the evaluation must be realistic given the time and resources available. Specifying the evaluation scope enables clear identification of the implementing organisation’s expectations and of the priorities that the evaluation team must focus on in order to avoid wasting its resources on areas of secondary interest. The central scope is usually specified in the work plan or the terms of reference (ToR) and the extended scope in the inception report.

3. Theory of change (ToC)

Theory of change (ToC) or project framework is a vital building block for any evaluation work and every evaluation should begin with one. A ToC may also be represented in the form of a logic model or a results framework. It illustrates project goals, objectives, outcomes and assumptions underlying the theory and explains how project activities are expected to produce a series of results that contribute to achieving the intended or observed project objectives and impacts.

A ToC also identifies which aspects of the interventions should be examined, what contextual factors should be addressed, what the likely intermediate outcomes will be and how the validity of the assumptions will be tested. Plus, a ToC explains what data should be gathered and how it will be synthesized to reach justifiable conclusions about the effectiveness of the intervention. Alternative causal paths and major external factors influencing outcomes may also be identified in a project theory.

A ToC also helps to identify gaps in logic or evidence that the evaluation should focus on, and provides the structure for a narrative about the value and impact of an intervention. All in all, a ToC helps the project team to determine the best impact evaluation methods for their intervention. ToCs should be reviewed and revised on a regular basis and kept up to date at all stages of the project lifecycle – be this at project design, implementation, delivery, or close.

More on the theory of change, logic model and results framework.

4. Key impact evaluation questions

Impact evaluations should be focused on key evaluation questions that reflect the intended use of the evaluation. Impact evaluation will generally answer three types of questions: descriptive, causal or evaluative. Each type of question can be answered through a combination of different research designs and data collection and analysis mechanisms.

Descriptive questions ask about how things were and how they are now and what changes have taken place since the intervention.
Causal questions ask what produced the changes and whether or not, and to what extent, observed changes are due to the intervention rather than other factors.
Evaluative questions ask about the overall value of the intervention, taking into account intended and unintended impacts. It determines whether the intervention can be considered a success, an improvement or the best option.

Examples of key evaluation questions for impact evaluation based on the OECD-DAC evaluation criteria.

Key Impact Evaluation Questions based on the OECD-DAC evaluation criteria
Relevance	To what extent did the intended impacts match the stated priorities of the organisation and intended participants?
Effectiveness	Did the intervention produce the intended impacts in the short, medium and long term? If so, for whom, to what extent and in what circumstances? What helped or hindered the intervention to achieve these impacts? What variations were there in the quality of implementation in different sites? To what extent are differences in impact explained by variations in implementation? Did implementation change over time as the intervention evolved? How did the intervention work in conjunction with other interventions to achieve outcomes?
Efficiency	What resources and strategies have been utilized to produce these results?
Impact	What unintended impacts, positive and negative, did the intervention produce?
Sustainability	Are impacts likely to be sustainable? Have impacts been sustained?

5. Impact evaluation design and methodologies

Measuring direct causes and effects can be quite difficult, therefore, the choice of methods and designs for impact evaluation of interventions is not straightforward, and comes with a unique set of challenges. There is no one right way to undertake an impact evaluation, discussing all the potential options and using a combination of different methods and designs that suit a particular situation must be considered.

Generally, the evaluation methodology is designed on the basis of how the key descriptive, causal and evaluative evaluation questions will be answered, how data will be collected and analysed, the nature of the intervention being evaluated, the available resources and constraints and the intended use of the evaluation.

The choice of the methods and designs also depend on causal attribution, including whether there is a need to form comparison groups and how it will be constructed. In some cases, quantifying the impacts of interventions requires estimating the counterfactual – meaning, estimating what would have happened to the beneficiaries in the absence of the intervention? But in most cases, mixed-method approaches are recommended as they build on qualitative and quantitative data and make use of several methodologies for analysis.

In all types of evaluations, it is important to dedicate sufficient time to develop a sound evaluation design before any data collection or analysis begins. The proposed design must be reviewed at the beginning of the evaluation and it must be updated on a regular basis – this helps to manage the quality of evaluation throughout the entire project cycle. Plus, engaging with a broad range of stakeholders and following established ethical standards and using the evaluation reference group to review evaluation design and draft reports all contribute to ensuring the quality of evaluation.

Descriptive Questions

In most cases, an effective combination of quantitative and qualitative data will provide a more comprehensive picture of what changes have taken place since the intervention. Data collection options include, but are not limited to interviews, questionnaires, structured or unstructured and participatory or non-participatory observations recorded through notes, photos or video; biophysical measurements or geographical information and existing documents and data, including existing data sets, official statistics, project records, social media data and more.

Causal Questions

Answering causal questions require a research design that addresses “attribution” and “contribution.” Attribution means the changes observed are entirely caused by the intervention and contribution means that the intervention partially caused or contributed to the changes. In practice, it is quite complex for an organisation to fully claim attribution to a change, this is because changes within the community are likely to be the result of a mix of different factors besides just the effects of the intervention, such as changes in economic and social environments, national policy etc.

The design for answering causal questions could be ‘experimental,’ ‘quasi-experimental’ or ‘non-experimental.’ Let’s take a look at each design separately:

Experimental: involves the construction of a control group through random assignment of participants. Experimental designs can produce highly credible impact estimates but are often expensive and for certain interventions, difficult to implement. Examples of experimental designs include:

- Randomized controlled trial (RCT) – In this type of experiment, two groups, a treatment group and a comparison group are created and participants for each group are picked randomly. The two groups are statistically identical, in terms of both observed and unobserved factors before the intervention but the group receiving treatment will gradually show changes as the project progresses. Outcome data for comparison and treatment groups and baseline data and background variables are helpful in determining the change.

Quasi experimental: unlike experimental design, quasi experimental design involves construction of a valid comparison group through matching, regression discontinuity, propensity scores or other statistical means to control and measure the differences between the individuals treated with the intervention being evaluated and those not treated. Examples of quasi-experimental designs include,

- Difference-in-differences: this measures improvement or change over time of an intervention’s participants relative to the improvement or change of non-participants.
- Propensity score matching: Individuals in the treatment group are matched with non-participants who have similar observable characteristics. The average difference in outcomes between matched individuals is the estimated impact. This method is based on the assumption that there is no unobserved difference in the treatment and comparison group.
- Matched comparisons: this design compares the differences between participants of an intervention being evaluated with the non participants after the intervention is completed.
- Regression discontinuity: in this design, individuals are ranked based on specific, measurable criteria. There is usually a cut-off point to determine who is eligible to participate. Impact is measured by comparing outcomes of participants and non-participants close to the cutoff line. Outcomes as well as data of ranking criteria, e.g. age, index, etc. and data on socioeconomic background variables are used.

Non-experimental: when experimental and quasi-experimental designs are not possible, we can conduct non-experimental designs for impact evaluation. This design takes a systematic look at whether the evidence is consistent with what would be expected if the intervention was producing the impacts, and also whether other factors could provide an alternative explanation.

- Hypothetical and logical counterfactuals: it is basically an estimate of what would have happened in the absence of an intervention. It involves consulting with key informants to identify either a hypothetical counterfactual, meaning what they think would have happened in the absence of an intervention or a logical counterfactual, meaning what would logically have happened in its absence.
- Qualitative comparative analysis: this design is particularly useful where there are a number of different ways of achieving positive impacts, and where data can be iteratively gathered about a number of cases to identify and test patterns of success.

Evaluative Questions

To answer these questions one needs to identify criteria against which to judge the evaluation results and decide how well the intervention performed overall or how successful or unsuccessful an intervention was. This includes determining what level of impact from the intervention will count as significant. Once the appropriate data are gathered, the results will be judged against the evaluative criteria.

For this type of evaluation, you should have a clear understanding of what indicates ‘success’ – is it represented as improvement in quality or value? One way to find out is by using a specific rubric that defines different levels of performance for each evaluative criterion, deciding what evidence will be gathered and how it will be synthesized to reach defensible conclusions about the worth of the intervention.

These are just a handful of commonly used impact evaluation methodologies in international development, to explore more methodologies, check out the Australian Government’s guidelines on “Choosing Appropriate Designs and Methods for Impact Evaluation.“

6. Data collection methods for impact evaluation

According to BetterEvaluation, well-chosen and well-implemented methods for data collection and analysis are essential for all types of evaluations and must be specified during the evaluation planning stage. One should have a clear understanding of the objectives and assumptions of the intervention, what baseline data exist and are available for use and what new data needs to be collected, how frequently, in what form, and what data do the beneficiaries need to deliver etc.

Reviewing the key evaluation questions can help to determine which data collection and analysis method can be used to answer each question and which data collection tools can be leveraged to gather all the necessary information. Sources for data can be stakeholder interviews, project documents, survey data, meeting minutes, and statistics, among others.

However, many outcomes of a development intervention are complex and multidimensional and may not be captured with just one method. Therefore, using a combination of both qualitative and quantitative data collection methods, which is also called a mixed-methods approach is highly recommended as it allows us to combine the strengths and counteract the weaknesses of both qualitative and quantitative evaluation tools, allowing for a stronger evaluation design overall and provides a better understanding of the dynamics and results of the intervention.

But how do you know which method is right for you?

It is a good idea to consider all possible impact evaluation methods and to carefully weigh advantages and disadvantages before making a choice(s). The methods you select must be credible, useful and cost effective in producing the information that is important for your intervention. As mentioned above, many impact evaluation uses mixed methods, which is a combination of qualitative and quantitative methods. Each method’s shortcomings can be fulfilled by using it in combination with other methods. Using a combination of different methods also helps to increase the credibility of evaluation findings as information from different data sources are converged, likewise, it can also help the team to gain a deeper understanding of the intervention, its effects and context.

7. Impact evaluation deliverables and timelines

Deliverables include an ‘inception report,’ a ‘draft report’ and the ‘final evaluation report’ but in case of complex evaluations, ‘monthly progress reports’ might also be required. These reports contain detailed descriptions of the methodology that will be used to answer the evaluation questions, as well as the proposed source of information and data collection procedure. These reports must also indicate the detailed schedule for the tasks to be undertaken, the activities to be implemented and the deliverables, plus, clarification on the role and responsibilities of each member of the evaluation team.

We hope you found this article helpful. Our intention behind the 2-part series was to explain impact evaluations and its key components in a simple manner so that you can plan and implement your own impact evaluation more accurately and effectively.

Before we sign off, just a quick reminder that this list is not all-inclusive but rather a list of few key elements that many organisations choose to include in their impact evaluation work plan or ToR. If you know of any additional elements that are included in an evaluation work plan in your organisation then do reach out to us and we’d be happy to add them here.

This article is partly based on the Methodological brief “Overview of Impact Evaluation,” by Patricia Rogers at UNICEF, 2014.

Additional Resources:

By Chandani Lopez Peralta, Content Marketing Manager at TolaData.

About TolaData

TolaData is a leading digital platform for Monitoring & Evaluation management, created by passionate development professionals and digital innovators to transform the way the not-for-profit sector monitors and evaluates projects.