Can evaluations support a more adaptive and flexible development approach?

Fredrik Magnussen from Civita discusses the purpose of evaluation and shares some ideas for the ongoing organizational aid reform in Norway.
Fredrik Magnussen works at the think tank Civitas project on development policy and assistance. He has a master’s degree in Conflict Studies from the University of Ottawa and a bachelor’s degree in Cultural and Social Studies from the University of Oslo. He recently published the report Does Norwegian development aid keep up with the global development? (Henger norsk bistand med i den globale utviklingen?).
Aid and development programs operate in increasingly changing contexts. This calls for more flexible approaches to the delivery of aid programs.
But despite an ever-expanding library on alternative development methods, a plethora of “future-of-aid” debates, workshops and new aid instruments, many donors still favor emphasis on demonstrating impact or assessing progress against pre-determined outputs and outcomes that can be measured quantitatively.
While the traditional development industry slowly realizes that different tactics are needed, moving from policy to practice has proven to be a difficult and entangled process. The pathologies of the aid world, such as self-interest, accountability mechanisms, risk avoidance, short time frames and slow speed, are often counterproductive. This is especially evident within large government donors, who, because of political considerations and organizational structures, often end up with superficial changes.
The dominant approach to evaluations is still based on the traditional “project cycle management”, which often takes place only at the end of the project cycle, executed in a narrow retrospectively “checks-and-balances” manner. However, this might not be fully appropriate for addressing the dynamics and complexities on the ground.  
For government donors, evaluations are an integral part of the (often political) results agenda. In most cases external consultants carry out this work. Hence, an industry of professional outside-evaluators and external consultants have emerged, in response to donor-led demand. This has arguably led to a standardization in methodology and set of (pre-determined) criteria. But such a priori identification of indicators may, however, say more about the evaluation system than it does about the project itself. A narrow and linear focus simplifies some of the complexity, which might make sense from a policy/strategy (or even portfolio) point of view. But faces some obvious challenges at the project level, resulting in variations in quality, scope and depth.
Being “results-driven” often means relying too much on the knowledge of numbers and concrete measures to make decisions. But this might be the wrong kind of knowledge, leading to wrong decisions. Additionally, evaluations often follows a course of action that mostly involves thinking “in the box” rather than thinking “out of the box”.
In most cases accountability outplays learning; they are usually incompatible goals. Instead of talking about “lessons learned”, it might be closer to reality to address them just as lessons, or “lessons spurned”. A lot of important and good information is left out of the polished, publicly available reports, reflecting what the aid system requires, rather than what happens. This increases the gap between field and report, which again contributes to less learning and transfer of knowledge.
A failed project is not necessarily a bad thing. We should keep in mind that while a project could fail according to stated project-criteria, it could perform well in the broader context. On the flipside, it is also possible for a project to achieve stated criteria but perform poorly in the system context. Evaluations should be able to reflect on both perspectives. Any project should have mechanisms that unveils negative consequences, or “Grimpact”. A recent article about Campbell’s Law and the Cobra Effect (unintended consequences) shows that relying too heavily on outcome measurement could lead to things becoming far worse than when you started. This demonstrates that good intentions do not always translate into good interventions.

Opportunities within the current system: 

Let’s be honest. Evaluations will not magically create results. At their current state it may seem that they are not fully fit for purpose. The uniqueness of each project and the fluidity of their environments makes it challenging to impose a rigidly uniform framework. But with the results agenda probably here to stay, the trick is to maneuver within (and adjust) the existing framework. It may be possible to turn the results agenda into a positive force if we play the game to change the rules.
But this requires a rethink in order to fit better into the dynamics and complexities on the ground. We could start by focusing on the specific rather than the general, replacing best practice with more locally adapted solutions. However, this demands a great deal of effort, investment and political will. The ongoing organizational aid reform in Norway could be an opportunity to overhaul existing practice. Maybe it is time to re-inject some common sense into our development efforts? Here are some ideas:
Bridge the gaps between planning, monitoring and assessments/evaluations. The lines between these integrate in complex systems. Any project requires timely and relevant information, which means there is a need for continuous feedback, not just relying on single snapshots at the end of the project cycle based on pre-determined criteria. A more comprehensive approach could alleviate some of the pressure away from evaluations as ends in themselves. While still being integral to the aid machinery, maybe we have been putting too much weight on the end-product, and not the process itself. Instead of striving for predictability, a broader and more honest approach on evaluations in complex systems might be better suited to inform policy and increase the quality of interventions. This might have the potential to reinforce learning by providing broader insights, influencing possibilities and creating space for reflection. The firsts step towards a more systematic approach and accelerate learning could be portfolio scanning and learning labs. Eventually, there is a need improve how results are communicated - finding ways to translate the lab-related findings into the external world.
Context-specific analysis: There is also a need to develop a deeper understanding of contexts. I agree with Mark Hoffmann that we must move beyond the mantra of the importance of contexts: We must come to understand how and why contexts matter, and not simply that they do.
Looking not only at the way a program may affect the context but also how the context can affect a project, are interlinked perspectives that should be reflected in evaluations. Such two-way assessments is for example reflected in “outside in” and “inside out” analysis. The former is to understand how the situation on the ground is (including the constraints and opportunities for a project’s influence on it), while the latter looks at the intervention itself (activities, outputs and outcomes). Placing the project in the larger context (in which it is operating) can identify a wider range of factors and development dynamics.
Utility and pragmatism: Maybe we could be more honest about utility, impact and the purpose of evaluations. To paraphrase the late Canadian “PCIA-scholar” Kenneth Bush: I view utility as the single most important criterion. The central (and fundamentally political) questions here are: Useful for whom? Useful for what? Whose interests are being served (or not)? If the purpose of an evaluation has not been clearly determined, the value of it will likely diminish. To conquer this, applying a Utilization-Focused Evaluation approach could help identify the right audience. An entry point could be more pragmaticism, coming to realize that there is no right answer or definition of impact. Impact can mean different things to different people in different contexts and different interventions.
Experimentation: It may well be time to move away from the standardized evaluation approaches and look more keenly at other that better capture the dynamics of development. For example Non-linear Impact Assessment, Real-Time Evaluations and introduce flexible/dynamic indicators. There is also goal-free evaluation approaches such as outcome harvesting and the most significant change (MSC) approach, which do not measure progress towards predetermined outcomes but identify and collect evidence of what has been achieved.
Information and flexibility during the lifespan of a program could very well give increased insights, better quality and more effective development. Loosening the shackles of prespecified indicators, enough space might be created to allow for different (and arguably more realistic) interpretations of impact.
The purpose of evaluations should move on from only judging success or failure, but rather strive to provide feedback, generate learning and encourage changes when appropriate. 
Published 11.06.2019
Last updated 11.06.2019