Adaptive Evaluation for Innovation and Scaling
Oct 10th 2023
The scaling of innovations often involves system change. Adaptive Evaluation offer...Read More
In 2022, Lok Swathya Mandali (LSM), a women-owned health collective enterprise supported by Self-Employed Women’s Association (SEWA) in India, applied agile processes to increase the sales of allopathic and ayurvedic medicines by its women agents using a novel adaptive approach to design and evaluation. They experimented with various solutions in several design iterations, including new incentive structures, sales training, and product training. The final design included a free cough syrup incentive for agents achieving a small monthly sales target of INR 500. This, along with earlier initiatives, doubled the number of agents earning above INR 500 and INR 1000.
LSM’s journey exemplifies how improving the sales agents’ lives required an exploratory process of change. Yet, many ‘impact’ evaluations in development, notably, randomized control trials (RCTs) and quasi-experimental methods, insist on keeping designs fixed until the end of the study. While these impact evaluations are good at assessing the statistical causal attribution of a specific intervention, they fail to address critical questions about effecting real, complex change. For example, what conditions enable a solution to work? How to discover what designs are most effective in a time frame relevant to decision-making? How can solutions be adapted, improved, and scaled for greater impact in new contexts?
⚬ ⚬ ⚬
Let’s zoom out and consider the remarkable story of women’s collective groups in India. LSM emerged from this long-term process. When India became independent, most women lacked the agency to voice their concerns. A wave of social mobilizations followed, including the formation of MYRADA in 1968 and SEWA in 1972, both of which promoted women’s groups. In the 1980s, NGOs like PRADAN furthered this initial momentum with support for village self-help groups. By the early 2000s, national and state governments adopted the idea, creating programs of state support for self-help groups, reaching about 90 million women in India to date. But despite remarkable progress, there’s still much to do to enhance the recognition, dignity, and independence of rural women.
Social and economic development is a transformative process that goes beyond income and assets. It involves change processes, such as leadership, organization, mobilization, innovation, and scaling. Effective vaccine delivery for diseases like polio, for instance, required scientific achievements, community mobilizations, and public health efforts. Change is influenced by human behavior, contexts, and systems. Implementing self-help groups for women involves raising awareness, shifting mindsets, challenging village customs, and reforming government functioning. Ultimately change requires imagination, exploration, experimentation, and learning from experiences. Some well-known development organizations, such as BRAC, Pratham, and One Acre Fund, embrace this approach, but these are rare. Worse, traditional impact evaluations are ill-adapted to work effectively with this type of exploratory change.
⚬ ⚬ ⚬
‘Adaptive’ Evaluation was conceived at Imago Global Grassroots, which was founded to address the ‘missing middle problem’ in traditional development efforts, where top-down programs by governments and multilaterals often fail to reach the most vulnerable, while local initiatives struggle to scale. Imago began working from the bottom up with grassroots organizations to help them scale, starting in India with SEWA, SRIJAN, PRADAN, Pratham, and the Transform Rural India Foundation. It then expanded to Latin America and North America, working with Paraguay’s Poverty Stoplight and the Harlem Children’s Zone, among others.
Working with these organizations, it became clear that scaling up involves understanding, navigating, and, in particular, changing systems. This is typically non-linear, and often unpredictable. Adaptive Evaluation was born out of the need to restructure existing measurement methodologies to support innovation and to scale processes that seek to address complex development questions.
Adaptive Evaluation is suited to innovating and scaling change processes for several reasons. It attempts to understand approaches and solutions within the context of the larger system, not just a proposed solution’s impact on a target group. It is inherently participatory, with an emphasis on co-creation with implementation teams and end-users to inform what works and why. It emphasizes learning through short cycles of testing and integrating insights. It engages in multiple hypotheses, adapting these to emerging field realities. It uses a variety of techniques, both quantitative and qualitative, from various disciplines, including systems thinking, design thinking, economics, and social sciences. Finally, an Adaptive Evaluation is a continuous journey, not limited to fixed data collection points, aligning with the ongoing nature of change processes. In sum, it builds on existing traditions of evaluation, including “developmental” and “realistic”, with more structure, especially around techniques of systems diagnosis, hypothesis testing, and iteration.
This contrasts with standard impact evaluations, which test only a few hypotheses about the solution’s impact, with limited attention to how the solution interacts with the larger system, and which use a narrow range of techniques, namely RCTs and quasi-experimental methods. Moreover, outside of a few exceptions, standard impact evaluations are often extractive, maintain distance from implementation, and measure only at fixed points of baseline, midline, and endline.
In what follows, we describe the approaches and process of an Adaptive Evaluation. Imago is currently engaged in five Adaptive Evaluations involving: (1) the design of an enterprise support system for SEWA’s enterprises in India; (2) the activation of civil society and government to empower rural women in two states in India through convergence of rural development programs; (3) two projects on the design of programs to support learning recovery of children in Brazil affected by the COVID-19 school close-downs; and (4) the development of an adaptive monitoring and learning strategy for a portfolio of solutions to address gender-based violence with UNDP Tunisia.
⚬ ⚬ ⚬
An Adaptive Evaluation uses three main approaches: systems diagnosis, theory-based assessment of change processes, and iterative design, outlined below.
Change processes often involve complex systems with multiple actors, entities, and interactions shaping an innovation’s success or failure. Yet, systems diagnoses are often absent in development discourses.
In our experience, an effective process involves building a system map in a workshop using design thinking techniques. Participants, including key actors in the system, list stakeholders and collectively answer questions about interrelationships, incentives, power dynamics, and mindsets. We capture insights using figurines, post-its, and markers, later digitizing them for examination and deliberation.
Identifying blockages, levers, and potential pathways is crucial in assessing the system map. Participants discuss how potential solutions flow through the system, identifying touchpoints for effectiveness and managing resistance. These pathways form initial theories of change. Role-playing specific situations in a solution pathway can enhance empathy and add insight.
System insights can be further explored through “circular” interviews, a qualitative technique involving interviewing various actors to understand their relationships and uncover tensions and dependencies. There are also formal mathematical system modeling techniques. Careful use of formal modeling can bring clarity, but their excessive reliance may lead to rigidity and opaqueness, hindering practical implementation.
Systems diagnosis is often the first step of an Adaptive Evaluation. It should be periodically updated as users engage with the system and develop new insights, and, of course, as the system itself changes over time.
Theory-based assessment builds testable hypotheses from a solution’s change pathway within the system, delineating causal steps and touchpoints with various actors. These are typically most effectively co-created with implementers and users.
A crucial step is examining the assumptions and hypotheses underlying each step in a theory of change and how these can be tested with existing or new data. At this point, the use of standard impact evaluation techniques, including RCTs, may be feasible for specific tests, but these are undertaken within a sequence of dynamic and continuous learning. Process tracing, using quantitative and qualitative evidence, is a core technique, as it allows for frequent hypothesis updating against unfolding evidence. Conceptually, causal hypotheses are explored through logical diagnostic tests as opposed to statistical causal testing.
To illustrate, consider sales training for women sales agents at LSM to improve their income. This requires the following causal steps: a well-designed training program, active agent participation, high recall of lessons, and applying skills to work — resulting in increased sales commissions. Simple evidence like interviews, administrative data, surveys, and agent follow-ups can test each step’s effectiveness. Logical inference can be used to draw conclusions from the data in most cases; this is mostly just common sense and doesn’t require any statistical expertise! If it is found, for example, that training attendance was low, it is safe to infer that the sales training solution is not enhancing the agents’ income, let alone their agency. Alternatively, an RCT may be employed, where agents are randomly assigned to different interventions. This often depends on the intervention partners’ capacity and willingness to randomize across groups.
For evaluation enthusiasts, some of the techniques that may be used — in order of increasing technical sophistication and decreasing scope for participation — are outcome harvesting, process tracing, RCTs, and quasi-experimental designs.
Iterative designs are fundamental to innovation and scaling. They put insights from theory-based assessments into action through repeated cycles of experimentation, testing, reflection, and learning. Iterative designs are related to agile implementation, design thinking, and Problem Driven Iterative Adaptation. It requires close coordination with partners to ensure evaluation insights are incorporated into design. The length of the iterative cycles varies based on the problem, ranging from rapid A/B testing to longer cycles with more extensive data collection. Identifying positive outliers and distilling lessons are common techniques.
A great example of iterative learning is BRAC’s 1970s poultry raising program, aimed at providing women working at home with additional income. BRAC initially introduced high-yield chickens, but they proved susceptible to diseases, requiring vaccines and antibiotics. This created a new hurdle as the women lacked vaccine administration knowledge. BRAC then designed specialized programs to train local women as veterinary professionals. Another hurdle emerged for vaccine storage in areas without electricity. After several unsuccessful attempts, they ingeniously used bananas as natural coolers! But challenges persisted: the chickens faced diet issues. BRAC embarked on a years-long quest, experimenting to create the best chicken feed formula for high survival rates, even producing their own maize seeds to improve feed quality.
Another challenge was the distribution of chickens. BRAC addressed this by training women as professional chicken rearers and investing in incubation and trucking facilities. It took 25 years of continuous efforts to establish a modern and profitable mill. BRAC’s unwavering commitment and experimental approach led to its success.
These three approaches form the basis of Adaptive Evaluation, with a range of specific tools in each area. Navigating their mix can be challenging for organizations involved in innovation and scaling. There are four criteria that guide the selection of approaches, as briefly outlined here.
The Type of Question: Often, techniques such as RCTs, that determine whether a solution works, are misapplied to assess how it works and scales. Instead, system-based approaches determine system functioning and pathways for change, theory-based assessments test causal mechanisms, and iterative designs tackle questions around obtaining and using feedback.
The Level of Complexity: Complexity (as defined by Bamberger et al.) increases with solution components, stakeholders, end-user diversity, and geographical spread. Techniques like RCTs and quasi-experimental designs suit lower complexity levels, while system mapping, process tracing, case studies, and positive deviance are more suitable for higher complexity levels. Techniques for higher complexity can be used for lower complexity levels, albeit with lower causal attribution, but the opposite is not usually true. For instance, user journeys or beneficiary interviews also work for low complexity, but RCTs may not be feasible at higher complexity levels due to difficulties with randomization in complex settings.
The Stage of Innovation and Scaling: Early stages of innovation benefit from testing what works using theory-based or iterative designs. Scaling up, however, demands more focus on system-based approaches, as it involves reaching more beneficiaries. Iterative cycles may be shorter initially, but longer for higher levels of scale. RCTs become beneficial after repeated innovation cycles lead to more robust protocols that can be tested more extensively. RCTs may also help to convince donors and funders, but they are of course only possible if there is potential to randomize.
Time, Cost, and Effort: Different approaches vary in time, cost, and effort implications. Full-blown RCTs and quasi-experimental designs tend to be more costly and time-consuming. Qualitative interviews and journey mapping usually cost less. Process tracing varies substantially depending on the depth of analysis but can be designed to be cost-effective.
⚬ ⚬ ⚬
The Adaptive Evaluation process consists of three phases: interpretation, innovation, and scaling. These phases do not always occur sequentially and often happen in repeated cycles during the evaluation journey. To illustrate each phase, we will use an example of a social enterprise aiming to empower women in rural India, inspired by Imago’s work.
The interpretation phase of an Adaptive Evaluation involves developing a comprehensive understanding of the problem and an initial examination of solutions. Thus, the interpretation phase for a social enterprise aiming to expand women’s agency involves examining general and specific theories around the critical consciousness of gender inequities, patriarchy, work practices, and decision-making, among others. A common activity of this phase of the evaluation is a system workshop with key stakeholders, including, among others, members of the social enterprise, rural women, and government officials. This should be complemented with historical accounts and data on household decision-making and practices wherever possible. The phase concludes with identifying potential solutions, such as creating a gender resource center, based on learnings from all these sources. Hypotheses are built for each step of associated theories of change.
The innovation phase involves cycles of prototyping, testing, and refinement, using agile, design thinking, or similar processes. In our gender resource center example, design thinking follows five steps. Empathizing entails listening to rural women’s barriers to entitlements and defining synthesizes the collective voices into a problem statement. Ideating involves brainstorming solutions for the gender resource center’s design and implementation. Prototyping tries one idea and testing evaluates its effectiveness, typically using theory-based methods. Lessons from prototyping update hypotheses and system understanding. The innovation phase includes rural women’s feedback and quick cycles of learning and discovery.
Scaling is crucial to development and evaluation. It demands a nuanced understanding of the system and an iterative process, with longer cycles and focused tracking of implementation and standardization. Before going to scale, it is important to ensure that solutions are prototyped and proven to be effective. Systems diagnosis of the scaling domain is again key. Gender resource centers may scale through expansion of the social enterprises, replication by other organizations, or strategic partnerships. A needs assessment then identifies resource gaps for scaling, leading to a further iterative phases of design adaptation across the scaled contexts.
⚬ ⚬ ⚬
We believe that coping with these challenges is the only way that evaluations can be a means to pursue social change. Philosopher Roberto Unger saw evaluation as a tool “to associate the explanation of what exists with the imagination of transformative opportunity. Not some horizon of ultimate possibles but the real possible which is always the adjacent possible…” Adaptive Evaluation is just a step toward realizing this grand vision of change.