Erdoo Karen Jay-Yina

Erdoo Karen Jay-Yina

Senior Evaluation officer
CGIAR's Independent Advisory and Evaluation Service (IAES)

Evaluation, Public Health, MEL, Storytelling.

My contributions

    • I enter into this rich discussion from the PoV of my experience managing the ongoing evaluation of the CGIAR GENDER (Generating Evidence and New Directions for Equitable Results) Platform which is being coordinated by IAES.  And from this vantage, I explore in some detail questions 2, 3 and 1 beginning with an overview of the evaluation context, the design of the evaluation and then capping with one key takeaway from applying the guidelines.


      The guidelines present four interlinked dimensions (research design, input, processes and outputs) which consider the many variables in the delivery and uptake of high-quality research, framed by the  QoR4D frame of reference and OECD DAC criteria. The application is by no means linear. The ongoing GENDER Platform evaluation served as a test-case. The evaluation aims to assess the Platform’s progress, document lessons learned, and provide forward-looking recommendations as it transitions to encompass an expanded mandate as an impact Platform.

      In answering the central evaluation questions, although the evaluation was not framed around an explicit “quality of science” (QoS) criterion, the guidelines were a useful toolbox in an agricultural research for development (AR4D) context to situate QoS while assessing the key questions following five DAC evaluation criteria - relevance, effectiveness, efficiency, coherence, and sustainability. The Platform evaluation, which was conducted by a multidisciplinary team led by an evaluator, integrated participatory, theory-driven, utilisation-focused and feminist approaches and deployed mixed-methods in data collection.

      By way of context, the GENDER Platform synthesizes and amplifies research, builds capacity, and sets directions to enable CGIAR to have an impact on gender equality, opportunities for youth, and social inclusion in agriculture and food systems. The Platform is organized around three interconnected modules (Evidence, Methods and Alliances).The guidelines were applied to the Evidence Module, which aims to improve the quantity and quality of gender-related evidence.


      In terms of the evaluation design, in line with the inception report,  the evaluation team developed sub- evaluation matrices that addressed the Platform’s Modules’ impact pathways and results framework. These sub matrices fed into an overarching  parent evaluation matrix. The matrices, overarching matrix (and other outputs) were reviewed by a team of external peer reviewers, including some members of IAES’s evaluation reference group, and by the Platform team to strengthen its validity. The reviews informed the subsequent revisions of the documents. 

      The four QoS dimensions have been integral in helping to evaluate the evidence module - these four dimensions were mapped to the focal evaluation criteria. Subject matter experts that led the Evidence module assessment systematically applied it to assess the module in a nested manner based on the mapping they conducted. Each of the Platform’s  three module assessments then fed into the overarching Platform evaluation in a synergistic manner.


      From this test case, one of several takeaways is that the convergence of different lenses is pivotal in applying the guidelines. The multidisciplinary evaluation team, in this case, benefited from both evaluator lenses -led by an evaluator, and “researcher lens”, with subject matter experts who were (gender) researchers that led the assessment of the Evidence module. The evaluation team in applying the guidelines straddled both perspectives to unpack the central evaluation questions mapped along the four QoS dimensions. Although multidisciplinary evaluation teams may not always be feasible in some contexts, in applying the guidelines, such multidisciplinarity may prove handy. However, it is essential that such teams invest sufficient time in capacity sharing and cross-learning to shorten the learning curve it may take for the convergence needed to effectively assess “QoS”, or mainstream it along the standard OECD DAC criteria as was done in this case. And the guidelines (and other derivative user-friendly products) can serve as a ready-to-use resource in both cases.

      High-quality research can be as challenging to assess as it is to deliver. Researchers, program managers, and other actors may also find the guidelines useful as a framing tool for thinking through evaluator perspectives at the formative and/or summative stages of the research or programming value chains for more targeted implementation and programming strategies. Application of the guidelines in process and performance evaluations across different contexts and portfolios will reveal insights to further strengthen and refine the tool.

      Finally, the GENDER Platform evaluation report and Evidence module assessment that details the application of the guidelines are soon to be released by IAES.

    • Dear John and colleagues,

      Excellent question, which sparked reflections based on insights from the recently completed independent reviews of 12 CGIAR research programmes (CRPs).CRPs are global research-for-development programmes covering themes from single-crop programmes like RICE, to integrated cross-cutting programmes like Climate Change, Agriculture & Food Security (CCAFS).

      How does your own work relate to the topic question?

      Following earlier announcement, the evaluation function of the CGIAR Advisory Services only recently completed independent & rapid reviews which covered the quality of science as well as the effectiveness of the outcomes achieved, zooming in on progress along ToC and usefulness of the ToC.

      What is a real-world example of a localized project design?

      The evidence need not be de-coupled from the risks and assumptions as together they give a big picture of the ground realities, irrespective of the size and type of intervention/program/initiative. To put things in perspective, ToC for CRPs are layered. First, all CRPs have a ToC which contribute to the CGIAR overall Strategy and Results Framework.  Cascaded down, the CRPs in turn, have different Flagship programmes (FP) - each FP contributes through specific impact pathways nested within the overall ToC. The FP ToC were co-designed and developed in collaboration with project teams, reflecting bottom-up approach- the process much appreciated overall in reviews.

      CRP reviews found that, although most CRPs incorporated evidence fed in from previous independent evaluations and impact assessments from conceptualization and during implementation, the ToC had varying levels of use and evolution. Overall, for some of the CRPs, the reviews found value in the process- in cultivating ToC-thinking even among scientist but limited evidence in its use as a measurement tool, linking it to the results framework.

      What would an evidence-based, evolving Theory of Change look like for that project?

      Given the global nature of CGIAR and majority of CRPs, grounding in the context has been key. ToC are very context/programme-specific. Framed within the context, one of the CRPs (Forests, Trees and Agroforestry-FTA)  had a considerably evolved ToC-use. It had annual targets adapted and indicators suited to the field realities. Some CRPs did not make any changes to their ToC (WHEAT). One of the conclusions for its Review (WHEAT) was that its’ ToC was good for “(1) priority setting, (2) assessing contribution of scientific outputs, (3) seeking and justifying funding, (4) mapping trajectory to impact and (5) reporting”  but unsuitable for assessing the effectiveness of CRP or flagship. Why? The review report says “because that was not its purpose.” Intentionality matters when developing ToCs, in order not to limit its usage in evidence-generation, learning, reporting and reprogramming, ToC development and iteration teams have to be intentional about co-designing it as an iterative evidence tool, tying in the indicators, linking the drivers and risks, testing the assumptions and causal pathways.

      What opportunities and obstacles do you see?

      Adaptive management was not found to be necessarily tidy, having revisions of ToC based on evidence, assumptions and risks could make the process as well as aggregation and reporting of results cumbersome. Yet this can be managed if reporting is consistently structured based on the indicators and targets linked to the (updated/revised) ToC.  The suite of metrics have to reflect design, implementation and scale-up of scientific innovations on the ground and be flexible, useful and coherent to allow progress to be tracked in a way that gives a clear picture of progress and the context has to promote a learning-by-doing approach. When the underpinning ToC, the evolution of the system and CRP metrics, and the evidence, with associated risks and assumptions, are revisited, captured and tracked coherently, then process tracing or contribution analysis of particular causal pathways is made easier. On the other hand, when ToC are not context-specific (time and place), which was the case in one of the CRPs (Grain, Legumes and Dryland Cereals-GLDC), accurate reflection on progress is challenging, as some TOC impact pathways may become obsolete.

      Reading other responses has been interesting, obviously, your question sparked an intriguing discussion. Should you and colleagues be interested in more information on earlier reflections around ToC in CGIAR and actual CRP Reviews, you can check out the hyperlinks.

      Best regards,

      Erdoo Karen Jay-Yina