How to evaluate science, technology and innovation in a R4D context? New guidelines offer some solutions

How to evaluate science, technology and innovation in a R4D context? New guidelines offer some solutions
11 contributions

How to evaluate science, technology and innovation in a R4D context? New guidelines offer some solutions


Dear colleagues,

The Evaluation Function of CGIAR would like to reopen last year’s discussion on How to evaluate science, technology and innovation in a development context? Contributions received last year were a key building block of the Evaluation Guidelines on Applying Quality of Research for Development Frame of Reference to Process and Performance Evaluations (with FAQ tab) !

In February, we organized a workshop to usher in the launch of the beta version of the Evaluation Guidelines , foster a common understanding among evaluators and subject matter experts of approaches and entry points to evaluating the quality of science (QoS) in CGIAR and in like-minded organizations, i.e. FAO, GEF and UNEP, and IDRC (see participants in Annex). The workshop allowed to draw broader lessons from assessing and evaluating QoS, and to identify opportunities to roll out and monitor the use and uptake of the Guidelines in CGIAR and beyond. First in the series of reflections from the participants is Q&A with Juha Uitto, Director of GEF evaluation office.

We would like to hear your reflections on the beta version of the Evaluation Guidelines: (make sure to read FAQs, the Spanish version to be available on 15 May)

  1. Do you think the Guidelines respond to the challenges of evaluating quality of science and research in process and performance evaluations?
  2. Are four dimensions clear and useful to break down during evaluative inquiry (Research Design, Inputs, Processes, and Outputs)? (see section 3.1)
  3. Would a designated quality of science (QoS) evaluation criterion capture the essence of research and development (section 3.1)?
  4. Do you have experience of using other evaluation criteria to evaluate interventions at the nexus of science, research, innovation and development? Please describe and cite.
  5. What are additional data collection and analysis methods that should be elaborated for evaluating science and research in process and performance evaluations? (see textbox 3, figure 8 and tables 5, 6 and 8)
  6. How can CGIAR support the roll-out of the Guidelines with the evaluation community and like-minded organizations?

Many thanks in advance!


  • Many thanks for sharing the document and for the opportunity to comment. I looked at the guidelines from the perspective of an evaluator with limited knowledge of the CGIAR system embarking on a new evaluation.

    For someone in my position, the guidance provides accessible background on the CGIAR approach, definitions, and so on, and useful links to other relevant materials.  The guidance overall provides an interesting conceptual framework in Ch 3, a flexible guide in Ch 4, and compendium of methods and questions that would also be useful in other evaluation contexts.

    It’s challenging to design a streamlined approach to look at impact over extended timeframes of a such a wide range of research outputs.  There is perhaps too much emphasis on uptake via formal academic publications. This is somewhat balanced by the efforts to look at impacts in real time in systems research and there are some excellent (CG) examples of multidisciplinary studies in this regard.  I didn’t see any reference to use of evaluation case studies. These proved useful in grounding an earlier CGIAR programme evaluation I was involved with and captured pathways to development outcomes that may not be reflected in formal literature or project reporting.

  • I think the guidelines are a set of recommendations for evaluating projects or organisations. If CGIAR wants to help roll them out, they could perhaps promote them to their community and the organisations they support.

    • They could also organise training sessions to help people understand how to apply the guidelines in their day-to-day work.
    • They could create online resources, such as videos or guides, to help people better understand the guidelines and how to apply them.
    • They could also work with partners to develop tools and methodologies for evaluating projects or organisations using the guidelines
    • These events could also be used as a platform to share examples of projects or organisations that have successfully applied the guidelines and to discuss lessons learned.
    • They could work with partners to develop mentoring programmes to help organisations apply the guidelines and improve over time.
    • Finally, they could organise events to promote the guidelines and create opportunities for stakeholders to meet and exchange ideas on how to apply them in their work.


    As an agricultural technician with a lot of experience I would suggest helping or setting up resource people who will be able to explain the guidelines to stakeholders and answer their questions. You could also contribute to the creation of resources to help people better understand the guidelines and their application. You could propose that the CGIAR work with other organisations to develop tools and methodologies for assessing projects or organisations using the guidelines. You could also propose that the CGIAR organises events to promote the guidelines and create opportunities for stakeholders to meet and exchange ideas on how to apply them in their work.

  • How can CGIAR support the roll-out of the Guidelines with the evaluation community and like-minded organizations?

    I believe that CGIAR can help like-minded organizations use the guidelines by emphasizing its best feature—flexibility.

    Flexibility is necessary. The guidelines were informed by the work of CGIAR, which is tremendously varied. A common evaluation design would not be appropriate for CGIAR. Neither would it be appropriate for most like-minded organizations.

    Flexibility is a middle ground. Instead of using a common evaluation design, each project might be evaluated with one-off bespoke designs. Often this is not practical. The cost and effort of individualization  limits the number, scope, and timeliness of evaluations. A flexible structure is a practical middle ground. It suggests what like-minded organizations and their stakeholders value and provides a starting place when designing an evaluation.

    Flexibility serves other organizations. The very thing that makes the guidelines useful for CGIAR also makes it useful to other organizations. Organizations can adopt what is useful, then add and adapt whatever else meets their purposes and contexts.

    Perhaps CGIAR could offer workshops and online resources (including examples and case studies) that suggest how to select from, adapt, and add to its criteria. It would not only be a service to the larger community, but a learning opportunity for CGIAR and its evaluation efforts.

  • Thanks, Seda for your important question. As the Guidelines state several times, they were informed by the International Development Research Centre (IDRC) RQ+ Assessment Instrument ( Hence some useful ideas and suggestions from a development organization are an integral part of the Guidelines.

    Perhaps the easiest way to answer your question is to use Table 7 on Pg. 19 Qualitative data themes, indicators per Quality of Science dimension with assessment criteria. This Table was developed for evaluating CGIAR research for development projects. As far as I can see, most of the themes and indicators of quality in a science-based research for development project are just as relevant to evaluating quality in a development project. Under design, as an evaluator I would want to know whether the design was coherent and clear and the methodologies fit the planned interventions. Under inputs, I would be looking at the skill base and diversity of the project team, whether or not the funding available was sufficient to complete the project satisfactorily and whether the capacity building was appropriate for planned activities and would be sufficient to provide sustainability for impact after the project finished. Under processes, my main questions would be the recognition and inclusiveness of partnerships, whether the roles and responsibilities were well-defined and whether there were any risks or negative consequences that I should be aware of. Finally under outputs, I would be interested in whether the communication methods and tools were adequate, whether planned networking included engagement of appropriate and needed stakeholders, whether the project was sufficiently aware if the enabling environment was conducive to the success of the project , where relevant – were links being made with policy makers, and whether scaling readiness was part of stakeholder engagement.

    Section 4 of the Guidelines on the Key Steps in Evaluating Quality of Science in research for development proposes methods which are also relevant to development projects. These include review of documents, interviews, focus group discussions, social networking analysis, the Theory of Change and the use of rubrics to reduce subjectivity when using qualitative indictors. The use of rubrics is a cornerstone of the IDRC RQ+ Assessment Instrument.

  • Do you think the Guidelines respond to the challenges of evaluating quality of science and research in process and performance evaluations?

    As an international evaluation expert, I am so fortunate to evaluate a large range of projects and programs covering research (applied and non-experimental), development and humanitarian interventions. Over past decade, I got opportunities to employ various frameworks and guidelines to evaluate CGIAR projects and program proposals especially with the World Agroforestry Center (ICRAF) and the International Institute for Tropical Agriculture (IITA) in Central Africa (Cameroon & Congo). For example, when leading the final evaluation of the Sustainable Tree Crops Programme, Phase 2 (PAP2CP) managed by the IITA-Cameroon, together with the team, we revised the OECD-DAC framework and criteria to include a science criterion to address the research dimensions such as inclusion and exclusion research criteria.

    When designing high-quality research protocols for a science evaluation, establishing inclusion and exclusion criteria for study participants is a standard and required practice. For example, inclusion criteria define as the key features of the target population that the evaluators will use to answer their research question (eg. demographic, and geographic characteristics of the targeted location in the two regions of Cameroon) should be considered. These are important criteria to understand the area of research and to get a better knowledge of the study population. Reversely, exclusion criteria cover features of the potential study participants who meet the inclusion criteria but present with additional characteristics that could interfere with the success of the evaluation or increase their risk for an unfavorable outcome (eg. characteristics of eligible individuals that make them highly likely to be lost to follow-up, miss scheduled appointments to collect data, provide inaccurate data, have comorbidities that could bias the results of the study, or increase their risk for adverse events). These criteria can be also considered to some extent as part of the cross-cutting themes, but still are not covered by the OECD-DAC evaluation criteria and framework, therefore can be become a challenge for evaluating quality of a science/research and performance evaluation.

    Are four dimensions clear and useful to break down during evaluative inquiry (Research Design, Inputs, Processes, and Outputs)? 

    A thorough review of the four dimensions shows that these are clear and useful especially when dealing with mixed methods approach involving both quantitative and qualitative methods and adequate indicators. Given that however context and rationale are always the best drivers of objectivity for the research design, research processes including collection of reliable and valid data/evidence to support decision-making process, it is very important that evaluators not only define the appropriate inclusion and exclusion criteria when designing a science research but also evaluate how those decisions will impact the external validity of the expected results. Therefore, on the basis of these inclusion and exclusion criteria, we can make a judgment regarding their impact on the external validity of the expected results. Making those judgments requires in-depth knowledge of the area of research (context and rationale), as well as of in what direction each criterion could affect the external validity of the study (in addition to the four dimensions).

    Serge Eric

  • Dear Seda, what a great contribution. Thanks. Proving a quality of science is important, yet insufficient. It, and the explanations around it, falls short for a organisation claiming its research programme is for development. The guidelines feel feint on this. As you say, and as I alluded to in my response, you want the them to be a stronger, more compelling read.    

  • I would like to complement the interventions below, and also the FAQ question about the quality of research vs. the development programme. As hinted by their title, the guidelines focus primarily on the scientific aspect. Of note here is the "outputs" dimension, which refers to quality of research outputs and contributions to advancement of science. I think the authors can more clearly identify how concrete development results related to particular research area can be considered as well. This is still not clear to me in the other three dimensions. Could you perhaps point us to the relevant parts of the guidelines on this?

  • Do you think the Guidelines respond to the challenges of evaluating quality of science and research in process and performance evaluations?

    On February 27 and 28, 2023, I attended a workshop in Rome, Italy about the CGIAR’s Independent Advisory and Evaluation Service (IAES)’s new set of evaluation guidelines. These build on the CGIAR Independent Science for Development Council (ISDC)’s Quality of Research for Development (QoR4D) Frame of Reference, and provide the framing, criteria, dimensions, and methods for assessing QoR4D – both within CGIAR, and in other like-minded organizations. The hybrid online and in-person event was designed to help practitioners across and beyond the CGIAR system to understand and apply the new guidelines in their own evaluative contexts.

    I found the workshop informative, resourceful and impactful. The main lessons learnt/takeaways for me from the workshop were:

    • Improved evaluation processes to assess the success and effectiveness of quality of science to provide evidence for policymaking;
    • The value of exchange of skills/experience between facilitators and participants for a particular evaluation project;
    • Sharing and documenting best practices, drawing on the knowledge and experience of IAES;
    • The development of  working ‘standards’ or principles to ensure effective engagement with donors and relevant actors; and
    • Supporting and advocating for public funding opportunities and supporting government by identifying where we can build capacity for effective innovation support and how we can effectively monitor and evaluate publicly funded projects.

    One challenge remains: how to apply the QoR4D for evaluating contribution to the SDGs

    I will use my takeaways from the workshop in my next book chapter entitled Nature-based solution to preserve the wetlands along the critical zone of River Nyong as well as in my MSc lecture.

    Dr Norbert Tchouaffe
    Pan-African institute for Development

    1. Do you think the Guidelines respond to the challenges of evaluating quality of science and research in process and performance evaluations?

    Having been involved in evaluating CGIAR program and project proposals as well as program performance over the past decade, I have used an evolving range of frameworks and guidelines.  For the 2015 Phase I CRP evaluations, we used a modified version of the OECD-DAC framework including the criteria relevance/coherence, effectiveness and impact and sustainability. The lack of a quality of science criterion in the OECD-DAC framework was addressed but evaluated without designated elements or dimensions. Partnerships were evaluated as cross-cutting and evaluation of governance and management were not directly linked to the evaluation of quality of science. For the 2020 Phase II CRP evaluative reviews, we used the QoR4D Frame of Reference with the elements relevance, credibility, legitimacy and effectiveness together with three dimensions inputs, processes and outputs.  Quality of science was firmly anchored in the elements credibility and legitimacy and all three dimensions had well-defined indicators. During the 2020 review process, the lack of a design dimension was highlighted in regard to its importance in evaluating coherence and methodological integrity and fitness as well as the comparative advantage of CGIAR to address both global and regional problems.  

    The beta version of the Evaluation Guidelines encapsulates all of these valuable lessons learnt from a decade of evaluations and, in this respect, it responds to the challenges of evaluating quality of science and research in process and performance evaluations. During its development, it has also consulted with other evaluation frameworks and guidelines to gain greater understanding of evaluation of both research and development activities. Due to this, it is flexible and adaptable and thus useful and usable by research for development organizations, research institutes and development agencies.

    Recently, the Evaluation Guidelines were used retrospectively to revisit the evaluative reviews of 2020 Phase II CRPs with a greater understanding of qualitative indicators in four dimensions. Application of the Guidelines provided greater clarity of the findings and enhanced the ability to synthesize important issues across the entire CRP portfolio.

    1. Are four dimensions clear and useful to break down during evaluative inquiry (Research Design, Inputs, Processes, and Outputs)? (see section 3.1)

    The four dimensions are clear and useful especially if accompanied by designated criteria with well-defined indicators. They are amenable to a mixed methods evaluation approach using both quantitative and qualitative indicators. In addition, they provide the flexibility to use the Guidelines at different stages of the research cycle from proposal stage where design, inputs and planned processes would be evaluated to mid-term and project completion stages where outputs would then become more important.

    1. Would a designated quality of science (QoS) evaluation criterion capture the essence of research and development (section 3.1)?

    From my own use of the quality of science criterion with intrinsic elements of credibility (robust research findings and sound sources of knowledge) and legitimacy (fair and ethical research processes and recognition of partners), it captures the essence of research and research for development. Whether it will capture the essence of development alone will depend on the importance of science to the development context.

  • Dear Svetlana,

    Hi and thanks for the opportunity to comment on the guidelines. I enjoyed reading them, yet only had time to respond to the first two questions.

    My responses come with a caveat - I do not have a research background, yet observed during a time i worked with agricultural scientists that the then current preoccupation with assessing impact among the ultimate clients group, as gauged by movements in the relative values of household assets, tended to mask the relative lack of information and interest about the capacity and capabilities of local R&D / extension systems before, during, and after investment periods. Their critical role in the process often got reduced to being treated as assumptions or risks to "good" scientific products or services.

    This made it difficult to link any sustainable impact among beneficiaries with information on institutional capacity at the time that research products were being developed. This may also have explained how believing in (hopelessly inflated) rate of return studies required a suspension of belief,  thus compromising prospects for efforts in assessing the impact of research to make much difference among decision-makers.

    Moving on - My responses to the two of your questions follow, and hope some you find interesting, useful even. 

    1.    Do you think the Guidelines respond to the challenges of evaluating quality of science and research in process and performance evaluations?

    Responding to this question assumes/depends on knowing the challenges to which the guidelines refer. In this regard, Section 1.1 is a slightly misleading read given the title. Why?

    The narrative neither spells out how the context has changed nor therefore, how and why these pose challenges to evaluating the Quality of Science. Rather, it describes CGIAR’s ambition for transformative change across system transformation – a tautology? - resilient agri-food systems, genetic innovation, and five  - unspecified  - SDGs. And, it concludes by explaining that, while CGIAR funders focus on development outcomes, the evaluation of CGIAR interventions must respond to both the QoR4D – research oriented to deliver development outcomes – and OECD/DAC – development orientation – frameworks. 

    The reasons that explain the insufficiency of the 6 OECD DAC criteria in evaluating CGIAR’s core business do not appear peculiar to CGIAR’s core business, relative to other publicly funded development aid  - the unpredictable and risky nature of research and the long time it takes to witness outcomes. Yes, it may take longer given the positioning of the CG system but, as we are all learning, operating environments are as inherently unpredictable as the results. Context matters. Results defy prediction; they emerge. Scientific research, what it offers, and with what developmental effect is arguably not as different as the guidelines suggest.  About evaluating scientific research, the peculiarity is who CGIAR employ and the need to ensure a high standard of science in what they do – its legitimacy and credibility. The thing is, it is not clear how these two elements, drawn from the QoR4D frame of reference, cover off so to say the peculiarities of CGIAR’s core business and so fill the gap defined by the 6 OECD DAC criteria. Or am I missing something?

    The differences between Process and Performance Evaluations are not discernible as defined at the beginning of Section 2.2. Indeed they appear remarkably similar; and so much so I asked myself – why have two when one would do? Process evaluations read as summative self-assessments across CGIAR and outcomes are in the scope of Performance Evaluations. Performance Evaluations read as more formative and repeat similar lines of inquiry - assessing organisational performance and operating models as well as process to Process Evaluations – the organisational functioning, instruments, mechanisms and management practices together with assessments of experience with CGIAR frameworks, policies etc.. No mention of assumptions – why given the “unpredictable and risky nature of research?” Assumptions, by proxy, define the unknown and for research managers and (timely) evaluations, they should be afforded an importance no less than the results themselves. See below

    The explanation as to the differences between the Relevance and Effectiveness criteria as defined by OECD/DAC with QoR4D in Table 2 is circumscribed. While the difference to do with Relevance explicitly answers the question of why CGIAR?, that for effectiveness is far too vague (to forecast and evaluate). What is so limiting about how the reasons why CGIAR delivers knowledge, products, and services  - to address a problem and contribute to innovative solutions  - can not be framed as objectives and/or results? And especially when the guidelines claim Performance Evaluations will be assessing these. 

    2. Are four dimensions clear and useful to break down during evaluative inquiry (Research Design, Inputs, Processes, and Outputs)? (see section 3.1)

    This section provides a clear and useful explanation of the four interlinked dimensions – Research Design, Inputs, Processes, and Outputs in Figure 3 that are used to provide a balanced evaluation of the overall Quality of Science. 

    A few observations:

    “Thinking about Comparative Advantage during the project design process can potentially lead to mutually beneficial partnerships, increasing CGIAR’s effectiveness through specialization and redirecting scarce resources toward the System’s relative strength”.…

    1)    With this in mind, and as mentioned earlier in section 2.3, it would be useful to explain how the research design includes proving, not asserting, CGIAR holds a comparative advantage by going through the four-step process described in the above technical note. Steps that generate evidence with which to claim CGIAR does or does not have a comparative advantage to arrive at a go/no go investment decision. 

    2)    Table 3 is great in mapping the QoS’s four dimensions with the six OECD/DAC criteria and I especially liked the note below on GDI. I remain unclear, however, why the Coherence criterion stops at inputs and limits its use to internal coherence. External coherence matters as much, if not more, and especially concerning how well and to what extent the outputs complement and are harmonised and coordinated with others and ensure they add value to others further along the process.  

    3)    While acknowledging the centrality of high scientific credibility and legitimacy, it is of equal importance to also manage and coordinate processes to achieve and maintain the relevance of the outputs as judged by the client. 

    4)    I like the description of processes, especially the building and leveraging of partnerships  

    5)    The scope of enquiry for assessing the Quality of Science should also refer to the assumptions, specifically those that have to hold for the outputs to be taken up by the client organisation, be they a National Extension Service or someone else. Doing this should not be held in abeyance to an impact study or performance evaluation. I say this for, as mentioned earlier, the uncertainty and unpredictability associated with research is as much to do with the process leading up to delivering outputs as it is in managing the assumption that the process along the impact pathway, once the outputs have been “delivered”, will continue. This mustn’t be found out until too late. Doing this helps mitigate the risk of rejection. Scoring well on the Quality of Science criterion does not guarantee the product or service is accepted and used by the client remembering that it is movement along the pathway, not the QoS, that motivates those who fund CGIAR.