RE: Artificial intelligence in the context of evaluation | Eval Forward

Dear Muriel,

I agree A.I. brings so much potential for evaluation, and many questions all at once! 

In the Office of Evaluation of WFP, as we have looked to increase our ability to be more responsive to colleagues’ needs for evidence, the recent advancements in artificial intelligence (A.I.) came as an obvious solution to explore. Therefore, I am happy to share some of the experience and thoughts we have accumulated as we have started connecting with this field. 

Our starting point for looking into A.I. was recognizing that we were limited in our capacity to make the most of the wealth of knowledge contained across our evaluations, to address our colleagues’ learning needs. This was mainly because manually locating and extracting evidence on a given topic of interest, to synthesize or summarize it for them, take so much time and efforts. 

So, we are working on developing an A.I. powered solution to automate evidence search using Natural Language Processing (NLP) tools, allowing to query our evidence with questions in natural language, a little like we do in any search engine on the web. Then, making the most of recent technology leaps in the field of generative A.I, such as Chat GPT, the solution could also deliver text that is newly generated from the extracted text passages, such as summaries of insights. 

We also expect that automating text retrieval will have additional benefits, such as helping to tag documents automatically and more systematically than humans, to support analytics and reporting; and as that Ai will also give an opportunity to direct relevant evidence directly to audiences based on their function, interests and location, just like Spotify or Netflix do. 

As we manage to have a solution that offers a good performance in the search results it offers, we hope it may then be replicable to serve other similar needs.

Beyond these uses that we are specifically exploring in the WFP Office of Evaluation, I see other benefits of A.I. to evaluations, such as:

  • Automating processes routinely conducted in evaluations, such as the synthesizing of existing evidence to generate brief summaries that could feed evaluations as secondary data.
  • Better access to knowledge or guidance and facilitating the curation of evidence for reporting in e.g., annual reporting exercises. 
  • Facilitating the generation of syntheses and identification of patterns from evaluation or review-type exercises.
  • Improving editing through automated text review tools to help enhance language.

I hope these inputs are useful, and look forward to hearing the experiences of others, as we are all learning as we go, and this is indeed full of promises, risks and surely moves us out of our comfort zones.

Best

Aurelie