Bien hecho Elías por esta brillante contribución que no pude producir personalmente.
Usted resume la situación perfectamente y recomienda exactamente qué hacer. Si algunas de las últimas contribuciones fueron algo prescriptivas o incluso teóricas, la suya está perfectamente inspirada en una experiencia práctica. Y eso es lo que buscamos como miembros en este tipo de plataforma de interacción e intercambio.
Tuve la oportunidad de visitar Benin en junio de 2014 como consultor del Banco Mundial para apoyar al equipo nacional implementando un programa de desarrollo comunitario de 5 años que comenzó como Fase II de un proyecto similar. Me sorprendieron los esfuerzos del Gobierno en su ambición de institucionalizar el Seguimiento y Evaluación en todos los sectores; todavía fueron los primeros años de la aplicación de la política gubernamental 2012-2021 que usted menciona en su contribución (al menos, me imagino). Tuve la oportunidad de visitar varios ministerios gubernamentales y encontré la existencia de un servicio SyE que cotejara varios datos del sector. Durante este período, todavía no había todos los medios disponibles, sino 6 años después, y para leerles ahora, entiendo que tenemos en nuestras manos una experiencia bastante interesante que podría inspirar a varios países, especialmente africanos, pero también a otros, con el fin de ser más prácticos en nuestras recomendaciones y dejar de encerrar nuestros intercambios en lo normativo y lo teórico.
Felicidades una vez más por esta contribución y buena suerte para Benin.
Ha pasado más de un mes desde que comenzamos este debate sobre el desajuste entre el seguimiento y la evaluación, aunque estas dos funciones siempre se han considerado complementarias y, por lo tanto, inseparables. Sin embargo, como primera reacción, debo expresar mi sorpresa de que sólo se han registrado 4 contribuciones para este tema. ¿Por qué una reacción tan débil de los miembros de nuestro grupo?
Más allá de esta sorpresa, he revisado las 3 reacciones que abordan específicamente las cuestiones de la práctica de seguimiento y evaluacióny, propuso relanzar el debate sobre este tema para que podamos elaborar algunas recomendaciones. Para que conste, y para que quede claro en mis recomendaciones, centraré mi intervención en la función de supervisión para distinguirla de la práctica de evaluación en cualquier sistema de supervisión-evaluación porque me parece que el término "seguimiento-evaluación" esconde muy mal el desajuste existente entre las dos funciones, ya que éstas no reciben la misma atención tanto a nivel nacional como internacional.
Como la primera en responder, Natalia recomienda que las teorías del cambio serían más útiles si se desarrollaran durante la fase de planificación o formulación de la intervención y sirvieran como base del sistema de monitoreo-evaluación. Esta es la esencia de la teoría del monitoreo y evaluación en lo que muchos libros de texto especializados sugieren.
También sugiere que las evaluaciones podrían ser más útiles en términos de aprendizaje de la intervención si las preguntas de la Teoria del Cambio y las preguntas de evaluación se alimentan de las preguntas formuladas por los equipos del programa después del análisis de los datos de monitoreo. ¿Pero no es eso lo que se supone que debemos hacer? Y si eso es todo, ¿por qué en general no es como se hace?
En su contribución, Aurélie reconoce que la evaluación está mejor desarrollada como una práctica que su función hermana de seguimiento, tal vez ya que las evaluaciones se realizan principalmente cuando se apoya con financiación externa dedicada, por lo tanto vinculada a un financiador externo. Este es, de hecho, el caso general que puede observarse fácilmente en los países menos adelantados. También se pregunta: ¿por qué la función de seguimiento aún no ha recibido el mismo interés de los donantes; ¿por qué los sistemas de seguimiento no son necesarios como una prioridad, dada la naturaleza esencial de esta herramienta para aprender de acciones pasadas y mejorar las acciones futuras a tiempo? Incluso parece dar una respuesta refiriéndose a un estudio: los países necesitan desarrollar una cultura de gestión general basada en resultados, que comience, incluso antes del seguimiento, con una planificación basada en los resultados. Pero ella no explica por qué aún no está en su lugar, a pesar del hecho de que han pasado 4 años desde que se lanzaron los ODS. Concluye su contribución reconociendo que en muchas instituciones, tanto nacionales como internacionales, el seguimiento sigue siendo ampliamente subestimado, investido y sugiere que corresponde a los evaluadores desempeñar un papel en el apoyo al surgimiento de la función de supervisión, en sus respectivas esferas de influencia; incluso si significa dejar de lado el sacrosanto principio de independencia por un tiempo. Pero no nos muestra cómo los evaluadores pueden lograr sacar a la luz esta tan deseada función de seguimiento donde los grandes donantes y los programas de creación de grandes capacidades han fracasado.
La tercera contribución proviene de Diagne, que comienza reconociendo que al desarrollar un sistema de seguimiento-evaluación, hay un mayor enfoque en las funciones y herramientas en lugar de en el campo -o alcance- y en el propósito del sistema, teniendo en cuenta las necesidades de información del profesor y otras partes interesadas. Dice que si el objetivo principal de un sistema de seguimiento-evaluación es acompañar la implementación de una intervención en un sentido de reflexión crítica constante para lograr los resultados asignados a esta intervención y alertar sobre las condiciones críticas de su aplicación, entonces es necesario revisar -diría personalmente, rediseñar- el sistema de seguimiento-evaluación. Y concluye subrayando que las políticas de desarrollo no dan suficiente importancia al seguimiento y evaluación de los ODS; se limitan a recopilar datos de programas implementados con socios extranjeros para expresar progresos en relación con un indicador en particular, que dista mucho de ser una buena práctica en lo que respecta al seguimiento y la evaluación.
Al menos dos contribuciones (Aurelie y Diagne) reconocen que se necesita una revisión importante del seguimiento-evaluación en la era de la gestión basada en resultados y de todos los demás corolarios basados en resultados.
Lo que podemos observar de todas estas contribuciones es que hay unanimidad sobre el interés y la importancia de fortalecer la complementariedad entre el seguimiento y la evaluación como dos funciones que se refuerzan entre sí, pero no sabemos cómo construir una función de seguimiento digna de la práctica actual de evaluación. Como se dice, identificar las buenas causas de un problema es ya la mitad de la solución a este problema. La principal causa del desajuste entre el seguimiento y la evaluación es que la evaluación ha sido consolidada por los financiadores y asociados para el desarrollo porque aborda sus preocupaciones sobre el desempeño de los programas que financian o implementan. Por otra parte, el seguimiento es una función más beneficiosa para los países que reciben asistencia para el desarrollo, y esa función aún no parece ser importante para los gobiernos del país por varias razones. Dado que hay muy poca inversión externa en la función de seguimiento a nivel nacional, esto alimenta el desajuste entre estas dos funciones. Por lo tanto, si se puede hacer algo para mitigar esta insuficiencia, se debe alentar a los donantes y asociados para el desarrollo a que inviertan en el fortalecimiento de los sistemas nacionales de seguimiento y evaluación y lleven a cabo programas para convencer a los gobiernos de los países receptores de la asistencia para el desarrollo del interés y la importancia de fortalecer los sistemas nacionales de seguimiento y evaluación.
Esperemos que esta contribución relanzará el debate sobre este tema..
Estoy contribuyendo a la nueva discusión iniciada por nuestro colega Carlos Tarazona.
Confiando en mi propia experiencia, los marcos lógicos y las cadenas de resultados son herramientas de planificación que pueden ayudar en la fase de formulación de cualquier acción de desarrollo orientada a resultados (o basada en resultados), aunque sea una política, un programa o un proyecto. Sin embargo, estas herramientas requieren la mayor parte del tiempo una gran experiencia técnica para usarlas de una manera bastante profesional y lograr una formulación sólida de una determinada acción de desarrollo. Dicho esto, la mayoría de los profesionales del desarrollo que no tienen ni idea de estas herramientas de planificación y formulación pueden correr el riesgo de no comprender suficientemente el marco lógico o la cadena de resultados de "su" acción de desarrollo.
Aquí viene el uso de "modelos lógicos" o "teorías de cambio", especialmente durante la fase de implementación, y más importante aún durante la fase de evaluación, de una determinada acción de desarrollo. En este caso, los miembros del equipo de implementación se sentarán juntos al comienzo de la fase de implementación para "dibujar" un "modelo lógico" o una "teoría del cambio" para comprender cómo evolucionará la acción de desarrollo en su área de implementación y cómo el "marco lógico" o la "cadena de resultados" de esa acción de desarrollo se desarrollará en realidad en una serie de relaciones de causa-efecto entre sus diferentes elementos, pasando de "recursos / insumos" a "actividades" a "resultados", a "resultados" y luego a "impacto". Dibujar el "modelo lógico" o la "teoría del cambio" de una acción de desarrollo, ya sea expresada en un dibujo o en un texto, ayudará a los profesionales del desarrollo a traducir el "marco lógico" o la "cadena de resultados" de una manera más expresiva y fácil para desvelar y comprender la "estrategia de cambio" de esa acción de desarrollo. Las consecuencias de tal esfuerzo son: (1) una mejor comprensión de la estrategia de implementación de la acción de desarrollo; (2) mucha información para una mejor programación de las actividades de acción para el desarrollo; y (3) la configuración de un sistema sólido de monitoreo y evaluación de esa acción de desarrollo.
Sin embargo, este no es el caso general de que un "modelo lógico" o una "teoría del cambio" esté listo al comienzo de la fase de implantación; algunas acciones de desarrollo han tomado tanto tiempo durante la fase de formulación que la agencia receptora se apresuraría a comenzar la implementación. Y aquí viene la segunda situación del uso de "modelos lógicos" y "teorías de cambio" en la fase de evaluación. Un ejercicio de evaluación sólido para una acción de desarrollo dada ciertamente dependería de un "modelo lógico" o "teoría del cambio" que pueda ayudar a los evaluadores a comprender lo que se suponía que debía hacer esa acción de desarrollo, al menos en la cabeza de las personas que formulan, y comparar con lo que realmente hizo la acción de desarrollo. Si se elaboró un "modelo lógico" o una "teoría del cambio" de una acción de desarrollo al comienzo de la fase de implementación, debe usarse y quizás mejorarse con la condición de que el equipo de implementación lo valide. De lo contrario, la primera tarea de los evaluadores sería elaborar un "modelo lógico" o una "teoría del cambio" para la acción de desarrollo "evaluar y evaluar" a fin de definir las diferentes vías que deberían examinarse durante el ejercicio de evaluación (parámetros , indicadores, datos a recopilar, etc.).
Al final, diría que el debate no debería ser si se debe usar un "marco lógico" o una "cadena de resultados", por un lado, o un "modelo lógico" o una "teoría del cambio"; El debate debe ser sobre el valor agregado mediante el uso de diferentes métodos y técnicas para garantizar una buena implementación y una evaluación sólida de una determinada acción de desarrollo. En resumen, no se trata de ESTO O ESO, sino de ESTE Y ESO.
Espero que esto ayude…
First of all, a big thank you to Hynda who puts on the table a well relevant topic of debate that we must all bring answers to without taboos in order to put at the center evaluation compared to many related considerations about development work in every possible sense of the term “development”.
In her message, Hynda highlights the term "mistake" as it is perceived by many of us in everyday life. However, we should place this term in the context of public policy planning in order to distinguish the intentional and the unintentional error in planning in our countries and in public policies more specifically. Then, Oumar tries to bring some snatches of answer but ends up very quickly pouring into the normative instead of staying in the real - what is done and why.
For my part, given my modest experience in administration and my modest research to understand how what I will call the "development theater" works, I caricature the sphere of development by the existence of different roles played by different actors and therefore the presence of different rationalities.
I would like to say at the outset that there is no pure naïve person who believes that development is an apolitical work that obeys exclusively to technical considerations. So when we talk about error in this context, we must talk about these unintentional errors that we could identify in our evaluation of public policies and it is also necessary that evaluation appreciates how and in what knowledge context these policies were formulated. This is where evaluation can become an interesting instrument to show us that the mistakes that we can identify through our evaluations are far from being unintentional. Indeed, such errors are strongly related to the balance of power that exists in the "theater of development" when planning a public policy. Here I agree with Oumar who admits that we have not done a lot of evaluations in our countries and that even when sometimes they are done, they are rather done as part of a "ballet folklore "- very often "imposed" by foreign donors.
And since these evaluations are done in the context of development programs and projects, the repercussions in the sphere of public policy planning remain limited, if not none, and the results of these evaluations are never seen as tools to help decision-making. It is important to recognize the separation in some of our countries between programs and development projects financed by foreign donors and public policies funded through public budgets and therefore most often fall within the domain of national sovereignty.
This is not to say that there are no other mistakes in public policy planning, such as lack of scientific and technical knowledge to develop a coherent public policy with relevant objectives and realistic outcomes. Mistakes that can come from a real lack of knowledge (either proven skills, reliable statistics, etc.) can come from other causes related to the famous "balance of power" mentioned above and this brings us back to the need to distinguish the intentional from the unintentional in our planning mistakes.
There are errors related to the existence of a "one and only" document that allows any reader to understand the public policy that some senior sector official talks about. I have personally experienced many examples of senior sectoral officials who spoke of a sectoral policy that existed only in their "head".
There are also errors related to the setting of objectives and results that are relevant and clear, evaluable, etc. As a Results-Based Management Specialist, I know something about the resistance some high level officials have and their need to avoid this kind of debate in the public policy planning document, when it exists and is made public. This makes things more difficult when talking about "accountability" of decision-makers in terms of achieving the objectives.
Other types of errors can be identified with respect to the allocation of resources for public policy and the logical links of allocated resources to the objectives and results assigned to the public policy.
Finally, there is another type of mistake that relates to "changeability" of the public policy. In a number of sectors, public policy is launched on the basis of ideas that are still insufficiently identified or apprehended and that sector policy officers are eager to implement in the field; then as the feedback (inconsistency protest, etc.) comes back from the field, public officers improve the "content" of the policy and this is done recurrently throughout the life of the policy, which makes it difficult to evaluate.
All this to say that the problem is not in public policy or its evaluation.
It is only fair that under such circumstances, the evaluation of a public policy elaborated in an administrative "straitjacket" devoid of logic and without knowledge is not possible and will have no conclusive result on the improvement of the development work ... It can only be used to tell the sectoral politician what he likes to hear, and this is not the role of evaluation and what is done in the advanced world.
Mustapha Malki, PhD
535 avenue Ampere #5
Laval, QC, Canada
Contribution of Emile and Bintou exchange on the necessary distinction between outputs and outcomes.
Outputs are all goods and services developed through the project's activities via the use of project's resources and inputs (in the Emile's case, these are the distibuted insecticide-treated nets). Outcomes would be the changes (of course, they should be "positive" changes, otherwise we should close that project) that can appear in the living and income conditions of project's target beneficiaries (in the Emile's case, these are the reduction of malaria incidence).
The difference between the two different items is that outputs - as well as activities and inputs - are part of the project's "controlled" environment (you can decide what and how much to buy and distribute) while outcomes remain the influence that the project is intending IF AND ONLY IF THE PROJECT'S TARGET BENENFICIARIES USE WHAT THE PROJECT DISTRIBUTED. This is why outcomes are part of the project's "influenced" environment.
And this is what makes things more difficult in achieving outcomes in comparison to outputs because the project management unit has no slight control over the changes among beneficiaries. It then depends on how relevant the implemented activities were in order to generate outputs that can really solve the problem situation identified at the onset. If we can borrow concepts from marketing, and if we assume that outcomes represent the changes requested by the beneficiaries (that is the "demand") and the outputs are the mean to bring about these changes (that is the "supply"), it is then needed that the "supply" meets the "demand" in order to changes to occur.
Contribution to Dowsen reaction
Yes, this what I do myself before I go on drafting a Results Framework (or Logical Framework) at the start of project design and this results framework for setting the M&E plan and for guiding any evaluation later. I start with the problem situation assessment (i.e. the Problem Tree tool) using the "cause-effect" causality law. And then turning each problem identified in the Problem Tree into a positive statement I develop the Objective Tree, then a fine-tuning of the Objective Tree using the "means-end" causality law. From the Objective Tree, I can identify the best alternative "result chain" move with it very easily to the results (or logical) matrix and so on...
Contribution to Reagan Ronald reaction on handbooks' quality.
I am not sure that anyone of us has attributed the poor quality of evaluation handbooks to evaluators or international consultants in evaluation. Personally I made it clear that sometimes the handbook's content can be of good quality but presented and disseminated upon a very poor communication processing and dissemination process. Based on what I know, many handbooks' content were prepared by high quality consultants in evaluation. However, relying on my minor competency on knowledge and information systems and communication, a good handbook, in general, and in evaluation, in particular, must rely - as a communicative tool - on 4 necessary 4 criteria: (1) a good, appropriate, relevant, and purposeful content; (2) an adequate mean of dissemination; (3) a good knowledge on the targeted population; and (4) a conducive environment to the use of the information. For many handbooks, we were more focusing on (1) and a bit less on (2) and (3) and this is not enough to give birth to good quality handbooks on any subject and not only on evaluation guidelines. Moreover, the consultant in charge of content can be quite good in terms of content (i.e. the substantive knowledge) but may not be very qualified in terms of communication. This is why I always recommend to build a team on an evaluator + a communication specialist to have a good quality handbook on evaluation.
Hope that I added a bit to this discussion.
Dear Natalia et_al.,
Thank you for putting on the table an important challenge to both the evaluator and the manager of a development project. And I want to apologize for not being able to answer earlier; the situation in my country had taken over my mind and took all my time during the last 3 weeks. The question of clearly distinguishing an output from an outcome is of utmost importance for the development project manager as well as for the evaluator, as well as the project monitoring and evaluation staff. And I doubt that the problem is really a terminology problem, at least theoretically speaking. According to my modest experience, the problem has its origin in several factors that I will try to explain below:
Having this said, a good training on the monitoring and evaluation of project staff, based on good logframe and result chain, can sometimes be the key to this problem. And to support this, I would like to share an experience I personally experienced in Sudan in 2003 on a project co-funded by IFAD and the Islamic Development Bank (IsDB) in North Kordofan State
I was contracted by IFAD to support the consolidation of the monitoring and evaluation system of this 7-year project while it was in the 4th year (first anomaly). The project was to deliver several outputs, including a 60 kilometre tarmac road between the State capital, El-Obeid, and the State second city, Bara, entirely financed by IsDB.
Locked up for 3 days with the entire project team, I was able to clearly see, through the indicators of effect proposed to me, that the project management team, including the principal responsible for monitoring and evaluation, was unable to clearly differentiate between the deliverable (the tarmac road) and the effects this deliverable could engender on its beneficiaries' living and income conditions. And slowly, my intervention and assistance made it possible for the project staff to start differentiating between a deliverable and its effect - as a development intervention - which can be perceptible only at the level of the social segments benefiting from a deliverable and not in the deliverable per se. I fully understand that the transformation of a stony road into a tarmac road is a change, but without the inclusion of the human dimension in our vision, it is difficult to pinpoint the development achieved. For proof, where can we perceive development of a new deliverable realized and closed for 3 years, for example, if human beings do not take advantage of it in order to change their living and income conditions (isn't it Hynda?). Thus, the project team members started, from the second day onwards, to differentiate things, suggesting better outcome indicators – completely different from output indicators, which served 3 years later to a good evaluation of the effects of the deliverable "tarmac road".
Thus, this little story highlights the necessary link that needs to be established between monitoring and evaluation from the start of a project – through mobilizing all necessary resources for the monitoring and evaluation system, including the necessary skills – so that evaluation can be done without much difficulty.
But even more importantly, although I am in favour of the evaluator "freedom of expression" (Isha), this necessary link between monitoring and evaluation will certainly lead to better ToRs for evaluation, guaranteeing this evaluator freedom within the framework defined by the project team. Without this link, too much of the evaluator's freedom of expression may incur a project at risk of receiving an evaluation report that is meaningless.
Sorry to have been a little long but the importance of the question asked by Natalia forced me to resort to certain details. I hope I have contributed a little bit to this discussion.
Thanks for bringing this important issue on monitoring & evaluation in one of the most important challenges of any M&E system related to its 'social learning' dimension. Besides, it was quite informative to read the contributions submitted within this debate – due to your suggestion – especially the ones of Ronald and Zahid.
The situation you depict is the one that is similar to what you might find in other African countries – I was involved between 2013 and 2015 in a very interesting AfDB initiative entitled "Africa 4 Results" and had a chance to visit some Western and Eastern African countries to face a very similar situation.
I don't have all necessary information to argue anything about your country but I have the feeling that in your case, the building of a National M&E seems to have started from the "harware" part and did not pay attention to the "software" issue. Sometimes I have the weakness to believe that in your case much attention was given on projects and projects monitoring collected data do not fit into national policies. And for this I would join my voice to Ronald and Zahid's contributions.
Having that said, we need to acknowledge that the construction of a national M&E system must start the publishing of a M&E general legal framework that will first will impose upon a Government to have a mid-term strategic plan of "multi-dimensional" development to which is annexed a results framework. This national strategic plan must have been prepared through a "true" participatory approach et be endorsed at end by the Parliament.
At the second level, this national "multi-dimensional" development plan will serve for each sector as a reference framework to establish a mid-term strategic sectoral plan to which is annexed a sectoral results framework. Each strategic sectoral plan must be approved by the Government and should bear a results framework that links the sectoral strategy to the mid-term national development plan.
At this level, any new project or programme will need to have a results framework that will link this project or programme to the sectoral plan. This is the "software" part that I mentioned above.
After that, the "hardware" part of the national M&E system is setup upon a concept note showing the inter-relations between the different levels of the national M&E system; the standard form of M&E unit at the different levels; the data collection procedures and methods; the reporting system and its timing; etc.
With this, one can assume that once monitoring data is collected at a project level can easily be aggregated at the sectoral level, getting the sectoral plan to feed back into the national strategic plan.
In such a situation that you bring in, starting with the "hardware" part, the majority of Government high and line staff might feel that M&E is just and additional "administrative" workload that is imposed from the top and lack of conviction in M&E will be very apparent.
Thinking of disseminating M&E results is highly recommended but talking about M&E "value for money" may just be seen inappropriate as M&E work is a sort of "quality insurance" or "life insurance" for development, and using such a metaphor, one can easily admit that having a "quality insurance" or a "life insurance" has certainly a cost, but omitting to have that insurance will certainly have a "at-least-ten-times" higher cost. This is why I believe the concept "value for money" is not the right concept to a given M&E system. I do not ant to be too much provocative but I feel that this issue of "value for money" is just a "proxy" indicator for a lacking conviction towards M&E work.
Mustapha Malki, PhD535 avenue Ampere #5Laval, QC, Canada
Thanks, dear Naser, for bringing this issue again to the forefront.
We should not stop 'hammering' that evaluation cannot and should not be disconnected from monitoring and we should do all we can to connect them from the start, at the moment of developmental action formulation, albeit a project, a programme, or a policy.
It is a fact - and nobody can deny that - that most of the time developmental actions are:
But why this is still happening after eighteen years of the MDG endeavour?
Because of weak or insufficient M&E capacities within national systems in almost all developing countries, but also a 'stricking' reluctance and lack of political will to adopt a national M&E framework for national development. Again this fear of M&E as a control and audit system is in the air...
Besides, whenever international organizations is pleading the need to build national capacities on this issue, stress and focus are rather put on evaluation and very low consideration is allotted to monitoring.
And again, I would claim that monitoring and evaluation - and not monitoring or evaluation - are the two 'legs' of a system on which will stand a developmental action seeking to ensure achieving its expected results; choosing the one or the other would just mean that our development action - as a person standing on one leg - will certainly fall short of achieving its expected results.
That's what I wanted to say as a rejoinder to Naser's contribution...
Many thanks to our dear Hynda for opening a very interesting debate on the challenges and constraints that hinder the emancipation of evaluation in some countries. All that has been said is quite valid, nevertheless the lack of understanding of the evaluation function, as evoked by Hynda, very often perceived as a control and forcing many individuals to positions of resistance for different reasons, remains one of the challenges that must be addressed. From my modest experience in the various results-based management training workshops that I lead, in their monitoring and evaluation component, I always start by demystifying the monitoring and evaluation functions among participants by asking a simple question: do we do monitoring and evaluation in our daily life? And I engage in a frank and serene debate with the participants by taking them to evoke examples of the everyday life where the human being practices the monitoring-evaluation in a rather intuitive and fortuitous way. The example of a trip by car to a destination where we have never been to arrive at a specific date and time, according to a precise route that we have never taken, is the example that comes up quite often. And here we begin to dissect our actions to finally discover that we do quite often monitoring-evaluation, sometimes without realizing it, and concluding that eventually the monitoring-evaluation is rather in our favor than to our disadvantage.
However, there are other challenges for the evaluation function that I can personally advance, by way of illustration and without being exhaustive, and that are better housed in the immediate environment of the evaluation function, including:
This is what I wanted to share with colleagues as a contribution to this debate ...
Dear all,As I am following the thread of this discussion, I get more convinced that platforms such as EvalForwARD CoP have to exist for evaluation practitioners of all backgrounds: it can only provide assets and advantages to all of us. Why I am saying this? Because I feel and "smell" some confusion in conceptualising "evaluation" in the air.According to my modest experience in Monitoring & Evaluation (M&E), I see "evaluation" strongly bounded by the Theory of Change defined during the project/programme formulation stage, and the results framework we assign to a given developmental action, be it a project, a programme, or a policy. Though lots of things need to be evaluated in any project/programme in order to be more comprehensive in our understanding of what worked and what didn't, we have to be faithful to what that project/programme was assumed or assigned to change. And for this, I join my voice to Emmanuel Bizimungu and Dr. Emile Houngbo, saying that we cannot evaluate anything and everything but we have to keep "targeted". Quoting Robert Chambers, I would say that we should opt for an "optimal ignorance" to not get our research efforts diluted in different senses and directions.In some interventions in this discussion thread, I assume that some friends are using the term "evaluation" as if it is a sectoral study assessment, a sort of an "état des lieux", as we say in French, or the "state of the art study" of the agricultural sectoral. If this is case, let us the words properly and keep the term "evaluation" for what it is meant: "the systematic and objective assessment of an on-going or completed project, programme or policy, its design, implementation and results. The aim is to determine the relevance and fulfillment of objectives, development efficiency, effectiveness, impact and sustainability… An assessment, as systematic and objective as possible, of a planned, on-going, or completed development intervention." (OECD, 2002 – Glossary of key terms in evaluation and Results-based Management).It is then clear that "evaluation" is something different as doing an "état des lieux" or the "state of the art study" in terms of objectives, orientation, and use, although there are some common features shared among all. But for evaluation, as a peculiar characteristic that it bears, we have to develop an evaluation matrix backed by some evaluation questions and a strong and robust research methodology before we start collecting any data.Furthermore, we have to keep in mind that, as the same glossary clearly mentions, "evaluation in some instances involves the definition of appropriate standards, the examination of performance against those standards, an assessment of actual and expected results and the identification of relevant lessons". This is why evaluation – the discipline and not the perceived term – is since the last decade developing into a new social science and for which specialists get officially accredited in some countries, such as Canada, for example.Sorry for being too long but there was a need to clear my mind and attract the attention of colleagues on the perceived slight confusion.Mustapha
Hello everyone,When I decided to join this community, I had great hope for simple but interesting debates on the importance of M & E in general, and evaluation, in particular, for the development practitioner and to encourage the generalization of its practices for a sustainable development by 2030.The debate on developmental evaluation, launched by our colleague Prosper, and which I am following, is a debate that only academics, mastering perfectly the art of "intellectual speculation" can afford because they have the time for that. Moreover, in my humble opinion, such a debate can bring nothing to the development practitioner except additional confusion about the usefulness and importance of both monitoring and evaluation.First of all, I note that we are not all on the same wavelength with respect to the concept of developmental evaluation, from what I read in this debate. Some contributions push towards the concept developed by M.Q. Patton, quoted several times in an article and PPT presentation shared by our colleague Koffi; others evoke a concept very close to evaluation in general, which aims to assess the effectiveness and efficiency of interventions, as presented by our colleague Émile.In the first case, having read several books and articles of Patton, this one evokes an evaluation approach accompanying the intervention (ie a project, a program, or a policy) throughout its implementation so that the evaluation results are used by the intervention team to improve the performance of the intervention or possibly its continuous reformulation until it meets the needs of the beneficiaries. I believe, to simplify the debate, that this is the expected role of the Monitoring function in any M&E system. Why then do we try to wrap it in a new packaging called "Developmental Evaluation"? If in the team in charge of an intervention, more importance and sufficient means are given to the monitoring function, for instance by developing participative mechanisms in this function, I am certain that one will reach convincing results in terms of performance and adaptation of the intervention, exactly as Patton's "Developmental Assessment" concept proposes. The only difference is that this Monitoring function will be less costly for the team and driven by the internal resources of the team, something that almost all M & E training manuals recommend.In the second case, our colleague Émile evokes what is really the role of evaluation since it must focus on the effects and impacts at the macro-economic level in relation to the main development indicators. For my part, and to put it simply, this is what must be attributed to the "Evaluation" function in the monitoring-evaluation system of a given intervention; whether or not we add the adjective "developmental" does not change this "Evaluation” function. In fact, which project or program, or national or sectoral policy, etc does not intervene in the developmental sphere? And which evaluation action of such project, or program, or policy, etc. is not intended to appreciate the effects - especially what are commonly known as end effects - and impacts?
Having said that, I think our community is made up mostly of practitioners in the field who want to see debates develop that can bring them practical solutions that are appropriate to their problems and that they can implement on the ground. So my recommendation is to develop simple debates on current topics and to avoid unnecessary confusion for our development practitioners. On the contrary, let us help them strengthen their monitoring and evaluation system by further strengthening their "Monitoring" function and further develop their "Evaluation" function.
Thanks, dear Eoghan, for taking time to go through my contribution and give more information about the evaluation.
The picture on introducing CA and get it adopted by farmers is very similar to what we have done for the last 3 decades in technology dissemination and adoption (intensive package on cereal cropping, use of quality seeds, mechanization, herbicides use, water-saving irrigation techniques, etc.). That general picture shows always some of the following aspects:
1. Project's technical staff are very enthusiastic to show their "successes" in the field by showing large numbers of farmers being enrolled by the project, and jump without hesitation to consider as a huge rate of technology adoption. They are very defensive when one tries to ask them questions if they took the time to know in deep their beneficiaries.
2. Farmers are keen to apply a new technology when someone else is covering the cost. But when the project is closed, then we see properly what is happening among farmers. Most of time, farmers who participated in a closed project start asking when the next new project will start and if they will be part of it, as if the closed project was just a game and then the game was over (I am becoming a bit cynical on this).
3. Little is done in terms of evaluation of the project outcomes, impacts, sustainability of both and so forth...
I am telling you this because I was involved in 2013 a 4-year Maghrebin CA project funded by Australia and implemented by ICARDA in Algeria, Morocco and Tunisia. I was involved in setting the M&E plan for that project and trained a bunch of Maghrebin researchers and dev practitioners on Results-Based Management so that they could make a good use of that M&E plan. All social actors involved in that project praised the work done (M&E Plan + RBM Training), especially the Austalians who were very keen to put a strong pressure on ICARDA to setup the M&E Plan. But the project was closed after 4 years in the same as I saw many projects closing (you can imagine the picture - business as usual).
But, in your case, and the case of your CA evaluated project, I am happy to see that you paint in your message the picture as it is in reality, i.e. that CA was not that "rosy" technology that could fit most farmers in Africa, especially that is was applied in a "one-size-fits-all" approach, with a little knowledge - to not say "no" knowledge - on the beneficiaries, and that the case presents some shortcomings that you are not hiding. And what and how evaluation has to do. Good to read a balanced contribution on a new technology.
As for the issue of sampling, especially with a "fixed constituency" for 4-5 years between baseline and project end, it is always a tricky issue to get that required robustness in our survey. But you tackled the issue through triangulation, using multiple sources of data, and honestly I would have go the same way. But locating 317 farmers among 385-390 at the end of the project is quite an endeavour by itself. That's why I mentioned in my previous contribution the need in such cases to make the sample bigger at the baseline in order to cover such turmoil at the end.
Finally, the way you presented the things made me more curious and "hungry" to look at the evaluation report. Without engaging in a formal commitment, I will download the evaluation report for which I am very thankful to you and try to squeeze some time to read (summer time is rushing away and missions and travels will start again very soon in September).
Good luck and kind greetings
Dear Mr. Molloy,
I have read with a great attention your contribution referring to the evaluation of CASU in Zambia. I must congratulate your department for such an achievement. However, I have two points to make here.
You mention at the start of your contribution that the entire population of the Conservation Agriculture project is "targeting over 300,000 smallholder farmers". That is the entire population of the project. You also mention that "the main focus of the evaluation was to assess the extent to which conservation agriculture has been sustainably adopted by Zambian beneficiary farmers … also sought to assess what outcomes were evident (positive and/or negative) from the project’s activities, and what were the impacts on food security, income, and soil health"
The first point I want to raise concerns the adoption study that you highlighted in your message. Though I don't have all details about how such a study was conducted and what results it did achieve, I would like use this opportunity to share some experiences on adoption studies, a sort of outcomes evaluation and if these outcomes are sustainable over time. Everett Rogers, one of the gurus on technology adoption by farmers, instruct us not to check the adoption rate at once or at any time. Adoption studies require that one is aware of the technology adoption process among farmers in order to understand how to work with adoption studies and set up appropriate protocols to study technology adoption among farmers. I saw many of adoption studies giving high rates of adoption at the end of a project and very low numbers of farmers are still keeping the technology 5-10 years after the end of a project. This is because what seems adoption to researchers is just experimentation to farmers, so real adoption for farmers will come far away after that moment of project end.
The second point I want to raise concerns the household survey undertaken by the University of Zambia and the sample size used by the research team. Besides other activities conducted within this evaluation (among which focus groups with 650 beneficiary farmers), you mention that "a household-level impact assessment survey to collect quantitative data amongst a sample of over 300 farmers, in order to assess progress against the baseline survey".
Nobody can deny that a survey can only be truly valuable when it is reliable and representative for the entire population of project's beneficiaries. This is why determining the ideal survey sample size with robust external and internal validities is quite important as it will help the research team to infer and extrapolate the results obtained on the research sample over the entire population of the project's beneficiaries.
Using a correct survey sample size is crucial for any research, and project evaluation is a research. A too big sample will lead to the waste of precious resources such as time and money, while a too small sample, though it can yield sound results (strong internal validity), but will certainly not allow inference and extrapolation of its results on the entire project population (weak external validity).
So, the sample size cannot be by how much a research team can handle but on how accurate the survey data ought to be. In other words, how closely the research team wants the results obtained on a sample to match those of the entire project population.
In statistics and statistical probabilities, we use two measures that affect the accurateness of data and which have a great importance as for the sample size: (1) the margin error, in most cases, we use 5%; and (2) the confidence level, in most cases, we use 95%. Based on these two measures, and given the population size, the research team can calculate how many respondents (people who might completely fill the survey questionnaire) it may actually need; that is the survey sample. Beside all this, the research team must consider a sufficient response rate – that is the number of "really exploitable" survey questionnaires – so that they include additional questionnaires beyond the sample so that the research team has sufficient number of completed questionnaires to exploit. The table can give an idea on the sample size for a project population of 300,000 individuals. For example, if we target 380-390 "exploitable" questionnaires, we allow 20-25% more questionnaires so that the survey is not put at risk of weak robustness.
As a conclusion, I believe that the sample size for the mentioned household survey, as part of the undertaken CASU evaluation, was a bit lower than what a probabilistic law would accept. Of course, this statement has no consequence on the results obtained within the sample as such, but the survey findings cannot be strongly and robustly inferred and extrapolated to the entire population of project's beneficiaries because of the weak external validity of the sample, due to no respect of the principles of probabilistic law.
I would like to contribute to the debate on gender mainstreaming in the evaluation of development actions - I use the generic term development action to refer to a project, program or policy. I find Georgette's debate quite important and that will have to be taken out of a debate that would remain philosophical and sterile, so much the development practitioners need practical elements to make the necessary corrections to their way of doing things. It goes without saying that the "gender" dimension is very important for development but this should not lead us to use it as a "master key" to use in all development actions; we must therefore deal with this "gender" issue in a systematic and mandatory way in development actions that have an undeniable gender dimension.
Following the watermark of this debate, I have the weakness to think that we are dealing with this issue right at the time of the evaluation - it is a debate that I come across very often among practitioners of the "simple and simplified" evaluation. However, this aspect must be dealt with well in advance, in general at the time of formulation of the development action and design of its results framework, and in particular in the choice of indicators and data collection - which should be disaggregated according to the gender dimension, and in the establishment of the monitoring and evaluation system for the development action.
If a given development action is articulated on a strong gender dimension, the reading of the project document, its results framework and indicators, and its monitoring and evaluation plan, etc., must reflect this strong gender dimension, even before the activities of this development action are launched on the ground. Without such an integrative perspective of Results-Based Management, the evaluation will be totally disconnected from the rest of the activities of a development action, including monitoring-evaluation activities, and will not help us in such a situation to bring all the necessary answers to questions about gender that we might ask ourselves at the time of an evaluation.
It is on this restrictive debate on evaluation, which reduces and "chops up" a process of management of the cycle of a development action to which I wanted to make an initial contribution during my first message on this platform; it is since a few weeks that I wanted to draw the attention of all members to the danger of speaking in a restrictive way about evaluation outside of a Results-Based Management perspective.
A word to the wise!