Well done Elias for this excellent contribution that I would have not been able to produce.
You summarize the situation perfectly and recommend exactly what to do. If some of the last contributions were somewhat prescriptive or even theoretical, yours is inspired by a practical experience. And that's what we're looking for as members in this kind of platform of interaction and exchange.
I had the opportunity to visit Benin in June 2014 as a World Bank consultant to support the national team implementing a 5-year community development program that began as Phase II of a similar project. I was surprised by the Government's efforts in its ambition to institutionalize Monitoring and Evaluation in all sectors; it was still the first years of the implementation of the 2012-2021 government policy that you mention in your contribution (at least, I imagine). I had the opportunity to visit several government ministries and found the existence of an M&E service that collated several data from the sector. During this period, not all the means were yet available but 6 years later, and to read to you now, I understand that we have in our hands a rather interesting experience that could inspire several countries, especially African but others as well, in order to become more practical in our recommendations and get our exchanges out from the normative and the theoretical.
Congratulations once again for this contribution and good luck for Benin.
It has been more than a month since we started this discussion on the mismatch between monitoring and evaluation, although these two functions have always been considered complementary and therefore inseparable. However, as a first reaction, I must express my surprise that only 4 contributions have been recorded for this theme. Why such a weak reaction from our group members?
Beyond this surprise, I have reviewed the 3 reactions that address specifically the issue of monitoring and evaluation practice and propose to relaunch the debate on this topic so that we can draw some recommendations. For the record, and in order to be clear in my recommendations, I will focus my intervention on the monitoring function to distinguish it from the evaluation practice in any monitoring-evaluation system because it seems to me that the term 'monitoring-evaluation' hides very poorly the existing mismatch between the two functions, as these do not receive the same attention both nationally and internationally.
As the first to respond, Natalia recommends that theories of change would be more useful if they were developed during the planning or formulation phase of the intervention and would serve as the foundation of the monitoring-evaluation system. This is the essence of the theory of monitoring and evaluation in what many specialized textbooks suggest.
She also suggests that evaluations could be more useful in terms of learning from the intervention if ToC and evaluation questions are fed from questions formulated by program teams after analysis of monitoring data.
But isn’t that what we are supposed to do? And if that's it, then why in general it's not how it is done?
In her contribution, Aurélie acknowledges that evaluation is better developed as a practice than her sister function of monitoring, perhaps since evaluations are done primarily when supported by dedicated external funding, thus linked to an external funder. This is in fact the general case that can easily be observed in the least developed countries. She also asks the question: why has the monitoring function not yet received the same interest from donors; why are monitoring systems not required as a priority, given the essential nature of this tool to learn from past actions and improve future actions on time? She even seems to give a bit of an answer by referring to a study: countries need to develop a general, results-based management culture, which begins, even before monitoring, with results-based planning. But she does not explain why it is not yet in place, despite the fact that it has been 4 years since the SDGs were launched. She concludes her contribution by acknowledging that in many institutions, both national and international, monitoring is still largely underestimated, under-invested and suggests that it is up to evaluators to play a role in supporting the emergence of the monitoring function, in their respective spheres of influence; even if it means putting aside the sacrosanct principle of independence for a time. But she does not show us how evaluators can succeed in bringing out this much-desired monitoring function where large donors and large capacity building programs have failed.
The third contribution comes from Diagne, which begins by recognizing that when developing a monitoring-assessment system, there is a greater focus on functions and tools rather than on the field - or scope - and the purpose of the system, taking into account the information needs of the funder and other stakeholders. He says that if the main purpose of a monitoring-assessment system is to accompany the implementation of an intervention in a sense of constant critical reflection in order to achieve the results assigned to this intervention and to alert on the critical conditions of its implementation, then a review - I would say personally, redesign - of the monitoring-evaluation system is necessary. And he concludes by stressing that development policies do not give enough importance to monitoring and evaluating the SDGs; they merely compile data from programmes implemented with foreign partners to express progress against a particular indicator, which is far from good practice with regard to monitoring and evaluation.
At least two contributions (Aurèlie and Diagne) recognize that a major overhaul of monitoring&evaluation is needed in the era of results-based management and all other results-based corollaries.
What we can note from all these contributions is that there is unanimity on the interest and importance of strengthening the complementarity between monitoring and evaluation as two functions that reinforce each other, but we do not know how to build a monitoring function worthy of the current practice of evaluation. As it is said, identifying the good causes of a problem is already half the solution to this problem. The major cause of the mismatch between monitoring and evaluation is that evaluation has been consolidated by funders and development partners because it addresses their concerns over the performance of the programs they fund or implement. On the other hand, monitoring is a more beneficial function for countries receiving development assistance and such a function does not yet seem to be important to the country's governments for several reasons. As there is very little external investment in the monitoring function at the national level, this fuels the mismatch between these two functions. So if there is anything that can be done to mitigate this inadequacy, then donors and development partners should be encouraged to invest in strengthening national monitoring and evaluation systems and conduct programmes in order to convince the governments of the recipient countries of development assistance of the interest and importance of strengthening national monitoring-evaluation systems.
Let us hope that this contribution will relaunch the debate on this topic...
I am contributing to the new discussion launched by our colleague Carlos Tarazona.
Relying on my own experience, logical frameworks and result chains are planning tools that can help at the formulation phase of any result-oriented (or result-based) developmental action, albeit a policy, a programme, or a project. However, these tools require most of the time a strong technical expertise to use them in a rather professional manner and achieve a sound formulation of a given developmental action. Having that said, most development practitioners having no clue whatsoever on these planning and formulation tools may incur the risk of not understanding sufficiently the logical framework or the result chain of “their” developmental action.
Here comes then the use of “logic models” or “theories of change” especially during the implementation phase – and more importantly during the evaluation phase – of a given developmental action. In this case, members of the implementing team will sit together at the start of the implementation phase to “draw” a “logic model” or a “theory of change” in order to understand how the developmental action will evolve in its implementation area and how the “logical framework” or the “result chain” of that developmental action will unfold in reality in a series of cause-effect relations between its different elements, moving from “resources/inputs”, to “activities” to “outputs”, to “outcomes” and then to “impact”. Drawing the “logic model” or “theory of change” of a developmental action – either expressed in a drawing or in text – will help development practitioners in translating the “logical framework” or the “result chain” into a more expressive and easier way to unveil and understand the “change strategy” of that developmental action. The consequences of such an endeavour are: (1) a better understanding of the developmental action implementation strategy; (2)a lot of information for a better programming of the developmental action activities; and (3) the setup of a sound monitoring & evaluation system of that developmental action.
However, this is not the general case that a “logic model” or “theory of change” is made ready at the start of the implantation phase; some developmental actions have taken so much time during the formulation phase that the recipient agency would rush to start the implementation. And here comes the second situation of the use of “logic models” and “theories of change” at the evaluation phase. A sound evaluation exercise for a given developmental action would certainly rely on a “logic model” or “theory of change” that can help evaluators understand what that developmental action was supposed to do – at least in the heads of the formulating people – and compare it with what the developmental action did really. If a “logic model” or “theory of change” of a developmental action was drawn at the start of the implementation phase, it should be used and maybe improved on the condition that it is validated by the implementing team. If not, then the first task of the evaluators would be to elaborate a “logic model” or “theory of change” for the “evaluand” developmental action in order to define the different avenues that should be looked at during the evaluation exercise (parameters, indicators, data to be collected, etc.).
At the end, I would say that the debate should not be whether to use a “logical framework” or a “result chain”, on one side, or a “logic model” or a “theory of change”; the debate must be on the added value by using different methods and techniques to ensure a good implementation and a sound evaluation of a given developmental action. In brief, it is not a THIS OR THAT issue, but rather a THIS AND THAT one.
Hope this helps…
Mustapha Malki, PhD
First of all, a big thank you to Hynda who puts on the table a well relevant topic of debate that we must all bring answers to without taboos in order to put at the center evaluation compared to many related considerations about development work in every possible sense of the term “development”.
In her message, Hynda highlights the term "mistake" as it is perceived by many of us in everyday life. However, we should place this term in the context of public policy planning in order to distinguish the intentional and the unintentional error in planning in our countries and in public policies more specifically. Then, Oumar tries to bring some snatches of answer but ends up very quickly pouring into the normative instead of staying in the real - what is done and why.
For my part, given my modest experience in administration and my modest research to understand how what I will call the "development theater" works, I caricature the sphere of development by the existence of different roles played by different actors and therefore the presence of different rationalities.
I would like to say at the outset that there is no pure naïve person who believes that development is an apolitical work that obeys exclusively to technical considerations. So when we talk about error in this context, we must talk about these unintentional errors that we could identify in our evaluation of public policies and it is also necessary that evaluation appreciates how and in what knowledge context these policies were formulated. This is where evaluation can become an interesting instrument to show us that the mistakes that we can identify through our evaluations are far from being unintentional. Indeed, such errors are strongly related to the balance of power that exists in the "theater of development" when planning a public policy. Here I agree with Oumar who admits that we have not done a lot of evaluations in our countries and that even when sometimes they are done, they are rather done as part of a "ballet folklore "- very often "imposed" by foreign donors.
And since these evaluations are done in the context of development programs and projects, the repercussions in the sphere of public policy planning remain limited, if not none, and the results of these evaluations are never seen as tools to help decision-making. It is important to recognize the separation in some of our countries between programs and development projects financed by foreign donors and public policies funded through public budgets and therefore most often fall within the domain of national sovereignty.
This is not to say that there are no other mistakes in public policy planning, such as lack of scientific and technical knowledge to develop a coherent public policy with relevant objectives and realistic outcomes. Mistakes that can come from a real lack of knowledge (either proven skills, reliable statistics, etc.) can come from other causes related to the famous "balance of power" mentioned above and this brings us back to the need to distinguish the intentional from the unintentional in our planning mistakes.
There are errors related to the existence of a "one and only" document that allows any reader to understand the public policy that some senior sector official talks about. I have personally experienced many examples of senior sectoral officials who spoke of a sectoral policy that existed only in their "head".
There are also errors related to the setting of objectives and results that are relevant and clear, evaluable, etc. As a Results-Based Management Specialist, I know something about the resistance some high level officials have and their need to avoid this kind of debate in the public policy planning document, when it exists and is made public. This makes things more difficult when talking about "accountability" of decision-makers in terms of achieving the objectives.
Other types of errors can be identified with respect to the allocation of resources for public policy and the logical links of allocated resources to the objectives and results assigned to the public policy.
Finally, there is another type of mistake that relates to "changeability" of the public policy. In a number of sectors, public policy is launched on the basis of ideas that are still insufficiently identified or apprehended and that sector policy officers are eager to implement in the field; then as the feedback (inconsistency protest, etc.) comes back from the field, public officers improve the "content" of the policy and this is done recurrently throughout the life of the policy, which makes it difficult to evaluate.
All this to say that the problem is not in public policy or its evaluation.
It is only fair that under such circumstances, the evaluation of a public policy elaborated in an administrative "straitjacket" devoid of logic and without knowledge is not possible and will have no conclusive result on the improvement of the development work ... It can only be used to tell the sectoral politician what he likes to hear, and this is not the role of evaluation and what is done in the advanced world.
535 avenue Ampere #5
Laval, QC, Canada
Contribution of Emile and Bintou exchange on the necessary distinction between outputs and outcomes.
Outputs are all goods and services developed through the project's activities via the use of project's resources and inputs (in the Emile's case, these are the distibuted insecticide-treated nets). Outcomes would be the changes (of course, they should be "positive" changes, otherwise we should close that project) that can appear in the living and income conditions of project's target beneficiaries (in the Emile's case, these are the reduction of malaria incidence).
The difference between the two different items is that outputs - as well as activities and inputs - are part of the project's "controlled" environment (you can decide what and how much to buy and distribute) while outcomes remain the influence that the project is intending IF AND ONLY IF THE PROJECT'S TARGET BENENFICIARIES USE WHAT THE PROJECT DISTRIBUTED. This is why outcomes are part of the project's "influenced" environment.
And this is what makes things more difficult in achieving outcomes in comparison to outputs because the project management unit has no slight control over the changes among beneficiaries. It then depends on how relevant the implemented activities were in order to generate outputs that can really solve the problem situation identified at the onset. If we can borrow concepts from marketing, and if we assume that outcomes represent the changes requested by the beneficiaries (that is the "demand") and the outputs are the mean to bring about these changes (that is the "supply"), it is then needed that the "supply" meets the "demand" in order to changes to occur.
Contribution to Dowsen reaction
Yes, this what I do myself before I go on drafting a Results Framework (or Logical Framework) at the start of project design and this results framework for setting the M&E plan and for guiding any evaluation later. I start with the problem situation assessment (i.e. the Problem Tree tool) using the "cause-effect" causality law. And then turning each problem identified in the Problem Tree into a positive statement I develop the Objective Tree, then a fine-tuning of the Objective Tree using the "means-end" causality law. From the Objective Tree, I can identify the best alternative "result chain" move with it very easily to the results (or logical) matrix and so on...
Contribution to Reagan Ronald reaction on handbooks' quality.
I am not sure that anyone of us has attributed the poor quality of evaluation handbooks to evaluators or international consultants in evaluation. Personally I made it clear that sometimes the handbook's content can be of good quality but presented and disseminated upon a very poor communication processing and dissemination process. Based on what I know, many handbooks' content were prepared by high quality consultants in evaluation. However, relying on my minor competency on knowledge and information systems and communication, a good handbook, in general, and in evaluation, in particular, must rely - as a communicative tool - on 4 necessary 4 criteria: (1) a good, appropriate, relevant, and purposeful content; (2) an adequate mean of dissemination; (3) a good knowledge on the targeted population; and (4) a conducive environment to the use of the information. For many handbooks, we were more focusing on (1) and a bit less on (2) and (3) and this is not enough to give birth to good quality handbooks on any subject and not only on evaluation guidelines. Moreover, the consultant in charge of content can be quite good in terms of content (i.e. the substantive knowledge) but may not be very qualified in terms of communication. This is why I always recommend to build a team on an evaluator + a communication specialist to have a good quality handbook on evaluation.
Hope that I added a bit to this discussion.
Dear Natalia et_al.,
Thank you for putting on the table an important challenge to both the evaluator and the manager of a development project. And I want to apologize for not being able to answer earlier; the situation in my country had taken over my mind and took all my time during the last 3 weeks. The question of clearly distinguishing an output from an outcome is of utmost importance for the development project manager as well as for the evaluator, as well as the project monitoring and evaluation staff. And I doubt that the problem is really a terminology problem, at least theoretically speaking. According to my modest experience, the problem has its origin in several factors that I will try to explain below:
Having this said, a good training on the monitoring and evaluation of project staff, based on good logframe and result chain, can sometimes be the key to this problem. And to support this, I would like to share an experience I personally experienced in Sudan in 2003 on a project co-funded by IFAD and the Islamic Development Bank (IsDB) in North Kordofan State
I was contracted by IFAD to support the consolidation of the monitoring and evaluation system of this 7-year project while it was in the 4th year (first anomaly). The project was to deliver several outputs, including a 60 kilometre tarmac road between the State capital, El-Obeid, and the State second city, Bara, entirely financed by IsDB.
Locked up for 3 days with the entire project team, I was able to clearly see, through the indicators of effect proposed to me, that the project management team, including the principal responsible for monitoring and evaluation, was unable to clearly differentiate between the deliverable (the tarmac road) and the effects this deliverable could engender on its beneficiaries' living and income conditions. And slowly, my intervention and assistance made it possible for the project staff to start differentiating between a deliverable and its effect - as a development intervention - which can be perceptible only at the level of the social segments benefiting from a deliverable and not in the deliverable per se. I fully understand that the transformation of a stony road into a tarmac road is a change, but without the inclusion of the human dimension in our vision, it is difficult to pinpoint the development achieved. For proof, where can we perceive development of a new deliverable realized and closed for 3 years, for example, if human beings do not take advantage of it in order to change their living and income conditions (isn't it Hynda?). Thus, the project team members started, from the second day onwards, to differentiate things, suggesting better outcome indicators – completely different from output indicators, which served 3 years later to a good evaluation of the effects of the deliverable "tarmac road".
Thus, this little story highlights the necessary link that needs to be established between monitoring and evaluation from the start of a project – through mobilizing all necessary resources for the monitoring and evaluation system, including the necessary skills – so that evaluation can be done without much difficulty.
But even more importantly, although I am in favour of the evaluator "freedom of expression" (Isha), this necessary link between monitoring and evaluation will certainly lead to better ToRs for evaluation, guaranteeing this evaluator freedom within the framework defined by the project team. Without this link, too much of the evaluator's freedom of expression may incur a project at risk of receiving an evaluation report that is meaningless.
Sorry to have been a little long but the importance of the question asked by Natalia forced me to resort to certain details. I hope I have contributed a little bit to this discussion.
Thanks for bringing this important issue on monitoring & evaluation in one of the most important challenges of any M&E system related to its 'social learning' dimension. Besides, it was quite informative to read the contributions submitted within this debate – due to your suggestion – especially the ones of Ronald and Zahid.
The situation you depict is the one that is similar to what you might find in other African countries – I was involved between 2013 and 2015 in a very interesting AfDB initiative entitled "Africa 4 Results" and had a chance to visit some Western and Eastern African countries to face a very similar situation.
I don't have all necessary information to argue anything about your country but I have the feeling that in your case, the building of a National M&E seems to have started from the "harware" part and did not pay attention to the "software" issue. Sometimes I have the weakness to believe that in your case much attention was given on projects and projects monitoring collected data do not fit into national policies. And for this I would join my voice to Ronald and Zahid's contributions.
Having that said, we need to acknowledge that the construction of a national M&E system must start the publishing of a M&E general legal framework that will first will impose upon a Government to have a mid-term strategic plan of "multi-dimensional" development to which is annexed a results framework. This national strategic plan must have been prepared through a "true" participatory approach et be endorsed at end by the Parliament.
At the second level, this national "multi-dimensional" development plan will serve for each sector as a reference framework to establish a mid-term strategic sectoral plan to which is annexed a sectoral results framework. Each strategic sectoral plan must be approved by the Government and should bear a results framework that links the sectoral strategy to the mid-term national development plan.
At this level, any new project or programme will need to have a results framework that will link this project or programme to the sectoral plan. This is the "software" part that I mentioned above.
After that, the "hardware" part of the national M&E system is setup upon a concept note showing the inter-relations between the different levels of the national M&E system; the standard form of M&E unit at the different levels; the data collection procedures and methods; the reporting system and its timing; etc.
With this, one can assume that once monitoring data is collected at a project level can easily be aggregated at the sectoral level, getting the sectoral plan to feed back into the national strategic plan.
In such a situation that you bring in, starting with the "hardware" part, the majority of Government high and line staff might feel that M&E is just and additional "administrative" workload that is imposed from the top and lack of conviction in M&E will be very apparent.
Thinking of disseminating M&E results is highly recommended but talking about M&E "value for money" may just be seen inappropriate as M&E work is a sort of "quality insurance" or "life insurance" for development, and using such a metaphor, one can easily admit that having a "quality insurance" or a "life insurance" has certainly a cost, but omitting to have that insurance will certainly have a "at-least-ten-times" higher cost. This is why I believe the concept "value for money" is not the right concept to a given M&E system. I do not ant to be too much provocative but I feel that this issue of "value for money" is just a "proxy" indicator for a lacking conviction towards M&E work.
Mustapha Malki, PhD535 avenue Ampere #5Laval, QC, Canada
Thanks, dear Naser, for bringing this issue again to the forefront.
We should not stop 'hammering' that evaluation cannot and should not be disconnected from monitoring and we should do all we can to connect them from the start, at the moment of developmental action formulation, albeit a project, a programme, or a policy.
It is a fact - and nobody can deny that - that most of the time developmental actions are:
But why this is still happening after eighteen years of the MDG endeavour?
Because of weak or insufficient M&E capacities within national systems in almost all developing countries, but also a 'stricking' reluctance and lack of political will to adopt a national M&E framework for national development. Again this fear of M&E as a control and audit system is in the air...
Besides, whenever international organizations is pleading the need to build national capacities on this issue, stress and focus are rather put on evaluation and very low consideration is allotted to monitoring.
And again, I would claim that monitoring and evaluation - and not monitoring or evaluation - are the two 'legs' of a system on which will stand a developmental action seeking to ensure achieving its expected results; choosing the one or the other would just mean that our development action - as a person standing on one leg - will certainly fall short of achieving its expected results.
That's what I wanted to say as a rejoinder to Naser's contribution...
Many thanks to our dear Hynda for opening a very interesting debate on the challenges and constraints that hinder the emancipation of evaluation in some countries. All that has been said is quite valid, nevertheless the lack of understanding of the evaluation function, as evoked by Hynda, very often perceived as a control and forcing many individuals to positions of resistance for different reasons, remains one of the challenges that must be addressed. From my modest experience in the various results-based management training workshops that I lead, in their monitoring and evaluation component, I always start by demystifying the monitoring and evaluation functions among participants by asking a simple question: do we do monitoring and evaluation in our daily life? And I engage in a frank and serene debate with the participants by taking them to evoke examples of the everyday life where the human being practices the monitoring-evaluation in a rather intuitive and fortuitous way. The example of a trip by car to a destination where we have never been to arrive at a specific date and time, according to a precise route that we have never taken, is the example that comes up quite often. And here we begin to dissect our actions to finally discover that we do quite often monitoring-evaluation, sometimes without realizing it, and concluding that eventually the monitoring-evaluation is rather in our favor than to our disadvantage.
However, there are other challenges for the evaluation function that I can personally advance, by way of illustration and without being exhaustive, and that are better housed in the immediate environment of the evaluation function, including:
This is what I wanted to share with colleagues as a contribution to this debate ...
Dear all,As I am following the thread of this discussion, I get more convinced that platforms such as EvalForwARD CoP have to exist for evaluation practitioners of all backgrounds: it can only provide assets and advantages to all of us. Why I am saying this? Because I feel and "smell" some confusion in conceptualising "evaluation" in the air.According to my modest experience in Monitoring & Evaluation (M&E), I see "evaluation" strongly bounded by the Theory of Change defined during the project/programme formulation stage, and the results framework we assign to a given developmental action, be it a project, a programme, or a policy. Though lots of things need to be evaluated in any project/programme in order to be more comprehensive in our understanding of what worked and what didn't, we have to be faithful to what that project/programme was assumed or assigned to change. And for this, I join my voice to Emmanuel Bizimungu and Dr. Emile Houngbo, saying that we cannot evaluate anything and everything but we have to keep "targeted". Quoting Robert Chambers, I would say that we should opt for an "optimal ignorance" to not get our research efforts diluted in different senses and directions.In some interventions in this discussion thread, I assume that some friends are using the term "evaluation" as if it is a sectoral study assessment, a sort of an "état des lieux", as we say in French, or the "state of the art study" of the agricultural sectoral. If this is case, let us the words properly and keep the term "evaluation" for what it is meant: "the systematic and objective assessment of an on-going or completed project, programme or policy, its design, implementation and results. The aim is to determine the relevance and fulfillment of objectives, development efficiency, effectiveness, impact and sustainability… An assessment, as systematic and objective as possible, of a planned, on-going, or completed development intervention." (OECD, 2002 – Glossary of key terms in evaluation and Results-based Management).It is then clear that "evaluation" is something different as doing an "état des lieux" or the "state of the art study" in terms of objectives, orientation, and use, although there are some common features shared among all. But for evaluation, as a peculiar characteristic that it bears, we have to develop an evaluation matrix backed by some evaluation questions and a strong and robust research methodology before we start collecting any data.Furthermore, we have to keep in mind that, as the same glossary clearly mentions, "evaluation in some instances involves the definition of appropriate standards, the examination of performance against those standards, an assessment of actual and expected results and the identification of relevant lessons". This is why evaluation – the discipline and not the perceived term – is since the last decade developing into a new social science and for which specialists get officially accredited in some countries, such as Canada, for example.Sorry for being too long but there was a need to clear my mind and attract the attention of colleagues on the perceived slight confusion.Mustapha
Hello everyone,When I decided to join this community, I had great hope for simple but interesting debates on the importance of M & E in general, and evaluation, in particular, for the development practitioner and to encourage the generalization of its practices for a sustainable development by 2030.The debate on developmental evaluation, launched by our colleague Prosper, and which I am following, is a debate that only academics, mastering perfectly the art of "intellectual speculation" can afford because they have the time for that. Moreover, in my humble opinion, such a debate can bring nothing to the development practitioner except additional confusion about the usefulness and importance of both monitoring and evaluation.First of all, I note that we are not all on the same wavelength with respect to the concept of developmental evaluation, from what I read in this debate. Some contributions push towards the concept developed by M.Q. Patton, quoted several times in an article and PPT presentation shared by our colleague Koffi; others evoke a concept very close to evaluation in general, which aims to assess the effectiveness and efficiency of interventions, as presented by our colleague Émile.In the first case, having read several books and articles of Patton, this one evokes an evaluation approach accompanying the intervention (ie a project, a program, or a policy) throughout its implementation so that the evaluation results are used by the intervention team to improve the performance of the intervention or possibly its continuous reformulation until it meets the needs of the beneficiaries. I believe, to simplify the debate, that this is the expected role of the Monitoring function in any M&E system. Why then do we try to wrap it in a new packaging called "Developmental Evaluation"? If in the team in charge of an intervention, more importance and sufficient means are given to the monitoring function, for instance by developing participative mechanisms in this function, I am certain that one will reach convincing results in terms of performance and adaptation of the intervention, exactly as Patton's "Developmental Assessment" concept proposes. The only difference is that this Monitoring function will be less costly for the team and driven by the internal resources of the team, something that almost all M & E training manuals recommend.In the second case, our colleague Émile evokes what is really the role of evaluation since it must focus on the effects and impacts at the macro-economic level in relation to the main development indicators. For my part, and to put it simply, this is what must be attributed to the "Evaluation" function in the monitoring-evaluation system of a given intervention; whether or not we add the adjective "developmental" does not change this "Evaluation” function. In fact, which project or program, or national or sectoral policy, etc does not intervene in the developmental sphere? And which evaluation action of such project, or program, or policy, etc. is not intended to appreciate the effects - especially what are commonly known as end effects - and impacts?
Having said that, I think our community is made up mostly of practitioners in the field who want to see debates develop that can bring them practical solutions that are appropriate to their problems and that they can implement on the ground. So my recommendation is to develop simple debates on current topics and to avoid unnecessary confusion for our development practitioners. On the contrary, let us help them strengthen their monitoring and evaluation system by further strengthening their "Monitoring" function and further develop their "Evaluation" function.
Thanks, dear Eoghan, for taking time to go through my contribution and give more information about the evaluation.
The picture on introducing CA and get it adopted by farmers is very similar to what we have done for the last 3 decades in technology dissemination and adoption (intensive package on cereal cropping, use of quality seeds, mechanization, herbicides use, water-saving irrigation techniques, etc.). That general picture shows always some of the following aspects:
1. Project's technical staff are very enthusiastic to show their "successes" in the field by showing large numbers of farmers being enrolled by the project, and jump without hesitation to consider as a huge rate of technology adoption. They are very defensive when one tries to ask them questions if they took the time to know in deep their beneficiaries.
2. Farmers are keen to apply a new technology when someone else is covering the cost. But when the project is closed, then we see properly what is happening among farmers. Most of time, farmers who participated in a closed project start asking when the next new project will start and if they will be part of it, as if the closed project was just a game and then the game was over (I am becoming a bit cynical on this).
3. Little is done in terms of evaluation of the project outcomes, impacts, sustainability of both and so forth...
I am telling you this because I was involved in 2013 a 4-year Maghrebin CA project funded by Australia and implemented by ICARDA in Algeria, Morocco and Tunisia. I was involved in setting the M&E plan for that project and trained a bunch of Maghrebin researchers and dev practitioners on Results-Based Management so that they could make a good use of that M&E plan. All social actors involved in that project praised the work done (M&E Plan + RBM Training), especially the Austalians who were very keen to put a strong pressure on ICARDA to setup the M&E Plan. But the project was closed after 4 years in the same as I saw many projects closing (you can imagine the picture - business as usual).
But, in your case, and the case of your CA evaluated project, I am happy to see that you paint in your message the picture as it is in reality, i.e. that CA was not that "rosy" technology that could fit most farmers in Africa, especially that is was applied in a "one-size-fits-all" approach, with a little knowledge - to not say "no" knowledge - on the beneficiaries, and that the case presents some shortcomings that you are not hiding. And what and how evaluation has to do. Good to read a balanced contribution on a new technology.
As for the issue of sampling, especially with a "fixed constituency" for 4-5 years between baseline and project end, it is always a tricky issue to get that required robustness in our survey. But you tackled the issue through triangulation, using multiple sources of data, and honestly I would have go the same way. But locating 317 farmers among 385-390 at the end of the project is quite an endeavour by itself. That's why I mentioned in my previous contribution the need in such cases to make the sample bigger at the baseline in order to cover such turmoil at the end.
Finally, the way you presented the things made me more curious and "hungry" to look at the evaluation report. Without engaging in a formal commitment, I will download the evaluation report for which I am very thankful to you and try to squeeze some time to read (summer time is rushing away and missions and travels will start again very soon in September).
Good luck and kind greetings
Dear Mr. Molloy,
I have read with a great attention your contribution referring to the evaluation of CASU in Zambia. I must congratulate your department for such an achievement. However, I have two points to make here.
You mention at the start of your contribution that the entire population of the Conservation Agriculture project is "targeting over 300,000 smallholder farmers". That is the entire population of the project. You also mention that "the main focus of the evaluation was to assess the extent to which conservation agriculture has been sustainably adopted by Zambian beneficiary farmers … also sought to assess what outcomes were evident (positive and/or negative) from the project’s activities, and what were the impacts on food security, income, and soil health"
The first point I want to raise concerns the adoption study that you highlighted in your message. Though I don't have all details about how such a study was conducted and what results it did achieve, I would like use this opportunity to share some experiences on adoption studies, a sort of outcomes evaluation and if these outcomes are sustainable over time. Everett Rogers, one of the gurus on technology adoption by farmers, instruct us not to check the adoption rate at once or at any time. Adoption studies require that one is aware of the technology adoption process among farmers in order to understand how to work with adoption studies and set up appropriate protocols to study technology adoption among farmers. I saw many of adoption studies giving high rates of adoption at the end of a project and very low numbers of farmers are still keeping the technology 5-10 years after the end of a project. This is because what seems adoption to researchers is just experimentation to farmers, so real adoption for farmers will come far away after that moment of project end.
The second point I want to raise concerns the household survey undertaken by the University of Zambia and the sample size used by the research team. Besides other activities conducted within this evaluation (among which focus groups with 650 beneficiary farmers), you mention that "a household-level impact assessment survey to collect quantitative data amongst a sample of over 300 farmers, in order to assess progress against the baseline survey".
Nobody can deny that a survey can only be truly valuable when it is reliable and representative for the entire population of project's beneficiaries. This is why determining the ideal survey sample size with robust external and internal validities is quite important as it will help the research team to infer and extrapolate the results obtained on the research sample over the entire population of the project's beneficiaries.
Using a correct survey sample size is crucial for any research, and project evaluation is a research. A too big sample will lead to the waste of precious resources such as time and money, while a too small sample, though it can yield sound results (strong internal validity), but will certainly not allow inference and extrapolation of its results on the entire project population (weak external validity).
So, the sample size cannot be by how much a research team can handle but on how accurate the survey data ought to be. In other words, how closely the research team wants the results obtained on a sample to match those of the entire project population.
In statistics and statistical probabilities, we use two measures that affect the accurateness of data and which have a great importance as for the sample size: (1) the margin error, in most cases, we use 5%; and (2) the confidence level, in most cases, we use 95%. Based on these two measures, and given the population size, the research team can calculate how many respondents (people who might completely fill the survey questionnaire) it may actually need; that is the survey sample. Beside all this, the research team must consider a sufficient response rate – that is the number of "really exploitable" survey questionnaires – so that they include additional questionnaires beyond the sample so that the research team has sufficient number of completed questionnaires to exploit. The table can give an idea on the sample size for a project population of 300,000 individuals. For example, if we target 380-390 "exploitable" questionnaires, we allow 20-25% more questionnaires so that the survey is not put at risk of weak robustness.
As a conclusion, I believe that the sample size for the mentioned household survey, as part of the undertaken CASU evaluation, was a bit lower than what a probabilistic law would accept. Of course, this statement has no consequence on the results obtained within the sample as such, but the survey findings cannot be strongly and robustly inferred and extrapolated to the entire population of project's beneficiaries because of the weak external validity of the sample, due to no respect of the principles of probabilistic law.
I would like to contribute to the debate on gender mainstreaming in the evaluation of development actions - I use the generic term development action to refer to a project, program or policy. I find Georgette's debate quite important and that will have to be taken out of a debate that would remain philosophical and sterile, so much the development practitioners need practical elements to make the necessary corrections to their way of doing things. It goes without saying that the "gender" dimension is very important for development but this should not lead us to use it as a "master key" to use in all development actions; we must therefore deal with this "gender" issue in a systematic and mandatory way in development actions that have an undeniable gender dimension.
Following the watermark of this debate, I have the weakness to think that we are dealing with this issue right at the time of the evaluation - it is a debate that I come across very often among practitioners of the "simple and simplified" evaluation. However, this aspect must be dealt with well in advance, in general at the time of formulation of the development action and design of its results framework, and in particular in the choice of indicators and data collection - which should be disaggregated according to the gender dimension, and in the establishment of the monitoring and evaluation system for the development action.
If a given development action is articulated on a strong gender dimension, the reading of the project document, its results framework and indicators, and its monitoring and evaluation plan, etc., must reflect this strong gender dimension, even before the activities of this development action are launched on the ground. Without such an integrative perspective of Results-Based Management, the evaluation will be totally disconnected from the rest of the activities of a development action, including monitoring-evaluation activities, and will not help us in such a situation to bring all the necessary answers to questions about gender that we might ask ourselves at the time of an evaluation.
It is on this restrictive debate on evaluation, which reduces and "chops up" a process of management of the cycle of a development action to which I wanted to make an initial contribution during my first message on this platform; it is since a few weeks that I wanted to draw the attention of all members to the danger of speaking in a restrictive way about evaluation outside of a Results-Based Management perspective.
A word to the wise!