Most evaluations require adjusting methodology to real-world constraints, whether they be economic, technical, organizational, political, or related to time or data.
Evaluators working in the areas of food and agriculture are constantly seeking creative ways to produce credible evaluation findings and recommendations while working under one or more of these constraints. Big data and data science can make a difference.
With the rapid expansion of big data and data science in all areas of our personal and professional lives, a vast range of new tools and techniques for data collection, analysis and dissemination are becoming available, many of which hold exciting potential for the evaluation of development programmes.
The potential of big data technologies is especially evident in the current pandemic context, where it can be easier to collect and analyze data remotely, safely, rapidly, and economically.
Challenges facing current evaluations
Let us take a look at some of the main challenges currently facing many evaluations:
- Costs of data collection and analysis: Data collection represents a major cost in most evaluations, and the scope of an evaluation is often limited by such costs.
- Small samples limit the kinds of analysis possible: evaluators are often under pressure to reduce sample size, and this limits the ability to conduct disaggregated analysis or many kinds of statistical analysis.
- Constructing counterfactuals (comparison groups): data costs and small samples make it difficult to construct a counterfactual design.
- Exclusion of hard-to-reach groups: Many vulnerable groups are more expensive or difficult to reach and may risk being left out.
- Exclusion of data that is more difficult to measure: data on behavior, processes, and attitudes may be excluded.
- Addressing complexity: Most conventional evaluation designs are not able to collect the kinds of data required to adequately analyse complex programmes.
- Sustainability and longitudinal analysis: It is difficult and expensive to collect the kinds of longitudinal data required to evaluate whether programmes are sustained over time.
How big data and data analytics can strengthen evaluations
Below is a list of some widely-used big data techniques, followed by examples, in Box 1, of their actual or potential applications in food and agriculture evaluations.
- Satellites and drones. These provide images with lower resolution but covering larger areas (satellite) and higher resolution for smaller areas (drone) images. Large numbers of variables can be collected economically and quickly over a vast area and over long periods of time.
- Remote sensors (Internet of things). Sensors can track movement, the use of services such as drinking water, sanitation and irrigation, and compliance with protocols (such as carbon reduction rice cultivation technologies).
- GPS location data. Can record location of infrastructure or events, e.g. traffic accidents, and movement, such as tracking and locating refugees.
- Social media, including radio call-in programmes. Analysis of Twitter, Facebook and other social media posts can identify and track potential problems like poverty hot-spots and ethnic tensions, in addition to attitudes, behavior and participation in different kinds of groups.
- Internet search data. Search data can be used to predict future actions. For example, searches for information on fertilizers might signal future farming decisions.
- Integrated data platforms and agency data files. Data analytics can merge multiple sources of data into a single platform so that previously unrecognized relationships can be identified.
- Biometric data. Health and other biometric indicators can be tracked with monitors attached to the bodies of humans or livestock.
Many of the techniques can be used to construct pretest-posttest (or baseline-end of project) designs, with or without a comparison group. The data generated by satellite images and some other sources can also be used to construct statistically strong comparison-group designs using techniques such as propensity score matching. Many of the technologies can also generate continuous, time-series data permitting the use of more sophisticated designs.
|Box 1 - Examples of big data use/potential in food and agriculture evaluations|
|Satellite data: For a large irrigation rehabilitation program of FAO in Afghanistan, the evaluation team used Google Earth to verify preliminary information from enumerators on the expansion of rehabilitated canals and changes in vegetation, comparing conditions before and after the programme. In the context of a WFP Impact Evaluation of the Livestock Insurance Scheme in Ethiopia, the vegetation index was used as an indicator of an ongoing drought that would affect pastoralists’ livelihoods and entitle them to receive an insurance payment.|
|Remote sensing: IFAD used remoting sensing in a recent Country Strategy and Programme Evaluation in Nepal to identify the state of sensitive ecosystems such as inhabited mountain slopes, and to track degradation of vegetation due to human and livestock activity.|
|GPS location data: could be used to track migration patterns from GPS-enabled phones, or the time spent by women to collect water.|
|Social media data: to identify potential poverty hot spots through the increase of terms associated with hunger, for example.|
|Call-in radio programmes: textual analysis can be used to document frequency of references to different problems and concerns of farmers in different regions.|
|Internet search data: using frequency of searches for information about destination cities as a predictor of migration from areas of high to lower unemployment.|
|PDF files and other organizational records (transactional data): merging previously unconnected organizational documents to create an integrated data platform to identify previously undetected associations and patterns.|
|Integrated data platforms that merge many secondary survey data files, public records, and social service agency client files. This permits the identification of patterns and analysis of the effects of contextual variables on program performance.|
|Biometric data [the “quantified self”]: cost-effective ways to collect health and biometric data on individuals and communities.|
Transitioning to new information ecosystem will pose challenges
The successful integration of big data into evaluation will require strengthening ties between evaluators and data scientists and developing a common approach to the evaluation of programmes. This would require working together on capacity development, pilot-evaluation exercises, and team building to overcome misunderstandings and, in some cases, mistrust. It would be well worth the effort.