This paper summarises analyses of survey and on-site spending data for the endline evaluation of the MVP. The MVP was initiated soon after the UN Millennium Project’s recommendations, leading to limitations in the project’s design, including the absence of a prospective comparison group for impact evaluation. This study exemplifies methods for retrospective observational studies and addresses the challenge of selecting a comparison group with scarce pre-intervention data.
Averaged across the ten MV1s, the project had a significant and favourable impact on 30 of 40 outcomes of interest and no significant adverse effects. The highest impacts were seen on agriculture, maternal health, and HIV and malaria outcomes. However, substantial variation between sites was observed. The three sites that were unable to be matched with DHS data were estimated to have been most favourably affected by the project, which could indicate poor matching with comparison villages or that these sites truly had the greatest impacts (or somewhere in between). Considering outcomes highlighted in previous evaluations,12x12Pronyk, PM, Muniz, M, Nemser, B et al. The effect of an integrated multisector model for achieving the Millennium Development Goals and improving child survival in rural sub-Saharan Africa: a non-randomised controlled assessment. Lancet. 2012;
Summary | Full Text | Full Text PDF | PubMed | Scopus (46) | Google ScholarSee all References,13x13Remans, R, Pronyk, PM, Fanzo, JC et al. Multisector intervention to accelerate reductions in child stunting: an observational study from 9 sub-Saharan African countries. Am J Clin Nutr. 2011;
Crossref | PubMed | Scopus (32) | Google ScholarSee all References we found that the MV1s had lower under-5 stunting and mortality estimates than did the comparison villages.
Our indices combined various outcomes along hypothesised causal pathways. For example, bednet ownership, bednet use, and malaria prevalence were grouped together in the HIV and malaria index. For these outcomes, the largest estimated impact was on bednet ownership, which was most directly linked to project activities, whereas the estimated impact on malaria prevalence was less than half in magnitude.
According to our assessment of target attainment, averaged across the ten MV1s, roughly a third of the targets were reached. All maternal health targets except one were reached. Some targets were reached across education, child health, HIV and malaria, and water and sanitation indices.
Total on-site spending in the MV1s decreased from an average of $132 to $109 between the project’s first phase and second phase. Although it would be interesting to see how the impacts of the project changed as on-site spending decreased, we did not estimate impacts over time because we did not have project-collected data in the comparison villages from before 2015. The available trend data in the MV1s between 2010 and 2015 showed that outcomes improved, averaged across the ten sites.
This study has some limitations. Our impact estimates are only interpretable as impacts if two assumptions hold. First, we assumed unconfoundedness—ie, within strata defined by observed variables, the outcomes in the MV1s and comparison villages would have been the same (on average) without the project.25x25Rubin, DB. For objective causal inference, design trumps analysis. Ann Appl Stat. 2008;
Crossref | Scopus (206) | Google ScholarSee all References,49x49Imbens, GW and Rubin, DB. Causal inference in statistics and social sciences: an introduction. Cambridge University Press,
New York, NY; 2015
Crossref | Scopus (213) | Google ScholarSee all References,58x58Gelman, A and Hill, JL. Data analysis using regression and multilevel/hierarchical models. Cambridge University Press,
New York, NY; 2007
Google ScholarSee all References,65x65Angrist, JD and Pischke, JS. Mostly harmless econometrics: an empiricist’s companion. Princeton University Press,
Princeton, NJ; 2009
Google ScholarSee all References,66x66Bang, H and Robins, JM. Doubly robust estimation in missing data and causal inference models. Biometrics. 2005;
Crossref | PubMed | Scopus (387) | Google ScholarSee all References,67x67Greenland, S, Robins, JM, and Pearl, J. Confounding and collapsibility in causal inference. Stat Sci. 1999;
Crossref | Scopus (333) | Google ScholarSee all References,68x68Rubin, DB. Inference and missing data. Biometrika. 1976;
Crossref | Scopus (3673) | Google ScholarSee all References,69x69Rubin, DB. Bayesian inference for causal effects: the role of randomization. Ann Stat. 1978;
Crossref | Google ScholarSee all References Second, we assumed that outcomes in the comparison villages were not affected by the project.49x49Imbens, GW and Rubin, DB. Causal inference in statistics and social sciences: an introduction. Cambridge University Press,
New York, NY; 2015
Crossref | Scopus (213) | Google ScholarSee all References
The plausibility of unconfoundedness was limited by the non-random design and scarce data for comparison villages at baseline. Our approach used available data, matching on many variables measured before project implementation or not affected by the MVP.26x26Rosenbaum, PR. The consequences of adjustment for a concomitant variable that has been affected by the treatment. J R Stat Soc A. 1984;
Crossref | Google ScholarSee all References Some matching variables were estimated from the DHS sample. We were unable to adjust for any additional confounding by the true (unobserved) variables, although research suggests this residual confounding is not substantial in most cases.70x70Lenis, D, Ebnesajjad, CE, and Stuart, EA. A doubly robust estimator for the average treatment effect in the context of a mean-reverting measurement error. Biostatistics. 2017;
PubMed | Google ScholarSee all References,71x71Webb-Vargas, Y, Rudolph, KE, Lenis, D, Murakami, P, and Stuart, EA. An imputation-based solution to using mismeasured covariates in propensity score analysis. Stat Methods Med Res. 2017;
Crossref | PubMed | Scopus (1) | Google ScholarSee all References For three countries (Nigeria, Ethiopia, and Tanzania), DHS data were not available, so only geographical variables were used. Two unmeasured variables, local political buy-in and community ownership, were not included in the matching, which might have affected the selection of the MV sites and hence were possible confounding variables. We assessed unconfoundedness, and the results did not undermine the assumption’s credibility (appendixappendix).
There are several possible routes by which outcomes in the comparison villages could have been affected by the project. First, residents might have migrated between the MV sites and comparison villages. However, at endline evaluation, household heads in the MV1s had lived there for almost 10 years (on average), whereas household heads in the comparison villages had lived in the MV sites for less than 1 year (on average; appendixappendix). Second, residents of comparison villages might have accessed project services, particularly at health facilities. Third, comparison villages might have heard about and adopted MVP interventions. Fourth, outcomes in the MV sites could have affected outcomes in comparison villages—eg, through reduced malaria contagion or sharing of HIV knowledge.72x72Ogburn, EL and VanderWeele, TJ. Causal diagrams for inference. Stat Sci. 2014;
Crossref | Scopus (14) | Google ScholarSee all References,73x73Ogburn, EL and VanderWeele, TJ. Vaccines, contagion, and social networks. Ann Appl Stat. 2017;
Crossref | Scopus (1) | Google ScholarSee all References Fifth, government spending in the comparison villages might have been affected by the project as a result of MV sites being targeted or deprioritised for investments. Our matching procedure ensured that comparison villages were at least 10 km away from MV sites in an effort to reduce all but the fifth interference issue.
Generalisability and sustainability are difficult to assess, so extrapolating the results to different scales, locations, and time periods should be done with caution. In particular, even if local political buy-in and community ownership were not confounding variables, they could affect the generalisability of the results.74x74Allcott, H. Site selection bias in program evaluation. Q J Econ. 2015;
Crossref | Scopus (33) | Google ScholarSee all References
Our analyses did not take into account spatial correlations (the tendency of closer areas to have more similar outcomes) beyond accounting for clustering into villages and countries.
We did not collect spending data in the comparison villages because of scarce evaluation resources and concerns about the accuracy of data recalled from up to 10 years previously. Knowledge of the difference in spending with and without the project could have enabled a cost-effectiveness analysis.
As in much social science research, both the intervention recipients and the evaluation team were not masked to project assignment. Likewise, recall bias and respondent fatigue might have affected data quality. Although some data were missing because of non-response (appendixappendix), most variables were almost complete, and response rates were similar for MV1s and comparison villages (appendixappendix). Results from an available-case analysis were very similar to the multiple-imputation analysis.
Our non-factorial design did not allow estimation of the effects of component interventions nor their interactions, preventing assessment of the extent of synergistic effects.75x75Sachs, JD, McArthur, JW, Schmidt-Traub, G, Kruk, M, Bahadur, C, and McCord, G. Ending Africa’s poverty trap. Brookings Pap Econ Act. 2004;
Crossref | Google ScholarSee all References,76x76Sachs, JD. Rapid victories against extreme poverty. Sci Am. 2007;
Crossref | PubMed | Google ScholarSee all References Similarly, we could not separately estimate the effects of project management from the intervention activities. This paper does not include a process evaluation studying the project’s causal pathways.
The MVP did not meet its goal of achieving all of the MDGs, mirroring low attainment of the MDGs across sub-Saharan Africa as a whole.64x64UN Statistics Division. Millennium Development Goals: 2015 progress chart. http://www.un.org/millenniumgoals/2015_MDG_Report/pdf/MDG%202015%20PC%20final.pdf; 2015. ()
Google ScholarSee all References Both Africa-wide MDG efforts and half of the MV1s received less donor funding than was recommended in the UN Millennium Project’s report.4x4UN Millennium Project. Investing in development: a practical plan to achieve the Millennium Development Goals. United Nations Millennium Project,
New York, NY; 2005
Google ScholarSee all References,77x77MDG Gap Task Force. The global partnership for development: the challenge we face. United Nations,
New York, NY; 2013
Google ScholarSee all References,78x78OECD. Development aid rises again in 2016. Organisation for Economic Co-operation and Development,
Paris; 2017https://www.oecd.org/dac/financing-sustainable-development/development-finance-data/ODA-2016-detailed-summary.pdf. ()
Google ScholarSee all References
The achievements of the MVP in health suggest support for the project’s emphasis on strengthening the continuum of care from households, to primary care facilities, and to tertiary care facilities. In particular, we believe that the project’s cadres of paid, professionalised community health workers, empowered with smartphones to aid in service delivery and real-time disease monitoring, contributed to the positive results. The project was also an early adopter of interventions and technologies that have since been implemented by development organisations and governments, in part because of the MVP’s demonstration and advocacy. These include free mass distribution of insecticide-treated bednets, home-based malaria testing by community health workers using rapid diagnostic tests, use of mobile health applications for collection of real-time operational data, and micro-grid solar-powered electrification in rural areas. Although poverty is difficult to define and accurately measure, the project’s overall positive impact on household asset ownership is a promising indication that living standards were improved.
This impact evaluation was restricted to a cross-sectional, endline comparison with matched villages, using methods specified in the protocol. In the future, additional comparisons would be interesting and useful as sensitivity analyses. For example, comparisons could be made between the MV1s (from data collected by the MVP) and national rural areas (from DHS data). Rural development initiatives, such as those done by the MVP, should be viewed as only one component of an integrated national strategy to end extreme poverty. The MVP was not able to address national-scale infrastructure or systems (such as highways, railways, or supply chains) that are crucial for development in rural and urban areas. However, this endline evaluation might allow some policy implications to be drawn from the project.