PRODUSE Impact - ME Guide
PRODUSE Impact - ME Guide
PRODUSE Impact - ME Guide
Photos
© GIZ / Marco Hüls
Layout
creative republic
Thomas Maxeiner Visual Communications
Frankfurt, Germany
www.creativerepublic.net
Language Editor
Sina Mabwa
This paper has also been published as an annex of the study Productive Use of Energy (PRODUSE) - Measuring
Impacts of Electrification on Micro-Enterprises in Sub-Saharan Africa.
If information from this guide or parts of the addenda are used, please cite as: Peters, J., Bensch, G. and Schmidt,
C.M. (2013): Impact Monitoring and Evaluation of Productive Electricity Use – An Implementation Guide for
Project Managers. In: Mayer-Tasch, L., Mukherjee, M., Reiche, K. (eds.), Productive Use of Energy (PRODUSE):
Measuring Impacts of Electrification on Micro-Enterprises in Sub-Saharan Africa. Eschborn.
PRODUSE is a joint initiative of the Energy Sector Management Assistance Program (ESMAP), the Africa
Electrification Initiative (AEI), the EUEI Partnership Dialogue Facility (EUEI PDF) and Deutsche Gesellschaft
für Internationale Zusammenarbeit (GIZ). Further information on www.produse.org.
Productive Use of Energy – PRODUSE
Impact Monitoring and Evaluation
of Productive Electricity Use –
An Implementation Guide for Project Managers
By Jörg Peters, Gunther Bensch and Christoph M. Schmidt56
4
Table of Contents
1. Introduction................................................................................................................................................................................................6
2. Classical M&E vs. Impact M&E.......................................................................................................................................................... 7
2.1. Outcomes, Impacts and Highly-Aggregated Impacts................................................................................................ 7
2.2. Second Round Effects.................................................................................................................................................................8
3. Developing a Productive Use Impact M&E System...................................................................................................................8
3.1. General strategies to isolate the project’s effect............................................................................................................8
3.2. Three PUE Impact M&E Modules..........................................................................................................................................9
4. Step-by-step Towards an Effective PUE Impact M&E System............................................................................................. 12
5. Aid Items....................................................................................................................................................................................................22
Aid 1. The Results Chain Concept and Demarcation Between Outcomes and Impacts....................................22
Aid 2. Strategies to Identify the Counterfactual Situation.............................................................................................22
Aid 3. List of Indicators...................................................................................................................................................................25
Aid 4. Outline of Terms of Reference for Short-Term Experts........................................................................................ 28
Aid 5. Outline of Inception Report............................................................................................................................................ 29
Addenda ......................................................................................................................................................................................................31
References ....................................................................................................................................................................................................32
5
1. Introduction
The existing literature on the methodology of impact evaluations targets academic evaluation researchers or
practitioners with a high affinity to becoming acquainted with evaluation methods.57 Practitioners who are
rather interested in setting up a hands-on monitoring and evaluation (M&E) scheme to obtain robust insights
into the impacts of concrete interventions, however, can hardly be expected to familiarise themselves with
these methodological issues at the level of highbrow econometric research.
Intending to close this gap, the guide provides assistance on how to design an impact M&E system or an
impact evaluation study for productive electricity use in elec¬trification interventions (PUE impact M&E
system in the following) and is tailored to examine electricity take-up and income generation in small and
micro-enterprises (SMEs).
It is targeting managers of electrification projects who are particularly interested in monitoring and evaluating
the impacts of electrification on SMEs (whom we – for simplicity – hereafter call ‘project managers’)58. Still, it is
in the same way geared towards researchers or practitioners in charge of the evaluation itself, the ‘researchers’.
The education of this audience with respect to methodological issues is not the focus of this discussion.
Rather, its major aim is raising awareness for important parameters in the design of a PUE impact M&E sys-
tem and the provision of project managers with an accessible menu of requisite steps, also intended to en-
courage the further development of local evaluation capacities. While this guide focuses on evaluating the
impacts of electrification on SMEs, the principal steps of the proposed PUE impact M&E system are inter-
changeable and can be transferred to other development projects.
Three different modules are presented, representing the spectrum of potential approaches and their respec-
tive advantages and limitations. Depending on the methodological approach, the PUE impact M&E system
can either be implemented by project staff or external consultants or researchers with special evaluation
skills have to be contracted. For the case in which such researchers are contracted, the present chapter guides
the project manager on how to effectively steer and backstop the assignment.
In order to stress the demarcation between classical M&E and impact M&E, the guide reviews briefly the dif-
ferent results of an intervention: outcomes, impacts and highly-aggregated impacts. Classical M&E systems
typically monitor project activities and sometimes outcomes, but not impacts (ADB 2006). This is elaborated
in Section 2 – also by discussing the problems and pitfalls that one encounters when the impacts of electrifi-
cation on SMEs are to be evaluated.
Section 3 first introduces principal strategies to assess the impact of an intervention. Subsequently, three
modules are presented: one simpler module based on a short SME survey (Module a), one module based on an
extended and profound SME survey (Module b) and one module based on anecdotal case studies (Module c).
Module (a) and (b) deliver data that can be analysed statistically. All modules have been applied during the
PRODUSE study and within other projects.59 A discussion of their respective opportunities and limitations
complements the proposal of the modules.
55) If information from this annex or parts of the addenda are used, please cite as: Peters, J., G. Bensch and C.M. Schmidt (2013) Impact Monito-
ring and Evaluation of Productive Electricity Use – An Implementation Guide for Project Managers. In: Mayer-Tasch, L., Mukherjee, M., Reiche,
K. (eds.), Productive Use of Energy (PRODUSE): Measuring Impacts of Electrification on Small and Micro-Enterprises in Sub-Saharan Africa.
56) The authors are grateful for valuable comments by Anna Brüderle, Nadja Kabierski-Chakrabarti, Lucius Mayer-Tasch, Kilian Reiche and
Colin Vance.
57) See Ravallion (2008) and Gertler et al. (2010) for examples of handbooks that comprehensively introduce the methodology of impact
evaluations.
58) The guide is not meant to replace existing handbooks and guides on impact M&E (see above and refer to Addendum 1) or, more
specifically, on evaluation or survey methodology (see for example Ravallion 2008, Iarossi 2007 and Warwick and Lininger 1975).
59) The impact M&E approaches presented here were applied as part of the PRODUSE study in Benin, Ghana and Uganda. Comparable M&E
studies were also implemented in Burkina Faso, Benin, Indonesia, Rwanda, Senegal and Mozambique. In addition to the PRODUSE report,
published reports are Bensch and Peters (2010), Bensch, Peters and Schraml (2010) and Harsdorff and Peters (2010). Methodologically
more elaborated methods are used, for example, in Bensch, Kluve and Peters (2011) or Peters, Vance and Harsdorff (2011).
6
Section 4 is the core of the guide and presents the process of designing a PUE impact study step by step. Step 1,
‘Getting Started’, pays particular attention to outlining the decision process, Step 2 describes the process of design-
ing the study, Step 3 the survey preparation, Step 4 the implementation of the survey and Step 5 the data analysis
and reporting. To facilitate its practical applicability, an addendum section contains references to further readings
and, most importantly, sample questionnaires that have been used in the PRODUSE study and other evaluations.
Any programme implemented in practice aims at making a genuine difference to the state of well-being in
the target population. To this end, the programme directly influences the state of outcome variables that are
intended to trigger intermediate impacts and, eventually, highly-aggregated impacts on income, nutrition or
other variables of fundamental importance. In the case of productive electricity usage, one might consider the
example of a programme that subsidises the extension or densification of the national grid. Here, a results
chain which connects the intervention’s inputs and activities to its outcomes and impacts in generic terms,
would, for example, consist of the following links: the desired outcome with regards to productive use is that
SMEs get connected. An intermediate impact then is that the firm uses electricity productively for example by
employing a machine or by extending its operating hours. The next step in the results chain then is the effect
that this electricity usage has on the firm’s production process (increased productivity), while the highly-
aggregated impact occurs at the level of the firm owner or the firm’s employees in the form of higher incomes
(see Aid 1 for a simple visualisation of a PUE results chain).
Outcomes are typically clearly attributable to the project’s intervention. Both intermediate and highly-aggregated
impacts, in contrast, might be caused by a combination of different factors. Apart from the project’s intervention,
such other factors may be the firm’s development along its secular growth path, rising or falling demand for the
firm’s products as a result of general economic development or changes in market prices of the firm’s products. In
the project’s results chain, this insight is expressed as the so-called attribution gap between outcomes and im-
pacts. Before attempting any quantification of either of them, the careful enumeration of what the outcomes and
impacts are and what the project could achieve in principle should be the starting point of every evaluation effort.
The results chain also shows the difficulty of an impact evaluation: on the one hand, only the highly-aggregat-
ed impact variables are of ultimate interest when gauging the effectiveness and success of a programme. The
intermediate impact variables, higher profitability, for example, are no means to the end. On the other hand,
the more aggregated the impact indicator is, the more difficult and costly it is to isolate the net effects of the
intervention on the impact indicators. Taking our example, these are the effects of the electrification project
alone. Gross effects, in contrast, also include influences due to external factors that would also have taken
place in the absence of the project. Disentangling the electrification impact from these other influences is
much more difficult for highly-aggregated impacts than for intermediate impacts. In other words, the attribu-
tion of causes and effects becomes more difficult.
Therefore, the question of which level of results to monitor and evaluate is a crucial question to be addressed
by the electrification project’s manager. In this spirit, an impact evaluation intends to go beyond the demands
of a classical monitoring system by also investigating the indirect benefits (impacts) of the intervention. A
classical monitoring system, by con¬trast, is basically restricted to tracking progress of programme imple-
mentation and to the review of achievements of the programme’s intended direct benefits (outcomes). The
present guide provides a pragmatic outline on how to design the implementation of a PUE impact M&E sys-
tem that allows to assess both outcomes and impacts.
The approaches described in this guide mainly aim at intermediate impacts such as higher profitability or
firm creation. Since, for instance, entrepreneurial activity is a promising avenue to economic development,
these intermediate impacts can be considered as a prerequisite and, thus, as proxies for highly-aggregated
impacts. While there is certainly no guarantee that intermediate impacts will ultimately translate into highly-
7
aggregated impacts, convincing evidence for the presence of intermediate impacts is an important piece of
information when assessing whether the programme has induced positive, highly-aggregated impacts or
not. Intermediary impacts can, hence, be seen as ‘stepping stones’ in the endeavour to identify the genuine
impact of the intervention on the ultimately meaningful dimensions of people’s well-being.
Even if the net effect of electrification on connected firms (the micro-effect) can be isolated successfully, this
is only one step towards a meaningful assessment of the programme’s impact. In order to obtain the benefi-
cial effect on the local economy as a whole (the macro-effect), one needs to account for so-called second
round effects. The most important second round effect is the crowding out effect. Crowding out effects occur
if the benefit to one enterprise is at the expense of other enterprises. For example, if a small shop attracts
more customers thanks to its new electric light bulbs, other non-connected shops may lose because their old
customers now buy at the connected shop.
In principle, the intervention area as a whole only benefits, i.e. the regional macro-effect is only positive if (i)
productive electricity usage replaces imported goods by locally produced ones or (ii) goods for export are pro-
duced using electricity or (iii) the total productivity of the local economy increases, for example via increased
usage of mills instead of mortar and pestle, liberating productive capacities for other purposes.
While it is difficult to fully account for such crowding out effects, they have to be kept in mind in both design-
ing a PUE impact M&E system and interpreting its findings. At least an attempt should be made to obtain
indicative evidence for such effects. This could be achieved, for example, by including non-connected SMEs in
the PUE impact M&E system as a control group or by probing qualitatively into the question of where the
customers of newly-connected enterprises are coming from.
Further second round effects are possible. Budget effects, for example, can be detected, if people in a village
spend parts of their limited budget on new products (e.g. photocopies, cold drinks) that were not available
before electrification. As a consequence, they reduce their expenditures on products they used to buy before
electrification. This becomes very evident in the case of expenditures for electricity itself – a typically
‘imported’ good. People no longer buy their candles at the local shop, thereby shifting parts of the added value
in the supply chain out of the region.
The fundamental decision to be taken by the project manager is whether evidence on impacts beyond the
attribution gap should be provided to donors, partner institutions or the public. If the answer is yes, this guide
shall help to design and implement appropriate impact M&E activities.
The methodological challenge of any impact evaluation is to isolate the net effects of an intervention and caus-
ally attribute changes in indicators specific to the intervention. For this reason, the evaluation strategy has to
identify the counterfactual situation, i.e. what would have happened to the beneficiaries’ (e.g. connected SMEs’)
relevant outcome variables (e.g. revenue) in the absence of the intervention. Comparing the counterfactual situ-
ation to the factual situation – what has actually transpired after the intervention – provides a valuable assess-
ment of the true impact of the project. As a matter of course, however, the counterfactual situation is unobserv-
8
able: we can never know for sure what change would have occurred among the beneficiary group if the programme
had not been implemented and the programme impact can at best be estimated in a convincing manner.
To find such a convincing estimate, we have to plausibly approximate this unobservable counterfactual situ-
ation. In practice, three main so-called identification strategies are available: (i) simple before-after compari-
son (the same firms are interviewed before and after electrification), (ii) cross-sectional comparison (con-
nected and non-connected firms are interviewed at one point in time) and (iii) before-after comparison with
control group (firms are interviewed before electrification, some of which get connected; connected and non-
connected firms are interviewed again after electrification). The three strategies differ in their methodologi-
cal robustness, i.e. the extent to which the evaluation is able to deliver valid and reliable results on the net
effects of the electrification. In general, the most robust approach is the before-after comparison with control
group, while exemptions might exist, for example if no adequate control group is available. An in-depth expla-
nation of these identification strategies can be found in Aid 2. This includes a discussion of the assumptions
under which each strategy is able to obtain the net effect.
In general, outcomes or impacts close to the attribution gap can be investigated using simpler approaches,
while the – in view of the results chain – more remote impacts have to be addressed by more sophisticated
ones. The choice of an adequate approach depends on the level of impacts the project manager wants to
obtain credible evidence on (see Table 66 in Section 3.2). Of course, available funds also play an important role,
since (iii) requires much more efforts than (i) or (ii).
In the following, we propose three modules for a PUE impact M&E system that are tailored to measure
impacts in the context of productive electricity use and that have been field-tested in various developing
countries: Module (a), based on a short enterprise survey, Module (b), based on a profound enterprise survey
and Module (c), a case study approach based on semi-structured in-depth interviews.
Since Module (a) and (b) deliver data for quantitative statistical analysis and Module (c) delivers information
that is interpreted qualitatively, the decision between the three proposed modules leads to the discussion
about the pros and cons of so-called qualitative and quantitative research. It is important to highlight that the
terms ‘quantitative’ and ‘qualitative’ do not refer to the nature of elicited information but only to how the col-
lected data is analysed. The major demarcation between Module (a) and (b) on the one hand and Module (c) on
the other is the sample size. In all three modules, both quantitative as well as qualitative questions can be
included in the questionnaire. To sum it up, the advantage of the case study approach in Module (c) is the more
open way in which interviews are conducted. Spontaneous adaptations of the interview are possible if deemed
interesting by the interviewer and the interviewee can more readily deviate from an intended interview line.
Owing to the nature of these interviews, a case study approach can only be based on a limited number of inter-
views and, hence, delivers anecdotal insights only. This also leads to the advantage of larger sample size surveys
– as proposed in Module (a) and (b) – which enable the researcher to average across many observations, thereby
benefiting from the law of large numbers. The price of this advantage of generalisation is that the researcher
is constricted to the corset of a structured questionnaire. One remedy is to combine the two general approach-
es, i.e. to complement the larger sample size surveys by selected in-depth interviews (see White 2002).
In contrast to the profound survey, the short enterprise survey in Module (a) aims at ‘easy to get and handle’
information (see Table 66) and abstains completely from eliciting more aggregated impacts such as profits or
improvements in market access. The aim of this modesty is to avoid difficult data processing (which includes
dealing with missing values, see Step 5a) and deriving misleading findings on more complex issues that might
result if no sufficient methodological effort is dedicated (e.g. with regards to sample size or advanced statisti-
cal data analysis). Module (a) envisages providing evidence on outcomes and on impacts that are close to the
attribution gap. The module then resorts to plausibility when linking the observed changes in the direct
results and impacts of the intervention to higher impacts. If the survey, for example, shows a considerable
take-up of machinery, one could plausibly assume that this also affects positively productivity and, hence, firm
profits and employee wages. Module (b), the profound survey, by contrast, aims to provide direct evidence for
such effects. A plausible counterfactual situation is established and the impact of electrification on, for exam-
ple, firm profits can be assessed by comparing the electricity-using firm to its counterfactual.
9
Module (c), the case study approach, is included since SMEs are less homogenous and numerous than house-
holds, making a statistical analysis more difficult. For example, only one or two larger firms might exist in a
target region. Including them in a larger sample size survey is not reasonable, since the advantage of larger
sample sizes – taking the average across many observations – cannot be exploited for obvious reasons.
Restricting oneself to the corset of a structured questionnaire is, hence, not necessary. Doing more open and case
study-like interviews is much more sensible in such a case. Another reason for applying the case study approach
is to account for unintended effects or to probe deeper into certain issues than structured questionnaires can do
(e.g. crowding out effects as delineated in Section 2.2). The findings, of course, have to be interpreted against the
backdrop of the non-representative selection. Transferring them to other surroundings or enterprises can only be
done to a limited extent (even more limited than for larger sample size studies). However, such case studies can
definitely help to understand complex processes among beneficiaries and provide for anecdotal evidence of elec-
trification impacts that can, not least, be fed into the design of future larger sample size surveys.
The following Table 66 catalogues the main features of the three modules introduced above – including their
respective advantages and limitations. Of course, the components can be modified for specific reasons and
the different parts of the three modules can be combined. Based on our experience in various projects, we
believe that the modules are a reasonable compilation of M&E activities that are required to yield the described
results and recommend that project managers design their PUE impact M&E system along these lines.
Please note that although Module (b) would be commonly referred to as the ‘rigorous’ way of doing M&E, this
term is purposefully avoided. The reason is that, as White (2002) points out ‘… the real basis for rigor is the
proper application of techniques. Badly or misleadingly applied, both quantitative and qualitative techniques
give bad or misleading conclusions.’ In this sense, all modules proposed here can and should be applied rigorously.
Main purpose
Providing evidence on impacts close Providing evidence on the causal Collect anecdotal evidence on
to the project’s direct outcomes relationship between electrification electricity usage and its impacts.
that can be assessed with a less and ultimate development In particular on issues that can
extensive survey set-up and without indicators using state-of-the-art hardly be addressed in structured
applying advanced statistical data evaluation techniques. interviews (e.g. impacts on
analysis for causal attribution. particular SMEs that do not qualify
Relation to ultimate poverty impacts for the other two modules due to
is instead established on a plausibi- non-comparability).
lity basis only by results chains.
Identification Strategy Background information ÒAid 2
Before-after comparison Cross-sectional, before-after compa- Before-after comparison or
rison or before-after comparison with retrospective questions
control group – The baseline survey in (with critical qualitative assessment)
a before-after strategy allows to also
obtain profound knowledge about
the target region.60
Sampling Method Also see Step 4e, Section 4
Simple random sampling Simple random sampling or Simple random sampling or non-
stratified random sampling random sampling of SMEs of parti-
cular interest. If combined with one
of the other two modules, typical
firm types that have shown up
during the surveys can be selected.
60) It can be particularly interesting from the project’s perspective to include an already electrified control region. This allows the project to
gain insights about the behaviour of the rural population/enterprises after electrification (see PRODUSE Chapter 3 and Peters (2009) for
methodological details of this approach).
61) A
uspices bias refers to the frequently observed tendency of an interviewee to give a response the enumerator (does not) like(s) to hear.
For example, an entrepreneur in a connected firm might answer more positively in an electrification project’s impact survey, because s/he
is thankful for the project. Likewise, s/he might give biased answers because s/he expects additional support from the project.
10
Module (a) Module (b) Module (c)
Short Enterprise Survey Profound Enterprise Survey Anecdotal Case Study Approach
Sample Size Also see Step 4d, Section 4
Small sample (50-100 SMEs). Larger sample (>300 SMEs). 5-20 selected SMEs.
Covered Indicators List of indicators ÒAid 3
Direct results of the intervention. All indicators of the short enterprise The interviews should attempt to
Collected information has to be survey are integrated in this module. collect information corresponding
- easy to determine by respondent In addition, the more detailed to the indicators listed in Aid 3
- low sensitivity to formulation of questionnaire allows for gathering (including quantifiable business
questions the more-difficult-to-obtain figures). The unique feature of this
- unaffected by an auspices bias61 information e.g. on firm income: module are the open-ended
- easy to quantify and process. detailed questions on sales, raw questions that provide the opportu-
Additional indicators on project- materials, labour and capital input nity to follow unexpec¬ted threads
relevant questions can be added. avoid sensitivity and auspices biases in the interview, e.g. on reasons for
in assessing the firm income. connecting or not connecting or
Additional indicators on project- market access barriers. Indirect and
relevant questions can be added. second round effects may also be
brought up, e.g. if the respondent is
aware of competitors who have not
benefited from the interven-tion.
Additional indicators on project-
relevant questions can be added.
Questionnaire
Structured, but short, focused on Structured, covering all dimen¬sions Open: interview guideline should
easy-to-get-information of firm activity, accounting for be pursued while leaving space for
Interview length around 30 minutes seasonality; decisive variables such spontaneous, discursive deviations
as employment or firm profits are in directions indicated by the
addressed in more detail and in respondent.
multiple ways in order to allow Interview length 30-120 minutes
cross-checking.
Interview length around 60 minutes
Sample Questionnaires
Information Processing
Simple data analysis with Excel, a Statistically advanced data analysis Systematic analysis of interview
sample data entry sheet in Adden- using statistical software (SPSS, notes along the lines of the guiding
dum 5. STATA, etc.). questions underlying the qualitative
exercise (see for example the
‘PRODUSE Guidelines for Qualitative
Interviews’, Addendum 4).
Can be implemented by own project Profound skills and experience Should be implemented by or under
staff, interns or consultants without required in all stages, i.e. survey close supervision of lead researcher;
particular skills in evaluation design and implementation as well recommendable to hire consultants
methods or statistics; supervision by as data analysis; some background familiar with (qualitative) evalua-
experienced evaluation researchers in development (and electrification) tions.
is recommendable. projects and knowledge of the
respective country recommendable;
for data collection, backstopping of
experienced local enumerators by
methodologically skilled researchers
11
4. Step-by-step Towards an Effective PUE Impact M&E System
The project manager might scrutinise the demands of the project, choose an appropriate identification
approach (see Section 3.1) and apply it using one of the three modules (see Section 3.2). But what is the best
sequence of making these choices and which are the questions to be addressed systematically in this pro-
cess? This section presents a step-by-step guidance for designing a PUE impact M&E system, suggesting
which stakeholders should be involved at which stages of the process. Steps 1 and 2 have to be carried out by
the project manager or at least require his or her close involvement. Steps 3 to 5 are mostly the responsibility
of the project staff members or of the external researchers to whom the implementation of the PUE impact
M&E system is assigned. In order to complement the guidance and information provided here and for further
readings, one may consult the M&E guides referred to in Addendum 1.
Before thinking about the concrete design of the PUE impact M&E system in Step 2, the project manager
should take the following basic considerations.
Do the additional benefits of a PUE impact M&E system compared to a classical M&E system justify the
additional costs from the project’s perspective? If yes, continue with Step 1b.
The intention of conducting impact M&E should be communicated to all other project stakeholders including
local partner institutions on both the political and implementation level (e.g. utilities, ministry). They should
be included in the design process, if possible.
The project’s results chain is the conceptual framework of the PUE impact M&E system. If no results chain with
regards to productive electricity use exists, it has to be drafted by the project management in order to get a clear
picture of which transmission channels from inputs to impacts exist. Accordingly, the results chain helps to
determine appropriate outcome and impact indicators. Even if a results chain has already been established, a
review is recommended at the time the PUE impact M&E system is designed, not least since adaptations in the
project design might have occurred in the meantime. A stylised model results chain is provided in Aid 1.62
Step 2: Designing
The third step is then to design the PUE impact M&E system. This includes the following parameters:
�
determination of the objectives of the PUE impact M&E system (Step 2a)
� decision on the impact indicators (Step 2b)
� choice of the appropriate module (Step 2c)
� selection of staff members or external researchers to implement the PUE impact M&E system (Step 2d)
� adaptation of selected module to project needs and particularities (Step 2e).
As depicted in Figure 8 decisions on a certain sub-step may have repercussions on previous sub-steps. For
example, if it is decided in Step 2d to hire an external researcher, s/he should review the previous steps incl. the
indicators to be examined. Likewise, the decision on which module to apply (Step 2c) can also affect the selec-
tion of indicators (Step 2b).
62) Note that in reality a results chain is much more complex in most cases. The purpose of the results chain presented here is only to
illustrate the idea of a theory of change underlying the project and its importance to the impact M&E system.
12
Figure 9: Steps in the Design of the PUE Impact M&E Approach
Step 2A
objectives
<
<
<
<
Step 2D < Step 2B
evaluator < indicators
<
Source: own illustration
<
Step 2C
<
module
<
Step 3
< proposed procedure
Legend <
potential repercussions
survey preparation
Step 2a: Determination of the Objective and Scope of the PUE Impact M&E System
The first step in designing a project-specific PUE impact M&E system is to agree on its objective. The crucial
point here is concerning the scope, i.e. which parts of the results chain shall be covered. Does the project want
to monitor or assess connected firms and the usage of electricity only or also higher impacts like profits or
employment? At this point, the principal research questions have to be formulated.
The objective may be subject to change when deciding on the characteristics of the M&E scheme as indicated
in Figure 9 above. For example, this can be the case if budgetary restrictions turn out to impede the implemen-
tation of a more sophisticated method (Step 2c) or if indicators considered as indispensable in Step 2b make it
necessary to reconsider the objectives of the PUE impact M&E system.
Indicators are direct and unambiguous measures of progress toward the intended goals of a project. Indicators
for the evaluation of impacts on productive electricity use range from simply counting the number of connected
firms and the appliances they use to the change in their profits, the number of employed workers and the
wages they earn. A systematic catalogue of indicators is given in Aid 3. Based on these indicators, concrete ques-
tions are to be formulated for the questionnaire. The choice of indicators has clear implications for the module
to be chosen in Step 2c (see also Section 3.2). For example, the indicator item ‘used appliances’ can be checked
with less effort, i.e. Module (a) than ‘firm profits’ (for which Module (b) is required). Accordingly, the list of indica-
tors in Aid 3. contains a recommendation for each indicator of which module is required to measure it.
Projects might want to include additional indicators to account for particularities in their project setup. In this
case, GTZ (2007) delineates aspects to be considered when constructing project-specific indicators. Such
guidelines are important to follow in order to attain a priori neutral indicators that reliably record the degree
of progress in the achievement of the proposed results. M&EED Group (2006) lists a range of potential indica-
tors applicable to productive use of electricity. Potential impacts that have not been intended by the project
– be they positive or negative – should also be considered and captured with appropriate indicators. For all
chosen indicators, it should be checked at this stage whether relevant data can be obtained from other sourc-
es. This includes official statistics but also baseline data from other projects or the project itself.
Most indicators require interviews with firm owners. Some impact indicators may necessitate further inter-
viewees, for example in order to obtain the perception of employees on the impact of electricity on their work-
13
ing environment. Such research questions, however, are best included in complementary qualitative inter-
views conducted in Module (c). Other examples are the impact on the community in total, on the local
environment or impacts related to the choice of the electricity source.
Mini-grids fed by diesel generators, for example, may result in high long-term costs and dependency on external
suppliers, whereas micro-hydro projects may interfere in the local water provision of households and farmers.
One of the three modules proposed in Section 3.2. must to be selected: the Short Enterprise Survey, the Profound
Enterprise Survey or the Anecdotal Case Study Approach. The module decision should be based on a comparison
of the advantages and limitations of each module (see Table 66) with the objectives of the evaluation (see Step
2a) and the available budget. Modifications of the selected module can be carried out in line with particular
needs of the project. An extensive calibration should be done by the staff member(s) or consultant(s) to whom
the implementation of the PUE impact M&E system is assigned as we explain in the following Step 2d.
Step 2d: Assign the Implementation to Qualified Staff Members or External Experts
The different modules require different levels of skills and resources. The module presentation in Section 3.2
indicates the requirements in terms of methodological know-how and time requirements to implement each
module. For Module (b), the hired researchers have to meet the following requirements: experience with
impact evaluations, statistical skills, experience with development projects and, if possible, electrification
projects. If it is intended to apply econometric methods during the data analysis, the researcher should be
familiar with statistics and econometrics – at best documented by a list of publications in academic peer-
reviewed journals in the fields of impact evaluation and applied econometrics.
Survey preparation varies substantially between the different modules. For Module (a), the sub-steps of this
task do mostly not apply, since its features are already pre-defined, e.g. the before-after approach is the only
recommended identification strategy (Step 3a) and no control regions are to be included (Step 3e). Module (b)
and (c), in contrast, require considerably more effort both with regard to desk and field work (Step 3e to 3h). The
field work implies a mission of the researchers to visit the target and potential control regions but also to
meet the project staff (in particular if the researchers are international experts), to finalise the methodology
and to train the survey team.
As described in Section 3.1, there are different ways of identifying the impacts of electrification. An appropriate
comparison, the so-called counterfactual situation, has to be established. If the PUE impact M&E system is set
up at the beginning of the electrification project, in principle all strategies are possible. If it is decided after the
project has electrified the target regions that impacts should be examined, the cross-sectional approach is
the only possible one. Methodologically, the before-after comparison with a control group is the most robust
approach – but, as a matter of course, it also requires more resources, since two surveys have to be done (be-
fore and after) in two regions (project’s target region plus control region). Without special methodological and
statistical skills, the cross-sectional comparison is the most difficult one for Module (a) and (b). Hence, as a
general rule it is recommended to set up the PUE impact M&E system before the first regions are electrified
and make either a simple before-after comparison or the extended version including a control group. If the
cross-sectional approach is chosen (e.g. because it is too late to estabish a baseline), it has to be done by
experienced evaluation researchers. See Aid 2. for a more profound description of the three approaches.
14
Step 3b: Submitting an Inception Report
An inception report should be drafted by the researchers to outline briefly the intended procedure at the out-
set of the assignment. It provides an opportunity for the project staff to get acquainted with the intended
approach and to intervene if deemed necessary. The submission of an inception report is, hence, particularly
recommended in case the researcher is an external person or entity but can also be a valuable preparatory
instrument for in-house discussions.
This inception report should best be structured as follows: (i) project description, (ii) methodology and (iii)
implementation. The first section should present basic information on the electrification project including its
results chain. The second chapter should first explain briefly the selection of modules. In a second step, adapta-
tions to the chosen module(s) can be illustrated. The purpose of the third chapter is to present an outline of the
data collection and analysis process supplemented by a time schedule. This should also include the envisaged
sample size and sampling method – if possible, already specifying the different SME types to be interviewed.
Based on the proposed approach outlined in Step 3b, a questionnaire has to be developed that covers the
requirements determined in steps 2a and 2b and that corresponds to the approach chosen in 2c. Of course, the
questionnaire for Module (a) is much shorter than the one for Module (b). In all cases, the questionnaire should
be well organised and furnished with complementary annotations for the enumerator, where necessary.
For Module (c), the questionnaire is more an interview guideline delineating the aspects that should be ad-
dressed during the interview in spite of its principal openness. Sample questionnaires for Module (a) and (b)
are provided in and Addendum 3. For Module (c), the PRODUSE guidelines for qualitative interviews that have
been developed for the Uganda case study (see Addendum 4) contains a list of guiding questions; of course,
this list cannot always be transferred one-to-one and needs to be adjusted to the particular case, since the
research objective may deviate in other countries and projects.
At least for modules (a) and (b), pre-testing the questionnaire with 5-20 interviews is imperative to scrutinise
the formulation of questions (the interviews for Module (c) are more conversational so that questions do not
need to be as accurate). It is most suitable to do this pre-test with the already selected and trained enumera-
tors (Step 3h). At the same time, the pre-test can serve as a training component for the enumerators. It is also
highly recommendable for the researchers to do field trips to the target region and some focus group discus-
sions with target group representatives to check the appropriateness and completeness of the questionnaire.
For Module (a) the team may even consist of project staff only. Additionally, interns or consultants can be
hired. By contrast, Module (b) requires one or two teams of around four enumerators and one field supervisor,
depending on the sample size and availability of time and means of transport, of course. As a rule of thumb,
one can expect 4 and 6 interviews per enumerator per day for Module (b) and (a), respectively. Interviews for
Module (c) should be conducted by the hired researchers themselves, supported by local consultants familiar
with the situation and social customs in the target region.
Information that allows assessing the comparability of potential control regions and the target region of the
electrification project should already be collected as part of the preparatory desk work. In addition, a field trip
to the target areas of the intervention is generally indispensable. While the comparability of villages can best
be assessed on the ground by visual inspection, the following list of criteria can provide for some guidance:
15
�
level of economic activity
� distance to the capital and/or regional centres
� population size
� main source of income (agricultural and non-agricultural products)
� road accessibility (distance to asphalt roads, accessibility by cars and/or trucks)
� transit traffic
� existence of a regular market in the village
� political relevance
� availability of other services (such as vocational training or microfinance)
� presence of other development projects.
Talking to local key informants such as village chiefs, teachers or NGO representatives can help to get a better
picture of the villages that are considered to be included.
The determination of the sample size for Module (a) or (b), in principle, is based on statistical considerations.
However, a statistically accurate determination of the required sample size, commonly referred to as power
analysis, will not be possible in most cases. This statistically appropriate sample size mainly depends on the
specific impact indicators (e.g. firm profits or employment, usage of electric lighting) and the extent to which
they are expected to change due to electrification: the smaller the expected change, the higher the sample size
that is required to derive robust and clear interpretations from statistical results. To sum it up, if one finds sta-
tistically significant evidence for an impact of electrification on, for example, firm profits, there are not so many
reasons to worry about a sufficiently large sample size. The problem is rather whether to interpret a no-effect
result as genuine evidence of no effect of the intervention or as a reflection of an insufficient sample size, given
the setup of statistical significance tests. It might as well be the case that the sample size is simply too small
to detect a positive impact. The objective of a power analysis is exactly to avoid such inconclusiveness.
See, for example, Magnani (1997) for an accessible presentation of power analysis.63 Among the parameters
required to determine the sample size, are (with + or - indicating whether the parameter increases or de-
creases the required sample size):
a) the
number of firms in the target population [+]
b) the heterogeneity of firms in the target region [+]
c) the expected magnitude of the intervention’s impact (e.g. 20 % higher profits for connected SMEs in
comparison to comparable non-connected ones) [-]
d) the desired degree of confidence that an observed change would not have occurred by chance (the
level of statistical significance) [+]
e) the desired degree of confidence that an actual change of the magnitude specified above will be
detected (statistical power) [+]. 64
Only d) and e) are at the discretion of the researcher. To gauge the concrete realisation of all other parameters
will be difficult in most cases, however. Nevertheless, a rough power calculation conducted with approximate
values will indicate how the required sample size changes if, for example, firm profits are taken as an impact
indicator compared to lighting hours usage (see Bloom 1995 for more details on sensitivity tests).
As a pragmatic alternative to power analysis, one might resort to rules of thumb: the purpose of any (quanti-
tative) evaluation study is to compare samples of firms with each other, for example connected to non-con-
nected firms or firms before electrification to the same firms after electrification. In order to allow for statisti-
cal analysis, as a rule of thumb, the sample size per subgroup must not fall below 30 firms, e.g. 30 connected
63) As a matter of course, the presentation can only be superficial at this point. For further readings on the power of surveys see also Cohen
(1988).
64) For indicators expressed as proportions (e.g. share of energy expenditures in total SME expenditures before the intervention) the initial
or baseline level of the indicator additionally affects the required sample size.
16
and 30 non-connected firms. However, the number of relevant subgroups increases with the set of firm char-
acteristics to be taken into account. For example, if the analysis furthermore distinguishes between com-
merce and manufacturing firms, the required sample size already increases to 120. Assuming that more firm
categories have to be accounted for (regional differences, firms sizes, industries, etc.) a sample size of 200-500
seems reasonable and allows for the application of many statistical tools. At least for Module (b) considerations
on this rule of thumb and the subgroups to account for should be provided in the inception report (see Step 3b).
For Module (c), the number of interviewed firms can be determined according to the budget. Here as well,
certain differences between firms that can be important for answering the research questions have to be
taken into account.
For example, one might be interested in the (non-)use of electricity and its impacts on service firms supplying
non-tradable goods and firms that are producing exportable goods as well as those producing non-exportable
goods (exportable in this context refers to trade with regions beyond the intervention zone of the electrification
intervention). In this case at least 1-2 representatives of each subgroup – further distinguished according to
their connection status – should be visited.
The purpose of sampling is to select firms for interviews from the totality of firms in the target region (and
potentially in a control region) in a way that is governed by chance, not by the researcher’s or enumerator’s
choice (probability sampling). The resulting randomness of sample selection is crucial for guaranteeing the
representativeness of the collected data. Module (c) is an exception by allowing as well for purposive sampling
of firms according to specific demands or ex-ante expectations. These expectations depend on the project
setup and the target region. For example, one might expect special insights on impacts in export-oriented
firms. For the qualitative part of the Uganda case study on electricity usage in two export oriented fishing
communities at Lake Victoria (see Addendum 4), to take another example, three groups were identified
beforehand: voluntary non-users, ‘non-performers’ that get connected but do not seem to benefit from the
connection and ‘winners’ that get connected and seem to be able to improve their performance. The type of
firms on which Module (c) should be targeted has to be elaborated on before the survey. This should also be
addressed in the inception report (Step 3b). Yet, in a case in which Module (c) is combined with another mod-
ule, the researchers can decide that firms to be interviewed qualitatively are selected after the survey accord-
ing to, for example, stylised firm types determined during the surveys.
For Module (a) and (b) some form of probability sampling has to be applied. In the ideal situation, the research-
ers have a comprehensive enumeration of all firms in the target area to draw a random sample from. In most
cases, such a list will not be available, however, only a list of villages to be electrified. Often, more than a dozen
villages are electrified, so that surveying all of them is hardly an option from a logistical and budgetary point
of view. The first step of sampling is therefore to select a subset of villages.65 A random selection where the
probability that a village is selected is directly linked to its population size is advisable (see e.g. Iarossi 2007 for
details). In particular for Module (a) the researcher might simply pick a subset of villages from the target
region – either by chance or based on certain ad-hoc representativeness considerations. For example, one
could choose a certain number of villages from each of the (sub-)regions the project intervenes in.
Per village, a certain number of firms then has to be selected – depending on the total sample size defined in
Step 3f. The most pragmatic approach is simple random sampling (within the villages): if a list of firms exists,
the field supervisor simply draws randomly the required number in the respective village. If no such list exists,
the field supervisor assigns the enumerators to different parts of the village, where the number of firms can
normally be obtained from some key informant. Since SMEs in rural parts of developing countries are often
not recognisable as such, the key informant should furthermore be consulted about the location of the
65) In demarcation to the ideal situation case mentioned above, this is referred to as clustered random sampling. Because observations from
one cluster do not differ as much as observations from different clusters do, one needs a larger sample size to capture the variation
between firms. The choice of the sampling scheme therefore has repercussions for the sample size determination (see 4d and Warwick
and Lininger 1975).
17
individual enterprises. The first firm to be interviewed is picked by chance by the field supervisor or the enu-
merator. Afterwards, the enumerator visits every nth firm along a predefined route – with the n depending on
the required sample size and on the number of firms that exist in the respective part of the village.
In brief, as long as the interviewed firms are selected randomly, basic representativeness can be expected.
Further structural sampling errors that occur in many settings can be avoided if the field research team con-
forms to the following two principles: (i) cover the whole intervention area, especially in terms of centrally and
remotely located firms and (ii) do not skip absent firms but revisit them later. Otherwise a certain part of the
local economy (e.g. shops that only open in the evening hours) may be excluded from the sample.
In case of the profound enterprise survey – Module (b) – the hired consultants might consider other more
elaborated forms of sampling, for example stratified random sampling. Here, firms are grouped into ‘strata’
beforehand. Stylised firm types such as ‘manufacturing’ and ‘services’ are one example of strata. Geography is
another logical choice for stratification, because location is likely to be correlated with a number of other
variables that are of relevance for the evaluation. For example, for a baseline study the enterprises in a village
can be stratified into ‘village centre firms’ and ‘more remote firms’. If information on the outline of the upcom-
ing grid is available, this may as well be used to stratify enterprises into firms located closer to the upcoming
grid and those living further away. Stratified sampling ensures that the two groups are adequately repre-
sented in the sample to be drawn and not – due to chance – underrepresented. If, for example, two in three
firms in an intervention area are located in the village centre, two in three interviewed firms should be located
there. For this approach it is necessary to know beforehand for each of the different ‘strata’ the number of
SMEs it contains.
Another option is to purposefully oversample firms that are more likely to connect in the future in order to
ensure that sufficient information is obtained about them. This option is particularly relevant if the research-
er worries about the risk of a low electrification rate among SMEs in general or among SMEs of a specific firm
type of interest. In our example, one might expect that the village centre firms are closer to the future power
lines and therefore more likely to connect to the future grid. In the case of oversampling it is important to use
weights during data analysis in order to reconstitute representativeness. Details on the implementation of
the different sampling approaches and additional methods can be found in the standard literature on survey
methodology (see, for example, Iarossi 2007, Magnani 1997 or Warwick and Lininger 1975). Apart from simple
random sampling, all sampling approaches should be implemented by methodologically skilled researchers.
Interviews for Module (c) are conducted by the researchers themselves. The field work team for Module (a) can
consist of project staff only. Additionally, interns or consultants can be hired. If enumerators or consultants are
hired, they can be trained in a few hours to do the interviews, depending on the complexity of the questionnaire.
Module (b) requires a team of around four enumerators and one field supervisor, depending on the sample
size and availability of means of transport. As a rule of thumb, an enumerator can conduct four interviews per
day. They have to be trained and backstopped by a methodologically skilled researcher. During the training,
the enumerators and the field supervisor have to become acquainted with the objective of the study and the
meaning and purpose of each question. Furthermore, the enumerators have to be taught how to deal with
non-responses, to pay attention to consistency problems and to report complementary qualitative informa-
tion in written comments or verbally to the field work supervisor. The training takes around 1.5 days in the
‘classroom’ and should be interactive, e.g. by means of role plays of interviews.
The training can be combined with a pre-test of the questionnaire, which is in this case conducted by the
freshly trained enumerators under supervision of the field supervisor and the researcher. It is recommendable
to contract the same enumerators for data entry afterwards. Data entry should also be taught during the
training course. Pre-test and data entry training take another 1.5-2 days.
In many regions, the employees of the SMEs to be interviewed will not speak English, French or Portuguese, so
that some form of translation has to be applied. Whether the questionnaire itself is translated or enumera-
18
tors translate the questions in an ad-hoc manner depends on the particular region and the local language
that is spoken.66 This should be discussed with the local survey partners familiar with the languages that are
spoken in the survey region.
Step 4: Implementation
In particular for Module (a) and (b), a thorough logistical planning is a precondition for a successful implemen-
tation of the survey. Transport to and within the target region has to be ensured. For Module (a), one enu-
merator can do 6-7 interviews per day. The longer questionnaire in Module (b) normally makes it difficult to do
more than 4 or 5 questionnaires per day. As a matter of course, in both cases this depends on the distance
from the base camp to the survey village at the respective day and from the distance between the SMEs to be
interviewed, which may be located in more than one village.
The sampling strategy determined in steps 3f and 3g has to be implemented in each village. In Module (b) this
has to be done by the field supervisor, who assigns the enumerators to different parts of the village. The enu-
merators should make sure that the interviewees are the actual owners with full insights into their firm’s
operation – if necessary through an appointment or revisiting the firm later. In addition, it is recommended to
do a short village level interview with, for example, the village chief to obtain an assessment of the local busi-
ness environment, market access and most important barriers, reliability of the electricity grid and general
income sources. After the first interviews have been completed, the questionnaires should be checked by
the field supervisor for consistency and completeness. Potential problems and respective solutions can be
discussed with the enumerators.
For Module (c) the interview length depends on the issues to be discussed with the respondent. But even if the
number of questions is known, the duration is less predictable than for structured questionnaires, as sponta-
neous deviations from the interview guideline are possible and even desired. If enterprises state that positive
or negative impacts of electrification of whatever sort exist, the researcher should – on the spot – check for
other potential sources of this impact. For example, the interviewee can be simply asked if other explanations
are possible for why her/his situation has improved, e.g. if the firm benefits from other development projects
(in general, the comparability criteria mentioned in Step 3e represent a useful starting point when trying to
elicit potential triggers of change).
It seems reasonable to take two hours as the maximum duration for the qualitative interview to avoid over-
burdening the enterprise. In this case, it might also be considered to give an in-kind remuneration to the
respondent to compensate for her/his loss of time. In addition, the interview might be divided and spread over
the day. Thereby, the interviewer also has the occasion to observe the business at different times of the day.
For Module (a) and (b) the entry of the collected data is a highly important step. If a proper digitalisation of the
questionnaire information is not assured, even the best collected data will not be useful. Therefore, much
effort has to be put into preparing an easy-to-use and trouble-free data entry template that helps to avoid
data entry mistakes from the outset. In the same way, the training of staff to enter the data (preferably, this is
done by the enumerators themselves, see Step 3h) and backstopping the data entry (which can be done by the
field supervisor) including quality assurance are of particular importance. The best way is to supervise the
entry of the first 3-4 waves of questionnaires directly and check afterwards for each questionnaire whether
the data is entered correctly. Once the data entry staff seems to work independently, picking just a sample of
questionnaires for quality control is sufficient.
66) To give an example, while Wolof is a widely spoken language in Senegal, also well-educated people are often unable to read it. Hence,
enumerators prefer to translate on the spot from French into Wolof. In Rwanda, in contrast, Kinyarwanda is also widely used as a written
language. Therefore, enumerators prefer to work with translated questionnaires. In Benin, many different languages are spoken within
one region, so that enumerators adapt on-site to the language the interviewed firm (or household) speaks. Translating the French
questionnaire into one local language would not make sense.
19
A code sheet for additional response categories or open questions has to be provided to the data entry staff
(at best after the first 3-4 waves of questionnaires have produced the most common answers) to avoid time-
consuming ex-post recoding and ensure uniform usage of codes. The data can be entered in an Excel spread-
sheet and easily transferred to other statistical packages for data analysis afterwards. A sample data entry
sheet is provided in Addendum 5.
For Module (c), the data can only be processed to the extent that it is quantifiable. Depending on the number
of interviewed firms, this is not always necessary. For the main body of collected information one might rath-
er speak of ‘digesting’ the interviews. How this is implemented depends on whether the interviews have been
done by the principal researcher or by someone else. In the latter case, a systematic way of reporting the
information has to be developed. This digestion step bears the risk that information gets lost and is time
consuming – another reason for assigning the interview work directly to the researcher. The staff member
who conducts the interviews should at least be in close contact with the researchers responsible for the final
report, also during the reporting phase.
For Module (b), a common challenge is how to deal with non-responses inducing missing values in the data. One
approach is to drop observations for which values are missing. To the extent that these values are not missing at
random, however, this will induce biased estimates. One can easily imagine that specific firms, for example those
exhibiting particularly high or particularly low profits, are more or less willing to respond to questions on profits.
Hence, the researcher has to find ways to impute missing observations in order to avoid biased results. The easi-
est way is to simply fill in mean values for certain subcategories of firms. For example, one might impute profits
according to the number of customers. More sophisticated imputation algorithms are presented by King et al.
(2001) or implemented in statistical software packages such as the module ICE within STATA.
For Module (a), the problem of missing data will be much less severe, as the principal idea of this module is to
aim for the easy-to-get data. Variables for which one expects high non-response rates should consequently
not be included in the Module (a) questionnaire.
For Module (a) and (b), basic data analysis can be done with Excel, which suffices to calculate frequencies, per-
cent distributions (proportion), means, medians and ratios. Advanced data analysis for Module (b) (regressions,
difference-in-differences, matching etc.) has to be done using special statistical software packages like SPSS or
STATA. These techniques can only be applied by researchers familiar with statistics and econometrics – at best
documented by a list of academic publications in the fields of impact evaluation and applied econometrics.
The applied methods should be based on the established literature on impact evaluation: Ravallion (2008)
provides a comprehensive overview of impact evaluation methods in development projects. Peters (2009)
proposes hands-on solutions in electrification projects that are feasible even with limited research budgets.
Examples of applied evaluations in development projects are numerous. There are many excellent papers in
the literature but most of them have been elaborated based on surveys or data sets beyond the scope of the
PUE impact M&E systems presented here. The following papers, however, are examples for methodologically
proper evaluations based on limited sample sizes and can be considered as role models for methods to be
applied in Module (b): Becerril and Abdulai (2010), Becchetti and Costantino (2008), Bensch, Kluve and Peters
(2011), Kondo et al. (2008), Peters, Vance and Harsdorff (2011), Schmook and Vance (2009).
For Module (c), the collected qualitative information has to be analysed systematically along the lines of the
guiding research questions. This includes a critical assessment of who has been referred to as information
sources and how to interpret the statements of the respondents.
20
Step 5b: Reporting
The final report of a PUE impact M&E effort should contain a documentation of the important steps sketched
in this guide. First, the project should be described with a focus on its theory of change (results chain), this
includes activities, important steps, regional foci, objectives and intended impacts. The study and survey
implementation including sampling method and sample size as well as the identification strategy have to be
presented. For Module (b), the extent to which the applied methods are in line with the related literature
should be documented. In particular for Module (c), the analytical approach has to be clearly delineated in
order to allow for inter-subjective verifiability.
The collected data can then be used to describe the socio-economic situation in the survey (and control)
region. Only variables that are not expected to be affected by the project should be included in this descrip-
tion. The variables to be affected, that is, the indicators selected from the list in Aid 3. (see Step 2b), can then be
presented in the results chapter. Sample selection issues or other potential caveats that might distort the
accuracy of the findings should be critically discussed.
Analysing, understanding and digesting the collected information requires sufficient time, which should be
granted to the researchers. This should also involve a discussion of the preliminary results with the project
staff and others, e.g. the local partner institution(s). The time for data analysis and reporting can range from
around 2 months in Module (a), 3 months in Module (c) to 3-6 months in Module (b). In particular, if advanced
statistics and econometrics are to be employed, the data analysis and reporting cannot be done in a few
weeks. Note that the effective man-days to be budgeted are less. A longer period of 6 months is recommend-
able in order to allow for review and ping-ponging processes that are required to draft an understandable
report on a high methodological level delivering policy-relevant results.
Among the different objectives of a PUE impact M&E system are learning effects for the project itself. Therefore,
beyond the pure analysis of the data and its reporting, researchers should derive recommendations useful for
the project and beyond. In the first place, of course, this concerns suggestions to improve the potentials to gen-
erate positive impacts (or also to avoid negative ones). For example, the PUE impact M&E might reveal that re-
gional differences in impacts exist (e.g. due to different market access or different production patterns or enter-
prise types). This would lead to the recommendation to focus more on certain regions or types of firms.
A potential recommendation could as well be to modify the communication towards the public based on
which impacts could be evidenced or not. For example, in one segment a PUE impact M&E could find substan-
tial benefits for the target group (e.g. households that enjoy lighting) and in another segment impacts are
rather thin (e.g. no substantial productive take-up of electricity). The report should formulate this explicitly
and recommend calibrating the communication of impacts (e.g. ‘Do not promise substantial productive use
impacts, but highlight the social impact of the project among households’).
Beyond the recommendations directly linked to impact results and potentials, other insights gained during
the field work should be captured and used for developing further recommendations. The field work during
impact surveys always brings the researchers extremely close to the target region and its people as well as
intermediate partners such as private or community operators. Experience in many projects has shown that this
close interaction often reveals weaknesses of the project implementation as well as potentials to improve it.
21
5. Aid Items
The following aids are composed of short instructions that shall give guidance for and ease implementation of
PUE impact M&E activities.
Aid 1. The Results Chain Concept and Demarcation Between Outcomes and Impacts..........................................22
Aid 2. Strategies to Identify the Counterfactual Situation...................................................................................................22
Aid 3. List of Indicators........................................................................................................................................................................25
Aid 4. Outline of Terms of Reference for Short-Term Experts............................................................................................. 28
Aid 5. Outline of Inception Report................................................................................................................................................. 29
Aid 1. The Results Chain Concept and Demarcation Between Outcomes and Impacts
The demarcation between outcomes and impacts can be visualised using the results chain concept. For a
stylised illustration, the results chain of an electrification project is presented that promotes the provision of
electricity by supporting the national utility to extend the electricity grid (see Figure 10). In this case it is the
ambition of the programme to connect households and SMEs. Hence, the outcomes of the programme are
connected households or firms. For which purposes electricity is used in connected households and firms is of
course also relevant to the project but it can hardly influence the usage of electricity. The usage of electricity
lies beyond the so-called attribution gap and, therefore, is an intermediate impact. Everything that happens
as a result of this usage, for example an increase in productivity, constitutes an impact. Here, potentially
observed changes can hardly be attributed to the programme alone but may as well be due to external factors.
Please note that the results chain presented here cannot serve as a blueprint for electrification projects. The
reality of electrification projects (and development projects in general), is much more complex. Figure 10 only
serves to introduce the principle of the project’s theory of change and to highlight that it has to be clear
before a PUE impact M&E can be designed, which results are considered as direct ones (hence: outcomes) and
indirect ones (impacts). In most cases, it requires several results chains, not just one, to visualise the different
channels via which the project intends to achieve its outcomes and impacts.
In order to determine the true effect of electrification on the chosen outcome indicators, one would have to
compare the outcome variable after electrification to the counterfactual situation of not having received it.
The counterfactual situation shows how the firm would perform if it had not been connected. The impossibil-
ity of this is obvious, since we can never observe both situations: the firm either gets connected once the
electricity service is available or not. To solve this, an identification strategy is required that allows replacing
the unobservable counterfactual situation by something that is observable. This section describes briefly the
different strategies that exist and that are referred to in Section 3.2. Basically, one can compare the connected
firm after the project to the same firm before electrification or, alternatively, one can compare the connected
firm to another, unconnected firm at the same time.
In all approaches a particularity of electrification projects has to be taken into account: customers, be they
households or enterprises, decide whether or not to connect to the grid or to obtain a solar home system. In
most projects, comparatively high connection fees and installation costs prevent a considerable share of
households and enterprises from getting connected. For evaluation purposes, this bears the temptation of
comparing the connected firms to the non-connected ones in order to determine the impact of electrifica-
tion. However, this comparison is very likely to be a comparison of apples and oranges, since the firms that
have decided to connect or purchase a solar system are different from those that did not resulting in a self-
selection bias (see below for examples).
22
Figure 10: Exemplary Results Chain of Grid Extension Programme
INPUT
Project personnel, funds
ACTIVITIES
National utility receives funds and technical support to
implement grid extentsion
OUTPUT
National grid is extended to previously
non-electrified villages
Attribution Gap
Micro-enterprises use
Micro-enterprises use
electricity to power electric
Intermediate IMPACT
Income Savings
Productivity Job creation
generation in energy
IMPACT
23
1. Simple Before-After Comparison
For this approach, impact indicators are compared for the same firm before and after electrification. Any differ-
ence is then attributed to the electrification intervention. The underlying assumption is that the firm before
electrification is the counterfactual situation of the firm after electrification. In other words, performance of the
firm would not have changed, had there been no electrification intervention. One can imagine that this does not
hold true in many cases. For example, different harvest yields over time might affect the purchasing power of the
firm’s customers in the region thereby affecting the firm’s performance. In addition, other external factors could
change that are unobservable for the researchers. Only provided that such factors can be ruled out or somehow
are taken into account, based on, for example, qualitative interviews with key informants, before-after comparison
can be a valid approach. Quantifying these factors, if they exist, however, will be difficult in most cases.
An advantage of the before-after comparison is that it does not suffer from self-selection problems since only
connected firms are examined.
2. Cross-Sectional Comparison
In the cross-sectional approach, connected firms are compared to non-connected firms. The difference in
indicators is then considered as the impact. The basic identification assumption here is that the non-connect-
ed firms behave like the connected ones would do if they had not connected to the grid. The result is the assump-
tion that there are no systematic differences between those firms that get connected and those that do not.
However, this assumption is likely to be violated in an oversimplified approach: for example, one might expect
that better-educated entrepreneurs are more likely to get connected. At the same time, better-educated
entrepreneurs are likely to be more productive and, hence, have higher profits. As a consequence, these better-
educated entrepreneurs are more likely to be connected and to exhibit better performance indicators at the
same time. By simply comparing mean values in, for example, profits between connected and non-connected
firms one would ascribe at least parts of the difference to the connection status that is, however, in fact
induced by differences in the educational level.
Such confounding effects can be separated into observable and unobservable differences between connected
and non-connected firms. The education of firm owners or employees, for example, is observable and can be
accounted for by applying multivariate regression techniques. Thereby, the effect of being connected to the
grid can be isolated holding other firm characteristics constant.
In contrast to the educational level, some of the differences may be hard to capture and remain unobserved.
One example of a potentially unobserved difference that might violate the identification assumption is the
entrepreneur’s motivation. It is hardly measurable, it potentially affects the decision to connect and might as
well affect impact variables such as profits. Again, without controlling for the entrepreneur’s motivation one
will ascribe parts of the difference between connected and non-connected firms to the grid connection,
although the connected firm would also exhibit better performance outcomes without a connection, since it
is due to the omitted variable motivation. This is commonly referred to as selection into treatment and even-
tually leads to biased results.
Problems resulting from such systematic differences between firms have to be addressed both on the level of
data collection and analysis. On the level of data collection, all important characteristics of firms and firm
owners that potentially affect the decision to connect and the impact indicators should be included. The
study team needs to assess what driving forces are behind the decision to connect. Variables that are difficult
to capture with a structured questionnaire might be grasped in accompanying qualitative interviews.
On the level of data analysis, methods like matching can improve the comparability of connected and non-
connected firms (PRODUSE Chapter 3, Peters, Vance and Harsdorff 2011 and Bensch, Kluve and Peters 2011). The
quest for more comparable firms becomes easier, if the non-connected firms are taken from a region where
electricity service is not available, i.e. that is not covered by the grid, for example (see again Peters, Vance and
24
Harsdorff 2011 and Bensch, Kluve and Peters 2011). The challenge in the implementation of such a control region
approach is to find regions that are comparable to the region under evaluation. It is difficult to rule out that
differences between regions that are invisible at the first (preparatory) glance are uncovered during the field
work (see PRODUSE Chapter 5).
By using a control group the above-mentioned problems of simple before-after comparisons are widely elim-
inated: the control group accounts for the fact that the environment of the firm under investigation is chang-
ing over time, which might also influence the impact indicators. If, for example, harvest yields change and
thereby the purchasing power of the local population as well, this will also affect the sales of firms in the re-
gion. Simply comparing the before and after sales of a meanwhile connected firm would bias the assessed
impact of electrification on sales. Including other non-connected firms from the region helps to estimate the
effect of the changed harvest yields, so that this effect can be differenced out to get the net impact of electri-
fication. Methodologically, this is done by comparing the change in impact indicators of connected firms to
the change in impact indicators for non-connected firms. The remaining assumption that has to be formu-
lated is that the environmental change, harvest yields for example, affect both connected and non-connected
firms in the same way. If doubts about the validity of this assumption remain, propensity score matching can
improve the comparability of control and connected firms and, hence, the validity of the assumption. As for
the cross-sectional approach, the control enterprises are in the optimal case recruited from a comparable
non-electrified region that remains non-electrified.
In addition, the problems arising from selection-into-treatment effects (see Section 2) are excluded as long as
both unobserved characteristics affecting the decision to connect as well as the impact indicators remain
constant over time.
This approach combines the cross-sectional approach with the one of before-after comparison with a control
group. At the baseline stage, a control region is selected that is already covered by the electricity service. This
allows for a cross-sectional comparison of non-connected firms in the yet non-electrified project region to con-
nected firms in a region that had been electrified before – be it in an earlier phase of the same electrification
project or by another project (see Bensch, Kluve and Peters 2011 and Peters, Vance and Harsdorff 2011). In addition,
an ex-post evaluation can be conducted using the control region (and those firms in the project region that
chose not to get connected) to filter out the effects of changing external conditions (see Peters 2009).
The second, third and fourth approach presented, refer to non-experimental methods, which use statistical
techniques to construct the counterfactual. Still, there is another group of impact evaluation methods known
as random experiment designs, which are similar to controlled medical experiments in that they use randomi-
sation to obtain the counterfactual. Since these designs are typically not applicable for productive use interven-
tions, it shall only be referred to Gertler et al. (2010) and Ravallion (2008) at this point. Finally, another option
(mentioned in Section 3.2) for the case in which no information has been gathered before project implementa-
tion is to employ retrospective questions that try to elicit ex-post the information of interest from the inter-
viewee (Bamberger 2010). These, however, have the strong disadvantage of being subject to respondent’s recall
errors and should therefore be applied only in particular cases and their results should be interpreted cautiously.
This list presents impact indicators for productive electricity use among SMEs. The observation fields of
expected impacts are supplemented with indicators and sub-indicators (‘What to measure’) that represent
options for possible choices to be made by the project. The following column indicates whether a specific
indicator is deemed to be elicited by means of a short or a profound SME survey. Recommended indicators and
sub-indicators as a minimum selection of essential impacts to be observed are highlighted in dark grey.
25
A sample questionnaire for both the short and the profound enterprise survey is provided in Addendum 2 and
Addendum 3.
MDG Recommended
Observati- Essen-
Rele- Indicator What to Measure Approach to Study
on Field tial
vance Indicator
26
MDG What to Measure Recommended
Observati- Essen-
Rele- Indicator Approach to Study
on Field tial
vance Indicator
Penetration •N
umber and percentage of SMEs owning Short and profound
Improved
of food fridges, freezer for food storage (commer- SME survey
food MDG 1
storage cial use)
storage
appliances
27
Aid 4. Outline of Terms of Reference for Short-Term Experts
1. Background
In this section of the Terms of Reference (ToR) information shall be provided on the context of the evaluation, for
example if the evaluation is embedded in an overarching monitoring and/or evaluation process. The relevance of
the evaluation shall be outlined briefly. It shall furthermore be explained why the evaluation takes places at the
projected point of time.
This section shall contain details on the project or programme that is going to be evaluated. These details com-
prise basic information such as the name, project/programme period and geographic intervention zone same as
description of the concrete activities of the intervention and the rationale behind them.
In this section of the ToR, the specific objectives of the evaluation have to be listed and – if deemed appropriate
– shortly explained. These objectives may, for example: (i) serve as a data basis for monitoring activities, (ii) assess
the impacts of the electrification intervention on an empirically sound basis or – more specifically – (iii) deter-
mine the (net) employment and income effects of the SME electrification intervention or (iv) develop recommen-
dations concerning potential complementary activities. If the study is a baseline, the objectives may rather be to
(v) provide benchmark data for a potential ex-post impact evaluation, (vi) portray the local economic conditions
in the project areas or (vii) reduce uncertainty about demand assumptions in the target region.
It is furthermore helpful to set up a list of questions in this section that the evaluation design is meant to help
answering (e.g. which of the observed changes can be causally attributed to the SME electrification and which
can only be plausibly attributed to it?).
4. Methods
The applied methods (including identification strategy and sample size) and its application to the concrete set-
ting – shall be outlined in this section.
5. Implementation
The responsibilities of the different persons and organisations involved in the evaluation shall be defined here. This
entails as well the definition of tasks to be executed by the evaluator. An indicative list of tasks is presented below:
28
7. Exploration and selection of control sites
8. First review of questionnaire
9. Planning organisation of the survey and logistics with the assistance of field work personnel
10. Training course for the survey team (supervisor and enumerators) concerning survey objective, design
and execution as well as data compilation
11. Pre-test at one of the project sites
12. Final revision of questionnaire
13. Review of organisation and logistics for the survey logistics with the assistance field work personnel
15. Provision of organisational and methodological backstopping to the field work personnel
16. Adaptations of the intended survey methodology (sample size, target villages/ regions/ SMEs,
questionnaire, etc.), if needed
6. Timeline
Along the different tasks and task sets, a timeline for the whole survey process is to be established.
If consultants are engaged to design an impact M&E system or to conduct baseline or impact studies, they
should be required to submit an inception report. The inception report has to be submitted by those who will
implement the PUE Impact M&E. In particular, if external researchers are hired it is imperative. The purpose of
the inception report is to familiarise those responsible for the PUE Impact M&E with the proposed methodo-
logical approach at the outset of the research effort.
29
1.2. General Conditions and Context
Sector and area specific conditions that are either favourable or affect adversely project implementation and the
achievement of impacts are to be specified here. This includes particularly other donors’ activities in the interven-
tion area(s).
2. Methodology
3. Implementation
This chapter shall include a table depicting the concrete sample size – if possible, specifying the number of differ-
ent types of SMEs to be interviewed. The data collection and data analysis process shall be outlined supplemented
by a time schedule defining what (task) is going to be done by whom (persons involved) with what (resources) by
when (date).
30
Addenda
The ‘Guide to Monitoring and Evaluation for Energy Projects’ prepared by the international working group of
M&EED (Monitoring and Evaluation in Energy for Development) in 2006 proposes a step by step approach to
building project-specific monitoring and evaluation procedures for energy access projects while being more
concerned with monitoring:
www.hedon.info/docs/MandEEDGuideFinalVersionEnglish.pdf
An extensive guide on ‘Monitoring and Evaluation in Rural Electrification Projects: A Demand-Oriented Ap-
proach’ has been compiled by the Energy Sector Management Assistance Program (ESMAP) in 2003. This ap-
proach intended to be both poverty and gender sensitive blends qualitative and quantitative techniques of
participatory assessments and socio-economic impact surveys.
http://go.worldbank.org/JN30SKKFR0
Within the series ‘Directions in Development’ the World Bank published in 2000 a ‘Handbook for Practitioners’
on ‘Evaluating the Impact of Development Projects on Poverty.’ It is a comprehensive guide both delivering the
methodological evaluation background and guidance on good evaluation practise.
http://go.worldbank.org/8E2ZTGBOI0
The publication by the Network of Networks on Impact Evaluation ‘Impact Evaluations and Development:
NONIE Guidance on Impact Evaluation’ (2009) contains an introduction into the theory and practice of rigor-
ous impact evaluation. The first block is on methodological and conceptual issues while the second deals with
managing impact evaluation and addresses aspects of evaluability, benefits and costs and planning.
http://www.worldbank.org/ieg/nonie/guidance.html
The book ‘Impact Evaluation in Practice’ published in 2010 is a comprehensive non-technical introduction to
the topic of impact evaluation and its practice in development. The material ranges from motivating impact
evaluation, to the advantages of different methodologies, to power calculations and costs. The book is geared
specifically towards development practitioners and policymakers designing prospective impact evaluations.
www.worldbank.org/ieinpractice
Disclaimer
Following these links means leaving this guide and entering an external web link. The links provide additional
information that may be useful or interesting and is being provided consistent with the intended purpose of
this guide. However, we cannot attest to the accuracy of information provided by this link or any other linked
site. Providing links to an external web site does not constitute an endorsement by the authors of this guide
of the sponsors of the site or the information or products presented on the site.
31
References
Bamberger, M. (2010): Reconstructing Baseline Data for Impact Evaluation and Results Measurement. PREM
Notes, Special Series on Nuts and Bolts of M&E Systems, No. 4, World Bank, Washington D.C.
Becchetti, L. and Costantino, M. (2008): The Effects of Fair Trade on Affiliated Producers: An Impact Analysis
on Kenyan Farmers. World Development 36 (5): 823-842.
Becerril, J. and Abdulai, A. (2010): The Impact of Improved Maize Varities on Poverty in Mexico: A Propensity
Score-Matching Approach. World Development, Vol. 38 (7), pp. 1024-1035.
Bensch, G. and Kluve, J. and J. Peters (2011): Impacts of rural Electrification in Rwanda. Journal of Development
Effectiveness 3 (4): 567-588.
Estache, A. (2010): A Survey of Impact Evaluations of Infrastructure Projects, Programs and Policies. ECARES
working paper 2010-005, Université Libre de Bruxelles, Brussels.
Gertler, P. J. and Martinez, J. and Premand, P. and Rawlings, L.B. and Vermeersch, C. M. J. (2010): Impact
Evaluation in Practice. World Bank Publications. Washington, D.C.
GTZ – Deutsche Gesellschaft für Technische Zusammenarbeit (2007): Impact Monitoring Guide. Salvador.
Iarossi, G. (2007): The Power of Survey Design. A User’s Guide for Managing Surveys, Interpreting Results, and
Influencing Respondents. World Bank, Washington D.C.
IEA – International Energy Agency (2012): World Energy Outlook 2012. Paris.
IEG - Independent Evaluation Group (2008): The Welfare Impact of Rural Electrification: A reassessment of
the costs and benefits. An IEG Impact Evaluation, World Bank. Washington D.C.
Kondo, T. and Orbeta, A. and Dingcong, C. and Infantado, C. (2008): Impact of Microfinance on Rural House-
holds in the Philippines. Network of Networks on Impact Evaluation (NONIE) Working Paper No. 4.
King, G. and Honaker, J. and Joseph, A. and Scheve, K. (2001): Analyzing Incomplete Political Science Data: An
Alternative Algorithm for Multiple Imputation. American Political Science Review, Vol. 95 (1): 49-69.
M&EED Group (2006): A Guide to Monitoring and Evaluation for Energy Projects. Monitoring and Evaluation
in Energy for Development International Working Group. Retrieved from:
http://www.gvepinternational.org/sites/default/files/resources/MEED_Guide_final_version_english.pdf
Magnani, R. (1997): Sampling Guide. Food and Nutrition Technical Assistance Project (FANTA), Washington
D.C.
Peters, J. (2009): Evaluating Rural Electrification Programs: The Methodology of Ex-Ante Impact Assessments.
Well-Being and Social Policy, Vol. 5 (2), pp. 25-40.
Peters, J. and Vance, C. and Harsdorff, M. (2011): Grid Extension in Rural Benin – Micro-manufacturers in the
Electrification Trap. World Development, Vol. 39 (5), pp. 773-783.
UN – United Nations (2005): The Energy Challenge for Achieving the Millennium Development Goals. New
York.
32
UN – United Nations (2010): Energy for a Sustainable Future. The Secretary General’s Advisory Group on
Energy and Climate Change. New York.
Warwick, D. P. and Lininger, C. A. (1975): The Sample Survey: Theory and Practice. New York.
White, H. (2002): Combining Quantitative and Qualitative Approaches in Poverty Analysis. World Develop-
ment 30 (3): 511-522.
33
Notes
34