Introduction This
section presents a standard protocol for drawing a national probability
sample for an Afrobarometer survey. Regardless of whether or not a previous
survey has been in a country, a new sample has to be drawn for each round of Afrobarometer
surveys. Whereas the standard sample size for Round 3 surveys will be
1200 cases, a larger sample size will be required in societies that are extremely
heterogeneous (such as South Africa and Nigeria), where the sample size will be
increased to 2400. Other adaptations may be necessary within some countries to
account for the varying quality of the census data or the availability of census
maps. The sample is designed as a representative cross-section
of all citizens of voting age in a given country. The goal is to give every adult
citizen an equal and known chance of selection for interview. We strive to reach
this objective by (a) strictly applying random selection methods
at every stage of sampling and by (b) applying sampling with probability proportionate
to population size wherever possible. A randomly selected sample of 1200 cases
allows inferences to national adult populations with a margin of sampling error
of no more than plus or minus 2.5 percent with a confidence level of 95 percent.
If the sample size is increased to 2400, the confidence interval shrinks to plus
or minus 2 percent. Sample Universe
The sample universe for Afrobarometer surveys includes all citizens of
voting age within the country. In other words, we exclude anyone who
is not a citizen and anyone who has not attained this age (usually 18 years) on
the day of the survey. Also excluded are areas determined to be either inaccessible
or not relevant to the study, such as those experiencing armed conflict or natural
disasters, as well as national parks and game reserves. As a matter of practice,
we have also excluded people living in institutionalized settings, such as students
in dormitories and persons in prisons or nursing homes. What to do about
areas experiencing political unrest? On the one hand we want to include them because
they are politically important. On the other hand, we want to avoid stretching
out the fieldwork over many months while we wait for the situation to settle down.
It was agreed at the 2002 Cape Town Planning Workshop that it is difficult to
come up with a general rule that will fit all imaginable circumstances. We will
therefore make judgments on a case-by-case basis on whether or not to proceed
with fieldwork or to exclude or substitute areas of conflict. National Partners
are requested to consult Core Partners on any major delays, exclusions or substitutions
of this sort. Sample Design The sample
design is a clustered, stratified, multi-stage, area probability sample.
To repeat the main sampling principle, the objective of the design is
to give every sample element (i.e. adult citizen) an equal and known
chance of being chosen for inclusion in the sample. We strive to reach this objective
by (a) strictly applying random selection methods at every stage
of sampling and by (b) applying sampling with probability proportionate to population
size wherever possible. In a series of stages, geographically defined
sampling units of decreasing size are selected. To ensure that the sample
is representative, the probability of selection at various stages is adjusted
as follows: - The sample is stratified by key social
characteristics in the population such as sub-national area (e.g. region/province)
and residential locality (urban or rural). The area stratification reduces the
likelihood that distinctive ethnic or language groups are left out of the sample.
And the urban/rural stratification is a means to make sure that these localities
are represented in their correct proportions.
- Wherever possible,
and always in the first stage of sampling, random sampling is conducted with probability
proportionate to population size (PPPS). The purpose is to guarantee
that larger (i.e., more populated) geographical units have a proportionally greater
probability of being chosen into the sample.
The sampling design
has four stages A first-stage to stratify and randomly
select primary sampling units; A second-stage
to randomly select sampling start-points; A
third stage to randomly choose households; A
final-stage involving the random selection of individual respondents.
We shall deal with each of these stages in turn. STAGE ONE:
Selection of Primary Sampling Units (PSUs) The primary sampling
units (PSU’s) are the smallest, well-defined geographic units for which
reliable population data are available. In most countries, these will be Census
Enumeration Areas (or EAs). Most national census data and maps
are broken down to the EA level. In the text that follows we will use the acronyms
PSU and EA interchangeably because, when census data are employed, they refer
to the same unit. We strongly recommend that NIs use official national
census data as the sampling frame for Afrobarometer surveys. Where recent or reliable
census data are not available, NIs are asked to inform the relevant Core Partner
before they substitute any other demographic data. Where the census is out of
date, NIs should consult a demographer to obtain the best possible estimates of
population growth rates. These should be applied to the outdated census data in
order to make projections of population figures for the year of the survey. It
is important to bear in mind that population growth rates vary by area (region)
and (especially) between rural and urban localities. Therefore, any projected
census data should include adjustments to take such variations into account. Indeed,
we urge NIs to establish collegial working relationships within professionals
in the national census bureau, not only to obtain the most recent census data,
projections, and maps, but to gain access to sampling expertise. NIs may even
commission a census statistician to draw the sample to Afrobarometer specifications,
provided that provision for this service has been made in the survey budget.
Regardless of who draws the sample, the NIs should thoroughly acquaint themselves
with the strengths and weaknesses of the available census data and the availability
and quality of EA maps. The country and methodology reports should cite the exact
census data used, its known shortcomings, if any, and any projections made from
the data. At minimum, the NI must know the size of the population and the urban/rural
population divide in each region in order to specify how to distribute population
and PSU’s in the first stage of sampling. National investigators should
obtain this written data before they attempt to stratify the sample. Once
this data is obtained, the sample population (either 1200 or 2400) should be stratified,
first by area (region/province) and then by residential locality (urban or rural).
In each case, the proportion of the sample in each locality in each region should
be the same as its proportion in the national population as indicated by the updated
census figures. Having stratified the sample, it is then possible to determine
how many PSU’s should be selected for the country as a whole, for each region,
and for each urban or rural locality. The total number of PSU’s
to be selected for the whole country is determined by calculating the maximum
degree of clustering of interviews one can accept in any PSU. Because PSUs (which
are usually geographically small EAs) tend to be socially homogenous we do not
want to select too many people in any one place. Thus, the Afrobarometer has established
a standard of no more than 8 interviews per PSU. For a sample size of 1200, the
sample must therefore contain 150 PSUs/EAs (1200 divided by 8). For a sample size
of 2400, there must be 300 PSUs/EAs. These PSUs should then be allocated
proportionally to the urban and rural localities within each regional stratum
of the sample. Let’s take a couple of examples from a country with a sample
size of 1200. If the urban locality of Region X in this country constitutes 10
percent of the current national population, then the sample for this stratum should
be 15 PSUs (calculated as 10 percent of 150 PSUs). If the rural population of
Region Y constitutes 4 percent of the current national population, then the sample
for this stratum should be 6 PSU’s. The next step is to select particular
PSUs/EAs using random methods. Using the above example of the rural localities
in Region Y, let us say that you need to pick 6 sample EAs out of a census list
that contains a total of 240 rural EAs in Region Y. But which 6? If the EAs created
by the national census bureau are of equal or roughly equal population size, then
selection is relatively straightforward. Just number all EAs consecutively, then
make six selections using a table of random numbers. This procedure, known as
simple random sampling (SRS), will ensure that each EA will have an equal probability
of being sampled. If the PSUs’/EAs have different population sizes,
however, then random sampling must be conducted with probability proportionate
to population size (PPPS). The idea here is that units with larger populations
should have a proportionally greater chance (probability) of being chosen. The
PPPS method is not difficult to use and is described in Appendix 6. Once
EA’s have been randomly selected they should be plotted on a national map.
Use this map to plan out the deployment routes for the various field teams. In
some cases, a few EAs may be so inaccessible or so dangerous that substitution
of PSUs becomes necessary. As long as PSU substitutions never constitute more
than 5 percent of all PSU’s it is acceptable to make them. The best method
is to randomly draw another EA in the hope that it will fall in a more convenient
location. Please record which EAs are substitutes and justify why they were substituted.
If more than 5 percent of PSUs require substitution, then the NI should discard
the entire Stage 1 sample and draw a new one. Oversampling- optional
In some countries, the NI may be concerned that a random sample might
miss a politically important minority group. Or, even if this minority is represented
in the sample in accordance with its share of the national population, there may
be too few cases to make reliable generalizations about the attitudes of this
group. Under these circumstances, over-sampling is permissible,
as we did in Round 1 for the Toaureg, Ijaw, and Coloured minorities in Mali, Nigeria
and South Africa respectively. Purposive over-sampling will also be required as
a condition of one donor’s funding in Round 3; USAID wishes to gather extra
information on certain regions where their projects are located, probably in Mali,
Mozambique, Senegal, South Africa, and Zambia. Note that the over-sample should
be coterminous with a given sampling stratum, usually a region. The NI should
consult the relevant Core Partner about any planned over-sampling and keep detailed
records that allow correct weighting factors to be calculated to correct for over-sampling
at the stage of data analysis. Additional Cluster (Optional)
In countries where regions are too numerous or too scattered to provide a logistically
feasible sampling frame, an additional stage of clustering can be considered,
as follows: * Choose a suitable geographic unit between region/province
and EA: e.g. administrative district. In large countries, it may not be practical
to visit all districts or even all regions. Number and stratify all districts
and, using PPPS, randomly choose a subset of these districts. Preferably, the
subset should not be less than half of the total number of districts in the country.
And the subset should always cover all relevant social variations nationwide.
* A population limit shall be set for districts that should be self-representing
(i.e. large districts which must be represented in the sample). Self-representing
districts will thus have a probability equal to one of inclusion in the sample.
* Once PPPS is applied, other districts will have a probability proportional
to population size of inclusion in the sample. Additional Stratum
(Optional) In urban areas that have extremely diverse housing patterns,
the NI may choose to add an additional layer of stratification to increase the
likelihood that the sample does not leave out high-density (especially
informal) settlements. Using a street map, a city or town can be divided
into high- medium- and low-density areas. It can then be required that PSUs are
represented equally (or better yet, in proportion to population sizes, if these
are known) within the sample for that city or town. STAGE TWO:
Selecting Sampling Start Points (SSP’s) Within each
PSU/EA, Field Teams travel to a randomly selected sampling start point (SSP).
Thus the number of start points is the same as the number of PSU’s (150
or 300). A sampling start point (SSP) is required so that interviewers
know where to start random walk patterns within each PSU (see next section). This
procedure has the effect of further clustering the sample into manageable areas
that are reachable on foot or by a short vehicle ride. Either in the office
or in the field, the Field Supervisor (FS) selects the SSP using one of the following
three methods. The ideal method If the FS is able to obtain
a list of all households in a selected EA, then this should be done.
Possible sources include the national census bureau or the office of district
administrator or local government authority. Once a list is available, the field
supervisor should randomly (using a random numbers table) choose eight households,
and send one Interviewer to each. A detailed map showing all households in the
EA and matching them with the listed names is necessary for this method.
(Note: If this method is used, it is not necessary to apply Stage Three: Selection
of Households. Go straight to Stage Four: Selection of Respondents).
An alternative method (where maps are available for the PSU) If the
census bureau has provided EA maps, the FS can randomly select a start point using
a grid. The FS places a ruler with numbers along the top of the map and a ruler
with numbers along the side of the map. He/she then uses a table of random numbers
(or a set of numbered cards) to select a number for the top axis and a number
for the side axis, resulting in a random combination (e.g. “9 and 6.”)
A line is then drawn on the map horizontal to the number chosen on the side, and
another line is drawn vertical to the number chosen on the top. The point on the
map where these two lines intersect is the sampling start point. The SSP is marked
on the map, and given to the field team for that area. The fieldwork team then
locates the nearest housing settlement to this point, and travels there (or as
near as they can to the point). In rural areas, finding the SSP may require the
field team to consult with local residents. Because we never know in advance
the actual condition on the ground in all the PSU’s, the FS may need to
choose a second sampling start point as a reserve or substitute if the SSP is
inappropriate or inaccessible. Another alternative (where maps are
not available) When maps are not available for the selected PSU,
the following procedure should be used. The FS contacts a local government councilor
or another official knowledgeable about the area. This person is consulted to
determine how many housing settlements (e.g. villages) are in the PSU. These settlements
must have identifiable boundaries that do not overlap with one another. These
settlements are numbered and, using numbered cards, the FS asks the informant
to randomly select one card. The settlement identified by the selected number
is the settlement where the interviews will be conducted. IMPORTANT: At
the start point, then the FS must be certain to preserve randomness, by rotating
the place where Interviewers begin their random walk pattern. If the Team starts
on a main road at one SSP, they should start off the road at the next SSP. If
the Team starts in a central place (like a school) in one EA, they should start
in a peripheral place in the next EA. And so on. The logic of random sampling
is to avoid ANY kind of pattern in the units selected at any stage.
STAGE THREE: Selecting Households Having arrived at the
sampling start point, the Team is ready to select households. For the
purposes of the Afrobarometer, a household is defined as a group of people who
presently eat together from the same pot. By this definition, a household does
not include persons who are currently living elsewhere for purposes of studies
or work. Nor does a household include domestic workers or temporary visitors (even
if they eat from the same pot or slept there on the previous night). And, in practice,
we want to select our respondent from among persons in the household who will
be available for interview on that same day. In multi-household dwelling
structures (like blocks of flats, compounds with multiple spouses, or backyard
dwellings for renters, relatives, or household workers), each household is treated
as a separate sampling unit. IMPORTANT: The third (household) and fourth
(respondent) stages of sampling are conducted by Interviewers.
Interviewers must be carefully trained and supervised to ensure that they follow
Afrobarometer sampling instructions to the exact letter. These sampling instructions
are summarized below and spelled out on the first two pages of every questionnaire.
Field Supervisors are responsible for ensuring that their teams
of Interviewers understand their parts of the sampling methodology and execute
them correctly. T he method for selecting households is as follows:
In well-populated urban and rural areas, with single-dwelling units:
Starting as near as possible to the SSP, the FS should choose any random point
(like a street corner, a school, or a water source) being careful to randomly
rotate the choice of such landmarks. The four Interviewers should be instructed
to walk away from this point in the following random directions:
The Walk Pattern : Interviewer 1 walks towards the sun, Interviewer 2
away from the sun, Interviewer 3 at right angle to Interviewer 1, Interviewer
4 in the opposite direction from Interviewer 3, etc. If the Team contains more
than four Interviewers, then the FS should take them to another randomly selected
place near the SSP to begin their walk patterns. When interviews are to
be conducted during the night by the whole team (excluding call backs),the team
should use the moon or some other random landmark to begin the walk pattern (Field
Supervisors should just make sure that interviewees disperse in directions opposite
to each other). Each Interviewer should use the day code to
establish an interval (n) for household selection. The day code introduces randomness
into the interval. It is calculated by adding together the numbers in the day
of the month as follows. On the 5 th, 14 th and 23 rd of the month the interval
would be 5, but on the 6 th, 15 th and 24 th it would be 6. And so on. On some
days (the 1 st and 10 th of the month) the Interviewer moves to the adjacent dwelling
structure (because the sampling interval is 1). On the 29th of the month the Interviewer
must leave the widest gap, selecting only every eleventh house. In every
case, the Interviewer selects the nth house on the right.
In well-populated urban and rural areas, with multiple-dwelling units:
If the start point is a block of flats, or if the walk pattern includes a block
of flats, then the Interviewer should start on the top floor and work his/her
way downwards, stopping at every nth flat on the right. In an exception to the
normal walk pattern, which only refers to blocks of flats, the Interviewer should
only visit alternate floors of the block. 5.3.3.3 In sparsely
populated rural areas, with small villages or single-dwelling farms:
In such areas, there may be only a few households around a given start point.
We do not wish to over-cluster the sample by conducting too many (e.g. all 8)
interviews in one small village. In these cases, the following guidelines shall
apply: If there are 15 or fewer households within walking distance of the start
point, the field team shall drop only one Interviewer there. If there are 16-30
households within walking distance of the start point, two Interviewers can be
dropped there. (If there are more than 50 households, the whole team can be dropped
off as usual). If only one or two Interviewers can be dropped at the start point,
the rest of the team should drive to the nearest housing settlement within
the same EA and closest to the SSP, where the next one, two or three Interviewers
shall be dropped according to the same rule. And so on. In sparsely
populated rural areas, with commercial farms: In countries where
commercial farms are large and contain populous settlements of farm workers, effort
should be made to avoid collecting all eight interviews for that EA on one farm.
To do this, the field supervisor should drop two Interviewers at the first farm
(either the first randomly chosen from a comprehensive list of farms within the
EA, or the first nearest the randomly selected start point), and then drop the
remaining two Interviewers at the next farm. Once the first two are finished,
they are moved to another farm for two more interviews, and the same with the
second pair, so that eight interviews are obtained from four separate farms in
each EA. It is important that all selected farms are within the selected EA.
Households should be chosen from lists of households on the farm, or by using
a standard random walk pattern. Remember to include both the farm owner’s
and farm workers’ dwellings on the lists or on the walk pattern. Once the
teams’ eight interviews are completed, the field supervisor should move
the team on to the next selected EA and repeat the procedure. Interviewer’s
second interview In a Team of four, each Interviewer is to obtain
two interviews per EA (4 Interviewers x 2 interviews = 8 interviews, the quota
for the EA). After completing the first interview, he or she should follow the
same procedure as before. He/she continues walking in the same direction and chooses
the nth dwelling on the right (where n = the day code). And so on. If the settlement
comes to an end and there are no more houses, the Interviewer should turn at right
angles to the right and keep walking, again looking for the nth dwelling on the
right. This procedure is repeated until the Interviewer finds an eligible dwelling
containing an eligible household. STAGE FOUR: Selecting
Individual Respondents Once the household is chosen, the
Interviewer is responsible for randomly selecting the individual respondent
within the household who will be interviewed. To ensure that women are
not underrepresented, the Afrobarometer sets a gender stratum
of an equal number of men and women in the overall sample. To accomplish this
stratum, the gender of respondents is alternated for each interview. First, the
Interviewer determines from the previous interview whether a man or a woman is
to be interviewed. The Interviewer then lists (in any order) the first
names of all the household members of that gender who are 18 years and
older, even those not presently at home but who will return to the house that
evening. From the list (which is numbered, see p. 2 of the questionnaire), the
interviewer randomly selects the actual person to be interviewed by asking a household
member to choose a numbered card from a blind deck of cards. The interviewer should
interview only the person whose number is selected and no one else in that household.
If the person selected refuses to be interviewed, the
Interviewer replaces the household by continuing the walking pattern and again
selecting the nth dwelling on the right (where n = the day code).
Note: In the Afrobarometer, we substitute households, not respondents. Under no
circumstances must the interviewer substitute another individual in the same household
for a respondent selected randomly by means of the numbered card method. It is
not acceptable, for example, to substitute a spouse, parent, child, sibling (or
domestic worker or visitor) in the same household for a selected respondent who
happens not to be at home at the time. If there is no one at
home in the selected household on the first try, the respondent should make one
call-back later in the day. Or, if the designated respondent is not at
home, the Interviewer should make an appointment to meet them later in the day.
Again, a call-back will be necessary in order to find the selected respondent
and to conduct the interview. It is also acceptable for the Interviewer to enquire
about the whereabouts of the selected respondent (they may perhaps be at work)
and, if nearby, to walk to that place to conduct the interview. If the
call-back is unsuccessful, say because the respondent has still not returned home
for the appointment, then, and only then, the Interviewer may substitute the household.
If the house is still empty or the selected respondent is not at home at the time
of the call-back, the Interviewer must substitute that household with the very
next household found in the direction of the walk pattern. This slight
change in the walk pattern is necessary under these circumstances since the Interviewer
may already have had a successful call earlier in the day in the household that
is located at the sampling interval. Reducing Household
Substitutions Round 3 draws on experiences from Round 2. All substitution
figures above 5 percent are considered high in the Afrobarometer surveys. We would
urge NIs to reduce the substitutions, whether for Primary Sampling Units (PSUs)
or households through better planning. Many household substitutions seem
to occur because of the timing of the interviews. Our data show that most interviews
take place between 8:00 am and 6:00pm. We can minimize substitutions through the
following means: a. Plan around the working timetables of rural or urban
communities. This means scheduling interviews to take place perhaps towards the
end of the day in some areas. b. In urban areas, gender strata are often
difficult to meet because a lot of men are at work, especially when interviews
are conducted during the week. We therefore advise that interviews in urban areas
be spread to include weekends. When planning deployments in urban areas, ensure
that at least one day of interviews falls on a weekend. c. If a minority language
group is in the sample, NIs need to plan ahead to ensure that field teams have
the right translations of the questionnaire. This means drawing the sample well
before the other fieldwork activities. |