The Design Effect

3. What is the design effect?

The design effect (Deff) is a measure of the statistical inefficiency of a study design. It is calculated relative to the efficiency under simple random sampling (SRS), which is where we sample individuals at random from the target population. SRS is a highly efficient design statistically even though it tends to be a horrible design here from a practical viewpoint! SRS for a nationwide study would mean sampling single individuals from the country as a whole, driving from village to village each time, which would be a logistical nightmare.

Higher values of Deff indicate that there is some statistical inefficiency introduced by our study design, although this should not necessarily be seen as a bad thing. A study design that is 5 times less efficient may be 100 times cheaper to run, meaning we can recruit more people and overall get a higher power. As with all things related to study design, statistical efficiency is just one aspect that must be balanced against other considerations when coming up with an optimal design.

The design effect is mathematically defined as the variance of our estimator of interest under the current study design divided by the variance under SRS. Let c be our estimator of the prevalence under the multi-site design, and let s be our estimator under SRS. Then we can say:

$D_{eff} = \frac{var(\hat{p}_c)}{var(\hat{p}_s)}$

We have mentioned both of these variances in previous sections when talking about CIs. The variance under the multi-site analysis is given by $var(\hat{p}_c) = \frac{s^2}{c}$, and the variance under SRS is given by $var(\hat{p}_s) = \frac{\hat{p}(1 - \hat{p})}{nc}$. For our example data this gives Deff= 5.56. This is a fairly large design effect, meaning there is a large impact of overdispersion between sites that is creating a high degree of statistical inefficiency.

3.1 The effective sample size

Another way to think about this is in terms of the effective sample size (ESS). This is the number of effectively independent samples we have in our study, which will be smaller than the actual sample size whenever there is statistical inefficiency. ESS is simply calculated as the true sample size divided by the design effect: $n_E = \frac{nc}{D_{eff}}$. In our case we find that nE= 250 / 5.56 = 44.93, which we would round down to 44 to be conservative. Some people (myself included) find this to be a very useful way of thinking about statistical efficiency. Even though we went to all the trouble of sampling 250 individuals, the fact that people are so similar within a site means that, effectively, we only sampled 44 distinct individuals in total. This is a very “real world” way of thinking about something that can otherwise feel quite abstract.

Now that we have an estimate of the sample size that takes account of correlations we could use this value within our original CI formula in place of the true sample size. Using nE in place of nc we obtain the interval $\hat{p} \pm z\sqrt{\frac{\hat{p}(1 - \hat{p})}{n_E}}$. At first glance this appears to give us yet another way of constructing CIs, but don’t be fooled. Once you substitute in the formula for nE and trace it back to the variance you’ll find that this simplifies to $\hat{p} \pm z\sqrt{\frac{\hat{p}(1 - \hat{p})}{nc}D_{eff}}$ and eventually to $\hat{p} \pm z\sqrt{\frac{s^2}{c}}$, which was our formula from the second CI method. So, the design effect doesn’t give us a magic way of dealing with the issue of ICC, it just gives us a different way of thinking about it. Rather than taking a whole new statistical approach it’s like we use the original approach but we “punish” the method for the inefficient design by stretching out the CIs to be wider. Either way you choose to look at it you get the same result.

3.2 Relating Deff to the ICC

The design effect can be related to the intra-cluster correlation coefficient (denoted r) through the formula:

Deff = 1 + (n − 1)r

Notice that when r = 1 (perfect correlation within sites) we find that Deff reaches its maximum value of n, and the effective sample size reaches its minimum value of $n_E = \frac{nc}{n} = c$. This should make some intuitive sense - when there is perfect correlation within sites we effectively have one observation per site because all subsequent observations will be dictated by the first.

But this formula also reveals one of the tricky things about Deff; that it only applies for a particular sample size. From the formula above we can see that a survey with twice the sample size (i.e., twice the n) will have roughly twice the Deff. This makes it difficult to compare values between studies. A design effect of 2 can mean a lot or a little, depending on the sample size of the study. On the other hand, the parameter r does not change with sample size, and so is more of an intrinsic property of the population. For this reason, we prefer to talk in terms of the ICC inside the DRpower package.