
7 Confidence Intervals
7.1 Confidence Intervals
The sampling distribution describes how the statistic varies across samples. The confidence interval is a way to turn knowledge about that sampling distribution into a statement about the unknown parameter. A \(Z\%\) confidence interval for the mean implies that \(Z\%\) of the intervals we generate will contain the population mean, \(\mu\).
Note that a \(Z\%\) confidence interval does not imply a \(Z\%\) probability that the true parameter lies within a particular calculated interval. The interval you computed either contains the true mean or it does not.
In practice, people often interpret confidence intervals informally as “showing the uncertainty around our estimate”: wider intervals correspond to higher sampling variability and less precise information about \(\mu\). Just as with standard errors, we can estimate confidence intervals using theory-driven or data-driven approaches. We will focus on data-driven approaches first.
Computation.
For example, consider the sample mean. We simulate the sampling distribution of the sample mean and construct a \(90\%\) confidence interval by taking the \(5^{th}\) and \(95^{th}\) percentiles of the sampling distribution. We then expect that approximately \(90\%\) of our constructed confidence intervals contain the theoretical population mean.
For example, consider the mean of a uniform random sample with a sample size of \(n=1000\).
Code
# Create 300 samples, each with 1000 random uniform variables
x_samples <- matrix(nrow=300, ncol=1000)
for(i in seq(1,nrow(x_samples))){
x_samples[i,] <- runif(1000)
}
sample_means <- apply(x_samples, 1, mean) # mean for each sample (row)
# Middle 90%
mq <- quantile(sample_means, probs=c(.05,.95))
paste0('we are 90% confident that the mean is between ',
round(mq[1],2), ' and ', round(mq[2],2) )
## [1] "we are 90% confident that the mean is between 0.49 and 0.52"
hist(sample_means,
breaks=seq(.4,.6, by=.001),
border=NA, freq=F,
col=rgb(0,0,0,.25), font.main=1,
main='90% Confidence Interval for the Mean')
abline(v=mq)
The \(5^{th}\) and \(95^{th}\) percentiles are called the “critical values” for the \(90\%\) confidence interval. The \(2.5^{th}\) and \(97.5^{th}\) percentiles are the critical values for the \(95\%\) confidence interval.
This is a repeated-sampling demonstration; in practice with one sample, we estimate the interval using resampling in the next section.
Interval Size.
Confidence intervals shrink with more data, as averaging washes out random fluctuations. Here is the intuition for estimating the weight of an apple:
- With \(n=1\) apple, your estimate depends entirely on that one draw. If it happens to be unusually large or small, your estimate can be far off.
- With \(n=2\) apples, the estimate averages out their idiosyncrasies. An unusually heavy apple can be balanced by a lighter one, lowering how far off you can be. You are less likely to get two extreme values than just one.
- With \(n=100\) apples, individual apples barely move the needle. The average becomes stable.
Code
# Create 300 samples, each of size n
par(mfrow=c(1,3))
for(n in c(25, 100, 250)){
x_samples <- matrix(nrow=300, ncol=n)
for(i in seq(1,nrow(x_samples))){
x_samples[i,] <- runif(n)
}
# Compute means for each row (for each sample)
sample_means <- apply(x_samples, 1, mean)
# 90% Confidence Interval
mq <- quantile(sample_means, probs=c(.05,.95))
paste0('we are 90% confident that the mean is between ',
round(mq[1],2), ' and ', round(mq[2],2) )
hist(sample_means,
breaks=seq(.1,.9, by=.005),
border=NA, freq=F,
col=rgb(0,0,0,.25), font.main=1,
main=paste0('n=',n))
abline(v=mq)
}
For a fixed sample size \(n\), there is a trade-off between precision: the width of a confidence interval, and accuracy: the probability that a confidence interval contains the theoretical value.
7.2 Resampling Intervals
Often, we have only one sample. In practice, we can use resampling procedures to estimate a confidence interval. E.g., we repeatedly resample data and construct a bootstrap or jackknife sampling distribution. Then we compute the confidence intervals using the upper and lower quantiles of the sampling distribution.
7.3 Normal Approximation
Given the sampling distribution is approximately Normal, the usual confidence intervals are symmetric. For the sample mean \(M\), we can then construct the interval \([M - E, M + E]\), where \(E\) is a “margin of error” on either side of \(M\). The margin of error for the Normal distribution is theoretically known, and can be calculated based on standard errors and the critical values. Given a standard error \(SE(M)\) and quantiles \(q(\alpha/2)\), we know from theory that \(E=q(\alpha)\times SE(M)\)
The critical values, \(M \pm q(\alpha) SE(M)\), define the Normal confidence interval associated with \(1-\alpha\) coverage. A coverage level of \(1-\alpha\) means that if the same sampling procedure were repeated \(100\) times from the same population, approximately \(1-\alpha\) percent of the intervals are expected to contain the true population mean. 1 We can also compute from theory that \(\pm 1.96~ SE(M)\) corresponds to the critical values of the Normal distribution, where \(SE(M)\) is estimated using the bootstrap distribution or theory (classical SEs: \(\hat{S}/\sqrt{n}\)). The \(95\%\) interval is the most common, and has quantile value \(q=1.96\). The \(90\%\) interval has quantile value \(q=1.65\). The values for \(M\) and \(SE(M)\) are estimated from data.
The main advantages of the Normal approximation is that
- can be computed formulaically as above and
- works well for estimating extreme probabilities, where resampling methods tend to be worse.
The main disadvantage is that
- the sampling distribution might be far from normal.
In the example below, they are all quite similar, but that does not always need to be the case.
Code
# Bootstrap Distribution with Percentile CI
hist(bootstrap_means, breaks=25,
main='Percentile vs Normal 95% CIs',
font.main=1, border=NA,
freq=F, ylim=c(0,0.7),
xlab=expression(hat(b)[b]))
boot_ci_percentile <- quantile(bootstrap_means, probs=c(.025,.975))
abline(v=boot_ci_percentile, lty=1)
# Normal Approximation with Bootstrap SEs
x <- seq(5,10,by=0.01)
se_boot <- sd(bootstrap_means)
fx <- dnorm(x,sample_mean,se_boot)
lines(x, fx, col="blue", lty=1)
boot_ci_normal <- qnorm(c(.025,.975), sample_mean, se_boot)
abline(v=boot_ci_normal, col="blue", lty=3)
# Normal Approximation with IID Theory SEs
classic_se <- sd(sample_dat)/sqrt(length(sample_dat))
fx2 <- dnorm(x,sample_mean,classic_se)
lines(x, fx2, col="red", lty=1)
ci_normal <- qnorm(c(.025,.975), sample_mean, classic_se) #sample_mean+c(-1.96, +1.96)*classic_se
abline(v=ci_normal, col="red", lty=3)
7.4 Misc. Topics
One-Sided Intervals.
Above, our confidence intervals were two-sided: they contained the middle \(Z\%\) of the sampling distribution. We can also construct one-sided intervals that extend to infinity in one direction.
A one-sided interval is shifted to one side, containing one tail rather than the middle. For example, an upper-bounded interval uses \((-\infty, q_{0.95}]\), where \(q_{0.95}\) is the \(95^{\text{th}}\) percentile of the bootstrap distribution: we are \(95\%\) confident the true value is at most \(q_{0.95}\). A lower-bounded interval uses \([q_{0.05}, \infty)\), where \(q_{0.05}\) is the \(5^{\text{th}}\) percentile of the bootstrap distribution: we are \(95\%\) confident the true value is at least \(q_{0.05}\).
Prediction Intervals.
Note that \(Z\%\) confidence intervals do not generally cover \(Z\%\) of the data (those types of intervals are covered later). In the examples above, notice the confidence interval for the mean differs from the confidence interval of the median, and so both cannot cover \(90\%\) of the data. The confidence interval for the mean is roughly \([0.48, 0.52]\), which theoretically covers only a \(0.52-0.48=0.04\) proportion of uniform random data, much less than the proportion \(0.9\).
In addition to confidence intervals, we can also compute a prediction interval which estimate the variability of new data rather than a statistic. To do so, we compute the lower/upper quantiles of the data.
Code
x <- runif(1000)
# Middle 90% of values
xq0 <- quantile(x, probs=c(.05,.95))
paste0('we are 90% confident that a future data point will be between ',
round(xq0[1],2), ' and ', round(xq0[2],2) )
## [1] "we are 90% confident that a future data point will be between 0.05 and 0.95"
hist(x,
breaks=seq(0,1,by=.01), border=NA,
main='Prediction Interval', font.main=1)
abline(v=xq0)
7.5 Further Reading
See
Generally, a coverage level of \(1-\alpha\) means \(Prob( M - E < \mu < M + E)=1-\alpha\). Notice that \(Prob( M - E < \mu < M + E) = Prob( - E < \mu - M < + E) = Prob( \mu + E > M > \mu - E)\). So if the interval \([\mu - 10, \mu + 10]\) contains \(95\%\) of all \(M\), then the interval \([M-10, M+10]\) will also contain \(\mu\) in \(95\%\) of the samples because whenever \(M\) is within \(10\) of \(\mu\), the value \(\mu\) is also within \(10\) of \(M\). But for any particular sample, the interval \([\hat{M}-10, \hat{M}+10]\) either does or does not contain \(\mu\). Similarly, if you compute \(\hat{M}=9\) for your particular sample, a coverage level of \(1-\alpha=95\%\) does not mean \(Prob(9 - E < \mu < 9 + E)=95\%\).↩︎

