Statistical Inference: Confidence Intervals

Statistical inference is the process of using data from a sample to make estimates or draw conclusions about an entire population. In VCE Specialist Mathematics, this focuses on estimating the population mean ($\mu$) and the population proportion ($p$).

1. Point and Interval Estimates

Point Estimate: A single value used to estimate a population parameter.
- The sample mean ($\bar{x}$) is a point estimate for the population mean ($\mu$).
- The sample proportion ($\hat{p}$) is a point estimate for the population proportion ($p$).
Interval Estimate (Confidence Interval): A range of values, derived from sample statistics, that is likely to contain the value of an unknown population parameter.

KEY TAKEAWAY: A point estimate provides no information about the precision of the estimate; a confidence interval provides a range of plausible values and a level of certainty.

2. Confidence Intervals for the Mean ($\mu$)

A confidence interval for the population mean is constructed when the population standard deviation ($\sigma$) is known, or when the sample size ($n$) is large enough ($n \ge 30$) to use the sample standard deviation ($s$) as an approximation for $\sigma$.

The General Formula

The $C\%$ confidence interval for $\mu$ is given by:
\$$\bar{x} \pm z \cdot \frac{\sigma}{\sqrt{n}}$\$

Where:
* $\bar{x}$ is the sample mean.
* $z$ is the critical value for the desired confidence level.
* $\frac{\sigma}{\sqrt{n}}$ is the standard error of the mean.
* $M = z \cdot \frac{\sigma}{\sqrt{n}}$ is the margin of error.

Common Critical Values ($z$)

The value of $z$ is determined by the standard normal distribution $Z \sim N(0, 1)$.

Confidence Level	$z$ value (approx)	$z$ value (exact)
90%	1.645	$invNorm(0.95, 0, 1)$
95%	1.96	$invNorm(0.975, 0, 1)$
99%	2.576	$invNorm(0.995, 0, 1)$

Note: For some technology-free questions, VCAA may specify using $z \approx 2$ for a 95% confidence interval.

EXAM TIP: If a question asks for the “distance between the sample mean and the population mean,” they are asking for the Margin of Error ($M$).

3. Margin of Error and Interval Width

The width ($W$) of a confidence interval is the distance between the upper and lower bounds.
\$$W = \text{Upper Bound} - \text{Lower Bound} = 2 \times \text{Margin of Error}$\$
\$$W = 2z \frac{\sigma}{\sqrt{n}}$\$

Factors Affecting Width

Confidence Level: Increasing the confidence level (e.g., 95% to 99%) increases $z$, which increases the width.
Sample Size ($n$): Increasing the sample size decreases the width. Since $W \propto \frac{1}{\sqrt{n}}$, to halve the width, you must quadruple the sample size.
Standard Deviation ($\sigma$): A larger population standard deviation increases the width.

Determining Required Sample Size

To find the minimum sample size $n$ required to achieve a specific margin of error $M$:
\$$n = \left( \frac{z \cdot \sigma}{M} \right)^2$\$
Always round $n$ up to the nearest whole number to ensure the margin of error is not exceeded.

VCAA FOCUS: Questions often ask how the sample size must change to achieve a certain reduction in width. If the width is reduced by a factor of $k$, the sample size must increase by a factor of $\frac{1}{k^2}$. For example, to reduce width by $\frac{2}{3}$ (to $\frac{1}{3}$ of the original), $n$ must increase by $3^2 = 9$.

4. Confidence Intervals for Proportions ($p$)

When dealing with categorical data (e.g., “Yes/No” responses), we estimate the population proportion $p$ using the sample proportion $\hat{p} = \frac{x}{n}$.

The General Formula

For a large sample size, the distribution of $\hat{P}$ is approximately normal: $\hat{P} \approx N\left(p, \frac{p(1-p)}{n}\right)$.
The $C\%$ confidence interval for $p$ is:
\$$\hat{p} \pm z \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$\$

Conditions for Validity

This approximation is generally considered valid if $n\hat{p} \ge 10$ and $n(1-\hat{p}) \ge 10$.

COMMON MISTAKE: Using the population proportion $p$ in the standard error formula when constructing a confidence interval. Since $p$ is unknown (that’s why we are making the interval!), we must use the sample estimate $\hat{p}$ to calculate the standard error.

5. Interpretation of Confidence Intervals

It is a common misconception that a 95% confidence interval has a “95% probability of containing the population mean.”

Correct Interpretation: If we were to take many random samples of the same size and construct a 95% confidence interval from each sample, approximately 95% of those intervals would contain the true population parameter ($\mu$ or $p$).
The Center: The center of the confidence interval is always the sample statistic ($\bar{x}$ or $\hat{p}$), not the population parameter.

Summary Table: Mean vs. Proportion

Feature	Population Mean ($\mu$)	Population Proportion ($p$)
Point Estimate	$\bar{x}$	$\hat{p} = \frac{x}{n}$
Standard Error	$\frac{\sigma}{\sqrt{n}}$	$\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$
Confidence Interval	$\bar{x} \pm z\frac{\sigma}{\sqrt{n}}$	$\hat{p} \pm z\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$

STUDY HINT: Practice using your CAS calculator to find confidence intervals quickly. In the Statistics menu, look for One-Sample Z Interval for means and One-Prop Z Interval for proportions. Knowing how to do this manually is essential for “Show that” or technology-free questions.

Feature	Population Mean (\(\mu\))	Population Proportion (\(p\))
Point Estimate	\(\bar{x}\)	\(\hat{p} = \frac{x}{n}\)
Standard Error	\(\frac{\sigma}{\sqrt{n}}\)	\(\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\)
Confidence Interval	\(\bar{x} \pm z\frac{\sigma}{\sqrt{n}}\)	\(\hat{p} \pm z\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\)

Statistical Inference: Confidence Intervals

Table of Contents

About these notes

Join StudyPulse