Hypothesis testing is a formal statistical process used to make decisions about a population parameter (such as the mean \(\mu\) or proportion \(p\)) based on sample data. It involves weighing evidence to decide whether to reject a null hypothesis in favour of an alternative hypothesis.
Hypothesis testing is analogous to a court trial:
* The Null Hypothesis (\(H_0\)): The “status quo” or the assumption of “no effect.” In a trial, this is the “presumption of innocence.” We assume \(H_0\) is true unless evidence suggests otherwise.
* The Alternative Hypothesis (\(H_1\)): What the researcher is trying to prove (the “guilty” verdict).
* The Test Statistic: A single value calculated from sample data (e.g., the sample mean \(\bar{x}\) or sample proportion \(\hat{p}\)) used to determine how far the sample result deviates from the null hypothesis.
KEY TAKEAWAY: In VCE Specialist Mathematics, the null hypothesis \(H_0\) always involves an equality (e.g., \(\mu = \mu_0\) or \(p = p_0\)), whereas the alternative hypothesis \(H_1\) involves an inequality (\(<\), \(>\), or \(\neq\)).
Hypotheses must be defined before collecting data. They can be one-tailed (directional) or two-tailed (non-directional).
| Test Type | Null Hypothesis (\(H_0\)) | Alternative Hypothesis (\(H_1\)) |
|---|---|---|
| One-tail (Right) | \(H_0: \mu = \mu_0\) | \(H_1: \mu > \mu_0\) |
| One-tail (Left) | \(H_0: \mu = \mu_0\) | \(H_1: \mu < \mu_0\) |
| Two-tail | \(H_0: \mu = \mu_0\) | \(H_1: \mu \neq \mu_0\) |
| Test Type | Null Hypothesis (\(H_0\)) | Alternative Hypothesis (\(H_1\)) |
|---|---|---|
| One-tail (Right) | \(H_0: p = p_0\) | \(H_1: p > p_0\) |
| One-tail (Left) | \(H_0: p = p_0\) | \(H_1: p < p_0\) |
| Two-tail | \(H_0: p = p_0\) | \(H_1: p \neq p_0\) |
EXAM TIP: When writing hypotheses, always define the parameter in words. For example: “where \(\mu\) is the mean heart rate of participants in the dark.”
To determine the likelihood of our sample result, we calculate a z-score, which measures how many standard deviations the sample statistic is from the hypothesised population parameter.
If the population standard deviation \(\sigma\) is known:
\$\(Z = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}\)\$
REMEMBER: The denominator represents the standard error of the sampling distribution. For proportions, we use the value of \(p\) from the null hypothesis (\(p_0\)) to calculate the standard error.
The \(p\)-value is the probability of obtaining a sample statistic as extreme as, or more extreme than, the one observed, assuming the null hypothesis is true.
Let \(Z_{obs}\) be the calculated test statistic from the sample data.
COMMON MISTAKE: Students often forget to double the \(p\)-value for a two-tailed test. If \(H_1\) uses \(\neq\), you must account for extremes in both directions.
The significance level (\(\alpha\)) is a pre-determined threshold used to decide whether the \(p\)-value is small enough to reject \(H_0\). Common levels are \(0.05\) (5%) and \(0.01\) (1%).
The \(p\)-value will decrease (making it more likely to reject \(H_0\)) if:
* The sample size \(n\) increases.
* The difference between the sample mean \(\bar{x}\) and the hypothesised mean \(\mu_0\) increases.
* The population standard deviation \(\sigma\) (or variance \(\sigma^2\)) decreases.
VCAA FOCUS: You must be able to state the conclusion in the context of the original problem. Avoid saying “H0 is true”; instead, say “There is insufficient evidence at the \(\alpha\) level of significance to suggest that [contextual claim]…”
Errors can occur because we are making a decision about a population based only on a sample.
| Error Type | Definition | Probability |
|---|---|---|
| Type I Error | Rejecting \(H_0\) when \(H_0\) is actually true. | \(\alpha\) (Significance level) |
| Type II Error | Failing to reject \(H_0\) when \(H_0\) is actually false. | \(\beta\) |
APPLICATION: In medical testing, a Type II error might mean failing to detect a disease in a sick patient, while a Type I error might mean telling a healthy patient they are sick. The choice of \(\alpha\) often depends on which error is more dangerous.