Once you have the regression equation \(\hat{y} = a + bx\), substitute a value of \(x\) to predict \(y\):
Example: Equation: \(\widehat{\text{score}} = 31.5 + 9.2 \times \text{hours}\)
Predict score for 5 hours: \(\hat{y} = 31.5 + 9.2(5) = 77.5\) marks.
| Interpolation | Extrapolation | |
|---|---|---|
| Definition | Predicting within the range of the observed data | Predicting outside the range of the observed data |
| Reliability | Generally reliable | Potentially unreliable |
| Example | Data ranges 1–8 hours; predict for 4 hours | Data ranges 1–8 hours; predict for 15 hours |
The linear relationship observed within the data range may not continue beyond it. The true relationship may:
- Level off (reach a maximum/minimum)
- Change direction
- Follow a curve
Example: Predicting exam scores for 20 hours of study using \(\hat{y} = 31.5 + 9.2x\) gives \(\hat{y} = 215.5\) — clearly impossible for a test out of 100. This shows the danger of extrapolation.
Context: A study finds \(\widehat{\text{fuel used}} = 2.3 + 0.08 \times \text{distance}\) (litres, km), based on data for distances 10–200 km.
| Prediction | Type | Reliable? |
|---|---|---|
| Distance = 50 km → 6.3 L | Interpolation | Yes |
| Distance = 150 km → 14.3 L | Interpolation | Yes |
| Distance = 500 km → 42.3 L | Extrapolation | Questionable |
| Distance = 5 km → 2.7 L | Extrapolation | Questionable |
KEY TAKEAWAY: Interpolation (within data range) is generally reliable. Extrapolation (outside data range) is unreliable — always identify which type of prediction you are making.
EXAM TIP: VCAA often gives a prediction scenario and asks whether it is interpolation or extrapolation, and to comment on its reliability. Always state the data range and whether the x-value falls inside or outside it.
VCAA FOCUS: The word “limitations” in the KK specifically invites discussion of extrapolation, weak association, non-linearity, and causation. Cover all relevant limitations in your answer.