14  Inference for Means

14.1 Conditions for Inference

1. Random

The data must come from a random sample, or random assignment in an experiment.

2. Large

You must check the following to know if the sample is large enough for us to know if we have an approximately normal distribution.

In the case of sample means, the sampling distribution is approximately normal if at least one of the following conditions are met.

  • Normal/Large Condition

  • When \(n<30\), check if the graph of the sample data does not show any strong skew or outliers. Otherwise,

3. Sampling Independence

Observations in our sample must be independent of each other.

In random samples from a population, observations are never independent because the population changes with every person we sample and remove from it. However, this effect is small enough to ignore as long as the population from which we’re sampling is at least 10 times as large as our sample. This needs to be stated!

14.2 One-Sample \(t\)-procedures

14.2.1 One-Sample \(t\)-interval

One-Sample \(t\)-interval

A \(C\)% confidence interval for the unknown population mean \(\mu\) when all conditions are met is calculated with the following:

\[\bar x \pm t^\ast \frac{s_x}{\sqrt{n}}\]

where \(t^\ast\) is the critical value for the Student’s \(t\) distribution corresponding with degrees of freedom \(df = n - 1\) with \(C\)% of its area between \(-t^\ast\) and \(t^\ast\).

See Appendix D for examples on how to calculate \(t^\ast\) for a given confidence level \(C\)% and degrees of freedom \(df\)

Why?

All confidence intervals are calculated by: \(\text{point estimate } \pm \text{ margin of error}\)

The point estimate for the true population mean is \(\bar x\)

The margin of error is always calculated as \((\text{critical value})(\text{standard error})\)

We use \(t^\ast\) critical values for \(t\)-intervals (which is why they are called \(t\)-intervals).

The standard error is calculated as \(\frac{s_x}{\sqrt{n}}\)

14.3 The Student’s \(t\) distribution

The Student’s \(t\) distribution

Draw an SRS of size \(n\) from a large population that has a Normal distribution with mean \(\mu\) and standard deviation \(\sigma\). The statistic \[t=\frac{\bar x - \mu}{s_x / \sqrt{n}}\] has a \(t\)-distribution with degrees of freedom \(df = n – 1\), denoted as \(t_{n-1}\). When the population distribution isn’t Normal, this statistic will be approximately \(t_{n-1}\) if the sample size is large enough.

As the \(df\) increases, the \(t\)-distribution approaches \(N(0,1)\) (standard normal) (See Figure @ref(fig:t-dist-to-norm)). This happens because \(s_x\) estimates \(\sigma\) more accurately as \(n\) increases. So using \(s_x\) in place of \(\sigma\) causes little extra variation when the sample is large enough.

14.3.1 Why a \(t\)-distribution?

When we’re conducting inference for a population proportion, there’s only one parameter (\(p\)) that we don’t know. The sampling distributions in these cases follow a Normal curve very well (as long as conditions are met), allowing us to use \(z\)-procedures. However, when we’re conducting inference for a population mean, there is additional uncertainty created by the fact that there are two parameters we don’t know: the population mean \(\mu\) and the population standard deviation \(\sigma\). Since we now have, we get a different sampling distribution that is not quite normal, especially at small \(n\).

As a result, we use what’s called the Student’s \(t\) distribution (a Guinness Beer brewer developed this), which is a slightly more conservative version of a normal distribution. The \(t\) distribution is still symmetric with a single peak at 0, but with much more area in the tails. The statistic \(t\) has the same interpretation as any standardized \(z\) statistic: it says many standard deviations \(\sigma\) that \(x\) is from the distribution’s mean \(\mu\).