13  Inference for Proportions

13.1 Conditions for Inference

1. Random

The data must come from a random sample, or random assignment in an experiment.

2. Large

You must check the following to know if the sample is large enough for us to know if we have an approximately normal distribution.

For proportions, remember, the population is never anywhere near normal (it’s always two bars, yes and no). In this case we check the Large Counts Condition.

This ensures that our sample proportion can take on enough different values (and make enough bars in a histogram) to create an approximately normal sampling distribution.

3. Sampling Independence

Observations in our sample must be independent of each other.

In random samples from a population, observations are never independent because the population changes with every person we sample and remove from it. However, this effect is small enough to ignore as long as the population from which we’re sampling is at least 10 times as large as our sample. This needs to be stated!

13.2 One-Sample \(z\)-procedures

13.2.1 One-Sample \(z\)-interval

One-Sample \(z\)-interval

A \(C\)% confidence interval for the unknown population proportion \(p\) when all conditions are met is calculated with the following:

\[\hat p \pm z^\ast \sqrt{\frac{\hat p (1 - \hat p)}{n}}\]

where \(z^\ast\) is the critical value for the standard Normal curve with \(C\)% of its area between \(-z^\ast\) and \(z^\ast\).

See Appendix C for examples on how to calculate \(z^\ast\) for a given confidence level \(C\)%

Why?

All confidence intervals are calculated by: \(\text{point estimate } \pm \text{ margin of error}\)

The point estimate for the true population proportion is \(\hat p\)

The margin of error is always calculated as \((\text{critical value})(\text{standard error})\)

We use \(z^\ast\) critical values for \(z\)-intervals (which is why they are called \(z\)-intervals).

The standard error is calculated as \(\sqrt{\frac{\hat p (1 - \hat p)}{n}}\)