Plot.plot({
marks: [
Plot.line(d3.range(0, 10, 0.01),
{
x: (x) => x,
y: (x) => jStat.chisquare.pdf(x, df1),
strokeWidth: 3,
stroke: "steelblue"
}
),
Plot.ruleX([0]),
Plot.ruleY([0]),
],
x: {
domain: [0, 10],
label: "Chi-squared Statistic"
},
y: {
domain: [0, .3],
label: "Density"
},
caption: "The Chi-squared Sampling Distribution"
})
16 \(\chi^2\) Tests
\(\chi^2\) (pronounced “kai” (/ˈkaɪ, ˈxiː/) squared) tests are used in situations where we have 2 or more proportions, and we want to see whether the distribution of those proportions is something that could occur naturally. There are three types that we cover.
16.1 Sampling distribution of \(\chi^2\)
The sampling distribution of \(\chi^2\) is not a Normal distribution. It is a right-skewed distribution that allows only positive values because \(\chi^2\) can never be negative. When the expected counts are all at least 5, the sampling distribution of the \(\chi^2\) statistic is close to a \(\chi^2\) distribution with a degrees of freedom based off the test that we do. The \(\chi^2\) distributions are a family of distributions that take only positive values and are skewed to the right. A particular chi-square distribution is specified by giving its degrees of freedom.
16.2 Conditions
Random: The data comes from a well-designed random sample or from a randomized experiment.
10%: When sampling without replacement, check that \(n\leq(.10)N\)
Large Counts: All expected counts are greater than 5
Be careful and remember that:
The chi-square test statistic compares observed and expected counts. Don’t try to perform calculations with the observed and expected proportions in each category.
When checking the Large Sample Size condition, be sure to examine the expected counts, not the observed counts.
16.3 \(\chi^2\) Goodness-of-Fit
We do this test when we wonder, “Does this data fit with what they are saying?” So how “good” does the distribution of data that we see “fit” with the distribution that they claim?
Hypotheses:
\(H_0\): The stated distribution of the categorical variable in the population of interest is correct. \[H_0:p_1=p_{0_1},p_2=p_{0_2}, \cdots ,p_c=p_{0_c} \] where there are \(c\) categories in the categorical variable
\(H_a\): The stated distribution of the categorical variable in the population of interest is not correct \[H_a:p_1 \not = p_{0_1},p_2 \not = p_{0_2}, \cdots ,p_c \not = p_{0_c}\]
Expected Counts:
The expected count for each category (\(E_i\)) is the sample size (\(n\)) times the stated probability of the category \(p_{0_i}\). \[E_i=np_{0_i}\]
Degrees of freedom:
We calculate the degrees of freedom by taking the # of categories - 1 \[df=c-1\]
Chi-Square Test Statistic: \[\chi^2=\sum\frac{(Observed-Expected)^2}{Expected} = \sum\frac{(x_i-E_i)^2}{E_i}\]
16.3.1 Example
Carrie made a 6-sided die in her ceramics class and rolled it 90 times to test if each side was equally likely to show up. The table summarizes the outcomes of her 90 rolls.
Outcome of roll | 1 | 2 | 3 | 4 | 5 | 6 | Total |
---|---|---|---|---|---|---|---|
Frequency | 12 | 28 | 12 | 13 | 10 | 15 | 90 |
State
\(H_0:\) The ceramic die’s sides are equally like to occur
\(H_a:\) The ceramic die’s dies are not equally likely to occur
Alternatively, \[H_0: p_1 = p_2 = \cdots = p_6 = \frac{1}{6}\] \[H_a: p_1 \not = \frac{1}{6}, p_2 \not = \frac{1}{6}, \cdots, p_6 \not = \frac{1}{6}\]
Plan
If the conditions are met, I will conduct a \(\chi^2\) for Goodness-of-Fit Test.
Outcome of Roll 1 2 3 4 5 6 Theoretical Probability 1/6 1/6 1/6 1/6 1/6 1/6 Expected Count 15 15 15 15 15 15 Random: We can assume that Carrie’s die rolls were random attempts.
Large: Each of the expected counts calculated are at least 5.
Independent: Since we are sampling with independence, we can assume that each dice roll is independent of each other.
Do
Outcome of roll 1 2 3 4 5 6 Observed 12 28 12 13 10 15 Theoretical Probability 1/6 1/6 1/6 1/6 1/6 1/6 Expected 15 15 15 15 15 15 \(\frac{(Observed - Expected) ^ 2}{Expected}\) \(\frac{(12 - 15) ^ 2}{15}\) \(\frac{(28 - 15) ^ 2}{15}\) \(\frac{(12 - 15) ^ 2}{15}\) \(\frac{(13 - 15) ^ 2}{15}\) \(\frac{(10 - 15) ^ 2}{15}\) \(\frac{(15 - 15) ^ 2}{15}\) \[ \begin{aligned} \chi^2&=\sum\frac{(x_i - E_i)^2}{E_i} \\ &= \frac{(12 - 15) ^ 2}{15} + \frac{(28 - 15) ^ 2}{15} + \frac{(12 - 15) ^ 2}{15} + \frac{(13 - 15) ^ 2}{15} + \frac{(10 - 15) ^ 2}{15} + \frac{(15 - 15) ^ 2}{15} \\ &= 0.6000 + 11.2667 + 0.6000 + 0.2667 + 1.6667 + 0.0000 \\ &= 14.4 \end{aligned} \] \[df = c - 1 = 6 - 1 = 5\] Looking up the test statistic and degrees of freedom on Table C, notice that 14.4 is between 13.39 and 15.09, therefore, \(0.01 < \text{p-value} < 0.02\).
Alternatively, do \(\chi^2\mathtt{cdf}(14.4, 1000, 5) = 0.0132586\).
Conclude
Since our \(\text{p-value} = 0.01326 < 0.05 = \alpha\), we reject the null hypothesis that the ceramic die’s sides are equally likely to show up. Therefore, there is convincing evidence that the ceramic die is not fair.
16.3.2 Doing it on a calculator
For context, if you have two vectors \(\vec x\) and \(\vec y\), you can find the difference between them by doing \(\vec x - \vec y\), which will take the element-wise difference between the two vectors (subtract the first from the first of the other, the second from the second of the other, …). For example,
\[ \begin{aligned} &\text {Let } \vec x = \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} \text{ and } \vec y = \begin{bmatrix} 7 \\ 6 \\ 5 \end{bmatrix}\\ & \text{Then }\vec x - \vec y = \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} - \begin{bmatrix} 7 \\ 6 \\ 5 \end{bmatrix} = \begin{bmatrix} -6 \\ -4 \\ -2 \end{bmatrix} \end{aligned} \]
When we do things on the calculator with the following method, everything works doing everything element-wise, whatever we do to the list happens to each thing in the list.
- Get ready to edit your lists by going to
stat > EDIT > edit
- You’ll see all your available lists,
L1
,L2
, …- To clear a given list, go up the header of the list (e.g.
L1
should be highlighted if you want to clear it), then hitclear
andenter
- To clear a given list, go up the header of the list (e.g.
- Enter the list of your observed values for the problem.
- Let’s say this is
L1
- Let’s say this is
- Enter your theoretical probabilities from the null hypothesis. Make sure they correspond to observed elements one-by-one; they should be in the same order as when you entered the observed values.
- Let’s say this is
L2
- Let’s say this is
- Define your next list of the expected lists (go up the header of the list that you want to fill in and make sure it’s highlighted) and do your sample size times the theoretical probabilities.
- Let’s say this is
L3
, so make sure thatL3
is highlighted and then enteryour sample size * L2
- Let’s say this is
- Now that you have your observed values and expected values, we can calculate the components of the \(\chi^2\) test statistic. In other words, calculate \[\frac{(Observed-Expected)^2}{Expected}\] for each observed and expected pair.
- Let’s say you want this in
L4
, so make sure thatL4
is highlighted and then enter(L1 - L3)^2/L3
- Let’s say you want this in
- We now have all the components and the final task is to sum up the values.
- Three options:
stat > CALC > 1-Var Stats
and then you should see1-Var Stats
on the command screen. Enter your desired list that you want to sum up (which in the example would look like1-Var Stats L4
).2nd > stat (list) > MATH > sum(
and than just input your desired list to sum (which in the example would besum(L4)
).- (On newer calculators ONLY) use \(\chi^2\mathtt{GOF-Test}\)
- Three options:
Here’s a a video tutorial using the example from Page 3 in your packets.
16.4 \(\chi^2\) Test of Homogeneity
We do this test when we wonder, “Is the distribution of this group’s data the same as this other group’s distribution of data?” So how similar (“homogenous”) is one group/treatment compared to another?
Hypotheses:
\(H_0\): The distribution of the variable is the same across all groups/treatments.
\[ \begin{aligned} H_0: &p_{1, 1} = p_{1,2} = \cdots =p_{1, c}, \\ &p_{2, 1} = p_{2,2} = \cdots =p_{2, c},\\ &\cdots,\\ &p_{r, 1} = p_{r,2} = \cdots =p_{r, c} \end{aligned} \]
where each there are \(r\) rows (categories in the variable) and \(c\) columns (populations/samples/groups/treatments) in the two-way table. Each \(p_{i, j}\) correlates to the \(i^{th}\) row and the \(j^{th}\) column.
\(H_a\): The distribution of the variable is not the same across all groups/treatments.
(The symbolic notation would be too messy, so it’s not included)
Expected Counts:
The expected count for each cell (\(E_{i,j}\)) is the total count for the corresponding row (\(\sum_{j = 1}^c x_{i,j}\)) times the total count for the corresponding column (\(\sum_{i = 1}^r x_{i,j}\)) divided by the total count (\(\sum_{j = 1}^c \sum_{i = 1}^r x_{i,j}\)).
\[\text{Expected} = \frac{(\text{row total}) \cdot (\text{column total})}{\text{total}}\]
\[E_{i,j} = \frac{(\sum_{j = 1}^c x_{i,j}) \cdot (\sum_{i = 1}^r x_{i,j})} {\sum_{j = 1}^c \sum_{i = 1}^r x_{i,j}}\]
Degrees of freedom:
We calculate the degrees of freedom by taking the # of rows - 1 times the # of columns - 1. \[df=(r - 1) \cdot (c-1)\]
\(\chi^2\) Test Statistic:
Calculate the \(\chi^2\) test statistic the same as always (don’t mind the complicated looking formula here).
\[\chi^2=\sum\frac{(Observed-Expected)^2}{Expected} = \sum_{j = 1}^c \sum_{i = 1}^r\frac{(x_{i,j}-E_{i,j})^2}{E_{i,j}}\]
16.4.1 Example
(Modified question, compare to the Test of Independence)
The General Social Survey (GSS) asked random samples of 234 Associate Holders, 321 Bachelor Holders, and 132 Master holders of their opinion about whether astrology is very scientific, sort of scientific, or not at all scientific. Here is a two-way table of counts for people in the sample who had three levels of higher education:
Degree held | |||||
---|---|---|---|---|---|
Associate’s | Bachelor’s | Master’s | Total | ||
Opinion about astrology | Not at all scientific | 169 | 256 | 114 | 539 |
Very or sort of scientific | 65 | 65 | 18 | 148 | |
Total | 234 | 321 | 132 | 687 |
Do the data provide convincing evidence of an difference in distribution of astrology opinion between degree holders?
State
\(H_0\): The distribution of astrology opinion is the same between degree holders.
\(H_a\): The distribution of astrology opinion is different between degree holders.
Plan
If the conditions are met, I will use a \(\chi^2\) Test of Homogeneity
Expected Counts Degree held Associate’s Bachelor’s Master’s Opinion about astrology Not at all scientific 183.59 251.847 103.563 Very or sort of scientific 50.41 69.153 28.437 Random: Each sample of degree holders are randomly selected.
Large: All the expected counts are at least 5
Independence: There are at least \(10n_1=2340\) Associate Holders, \(10n_2=3210\) Bachelor Holders, and \(10n_3=1320\) Master holders
Do
\[\begin{aligned} \chi^2 = &\frac{(169-183.59)^2}{183.59} + \frac{(256-251.847)^2}{251.847} + \frac{(114-103.563)^2}{103.563} +\\ &\frac{(65-50.41)^2}{50.41} + \frac{(65-69.153)^2}{69.153} + \frac{(18-28.437)^2}{28.437}\\\\ df =& (2-1)(3-1) = 2\\\\ \text{p-value } =& \chi^2\texttt{cdf(10.582, 1000, 2)} \approx 0.005 \end{aligned}\]
Conclude
Since \(\text{p-value} \approx 0.005 < 0.05 = \alpha\), we reject the null hypothesis. There is convincing evidence that the distribution of astrology opinion is different between degree holders.
16.5 \(\chi^2\) Test of Independence
We do this test when we wonder, “Are these two variables in this set of data independent or not?” In other words, this is a more deterministic way of doing what we did in Chapter 6 when we compared just the probability of a single event and a conditional probability
Hypotheses:
\(H_0\): There is no association between the two variables or that the two events are independent.
\[ \begin{aligned} H_0: &p_{1, 1} = p_{1,2} = \cdots =p_{1, c}, \\ &p_{2, 1} = p_{2,2} = \cdots =p_{2, c},\\ &\cdots,\\ &p_{r, 1} = p_{r,2} = \cdots =p_{r, c} \end{aligned} \]
where each there are \(r\) rows (categories in the first variable) and \(c\) columns (categories in the second variable) in the two-way table. Each \(p_{i, j}\) correlates to the \(i^{th}\) row and the \(j^{th}\) column.
\(H_a\): There is an association between the two variables or that the two events are not independent.
(The symbolic notation would be too messy, so it’s not included)
Expected Counts:
The expected count for each cell (\(E_{i,j}\)) is the total count for the corresponding row (\(\sum_{j = 1}^c x_{i,j}\)) times the total count for the corresponding column (\(\sum_{i = 1}^r x_{i,j}\)) divided by the total count (\(\sum_{j = 1}^c \sum_{i = 1}^r x_{i,j}\)).
\[Expected = \frac{(row ~ total) \cdot (column ~ total)}{total}\]
\[E_{i,j} = \frac{(\sum_{j = 1}^c x_{i,j}) \cdot (\sum_{i = 1}^r x_{i,j})} {\sum_{j = 1}^c \sum_{i = 1}^r x_{i,j}}\]
Degrees of freedom:
We calculate the degrees of freedom by taking the # of rows - 1 times the # of columns - 1. \[df=(r - 1) \cdot (c-1)\]
Chi-Square Test Statistic: \[\chi^2=\sum\frac{(Observed-Expected)^2}{Expected} = \sum_{j = 1}^c \sum_{i = 1}^r\frac{(x_{i,j}-E_{i,j})^2}{E_{i,j}}\]
16.5.1 Example
(Original question, compare to the Test of Homogeneity
The General Social Survey (GSS) asked a random sample of their opinion about whether astrology is very scientific, sort of scientific, or not at all scientific. Here is a two-way table of counts for people in the sample who had three levels of higher education:
Degree held | |||||
---|---|---|---|---|---|
Associate’s | Bachelor’s | Master’s | Total | ||
Opinion about astrology | Not at all scientific | 169 | 256 | 114 | 539 |
Very or sort of scientific | 65 | 65 | 18 | 148 | |
Total | 234 | 321 | 132 | 687 |
Do the data provide convincing evidence of an association between astrology opinion and degree held by an adult?
State
\(H_0\): There is no association between astrology opinion and degree held by an adult.
\(H_a\): There is an association between astrology opinion and degree held by an adult.
Plan
If the conditions are met, I will use a \(\chi^2\) Test of Independence.
Expected Counts Degree held Associate’s Bachelor’s Master’s Opinion about astrology Not at all scientific 183.59 251.847 103.563 Very or sort of scientific 50.41 69.153 28.437 Random: The sample of adults is randomly selected.
Large: All the expected counts are at least 5
Independence: There are at least \(10n = 6870\) adults.
Do
\[\begin{aligned} \chi^2 = &\frac{(169-183.59)^2}{183.59} + \frac{(256-251.847)^2}{251.847} + \frac{(114-103.563)^2}{103.563} +\\ &\frac{(65-50.41)^2}{50.41} + \frac{(65-69.153)^2}{69.153} + \frac{(18-28.437)^2}{28.437}\\\\ df =& (2-1)(3-1) = 2\\\\ \text{p-value } =& \chi^2\texttt{cdf(10.582, 1000, 2)} \approx 0.005 \end{aligned}\]
Conclude
Since \(\text{p-value} \approx 0.005 < 0.05 = \alpha\), we reject the null hypothesis. There is convincing evidence there is an association between an adult’s opinion about astrology and the degree that they hold.
16.6 Calculator for Homogeneity/Independence
There’s no difference in how \(\chi^2\), \(df\), and the p-value is calculated between the \(\chi^2\) test for Homogeneity and the \(\chi^2\) test for Independence.
- Enter the observed data seen in the two-way table in matrix
[A]
- Access this by
2nd
> \(\texttt{x}^{\texttt{-1}}\) (matrix
) >EDIT
- Make sure to edit the dimensions of the matrix by matching up the number of rows
x
the number of columns
- Access this by
- Do a \(\chi^2\texttt{-Test}\)
- Access this by
stat
>TESTS
- Leave observed as
[A]
and expected as[B]
- Just hit calculate
- You will see the \(\chi^2\) test statistic value, p-value, and df
- Access this by
- Matrix
[B]
is now populated with the expected counts.- View matrix
[B]
the same way that you edited matrix[A]
- View matrix
16.7 Follow-up (Component) Analysis
Given our \(\chi^2\) statistic, \[\chi^2=\sum\frac{(Observed-Expected)^2}{Expected}\]
Each individual term \(\frac{(Observed-Expected)^2}{Expected}\) that is calculated to be added up is called a component of the \(\chi^2\) statistic. The magnitude of these components provide a measurement as to how far off a certain observed value is from its expected value.
When doing a follow-up analysis of the \(\chi^2\) test, look at the components of the calculated \(chi^2\) statistic. The category(ies) which have the larger components are the categories that contribute the most to the test statistic. Then explain what direction these categories are pulling the actual distribution to favor.
16.7.1 Example
(From the \(\chi^2\) GOF Example)
If there is convincing evidence of a difference in the distribution dice sides, perform a follow-up analysis.
A reminder of our results:
Outcome of roll | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|
Observed | 12 | 28 | 12 | 13 | 10 | 15 |
Expected | 15 | 15 | 15 | 15 | 15 | 15 |
\(\frac{(Observed - Expected) ^ 2}{Expected}\) | 0.6000 | 11.2667 | 0.6000 | 0.2667 | 1.6667 | 0.0000 |
Based off our components, the 2 side of the die is contributing most towards our statistic (11.2667). The 2 side is showing up much more than we expected if the dice was actually fair.