10 Random Variables
A random variable is a variable whose value is a numerical outcome of a random phenomenon.
The probability distribution of a random variable \(X\) gives us all possible values of \(X\), and their corresponding probabilities. Probability distributions are typically given as tables, histograms (with probability on the y-axis, instead of frequency), or density curves (like the Normal curve).
For any given named distribution (e.g., Binomial, Normal, Geometric, Uniform), we will always have a probability density function (pdf) and a cumulative density function (cdf)
We use pdfs to find the probability of a singular point. * e.g., if I want to find the probability of that a random variable \(X\) is equal to \(x\), then I’d plug in the value of \(x\) into the pdf of \(X\), \(P(X = x)\).
We use cdfs to find the probability of a falling within a certain interval. * e.g., if I want to find the probability of that a random variable \(X\) is less than \(x\), then I’d plug in the value of \(x\) into the cdf of \(X\), \(P(X \leq x)\).
10.1 Discrete Random Variables
A discrete random variable describes a process that only has specific, predefined numerical outcomess. For example, we can use a discrete random variable to finding the probability of Michael Jordon scoring 50 points in a single game or the probability of rolling a 3 or lower on a six-sided die.
When we have a discrete random variable \(X\) whose probability distribution is
\[ \begin{aligned} \textbf{Value:}& ~~~~ x_1 ~~~~ x_2 ~~~~ x_3 ~~~~ \cdots \\ \textbf{Probability:}& ~~~~ p_1 ~~~~ p_2 ~~~~ p_3 ~~~~\cdots \end{aligned} \]
we know the following about the mean and standard deviation of \(X\):
\[\mu_X = E(X) = \sum x_i P(x_i)\]
\[\sigma_X = \sqrt{\sum \left( x_i - \mu_X \right) ^2 P \left( x_i \right)}\]
Keep in mind that the standard deviation formula here (and else where) is equivalent to the formula of population standard deviation when we divide by \(n\) instead of multiplying by \(P(x_i)\).
Just as a quick reminder, our formula for sample standard deviation is:
\[s_X = \sqrt{\frac{\sum(x_i - \bar x)^2}{n - 1}}\]
Our formula for population standard deviation (which we won’t use in this class is)
\[\sigma_X = \sqrt{\frac{\sum(x_i - \mu_X)^2}{n}}\]
Taking this formula, we can notice that:
\[ \begin{aligned} \sigma_X &= \sqrt{\frac{\sum(x_i - \mu_X)^2}{n}} &(1)\\ &=\sqrt{E((X - \mu_X)^2)} &(2)\\ &=\sqrt{\sum \left( x_i - \mu_X \right) ^2 P \left( x_i \right)} &(3)\\ \end{aligned} \]
In other words, remember that we define standard deviation as the root mean of the squared differences from the mean.
We get to step (2) by recognizing that by adding up all the squared differences from the mean and dividing by \(n\), we are taking the mean of the squared differences. Therefore, \(\frac{\sum(x_i - \mu_X)^2}{n} = E((X - \mu_X)^2)\)
And in the case of discrete random variables, we can find the mean of \(X\) as \(\mu_X = E(X) = \sum x_iP(x_i)\) which allows us to rewrite from step (2) to (3)
10.1.1 Binomial Random Variables
Binomial random variables have parameters \(n\) and \(p\), and can be written \(B(n, p)\). Remember, Normal random variables have parameters and and can be written \(N(\mu,\sigma)\).
The pdf of a Binomial Random Variable (i.e. the binomial formula) is:
\[ \begin{aligned} P(X=k) &= {n \choose k} p^k (1-p)^{n-k}\\ \text{where } k &= 0, 1, 2, 3, \cdots, n \end{aligned} \]
To apply this formula in a graphing calculator: \(\texttt{2nd -> vars (distr) -> binompdf}\)
Usage: \(\texttt{binompdf(n, p[,x])}\)
The cdf of a Binomial Random Variable is:
\[ \begin{aligned} P(X\leq k) &= \sum_{i = 0}^n {n \choose i} p^i (1-p)^{n-i} \end{aligned} \]
In graphing calculator: \(\texttt{2nd -> vars (distr) -> binomcdf}\)
Usage: \(\texttt{binomcdf(n, p[,x])}\)
The mean and standard deviation of a binomial random variable is given by:
\[ \begin{aligned} \mu_X &= n p \\ \sigma_X &= \sqrt{n p q} \\ &\text{, where } q = 1-p. \end{aligned} \]
Identifying Binomial Situations
We can identify a binomial setting when we successfully recognize these four things about a random phenonmenon:
Binary?
The possible outcomes of each trial can be classified as “success” or “failure”.Independent?
Trials must be independent; that is, knowing the result of one trial must not tells us anything about the result of another trial.Number?
The number of trials n of the chance process must be fixed in advance.Same?
There is the same probability p of success on each trial.
10% Condition
The second condition is often not perfectly met, as in the case of an SRS from some population. Imagine choosing 10 students from a class of 15 females and 15 males—as we choose people, the remaining population changes, which changes the probability that the next person chosen will be male or female.
When we lack complete independence, we can see the consequence of this is negligible as long as our sample is small relative to the population from which we are sampling. If we were choosing our 10 people from a school of 3,300 students, the change in probability from person to person would be small enough to ignore.
The general rule is that the sample needs to be less than \(\frac{1}{10}\), or 10%, of the population. We refer to this as the 10% condition. \[ n \leq (.10) N \]
Normal Approximation to the Binomial Distribution
Remember that as \(n\) gets large, a binomial random variable \(X\) can take on more and more different values, and it can become tedious to continue to treat X as a discrete random variable. As \(n\) get larger, we can treat \(X\) as a continuous random variable, more specifically:
As \(n\) gets larger, the binomial distribution gets closer to a normal distribution.
However, before we use a normal distribution to approximate a binomial distribution, we have to check the following condition:
Large Counts condition
We can use a Normal distribution to model a binomial distribution if we know that a the Large Counts Condition is fulfilled: \[np \geq 10 \text{ and }n(1-p) \geq 10\] In other words, if the expected number of successes and failures (respectively) is greater than or equal to 10.
Doing a Normal Approximation to a Binomial Distribution
First verify all the conditions for a Binomial setting and the Large Counts Condition. Since we know that we have a binomial setting, we then know the distribution that we want to use is \(Normal(np, \sqrt{npq})\) proceed with the calculations according to this distribution.
10.1.2 Geometric Random Variables
If \(X \sim G(p)\), \(X\) has a geometric distribution with a parameter \(p\) probability of “success”.
The pdf of a geometric random variable is:
\[ \begin{aligned} P(X=x) &= (1-p)^{x-1}p\\ \text{where } x &= 1, 2, 3, \cdots \end{aligned} \]
In graphing calculator: \(\texttt{2nd -> vars (distr) -> geometpdf}\)
Usage: \(\texttt{geometpdf(p, x)}\)
The cdf of a Geometric Variable is:
\[ \begin{aligned} P(X\leq x) &= \sum_{i=1}^x(1-p)^{i-1}p \end{aligned} \]
In graphing calculator: \(\texttt{2nd -> vars (distr) -> geometcdf}\)
Usage: \(\texttt{geometcdf(p, x)}\)
The mean and standard deviation of a geometric random variable is given by:
\[ \begin{aligned} \mu_X &= \frac{1}{p} \\ \sigma_X &= \frac{\sqrt{q}}{p} \\ &\text{, where } q = 1-p. \end{aligned} \]
Identifying Geometric Settings
A geometric setting is very similar to a binomial setting, except that \(n\) the number of trials is not fixed and that we are instead finding probabilities until the first “success” occurs.
A geometric setting is defined as a series of observations where these 4 conditions are met:
Binary?
The possible outcomes of each trial can be classified as “success” or “failure”Independent?
Trials must be independent, that is, knowing the result of one trial must not have any effect on the result of any other trial.Trials until?
The goal is to count the number of trials until the first success occurs.Success?
On each trial, the probability \(p\) of success must be the same.
10.2 Operations with Random Variables
10.2.1 Linear Transformations
When we add a constant \(a\) and/or multiply by a constant \(b\) to a random variable \(X\), we perform a linear transformation of the form
\[ a + bX \]
The mean of \(a + bX\) is transformed in the same way that we change \(X\)
\[ \mu_{a+bX}=a+\mu_{bX}=a+b\mu_X \]
The standard deviation of \(a + bX\) is only affected by multiplication:
\[ \sigma_{a+bX}=\sigma_{bX}=b\sigma_X \]
10.2.2 Sum and Difference between Random Variables
In addition to linear transformation, we can also, to an extent, describe the result of adding or subtracting random variables
If we have two random variables \(X\) and \(Y\), we can describe the mean and standard deviation of the sum or difference of these with the following.
\[\mu_{X \pm Y} = \mu_X \pm \mu_Y\]
If the random variables \(X\) and \(Y\) are independent, then
\[ \sigma_{X \pm Y}^2 = \sigma_X^2 + \sigma_Y^2 \]
And it follows that:
\[ \sigma_{X \pm Y} = \sqrt{\sigma_X^2 + \sigma_Y^2} \]
What is \(\sigma_X^2\)?
The variance of a random variable \(X\) is:
\[ Var(X) = \sigma_X^2 \]
Keep in mind that the formula for the standard deviation here only applies if we know that \(X\) and \(Y\) are independent of each other.
10.3 Interpretations
10.3.1 Probabilities and Means
Be aware that you are still expected to know how to interpret probabilities. Now that we are working with means / expected values, we are going to extend our long run probability interpretation to means too.
Remember, if we are given a probability \(p\), we interpret it in the form of:
If [do the situation] many many times, [this thing] will happen about [\(p\)] of the time.
This extends to any mean \(\mu\):
If [do the situation] many many times, we will have an average about [\(\mu\)].
Remember that you should be describing the situation properly.
If we are looking at the average amount of heads in 50 coin flips, we’d say:
If we flip a coin 50 times many many times, we will see an average of about 25 heads.
Or if we are looking at the probability that we get 25 heads in 50 coin flips (which is \(\approx\) 0.1123)
If we flip a coin 50 times many many times, we will observe getting 25 heads about 0.1123 of the time.
10.3.2 Standard Deviation
We previously interpreted standard deviation in Chapter 1, and now we can tone down the interpretation, we’d just say something along the lines of
[The variable] typically varies by [standard deviation] from the [mean].