7 Experimental Design
A sample survey aims to gather information about a population without disturbing the population in the process. Sample surveys are one kind of observational study.
Observational studies only involve observing individuals and measuring variables of interest with no attempt to influence the responses
Unfortunately, this means that many variables are not controlled for so we cannot establish causation between the explanatory and response variables.
Response variables measure an outcome of a study.
Explanatory variables may help explain or predict changes in a response variable.
Because of the lack of control, we cannot be certain that variables other than the explanatory variable confound the response instead.
Confounding occurs when the effects of two variables on a response variable cannot be separated from each other.
Confounding variables are variables other than the explanatory variable that may have an effect on the response variable.
When our goal is to understand cause and effect, an experiment is the only source of fully convincing data since results in observational studies typically have some confounding variable in play. For this reason, the distinction between observational study and experiment is one of the most important in statistics.
An experiment deliberately imposes some treatment (the specific condition applied) on individuals (experimental units or subjects if the experimental units are human) to measure their responses.
7.1 Experiment Principles
In general, the quality of experiments (their internal validity) can be judged based on the degree to which they have four things: comparison, randomization, control, and replication. Stronger internal validity gives us a better cause-effect link in our experiment. Whenever you are describing or evaluating the design of an experiment, you need to be sure to discuss all four of these!
These four determine how much internal validity we have in our experiment.
- Comparison
- Use a design that compares two or more treatments.
- Randomization
- Use chance to assign experimental units to treatments. Doing so helps create roughly equivalent groups of experimental units by balancing the effects of other variables among treatment groups.
- Control
- Keep other variables that might affect the response the same for all groups.
- Replication
- Use enough experimental units in each group so that difference in the effects of the treatments can be distinguished from chance differences between the groups.
The logic of a randomized comparative experiment depends on our ability to treat all the subjects the same in every way except for the actual treatments being compared. Good experiments, therefore, require careful attention to details to ensure that all subjects really are treated identically.
7.1.1 Placebos
The response to a dummy treatment is called the placebo effect. Subjects are given a placebo treatment to control for the placebo effect.
For example, If I tell someone that I am giving them an energy drink (when it in fact has doesn’t actually provide “energy”), and they feel like they have energy after, they have fallen for the placebo effect.
It’s well known that someone’s mental state can easily affect their physical state, so it’s important to control for the placebo effect. Typically, this applies to medicine settings, where you might give a pill with the actual medicine and a placebo (a pill with everything but the actual medicine) and conduct it in a blind.
Conducting an experiment in a blind means that you give treatments to patients without allowing them to know which treatment they are taking.
However, Whenever possible, experiments with human subjects take it a bit further and conduct their experiment in a double-blind, where neither the subjects nor those who interact with them and measure the response variable know which treatment a subject received.
7.2 Experiment Designs
7.2.1 Completely Randomized Design
In a completely randomized design, the experimental units are assigned to the treatments completely by chance. This is similar to (but NOT the same as) a simple random sample (SRS), because in both cases we ignore other variables. Here’s the difference: In an SRS, we’re picking some people (our sample) to study, and ignoring the rest. In a completely randomized experiment, however, we already have our sample (the people in our experiment), and we’re randomly deciding how we’re going to study each person (or, which treatment they’re going to get). So in complete randomization, the randomization is in the assignment, not in the selection, of people in our study.
7.2.2 Randomized Block Design
In a randomized block design, the experimental units are first assigned to blocks according to the different types/status of the experimental units in the experiment. This is similar to stratified random sampling, however, we are not taking a sample. Any reference to stratified random sampling is wrong when describing an experiment design.
After each experimental unit is assigned to their block, the experiment is carried out in each block, where a completely randomized design is carried out within the block.
Afterwards, you compare and analyze results from each block and finally combine all results and analyze the differences between blocks.
7.2.3 Matched Pairs Design
A matched pairs design is a special case of a randomized block design that uses blocks of size 2. In this kind of design, you have to have “matched pairs.” In other words, you need to have two extremely similar individuals that make up each block. In some cases, you have a single person for each block and that person recieves both treatments in randomized order (because who is more similar to a person than themselves?).
7.3 Inference
The main purpose of experiments is to be able to infer something about what we did. Does A actually affect B? Is it true for anyone else other than the people we experimented on?
An observed effect so large that it would rarely occur by chance is said to be statistically significant. If we test something according to a single assumption that we make and find out that the data that we collect doesn’t really match up with that claim (if the chance of seeing data like the one we obtained is too low), then we’d say that it is statistically significant evidence.
7.3.1 Scope of Inference
The scope of inference refers to the type of inferences (conclusions) that can be drawn from a study. The types of inferences we can make (inferences about the population and inferences about cause-and-effect) are determined by two factors in the design of the study.
Were individuals randomly assigned to groups? | |||
---|---|---|---|
Yes | No | ||
Were individuals randomly selected from a population? |
Yes |
Can make inferences about the population Can make inferences about cause and effect (Rare in the real world) |
Can make inferences about the population Cannot make inferences about cause and effect (Some observational studies) |
No |
Cannot make inferences about the population Can make Inferences about cause and effect: (Most experiments) |
Cannot make inferences about the population Cannot make Inferences about cause and effect (Some observational studies) |