We've all heard in school that “correlation does not imply causation,” but The term “causality” has a nice intuitive definition, but has eluded. Causality in the real world seldom falls into one neat pattern or another. Cause precedes effect; sequential pattern; Direct link between cause and effect; Has a clear beginning and a clear ending; Effect can Can you think of an example?. A correlation between two variables does not imply causation. on the test; there could be other reasons—the student may not have studied well, for example.
It is entirely possible that girls who are prone to eating disorders are also attracted to soap operas. There are several reasons why common sense conclusions about cause and effect might be wrong.
Correlated occurrences may be due to a common cause. For example, the fact that red hair is correlated with blue eyes stems from a common genetic specification that codes for both. A correlation may also be observed when there is causality behind it—for example, it is well established that cigarette smoking not only correlates with lung cancer but actually causes it.
But in order to establish cause, we have to rule out the possibility that smokers are more likely to live in urban areas, where there is more pollution—and any other possible explanation for the observed correlation.
In many cases, it seems obvious that one action causes another; however, there are also many cases when it is not so clear except perhaps to the already-convinced observer. In the case of soap-opera watching anorexics, we can neither exclude nor embrace the hypothesis that the television is a cause of the problem—additional research would be needed to make a convincing argument for causality. Another hypothesis might be that girls inclined to suffer poor body image are drawn to soap operas on television because it satisfies some need related to their poor body image.
None of these hypotheses are tested in a study that simply asks who is watching soaps and who is developing eating disorders, and finding a correlation between the two.
Causation vs Correlation
How, then, does one ever establish causality? This is one of the most daunting challenges of public health professionals and pharmaceutical companies. In a controlled study, two groups of people who are comparable in almost every way are given two different sets of experiences such one group watching soap operas and the other game showsand the outcome is compared.
If the two groups have substantially different outcomes, then the different experiences may have caused the different outcome. There are obvious ethical limits to controlled studies: This is why epidemiological or observational studies are so important.
These are studies in which large groups of people are followed over time, and their behavior and outcome is also observed. In these studies, it is extremely difficult though sometimes still possible to tease out cause and effect, versus a mere correlation. This was the case with cigarette smoking, for example. At the time that scientists, industry trade groups, activists and individuals were debating whether the observed correlation between heavy cigarette smoking and lung cancer was causal or not, many other hypotheses were considered such as sleep deprivation or excessive drinking and each one dismissed as insufficiently describing the data.
When the stakes are high, people are much more likely to jump to causal conclusions. This seems to be doubly true when it comes to public suspicion about chemicals and environmental pollution.
There has been a lot of publicity over the purported relationship between autism and vaccinations, for example. As vaccination rates went up across the United States, so did autism.
Australian Bureau of Statistics
And if you splice the data in just the right wayit looks like some kids with autism have had more vaccinations. A few days later while listening to a data skeptic podcast it hit me: I had assumed that studying abroad caused students to have better grades and career prospects, when all the statistics showed was that the two were correlated.
Sometimes, especially with health, these tend towards the unbelievable like a Guardian headline claiming a diet of fish leads to less violence. The real explanation is usually much less exciting.
For example, students who take music lessons may perform better in school, but they are also more likely to have grown up in an environment with a large emphasis on education and the resources needed to succeed academically. These students would therefore have higher school achievement with or without the music lessons.
Correlation vs. Causation: An Example – Towards Data Science
Taking music classes and school performance happen to rise in tandem because they are both products of a similar background, but one does not necessarily cause the other. Likewise, people who stay in school longer typically have more resources which also means they can afford better health care. Most of the time these mistakes are not made out of a deliberate effort to deceive although that does occur but out of an honest misunderstanding of the idea of causation.
What the statistics, especially those in the study abroad email, show is a selection bias. In each study, the individuals observed do not come from a representative slice of society, but instead are all drawn from similar groups, leading to a skewed result.
While it might be possible that studying abroad did somehow motivate lagging students to graduate on time, the more likely explanation is that students who choose to go abroad were those in a better position academically in the first place.
They would graduate on time with high GPAs regardless of whether they went to another country. It takes a lot of work and preparation to go to another country to study for a year, and the students who feel confident enough to do so are the ones who are on top of their studies.
In this real-world case, the selection bias is towards better students. The sample of students who study abroad is not indicative of students as a whole, rather, it includes only the best-prepared students and therefore it is no surprise that this group has significantly better academic and career outcomes. The study abroad experience may look great in hindsight, but if we selected only the best students and had them do anything, it would be misleading to say the phenomenon led to better grades.
Say for example we own a bottled water company and we want to gather some positive stats to help with sales. We hire a few students to stand outside the honors class and only give our water to the top students. We then conduct a study that shows conclusively that students who drink our brand get better grades. Because we selected a specific group of subjects to include in our study, we can make it look as though our water caused an increase in grades.