top of page
  • Writer's picturePriyankaa Nigam

Why do we need Statistics?

Excitement isn't the first reaction to the subject of statistics. This is what this post aims to change. You'll learn the basics of statistics ...



"Statistics describe Hidden Realities!"


Put yourself in the shoes of a high school senior. It's fall, which means it's time to think about college applications. United States is home to an impressive 5,300 higher education institutions. The question then becomes, "How do you pick which universities to apply to?" If you know a few descriptive statistics, such as the 25th to 75th percentile GPA and SAT/ACT scores of the students who were admitted to a particular college, you can make an educated choice based on whether you have a reasonable chance of being accepted.


Statistics is an objective method of observing reality, which may reveal things we didn't see before. If we can interpret these findings correctly, we can use them to make informed decisions in many practical settings. The more statistical concepts we know, the more tools we have at our disposal to solve practical problems. Let's look at some examples of statistical concepts and why it's crucial to interpret them properly.


Simpson's Paradox


It is when a trend appears in different groups of data but disappears or reverses when the groups are combined into one set.


Let's pretend you've developed two possible treatments for COVID-19: Drug A and Drug B. You are curious as to which one is more efficient. So, you conduct two human clinical trials.



In both trials, Drug A appears to be superior to Drug B. However, what if we were to take a look at the whole data, combined for trial 1 and trial2?





Drug B now seems to have a better success rate than Drug A. It's the opposite of what we expected to happen.


For correct interpretation of the data, it is crucial to recognize and understand this apparent contradiction.



Correlation (is not Causation)


Correlation tells us to what extent different variables are related to each other.


The relationship can be either positive (as one variable increases, the other increases) or negative (as one variable increases, the other decreases). Or there can be no relationship between the variables.


Understanding the relationship between two variables is important in making predictions about their future behavior. This has applications in medical and psychological fields as well as in business, government, and everyday life.


For instance, the amount of time students spend studying each day and their GPA have a strong positive correlation. According to the graph below, a student should expect to earn a GPA of 3.8 if he or she studies for four hours daily.


Even though correlational studies are useful for determining the strength and direction of relationships, they have limitations in that they don't reveal much about what causes a relationship. While it is true that some correlations are the result of one variable causing the other, it is also possible that another variable, known as a confounding variable, may be the cause of both variables or that the correlation is merely coincidental.


A classic example is the strong positive correlation between ice cream sales and number of homicides.


When ice cream sales go up, so do murders. But does that prove that ice cream consumption causes murders? That's completely ridiculous! Hotter temperatures are likely to be a contributing factor in both the rise in ice cream sales and the rise in homicide rates. When the temperature rises, ice cream sales spike. Also, since more people are out and about on the streets, the number of potential victims increase.

A spurious correlation between two completely unrelated incidents, homicide and ice cream sales, results when the hidden factor, hot temperature, is ignored.

Many people make the mistake of assuming causation where there is only correlation, which can lead to misleading conclusions. It can make them take a faulty decision. In the above scenario, attributing the murders to ice cream sales could very well cause the government to put a ban on selling ice-cream.


It is important to have a proper understanding of statistical concepts when using them, and misinterpretation can actually lead us astray.


Key Takeaways:

  • Learning and understanding statistical principles gives us the ability to tackle real-world problems.

  • With its emphasis on empirical evidence, Statistics has the capacity to shed light on hidden truths.

  • It is essential to have a firm grasp on statistical concepts to interpret data correctly; otherwise, it can lead to misleading inferences and wrong decisions.


References:


Crash Course. (2018, January 24). What Is Statistics: Crash Course Statistics #1 [Video]. YouTube. https://youtu.be/sxQaBpKfDRk


Grigg, T. (2018, December 9). Simpson’s Paradox and Interpreting Data. Towards Data Science. https://towardsdatascience.com/simpsons-paradox-and-interpreting-data-6a0443516765


Mark Liddell. How statistics can be misleading - [Video]. YouTube. https://youtu.be/sxYrzzy3cq8


Seema Singh. (2018, August 24). Why Correlation does not Imply Causation? Medium.


Smith, A. (2016, January 14) Why You Should Love Statistics [Video]. TEDxExeter. TED-Ed.



bottom of page