# Correlations

Let’s imagine it is a hot summer day and you are very thirsty. You look around and see a vending machine selling water.

You know what to do to get what you need: you put a coin and the machine should behave by giving you a bottle of water – unless it is broken, which unfortunately can happen.

After you insert the coin, the strength of your relationship will be measured. If the coin payment is correct, the machine will respond by giving you the much needed bottle of water.

In the world of statistics, we measure relationships between different variables using calculators and softwares. There are different ways to test for associations or correlations, depending on the type of variables that you are measuring.

But beware, sometimes correlations may be deceiving. And just because your software lets you draw a line of best fit, doesn’t mean that it will be appropriate. Let’s take a look at a few of the most common tests of correlation.

## Pearson Correlation

The Pearson correlation takes a look at two continuous variables to see if there is a strong relationship between the two of them. The magnitude and direction of this association is described by the coefficient of determination (r2). The closer r2 is to 1 or -1, the better the model. If your coefficient of determination is close to 0, it indicates that there is likely no correlation.

Assumptions for Pearson Correlation

1. Both measured variables are normally distributed
2. There is a linear relationship between the two variables
3. The data is equally distributed across a regression line (with a line of best fit, there are just as many points that fall above the line as there are below)

Importantly, finding a strong correlation between two variables does not imply causation. It may indicate that both variables are influenced by a hidden confounder.

## Spearman Correlation

On the other hand, Spearman’s correlation takes all the values and ranks them by order first, before determining the relationship. This is a non-parametric test, which means that data does not need to be normally distributed to be used. In contrast to a Pearson Correlation test, there are no assumptions with the Spearman Correlation.

All that the Spearman Correlation requires are variables that are measured on an ordinal scale. If there is any kind of order to the dependent variables, they can be sorted from smallest to largest.

When might it be more appropriate to measure Spearman’s correlation? You might look to see if there is a correlation between age and the finishing position of an Olympian in a race. You can interpret the coefficients in a similar way as you would with the Pearson correlation test.

## Takeaway

Looking to test the strength of your relationship? Can you quantify your variables (or your love)? Then look no further than Pearson and Spearman correlation. Using statistical software, you can quickly figure out if an independent variable is linearly associated with a dependent variable.

If your data isn’t normally distributed or otherwise ordinal, the Spearman test can be used to assess your data. Remember to cautiously interpret your results because as we will soon cover, correlation does not equal causation.