Another frustrating aspect of science and statistics is the challenge of determining true values.
While we can get a good approximation by increasing our sample size, for many measures we might never attain the real value. If the real value is so mysterious, why do we often see averages and point estimates?
Many researchers are now advocating for more transparent reporting of confidence intervals. Simply put, this is a range of values, where we know that the true value is likely to fall.
Remember the normal distribution? If we know that our dependent variable is continuous, randomly sampled, and normally distributed – we can easily estimate confidence intervals.
Remember that 68 percent of data from a sample will lie within 1 standard deviation of the mean. But we can increase our certainty – we know that 95 percent of our data falls within two standard deviations of the mean.
How exactly would we calculate these confidence intervals? There is a table filled with Z values, that provide you numbers to plug into a formula that determines a confidence interval.
For example, the Z value for a 95% confidence interval is 1.960. For 99%, this value is 2.576. We can then use the following formula:
Let’s take a close look at the formula and see what happens when we change the standard deviation and the sample size.
We can take a few practical examples in the table below, to see how these intervals change.
In short, we can define the confidence interval as an estimate from observed data that gives a range of values for an unknown population parameter. It can also tell you about the reliability of your effect size and measurements.
For example, if you are looking to see if a gene is increased or decreased after a certain intervention or treatment, you can calculate an effect size. If the value is positive, it means that the gene is increased. If it is negative then it means that the gene is decreased. However, even with a low P-value, your effect size might be inconclusive.
If the effect size looks like this: 95% CI [-0.2; 0.2], it includes zero. That means you don’t have enough power to determine whether the expression of this gene increases or decreases. If there was a stronger effect, the 95% CI would not contain the value zero.
Confidence intervals are important to calculate because they help contextualize the results of a statistical test. Considering the sample size and standard deviation of your dataset, they estimate the range of values that the population parameter might fall into. It also means that some statistically significant results may end up being uncertain due to 95% confidence intervals.