**Regression to the mean shows that following an extreme random
event, the next random event is likely to be less extreme.**

The concept was first presented by Sir Francis Galton and the more comprehensive statement of regression to the mean is that:

**In any series of events with complex phenomena that are
dependent on many variables, and where chance is involved, extreme outcomes
tend to be followed by more moderate ones.**

It is not a mathematical law but a statistical tendency.

To be able to understand regression to the mean, we first need to understand correlation.

**Correlation** __is a relationship__, thus **action A relates to outcome B**.

The strength or weakness of the correlation of shared factors [or causes] will show the level of dependency of the outcome [or score] on those factors.

If the correlation is strong there will be little to no regression to the mean, whereas as a weak or non-existent correlation between shared factors will likely show that there is a regression.

__Examples:__

**[1] Strong correlation of factors - TEMPERATURE**

The only factor determining temperature – velocity of molecules — is shared by all scales, hence each degree in Celsius will have exactly one corresponding value in Fahrenheit.

So the temperature in Celsius and Fahrenheit will have a correlation coefficient of 1 and a plotting of temperature score will be a straight line.

**[2] Weak to non-existent correlation of factors - BOTTLED WATER CONSUMPTION and DEATHS FROM AIR CRASHES**

If we looked at all
countries in the world and plotted death rates from air crashes per country, in any specific year ,
against per capita consumption of bottled water, the plotting of scores for each event would show no
pattern at all. Thus the correlation coefficient is 0, i.e. there is no correlation.

We need to be especially careful of regression to the mean when we are trying to establish causality between two factors.

As we have already established:

**Correlation** is a relationship, thus **action A relates to outcome B**.

Whereas:

**# Causation** can be defined as **action A causes outcome B.**

**# Causation** **results from a perfect correlation between action A and outcome B.**

- Correlation and causation are often confused because the human mind likes to find patterns even when they do not exist.
- We can
__not__assume causation just because we see an apparent connection or link between A and B, such as we see that B follows A or we see A and B simultaneously. - We can only show a causation when we know
__how__it causes B and__why__it causes B. - We need to be able to
__separate action A from all other variable factors__ - We must also be able to show how Action A can be replicated under the same conditions to achieve outcome B, AND be able to support this with supporting empirical evidence [thus following scientific method].

**This is something that many people, social media, the general media, and
sometimes even trained scientists, fail to understand or recognise.**

Regression To The Mean - Examples

Can you recall a wonderful evening out? The weather was fabulous, your table was great and restaurant was excellent, the food and wine were superb and you were with great people.

Then you tried to repeat the experience and somehow it wasn't as good and you were disappointed. Why was this? It was because your perfect evening was due to a random series of chance events that all fell in place on the wonderful evening.

Whenever you try to repeat the perfect experience, chances are that at least one thing won't be perfect the next time.

Regression to the mean is prevalent
in sport and can explain the “*manager of the month curse*”
in football.

This award is usually won by managers who have had four or more wins in a row, often because of a combination of skill and luck.

When the luck runs out, the “*curse*” strikes.

Regression to the mean can be harmless, but it becomes a problem when it is misinterpreted:

*"For example, **imagine you ran a hospital** and were told that
hospital-acquired infections were five times higher than average last
month. A colleague tells you they know the cause and it can solved by
using more prophylactic antibiotics. *

*You agree and in the following month you’re told that prophylactic
antibiotic use is through the roof and infections have come down. Your
mind makes a causal connection and you’re now convinced of the need for
widespread prophylactic antibiotics, a potentially dangerous connection
given that the unusual infection rate could have been due to chance
events. *

*Now your hospital budget will be tighter because of the costs of
using more antibiotics, and you’re contributing to serious problem of
antibiotic resistance."*

Here are 4 suggestions for identifying and countering false attribution of cause and effect and understanding the regression to the mean

**Gather more data****Identify the variables****Wait and watch****Rinse and repeat**

Return to: Mental Models