This post continues the series of posts on performance measures. In our previous article we talked about Cohen’s Kappa.

A common way to measure the performance of a regression algorithm is Pearson’s correlation between the true and the predicted values. However, Pearson’s correlation in this case suffers from one drawback. It ignores any bias which might exist between the true and the predicted values. Let me give an example. The figure below shows the relationship between two variables. Variable X is sampled from the standard normal distribution and y=x+10. We can treat X as the target variable and as Y the predicted value of a statistical model.

The correlation between these two variables takes the value 1, so it is perfect.

perfect correlation

However, what we care about when measuring the agreement between the true and the predicted values, is whether they predict the exact same value. If that is the case, then a point is on the 45 degree line crossing the origin. This is the red line shown below. It is clear that the red line that indicates perfect agreement is parallel to the relationship between x and y.

Perfect correlation but no agreement

So what can we do? The concordance correlation coefficient comes to the rescue! The concordance correlation coefficient measures the agreement between two variables. In this case, the value is around 0.02, indicating no agreement between the two variables.

Unfortunately, the concordance correlation coefficient is not widely used in the evaluation of predictive models. I believe this to be an important omission and I would urge any data scientist to start using it for regression modelling.

You can use it in R by downloading the package epiR.

So, next time you run a regression model use both Pearson’s correlation coefficient and the concordance correlation coefficient and check the difference between them. You’ll see that the concordance correlation coefficient is usually more conservative and has a lower value than Pearson’s correlation coefficient. However, it can be a safer measure to use for measuring performance in regression, in order to avoid these few cases where strong bias in the predictions might be an issue.