Site Loader
Are you interested in learning more about how to become a data scientist? Then make sure to check out my webinar: what it's like to be a data scientist.


Linear regression is without question the most famous statistical algorithm. It is often the first algorithm that is being taught in machine learning courses and it is surprisingly effective in a huge range of problems. It has a variety of nice properties, such as the fact that the coefficients have a clear interpretation.

However, linear regression is many times misused. The two most common problems I’ve seen in practice are:

  1. Not checking that the assumptions of the model are true.
  2. Not doing any kind of diagnostic checking.

This dashboard follows the ideas set out by my article on data science protocols. It conducts linear regression, along with various proven and trusted diagnostic tests in order to discover any issues with the model. This provides a safe way to run and use linear regression. The dashboard tests for:

  1. Multicollinearity
  2. Global assumptions (skewness, kurtosis, link function and heteroskedasticity).
  3. Outlier detection
  4. Model diagnostics

Let me know about any feedback which you might have.

Link to dashboard: https://stylianos-kampakis.shinyapps.io/linear_regression/

 


If you are interested to know more about how data science can be used in your business, make sure to check out my book The Decision Maker's Handbook to Data Science. Also, make sure to check out my courses, as well as my webinars:
  1. What it's like to be a data scientist 
  2. The importance of data strategy

Related Posts More From Author