Hello
Data Ninja’s!
This
introductory blog post is all about need of regularization in Ordinary Least
Square (OLS) method. I have tried covering the concept using simple text only.
In coming posts, I will try to use equations and visuals for better
comprehension.
The
Linear Regression is a process of fitting the curve/line so that the sum of square
of difference between estimated value and actual value is minimized
(minimization of squared residual).The method is also called the Ordinary Least
Square (OLS) method.
But, the Ordinary Least Square method is not
sufficient whenever low ratio of observations to number variable exist in
Regression Modelling. The prediction accuracy gets compromised in case of
higher number of variable and low data points. The methods like Ridge Regression and Lasso provide
probable solution for the problem.
Ridge regression generally yields better predictions than OLS
solution, through a better compromise between bias and variance. Its main
drawback is that all predictors are kept in the model, so it is not very
interesting if you seek a parsimonious model or want to apply some kind of
feature selection.
To
achieve sparsity, the lasso is more appropriate but it will not necessarily
yield good results in presence of high collinearity (it has been observed that
if predictors are highly correlated, the prediction performance of the lasso is
dominated by ridge regression).
I expect
the attempt of blog post to introduce the shortcoming of OLS and need of Regularization
in regression model was successful through this blog post. If yes keep visiting
this blog to get more incite on Data Analytics & Digital Marketing. You can
also subscribe for blog posts using subscription options available in right
sidebar.