'Can we use Normal Equation for Logistic Regression ?

Just like we use the Normal Equation to find out the optimum theta value in Linear Regression, can/can't we use a similar formula for Logistic Regression ? If not, why ? I'd be grateful if could someone could explain the reasoning behind it. Thank You.



Solution 1:[1]

Unfortunately no, only two methods in classification theory have closed form solutions - linear regression and linear discriminant analysis/fischer discriminant.

In general it is considered a miracle that it "works" even for linear regression. As far as I know it is nearly impossible to prove that "you cannot solve logistic reggresion in closed form", however general understanding is that it will not ever be the case. You can do it, if your features are binary only, and you have very few of them (as a solution is exponential in number of features), which has been shown few years ago, but in general case - it is believed to be impossible.

So why it worked so well for linear regression? Because once you compute your derivatives you will notice, that resulting problem is set of linear equations, m equations with m variables, which we know can be directly solved through matrix inversions (and other techniques). When you differentiate logistic regression cost, resulting problem is no longer linear... it is convex (thus global optimum), but not linear, and consequently - current mathematics does not provide us with tools strong enough to find the optimum in closed form solution.

That being said there exists (absolutely impractical computationally) closed form solution if all your input variables are categorical (they can only take finitely many values that you can enumerate): https://www.tandfonline.com/doi/abs/10.1080/02664763.2014.932760?journalCode=cjas20

Solution 2:[2]

Yes, if we develop a mathematical model to solve the differentiated form of cost function which is case of linear regression is 'matrix' and its inverses. But no such tool is available in this uptil now. So till now a big NO.

Solution 3:[3]

Yes, but im not sure on to what extent. This works for binary logistic regression.

Note that we are dealing with logistic regression and not linear regression. So if we use normal equation as it is, which supposed to be used for linear regression, the solution of theta would only be for y = 0s, not both 1s and 0s.

The correct solution is to make the binary logistic term y of 1s and 0s into linear terms. It is quite simple,

from logistic function y in terms of theta * x: y = 1/( 1 + e**(-thetax)) #corresponds to linear regression y=thetax to thetax in terms of y: thetax = -ln(1/y -1)

This means, in normal equation's y of [0 1] into [-inf inf]. Since inf term is not applicable in normal equation, we can use approx [-99999 99999]. The close the better.

In short, from y=[0s 1s] to y=[-999s 999s].

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 ANUJ KUMAR
Solution 3 vencoder