'How to obtain RMSE out of lm result?
I know there is a small difference between $sigma
and the concept of root mean squared error. So, i am wondering what is the easiest way to obtain RMSE out of lm
function in R?
res<-lm(randomData$price ~randomData$carat+
randomData$cut+randomData$color+
randomData$clarity+randomData$depth+
randomData$table+randomData$x+
randomData$y+randomData$z)
length(coefficients(res))
contains 24 coefficient, and I cannot make my model manually anymore.
So, how can I evaluate the RMSE based on coefficients derived from lm
?
Solution 1:[1]
Residual sum of squares:
RSS <- c(crossprod(res$residuals))
Mean squared error:
MSE <- RSS / length(res$residuals)
Root MSE:
RMSE <- sqrt(MSE)
Pearson estimated residual variance (as returned by summary.lm
):
sig2 <- RSS / res$df.residual
Statistically, MSE is the maximum likelihood estimator of residual variance, but is biased (downward). The Pearson one is the restricted maximum likelihood estimator of residual variance, which is unbiased.
Remark
- Given two vectors
x
andy
,c(crossprod(x, y))
is equivalent tosum(x * y)
but much faster.c(crossprod(x))
is likewise faster thansum(x ^ 2)
. sum(x) / length(x)
is also faster thanmean(x)
.
Solution 2:[2]
To get the RMSE in one line, with just functions from base
, I would use:
sqrt(mean(res$residuals^2))
Solution 3:[3]
I think the other answers might be incorrect. The MSE of regression is the SSE divided by (n - k - 1), where n is the number of data points and k is the number of model parameters.
Simply taking the mean of the residuals squared (as other answers have suggested) is the equivalent of dividing by n instead of (n - k - 1).
I would calculate RMSE by sqrt(sum(res$residuals^2) / res$df)
.
The quantity in the denominator res$df
gives you the degrees of freedom, which is the same as (n - k - 1). Take a look at this for reference: https://www3.nd.edu/~rwilliam/stats2/l02.pdf
Solution 4:[4]
Just do
sigma(res)
An you got it
Solution 5:[5]
Checkout the rmse() function in the Metrics package
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | |
Solution 3 | Arthur |
Solution 4 | JF Collin |
Solution 5 | jans |