'How to exclude NA values in lm function (regression)?
I am doing a regression analysis with 70 countries. My dependent variable is 'Inequality' and my independent variable is 'Sanction'.
My original columns look as follows: 1: 'Year'(1914-2006; coded as 'numeric') 2-71: Sanction (binary; no data missing; coded as 'numeric') 72-141: Inequality (numeric; Gini-coefficient; some data missing; coded as 'numeric')
In a first step, I bound the columns in R:
Inequality <- data.frame(Gini=c("I_1","I_2","I_3"...,"I_70"))
Sanction <- data.frame(Sanction=c("S_1","S_2","S_3"...,"S_70"))
model1<- cbind(Inequality, Sanction)
In a second step, I try to perform the regression (here comes the problem):
model1 = lm(Gini ~Sanction, model1, na.action = na.exclude)
This is the error, which pops up:
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
NA/NaN/Inf in 'y'
In addition: Warning message:
In storage.mode(v) <- "double" : NAs introduced by coercion
My question is: why does the lm-function still bother about the NA values, which do exist in the Inequality columns (as some data is missing)? Why is it apparently not sufficient to use na.action = na.exclude
?
I believe, the mistake could be earlier in this process; that's why I showed you all of it.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|