'Correlation problems with two variables WITH NA
I have two variables and I want to know if they are correlated, I have them distributed like this:
X = 14,15,16,18,12,13,14,15
Y = NA, 13,12, NA, NA, 16,16, NA
And when by
cor(X, Y)
NA
Solution 1:[1]
If you can tolerate omitting all points for which NA
appears in even one of either X
or Y
, then you can call cor()
with the option use='complete.obs'
:
X <- c(14, 15, 16, 18, 12, 13, 14, 15)
Y <- c(NA, 13, 12, NA, NA, 16, 16, NA)
cor(X, Y, use='complete.obs', method='pearson')
[1] -0.9393364
You can verify for yourself that the above result is the same as using:
X <- c(15, 16, 13, 14)
Y <- c(13, 12, 16, 16)
cor(X, Y, method='pearson')
i.e. just dropping those data points for which either X
or Y
has an NA
value.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Tim Biegeleisen |