'MCAR Little's test in Python
How can I execute Little's Test, to find MCAR in Python? I have looked at the R package for the same test, but I want to do it in Python. Is there an alternate approach to test MCAR?
Solution 1:[1]
You can use rpy2 to get the mcar test from R. Note that using rpy2 requires some R coding.
Set up rpy2 in Google Colab
# rpy2 libraries
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
from rpy2.robjects import pandas2ri
from rpy2.robjects import globalenv
# Import R's base package
base = importr("base")
# Import R's utility packages
utils = importr("utils")
# Select mirror
utils.chooseCRANmirror(ind=1)
# For automatic translation of Pandas objects to R
pandas2ri.activate()
# Enable R magic
%load_ext rpy2.ipython
# Make your Pandas dataframe accessible to R
globalenv["r_df"] = df
You can now get R functionality within your Python environment by using R magics. Use %R
for a single line of R code and %%R
when the whole cell should be interpreted as R code.
To install an R package use:
utils.install_packages("package_name")
You may also need to load it before it can be used:
%R library(package_name)
For the Little's MCAR test, we should install the naniar
package. Its installation is slightly more complicated as we also need to install remotes
to download it from github, but for other packages the general procedure should be enough.
utils.install_packages("remotes")
%R remotes::install_github("njtierney/naniar")
Load naniar
package:
%R library(naniar)
Pass your r_df
to the mcar_test
function:
# mcar_test on whole df
%R mcar_test(r_df)
If an error occurs, try including only the columns with missing data:
%%R
# mcar_test on columns with missing data
r_dfMissing <- r_df[c("col1", "col2", "col3")]
mcar_test(r_dfMissing)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |