'data partitionning function CreateDataPartition cross validation problem
I am trying to get predictions of a multiple variables model, its eplt
, its made of 7 scores and one final exam score moy_exam2
, I want to predict the later using the 7 scores, I have 29441 obs,like this:
'data.frame': 19643 obs. of 8 variables:
$ HG : num 11.5 14 7.5 10.5 9.5 9.5 10 14 11.5 14 ...
$ Math : num 8 7.25 9.25 13.25 4.25 ...
$ Ar : num 11.2 12.8 8.5 11.5 9.5 ...
$ Fr : num 4 4.25 6.5 6.75 5.5 ...
$ EI : num 8 10.5 2.5 4 7 9.5 8.5 9.5 12 14 ...
$ SVT : num 5.25 9.25 7 11.5 12.5 ...
$ PC : num 11.5 16.75 4.25 13.75 10 ...
$ moy_exam2: num 8.15 9.48 7.23 10.33 7.44 ...
I decided 85% for training and 15% for testing out the model, so in partitioning the data with CreateDataPartition I try this :
# Load the data
data("neplt")
# Inspect the data
library(tidyverse)
sample_n(neplt, 3)
# Split the data into training and test set
set.seed(1,sample.kind = "Rounding")
#remember the last sample
training.samples=neplt$moy_exam2
library(Rcpp)
training.samples <- neplt$moy_exam2 %>%
createDataPartition(neplt,p = 0.85, list = FALSE,times = 1)
train.data <- neplt[training.samples, ]
test.data <- neplt[-training.samples, ]
# Build the model
model <- lm(moy_exam2 ~., data = train.data, na.action=na.omit)
# Make predictions and compute the R2, RMSE and MAE
predictions <- model %>% predict(test.data)
data.frame( R2 = R2(predictions, test.data$moy_exam2),
RMSE = RMSE(predictions, test.data$moy_exam2),
MAE = MAE(predictions, test.data$moy_exam2))
I get the error
Error in split_indices(as.integer(splitv), attr(splitv, "n")) :
function 'Rcpp_precious_remove' not provided by package 'Rcpp'
I don't use any split_indices
function here! and the Rccp
is already loaded, so I continue the executing, but the program gets stuck on the CreateDataPartition line,
I clean the data eplt
using na.omit
and also with na.exclude
to remove any doubt about the NA missing values,
then, I tried adding the sample.kind = "Rounding"
attribute to the set.seed
to get it to work, still the Rstudio keeps loading indefinitely, and the console shows a + sign:
does it seems to be related to the memory capacity? or doesnt it have indefinite number of sample that the it couldn't finish it in 100 years, its been running for hours with no results!
Solution 1:[1]
I had a similar problem and error code when running summarySE. It seems like others have had issues like this too: Rcpp package doesn't include Rcpp_precious_remove
I installed and loaded Rcpp again and it worked thereafter!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | TylerH |