'Vectorised argument for a function in R. The function gives out multiple data frames, whereas I'd like it to output only one

I'd like to compute trimmed mean for each trimming proportion alpha, and then see which trimming proportion gives the minimal variance of the trimmed means, when Bootstrap simulations of size N=200 are applied. The problem that I have, is that when I try to create a data frame of column1 = mean and column2 = variance, the code that I wrote creates each output of mean and variance as separate data frame, so I cannot look up the trimming proportion and trimmed mean which have the minimal variance.

The function gives out "data.frame" 9 times. I guess it's because the alpha argument is vectorized. The code:

 tmean_var <- function(n,N,alpha){
      set.seed(1)
      data <- rnorm(n)
      data_aug1 <- c(data, -data)
      data_aug2 <- c(data, 2 * median(data) - data)
      est <- data.frame()
      tmean <- replicate(N, {
      sample <- base::mean(sample(x = data[(round(alpha*n)+1):(n-round(alpha*n))],
                                    size = n-2*round(alpha*n), replace = TRUE))
    })
      
    mean <- base::mean(tmean)
    var <- var(tmean) * (n-2*round(alpha * n))
    df <- data.frame(mean = mean, var = var)
    class(df)
    }
    f <- Vectorize(tmean_var, vectorize.args = "alpha")
    f(n,N,alpha)

How could I make the output to be one dataframe not nine?

r


Solution 1:[1]

This should do it. Rather than try to use Vectorize() on a function that doesn't inherently take vector arguments, you could just use sapply() and lapply() across the values of alpha you provide as below:

tmean_var <- function(n,N,alpha){
  set.seed(1)
  data <- rnorm(n)
  data_aug1 <- c(data, -data)
  data_aug2 <- c(data, 2 * median(data) - data)
  est <- data.frame()
  tmean <- lapply(alpha, function(a){replicate(N, {
    sample <- base::mean(sample(x = data[(round(a*n)+1):(n-round(a*n))],
                                size = n-2*round(a*n), replace = TRUE))
    })
  })
  
  mean <- sapply(tmean, mean)
  var <- sapply(seq_along(tmean), function(i)var(tmean[[i]]) * (n-2*round(alpha[i] * n)))
  df <- data.frame(mean = mean, var = var, alpha=alpha)
#  class(df)
}

out <- tmean_var(100, 200, c(.1, .2, .3))
out
#>         mean       var alpha
#> 1 0.10555709 0.8066377   0.1
#> 2 0.06868891 0.8331401   0.2
#> 3 0.21791984 0.9024612   0.3

Created on 2022-05-13 by the reprex package (v2.0.1)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 DaveArmstrong