'map function based on a column value of dataframes stored in a list

I am trying to use the map function to do something complex - I'd like to use the values of the Result column per each dataframe I have in a list ( these are monthly dataframes and should be kept separated) and iterate this column with an external vector which should change according to a categorical variable inside the dataframe. Thus I defined two different functions to be passed inside map but I am getting an error. The ideal would be also to create a new column in each dataframe of the list to store the new values .. but I am not sure how to do that with "mutate" given that the object is a list.

Thanks a lot

rm(list = ls())

setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
#> Error: RStudio not running
getwd()
#> [1] "C:/Users/Angela/AppData/Local/Temp/RtmpYhghZ1/reprex-3a1458d04eec-pure-tayra"

#load required packages 
library(mc2d)
#> Loading required package: mvtnorm
#> 
#> Attaching package: 'mc2d'
#> The following objects are masked from 'package:base':
#> 
#>     pmax, pmin
library(gplots)
#> 
#> Attaching package: 'gplots'
#> The following object is masked from 'package:stats':
#> 
#>     lowess
library(RColorBrewer)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(reprex)
library(tidyverse)
set.seed(99)
iters<-1000

df<-data.frame(id=c(1:30),cat=c(rep("a",12),rep("b",18)),month=c(1:6,1,6,4,1,5,2,3,2,5,4,6,3:6,4:6,1:5,5),n=rpois(30,5))

df$n[df$n == "0"] <- 3
se<-rbeta(iters,96,6)
epi.a<-rpert(iters,min=1.5, mode=2, max=3)
p=0.2
p2=epi.a*p

df<-as_tibble(df)
# this defined function ensures any `n` from `df` will be itered with 10000 s and a and generated 10000 results
iter_n <- function(n) map2_dbl(.x = se, .y = p2, ~ 1 - (1 - .x * .y) ^ n)
list_1 <- df %>% mutate(Result = map(n, ~iter_n(.x))) %>% unnest(Result)%>% group_split(month)
list_1[[1]]
#> # A tibble: 4,000 x 5
#>       id cat   month     n Result
#>    <int> <chr> <dbl> <dbl>  <dbl>
#>  1     1 a         1     5  0.953
#>  2     1 a         1     5  0.927
#>  3     1 a         1     5  0.904
#>  4     1 a         1     5  0.945
#>  5     1 a         1     5  0.872
#>  6     1 a         1     5  0.840
#>  7     1 a         1     5  0.896
#>  8     1 a         1     5  0.944
#>  9     1 a         1     5  0.925
#> 10     1 a         1     5  0.937
#> # ... with 3,990 more rows

p3a=rbeta(iters,50,5)
p3b=rbeta(iters,40,6)

iter_n2a<-function(Result) map_dbl(p3a, ~ prod(1 - Result * .x))
iter_n2b<-function(Result) map_dbl(p3b, ~ prod(1 - Result * .x))

list_2 <- list_1%>% map( ~ mutate(n_p = if_else(.x$cat == "a",
                                    map(.x$Result,  ~ iter_n2a(.x)),
                                    map(.x$Result,  ~ iter_n2b(.x)))))
#> Error in UseMethod("mutate"): no applicable method for 'mutate' applied to an object of class "list"

Created on 2022-05-06 by the reprex package (v2.0.1)



Solution 1:[1]

If you use dplyr::rowwise() to set the grouping per df in list1 it will compute without error:

list_2 <- list_1[1:2] %>% map( ~ rowwise(.x) %>%
                            mutate(n_p = if_else(cat == "a",
                                                map(Result,  ~ iter_n2a(.x)),
                                                map(Result,  ~ iter_n2b(.x)))))


list_2[[1]][1,"n_p"] %>% 
  unlist()

I have no idea if those values are sensible (e.g. the calculation is correct)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Nate