'map function based on a column value of dataframes stored in a list
I am trying to use the map function to do something complex - I'd like to use the values of the Result column per each dataframe I have in a list ( these are monthly dataframes and should be kept separated) and iterate this column with an external vector which should change according to a categorical variable inside the dataframe. Thus I defined two different functions to be passed inside map but I am getting an error. The ideal would be also to create a new column in each dataframe of the list to store the new values .. but I am not sure how to do that with "mutate" given that the object is a list.
Thanks a lot
rm(list = ls())
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
#> Error: RStudio not running
getwd()
#> [1] "C:/Users/Angela/AppData/Local/Temp/RtmpYhghZ1/reprex-3a1458d04eec-pure-tayra"
#load required packages
library(mc2d)
#> Loading required package: mvtnorm
#>
#> Attaching package: 'mc2d'
#> The following objects are masked from 'package:base':
#>
#> pmax, pmin
library(gplots)
#>
#> Attaching package: 'gplots'
#> The following object is masked from 'package:stats':
#>
#> lowess
library(RColorBrewer)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(reprex)
library(tidyverse)
set.seed(99)
iters<-1000
df<-data.frame(id=c(1:30),cat=c(rep("a",12),rep("b",18)),month=c(1:6,1,6,4,1,5,2,3,2,5,4,6,3:6,4:6,1:5,5),n=rpois(30,5))
df$n[df$n == "0"] <- 3
se<-rbeta(iters,96,6)
epi.a<-rpert(iters,min=1.5, mode=2, max=3)
p=0.2
p2=epi.a*p
df<-as_tibble(df)
# this defined function ensures any `n` from `df` will be itered with 10000 s and a and generated 10000 results
iter_n <- function(n) map2_dbl(.x = se, .y = p2, ~ 1 - (1 - .x * .y) ^ n)
list_1 <- df %>% mutate(Result = map(n, ~iter_n(.x))) %>% unnest(Result)%>% group_split(month)
list_1[[1]]
#> # A tibble: 4,000 x 5
#> id cat month n Result
#> <int> <chr> <dbl> <dbl> <dbl>
#> 1 1 a 1 5 0.953
#> 2 1 a 1 5 0.927
#> 3 1 a 1 5 0.904
#> 4 1 a 1 5 0.945
#> 5 1 a 1 5 0.872
#> 6 1 a 1 5 0.840
#> 7 1 a 1 5 0.896
#> 8 1 a 1 5 0.944
#> 9 1 a 1 5 0.925
#> 10 1 a 1 5 0.937
#> # ... with 3,990 more rows
p3a=rbeta(iters,50,5)
p3b=rbeta(iters,40,6)
iter_n2a<-function(Result) map_dbl(p3a, ~ prod(1 - Result * .x))
iter_n2b<-function(Result) map_dbl(p3b, ~ prod(1 - Result * .x))
list_2 <- list_1%>% map( ~ mutate(n_p = if_else(.x$cat == "a",
map(.x$Result, ~ iter_n2a(.x)),
map(.x$Result, ~ iter_n2b(.x)))))
#> Error in UseMethod("mutate"): no applicable method for 'mutate' applied to an object of class "list"
Created on 2022-05-06 by the reprex package (v2.0.1)
Solution 1:[1]
If you use dplyr::rowwise()
to set the grouping per df in list1
it will compute without error:
list_2 <- list_1[1:2] %>% map( ~ rowwise(.x) %>%
mutate(n_p = if_else(cat == "a",
map(Result, ~ iter_n2a(.x)),
map(Result, ~ iter_n2b(.x)))))
list_2[[1]][1,"n_p"] %>%
unlist()
I have no idea if those values are sensible (e.g. the calculation is correct)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Nate |