'Purrr::pmap using a named list of columns with '~' function doesn't respect names from .l?

This is my first time posting, so bear with me. I'm using purrr::pmap() to map a function over 3 columns of a tibble(), to create a 4th column

library(tidyverse)

set.seed(123)

df <- tibble(a = as.character(1:3), b = sample(LETTERS, 3), c = sample(letters, 3))
df

For the sake of argument, using the below mutate() creates the expected outcome of the simplistic example, but my use case is more complex than str_c(...) - it's a trigger for a SQL query based on the content of columns a, b, c (plus some other data transforms).

df %>%
  mutate(str = str_c(a, b, c))

Using an un-named list in pmap(.l) and ~str_c() generates the expected outcome as below:

df %>% 
  mutate(str = pmap_chr(list(a, b, c), 
                        ~str_c(..1, ..2, ..3)))

Naming the columns in the .l argument and assigning the function through function() also works as expected:

df %>% 
  mutate(str = pmap_chr(list(list_a = a, list_b = b, list_c = c), 
                        function(list_a, list_b, list_c) str_c(list_a, list_b, list_c)))

But what I can't understand is why the following errors with the warning that 'list_a' not found?

df %>% 
  mutate(str = pmap_chr(list(list_a = a, list_b = b, list_c = c), 
                        ~str_c(list_a, list_b, list_c))) 

Obviously the issue is simple to resolve by using function() instead of ~, but I'd like to understand why ~ function assignment doesn't seem to respect the names assigned in .l for my understanding of R.



Solution 1:[1]

We can do this without using pmap

library(dplyr)
library(purrr)
df %>%
    mutate(str = invoke(str_c, cur_data()))

The idea of using named argument is that it should match the arguments of the function. Here, str_c takes a variadic argument i.e. any number of inputs

str_c(..., sep = "", collapse = NULL)

For that reason, it can be used by specifying the column names one by one and that is a vectorized option. However, with pmap it is looping over each row and is not really needed for this function str_c.

Just like paste, we can either use do.call or invoke (from purrr) to pass variadic arguments - cur_data() returns the dataset and the data.frame/tibble is a list with elements/columns of equal length

If we want to check the output from pmap, concatenate all the elements and check - it returns a named vector which can be accessed (if needed)

df %>% 
   mutate(str = pmap(list(list_a = a, list_b = b, list_c = c), 
                         ~ c(...))) %>% 
   pull(str)
[[1]]
list_a list_b list_c 
   "1"    "O"    "c" 

[[2]]
list_a list_b list_c 
   "2"    "S"    "j" 

[[3]]
list_a list_b list_c 
   "3"    "N"    "r" 

ie.

df %>% 
   mutate(str = pmap_chr(list(list_a = a, list_b = b, list_c = c),    
                    ~ {tmp <- c(...)
     str_c(tmp["list_a"], tmp["list_b"], tmp["list_c"])}))
# A tibble: 3 × 4
  a     b     c     str  
  <chr> <chr> <chr> <chr>
1 1     O     c     1Oc  
2 2     S     j     2Sj  
3 3     N     r     3Nr  

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1