'Purrr::pmap using a named list of columns with '~' function doesn't respect names from .l?
This is my first time posting, so bear with me. I'm using purrr::pmap()
to map a function over 3 columns of a tibble()
, to create a 4th column
library(tidyverse)
set.seed(123)
df <- tibble(a = as.character(1:3), b = sample(LETTERS, 3), c = sample(letters, 3))
df
For the sake of argument, using the below mutate() creates the expected outcome of the simplistic example, but my use case is more complex than str_c(...) - it's a trigger for a SQL query based on the content of columns a, b, c (plus some other data transforms).
df %>%
mutate(str = str_c(a, b, c))
Using an un-named list in pmap(.l)
and ~str_c()
generates the expected outcome as below:
df %>%
mutate(str = pmap_chr(list(a, b, c),
~str_c(..1, ..2, ..3)))
Naming the columns in the .l
argument and assigning the function through function()
also works as expected:
df %>%
mutate(str = pmap_chr(list(list_a = a, list_b = b, list_c = c),
function(list_a, list_b, list_c) str_c(list_a, list_b, list_c)))
But what I can't understand is why the following errors with the warning that 'list_a' not found
?
df %>%
mutate(str = pmap_chr(list(list_a = a, list_b = b, list_c = c),
~str_c(list_a, list_b, list_c)))
Obviously the issue is simple to resolve by using function()
instead of ~
, but I'd like to understand why ~
function assignment doesn't seem to respect the names assigned in .l
for my understanding of R.
Solution 1:[1]
We can do this without using pmap
library(dplyr)
library(purrr)
df %>%
mutate(str = invoke(str_c, cur_data()))
The idea of using named argument is that it should match the arguments of the function. Here, str_c
takes a variadic argument i.e. any number of inputs
str_c(..., sep = "", collapse = NULL)
For that reason, it can be used by specifying the column names one by one and that is a vectorized option. However, with pmap
it is looping over each row and is not really needed for this function str_c
.
Just like paste
, we can either use do.call
or invoke
(from purrr
) to pass variadic arguments - cur_data()
returns the dataset and the data.frame/tibble is a list
with elements/columns of equal length
If we want to check the output from pmap
, concatenate all the elements and check - it returns a named vector which can be accessed (if needed)
df %>%
mutate(str = pmap(list(list_a = a, list_b = b, list_c = c),
~ c(...))) %>%
pull(str)
[[1]]
list_a list_b list_c
"1" "O" "c"
[[2]]
list_a list_b list_c
"2" "S" "j"
[[3]]
list_a list_b list_c
"3" "N" "r"
ie.
df %>%
mutate(str = pmap_chr(list(list_a = a, list_b = b, list_c = c),
~ {tmp <- c(...)
str_c(tmp["list_a"], tmp["list_b"], tmp["list_c"])}))
# A tibble: 3 × 4
a b c str
<chr> <chr> <chr> <chr>
1 1 O c 1Oc
2 2 S j 2Sj
3 3 N r 3Nr
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |