'Calling user defined functions from dplyr::mutate
I'm working on a project that involves many different tibbles, all of which have a period
variable of the format YYYYMM. Below is an example of how all my tibbles look like:
tibble_1 <- tibble::tibble(
period = c(201901, 201912, 201902, 201903),
var_1 = rnorm(4),
var_2 = rnorm(4)
)
But for some operations (i.e. time series plots) it's easier to work with an actual Date variable. So I'm using mutate to transform the period variable into a date like follows:
tibble_1 %>%
dplyr::mutate(
date = lubridate::ymd(stringr::str_c(period, "01"))
)
Since I will be doing this a lot, and the date transformation is not the only mutation I am going to be doing when calling mutate
, I'd like to have a user-defined function that I can call from within the mutate
call. Here's my function:
period_to_date <- function() {
lubridate::ymd(stringr::str_c(period, "01"))
}
Which I would later call like this:
tibble_1 %>%
dplyr::mutate(
date = period_to_date()
)
Problem is, R can't find the period object (which is not really an object on itself, but part of the tibble
).
> Error in stri_c(..., sep = sep, collapse = collapse, ignore_null =
TRUE) : object 'period' not found
I'm pretty sure I need to define a data-mask so that the envir in which period_to_date
is executed can look for the object in it's parent envir (which should always be the caller envir since the tibble
containing the period
column is not always the same), but I can't seem to figure out how to do it.
Solution 1:[1]
The function does not know which object you want to modify. Pass the period
object in the function and use it like :
period_to_date <- function(period) {
lubridate::ymd(stringr::str_c(period, "01"))
#Can also use
#as.Date(paste0(period,"01"), "%Y%m%d")
}
tibble_1 %>%
dplyr::mutate(date = period_to_date(period))
# period var_1 var_2 date
# <dbl> <dbl> <dbl> <date>
#1 201901 -0.476 -0.456 2019-01-01
#2 201912 -0.645 1.45 2019-12-01
#3 201902 -0.0939 -0.982 2019-02-01
#4 201903 0.410 0.954 2019-03-01
Solution 2:[2]
Consider passing the column name as an argument to your function:
library(dplyr)
period_to_date <- function(x) {
lubridate::ymd(stringr::str_c(x, "01"))
}
df <- data.frame(x = 1:3, period = c('201903', '202001', '201511'))
df %>% mutate(p2 = period_to_date(period))
#> x period p2
#> 1 1 201903 2019-03-01
#> 2 2 202001 2020-01-01
#> 3 3 201511 2015-11-01
Created on 2020-01-10 by the reprex package (v0.3.0)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | mrhellmann |