'How can I use a vector variable to avoid repeating references to non-adjacent columns in R?
Question
In R, can I used a vector that holds the names of data frame columns to avoid repeated code?
vec_columns <- c("col1", "col2", "col8", "col54")
Background
I started off looking to solve the problem that is answered in this question, which works on its own terms: Coalesce columns and create another column to specify source
But in my specific use case, I have many columns and the ones that I want to coalesce()
are not adjacent to each other, so the <tidy-select>
used in that solution doesn't work for me.
Modified Example from Original Question
In the original question, the OP was using contiguous columns that began in column #1, but I have leading columns that are not part of the coalesce()
, plus the columns I want to coalesce()
are separated from each other.
df_2 <-
data.frame(
Name = c("A", "B", "C", "D", "E"), #Adding a name column not in original posted question
group_1 = c(NA, NA, NA, NA, 2),
group_2 = c(NA, 4, NA, NA, 1),
group_3 = c(NA, NA, 5, NA, NA),
group_4 = c(1, NA, NA, 2, NA),
group_5 = c(NA, 3, NA, NA, NA)
)
> df_2
Name group_1 group_2 group_3 group_4 group_5
1 A NA NA NA 1 NA
2 B NA 4 NA NA 3
3 C NA NA 5 NA NA
4 D NA NA NA 2 NA
5 E 2 1 NA NA NA
This solution below I created and does exactly what I want for output:
df_2 %>%
mutate(one_col = coalesce(group_2, group_3, group_5)) %>%
rowwise() %>%
mutate(group_col = c("group_2", "group_3", "group_5")[!is.na(c_across(c(group_2, group_3, group_5)))][1])
# A tibble: 5 x 8
# Rowwise:
Name group_1 group_2 group_3 group_4 group_5 one_col group_col
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 A NA NA NA 1 NA NA NA
2 B NA 4 NA NA 3 4 group_2
3 C NA NA 5 NA NA 5 group_3
4 D NA NA NA 2 NA NA NA
5 E 2 1 NA NA NA 1 group_2
The Problem
But as you can see, I have to repeat those column names 3 times. Future proofing myself, I envision where I might want or 10 columns in the coalesce()
. I want to just set a variable holding a vector of column names once, but doing the obvious myvec <- c("group_2", "group_3", "group_5")
and inserting it doesn't work.
EDIT
Reading the comments, I got to an answer that meets the need, but I'm reluctant to answer it myself because credit goes to the commenter(s).
This achieves what I wanted:
myvec <- c("group_2", "group_3","group_5")
df_2 %>%
mutate(one_col = coalesce(!!!select(.,myvec))) %>% rowwise() %>%
mutate(group_col = myvec[!is.na(c_across(myvec))][1])
# A tibble: 5 x 8
# Rowwise:
Name group_1 group_2 group_3 group_4 group_5 one_col group_col
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 A NA NA NA 1 NA NA NA
2 B NA 4 NA NA 3 4 group_2
3 C NA NA 5 NA NA 5 group_3
4 D NA NA NA 2 NA NA NA
5 E 2 1 NA NA NA 1 group_2
The main goal here is really to get the group_col
column which records which column was used in the coalesce()
, but with the flexibility to handle non-adjacent columns, and also to change the order of the columns in the coalesce()
. For example, below I reverse the order of the myvec
so that row 2, "B" selects 3 and group_5, instead of 4 and group_2 in the above example.
> myvec2 <- c("group_5", "group_3","group_2")
> df_2 %>%
+ mutate(one_col = coalesce(!!!select(.,myvec2))) %>% rowwise() %>%
+ mutate(group_col = myvec2[!is.na(c_across(myvec2))][1])
Note: Using an external vector in selections is ambiguous.
i Use `all_of(myvec2)` instead of `myvec2` to silence this message.
i See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
This message is displayed once per session.
# A tibble: 5 x 8
# Rowwise:
Name group_1 group_2 group_3 group_4 group_5 one_col group_col
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 A NA NA NA 1 NA NA NA
2 B NA 4 NA NA 3 3 group_5
3 C NA NA 5 NA NA 5 group_3
4 D NA NA NA 2 NA NA NA
5 E 2 1 NA NA NA 1 group_2
I am mildly concerned about the message I received about external vector in selections is ambiguous, so perhaps I'll need to use that all_of()
function, but this did work.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|