'Reshape wide data to panel format with multiple paired columns using tidyr
In a large survey, the data are encoded as (Respondent, question, year binary, question, year binary). As in:
R Question1 | 2008 Bin. | Question2 | 2009 Bin.
1 abc... 1 def... 1
What we would like is a Respondent/Year setup as in:
R | Year | Question
1 2008 abc...
1 2009 def...
The challenge we're finding is that columns need to gathered or pivoted in pairs. In other words, Question1 and Year 2008 are one pair and Question2 and Year 2009 are a second pair (and so on). Base R reshape
has an option to group columns in lists so that the paired columns get reshaped together. This works reasonably well (though there would need to be minor additional scrubbing of the data afterwards).
What would the syntax be, though, with tidyr::gather
and/or tidyr::pivot_longer
?
This question has been discussed elsewhere (see two links below) but the pivot_longer
solutions seem considerably more complicated than should be necessary for this data example:
Panel data with pivot_longer, here: https://github.com/tidyverse/tidyr/issues/577
Also see gathering multiple sets of columns: Gather multiple sets of columns
library(dplyr)
df <- data.frame(
R = 1:3,
Question1 = letters[1:3],
Bin_2008 = c(1,1,1),
Question2 = letters[4:6],
Bin_2009 = c(1,1,1)
)
# look at data
df
# R Question1 Bin_2008 Question2 Bin_2009
# 1 1 a 1 d 1
# 2 2 b 1 e 1
# 3 3 c 1 f 1
# reshape works, how to do with tidyr?
df2 <-
reshape(
df,
varying = list(c("Question1", "Question2"), c("Bin_2008", "Bin_2009")),
v.names = c("question", "year_bin"),
idvar = "R",
direction = "long"
)
# reshape creates time variable coded 1 or 2 but we want 2008 or 2009, so minor recode
df2 <- df2 %>%
mutate(
year = case_when(
time == 1 ~ "2008",
time == 2 ~ "2009"
)
)
# re-order columns so data is Respondent-year
df2 <- df2 %>%
select(R, year, everything())
df2
# R year time question year_bin
# 1.1 1 2008 1 a 1
# 2.1 2 2008 1 b 1
# 3.1 3 2008 1 c 1
# 1.2 1 2009 2 d 1
# 2.2 2 2009 2 e 1
# 3.2 3 2009 2 f 1
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|