'Multiples rows to one row in R [closed]

In R, I have a data frame with several values. I would like to have a data frame that transforms the data frame into a data frame with just one row with all the values. I have a data frame like this:

df <- data.frame(A = c("time", "time", "time"),
                 B = c("place", "place", "place"),
                 C = c(NA, 1, NA),
                 D = c(NA, NA, 2),
                 E = c(3, NA, NA),
                 `F` = c(4,4, NA),
                 G = c(NA, 5, NA))

     A     B  C  D  E  F  G
1 time place NA NA  3  4 NA
2 time place  1 NA NA  4  5
3 time place NA  2 NA NA NA

And I want a dataframe like this:

     A     B C D E F G
1 time place 1 2 3 4 5

I tried using the function reshape like here: Turning one row into multiple rows in r

And tried the function unique but than I lose a lot of data: Selecting unique rows in matrix using R



Solution 1:[1]

We can group by 'A', 'B' and select the first non-NA element across other columns

library(dplyr)
df1 %>%
     group_by(A, B) %>%
     summarise(across(everything(), ~ .[order(is.na(.))][1]), .groups = 'drop')

-output

# A tibble: 1 x 8
#  A     B         C     D     E     F     G H    
#  <chr> <chr> <int> <int> <int> <int> <int> <lgl>
#1 time  place     1     2     3     4     5 NA   

Or with coalesce

library(purrr)
df1 %>%
    group_by(A, B) %>% 
    summarise(across(everything(), ~ reduce(., coalesce)), .groups = 'drop')

data

df1 <- structure(list(A = c("time", "time", "time"), B = c("place", 
"place", "place"), C = c(NA, 1L, NA), D = c(NA, NA, 2L), E = c(3L, 
NA, NA), F = c(4L, NA, NA), G = c(NA, 5L, NA), H = c(NA, NA, 
NA)), class = "data.frame", row.names = c(NA, -3L))

Solution 2:[2]

Solution using colMeans to account for diverging values:

1.Create test data

df <- structure(list(
  A = c("time", "time", "time"), 
  B = c("place", "place", "place"), 
  C = c(NA, 1L, NA), 
  D = c(NA, NA, 2L), 
  E= c(3L, NA, NA), 
  F = c(4L, 4L, NA), 
  G = c(NA, 5L, NA)), 
  row.names = c(NA, -3L),
  class = c("data.table", "data.frame"))

2.Use unique for first two columns and colMeans for the rest, cast to data.frame:

cbind(unique(df[, 1:2]), as.data.frame.list(colMeans(df[,3:7], na.rm = TRUE)))

Returns:

      A     B C D E F G
1: time place 1 2 3 4 5

Solution 3:[3]

You can use na.omit() with summarise, a la:

library(tidyverse)

df %>% group_by(A, B) %>% 
       summarise(C = mean(na.omit(C)),
                 D = mean(na.omit(D)),
                 E = mean(na.omit(E)),
                 F = mean(na.omit(F)),
                 G = mean(na.omit(G)))

Your example data has only unique values in each column C-G, so following your comment, I have used mean() to pick up the mean of non-NA observations.

data:

df <- structure(list(A = c("time", "time", "time"), B = c("place",
"place", "place"), C = c(NA, 1L, NA), D = c(NA, NA, 2L), E = c(3L,
NA, NA), F = c(4L, 4L, NA), G = c(NA, 5L, NA)), class = "data.frame", row.names = c(NA,
-3L))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 akrun
Solution 2 dario
Solution 3