'How to merge output of lapply iterated function into 1 dataframe? [duplicate]

I have a function responses to process some data that is subset into different data frames.

I also have a list of these data frames partylist.

I am trying to iterate through the list of data subsets using lapply and then collect the results in one data frame.

#Function called "response"

response <- function(dat, p){
    y = select(dat, p)
    sums = table(y)
    sums_df = as.data.frame(sums)
    sums_df$rfreq = (sums_df$Freq/sum(sums_df$Freq))*100
    sums_df$rfreq = round(sums_df$rfreq, digits = 0)
    sums_df = sums_df[1:4, c(1,3)]
  return(sums_df)  
}

#Code for iterating through the list of dfs.

lapply(partylist, response, p = "f78a")

#Output:

[[1]]
                    y rfreq
1      Instämmer helt    29
2    Instämmer delvis    40
3  Instämmer knappast     6
4 Instämmer inte alls     2

[[2]]
                    y rfreq
1      Instämmer helt    32
2    Instämmer delvis    38
3  Instämmer knappast     8
4 Instämmer inte alls     2

Can anybody suggest how I would do this?

A similar question was asked here but it never got answered.



Solution 1:[1]

Your data:

outputs <- list(structure(list(y = c("Instämmer helt", "Instämmer delvis", 
"Instämmer knappast", "Instämmer inte alls"), 
 rfreq = c(29L, 40L, 6L, 2L)), class = "data.frame", 
 row.names = c(NA, -4L)), 
 structure(list(y = c("Instämmer helt", "Instämmer delvis", 
                         "Instämmer knappast", "Instämmer inte alls"), 
 rfreq = c(32L, 38L, 8L, 2L)), class = "data.frame", 
 row.names = c(NA, -4L)))

Reduce can be used for adding the columns, as @Allan Cameron said, and if combined with merge, it can also be used to bind the rfreq columns without repeating y columns.

Reduce(function(df1,df2) merge(df1,df2, by = "y", suffixes = 1:2), outputs)

#                   y rfreq1 rfreq2
#1    Instämmer delvis     40     38
#2      Instämmer helt     29     32
#3 Instämmer inte alls      2      2
#4  Instämmer knappast      6      8

This approach can be applied to a list with more than two elements, but the column names are duplicated. The suffix 3,4,... are not automatically added to the resulted column names.

# Creating two more elements so now `outputs` has four elements
outputs[[3]] <- outputs[[1]]
outputs[[4]] <- outputs[[2]]

# Exactly same code

Reduce(function(df1,df2) merge(df1,df2, by = "y", suffixes = 1:2), outputs) 

# The result:
#                   y rfreq1 rfreq2 rfreq1 rfreq2
#1    Instämmer delvis     40     38     40     38
#2      Instämmer helt     29     32     29     32
#3 Instämmer inte alls      2      2      2      2
#4  Instämmer knappast      6      8      6      8
#Warning message:
#In merge.data.frame(df1, df2, by = "y", suffixes = 1:2) :
#  column names ‘rfreq1’, ‘rfreq2’ are duplicated in the result

Updates

As for why the row order get swapped in the resulted data frame, it is because merge function by default sorts the merged rows lexicographically, as explained in its documentation:

The rows are by default lexicographically sorted on the common columns, but for sort = FALSE are in an unspecified order.

To avoid this default behavior, we can set sort = FALSE

Reduce(function(df1,df2) merge(df1,df2, by = "y", suffixes = 1:2, sort = FALSE), outputs)

#                    y rfreq1 rfreq2 rfreq1 rfreq2
#1      Instämmer helt     29     32     29     32
#2    Instämmer delvis     40     38     40     38
#3  Instämmer knappast      6      8      6      8
#4 Instämmer inte alls      2      2      2      2
#Warning message:
#In merge.data.frame(df1, df2, by = "y", suffixes = 1:2, sort = FALSE) :
#  column names ‘rfreq1’, ‘rfreq2’ are duplicated in the result

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1