'How to aggregate row values with multiple row criteria
I'm trying to aggregate values present in rows in which repeated values present in other columns must be used as a criteria:
This is what I have:
Candidate, City, Ballot Box Number, Number of Votes
A Halifax 1 100
A Halifax 2 100
A Halifax 3 100
A Glasgow 4 100
B Halifax 1 100
B Halifax 2 0
B Halifax 3 50
B Glasgow 4 3
This is what I want:
Candidate,City,Number of Votes
A,Halifax,300
A,Glasgow,100
B,Halifax,150
B,Glasgow,3
It's similar to Dynamic Table functions in Excel. I need to make partial aggregations to extract useful information from this dataset.
Already tried aggregate
and rowsums
functions, but wasn't able to insert multiple conditions.
Solution 1:[1]
Next time, please make it easy to reproduce your example data. Use this code:
library(tidyverse)
candidate <- c("A","A","A","A","B","B","B","B")
city <- c("Halifax", "Halifax", "Halifax", "Glasgow", "Halifax", "Halifax", "Halifax", "Glasgow")
ballot_box <- c(1,2,3,4,1,2,3,4)
votes <- c(100,100,100,100,100,0,50,3)
df <- data.frame(candidate, city, ballot_box, votes)
df %>% dput() #copy and paste the output
structure(list(candidate = c("A", "A", "A", "A", "B", "B", "B",
"B"), city = c("Halifax", "Halifax", "Halifax", "Glasgow", "Halifax",
"Halifax", "Halifax", "Glasgow"), ballot_box = c(1, 2, 3, 4,
1, 2, 3, 4), votes = c(100, 100, 100, 100, 100, 0, 50, 3)), class = "data.frame", row.names = c(NA,
-8L))
Here is your solution:
library(tidyverse)
df %>% select(-ballot_box) %>% group_by(candidate, city) %>% summarize(votes = sum(votes))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | OTA |