'Barplot of summed values per category in R - currently plotting highest value only
I have a data set (named data
) as follows
site year month supplier FG total
540853 2015 1 790122 T25 3
540853 2015 3 790122 T25 5
540853 2015 3 790122 V24 8
540853 2015 4 790122 V24 1
540853 2015 4 790122 T25 6
540853 2015 4 790122 W29 4
540853 2015 5 790122 W29 9
540853 2015 5 790122 V24 2
540853 2015 5 790122 T25 7
I would like to create a bar plot for all the months ins 2015, suppler 790122, showing the sum of totals
for each FG
. The x-axis would have T25, W29 and V24. The y-axis would read 23 for T25 (3+5+8+7), 13 for W29 (4+9) and 11 for V24 (8+1+2).
I initially plotted using the following code
plot1 <- ggplot(subset(data, Year %in% c("2015") & supplier %in% c("520302")),
aes(x = factor(FG), y = total)) +
geom_bar(stat = "identity", position = "dodge") +
theme(panel.grid = element_blank(), panel.background = element_blank(), axis.line = element_line(colour = "black"))
This produced a barplot that I thought was correct. However, I later wanted to add a table beside it so the readers could see the exact values for each FG, rather than reading it from the graph. Upon doing this I realised that the values in the barplot did not match the values in the table.
I plotted a second graph with the following code
for (i in 790122){
For_summary <- subset(data, year %in% c("2015") & supplier %in% i)
summary_tbl <- data.frame(ddply(For_summary, c("FG"), summarise, S = sum(total)))
colnames(summary_tbl) <- c("FG", "total")
}
plot2 <- ggplot(summary_tbl,
aes(x = factor(FG), y = total)) +
geom_bar(stat = "identity", position = "dodge") +
ylim(0,25) + labs(title = "plot 2") +
theme(panel.grid = element_blank(), panel.background = element_blank(), axis.line = element_line(colour = "black"))
This gave a barplot with the values I want (see attached image). In the first barplot the R code seems to plot only the highest value for each of the FG.
Can anyone advise on what part of the code is doing this and how I can plot it correctly without having the create the summary_tbl
in the for
loop first?
Solution 1:[1]
Try aggregating the data first:
df2 <- aggregate(total~FG, df, sum)
ggplot(df2, aes(FG, total)) +
geom_bar(stat="identity")
Or as mentioned in the comments, it is possible to summarize in the function:
ggplot(df, aes(FG, total)) +
geom_bar(stat="summary", fun.y="sum")
A third option is to remove the "position=dodge"
argument of your original data. Use that parameter for grouped bar graphs.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |