'Barplot of summed values per category in R - currently plotting highest value only

I have a data set (named data) as follows

  site  year  month  supplier   FG   total  
540853  2015      1    790122  T25       3  
540853  2015      3    790122  T25       5  
540853  2015      3    790122  V24       8  
540853  2015      4    790122  V24       1  
540853  2015      4    790122  T25       6  
540853  2015      4    790122  W29       4  
540853  2015      5    790122  W29       9  
540853  2015      5    790122  V24       2  
540853  2015      5    790122  T25       7 

I would like to create a bar plot for all the months ins 2015, suppler 790122, showing the sum of totals for each FG. The x-axis would have T25, W29 and V24. The y-axis would read 23 for T25 (3+5+8+7), 13 for W29 (4+9) and 11 for V24 (8+1+2).

I initially plotted using the following code

plot1 <- ggplot(subset(data, Year %in% c("2015") & supplier %in% c("520302")), 
                aes(x = factor(FG), y = total)) + 
         geom_bar(stat = "identity", position = "dodge") +
         theme(panel.grid = element_blank(), panel.background = element_blank(), axis.line = element_line(colour = "black"))

This produced a barplot that I thought was correct. However, I later wanted to add a table beside it so the readers could see the exact values for each FG, rather than reading it from the graph. Upon doing this I realised that the values in the barplot did not match the values in the table.

I plotted a second graph with the following code

for (i in 790122){
  For_summary <- subset(data, year %in% c("2015") & supplier %in% i)
  summary_tbl <- data.frame(ddply(For_summary, c("FG"), summarise, S = sum(total)))
  colnames(summary_tbl) <- c("FG", "total")
}

plot2 <- ggplot(summary_tbl, 
                aes(x = factor(FG), y = total)) + 
  geom_bar(stat = "identity", position = "dodge") +
  ylim(0,25) + labs(title = "plot 2") +
  theme(panel.grid = element_blank(), panel.background = element_blank(), axis.line = element_line(colour = "black"))

This gave a barplot with the values I want (see attached image). In the first barplot the R code seems to plot only the highest value for each of the FG.

enter image description here

Can anyone advise on what part of the code is doing this and how I can plot it correctly without having the create the summary_tbl in the for loop first?



Solution 1:[1]

Try aggregating the data first:

df2 <- aggregate(total~FG, df, sum)

ggplot(df2, aes(FG, total)) + 
  geom_bar(stat="identity")

Or as mentioned in the comments, it is possible to summarize in the function:

ggplot(df, aes(FG, total)) +
  geom_bar(stat="summary", fun.y="sum")

A third option is to remove the "position=dodge" argument of your original data. Use that parameter for grouped bar graphs.

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1