'How to add legend to a combined line+box graph in ggplot2()
I have a dataframe with the following info
- first column as number of clusters
- second column as within cluster variation (line)
- third column as reduction in within cluster variation (bar)
But I used the following code to plot it (plotted graph shown later in the comment), but not sure how to add a legend for this combined (line and bar) plot.
#plot with ggplot2
ggplot(data) +
geom_col(aes(x=cluster_number, y=2*reduction_in_withincluster_variation), fill="grey", colour="black")+
geom_line(aes(x=cluster_number, y=within_cluster_variation), size=1)+
geom_point(aes(x=cluster_number, y=within_cluster_variation), pch =16)+
scale_x_continuous(breaks=seq(1, 16, 1)) +
labs(y = "Within-cluster variation", x = "Number of clusters")+
scale_y_continuous(breaks=seq(0, 0.14, 0.02), sec.axis = sec_axis(~./2, breaks=seq(0, 0.07, 0.01), name = "Reduction in within-cluster variation"))+
theme_classic(base_size = 13)
Any suggestions would be really helpful
thanks!
The plot currently looks like this, I would like to add a legend for the line and the bars to the right side of the graph.
Solution 1:[1]
To get a legend to show for a given geom, you need to assign something that would be represented in a legend within aes()
. Normally, this would be applied to a column in your data, and legend key entries are created and assigned colors based on the values of each observation in that data. Howevever, you can also assign a string of characters to an aesthetic such as fill=
or color=
or linetype=
within aes()
. This will have the effect to create a legend key entry for that particular geom and add that aesthetic entry to the legend with that name.
We can use this method applied to OP's example. They have not shared data, but here's an example with data that kind of mimics what OP shows (without the secondary y axis stuff, etc):
library(ggplot2)
data <- data.frame(
cluster_number = 1:16,
points_and_lines = c(0.14, 0.077, 0.052, 0.045, 0.038, 0.035, 0.033, 0.032, 0.032, 0.031, 0.029, 0.028, 0.027, 0.025, 0.024, 0.022),
bar_heights = c(0.12, 0.04, 0.02, 0.018, 0.012, 0.01, 0.008, 0.007, 0.007, 0.006, 0.004, 0.004, 0.004, 0.003, 0.002, 0.001)
)
p <- ggplot(data) +
geom_col(aes(x=cluster_number, y=bar_heights), fill="grey", colour="black")+
geom_line(aes(x=cluster_number, y=points_and_lines), size=1)+
geom_point(aes(x=cluster_number, y=points_and_lines), pch =16)+
scale_x_continuous(breaks=seq(1, 16, 1)) +
labs(y = "Within-cluster variation", x = "Number of clusters")+
theme_classic(base_size = 13)
p
Here's the application of the strategy noted above. In this case, I want to use fill=
for the columns and I'll use color=
for the lines and points. This will have the net effect of combining the look in the legend key of a line + a point for the "lines and points" and showing a grey box for the columns. If you just want a line or point entry in the legend for "lines and points", you can put one or the other only into aes()
.
Note that we not only have to add fill
and color
within aes()
for the geoms, but we need to also: (1) make the legend titles NULL
to avoid having a legend title (unless you want it), and (2) define the actual colors, otherwise, ggplot2
will map them to the default hue scale.
ggplot(data) +
geom_col(aes(x=cluster_number, y=bar_heights, fill="bars"), colour="black")+
geom_line(aes(x=cluster_number, y=points_and_lines), size=1)+
geom_point(aes(x=cluster_number, y=points_and_lines, color="lines and points"), pch =16)+
scale_x_continuous(breaks=seq(1, 16, 1)) +
labs(y = "Within-cluster variation", x = "Number of clusters", color=NULL, fill=NULL)+
scale_fill_manual(values="grey") +
scale_color_manual(values="black") +
theme_classic(base_size = 13) +
theme(legend.position="top")
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | chemdork123 |