'Fitting a line to the mean values of a multilevel variable using geom_smooth
I have this dataframe.
I create a plot representing on the y-axis "value" and on the x-axis each of the levels of "Column_S", which contains levels from S1 to S10. All this grouped by "VS" and "Inst" (with a facet_grid) and "Estatus" with colour.
na.omit(data) %>%
group_by(VS,Estatus,Inst, Columna_S) %>%
summarise(media = mean(value),
desvio = sd(value),
error_est = desvio / sqrt(n()),
intervalo_sup = media + (2*error_est),
intervalo_inf = media - (2*error_est)) -> stats
ggplot() +
geom_errorbar(
data = stats,
aes(x = Columna_S,
ymin = intervalo_inf,
ymax = intervalo_sup,
color = Estatus
), width = 0.4 )+
geom_point(data = stats,
aes(x = Columna_S,
y = media,
color = Estatus))+
facet_grid(Inst ~ VS, scales = "free") +
theme_classic()
However, instead of an independent point for each mean, I'd like to use geom_smooth
to plot a line (plus standard error) that fits the means along the x-axis.
I have tried the code below, but it is not generating the desired outcome.
na.omit(data) %>%
ggplot(aes(x=Columna_S,y=value,colour=Estatus, na.rm = TRUE) ) +
geom_smooth(aes(fill=Estatus,na.rm = TRUE)) +
scale_y_continuous(breaks = seq(0,100, by=25), limits=c(0,100))+
facet_grid(Institution~Value_system, scales = "free")+
theme_classic()
Solution 1:[1]
You have factors on the x-axis, I am not so sure it makes sense to put a smooth line through it, but if thats what you need, it goes like this:
data %>%
filter(!is.na(Columna_S)) %>%
ggplot(aes(x=Columna_S,y=value,colour=Estatus)) +
geom_smooth(aes(group=Estatus)) +
facet_grid(Inst~VS, scales = "free")+
theme_classic()
You can consider putting a line through all the means:
data %>%
filter(!is.na(Columna_S)) %>%
ggplot(aes(x=Columna_S,y=value,colour=Estatus)) +
stat_summary(aes(group=Estatus),fun=mean,geom="line") +
facet_grid(Inst~VS, scales = "free")+
theme_classic()
Solution 2:[2]
It is a year late, but…
I have a similar problem, I have an ordered factor and want to fit a straight line through the levels, but the solution above does not work: no errors given, just no lines on the plot.
The solution I have found works because in ggplot you can plot datasets on top of each other (you are not restricted to one data frame). So,
- create a numeric version of the factor
- Plot the box plots using the faceting etc as above
- then plot that on top of the factor layer a geom_smooth layer mentioning the data= (same data frame) and
- use the numeric factor as x for this layer, keep the same y.
It works! Because the factor values = the numeric values the lines plot in the right place. If it is still needed I can add the code.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | StupidWolf |
Solution 2 | DaveG |