'How to convert months as factors while still maintaining the months in sequence?
I have a original data frame (df) containing data of around 10 years(1994-2003). The head(df) is as shown below:
Sl.no Date Year Month Season val1 val2 val3
1 1 1993-12-01 1993 Dec Winter 21.0 16.0 3.0
2 2 1994-01-01 1994 Jan Winter 21.0 15.5 0.0
3 3 1994-02-01 1994 Feb Winter 21.0 18.5 0.0
4 4 1994-03-01 1994 Mar Spring 30.0 24.0 1.9
5 5 1994-04-01 1994 Apr Spring 35.5 27.0 0.5
6 6 1994-05-01 1994 May Spring 36.0 30.0 1.5
since i wanted to convert Months as factors, so as to plot boxplot, i used:
df$Month <- as.factor(format(df$Date, "%b"))
levels(df$Month) <- c("Jan","Feb","Mar", "Apr", "May", "Jun", "Jul",
"Aug", "Sep", "Oct", "Nov", "Dec")
however the output appeared as below: (Months were not in sequence like original df)
Sl.no Date Year Month Season val1 val2 val3
1 1 1993-12-01 1993 Mar Winter 21.0 16.0 3.0
2 2 1994-01-01 1994 May Winter 21.0 15.5 0.0
3 3 1994-02-01 1994 Apr Winter 21.0 18.5 0.0
4 4 1994-03-01 1994 Aug Spring 30.0 24.0 1.9
5 5 1994-04-01 1994 Jan Spring 35.5 27.0 0.5
6 6 1994-05-01 1994 Sep Spring 36.0 30.0 1.5
so in the above df, it is noted that the months are distorted, which otherwise should be in sequence following the Date.
so how can i rectify this problem? your help will be highly appreciated. kind regards
Solution 1:[1]
Use
df$Month <- factor(format(df$Date, "%b"), month.abb, ordered = TRUE)
Demo of the problem you're facing:
set.seed(1)
M <- sample(month.abb, 20, TRUE)
M
# [1] "Apr" "May" "Jul" "Nov" "Mar" "Nov" "Dec" "Aug" "Aug" "Jan" "Mar" "Mar" "Sep" "May"
# [15] "Oct" "Jun" "Sep" "Dec" "May" "Oct"
your_attempt <- as.factor(M)
# [1] Apr May Jul Nov Mar Nov Dec Aug Aug Jan Mar Mar Sep May Oct Jun Sep Dec May Oct
# Levels: Apr Aug Dec Jan Jul Jun Mar May Nov Oct Sep
## At this step, you're basically asking R to replace "Apr" with "Jan",
## "Aug" with "Feb", and so on. Not what you're looking for....
levels(your_attempt) <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
your_attempt
# [1] Jan Aug May Sep Jul Sep Mar Feb Feb Apr Jul Jul Nov Aug Oct Jun Nov Mar Aug Oct
# Levels: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## ordered = TRUE not necessarily required. Depends on what you want to do
new_attempt <- factor(M, levels = month.abb, ordered = TRUE)
new_attempt
# [1] Apr May Jul Nov Mar Nov Dec Aug Aug Jan Mar Mar Sep May Oct Jun Sep Dec May Oct
# Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < Oct < Nov < Dec
Solution 2:[2]
the month()
function from the lubridate package will handle this for you.
library(lubridate)
df$Month <- month(df$Date, label=TRUE, abbr=TRUE)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | A5C1D2H2I1M1N2O1R2T1 |
Solution 2 | mac |