'How to convert months as factors while still maintaining the months in sequence?

I have a original data frame (df) containing data of around 10 years(1994-2003). The head(df) is as shown below:

Sl.no       Date Year Month Season            val1            val2     val3
1     1 1993-12-01 1993   Dec Winter          21.0            16.0      3.0
2     2 1994-01-01 1994   Jan Winter          21.0            15.5      0.0
3     3 1994-02-01 1994   Feb Winter          21.0            18.5      0.0
4     4 1994-03-01 1994   Mar Spring          30.0            24.0      1.9
5     5 1994-04-01 1994   Apr Spring          35.5            27.0      0.5
6     6 1994-05-01 1994   May Spring          36.0            30.0      1.5

since i wanted to convert Months as factors, so as to plot boxplot, i used:

df$Month <- as.factor(format(df$Date, "%b"))
levels(df$Month) <- c("Jan","Feb","Mar", "Apr", "May", "Jun", "Jul",
"Aug", "Sep", "Oct", "Nov", "Dec")

however the output appeared as below: (Months were not in sequence like original df)

Sl.no       Date Year Month Season          val1             val2      val3
1     1 1993-12-01 1993   Mar Winter          21.0            16.0      3.0
2     2 1994-01-01 1994   May Winter          21.0            15.5      0.0
3     3 1994-02-01 1994   Apr Winter          21.0            18.5      0.0
4     4 1994-03-01 1994   Aug Spring          30.0            24.0      1.9
5     5 1994-04-01 1994   Jan Spring          35.5            27.0      0.5
6     6 1994-05-01 1994   Sep Spring          36.0            30.0      1.5

so in the above df, it is noted that the months are distorted, which otherwise should be in sequence following the Date.

so how can i rectify this problem? your help will be highly appreciated. kind regards



Solution 1:[1]

Use

df$Month <- factor(format(df$Date, "%b"), month.abb, ordered = TRUE)

Demo of the problem you're facing:

set.seed(1)
M <- sample(month.abb, 20, TRUE)
M
#  [1] "Apr" "May" "Jul" "Nov" "Mar" "Nov" "Dec" "Aug" "Aug" "Jan" "Mar" "Mar" "Sep" "May"
# [15] "Oct" "Jun" "Sep" "Dec" "May" "Oct"

your_attempt <- as.factor(M)
#  [1] Apr May Jul Nov Mar Nov Dec Aug Aug Jan Mar Mar Sep May Oct Jun Sep Dec May Oct
# Levels: Apr Aug Dec Jan Jul Jun Mar May Nov Oct Sep

## At this step, you're basically asking R to replace "Apr" with "Jan",
##   "Aug" with "Feb", and so on. Not what you're looking for....
levels(your_attempt) <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", 
                          "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")

your_attempt
#  [1] Jan Aug May Sep Jul Sep Mar Feb Feb Apr Jul Jul Nov Aug Oct Jun Nov Mar Aug Oct
# Levels: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

## ordered = TRUE not necessarily required. Depends on what you want to do
new_attempt <- factor(M, levels = month.abb, ordered = TRUE)
new_attempt
#  [1] Apr May Jul Nov Mar Nov Dec Aug Aug Jan Mar Mar Sep May Oct Jun Sep Dec May Oct
# Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < Oct < Nov < Dec

Solution 2:[2]

the month() function from the lubridate package will handle this for you.

library(lubridate)
df$Month <- month(df$Date, label=TRUE, abbr=TRUE)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 A5C1D2H2I1M1N2O1R2T1
Solution 2 mac