'Start of the week on Monday in Spark
This is my dataset:
from pyspark.sql import SparkSession, functions as F
spark = SparkSession.builder.getOrCreate()
df = spark.createDataFrame([('2021-02-07',),('2021-02-08',)], ['date']) \
.select(
F.col('date').cast('date'),
F.date_format('date', 'EEEE').alias('weekday'),
F.dayofweek('date').alias('weekday_number')
)
df.show()
#+----------+-------+--------------+
#| date|weekday|weekday_number|
#+----------+-------+--------------+
#|2021-02-07| Sunday| 1|
#|2021-02-08| Monday| 2|
#+----------+-------+--------------+
dayofweek
returns weekday numbers which start on Sunday.
How to return weekday numbers with the week start on Monday instead of Sunday? I.e.
+----------+-------+--------------+
| date|weekday|weekday_number|
+----------+-------+--------------+
|2021-02-07| Sunday| 7|
|2021-02-08| Monday| 1|
+----------+-------+--------------+
Solution 1:[1]
Apparently, there is a weekday
function which can do it. It can be accessed using expr
.
from pyspark.sql import SparkSession, functions as F
spark = SparkSession.builder.getOrCreate()
df = spark.createDataFrame([('2021-02-07',),('2021-02-08',)], ['date']) \
.select(
F.col('date').cast('date'),
F.date_format('date', 'EEEE').alias('weekday'),
F.expr('weekday(date) + 1').alias('weekday_number'),
)
df.show()
#+----------+-------+--------------+
#| date|weekday|weekday_number|
#+----------+-------+--------------+
#|2021-02-07| Sunday| 7|
#|2021-02-08| Monday| 1|
#+----------+-------+--------------+
Solution 2:[2]
You can try this :
date_format(col("date"), "u")).alias('weekday_number')
For some reason, it's not in the Spark's documentation of datetime patterns for formatting
You also might need to add this configuration line:spark.conf.set('spark.sql.legacy.timeParserPolicy', 'LEGACY')
Thanks for your feedback and very happy to help =)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Christophe |