'Pyspark: Return next weeks saturday

I'm trying to return next weeks Saturday date from datatype column rel_d.

Normally, in python, I'd subtract number of days till next Saturday and add it to the rel_d

def next_saturday(dt):
    next_sat_dt = dt + relativedelta(days=(12-dt.weekday())) # 12 as indexing starts from 0 in python
    return next_sat_dt

creating a UDF in pyspark for the same seems like a bulky operation. Is there some spark operation which could do it faster?



Solution 1:[1]

You could use 2 next_day in pyspark to reach to next week's Saturday

Note that in pyspark day starts from Sunday (0) and ends on Saturday (7).

So, if you jump to next Sunday and then jump to next Saturday, it will be equal to your requirement.

Subsequently, You can also add multiples of 7 using F.day_add to reach nth week of your choice

df = df.withColumn('next_saturday_date',F.next_day(F.next_day(F.col('rel_d'), 'Sun'), 'Sat'))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Itachi