'Spark 3.0 timeStamp parsing doesn't work ever after passing the format
This is a issue I am facing with Spark 3.0, worked before without even specifying a format.
Now, I tried explicitly specifying the format, but it still doesn't work.
Here the input format,
Here's the code I wrote,
Clearly, the format "MM/dd/yyyy hh:mm" should've worked, but it's not.
So I must be ignorant about a couple things here.
Solution 1:[1]
Its not correct what you do since spark 3.0 there where major changes regarding the datetime formatting.
Here is a working example:
val df = Seq("12/21/2018 15:17").toDF("a")
df.select(to_timestamp($"a", "M/d/yyyy H:mm")).show()
Notice the capital H? that stands for hours 0-23
the lower case letter h stands for 1 - 12
reference: https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html
Solution 2:[2]
It's been a while but writing anyway.. I was experiencing the same problem with spark 2.4.6. Then I used SparkSQL and it worked very well. I found the soluti?n in this link: SparkSQL - Difference between two time stamps in minutes
Example:
sqlContext.sql("select (bigint(to_timestamp(end_timestamp,'yyyy-MM-dd HH:mm:ss'))-bigint(to_timestamp(start_timestamp,'yyyy-MM-dd HH:mm:ss')))/(60) as duration from table limit 2")
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Matt |
Solution 2 | codergirrl |