'Azure Databricks Delta Table modifies the TIMESTAMP format while writing from Spark DataFrame
I am new to Azure Databricks,I am trying to write a dataframe output to a delta table which consists TIMESTAMP column. But strangely it changes the TIMESTAMP pattern after writing to delta table.
My DataFrame Output column holds the value in this format 2022-05-13 17:52:09.771
,
But After writing it to the Table, The column value is getting populated as
2022-05-13T17:52:09.771+0000
I am using below function to generate this Dataframe output
val pretsUTCText = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'")
val tsUTCText: String = pretsUTCTextNew.format(ts)
val tsUTCCol : Column = lit(tsUTCText)
val df = df2.withColumn(to_timestamp(timestampConverter.tsUTCCol,"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"))
The Dataframe output is returning 2022-05-13 17:52:09.771
as TIMESTAMP pattern.
But After writing it to Delta Table I see the same value is getting populated as 2022-05-13T17:52:09.771+0000
Thanks in Advance. I could not find any solution.
Solution 1:[1]
I have just found the same behaviour on Databricks as you, and it behaves differently than the Databricks document. It seems after some versions Databricks show timezone as a default so you see additional +0000. I think you can use date_format function when you populate data if you don't want it. Also, I think you don't need 'Z' in format text as it is for timezone. See the screenshot below.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | PhuriChal |