'Spark Delta table restore to version
I am trying to restore a delta table to its previous version via spark java , am using local ide .code is as below
import io.delta.tables.*;
DeltaTable deltaTable = DeltaTable.forPath(spark, <path-to-table>);
DeltaTable deltaTable = DeltaTable.forName(spark, <table-name>);
deltaTable.restoreToVersion(0) // restore table to oldest version
deltaTable.restoreToTimestamp("2019-02-14") // restore to a specific timestamp
As per the documentation databricks doc the method given here is not available in delta-core version 0.8.0. The method is also not in the api docs .
Is this only available in Datbricks run time?
Currently i have to load with the previous version and rewrite the df using delta.Is there any better way to do it?
Solution 1:[1]
Deltalake version 0.8 does not have restoreToVersion
and restoreToTimestamp
methods. There is no trace of such methods in open-source deltalake 0.8 as you can check in delta-lake repository
So currently and as far as I know, there is no other method than rewriting from a previous version, as explained in answers of this question
EDIT
As commented by boyangeor, restoreToVersion and restoreToTimestamp are now available in DeltaLake from version 1.2
Solution 2:[2]
The rollback "restoreVersion" is very much this:
In python
delta_table_path = "/tmp/delta-table"
df = spark.read.format("delta").option("versionAsOf", 0).load(delta_table_path)
df.show()
In Java:
String delta_table_path = "/tmp/delta-table"
Dataset<Row> df = spark.read().format("delta").option("versionAsOf", 0).load(delta_table_path);
df.show();
In Scala:
var delta_table_path = "/tmp/delta-table"
val df = spark.read.format("delta").option("versionAsOf", 0).load(delta_table_path)
df.show()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 |