'How to fetch the latest schema change in BigQuery and restore deleted column within 7 days
Right now I fetch columns via below command:
SELECT COLUMN_NAME, DATA_TYPE
FROM `Dataset`.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
WHERE table_name="User"
But if I drop a column using command : Alter TABLE user drop column blabla
, the column is not actually deleted within 7 days based on official documentation. If I use the above command, the column is still there in the schema. It is just I cannot insert data into such column and view such column in GCP console. This inconsistency really cause an issue.
If I want to write bash script to monitor schema changes and do some operation based on it.
My questions are
- How I can fetch the correct schema in spanner which reflect the recently deleted the column?
- If the column is not actually deleted, is there any way to easily restore it?
Solution 1:[1]
If you want to fetch the recently deleted column you can try searching through Cloud Logging. I'm not sure what tools Spanner supports but if you want to use Bash you can use
gcloud
to fetch logs. Though it will be difficult to parse the output and get the information you want.Command used below fetched the logs for
google.cloud.bigquery.v2.JobService.InsertJob
since anALTER TABLE
is considered as anInsertJob
and filter it based from the actual query where it saysdrop
. The regex I used is not strict (for the sake of example), I suggest updating the regex to be stricter.gcloud logging read 'protoPayload.methodName="google.cloud.bigquery.v2.JobService.InsertJob" AND protoPayload.metadata.jobChange.job.jobConfig.queryConfig.query=~"Alter table .*drop.*"'
Sample snippet from the command above (Column PADDING is deleted based from the query):
If you have options other than Bash, I suggest that you create a BQ sink for your logging and you can perform queries there and get these information. You can also use client libraries like Python, NodeJS, etc to either query in the sink or directly query in the GCP Logging.
As per this SO answer, you can use the time travel feature of BQ to query the deleted column. The answer also explains behavior of BQ to retain the deleted column within 7 days and a workaround to delete the column instantly. See the actual query used to retrieve the deleted column and the workaround on deleting a column on the previously provided link.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |