'How to change Hudi table version via Hudi CLI

How do I change the table version via the Hudi CLI?

Steps:

  1. ssh into EMR
  2. kick off the hudi cli /usr/lib/hudi/cli/bin/hudi-cli.sh. Version of the Hudi CLI is 1.
  3. connect to my table connect --path s3://bucket/db/table

In the desc of the table I see that it is version=3, but I want to use Hudi 0.9.0 to write to the table so I would like to set the table to version=2.

org.apache.hudi.exception.HoodieException: Unknown versionCode:3
  at org.apache.hudi.common.table.HoodieTableVersion.lambda$versionFromCode$1(HoodieTableVersion.java:54)
  at java.util.Optional.orElseThrow(Optional.java:290)
  at org.apache.hudi.common.table.HoodieTableVersion.versionFromCode(HoodieTableVersion.java:54)
  at org.apache.hudi.common.table.HoodieTableConfig.getTableVersion(HoodieTableConfig.java:246)


Solution 1:[1]

Sadly, I'm not aware of any way to use version 0.9.0 to downgrade 3 to 2, due to the error you are getting. There is no way for version 0.9.0 to know how 0.10.0 was writing things differently.

Recently, AWS has 6.6 available for use, but it isn't well documented. I'd recommend switching over to that, because it has hudi version 0.10.0 and can then do that downgrade.

This link should get updated whenever 6.6 gets updated in the docs. https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-release-app-versions-6.x.html

Side note, if you are using the bootstrap action script provided by AWS to repair the log4j vulnerability, I'd recommend taking the version 6.5 version provided and editing it to be 6.6. There is not a 6.6 script available at this time, but I did that and was not able to detect any vulnerabilities.

This link provides an explanation on the bootstrap action: https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-log4j-vulnerability.html

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 JWorth