'How to pass hbase-site.xml to Google Cloud Dataflow template
We have a setup where we have a Hbase cluster running on Google cloud and using Dataflow I want to write into Hbase tables. For this, I want to pass my hbase-site.xml
file in staging and then in prod, I will pass different hbase-site.xml
in production environment. However, I am not able to find an option to pass a resource file to Dataflow template. Is there any option in Dataflow similar to --files
in Spark or --classpath
in Flink for adding this.
I can definitely add hbase-site.xml
to src/main/resources
which helps but I want different hbase-site.xml
for two different environments. So, having an option like this would be very beneficial.
Solution 1:[1]
Are you using Beam HBaseIO and is it possible to pass these parameters as a part of the Configuration
provided to it ? If so, you could probably update your template to accept this config (or values for creating a config) as a PipelineOption (and parse them in the Main class).
If you want the file to be available locally (in the VM), you probably need to setup a custom container to be used by your template.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | chamikara |