'Apache Beam run docker in pipeline
The apache beam pipeline (python) I'm currently working on contains a transformation which runs a docker container.
While that works well during local testing with the DirectRunner, I'm wondering how one would deploy such a pipeline?
- Google Dataflow isn't possible. While we can use a custom docker image, Docker-in-Docker would require the containers on Dataflow to be run with the "--privilege" flag.
- ApacheSpark isn't possible (only tested using Google's DataProc service). The pipeline is apparently executed within a docker container.
So, what would be the easiest way to deploy such a pipeline?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|