'Apache Beam - Flink Container Docker Issue (java.io.IOException: Cannot run program "docker": error=2,)

I am processing a Kafka stream with Apache Beam by running the Beam Flink container they provide.

docker run --net=host apache/beam_flink1.13_job_server:latest

After setting up the container I run a script using python main.py

from apache_beam.io.kafka import ReadFromKafka
from apache_beam.options.pipeline_options import PipelineOptions

if __name__ == '__main__':
    options = PipelineOptions([
        "--runner=PortableRunner",
        "--job_endpoint=localhost:8099",
        "--environment_type=LOOPBACK",
    ])

    pipeline = beam.Pipeline(options=options)

    result = (
        pipeline
        | "Read from kafka" >> ReadFromKafka(
            consumer_config={
                "bootstrap.servers": 'localhost:9092',
            },
            topics=['demo'],
            expansion_service='localhost:8097',
        )

        | beam.Map(print)
    )

    pipeline.run()

When doing so the Beam Flink container receives the pipeline submission but runs into the following error during execution:

Jul 15, 2021 6:24:50 PM org.apache.flink.runtime.executiongraph.Execution transitionState
INFO: Source: Impulse (1/1) (b8ee3d07bb3fba0ce7b8c6e8f1055168) switched from RUNNING to FINISHED.
Jul 15, 2021 6:24:50 PM org.apache.beam.runners.fnexecution.environment.DockerCommand runImage
WARNING: Unable to pull docker image apache/beam_java8_sdk:2.31.0, cause: Cannot run program "docker": error=2, No such file or directory
Jul 15, 2021 6:24:50 PM org.apache.flink.runtime.taskmanager.Task transitionState
WARNING: [3]Read from kafka/KafkaIO.Read/KafkaIO.Read.ReadFromKafkaViaSDF/{ParDo(GenerateKafkaSourceDescriptor), KafkaIO.ReadSourceDescriptors} (1/2)#0 (ede2e66e82e968a1df246f14a9c7e1f6) switched from INITIALIZING to FAILED with failure cause: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: java.io.IOException: Cannot run program "docker": error=2, No such file or directory
    at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
    at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:451)
    at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:436)
    at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:303)
    at org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:38)
    at org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:202)
    at org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator.open(ExecutableStageDoFnOperator.java:243)
    at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:437)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:582)
    at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:562)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:537)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Cannot run program "docker": error=2, No such file or directory

Are there any dependencies on running this container?



Solution 1:[1]

I had "python" instead of "docker", the problem was because of I was working with python3, for my case the solution was with:

sudo ln -s /usr/bin/python3 /usr/bin/python

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Tyler2P