Category "flink-streaming"

The RemoteEnvironment cannot be used when submitting a program through a client, or running in a TestEnvironment context

I was trying to execute the apache-beam word count having Kafka as input and output. But on submitting the jar to the flink cluster, this error came - The Remot

Inconsistent results when joining multiple tables in Flink

We've 4 CDC sources defined of which we need to combine the data into one result table. We're creating a table for each source using the SQL API, eg: "CREATE TA

Flink Python Datastream API Kafka Producer Sink Serializaion

I'm trying to read data from one kafka topic and writing to another after making some processing. I'm able to read data and process it when i try to write it to

No ExecutorFactory found to execute the application in Flink 1.11.1

first of all I have read this post about the same issue and tried to follow the same solution that works for him (create a new quickstart with mvn and migrate t

Create pyFlink DataStream Consumer from Tweets Kafka Producer in Python

I want to create I stream kafka consumer in pyFlink, which can read tweets data after deserialization (json), I have pyflink version 1.14.4 (last version) Can I

if we cancel the job with savepoint, job got cancelled and savepoint was failure how to restore this job now

I have deployed flink job in application mode using native kubernetes deployment and stopping job along with savepoint (I'm using rest api command for that) but

Flink TaskManager not reconnecting to the new Jobmanager

I have configured Flink in HA mode as mentioned here: I wanted to test the fault tolerance, hence I did the following: Setup Flink cluster with 2 JobManagers

Flink checkpoint not replaying the kafka events which were in process during the savepoint/checkpoint

I want to test end-to-end exactly once processing in flink. My job is: Kafka-source -> mapper1 -> mapper-2 -> kafka-sink I had put a Thread.sleep(100

Apache Flink - writing stream to S3 error - null uri host

I have a Flink data pipeline that transforms the log file downloaded from S3 and write back in parquet file format to another S3 bucket. I have configured the S

Integration testing flink job

I've written a small flink application. I has some input, and enriches it with data from an external source. It's an RichAsyncFunction and within the open metho

Apache Flink: AWS S3 timeout exception when starting a job from a savepoint

I have a Flink job which has large state in a Map operator. We are taking savepoint which has around 80GB storing to AWS S3. We have around 100 parallelism for

Flink Missing Events With Windowed Processor(Event Time Windows) and Kafka Source

We have a Streaming Job that has 20 separate pipelines, with each pipeline having one/many Kafka topic sources and with some pipelines having Windowed Processor