Category "apache-flink"

if we cancel the job with savepoint, job got cancelled and savepoint was failure how to restore this job now

I have deployed flink job in application mode using native kubernetes deployment and stopping job along with savepoint (I'm using rest api command for that) but

java.lang.NoClassDefFoundError: org/apache/flink/streaming/api/scala/StreamExecutionEnvironment

package com.knoldus import org.apache.flink.api.java.utils.ParameterTool import org.apache.flink.streaming.api.scala._ import org.apache.flink.streaming.api.win

Flink TaskManager not reconnecting to the new Jobmanager

I have configured Flink in HA mode as mentioned here: I wanted to test the fault tolerance, hence I did the following: Setup Flink cluster with 2 JobManagers

how to filter None in Row Type Pyflink

I'm using Pyflink and was wondering if there is more Generic way to filter None value or handle none json format. # main(): try: data_stream = data_

How to drain the window after a Flink join using coGroup()?

I'd like to join data coming in from two Kafka topics ("left" and "right"). Matching records are to be joined using an ID, but if a "left" or a "right" record i

Flink checkpoint not replaying the kafka events which were in process during the savepoint/checkpoint

I want to test end-to-end exactly once processing in flink. My job is: Kafka-source -> mapper1 -> mapper-2 -> kafka-sink I had put a Thread.sleep(100

Apache Flink - writing stream to S3 error - null uri host

I have a Flink data pipeline that transforms the log file downloaded from S3 and write back in parquet file format to another S3 bucket. I have configured the S

Cannot access Flink dashboard localhost:8081 on windows

I follow the first steps to install Flink. I can start the cluster without any problem $ start-cluster.sh Starting cluster. Starting standalonesession daemon on

Integration testing flink job

I've written a small flink application. I has some input, and enriches it with data from an external source. It's an RichAsyncFunction and within the open metho

What should I use instead deprecated FlinkKafkaConsumer? Scala Flink

I try to get data from Kafka to Flink, I use FlinkKafkaConsumer but Intellij shows me that it is depricated and also ssh console in Google Cloud shows me this e

Windowing on mapped stream is giving different results

Use Case : Map the input stream of sometype to another type and sink. Also apply the aggragator on the mapped stream to count the number of events of each event

FLINK1.8 , ERROR INFO : state is larger than the maximum permitted memory-backed state. use RocksDBStateBackend not useful

I'm running hibench's flinkbench with FLINK1.8 Stateful Wordcount (wordcount) This workload counts words cumulatively received from Kafka every few seconds. T

I'm looking for code to connect PyFlink to Pulsar

I've been trying various examples, but I need to see how to connect PyFlink to Pulsar. I have Pulsar 2.8.0, Flink 1.13.1 and Scala 2.11. I just need to see how

Kafka Client Timeout of 60000ms expired before the position for partition could be determined

I'm trying to connect Flink to a Kafka consumer I'm using Docker Compose to build 4 containers zookeeper, kafka, Flink JobManager and Flink TaskManager. For z

Apache Flink: AWS S3 timeout exception when starting a job from a savepoint

I have a Flink job which has large state in a Map operator. We are taking savepoint which has around 80GB storing to AWS S3. We have around 100 parallelism for

Apache Flink + CEP - Detect same events

I'd like to detect events that share the same property. Suppose I have a simple case class: case class Record(name: String, value: Int) Suppose there is the

Flink Missing Events With Windowed Processor(Event Time Windows) and Kafka Source

We have a Streaming Job that has 20 separate pipelines, with each pipeline having one/many Kafka topic sources and with some pipelines having Windowed Processor

What is the Java version that the Flink can support in 2022?

Let's say if I start a new Flink Java project, and if I look for "stable Flink Java production experience", which version should I need to use? The official doc

where's socketTextStream in pyflink

I want to translate the following code into pyflink and run it in pyflink-shell.sh afterwards. public class MapDemo { private static int index = 1; pub

Kinesis Analytics SQL query to narrow down the sensors that are not sending data

Context: We use Kinesis analytics to process our sensor data and find anomalies in the sensor data. Goal: We need to identify the sensors that didn’t send