Category "apache-flink"

Integration testing flink job

I've written a small flink application. I has some input, and enriches it with data from an external source. It's an RichAsyncFunction and within the open metho

What should I use instead deprecated FlinkKafkaConsumer? Scala Flink

I try to get data from Kafka to Flink, I use FlinkKafkaConsumer but Intellij shows me that it is depricated and also ssh console in Google Cloud shows me this e

Windowing on mapped stream is giving different results

Use Case : Map the input stream of sometype to another type and sink. Also apply the aggragator on the mapped stream to count the number of events of each event

FLINK1.8 , ERROR INFO : state is larger than the maximum permitted memory-backed state. use RocksDBStateBackend not useful

I'm running hibench's flinkbench with FLINK1.8 Stateful Wordcount (wordcount) This workload counts words cumulatively received from Kafka every few seconds. T

I'm looking for code to connect PyFlink to Pulsar

I've been trying various examples, but I need to see how to connect PyFlink to Pulsar. I have Pulsar 2.8.0, Flink 1.13.1 and Scala 2.11. I just need to see how

Kafka Client Timeout of 60000ms expired before the position for partition could be determined

I'm trying to connect Flink to a Kafka consumer I'm using Docker Compose to build 4 containers zookeeper, kafka, Flink JobManager and Flink TaskManager. For z

Apache Flink: AWS S3 timeout exception when starting a job from a savepoint

I have a Flink job which has large state in a Map operator. We are taking savepoint which has around 80GB storing to AWS S3. We have around 100 parallelism for

Apache Flink + CEP - Detect same events

I'd like to detect events that share the same property. Suppose I have a simple case class: case class Record(name: String, value: Int) Suppose there is the

Flink Missing Events With Windowed Processor(Event Time Windows) and Kafka Source

We have a Streaming Job that has 20 separate pipelines, with each pipeline having one/many Kafka topic sources and with some pipelines having Windowed Processor

What is the Java version that the Flink can support in 2022?

Let's say if I start a new Flink Java project, and if I look for "stable Flink Java production experience", which version should I need to use? The official doc

where's socketTextStream in pyflink

I want to translate the following code into pyflink and run it in pyflink-shell.sh afterwards. public class MapDemo { private static int index = 1; pub

Kinesis Analytics SQL query to narrow down the sensors that are not sending data

Context: We use Kinesis analytics to process our sensor data and find anomalies in the sensor data. Goal: We need to identify the sensors that didn’t send