'Why does co-partitioning of two Kstreams in kafka require same number of partitions for both the streams?
I wanted to know why does co-partitioning of two Kstreams in kafka require same number of partitions for both the streams as is given in the documentation in below URL: enter link description here
Solution 1:[1]
As the name "co-partition" indicates, you want to put data from different topics but same key to the same Kafka Streams application instance. If you don't have the same number of partitions, it's not possible to get this behavior.
Assume you have topic A with 2 partitions and topic B with 3 partitions. Thus, it can happen that one record with key X is hashed to partitions A-0 and B-1 (ie, not same partition number). However, for a different key Y it might be hashed to A-0 but B-2.
Only if the number of partitions is the same for both topics, records with same key end up in the same partitions (of different topics of course), and this allows to process A-0/B-0 and A-1/B-1 etc together.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |