'How commit interval & cache max buffer behave in a sub-topology with multiple state store?

Taken from the book: Mastering Kafka Streams and ksqlDB

Kafka Streams uses a depth-first strategy when processing data. When a new record is received, it is routed through each stream processor in the topology before another record is processed.

Also from the documentation: https://docs.confluent.io/platform/current/streams/architecture.html#

Kafka Streams does not use a backpressure mechanism because it does not need one. Using a depth-first processing strategy, each record consumed from Kafka will go through the whole processor (sub-)topology for processing and for (possibly) being written back to Kafka before the next record will be processed.

My question here is:

1 ) When a sub-topology (i.e. task) has multiple stateful processor (KTable Operations) with state store

1-A) Do each stateful processor make use of the cache memory? If so, given that the cache memory is distributed evenly across the Threads and a Thread execute the Tasks (I guess in sequence), how is that memory distributed/allocated for the cache of each stateful processor of the task being executed ?

1-B) If each stateful Processors make use of a cache, how do the Depth-First processing happens? If I have let say 3 stateful processors in a row, how do a record go through the 3 processor ? I understand that a commit interval met or a cache will trigger forwarding, flushing and saving to the changelog topic of a state store of a stateful processor. However where there are 3 a row, I get confused as to how this would work.

1-B-1) Will record be forward only on commit interval met or cache full between processor ? And therefore, hypothetically the time for a record to traverse a sub-topology (if the caches of processor never get full), would be commit interval multiplied by 3 ?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source