'Why retention.ms of Kaka Streams repartition topic is set to -1 by default? Isn't this infinitely retain messages in repartition topic?

I think it's related to the below links, but I don't understand.

It's possible to provide topic configurations like "retention.ms", "cleanup.policy" for kafka streams internal topics like *-changelog topics to delete useless logs.

But when it comes to internal topics like *-repartition topics, it's not possible to provide topic configuration values, even though the default "retention.ms" for repartition topic is "-1" which means infinite retention. How can I delete or manage repartition topics? Otherwise the repartition topic's size is going to be too large and disk malfunction problems might occur.

How can I manage repartition topics? What is purgeData? Couldn't find any related explanations on the documentation.



Solution 1:[1]

Fact

  • retention.ms for the repartition topics is -1 by default and there's no way to override this value in kafka-streams client code.

What I misunderstood

  • Size of the repartition topic would be increasing infinitely since the retentions.ms for the repartition topics is -1.

Fix misunderstanding

  • There's a method called maybeCommit in the StreamThread class.
  • maybeCommit method is called iteratively inside the loop that handles stream records.
  • Inside the maybeCommit method (version 2.7.1), there's a comment like below.

    try to purge the committed records for repartition topics if possible

  • Based on this, what I understand is that when the record in the repartition topics is streamed down to the changelog topic, then the records already sent are purged periodically.
  • Therefore, there's no need to clear or manage retention.ms for the repartition topics.

Reference

Please leave a comment or correct this if I'm wrong.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 yaboong