'Is kafka log cleaner logging anything on AWS MSK?

I use AWS MSK cluster with brokers logging turned on to CloudWatch. Logging works and I can see brokers logs. We have some topics with cleanup.policy=compact and some with cleanup.policy=delete. The system is running on the new cluster for about 2 weeks now.

From my research (e.g. https://zendesk.engineering/an-investigation-into-kafka-log-compaction-5e520f4291f0) I see that kafka should run log cleaner (obviously) and there should be some traces in logs of this activity. However in my CloudWatch log group I cannot find a word "cleaner" or "cleaned" and I cannot find any trace of log cleaner running.

Is log cleaner running at all? It obviously should but I can't find anything in the logs to confirm this, and also we have a lot of messages eligible for cleanup but still not cleaned, for about 2 weeks now.

Kafka cluster version is 2.8.1



Solution 1:[1]

It is quite likely these logs are not being show in MSK since it seems that, by default, they do not go to the main log stream, from: https://jaceklaskowski.gitbooks.io/apache-kafka/content/kafka-log-LogCleaner.html

Please note that Kafka comes with a preconfigured kafka.log.LogCleaner logger in config/log4j.properties:

log4j.appender.cleanerAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.cleanerAppender.DatePattern='.'yyyy-MM-dd-HH
log4j.appender.cleanerAppender.File=${kafka.logs.dir}/log-cleaner.log
log4j.appender.cleanerAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.cleanerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n

log4j.logger.kafka.log.LogCleaner=INFO, cleanerAppender
log4j.additivity.kafka.log.LogCleaner=false

That means that the logs of LogCleaner go to logs/log-cleaner.log file at INFO logging level and are not added to the main logs (per log4j.additivity being off).

It is a bit misleading though because the LogCleaner takes care of compacted topics, I'm not sure where is logged (or at which log level since AWS MSK only exports INFO level logs) the deletion of messages in topics with delete cleanup policy.

I would contact AWS support to know if there is a way or to know what do they do with these logs.

Alternatively, you could try to set up open monitoring with Prometheus which will get all metrics exported by Kafka to JMX. If enabled, there should be a metric (max-clean-time-sec) that, at least, will tell you if it is running and you may get some other interesting information to troubleshoot your issue.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Gerard Garcia