'How to fix the JAVA Kafka Producer Error "Received invalid metadata error in produce request on partition" and Out of Memory when broker is down

I have been creating a Kafka Producer example using Java. I have been sending normal data which is just "Test" + Integer as value to Kafka. I have used the below properties and after I have started the Producer Client and messages are on the way, during this I am killing the broker and suddenly receiving the below error message instead of retrying.

Using 3 brokers and topic with 3 partitions and replication factor as 3 and no min-insync-replicas

Below are the properties configured config.put(ProducerConfig.ACKS_CONFIG, "all");
config.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, "1"); config.put(CommonClientConfigs.RETRIES_CONFIG, 60); config.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true); config.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG ,10000); config.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG ,30000); config.put(ProducerConfig.MAX_BLOCK_MS_CONFIG ,10000); config.put(ProducerConfig.MAX_REQUEST_SIZE_CONFIG , 1048576); config.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384); config.put(ProducerConfig.LINGER_MS_CONFIG, 0); config.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 1073741824); // 1GB

and the result when I have killed all my brokers or sometimes one of the broker is as below

**Error:**
WARN org.apache.kafka.clients.producer.internals.Sender  - [Producer 
clientId=producer-1] Got error produce response with correlation id 124 
on topic-partition testing001-0, retrying (59 attempts left). Error: 
NETWORK_EXCEPTION
27791 [kafka-producer-network-thread | producer-1] WARN 
org.apache.kafka.clients.producer.internals.Sender  - [Producer 
clientId=producer-1] Received invalid metadata error in produce request 
on partition testing001-0 due to 
org.apache.kafka.common.errors.NetworkException: The server disconnected 
before a response was received.. Going to request metadata update now
28748 [kafka-producer-network-thread | producer-1] ERROR 
org.apache.kafka.common.utils.KafkaThread  - Uncaught exception in thread 
'kafka-producer-network-thread | producer-1':
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(Unknown Source)
at java.nio.ByteBuffer.allocate(Unknown Source)
at    org.apache.kafka.common.memory.MemoryPool$1.tryAllocate 
(MemoryPool.java:30)
at org.apache.kafka.common.network.NetworkReceive.readFrom
(NetworkReceive.java:112)
at org.apache.kafka.common.network.KafkaChannel.receive
(KafkaChannel.java:335)
at org.apache.kafka.common.network.KafkaChannel.read
(KafkaChannel.java:296)
at org.apache.kafka.common.network.Selector.attemptRead
(Selector.java:560)
at org.apache.kafka.common.network.Selector.pollSelectionKeys
(Selector.java:496)
at org.apache.kafka.common.network.Selector.poll(Selector.java:425)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:510)
at org.apache.kafka.clients.producer.internals.Sender.run
(Sender.java:239)
at org.apache.kafka.clients.producer.internals.Sender.run
(Sender.java:163)
at java.lang.Thread.run(Unknown Source)


Solution 1:[1]

I assume you are testing the producer. When a producer connect to the Kafka cluster you will pass all broker IPs and ports as a comma separated string. In your case there are three brokers. When producer try to connect to cluster, as part of initialization cluster controller responds with cluster metadata. Assume your producer only populating message to a single topic. Cluster maintains a leader among brokers for each and every topic. After identify the leader for the topic, your producer only going to communicate to the leader until it is live.

In your testing scenario, you are deliberately killing the broker instances. When it happens kafka cluster need to identify a new leader for your topic and controller has to pass the new meta data to your producer. If the metadata change quite frequently( in your case you may kill another broker mean while) producer may receive invalid metadata.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Steephen