'CloudWatch Agent: batch size equal to "1" - is it a bad idea?

If I correctly understand, a CloudWatch Agent publishes events to CloudWatch by using a of kind of batching, the size of which is specified by the two params:

batch_count:

Specifies the max number of log events in a batch, up to 10000. The default value is 1000.

batch_size

Specifies the max size of log events in a batch, in bytes, up to 1048576 bytes. The default value is 32768 bytes. This size is calculated as the sum of all event messages in UTF-8, plus 26 bytes for each log event.

I guess, that in order to eliminate a possibility of loosing any log data in case of a EC2 instance termination, the batch_count should be equal to 1 (because in case of the instance termination all logs will be destroyed). Am I right that this is only one way to achieve it, and how this can affect the performance? Will it have any noticeable side-effects?



Solution 1:[1]

Yes, it's a bad idea. You are probably more likely to lose data that way. The PutLogEvents API that the agent uses is limited to 5 requests per second per log stream (source). With a batch_count of 1, you'd only be able to publish 5 log events per second. If the application were to produce more than that consistently, the agent wouldn't be able to keep up.

If you absolutely can't afford to lose any log data, maybe you should be writing that data to a database instead. There will always be some risk of losing log data, even if with a batch_count of 1. The host could always crash before the agent polls the log file... which BTW is every 5 seconds by default (source).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Daniel Vassallo