'Logstash with persistent queue

I have started logstash using following configurations:

Inside logstash.yml:

queue.type: persisted
queue.max_bytes: 8gb
queue.checkpoint.writes: 1

configuration file:

input {
    beats {
        port => "5043"
    }
}
filter {
    grok {
        match => {
            "message" => "%{COMBINEDAPACHELOG}"
        }
    }
    geoip {
        source => "clientip"
    }
}
output {
    elasticsearch {
        hosts => ["localhost:9200"]
        index => "test"
        document_type => "tw"
    }
}

I have such situation.

  1. Imagine elasticsearch is turned off

  2. Now imagine, while elasticsearch is turned off, logstash received logging events

  3. Now imagine we turn logstash off too

Now, if I turn logstash and elasticsearch on, logstash doesn't send the messages which were received during step 2 -- that is when elasticsearch was turned off and logstash was receiving events.



Solution 1:[1]

Nowadays Logstash Persistent Queues behave like this:
"When the persistent queue feature is enabled, Logstash will store events on disk. Logstash commits to disk in a mechanism called checkpointing."

https://www.elastic.co/guide/en/logstash/current/persistent-queues.html https://www.elastic.co/guide/en/logstash/current/persistent-queues.html#durability-persistent-queues

Solution 2:[2]

Is that all you have in logstash.yml for your pipeline? You should be defining your pipeline settings in either logstash.yml or pipelines.yml. For example, it should look like:

- pipeline.id: Beats
  path.config: "/LogStash/pipelines/beatspipeline.yml"
  queue.type: persisted
  path.queue: /Logstash/data/queue
  queue.max_bytes: 10gb

The documentation doesn't explicitly state you must configure per pipeline settings, but I know this method has always worked.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 straville
Solution 2 Grunt