'Logstash with persistent queue
I have started logstash using following configurations:
Inside logstash.yml:
queue.type: persisted
queue.max_bytes: 8gb
queue.checkpoint.writes: 1
configuration file:
input {
    beats {
        port => "5043"
    }
}
filter {
    grok {
        match => {
            "message" => "%{COMBINEDAPACHELOG}"
        }
    }
    geoip {
        source => "clientip"
    }
}
output {
    elasticsearch {
        hosts => ["localhost:9200"]
        index => "test"
        document_type => "tw"
    }
}
I have such situation.
- Imagine elasticsearch is turned off 
- Now imagine, while elasticsearch is turned off, logstash received logging events 
- Now imagine we turn logstash off too
Now, if I turn logstash and elasticsearch on, logstash doesn't send the messages which were received during step 2 -- that is when elasticsearch was turned off and logstash was receiving events.
Solution 1:[1]
Nowadays Logstash Persistent Queues behave like this:
"When the persistent queue feature is enabled, Logstash will store events on disk. Logstash commits to disk in a mechanism called checkpointing."
https://www.elastic.co/guide/en/logstash/current/persistent-queues.html https://www.elastic.co/guide/en/logstash/current/persistent-queues.html#durability-persistent-queues
Solution 2:[2]
Is that all you have in logstash.yml for your pipeline? You should be defining your pipeline settings in either logstash.yml or pipelines.yml. For example, it should look like:
- pipeline.id: Beats
  path.config: "/LogStash/pipelines/beatspipeline.yml"
  queue.type: persisted
  path.queue: /Logstash/data/queue
  queue.max_bytes: 10gb
The documentation doesn't explicitly state you must configure per pipeline settings, but I know this method has always worked.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source | 
|---|---|
| Solution 1 | straville | 
| Solution 2 | Grunt | 

 elasticsearch
elasticsearch