'Logstash with persistent queue
I have started logstash using following configurations:
Inside logstash.yml:
queue.type: persisted
queue.max_bytes: 8gb
queue.checkpoint.writes: 1
configuration file:
input {
beats {
port => "5043"
}
}
filter {
grok {
match => {
"message" => "%{COMBINEDAPACHELOG}"
}
}
geoip {
source => "clientip"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "test"
document_type => "tw"
}
}
I have such situation.
Imagine elasticsearch is turned off
Now imagine, while elasticsearch is turned off, logstash received logging events
- Now imagine we turn logstash off too
Now, if I turn logstash and elasticsearch on, logstash doesn't send the messages which were received during step 2 -- that is when elasticsearch was turned off and logstash was receiving events.
Solution 1:[1]
Nowadays Logstash Persistent Queues behave like this:
"When the persistent queue feature is enabled, Logstash will store events on disk. Logstash commits to disk in a mechanism called checkpointing."
https://www.elastic.co/guide/en/logstash/current/persistent-queues.html https://www.elastic.co/guide/en/logstash/current/persistent-queues.html#durability-persistent-queues
Solution 2:[2]
Is that all you have in logstash.yml for your pipeline? You should be defining your pipeline settings in either logstash.yml or pipelines.yml. For example, it should look like:
- pipeline.id: Beats
path.config: "/LogStash/pipelines/beatspipeline.yml"
queue.type: persisted
path.queue: /Logstash/data/queue
queue.max_bytes: 10gb
The documentation doesn't explicitly state you must configure per pipeline settings, but I know this method has always worked.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | straville |
Solution 2 | Grunt |