'Apache Pulsar - use cases for infinite retention of a topic
I am actually planing our next version of our telemetry system architecture. I am strongly considering Pulsar at the messaging solution.
To better understand what's this technology is best for, can someone share their use cases of why their use the infinite retention of a topic other than audit trail ?
I was main goal is to see if our telemetry data could be simply stored in a pulsar topic and query that for analytics purpose instead of using a time series database like Apache Druid.
Thanks !
Solution 1:[1]
The use-case I've had for infinite retention is when you want to store the history going back to the beginning: e.g. in an event-sourcing style approach, the longer you're keeping the events archived, the more able you are to remix your state.
With durable-log style storage, remember that it heavily optimizes for slurping the log starting at some point. For higher-volume queries or queries with strict latency requirements, this is generally pretty unsuited for that sort of workload, and even more so if you can't limit reads to a single partition (remember also that with multiple partitions, even the ordering of the messages in the log may be difficult to reconstruct). For infrequent queries with loose latency requirements, though, storing them in pulsar might not be that bad, especially if you'd be using pulsar already to feed data into the time-series store (as you can then dispense with the time-series store).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Levi Ramsey |