'spark-streaming-kafka-0-8 vs spark-streaming-kafka-0-10
I am a new beginner in the big data field, I need to make a demo which streams data from Kafka topic using spark stream then make some aggregation and filtering then save this data. I'm using spark 2.3 I need to know which version of spark stream Kafka must use 0.8 or 10 as in spark-2.3 document ->
https://spark.apache.org/docs/2.3.0/streaming-kafka-integration.html mention that
0.8 is deprecated and 10 is stable but in streaming-kafka-0-10-integration ->
https://spark.apache.org/docs/2.3.0/streaming-kafka-0-10-integration.html
mention that 0.8 is stable and 10 is experimental I'm using Kafka 2.1.
so which of them i must use
Solution 1:[1]
Version 0.8 is stable but kafka is not providing technical support for this version. I think, you should go ahead with the latest version.
Solution 2:[2]
I'm using Kafka 2.1.
Then you should use Spark's 0.10 Kafka API, mostly for the reasons of having the new Consumer API, as mentioned on that page.
If you upgrade to Spark 2.4, the same library was upgraded to use Kafka 2.0 libraries, but they kept the name as 0.10 - SPARK-18057
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Rohit Yadav |
Solution 2 | OneCricketeer |