'Spark hangs on union with zero running task
I have two records of type RDD[T]
For example:
val a: RDD[Integer] = ....
val b: RDD[Integer] = ...
when I perform
val z = a.union(b)
println(z)
I find the spark hangs for ever
[Stage 23:=============================> (1 + 0) / 2]
Not sure why it shows 0 running tasks.
Environment:
Spark 1.6
Scala 2.11.6
Total records in a and b is 10 records each. It is a small file.
Did anyone came across this case where running task is zero and the spark hangs and never ends.
Solution 1:[1]
Apparently setting
--conf spark.driver.host=127.0.0.1
solved the problem for me.
Let's thanks Melitta Dragaschnig & rrusso2007
Edit: Just wanted to mention that I was facing this problem when doing a union between 2 DataFrames from which one was created by reading from Cassandra (using the DataStax Spark Cassandra Connector)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |