'Databricks Spark: java.lang.OutOfMemoryError: GC overhead limit exceeded i

I am executing a Spark job in Databricks cluster. I am triggering the job via a Azure Data Factory pipeline and it execute at 15 minute interval so after the successful execution of three or four times it is getting failed and throwing with the exception "java.lang.OutOfMemoryError: GC overhead limit exceeded". Though there are many answer with for the above said question but in most of the cases their jobs are not running but in my cases it is getting failed after successful execution of some previous jobs. My data size is less than 20 MB only.

My cluster configuration is: Databricks cluster Configuration

So the my question is what changes I should make in the server configuration. If the issue is coming from my code then why it is getting succeeded most of the time. Please advise and suggest me the solution.



Solution 1:[1]

This is most probably related to executor memory being bit low .Not sure what is current setting and if its default what is the default value in this particular databrics distribution. Even though it passes but there would lot of GCs happening because of low memory hence it would keep failing once in a while . Under spark configuration please provide spark.executor.memory and also some other params related to num of executors and cores per executor . In spark-submit the config would be provided as spark-submit --conf spark.executor.memory=1g

Solution 2:[2]

You may try increasing memory of driver node.

Solution 3:[3]

Sometimes the Garbage Collector is not releasing all the loaded objects in the driver's memory.

What you can try is to force the GC to do that. You can do that by executing the following:

spark.catalog.clearCache()
for (id, rdd) in spark.sparkContext._jsc.getPersistentRDDs().items():
    rdd.unpersist()
    print("Unpersisted {} rdd".format(id))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Shridhar
Solution 2 Vivek Atal
Solution 3 George Sotiropoulos