'Spark job failing on jackson dependencies

I have spark job that is failing after the upgrade of the cdh from 5.5.4 which had spark 1.5.0 to cdh 5.13.0 which has spark 1.6.0

The job is running with the new spark dependencies but i see strange behavior for one spark job that:

1) sometimes it's oozie launcher marked as success and other as killed,

2) also for the spark job itself i see that is failing on the jackson databind.

2018-01-05 19:07:17,672 [Driver] ERROR org.apache.spark.deploy.yarn.ApplicationMaster - User class threw exception: java.lang.VerifyError: Bad type on operand stack Exception Details: Location: org/apache/spark/metrics/sink/MetricsServlet.(Ljava/util/Properties;Lcom/codahale/metrics/MetricRegistry;Lorg/apache/spark/SecurityManager;)V @116: invokevirtual Reason: Type 'com/codahale/metrics/json/MetricsModule' (current frame, stack[2]) is not assignable to 'com/fasterxml/jackson/databind/Module'



Solution 1:[1]

The error you are getting is a Java Bytecode Verification Error. This happens right before the class can be loaded onto the JVM by the classloader. The purpose of this step is to ensure that the code didn't come from a malicious compiler, but indeed follows the Java language rules.

Read more about it here: http://www.oracle.com/technetwork/java/security-136118.html

Now, to address your problem. This error is also thrown when your code finds different jars/classes at runtime than the ones which were used during compile time.

The MetricServlet class in the spark-core lib tries to instantiate an object of type MetricsModule which is packaged inside the metrics-json jar. Then it tries to register this object (within it's 'ObjectMapper') as a generic Module object. Note: MetricsModule extends from the Module class of jackson-databind jar. So, in simple terms, an object of type MetricsModule is being type-casted to parent class Module.

However, the MetricsModule class in your environment is not loaded from the metrics-json Jar, but some other foreign Jar or third party library, where it extended a different Module parent class. This Jar must have been compiled using some.other.package.Module class rather than the original com.fasterxml.jackson.databind.Module from jackson-databind.

E.g. Uber JAR for CosmosDB connector for Spark packages both MetricsModule and Module class. But the latter is packaged under "cosmosdb_connector_shaded.jackson.databind.Module" giving the exact same error -

"Type 'com/codahale/metrics/json/MetricsModule' (current frame, stack[2]) is not assignable to 'com/fasterxml/jackson/databind/Module'"

To resolve this class conflict you need to find the JAR which actually loaded MetricsModule class. Use -verbose:class JVM option with your Spark Driver JVM to track this.

Solution 2:[2]

@sg1 explanation is accurate. For me, I fixed this error by adding the jars as part of spark.driver.extraClassPath instead of copying them in jars/ directory of spark. You can also try shading the particular dependency such as Jackson in your uber jar.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 sg1
Solution 2 Kartik Khare