'Run spark program locally with intellij
I tried to run a simple test code in intellij IDEA. Here is my code:
import org.apache.spark.sql.functions._
import org.apache.spark.{SparkConf}
import org.apache.spark.sql.{DataFrame, SparkSession}
object hbasetest {
  val spconf = new SparkConf()
  val spark = SparkSession.builder().master("local").config(spconf).getOrCreate()
  import  spark.implicits._
  def main(args : Array[String]) {
    val df = spark.read.parquet("file:///Users/cy/Documents/temp")
    df.show()
    spark.close()
  }
}
My dependencies list:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.1.0</version>
<!--<scope>provided</scope>-->
</dependency>
<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-core_2.11</artifactId>
  <version>2.1.0</version>
  <!--<scope>provided</scope>-->
</dependency>
when I click with run button, it throw an exception:
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapreduce.TaskID.<init>(Lorg/apache/hadoop/mapreduce/JobID;Lorg/apache/hadoop/mapreduce/TaskType;I)V
I checked this post, but situation don't change after making modification. Can I get some help with running local spark application in IDEA? THx.
Update: I can run this code with spark-submit. I hope to directly run it with run button in IDEA.
Solution 1:[1]
Are you using cloudera sandbox and running this application because in POM.xml i could see CDH dependencies '2.6.0-mr1-cdh5.5.0'.
If you are using cloudera please use the below dependency for your spark scala project because the 'spark-core_2.10' artifact version gets changed.
<dependencies>
  <dependency>
    <groupId>org.scala-lang</groupId>
    <artifactId>scala-library</artifactId>
    <version>2.10.2</version>
  </dependency>
  <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.10</artifactId>
    <version>1.0.0-cdh5.1.0</version>
  </dependency>
</dependencies>
I used the below reference to run my spark application.
Reference: http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/
Solution 2:[2]
Here are the settings I use for Run/Debug configuration in IntelliJ:
*Main class:*
org.apache.spark.deploy.SparkSubmit
*VM Options:*
-cp <spark_dir>/conf/:<spark_dir>/jars/* -Xmx6g
*Program arguments:*
--master
local[*]
--conf
spark.driver.memory=6G
--class
com.company.MyAppMainClass
--num-executors
8
--executor-memory
6G
<project_dir>/target/scala-2.11/my-spark-app.jar
<my_spark_app_args_if_any>
spark-core and spark-sql jars are referred in my build.sbt as "provided" dependencies and their versions must match one of the Spark installed in spark_dir. I use Spark 2.0.2 at the moment with hadoop-aws jar version 2.7.2.
Solution 3:[3]
It may be late for the reply, but I just had the same issue. You can run with spark-submit, probably you already had related dependencies. My solution is:
- Change the related dependencies in - Intellij Module Settingsfor your projects from- providedto- compile. You may only change part of them but you have to try. Brutal solution is to change all.
- If you have further exception after this step such as some dependencies are "too old", change the order of related dependencies in module settings. 
Solution 4:[4]
I ran into this issue as well, and I also had an old cloudera hadoop reference in my code. (You have to click the 'edited' link in the original poster's link to see his original pom settings).
I could leave that reference in as long as I put this at the top of my dependencies (order matters!). You should match it against your own hadoop cluster settings.
<dependency>
  <!-- THIS IS REQUIRED FOR LOCAL RUNNING IN INTELLIJ -->
  <!-- IT MUST REMAIN AT TOP OF DEPENDENCY LIST TO 'WIN' AGAINST OLD HADOOP CODE BROUGHT IN-->
  <groupId>org.apache.hadoop</groupId>
  <artifactId>hadoop-mapreduce-client-core</artifactId>
  <version>2.6.0-cdh5.12.0</version>
  <scope>provided</scope>
</dependency>
Note that in 2018.1 version of Intellij, you can check Include dependiencies with "Provided" Scope which is a simple way to keep your pom scopes clean.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source | 
|---|---|
| Solution 1 | jose praveen | 
| Solution 2 | Denis Makarenko | 
| Solution 3 | Kwitter | 
| Solution 4 | sethcall | 
