'How to connect Snowflake with PySpark?

I am trying to connect to Snowflake with Pyspark on my local machine.

My code is as follows:

from pyspark.sql.types import *
from pyspark.sql import SparkSession
from pyspark import SparkConf

conf = SparkConf()
conf.set('spark.jars','/path/to/driver/snowflake-jdbc-3.12.17.jar , \
/path/to/connector/spark-snowflake_2.12-2.10.0-spark_3.2.jar')

spark = SparkSession.builder \
    .master("local") \
    .appName("snowflake-test") \
    .config(conf=conf) \
    .getOrCreate()


sfOptions = {
    "sfURL": "https://someurl.com",
    "sfAccount": "account",
    "sfUser": "user",
    "sfPassword": "password",
    "sfDatabase": "database",
    "sfSchema": "PUBLIC",
    "sfWarehouse": "warehouse"
}

SNOWFLAKE_SOURCE_NAME = "net.snowflake.spark.snowflake"

df = spark.read.format(SNOWFLAKE_SOURCE_NAME) \
    .options(**sfOptions) \
    .option("query", "select * DimDate") \
    .load()

df.show()

When I run this I get the error:

py4j.protocol.Py4JJavaError: An error occurred while calling o46.load.

How to fix this?



Solution 1:[1]

With the Snowflake Spark JAR version "spark-snowflake_2.12:2.10.0-spark_3.2" Snowflake JDBC 3.13.14 needs to be used. I see that you are using 3.12.17 JDBC version.

Can you add JDBC Version 3.13.14 and then test. As pointed by FKyani, this is a compatibility issue between Snowflake-Spark Jar and JDBC jar.

Solution 2:[2]

Please confirm the correct JDBC version is imported.

Recommended Client Versions: https://docs.snowflake.com/en/release-notes/requirements.html#recommended-client-versions

Solution 3:[3]

This looks similar to the error mentioned in this article: https://community.snowflake.com/s/article/Error-py4j-protocol-Py4JJavaError-An-error-occurred-while-calling-o74-load-java-lang-NoSUchMethodError-scala-Product-init-Lscala-Product-V

If you are using Scala 2.12, you need to downgrade it to 2.11. Please note that, in this case, you will also have to use the associated version of the Spark connector for Snowflake.

Spark connectors for Snowflake can be found here. We recommend you to use the latest connector version depending on your Spark version and Scala 2.11.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Anshul Thakur
Solution 2 FKayani
Solution 3 FKayani