'Spark on Windows - java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0

In Win10, in IntelliJ this path("C:/hive/Orders_[0-9]*.csv") works good when run as stand alone java spark job. But not working as Spring Boot spark job. Seems the spring boot not detecting native Filesystem. Not sure how to resolve this.

Dataset<Row> DF1 = spark
                .read().format("csv")
                .option("header", "true")
                .option("delimiter", "\t")
                .load("C:/hive/Orders_[0-9]*.csv");

Error:

Error starting ApplicationContext. To display the auto-configuration report re-run your application with 'debug' enabled.
2019-09-04 21:59:27.701 ERROR [omni-ods-migration,,,] 8216 --- [           main] o.s.boot.SpringApplication               : Application startup failed

org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'odsMigrationService': Invocation of init method failed; nested exception is java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
    at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:137)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsBeforeInitialization(AbstractAutowireCapableBeanFactory.java:409)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1620)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:555)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:483)
    at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:306)
    at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:230)
    at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:302)
    at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:197)
    at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:761)
    at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:867)
    at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:543)
    at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:693)
    at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:360)
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:303)
    at org.springframework.boot.builder.SpringApplicationBuilder.run(SpringApplicationBuilder.java:134)
    at com.jcpenney.ods.OdsMigration.main(OdsMigration.java:20)
Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:645)
    at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:1230)
    at org.apache.hadoop.fs.FileUtil.list(FileUtil.java:1435)
    at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:493)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
    at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:678)
    at org.apache.hadoop.fs.Globber.listStatus(Globber.java:77)
    at org.apache.hadoop.fs.Globber.doGlob(Globber.java:235)
    at org.apache.hadoop.fs.Globber.glob(Globber.java:149)
    at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:2016)
    at org.apache.spark.deploy.SparkHadoopUtil.globPath(SparkHadoopUtil.scala:241)
    at org.apache.spark.deploy.SparkHadoopUtil.globPathIfNecessary(SparkHadoopUtil.scala:247)
    at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:383)
    at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$12.apply(DataSource.scala:379)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
    at scala.collection.immutable.List.flatMap(List.scala:355)
    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:379)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:132)
    at com.jcpenney.ods.service.OdsMigrationService.readHfsFile(OdsMigrationService.java:588)
    at com.jcpenney.ods.service.OdsMigrationService.processOrders(OdsMigrationService.java:334)
    at com.jcpenney.ods.service.OdsMigrationService.run(OdsMigrationService.java:129)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleElement.invoke(InitDestroyAnnotationBeanPostProcessor.java:366)
    at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:311)
    at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:134)
    ... 16 common frames omitted

below code works good in spring boot as well when the path is given with exact file name.

Dataset<Row> DF1 = spark
                .read().format("csv")
                .option("header", "true")
                .option("delimiter", "\t")
                .load("C:/hive/Orders_000001.csv");

how to fix this?



Solution 1:[1]

Here is a possible solution

  1. Download Hadoop files for Windows from https://github.com/cdarlint/winutils
  2. Extract the files (e.g. C:\hadoop). Make sure the directory structure is similar to this C:\hadoop\bin\winutils.exe
  3. Set environment variable HADOOP_HOME to C:\hadoop
  4. Add Hadoop to Path env variable: %HADOOP_HOME%\bin
  5. Copy hadoop.dll to Windows\System32 (may not be necessary)
  6. Restart system
  7. Java application specific: Add this to the main method: System.setProperty ("hadoop.home.dir", "C:/hadoop/" ); System.load ("C:/hadoop/bin/hadoop.dll");

References:

  1. https://cwiki.apache.org/confluence/display/HADOOP2/WindowsProblems
  2. https://sparkbyexamples.com/spark/spark-hadoop-exception-in-thread-main-java-lang-unsatisfiedlinkerror-org-apache-hadoop-io-nativeio-nativeiowindows-access0ljava-lang-stringiz/
  3. https://blog.csdn.net/weixin_30802273/article/details/96528359

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 OneCricketeer