'Delta Table / Athena And Spark

I have my delta table, which can be read from Athena.

enter image description here

When I try to get the data through a query from spark I get the following error:

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 80.0 failed 4 times, most recent failure: Lost task 0.3 in stage 80.0 (TID 449, ip-172-31-22-178.ec2.internal, executor 2): java.lang.RuntimeException: s3://<path>/BDA/DELTA/CLIENTE/_symlink_format_manifest/PERIODO=202001/manifest is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [117, 101, 116, 10]

enter image description here

enter image description here

if I do that same query in athena, there are no problems

enter image description here



Solution 1:[1]

This happens because your delta file was already created with a manifest to be read in athena now if you want to read it with spark, it has to be this way

%sql select * from delta.s3://path/tabla/ limit

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Catherine Solano