'Can AWS Athena update or insert data stored in S3?

The document just says that it is a query service but not explicitly states that it can or cannot perform data update.

If Athena cannot do insert or update, is there any other aws service which can do like a normal DB?



Solution 1:[1]

Amazon Athena is, indeed, a query service -- it only allows data to be read from Amazon S3.

One exception, however, is that the results of the query are automatically written to S3. You could, therefore, use a query to generate results that could be used by something else. It's not quite updating data but it is generating data.

My previous attempts to use Athena output in another Athena query didn't work due to problems with the automatically-generated header, but there might be some workarounds available.

If you are seeking a service that can update information in S3, you could use Amazon EMR, which is basically a managed Hadoop cluster. Very powerful and capable, and can most certainly update information in S3, but it is rather complex to learn.

Solution 2:[2]

Amazon Athena adds support for inserting data into a table using the results of a SELECT query or using a provided set of values

Amazon Athena now supports inserting new data to an existing table using the INSERT INTO statement.

https://aws.amazon.com/about-aws/whats-new/2019/09/amazon-athena-adds-support-inserting-data-into-table-results-of-select-query/

https://docs.aws.amazon.com/athena/latest/ug/insert-into.html

Bucketed tables not supported

INSERT INTO is not supported on bucketed tables. For more information, see Bucketing vs Partitioning.

Solution 3:[3]

AWS S3 is a object storage. Both Athena and S3 Select is for queries. The only way to modify a object(file) in S3 is to retrieve from S3, modify and upload back to S3.

Solution 4:[4]

Solution 5:[5]

Athena supports CTAS (create table as) statements as of October 2018. You can specify output location and file format among other options.

https://docs.aws.amazon.com/athena/latest/ug/ctas.html

To INSERT into tables you can write additional files in the same format to the S3 path for a given table (this is somewhat of a hack), or preferably add partitions for the new data.

Like many big data systems, Athena is not capable of handling UPDATE statements.

Solution 6:[6]

Finally there is a solution from AWS. Now you can perform CRUD (create, read, update and delete) operations on AWS Athena. Athena Iceberg integration is generally available now. Create the table with:

TBLPROPERTIES ( 'table_type' ='ICEBERG' [, property_name=property_value])

then you can use it's amazing feature.

For a quick introduction, you can watch this video. (Or search Insert / Update / Delete on S3 With Amazon Athena and Apache Iceberg | Amazon Web Services on Youtube)

Read Considerations and Limitations

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 John Rotenstein
Solution 2 Hariprasad
Solution 3 Ashan
Solution 4 Theo
Solution 5 Kirk Broadhurst
Solution 6 Bijohn Vincent