'Can AWS Athena update or insert data stored in S3?
The document just says that it is a query service but not explicitly states that it can or cannot perform data update.
If Athena cannot do insert or update, is there any other aws service which can do like a normal DB?
Solution 1:[1]
Amazon Athena is, indeed, a query service -- it only allows data to be read from Amazon S3.
One exception, however, is that the results of the query are automatically written to S3. You could, therefore, use a query to generate results that could be used by something else. It's not quite updating data but it is generating data.
My previous attempts to use Athena output in another Athena query didn't work due to problems with the automatically-generated header, but there might be some workarounds available.
If you are seeking a service that can update information in S3, you could use Amazon EMR, which is basically a managed Hadoop cluster. Very powerful and capable, and can most certainly update information in S3, but it is rather complex to learn.
Solution 2:[2]
Amazon Athena adds support for inserting data into a table using the results of a SELECT query or using a provided set of values
Amazon Athena now supports inserting new data to an existing table using the INSERT INTO statement.
https://docs.aws.amazon.com/athena/latest/ug/insert-into.html
Bucketed tables not supported
INSERT INTO is not supported on bucketed tables. For more information, see Bucketing vs Partitioning.
Solution 3:[3]
AWS S3 is a object storage. Both Athena and S3 Select is for queries. The only way to modify a object(file) in S3 is to retrieve from S3, modify and upload back to S3.
Solution 4:[4]
As of September 20, 2019 Athena also supports INSERT INTO
: https://aws.amazon.com/about-aws/whats-new/2019/09/amazon-athena-adds-support-inserting-data-into-table-results-of-select-query/
Solution 5:[5]
Athena supports CTAS (create table as) statements as of October 2018. You can specify output location and file format among other options.
https://docs.aws.amazon.com/athena/latest/ug/ctas.html
To INSERT into tables you can write additional files in the same format to the S3 path for a given table (this is somewhat of a hack), or preferably add partitions for the new data.
Like many big data systems, Athena is not capable of handling UPDATE statements.
Solution 6:[6]
Finally there is a solution from AWS. Now you can perform CRUD (create, read, update and delete) operations on AWS Athena. Athena Iceberg integration is generally available now. Create the table with:
TBLPROPERTIES ( 'table_type' ='ICEBERG' [, property_name=property_value])
then you can use it's amazing feature.
For a quick introduction, you can watch this video. (Or search Insert / Update / Delete on S3 With Amazon Athena and Apache Iceberg | Amazon Web Services
on Youtube)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | John Rotenstein |
Solution 2 | Hariprasad |
Solution 3 | Ashan |
Solution 4 | Theo |
Solution 5 | Kirk Broadhurst |
Solution 6 | Bijohn Vincent |