'It is possible to Stream data from beam (Scio) to an S3 bucket?
Currently, I'm working on a project which extracts data from a BigQuery table using Scio in Scala.
I'm able to extract and ingest the data into ElasticSearch, but I'm trying to do the same but using an S3 bucket as storage.
Certainly, I'm able to write the data into a txt file using the method saveAsTextFile
, and then upload it from my machine to the s3 bucket adding the correct libraries into sbt.
However, I don't know if it is possible to write a saveCustomOutput
code to write the data right away to S3, instead of using a local storage.
Solution 1:[1]
It is possible but instead of using an S3 bucket as LZ, I set a Kinesis Data stream. By adding a Kinesis event over a Lambda function, it was possible to stream the data into an S3 bucket
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | MasterC |