'Query (SQL like joins) remote CSV for data analysis
I would like to query (SQL with joins) CSV files sitting in a network folder for performing data analysis work. I'm not allowed to move the files out of the network folder due to regulatory reasons. Obviously, I also cannot import the CSV into a database table.
I'm beginning to explore Presto for this, but I'm not sure if it can handle this scenario. Any advise from Presto experts?
Solution 1:[1]
You could use the SQLite https://www.sqlite.org/index.html
SQLite is NOT a "regular" client-server DB, it it a local file DB (or even in RAM memory if you want) that stores all you data. Using this your data never leaves your network folder.
You can easily import the CSV file into that local DB (doing an import to real table or virtual table)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |