'I am able to create a .csv file using Talend job and I want to convert .csv to .parquet file using tSystem component?
I have a Talend job to create a .csv file and now I want to convert .parquet format using Talend v6.5.1. Only option I can think, tSystem component to call the python script from local or directory where .csv landing temporarily. I know I can convert this easily using pandas or pyspark but I am not sure the same code will be work for tSystem in Talend. Can you please provide the suggestions or instructions-
Code:
import pandas as pd
DF = pd.read_csv("Path")
DF1 = to_parquet(DF)
Solution 1:[1]
If you have an external script on your file system, you can try
"python \"myscript.py\" "
Here is a link on talend forum regarding this problem : https://community.talend.com/t5/Design-and-Development/how-to-execute-a-python-script-file-with-an-argument-using/m-p/23975#M3722
Solution 2:[2]
I am able to resolve the problem following below steps-
import pandas as pd
import pyarrow as pa
import numpy as np
import sys
filename = sys.argv[1]
test = pd.read_csv(r"C:\\Users\\your desktop\\Downloads\\TestXML\\"+ filename+".csv")
test.to_parquet(r"C:\\Users\\your desktop\\Downloads\\TestXML\\"+ filename+".parque
t")
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Corentin |
Solution 2 |