'PySpark read data into Dataframe, transform in sql, then save to dataframe

New to Spark and Synapse....Need to do some transformation including adding a columns, changing datatypes, etc. I am reading a csv into a dataframe. I'd like to save the dataframe as a temp view, do my transformation in SQL (using a cell with %%sql), then save the data frame as parquet file in another folder.

If I create add columns in my temp view, do I need to save my temp view back to another data frame? Or does my Original dataframe now include the new columns? If not, how do I create a new dataframe ( that I can write as parquet) from my sql temp view?

Or is there a link that shows a good algorithm for performing my task?

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'PySpark read data into Dataframe, transform in sql, then save to dataframe

Sources

Related Questions