'pyspark dataframe to valid json
Im trying to convert a dataframe to a valid json format, howeever I have not succeeded yet.
if I do like this:
fullDataset.repartition(1).write.json(f'{mount_point}/eds_ckan', mode='overwrite', ignoreNullFields=False)
I only get row based json like this:
{"col1":"2021-10-09T12:00:00.000Z","col2":336,"col3":0.0}
{"col1":"2021-10-16T20:00:00.000Z","col2":779,"col3":6965.396}
{"col1":"2021-10-17T12:00:00.000Z","col2":350,"col3":0.0}
Does anyone know how to convert it to valid json which is not row based?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|