'pyspark dataframe to valid json

Im trying to convert a dataframe to a valid json format, howeever I have not succeeded yet.

if I do like this:

fullDataset.repartition(1).write.json(f'{mount_point}/eds_ckan', mode='overwrite', ignoreNullFields=False)

I only get row based json like this:

{"col1":"2021-10-09T12:00:00.000Z","col2":336,"col3":0.0}
{"col1":"2021-10-16T20:00:00.000Z","col2":779,"col3":6965.396}
{"col1":"2021-10-17T12:00:00.000Z","col2":350,"col3":0.0}

Does anyone know how to convert it to valid json which is not row based?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source