Category "pyspark-sql"

pyspark.sql.utils.AnalysisException: Failed to find data source: kafka

I am trying to read a stream from kafka using pyspark. I am using spark version 3.0.0-preview2 and spark-streaming-kafka-0-10_2.12 Before this I just stat zoo

Spark SQL error : org.apache.spark.sql.catalyst.parser.ParseException: extraneous input '$' expecting

I am forming a query in a String Builder like below : println(dataQuery) Execution started at 2019-10-31 02:58:24.006019 PST res245: String = " SELECT transac

How to convert from Pandas' DatetimeIndex to DataFrame in PySpark?

I have the following code: # Get the min and max dates minDate, maxDate = df2.select(f.min("MonthlyTransactionDate"), f.max("MonthlyTransactionDate")).first()

How to apply the describe function after grouping a PySpark DataFrame?

I want to find the cleanest way to apply the describe function to a grouped DataFrame (this question can also grow to apply any DF function to a grouped DF) I