'How to create Dataframe form presto db table of Array Data type column using spark
I am trying to create spark Dataframe from presto db table which has few columns as Array DataType. I tried multiple ways but I am getting same exception
java.sql.SQLException: Unsupported type ARRAY
Approach 1:(By casting column in json) -Getting null value it is not throwing any exception
select json_format(cast(customtags as json)) as customtags from table;
Approach 2: (by casting it to String in select query itself) same exception
Approach 3: (By adding custom schema) : Throwing same Error
StructType custSchema = DataTypes.createStructType(
new StructField[] {
DataTypes.createStructField("customtags", DataTypes.createArrayType(
DataTypes.createStructType(
new StructField[]{
DataTypes.createStructField("name", DataTypes.StringType, true),
DataTypes.createStructField("value", DataTypes.StringType, true)
}
)
), true)
});
Dataset<Row> stgTblDF = sparksession
.read()
.format("jdbc")
.option("driver", "io.prestosql.jdbc.PrestoDriver")
.option("url","url")
.option("user", prestoCredentials.getUsername())
.option("password", prestoCredentials.getPassword())
.option("query", "select customtags from table").schema(custSchema).load();
Approach 4: (By casting as a string in custom schema) But same Exception
Properties connectionProperties = new Properties();
connectionProperties.put("user", prestoCredentials.getUsername());
connectionProperties.put("password", prestoCredentials.getPassword());
connectionProperties.put("driver", "io.prestosql.jdbc.PrestoDriver");
connectionProperties.put("customSchema", "customtags STRING");
Dataset<Row> stgTblDF = sparksession.read().jdbc("jdbc:presto://url","(select customtags from table)", connectionProperties);
Solution 1:[1]
This worked for me:
select json_format(cast(customtags as json)) as customtags from table;
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Eric Aya |