'How to extract values from key value map?

I have a column of type map, where the key and value changes. I am trying to extract the value and create a new column.


|symbols        |
|[3pea -> 3PEA] |
|[barello -> BA]|
|[]             |
|[]             |

Expected output:

|3PEA   |
|BA     |
|       |
|       |

Here is what I tried so far using a udf:

def map_value=udf((inputMap:Map[String,String])=> {inputMap.map(x=>x._2) 

java.lang.UnsupportedOperationException: Schema for type scala.collection.immutable.Iterable[String] is not supported

Solution 1:[1]

import org.apache.spark.sql.functions._
import spark.implicits._
val m = Seq(Array("A -> abc"), Array("B -> 0.11856755943424617"), Array("C -> kqcams"))

val df = m.toDF("map_data")
// Simulate your data I think.

val df2 = df.withColumn("xxx", split(concat_ws("",$"map_data"), "-> ")).select($"xxx".getItem(1).as("map_val")).drop("xxx")

results in:

|            map_data|
|          [A -> abc]|
|[B -> 0.118567559...|
|       [C -> kqcams]|

|map_val            |
|abc                |
|kqcams             |

Solution 2:[2]

Since Spark scala v2.3 api, sql v2.3 api, or pyspark v2.4 api you can use the spark sql function map_values

The following is in pyspark, scala would be very similar.
Setup (assuming working SparkSession as spark):

from pyspark.sql import functions as F

df = (
        {"key": ["3pea"],    "value": ["3PEA"] },
        {"key": ["barello"], "value": ["BA"]   }
    .select(F.map_from_arrays(F.col("key"), F.col("value")).alias("symbols") )

 |-- symbols: map (nullable = true)
 |    |-- key: string
 |    |-- value: string (valueContainsNull = true)

|        symbols|
| [3pea -> 3PEA]|
|[barello -> BA]|
|    3PEA|
|      BA|


