How do I convert a WrappedArray column in spark dataframe to Strings?

bdguy

I am trying to convert a column which contains Array[String] to String, but I consistently get this error

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 78.0 failed 4 times, most recent failure: Lost task 0.3 in stage 78.0 (TID 1691, ip-******): java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to [Ljava.lang.String; 

Here's the piece of code

val mkString = udf((arrayCol:Array[String])=>arrayCol.mkString(","))  
val dfWithString=df.select($"arrayCol").withColumn("arrayString",
      mkString($"arrayCol"))  
zero323

WrappedArray is not an Array (which is plain old Java Array not a natve Scala collection). You can either change signature to:

import scala.collection.mutable.WrappedArray

(arrayCol: WrappedArray[String]) => arrayCol.mkString(",")

or use one of the supertypes like Seq:

(arrayCol: Seq[String]) => arrayCol.mkString(",")

In the recent Spark versions you can use concat_ws instead:

import org.apache.spark.sql.functions.concat_ws

df.select(concat_ws(",", $"arrayCol"))

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How do I add a new column to a Spark DataFrame (using PySpark)?

In Pandas how do I convert a string of date strings to datetime objects and put them in a DataFrame?

How to convert WrappedArray to String using Spark / JAVA

How do I detect if a Spark DataFrame has a column

How do I add an persistent column of row ids to Spark DataFrame?

How do I convert column of unix epoch to Date in Apache spark DataFrame using Java?

How do I collect a List of Strings from spark DataFrame Column after a GroupBy operation?

How do I filter rows based on whether a column value is in a Set of Strings in a Spark DataFrame

How do I convert an RDD with a SparseVector Column to a DataFrame with a column as Vector

How to convert Spark DataFrame column of sparse vectors to a column of dense vectors?

spark dataframe trim column and convert

How do I display predictions, labels and dataframe column in Spark/Scala?

How do I convert missing values into strings?

How do I convert a single dataframe column to a dictionary for each row with the column name as the key?

How could i convert a DataFrame Column name into a value in Spark-Scala

R: How do I convert a dataframe of strings into POSIXt objects?

How do I convert a HashSet of Strings into a Vector?

How do I convert a dataframe of strings to csv in python?

How do I explode equal length strings into Pandas DataFrame columns (without empty column)

How do I move a spark dataframe's columns to a nested column in the same dataframe?

How do I convert a dataframe consisting of a column of sentences and a column of scores into one with a column of words and average scores?

Spark: convert DataFrame column into vector

How to convert WrappedArray to string in spark?

How do I convert a Pandas Dataframe with one column into a Pandas Dataframe of two columns?

How do I create a labeling column for strings based on another DataFrame?

How do i convert Swift strings into Integers?

How can I nullify spark dataframe column

How do I convert a dataframe column filled with numbers to strings in python?

How do I convert a Vector of Strings to [&str]