how to add list as a new column?

elham

I have a pyspark data frame and I want to attach a list as a new column to it. In pandas it is very easy: df['new_column']=mylist. I did the following:

df.withColumn("Normalized",sparlist).show(false)

But this is the error:

AssertionError: col should be Column

mylist=['fg','af','ab','df','cd']

| id|     mylist|
+---+---------+
|  0| fg       |
|  1| af       |
|  2| ab       |
|  3| df       |
|  4| cd       |
+---+---------
mck

You can use F.array to create an array from a list:

import pyspark.sql.functions as F

mylist = [0,1,2]
df2 = df.withColumn('list', F.array(*[F.lit(i) for i in mylist]))

df2.show()
+---+---------+
| id|     list|
+---+---------+
|  0|[0, 1, 2]|
|  1|[0, 1, 2]|
|  2|[0, 1, 2]|
|  3|[0, 1, 2]|
|  4|[0, 1, 2]|
+---+---------+

For your modified question:

mylist = ['fg','af','ab','df','cd']
df2 = df.withColumn('list', F.array(*[F.lit(i) for i in mylist])[F.col('id')])

df2.show()
+---+----+
| id|list|
+---+----+
|  0|  fg|
|  1|  af|
|  2|  ab|
|  3|  df|
|  4|  cd|
+---+----+

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How to add new column from list column value

How to add a new column?

How to add new element to pandas.DataFrame column which is list?

How to list a row values and add as a new column in a DataFrame?

In Pyspark, how to add a list of values as a new column to an existing Dataframe?

How to add a new column in a view?

How to add a specific word to a new column when it is a value in a list within a column

How to split a list column and add them as new column values in polars dataframe?

How to add values to a new column based on a list with values from another column

Add new column from a list of dicts

add all the values of a row into a new column as a list

Duplicate list values and add in new column to dataframe

How to add list items to a new dataframe column inside another list in r?

How do I extract the maximum value from a list in a list col and add it as a new column in tibble?

How add a new column to in dataframe and populate the column?

how to map a list to a column and create a new column

How to add new list item to unordered list

Check if column is in a list, remove if not and add value to a new column

How to add "Local Pickup Plus pickup date" to a new column in WooCommerce admin order list

How can I add data (a python list) to a new column in an existing Excel file?

How to add new Column in Python pandas dataframe by searching keyword value given in list?

How to add nested list as new column to existing pandas data-frame

how to add a new column that will contain a list of common values from two lists from other columns

How to add the shipping zone name to a new column in WooCommerce admin order list

Pyspark / Dataframe: Add new column that keeps nested list as nested list

How to add a column of zeros in a List<List<double[]>>()?

How to add new column with handling nan value

How to Add new column containing ZScore in SQL

how to add new column if it's not exist

TOP Ranking

HotTag

Archive