How Can I create three new columns for my data?

Demetri Pananos

I've got some data that looks like

tweet_id               worker_id    option
397921751801147392  A1DZLZE63NE1ZI  pro-vaccine
397921751801147392  A3UJO2A7THUZTV  pro-vaccine
397921751801147392  A3G00Q5JV2BE5G  pro-vaccine
558401694862942208  A1G94QON7A9K0N  other
558401694862942208  ANMWPCK7TJMZ8   other

What I would like is a single line for each tweet id, and three 6 columns identifying the worker id and the option.

It the desired output is something like

tweet_id              worker_id_1  option_1     worker_id_2    option_2     worker_id_3    option 3
397921751801147392 A1DZLZE63NE1ZI pro-vaccine A3UJO2A7THUZTV pro_vaccine A3G00Q5JV2BE5G pro_vaccine

How can I achieve this with pandas?

Psidom

This is about reshaping data from long to wide format. You can create a grouped count column as id to spread as new column headers and then use pivot_table(), finally rename the columns by pasting the multi-level together.

df['count'] = df.groupby('tweet_id').cumcount() + 1
df1 = df.pivot_table(values = ['worker_id', 'option'], index = 'tweet_id', 
                     columns = 'count', aggfunc='sum')
df1.columns = [x + "_" + str(y) for x, y in df1.columns]

enter image description here


An alternative option to pivot_table() is unstack():

df['count'] = df.groupby('tweet_id').cumcount() + 1
df1 = df.set_index(['tweet_id', 'count']).unstack(level = 1)
df1.columns = [x + "_" + str(y) for x, y in df1.columns]

enter image description here

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How can I reshape my data, moving rows to new columns?

How can I create a new data frame based on the existing columns?

How can I create a new table in Power BI from three columns of another two table with a filter?

How do I create new columns by combining data in existing columns?

How can I create a new wide data frame with rows based on all combos of values in two columns?

How can I create a new variable using conditions applied to several columns of data?

How can I create a new series by using specific rows and columns of a pandas data frame?

How can I create a new category in pandas data frame based on my previous categories?

How can I create a new series by calculating the values in my pandas data frame?

Without splitting my data, how can i create a new categorical variable using function in R

How can I create new columns with groupby in pandas?

Using R, How can I create new columns using if?

How can I use the responses in multiple columns to create a new column?

How can I combine my columns into one to create a URI in MySQL?

how do I create a new column in my main data frame filling in the values from a smaller dataset based on two columns they have in common?

How can I create my data class for my json file?

How can I normalize the data in a range of columns in my pandas dataframe

How can I transpose columns at the end of my data drame?

How can I add two columns to my data set in sqlite?

mySQL - Create a New Table Using Data and Columns from Three Tables

How can I organize my data into a new table

How can I transform my data to create a grouped boxplot in R?

How do I group my pandas columns to map and create a new column based on map values

I can't create a new Dataframe with the columns I want

How can I populate new columns?

How can I add a new row which shows the number of columns into my code

how can I run a script when I create a new workspace on my Mac?

How do I clear my randomly generated data to create new data in each file in a loop?

How can I filter into a new data frame conditional on the columns existing in another data frame?

TOP Ranking

HotTag

Archive