Add a new column to a dataframe based on results from other columns

juansalix

I'm very new to R so I hope my question will be interesting. What I want to do is quite straightforward. Here's a sample of my dataset:

> head(belongliness)
   ACTIVITY_X ACTIVITY_Y ACTIVITY_Z   Event  cluster1    cluster2     cluster3    cluster4
1:         40         47         62 Head-up 0.1900989 0.768225365 0.0160654667 0.025610279
2:         60         74         95 Head-up 0.5392218 0.038558310 0.0064671635 0.415752686
3:         62         63         88 Head-up 0.7953673 0.044981152 0.0067121719 0.152939414
4:         60         56         82 Head-up 0.9941016 0.002608879 0.0003007537 0.002988748
5:         66         61         90 Head-up 0.7027407 0.048318016 0.0079239680 0.241017291
6:         60         53         80 Head-up 0.9541378 0.023338896 0.0024442116 0.020079071

I would like to create a new column "winning cluster" to the right side of column "cluster 4". Column "winning cluster" will take the highest value among columns "cluster 1" to "cluster 4" for each row and display the index name of that column.

For row 1 that will be cluster 2, for row 2 cluster 1, for row 3 cluster 1 etc.

Any help is appreciated!

akrun

If the dataset is a data.table class, specify the columns of interest in .SDcols, get the column index of highest value in each row with max.col, use that to select the column name and assign (:=) as 'winning_cluster'

library(data.table)
belongliness[, winning_cluster := names(.SD)[max.col(.SD)], 
           .SDcols = cluster1:cluster4]

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Create new column in R dataframe based on results from 3 other columns

Pandas dataframe add new column based on if other columns have data or not

Panda Dataframe - Add values to new column based on criteria of other columns

Add a new column to a dataframe based on multiple columns from another dataframe

Creating a new column based on other columns from another dataframe

Add a new column based on aggregation from other two columns

Add new column to dataframe that is another column's values from the month before based repeating datetime index with other columns as identifiers

Dataframe create new column based on other columns

New column in DataFrame from other columns AND rows

Add a column based on a condition from other columns

Create new column into dataframe based on values from other columns using apply function onto multiple columns

How to add new column in pandas dataframe based on values in two other columns

Add new columns to pandas dataframe based on other dataframe

How to add a new column into a dataframe based on rows of an other dataframe?

How to add new column in dataframe based on the other dataframe?

Pandas: Add new dataframe column based on the dates of other smaller dataframe

Add new column to Panda dataframe based on other column

Add a new column in a dataframe based on two different dataframe columns conditions

How to populate values inside a new column based values from other columns in a dataframe in Pandas

How to create a new column based on values from other columns in a Pandas DataFrame

Creating a new column in dataframe based on multiple conditions from other rows and columns? Including rows that are null? - Python/Pandas

Fill new column in one dataframe with values from another, based on values in two other columns? (Python/Pandas)

Creating new column based on conditions and values from other columns in a pandas dataframe

Pandas dataframe create a new column based on columns of other dataframes

fill new column of pandas DataFrame based on if-else of other columns

Create new Python DataFrame column based on conditions of multiple other columns

How to create a new column in a DataFrame based on values of two other columns

Create new column in Pandas DataFrame based on other columns

Python: pandas DataFrame new column based on other columns