How to add calculated column based on values in rows

Stanislav Jirak

I want to add a calculated column based on values in rows.

My data look like:

df.head(5)

CountryCode question_code   answer  percentage
0   Austria b1_a    Very widespread 8
1   Austria b1_a    Fairly widespread   34
2   Austria b1_a    Fairly rare 45
3   Austria b1_a    Very rare   9
4   Austria b1_a    Don`t know  4

I tried:

def scoring(df):
    df.answer == 'Very widespread':
        return df.percentage*(-2)
    df.answer == 'Fairly widespread':
        return df.percentage*(-1)
    df.answer == 'Fairly rare':
        return df.percentage
    df.answer == 'Very rare':
        return df.percentage*2
    df.answer == 'Don`t know':
        return 0

which yields:

File "", line 3 df.answer == 'Very widespread': ^ SyntaxError: invalid syntax

Help would be appreciated.

Mustafa Aydın

You seem to have forgotten to write ifs and elifs:

def scoring(row):
    if row.answer == "Very widespread":
        return row.percentage*(-2)
    elif row.answer == "Fairly widespread":
        return row.percentage*(-1)
    elif row.answer == "Fairly rare":
        return row.percentage
    elif row.answer == "Very rare":
        return row.percentage*2
    elif row.answer == "Don\"t know":
        return 0

where df is renamed to row because apply will pass each row to it rather than the whole frame, as you can do:

df["scores"] = df.apply(scoring, axis=1)

to get

>>> df

  CountryCode question_code             answer  percentage  scores
0     Austria          b1_a    Very widespread           8     -16
1     Austria          b1_a  Fairly widespread          34     -34
2     Austria          b1_a        Fairly rare          45      45
3     Austria          b1_a          Very rare           9      18
4     Austria          b1_a         Don't know           4       0

But better yet, we can generate a multipliers-mapping beforehand and map the answer column with it:

mapping = {"Very widespread": -2,
           "Fairly widespread": -1,
           "Fairly rare": 1,
           "Very rare": 2,
           "Don't know": 0}

After mapping the answers with this, result can be multiplied with percentages:

df["scores"] = df.answer.map(mapping).mul(df.percentage)

which gives the same result above.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

SQL - How to add a calculated column based on values from other columns

How to create a column in which its values are calculated based the differences between rows from other columns?

how to add a column based on values in two previous rows in pandas

How to add new rows in existing dataframe based on the column values?

How to add pandas data frame column based on other rows values

Pandas DataFrame: Add new column with calculated values based on previous row

Calculated row: How to calculate cell value of the particular column based on the values from other rows in the same column using AG Grid?

sql add rows based on column values

How can I add a calculated column with different rows to a dataframe?

Cumulative column based on the values of a calculated column

Add a column (in Pandas) that is calculated based on another column

How to add rows in one column based on repeated values in another column , and finally keep the first row in python?

Calculated Column for multiple rows based on single row

How to sort pandas rows based on column values

How to Select Rows Based on Column Values in Pandas

How to remove rows based on the column values

How to create duplicate rows based on a column values

how to reorder of rows of a dataframe based on values in a column

PySpark how to create a column based on rows values

How to remove duplicated rows based on values in column

R: how to add rows based on the conditions of a column

R: how to add rows based on the value in a column

How to add a year in a calculated column?

How to add rows and values for given column?

how to add values of related rows to a column in SQL

Pandas: Add calculated column based on condition

How to create a calculated column in DAX, based on one column values and on a specific date of other column?

How to add extra column with values based on previous rows in Pandas data frame?

How can I use calculated values by formula in a new column for other rows in new column in R?

TOP Ranking

HotTag

Archive