I want to add a calculated column based on values in rows.
My data look like:
df.head(5)
CountryCode question_code answer percentage
0 Austria b1_a Very widespread 8
1 Austria b1_a Fairly widespread 34
2 Austria b1_a Fairly rare 45
3 Austria b1_a Very rare 9
4 Austria b1_a Don`t know 4
I tried:
def scoring(df):
df.answer == 'Very widespread':
return df.percentage*(-2)
df.answer == 'Fairly widespread':
return df.percentage*(-1)
df.answer == 'Fairly rare':
return df.percentage
df.answer == 'Very rare':
return df.percentage*2
df.answer == 'Don`t know':
return 0
which yields:
File "", line 3 df.answer == 'Very widespread': ^ SyntaxError: invalid syntax
Help would be appreciated.
You seem to have forgotten to write if
s and elif
s:
def scoring(row):
if row.answer == "Very widespread":
return row.percentage*(-2)
elif row.answer == "Fairly widespread":
return row.percentage*(-1)
elif row.answer == "Fairly rare":
return row.percentage
elif row.answer == "Very rare":
return row.percentage*2
elif row.answer == "Don\"t know":
return 0
where df
is renamed to row
because apply
will pass each row to it rather than the whole frame, as you can do:
df["scores"] = df.apply(scoring, axis=1)
to get
>>> df
CountryCode question_code answer percentage scores
0 Austria b1_a Very widespread 8 -16
1 Austria b1_a Fairly widespread 34 -34
2 Austria b1_a Fairly rare 45 45
3 Austria b1_a Very rare 9 18
4 Austria b1_a Don't know 4 0
But better yet, we can generate a multipliers-mapping beforehand and map
the answer
column with it:
mapping = {"Very widespread": -2,
"Fairly widespread": -1,
"Fairly rare": 1,
"Very rare": 2,
"Don't know": 0}
After mapping the answer
s with this, result can be multiplied with percentage
s:
df["scores"] = df.answer.map(mapping).mul(df.percentage)
which gives the same result above.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments