How can I calculate a new column depending of the value from my column? I need to create string chains to create my new column in pandas python

Daniel Luizet

I have a column that has a number value (15, ..., 28, etc) called 'big', but depending on this number it should sum the columns with 5 previous numbers columns names, I mean something like...

big c15 c16 c17 ... c27 c28
23 1 0 1 ... 1 0
21 1 1 0 ... 1 1
... 0 0 1 ... 1 0
25 1 0 1 ... 1 1

So, depending on the "big" column, for example, 25, my new column should sum 'c24'+'c23'+'c22'+'c21'+'c20' and the result must be calculated in the new column name.

I have tried several movements but it doesn't works. I show my code below:

def test_fun(df):
    if (df['big'] > 19).all():
        pc = []
        for i in range(1,6):
            x = 'c' + (df['big'] - i).apply(str)
            pc.append(x)
        y = df[pc].sum(axis = 1)
        return y
    elif (df['big'] == 19).all():
        pc = []
        for i in range(1,5):
            x = 'c' + (df['big'] - i).apply(str)
            pc.append(x)
        y = df[pc].sum(axis = 1)
        return y
    elif (df['big'] == 18).all():
        pc = []
        for i in range(1,4):
            x = 'c' + (df['big'] - i).apply(str)
            pc.append(x)
        y = df[pc].sum(axis = 1)
        return y
    else:
        pc = []
        for i in range(1,3):
            x = 'c' + (df['big'] - i).apply(str)
            pc.append(x)
        y = df[pc].sum(axis = 1)
        return y

df['new_column'] = df.apply(lambda row: test_fun(df), axis = 1)

I added several conditions due to actually my table is beginning from c15 to c28 column, but it will be increasing during time.

Finally, when I use the function df.apply() to apply my function by row I had been having several errors during my trials. Some of them like:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

that's what I added .all() in my if, elif, else conditions. Even...

raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Index  'c27', 'c27', ...,\n   ('c26', 'c26',...  dtype='object')] are in the [columns]"

Do you know what I should be probably doing wrong?

Chris

One way using pandas.DataFrame.apply:

def get_big(series):
    n = series["big"]
    indices = ["c%s" % i for i in range(n-1, n-6, -1)]
    indices = series.index.intersection(indices)
    return series[indices].sum()

df.apply(get_big, axis=1)

Sample data

   c20  c21  c22  c23  c24  c25  c26  c27  c28  c29  big
0    0    1    1    0    1    0    1    1    1    0   21
1    1    0    1    0    0    0    1    0    1    0   28
2    1    1    0    1    0    1    0    1    0    0   20
3    0    0    0    0    1    0    0    1    0    1   20
4    1    1    0    1    0    0    0    0    0    0   23
5    1    0    0    1    0    0    0    1    0    0   25
6    0    1    0    0    1    1    1    0    1    0   23
7    1    0    1    0    0    0    0    1    0    1   20
8    1    0    1    0    1    1    0    0    0    1   26
9    0    0    0    1    1    0    1    1    0    1   25

Output:

0    0
1    1
2    0
3    0
4    2
5    2
6    1
7    0
8    3
9    2
dtype: int64

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Python Pandas create new column based on another column value

How do I create a new Pandas column with a specific dtype?

How can I restructure a dataframe to create new column labels based on Column[se] values and then populate those new columns with Column[value] Values

How can I loop over rows in my DataFrame, calculate a value and put that value in a new column with this lambda function

How do I create a new column in a csv file using Pandas, and add data depending on the values in those columns

How do I create a new column if the text from one column if the text from a second column contains a specific string pattern?

How can I create a new dataframe by subtracting the first column from every other column?

How do I group my pandas columns to map and create a new column based on map values

Python pandas: create new column based on max value within group, but using value from additional (string) column

Create new row depending on Column value

Python: In a dataframe, create a new column with a string sliced from a column with the value of another column

How can I add a value to a new column based on a value existing in my dataframe in Python?

how do i take a python pandas dataframe and create a new table using the column and row names as the new column

How do i transform my csv file from column values to new column with its value..?

How can I copy a part of a string from a column into a new column in pandas

How Do I Create New Pandas Column Based On Word In A List

Python Pandas pivoting: how to group in the first column and create a new column for each unique value from the second column

How to Create New Column in Pandas?

How do I group a pandas column to create a new percentage column

How can I modify apply and lambda function to create new column based on other in Python Pandas?

How can I add string and create new column in my csv file using PowerShell

How to create new columns from a column value in pandas based on IDs

Using Python - How can I create a new column ("new_col") by returning the value of "colA" if "colA" can be found in "colB"

Python - pandas: create new columns and transpose depending on column names

Can I make a Python if condition using Regex on Pandas column to see if it contains something and then create a new column to hold it

When I create a csv, from a data frame in Python, a new column is added to the beginning of the table in my csv. How do I remove this column?

How can I create a new column in a pandas data frame by extracting words from sentences in another column?

How can I create a new column in a pandas dataframe that counts backwards the rows based on the value of another column?

How can I modify this script to change each value, rather than create a new column for new values?