How can I conditionally sum values from different columns after aggregation?

user2549803

I have this dataframe to begin with:

ID PRODUCT_ID        NAME  STOCK  SELL_COUNT DELIVERED_BY PRICE_A PRICE_B
1         P1  PRODUCT_P1     12          15          UPS   32,00   40,00
2         P2  PRODUCT_P2      4           3          DHL    8,00     NaN
3         P3  PRODUCT_P3    120          22          DHL     NaN  144,00
4         P1  PRODUCT_P1    423          18          UPS   98,00     NaN
5         P2  PRODUCT_P2      0           5          GLS   12,00   18,00
6         P3  PRODUCT_P3     53          10          DHL   84,00     NaN
7         P4  PRODUCT_P4     22           0          UPS    2,00     NaN
8         P1  PRODUCT_P1     94          56          GLS     NaN   49,00
9         P1  PRODUCT_P1      9          24          GLS     NaN    1,00

What I'm trying to achieve is - after aggregating by PRODUCT_ID, to sum PRICE_A or PRICE_B depending on whether they have a value or not (prioritizing PRICE_A if both are set).

Based on @WeNYoBen 's helping answer, I now know how to conditionally apply aggregation functions depending on different columns:

def custom_aggregate(grouped):

    data = {
        'STOCK': grouped.loc[grouped['DELIVERED_BY'] == 'UPS', 'STOCK'].min(),
        'TOTAL_SELL_COUNT': grouped.loc[grouped['ID'] > 6, 'SELL_COUNT'].sum(min_count=1),
        'COND_SELL_COUNT': grouped.loc[grouped['SELL_COUNT'] > 10, 'SELL_COUNT'].sum(min_count=1)
        # THIS IS WHERE THINGS GET FOGGY...
        # I somehow need to add a second condition here, that says 
        # if PRICE_B is set - use the PRICE_B value for the sum()
        'COND_PRICE': grouped.loc[grouped['PRICE_A'].notna(), 'PRICE_A'].sum()
    }

    d_series = pd.Series(data)
    return d_series

result = df_products.groupby('PRODUCT_ID').apply(custom_aggregate)

I really don't know if this is possible by using the .loc function. One way to solve this could be to create an additional column before calling .groupby that already contains the correct price values. But I thought there might be a more flexible way of doing this. I'd be happy to somehow apply a custom function for the 'COND_PRICE' value calculation that gets executed before passing the results to sum(). In SQL I could nest x levels of CASE WHEN END statements in order to implement this kind of logic. Just curious about how to implement this flexibility in pandas.

Thanks a lot.

BENY

So here is the solution we need fillna

def custom_aggregate(grouped):

    data = {
        'STOCK': grouped.loc[grouped['DELIVERED_BY'] == 'UPS', 'STOCK'].min(),
        'TOTAL_SELL_COUNT': grouped.loc[grouped['ID'] > 6, 'SELL_COUNT'].sum(min_count=1),
        'COND_SELL_COUNT': grouped.loc[grouped['SELL_COUNT'] > 10, 'SELL_COUNT'].sum(min_count=1),
        # Fillna if A have the value A return , if not check with B , both nan will keep the value as nan
        'COND_PRICE': grouped['PRICE_A'].fillna(grouped['PRICE_B']).sum()
    }

    d_series = pd.Series(data)
    return d_series

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-12-8

Comments

0 comments

After comparing two columns, how can I remove unique values in one column with its values in other different columns in R

TOP Ranking

Article

How can I conditionally sum values from different columns after aggregation?

How can I conditionally sum values from different columns after aggregation?

Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

pump.io port in URL

Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

How to import an asset in swift using Bundle.main.path() in a react-native native module

Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

3D Touch Peek Swipe Like Mail

BigQuery - concatenate ignoring NULL

How to how increase/decrease compared to adjacent cell

Make a B+ Tree concurrent thread safe

Emulator wrong screen resolution in Android Studio 1.3

Can a 32-bit antivirus program protect you from 64-bit threats

Svchost high CPU from Microsoft.BingWeather app errors

Double spacing in rmarkdown pdf

Unable to use switch toggle for dark mode in material-ui

java.lang.NullPointerException: Cannot read the array length because "<local3>" is null

Google Chrome Translate Page Does Not Work

How to fix "pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '<'" using YOLOv3?

Using Response.Redirect with Friendly URLS in ASP.NET

Bootstrap 5 Static Modal Still Closes when I Click Outside

SSIS setting column with data in Script Component