Create dynamic ranges and calculate mean

windwalker

I would like to create an additional column with averages based on column A, using dynamic ranges.

import numpy as np
import pandas as pd
    
test = {'A' : [100, 120, 70, 300, 190, 70, 300, 190, 70],
        'B' : [80, 50, 64, 288, 172, 64, 288, 172, 64],
        'C' : ['NO', 'NO', 'YES', 'NO', 'YES', 'YES', 'NO', 'YES', 'YES'],
        'D' : [0, 1, 0, 3, 2, 2, 3, 1, 4] }

df = pd.DataFrame(data=test)

     A      B      C      D    
0   100    80     NO      0
1   120    50     NO      1
2    70    64    YES      0
3   300   288     NO      3
4   190   172    YES      2
5    70    64    YES      2
6   300   288     NO      3
7   190   172    YES      1
8    70    64    YES      4

When item in column C is YES get average value from dynamic range in column A by using value in column D as starting row index and row index of current row -1 as highest row index.

Below is the outcome I am seeking to achieve.

     A      B      C      D    Dyn_Ave    
0   100    80     NO      0     NaN
1   120    50     NO      1     NaN
2    70    64    YES      0     110
3   300   288     NO      3     NaN
4   190   172    YES      2     185
5    70    64    YES      2     187
6   300   288     NO      3     NaN
7   190   172    YES      1     175
8    70    64    YES      4     188

My attempt at creating the column has lead me to a np.where approach, although I am coming across the following error - TypeError: Cannot index by location index with a non-integer key

df['Dyn_Ave'] = np.where(df['C'] == 'YES', df['A'].iloc[df['D']:df.loc['C'][-1]].mean(), np.nan)

Quang Hoang

Let's try:

s = df['A'].cumsum().shift(fill_value=0)

df['Dyn_Ave'] = np.where(df['C'] == 'YES', 
                         (s - s.reindex(df['D']).values) / (np.arange(len(df)) - df['D']),           
                         np.nan)

Output:

     A    B    C  D     Dyn_Ave
0  100   80   NO  0         NaN
1  120   50   NO  1         NaN
2   70   64  YES  0  110.000000
3  300  288   NO  3         NaN
4  190  172  YES  2  185.000000
5   70   64  YES  2  186.666667
6  300  288   NO  3         NaN
7  190  172  YES  1  175.000000
8   70   64  YES  4  187.500000

Explanation: Let's first forget about C=='YES' for a moment and paying attention to dynamic average. The average from row df['D'] to row j-1 can be seen as

(cumsum[j-1] - cumsum[df['D']-1])/(j-df['D'])

or:

(cumsum.shift()[j] - cumsum.shift()[df['D']) / (j-df['D'])

That's why we first calculate the cumsum, then shift it:

s = df['A'].cumsum().shift(fill_value=0)

To get the cumsum at df['D'], we use reindex and pass the underlying numpy array for subtraction:

(s - s.reindex(df['D']).values)

The number of rows can be easily seen as:

(np.arange(len(df)) - df['D'])

The last part is just filling in where C=='YES', just as you were trying to accomplish.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-12-20

Comments

0 comments

TOP Ranking

Article

Create dynamic ranges and calculate mean

Create dynamic ranges and calculate mean

pump.io port in URL

Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

How to import an asset in swift using Bundle.main.path() in a react-native native module

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

Using Response.Redirect with Friendly URLS in ASP.NET

Can a 32-bit antivirus program protect you from 64-bit threats

Double spacing in rmarkdown pdf

How to fix "pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '<'" using YOLOv3?

3D Touch Peek Swipe Like Mail

Bootstrap 5 Static Modal Still Closes when I Click Outside

Assembly definition can't resolve namespaces from external packages

Vector input in shiny R and then use it

Emulator wrong screen resolution in Android Studio 1.3

Svchost high CPU from Microsoft.BingWeather app errors

Graphics Context misaligned on first paint

Python connect to firebird docker database

Is this docker-for-mac password dialog legit?

How to save models trained locally in Amazon SageMaker?