pandas: Calculate the difference from a grouped average

robroc

I have sensor data for multiple sensors by month and year:

import pandas as pd
df = pd.DataFrame([
 ['A', 'Jan', 2015, 13], 
 ['A', 'Feb', 2015, 10], 
 ['A', 'Jan', 2016, 12], 
 ['A', 'Feb', 2016, 11], 
 ['B', 'Jan', 2015, 7],
 ['B', 'Feb', 2015, 8], 
 ['B', 'Jan', 2016, 4], 
 ['B', 'Feb', 2016, 9]
], columns = ['sensor', 'month', 'year', 'value'])

In [2]: df
Out[2]:
    sensor month  year  value
0      A   Jan  2015     13
1      A   Feb  2015     10
2      A   Jan  2016     12
3      A   Feb  2016     11
4      B   Jan  2015      7
5      B   Feb  2015      8
6      B   Jan  2016      4
7      B   Feb  2016      9

I calculated the average for each sensor and month with a groupby:

month_avg = df.groupby(['sensor', 'month']).mean()['value']

In [3]: month_avg
Out[3]:
sensor  month
A       Feb      10.5
        Jan      12.5
B       Feb       8.5
        Jan       5.5

Now I want to add a column to df with the difference from the monthly averages, something like this:

    sensor month  year  value  diff_from_avg
0      A   Jan  2015     13    1.5
1      A   Feb  2015     10    2.5
2      A   Jan  2016     12    0.5
3      A   Feb  2016     11    0.5
4      B   Jan  2015      7    2.5
5      B   Feb  2015      8    0.5
6      B   Jan  2016      4    -1.5
7      B   Feb  2016      9    -0.5

I tried multi-indexing df and avgs_by_month similarly and trying simple subtraction, but no good:

df = df.set_index(['sensor', 'month'])
df['diff_from_avg'] = month_avg - df.value

Thank you for any advice.

piRSquared

assign new column with transform

diff_from_avg=df.value - df.groupby(['sensor', 'month']).value.transform('mean')
df.assign(diff_from_avg=diff_from_avg)

  sensor month  year  value  diff_from_avg
0      A   Jan  2015     13            0.5
1      A   Feb  2015     10           -0.5
2      A   Jan  2016     12           -0.5
3      A   Feb  2016     11            0.5
4      B   Jan  2015      7            1.5
5      B   Feb  2015      8           -0.5
6      B   Jan  2016      4           -1.5
7      B   Feb  2016      9            0.5

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

pandas calculate difference based on indicators grouped by a column

Calculate difference between grouped elements in pandas

pandas calculate difference based on indicators grouped by a column with duplicated grouped pair

Calculate Average Time Difference in Groups Pandas Python

calculate average value from pandas dataframe

Calculate the average consumption from data in Pandas Dataframe

get the difference between max and min for a groupby in pandas and calculate the average

calculate average of the velocity in pandas

How to calculate difference on grouped df?

Using python to calculate average of a grouped column

Calculate average price of grouped dates by day

Fast way to calculate the average all the c grouped by (a, b) tuples from zip(a, b, c)

How to use linq to calculate average from grouped data where value > 0 using let

Calculate difference from a reference row in pandas (python)

Calculate difference and then average per category

Add new rows to calculate the sum and average from exiting pandas dataframe

Pandas Calculate Average Bias By Rows from 2 Columns

Pandas - calculate monthly average from data with mixed frequencies

How to calculate average of monthly sales data from python pandas dataframe

Calculate average from EditTexts

Average Time difference in pandas

pandas grouped weighted average with weights from records and values from column names

calculate Exponential Moving Average with pandas

Calculate weighted average with pandas dataframe

pandas - calculate average by iterating the filters

Pandas: How to calculate the average of a groupby

Calculate Cumulative Average using Pandas

Pandas - unable to calculate moving average

Pandas - moving average grouped by multiple columns