calculate average and maximum value for subset of rows in pandas dataframe

user5421875

I have dataframe that looks like:

 date       session     time         x1          x2     x3    x4      x5     x6   
 2015-05-22      1     morning       Tom         129     1     129    45     67
 2015-05-22      1     morning       Kate         0      1     670    89     34   
 2015-05-22      1     noon          GroupeId     0      1     45     56    13
 2015-05-26      2     noon          Hence        129    1     167    7     13
 2015-05-26      2     evening       Kate         0            987    876    478
 2015-05-26      3     night         Julie        0      1     567            8

So I need to calculate the average and maximume value per column for each session, i.e. to have the average of values X2 for each session(first, second or third in example, but in real dataframe I have much more rows and sessions), the maximum of values x4 for X4, the sum of the values x3 for each session. I found a lot of examples for average of several columns, but it's not exactly what I'm looking for, as you see. I tried some methods like: multi_df.groupby(level=1).sum().to_csv('output.csv', sep='\t') for multilevel dataframe that I tried create with this by multi_df=df.set_index(['session','index'], inplace=False) but it doesn't give me the result that could make sens

so any advice or example of transformation like those I'm looking for, is appreciated

hilberts_drinking_problem

Are you looking for something like this? (i.e. a way to aggregate with specific functions per column?).

import pandas as pd
import numpy as np

df = pd.io.parsers.read_csv('temp.txt', sep = '\t')

df_agg = df.groupby('session').agg({
    'x2' : np.mean,
    'x3' : np.sum,
    'x4' : np.min,
    })

# you can apply more than one function to a column like so:

df_agg_multifunc = df.groupby('session').agg({
    'x2' : [np.mean, np.std],
    'x3' : [np.sum, np.std],
    'x4' : [np.min, np.std],
    })

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2021-04-5

Comments

0 comments

TOP Ranking

Article

calculate average and maximum value for subset of rows in pandas dataframe

calculate average and maximum value for subset of rows in pandas dataframe

Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

pump.io port in URL

How to import an asset in swift using Bundle.main.path() in a react-native native module

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

Emulator wrong screen resolution in Android Studio 1.3

3D Touch Peek Swipe Like Mail

Double spacing in rmarkdown pdf

Svchost high CPU from Microsoft.BingWeather app errors

How to how increase/decrease compared to adjacent cell

Using Response.Redirect with Friendly URLS in ASP.NET

java.lang.NullPointerException: Cannot read the array length because "<local3>" is null

BigQuery - concatenate ignoring NULL

How to fix "pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '<'" using YOLOv3?

ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

Can a 32-bit antivirus program protect you from 64-bit threats

Make a B+ Tree concurrent thread safe

Bootstrap 5 Static Modal Still Closes when I Click Outside

Vector input in shiny R and then use it

Assembly definition can't resolve namespaces from external packages