i am in python i have a data frame like this contain sub_id refer to patient_id, hour_measure from 1 to 22 and other patient's measurement

  subject_id  |   hour_measure     heart rate     |  urinecolor |  blood pressure  
        3        |  1                   40        |  red        |  high
        3        |  2                   60        |  red        |  high
        3        |  ..                  ..        |  ..         |  ..
        3        |  22                  90        |  red        |  high

        4        |  3                   60        |  yellow     |  low
        4        |  3                   60        |  yellow     |  low  
        4        |  22                  90        |  red        |  high

i want to group sub_id measurement by max min skew,etc for numeric features and first and last value for categorical

i write the follwing code

df= pd.read_csv(path)
df1 = (df.groupby(['subject_id','hour_measure'])
        .agg([ 'sum','min','max', 'median','var','skew']))
f = lambda x: next(iter(x.mode()), None)
cols = df.select_dtypes(object).columns
df2 = df.groupby(['subject_id','hour_measure'])[cols].agg(f)
df2.columns = pd.MultiIndex.from_product([df2.columns, ['mode']])
print (df2) 
df3 = pd.concat([df1, df2], axis=1).unstack().reorder_levels([0,2,1],axis= 1)
print (df3)          

it give me the grouping for every hour

i try to make it group only with subject id only

df1 = (df.groupby(['subject_id'])
        .agg([ 'sum','min','max', 'median','var','skew']))

it also give me the same output , and calculate the statistics for every hour as follows

     subject_id  |     heart rate_1     |  heartrate_2 .... 
                |  min    max     mean  | min   max   mean ....               

i want the out put to be as the following

     subject_id  |     heart rate        |  repiratotry rate  |urine color
                 |  min  |  max   | mean  | min |  max |  mean ..|. first |  last 
        3            50     60      55     40     65      20     | yellow |  red

any one can tell how can i edit the code to give the wanted output any help will appreciated


let me know if this gets you close to what you're looking for. I did not run into your issue with grouping by every hour so I'm not sure if I understood your question completely.

# sample dataframe
df = pd.DataFrame(
        "subject_id": [1, 1, 1, 2, 2, 2, 3, 3, 3],
        "hour_measure": [1, 22, 12, 5, 18, 21, 8, 18, 4],
        "blood_pressure": [
# sort out numeric columns before aggregating them
numeric_result = (
    .agg(["min", "max", "mean"])

# sort out categorical columns before aggregating them
categorical_result = (
    .agg(["first", "last"])

# combine numeric and categorical results
result = numeric_result.join(categorical_result)
                 hour_measure                blood_pressure
                    min max       mean          first  last
1                     1  22  11.666667           high  high
2                     5  21  14.666667           high   low
3                     4  18  10.000000            low  high

