Write way to calculate weighted average with pandas Dataframe with mean or directly multiply

rra

I have the following pandas Dataframe:

data_df = pd.DataFrame({'id':['era','bb','cs','jd','ek','gtf','okg','huf','mji','loj','djjf','wloe','rfm','cok'],
                        'doc':[1050 ,580,170,8, 7, 220, 45155,305,458,201,48,78,256,358],
                        'dif':[1,1,1,3,3,2,2,3,4,5,8,7,9,10],
                        'usg':[102,258,7083,84,165,627,784,85,658,261,85,12,235,45],
                        'snp':[4.2,2.5,1.03,2.56,1.25,3.56,1.49,2.95,1.05,0.815,0.78,0.612,0.365,0.164]})
data_df

    id    doc      dif       usg         snp
0   era   1050      1        102        4.200
1   bb    580       1        258        2.500
2   cs    170       1        7083       1.030
3   jd    8         3        84         2.560
4   ek    7         3        165        1.250
5   gtf   220       2        627        3.560
6   okg   45155     2        784        1.490
7   huf   305       3        85         2.950
8   mji   458       4        658        1.050
9   loj   201       5        261        0.815
10  djjf  48        8        85         0.780
11  wloe  78        7        12         0.612
12  rfm   256       9        235        0.365
13  cok   358       10       45         0.164

I need to compute the weighted average of all the columns and finally need to make a rank column for the id. I'm planning to give 30%,20%,35%,15% weight for "doc", "dif,"usg" and "snp" columns. If "doc","usg" and "snp" is high it should get the higher weighted value. But for "dif" column if lower the grade it should be get the higher value. I'm actually struggling with that part. How to do that (lower grades get high marks). I simply multiply and get the weighted average. Is that the correct way? Is it possible to write a lambda function for this calculation? My calculation is like that

data_df["Weighted_Average"] = data_df["doc"]*0.3 + data_df["dif"]*.2+ data_df["usg"]*.35+ data_df["snp"]*.15
data_df

    id    doc      dif       usg     snp   Weighted_Average
0   era   1050     1         102    4.2     351.53
1   bb    580      1         258    2.5     264.875
2   cs    170      1         7083   1.03    2530.4045
3   jd    8        3         84     2.56    32.784
4   ek    7        3         165    1.25    60.6375
5   gtf   220      2         627    3.56    286.384
6   okg   45155    2         784    1.49    13821.5235
7   huf   305      3         85     2.95    122.2925
8   mji   458      4         658    1.05    368.6575
9   loj   201      5         261    0.815   152.77225
10  djjf  48       8         85     0.78    45.867
11  wloe  78       7         12     0.612   29.0918
12  rfm   256      9         235    0.365   160.90475
13  cok   358      10        45     0.164   125.1746

However, my calculation is wrong since I didn't consider higher marks to get lower grades. (data_df["dif"]*.2)

I saw several of them use mean to calculate to weighted average and some use directly like this too. Which is the correct way? Any help would be appriciate.

Corralien
  1. Prepare data:
weights = {"doc": 0.3, "dif": 0.2, "usg": 0.35, "snp": 0.15}

df1 = data_df[weights.keys()]  # extract numeric columns
df2 = pd.DataFrame(weights, index=data_df.index)  # build weighted dataframe

df1 = df1.assign(dif=df1["dif"].max() + 1 - df1["dif"])  # reverse grade

Note: reverse grade according to your post. You can use 1/dif too.

  1. Compute your weighted average:
data_df["Weighted_Average"] = df1.mul(df2).sum(axis="columns")
>>> data_df.sort_values("Weighted_Average", ascending=False)
      id    doc  dif   usg    snp  Weighted_Average
6    okg  45155    2   784  1.490       13822.92350
2     cs    170    1  7083  1.030        2532.20450
8    mji    458    4   658  1.050         369.25750
0    era   1050    1   102  4.200         353.33000
5    gtf    220    2   627  3.560         287.78400
1     bb    580    1   258  2.500         266.67500
12   rfm    256    9   235  0.365         159.50475
9    loj    201    5   261  0.815         152.97225
13   cok    358   10    45  0.164         123.37460
7    huf    305    3    85  2.950         123.29250
4     ek      7    3   165  1.250          61.63750
10  djjf     48    8    85  0.780          44.86700
3     jd      8    3    84  2.560          33.78400
11  wloe     78    7    12  0.612          28.49180

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Calculate weighted average with pandas dataframe

Calculate weighted average using a pandas/dataframe

Calculate weighted average based on 2 columns using a pandas/dataframe

Pandas: calculate weighted average by row using a dataframe and a series

Calculate weighted average results for multiple columns based on another dataframe in Pandas

computing weighted average in pandas dataframe

Annual weighted average of pandas dataframe

Calculate weighted average of dataframe rows with missing values

Pandas pivot table / groupby to calculate weighted average

Calculate weighted average with Pandas for decreasing cost

Calculate a rolling window weighted average on a Pandas column

How to calculate Volume Weighted Average Price (VWAP) using a pandas dataframe with ask and bid price?

groupby weighted average and sum in pandas dataframe

How to pivot a pandas dataframe to include a weighted average?

Efficient evaluation of weighted average variable in a Pandas Dataframe

Calculating weighted average by sorting and aggregating in a pandas dataframe

Grouping columns of dataframe by other dataframe and calculate weighted average of aggregated columns

How to calculate a weighted average?

How to resample and calculate amount weighted average price in Pandas?

Pandas calculate and apply weighted rolling average on another column

Calculate weighted mean in excel

Return groupby weighted average for multiple pandas dataframe columns as a dataframe

How to group by two columns, calculate weighted mean, return DataFrame, in Python

Pandas - Take bitwise similarly and calculate weighted mean based on that

Weighted average pandas

Pandas groupby weighted average

pandas cumulative weighted average

Weighted average on pandas

Calculate Weighted Average and Weighted Standard Deviation in DAX