I have a question about my code. My data looks like this:
I want to summarize by SKU # and by reason what percentage (VS Ordered) was not shipped to this format:
What I did is that I used pd.groupby to have the sum of cut by SKU#, then I got a dataframe looks like this: SKU Reason A Transportation A Raw Material B Transportation B Raw Material C Transportation C Raw Material
Just wondering how can I change to this format: A Transportation Raw Material B Transportation Raw Material C Transportation Raw Material
My plan is to merge this one to one that was sum of order amount by SKU number to get the percentages of each reason. I would be more than happy to get a better method as well!
Create a ratio of unshipped, then sum in a pivot table:
data = {
'PO Nbr' : [1,2,3,4,5],
'SKU #' : ['A','B','A','C','A'],
'Ordered' : [100,100,500,100,200],
'Shipped' : [100,80,450,100,30],
'Reason for Not Shipped' : ['', 'raw material','transportation','','transportation']
}
df = pd.DataFrame(data)
df['Pct Not Shipped'] = (df['Ordered'] - df['Shipped']) / df.groupby('SKU #').Ordered.transform('sum')
print(df.pivot_table(index='SKU #',columns='Reason for Not Shipped',values='Pct Not Shipped',aggfunc='sum'))
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments