我有一個包含很多列的熊貓數據框。我想創建一個只有兩列的新數據框。第一列應包含出現在原始數據框特定列中的所有值。第二列應包含與第一列的值匹配的原始數據框的所有其他數據。
例如,我的輸入數據框的結構如下:
Name Menu City
0 Foo Burgers Burgers and Fries New York
1 Cheesy's Cheeseburgers New York
2 Buggy Burgers Insect Burgers London
3 Fry Guy Fries London
4 Beermania Beer Berlin
在代碼中:
df = pd.DataFrame([["Foo Burgers", "Burgers and Fries", "New York"],
["Cheesy's", "Cheeseburgers", "New York"],
["Buggy Burgers", "Insect Burgers", "London"],
["Fry Guy", "Fries", "London"],
["Beermania", "Beer", "Munich"]], columns=["Name","Menu","City"])
如何輕鬆地將數據框轉換為以下目標結構?
City Restaurants
0 New York [{"Name": "Foo Burgers", "Menu": "Burgers and Fries"}, {"Name":"Cheesy's", "Menu": "Cheeseburgers"}]
1 London [{"Name": "Buggy Burgers", "Menu": "Insect Burgers"}, {"Name":"Fry Guy", "Menu": "Fries"}]
2 Munich [{'Name': 'Beermania', 'Menu': 'Beer'}]
在代碼中:
goal_df = pd.DataFrame([["New York", [{"Name": "Foo Burgers", "Menu": "Burgers and Fries"}, {"Name":"Cheesy's", "Menu": "Cheeseburgers"}], ],
["London", [{"Name": "Buggy Burgers", "Menu": "Insect Burgers"}, {"Name":"Fry Guy", "Menu": "Fries"}], ],
["Munich", [{"Name": "Beermania", "Menu": "Beer"}], ]], columns=["City", "Restaurants"])
你可以做一個groupby().agg
有to_dict
:
(df.drop('City', axis=1).groupby(df['City'])
.apply(lambda x: x.to_dict(orient='records'))
.reset_index(name='Restaurants')
)
輸出:
City Restaurants
0 London [{'Name': 'Buggy Burgers', 'Menu': 'Insect Bur...
1 Munich [{'Name': 'Beermania', 'Menu': 'Beer'}]
2 New York [{'Name': 'Foo Burgers', 'Menu': 'Burgers and ...
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句