我有一个数据框
Counties Numbers
Yabucoa Municipio, Puerto Rico 7766
Marion County, West Virginia 8756
Barbour County, Alabama 33445
Santa Cruz County, Arizona 447
Navajo County, Arizona 1500
Denver County, Colorado 67990
我试图以一种方式排序,以便州名按字母顺序排序,县名在州内内部排序
Counties Numbers
Barbour County, Alabama 33445
Navajo County, Arizona 1500
Santa Cruz County, Arizona 447
Denver County, Colorado 67990
Yabucoa Municipio, Puerto Rico 7766
Marion County, West Virginia 8756
数据框代码:
df_test = pd.DataFrame([
{'Counties': 'Yabucoa Municipio, Puerto Rico','Numbers': 7766},
{'Counties': 'Marion County, West Virginia','Numbers': 8756},
{'Counties': 'Barbour County, Alabama','Numbers': 33445},
{'Counties': 'Santa Cruz County, Arizona','Numbers': 447},
{'Counties': 'Navajo County, Arizona','Numbers': 1500},
{'Counties': 'Denver County, Colorado','Numbers': 67990}
])
我已经尝试使用sort
和split
代码,但它没有提供所需的输出
df_test['Counties'] = df_test['Counties'].apply(lambda x: ','.join(sorted(x.split(','))))
应该做什么?请帮忙。谢谢!
在这种方法可能是以下几点:
df = pd.DataFrame(
[
{"Counties": "Yabucoa Municipio, Puerto Rico", "Numbers": 7766},
{"Counties": "Marion County, West Virginia", "Numbers": 8756},
{"Counties": "Barbour County, Alabama", "Numbers": 33445},
{"Counties": "Santa Cruz County, Arizona", "Numbers": 447},
{"Counties": "Navajo County, Alabama", "Numbers": 1500},
{"Counties": "Denver County, Colorado", "Numbers": 67990},
]
)
然后创建一个键来重新排序:
re_order_key = (
df["Counties"]
.str.split(",", expand=True)
.rename(columns={0: "county", 1: "state"})
.sort_values(by=["state", "county"])
)
将此索引与 iloc 一起使用:
df.iloc[re_order.index, :].reset_index(drop=True)
这使:
Counties Numbers
0 Barbour County, Alabama 33445
1 Navajo County, Alabama 1500
2 Santa Cruz County, Arizona 447
3 Denver County, Colorado 67990
4 Yabucoa Municipio, Puerto Rico 7766
5 Marion County, West Virginia 8756
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句