我有以下三个 CSV 文件:
1.csv:
id,status,env
aaaa,PASS,PROD
aaaa,PASS,DEV
bbbb,PASS,PROD
bbbb,PASS,DEV
2.csv:
id,successPct24,env
aaaa,"99.73",PROD
aaaa,"99.89",DEV
bbbb,"100.00",PROD
bbbb,"92.53",DEV
3.csv
id,successPctMonth,env
aaaa,"99.70",PROD
aaaa,"99.90",DEV
bbbb,"100.00",PROD
bbbb,"99.91",DEV
目标是创建一个格式如下的单个 CSV 文件:
id,status,successPct24,successPctMonth,env
因此,使用我的示例 CSV 文件,单个 CSV 应如下所示:
aaaa,PASS,99.73,99.7,PROD
aaaa,PASS,99.89,99.9,DEV
bbbb,PASS,100.0,100.0,PROD
bbbb,PASS,92.53,99.91,DEV
我尝试使用以下 Python 代码来完成此操作...
import pandas as pd
csv1 = pd.read_csv("1.csv", index_col=[0], usecols=["id", "status"])
csv2 = pd.read_csv("2.csv", index_col=[0], usecols=["id", "successPct24"])
csv3 = pd.read_csv("3.csv", index_col=[0], usecols=["id", "successPctMonth", "env"])
firstcsv = csv1.join(csv2)
finalcsv = firstcsv.join(csv3)
# print (finalcsv)
finalcsv.to_csv('4.csv', index=True)
...但生成的单个 CSV 不正确:
aaaa,PASS,99.73,99.7,PROD
aaaa,PASS,99.73,99.9,DEV
aaaa,PASS,99.89,99.7,PROD
aaaa,PASS,99.89,99.9,DEV
aaaa,PASS,99.73,99.7,PROD
aaaa,PASS,99.73,99.9,DEV
aaaa,PASS,99.89,99.7,PROD
aaaa,PASS,99.89,99.9,DEV
bbbb,PASS,100.0,100.0,PROD
bbbb,PASS,100.0,99.91,DEV
bbbb,PASS,92.53,100.0,PROD
bbbb,PASS,92.53,99.91,DEV
bbbb,PASS,100.0,100.0,PROD
bbbb,PASS,100.0,99.91,DEV
bbbb,PASS,92.53,100.0,PROD
bbbb,PASS,92.53,99.91,DEV
我确定我遗漏了一个参数,或者我配置错误。对此请求的任何帮助将不胜感激。
你需要加入 2 列 -'id' and 'env'
代码:
df1 = pd.read_csv("1.csv")
df2 = pd.read_csv("2.csv")
df3 = pd.read_csv("3.csv")
finalcsv = df1.merge(df2, 'left', on=['id', 'env']).merge(df3, 'left', on=['id', 'env'])
结果:
id status env successPct24 successPctMonth
0 aaaa PASS PROD 99.73 99.70
1 aaaa PASS DEV 99.89 99.90
2 bbbb PASS PROD 100.00 100.00
3 bbbb PASS DEV 92.53 99.91
如果您需要其他列顺序:
finalcsv = finalcsv[['id', 'status', 'successPct24', 'successPctMonth', 'env']]
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句