嗨,我正在尝试将几个现有列合并为1个新列,然后删除CSV文件中的三个原始列。我一直在尝试用熊猫做这件事,但是运气并不好。我是python的新手。
我的代码首先在同一目录中合并了几个CSV文件,然后尝试操纵这些列。第一个合并工作,我得到了包含合并数据的output.csv,但是列的合并却没有。
import glob
import pandas as pd
interesting_files = glob.glob("*.csv")
header_saved = False
with open('output.csv','wb') as fout:
for filename in interesting_files:
with open(filename) as fin:
header = next(fin)
if not header_saved:
fout.write(header)
header_saved = True
for line in fin:
fout.write(line)
df = pd.read_csv("output.csv")
df['HostAffected']=df['Host'] + "/" + df['Protocol'] + "/" + df['Port']
df.to_csv("newoutput.csv")
有效地解决这个问题:
Host,Protocol,Port
10.0.0.10,tcp,445
10.0.0.10,tcp,445
10.0.0.10,tcp,445
10.0.0.10,tcp,445
10.0.0.10,tcp,445
10.0.0.10,tcp,445
10.0.0.10,tcp,445
10.0.0.10,tcp,49707
10.0.0.10,tcp,49672
10.0.0.10,tcp,49670
变成这样的东西:
HostsAffected
10.0.0.10/tcp/445
10.0.0.10/tcp/445
10.0.0.10/tcp/445
10.0.0.10/tcp/445
10.0.0.10/tcp/445
10.0.0.10/tcp/445
10.0.0.11/tcp/445
10.0.0.11/tcp/49707
10.0.0.11/tcp/49672
10.0.0.11/tcp/49670
10.0.0.11/tcp/49668
10.0.0.11/tcp/49667
csv中还有其他列。
我不是编码人员,我只是想解决一个问题,对您的帮助非常感谢。
从我的角度来看,我们有三种选择:
%timeit df['Host'] + "/" + df['Protocol'] + "/" + df['Port'].map(str)
%timeit ['/'.join(i) for i in zip(df['Host'],df['Protocol'],df['Port'].map(str))]
%timeit ['/'.join(i) for i in df[['Host','Protocol','Port']].astype(str).values]
时间:
10 loops, best of 3: 39.7 ms per loop
10 loops, best of 3: 35.9 ms per loop
10 loops, best of 3: 162 ms per loop
无论多么慢,我认为这都是您最易读的方法:
import pandas as pd
data = '''\
ID,Host,Protocol,Port
1,10.0.0.10,tcp,445
1,10.0.0.10,tcp,445
1,10.0.0.10,tcp,445
1,10.0.0.10,tcp,445
1,10.0.0.10,tcp,445
1,10.0.0.10,tcp,445
1,10.0.0.10,tcp,445
1,10.0.0.10,tcp,49707
1,10.0.0.10,tcp,49672
1,10.0.0.10,tcp,49670'''
df = pd.read_csv(pd.compat.StringIO(data)) # Recreates a sample dataframe
cols = ['Host','Protocol','Port']
newcol = ['/'.join(i) for i in df[cols].astype(str).values]
df = df.assign(HostAffected=newcol).drop(cols, 1)
print(df)
返回值:
ID HostAffected
0 1 10.0.0.10/tcp/445
1 1 10.0.0.10/tcp/445
2 1 10.0.0.10/tcp/445
3 1 10.0.0.10/tcp/445
4 1 10.0.0.10/tcp/445
5 1 10.0.0.10/tcp/445
6 1 10.0.0.10/tcp/445
7 1 10.0.0.10/tcp/49707
8 1 10.0.0.10/tcp/49672
9 1 10.0.0.10/tcp/49670
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句