我编写了这段代码来比较两个CSV files (f1 and f2)
都有3列和许多行的两个代码,然后分别将f1的anitem at cell 1 of f1 matches that of f2
和item at cell 2 of f1 matches that of f2, it should write the values
cell1,f1的cell2和f2的cell 3分别与一个名为的文件进行比较network_python.csv
编码:
t = {}
with open('file1.csv') as ff:
for f1 in csv.DictReader(ff):
with open('file2.csv') as ff:
for f2 in csv.DictReader(ff):
if int(f1['From'].strip()) == int(f2['From'].strip()) and int(f1['To'].strip()) == int(f2['To'].strip()):
print (f1['From'], f1['To'], f2['Mode'])
t.update({'From': f1['From'], 'To': f1['To'], 'Mode': f2['Mode']})
with open('network_python.csv', 'w') as csvfile:
fieldnames = ['From', 'To', 'Mode']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for k,v in t.iteritems():
writer.writerow(t)
file1.csv中的样本数据
From To Mode
1 2 cw
2 1 cw
3 4 cwt
7 2 cbt
8 9 ct
file2.csv中的样本数据
From To Mode
8 9 c
3 4 cw
1 2 cwt
7 2 ct
2 1 cb
该代码可以正常工作(即获得正确的输出),但是在写入文件时,它会写入一行,从而覆盖先前的结果。还有没有办法可以提高代码的效率?因为使用大文件速度很慢。我在这里搜索了一些问题,但它们并没有完全回答我的问题。谢谢您的宝贵时间
您可以从两个csv文件创建一个dict
where密钥(from, to)
,然后将它们组合在一起以得到结果:
import csv
from collections import OrderedDict
with open('file1.csv') as f:
reader = csv.reader(f)
reader.next()
rows = OrderedDict((tuple(row[:2]), None) for row in reader)
with open('file2.csv') as f:
reader = csv.reader(f)
reader.next()
# Skip row if matching row wasn't present in file1.csv
rows.update({tuple(row[:2]): row[2] for row in reader if tuple(row[:2]) in rows})
with open('network_python.csv', 'wb') as csvfile:
fieldnames = ['From', 'To', 'Mode']
writer = csv.writer(csvfile)
writer.writerow(fieldnames)
# Skip row if it wasn't present in file2.csv
writer.writerows((k[0], k[1], v) for k, v in rows.iteritems() if v is not None)
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句