我已经努力了几个小时,无法解决问题,所以请帮助我!我有2个包含一些元组的列表。
list_1 = [[('This', 'DT'), ('is', 'VBZ'), ('an', 'DT'), ('apple', 'NN')], [('This', 'DT'), ('Should', 'MD'), ('be', 'VB'), ('taught', 'VBN'), ('at','IN'), ('school', 'NN')], [('you', 'PRP'), ('might', 'MD'), ('take', 'VB'), ('them', 'PRP')]]
list_2 = [[('This', 'DT'), ('is', 'VBN'), ('an', 'DT'), ('apple', 'NNS')], [('This', 'DT'), ('Should', 'VB'), ('be', 'VB'), ('taught', 'VBN'), ('at','IN'), ('school', 'NNP')], [('you', 'PRP'), ('might', 'MD'), ('take', 'VB'), ('them', 'PRP')]]
我想比较每个句子的每个元组list_1
和list_2
元素明智的,仅返回具有不同标签的元组。
def get_similarity():
for list_1_sentence, list_2_sentence in zip(list_1, list_2):
if len(list_1) != len(list_2):
try:
raise Exception('this is an exception, some tags are missing')
except Exception as error:
print('caught this error : ', repr(error))
else:
return [(x, y) for x, y in zip(list_1_sentence, list_2_sentence) if x != y]
get_similarity()
问题在于它返回不同的元组,但是仅对于第一句话:
[(('is', 'VBZ), ('is', 'VBN)), ('apple', 'NN'), ('apple', 'NNS'))]
为什么不遍历所有句子?
您可以尝试以下方法:
list_1 = [[('This', 'DT'), ('is', 'VBZ'), ('an', 'DT'), ('apple', 'NN')], [('This', 'DT'), ('Should', 'MD'), ('be', 'VB'), ('taught', 'VBN'), ('at','IN'), ('school', 'NN')], [('you', 'PRP'), ('might', 'MD'), ('take', 'VB'), ('them', 'PRP')]]
list_2 = [[('This', 'DT'), ('is', 'VBN'), ('an', 'DT'), ('apple', 'NNS')], [('This', 'DT'), ('Should', 'VB'), ('be', 'VB'), ('taught', 'VBN'), ('at','IN'), ('school', 'NNP')], [('you', 'PRP'), ('might', 'MD'), ('take', 'VB'), ('them', 'PRP')]]
final_tags = list(filter(lambda x:x, [[c for c, d in zip(a, b) if c[-1] != d[-1]] for a, b in zip(list_1, list_2)]))
输出:
[[('is', 'VBZ'), ('apple', 'NN')], [('Should', 'MD'), ('school', 'NN')]]
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句