I'm looking for a fast way to compare one value of dict to all values, and want to iterate through all values of the dict.
I understand there is a going to be a lot of repetition in value checking, so I was trying to update the iterable (pop already iterated keys) during iteration, but it seems I can't modify the iterable during iteration.
Here is the code I'm using:
#comparing value to all value2
duplicates = []
for key,value in image_dict_copy.items():
for key2,value2 in image_dict_copy.items():
if hamming_distance(value, value2) > .85:
duplicates.append((key, key2))
image_dict_copy.pop(key) #doesn't work
print(len(image_dict_copy)) #trying to shrink the size of the iterable
Any suggestions as to how to improve the speed? It's pretty slow at the moment.
There are various ways to do this, but essentially, you want to compare every possible pair of keys in your dict. The easiest way is to not reinvent the wheel and use itertools:
import itertools
for k1, k2 in itertools.combinations(image_dict_copy, 2):
if hamming_distance(image_dict_copy[k1], image_dict_copy[k2]) > .85:
duplicates.append((k1, k2))
Now, the computational complexity will still be quadratic, but you'll be doing about half the number of actual comparisons.
itertools.combinations
is handy because it takes any iterable. But if you had a sequence, i.e. a list
, this is the basic way to iterate over every unique pair (by index):
>>> keys = list('abcde')
>>> for i in range(len(keys)):
... for j in range(i + 1, len(keys)):
... print(keys[i], keys[j])
...
a b
a c
a d
a e
b c
b d
b e
c d
c e
d e
You could use the above approach if you did something like
keys = list(image_dict_copy)
But just stick with itertools
And just for fun, if you really wanted to use pop
, iterate over a copy backwards and pop from the back:
>>> keys = list('abcde')
>>> keys_copy = keys[:-1]
>>> for k1 in reversed(keys):
... for k2 in reversed(keys_copy):
... print(k1, k2)
... if keys_copy:
... _ = keys_copy.pop()
...
e d
e c
e b
e a
d c
d b
d a
c b
c a
b a
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments