I want to create a function called jaccard that works something like this
def jaccard(doc1, doc2):
inter = len(np.intersect1d(doc1, doc2))
union = len(np.union1d(doc1, doc2))
jaccard = float(inter)/union
return jaccard
Except I only want it to take one parameter and have the other parameter hard coded into the function.
I want to write a function that will generate this function with the hard coded parameter because I need to use it with thousands of parameters.
def jaccard(doc2):
inter = len(np.intersect1d(['Work with us --- The Missing Slate Magazine'], doc2))
union = len(np.union1d(['Work with us --- The Missing Slate Magazine'], doc2))
jaccard = float(inter)/union
return jaccard
So I want to generate a function like this.
The reason I want this is because I want to apply it to a column of a Pandas DataFrame. The data frame contains a list of strings. I want to find the jaccard distance for each of them with the hard coded parameter in the function.
Thank you in advance!
You can create closures in python. For instance
def jacard(doc1):
def _jacard(doc2):
inter = len(np.intersect1d(doc1, doc2))
union = len(np.union1d(doc1, doc2))
return float(inter)/union
return _jacard
then:
prepared_func = jacard(doc1)
afterwards:
results = map(prepared_func, some_array_of_doc2s)
partial argument binding using functools in this case is a shorthand for creating a closure with argument doc2 bound.
prepared_func = functools.partial(jaccard, doc1)
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments