I want to print before and after 10 words of the matched word in the string.
For example, I have
string = "About the company -Our client is one of the world's fastest-growing AI-based contract management solution providers.Exp -7+ Years Location -MumbaiJob Role -Min 7years hands-on experience in Natural Language Processing, Machine Learning, Artificial Intelligence, and IBM Watson"
In the above string, I want to search of letter experience and wants output like
Location -MumbaiJob Role -Min 7years hands-on experience in Natural Language"
I tried (\S+)\s+exp+
, but it only returns one before word.
Spliting the words on one or more whitespace chracters is probably the best approach:
import re
s = "About the company -Our client is one of the world's fastest-growing AI-based contract management solution providers.Exp -7+ Years Location -MumbaiJob Role -Min 7years hands-on experience in Natural Language Processing, Machine Learning, Artificial Intelligence, and IBM Watson"
words = re.split(r'\s+', s)
try:
index = words.index('experience')
except Exception:
pass
else:
start = max(index - 5, 0)
end = min(index + 6, len(words))
print(' '.join(words[start:end]))
Prints:
-MumbaiJob Role -Min 7years hands-on experience in Natural Language Processing, Machine
But if you inisist on using a regular expression, then this should print up to 5 words preceding and 5 words following "experience":
import re
s = "About the company -Our client is one of the world's fastest-growing AI-based contract management solution providers.Exp -7+ Years Location -MumbaiJob Role -Min 7years hands-on experience in Natural Language Processing, Machine Learning, Artificial Intelligence, and IBM Watson"
m = re.search(r'([\w,;!.+-]+\s+){0,5}experience(\s+[\w,;!.+-]+){0,5}', s)
if m:
print(m[0])
Prints:
-MumbaiJob Role -Min 7years hands-on experience in Natural Language Processing, Machine
Update to Handle "experience" or "Experience"
I have also simplified the regular expression:
import re
s = "About the company -Our client is one of the world's fastest-growing AI-based contract management solution providers.Exp -7+ Years Location -MumbaiJob Role -Min 7years hands-on Experience in Natural Language Processing, Machine Learning, Artificial Intelligence, and IBM Watson"
# By splitting on one or more whitespace characters:
words = re.split(r'\s+', s)
try:
index = words.index('experience')
except Exception:
try:
index = words.index('Experience')
except Exception:
index = None
if index:
start = max(index - 5, 0)
end = min(index + 6, len(words))
print(' '.join(words[start:end]))
# Using a regular expression:
m = re.search(r'(\S+\s+){0,5}[eE]xperience(\s+\S+){0,5}', s)
if m:
print(m[0])
Prints:
-MumbaiJob Role -Min 7years hands-on Experience in Natural Language Processing, Machine
-MumbaiJob Role -Min 7years hands-on Experience in Natural Language Processing, Machine
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments