我有很长的文字,这是其中的一部分
C: state name of the Company in Russian: [03_SNYuLOOO IC "Story Group".]
). - [04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow,
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B].
我需要找到所有这样的所有子字符串:
[03_SNYuLOOO IC "Story Group".]
[04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow,
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B]
我尝试使用
re.findall(r'^\[\d{2}_[\s\S]+\]$', text)
但是它返回空列表。我怎么了
在^
和$
锚需要整个字符串匹配的模式,并[\s\S]+
匹配任何字符1+尽可能多的,抓住任何[
和]
它的方式字符串的结尾,所以最终]
将匹配最右边]
的的字符串中。
您可以使用以下正则表达式:
r'\[\d{2}_[^]]+]'
细节
\[
-文字 [
\d{2}
-两位数_
-下划线[^]]+
-除以下以外的一种或多种字符 ]
]
-文字]
。参见Python演示:
import re
s='''C: state name of the Company in Russian: [03_SNYuLOOO IC "Story Group".]
). - [04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow,
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B].'''
print(re.findall(r'\[\d{2}_[^]]+]', s))
# => ['[03_SNYuLOOO IC "Story Group".]', '[04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow, \nul. Krasnobogatyrskaya, 2, is built.\n2, floor 3. com. 11. Office B]']
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句