如何使用后向交替

塞巴斯蒂安·泽基(Sebastian Zeki)

目标:

我只想在句子中使用“否”来匹配句子,但前提是在r中不以“有”,“有”或“有”为前提。

输入:

The ground was rocky with no cracks in it
No diggedy, no doubt
Understandably, there is no way an elephant can be green

预期产量:

The ground was rocky with no cracks in it
Understandably, there is no way an elephant can be green

尝试:

gsub(".*(?:((?<!with )|(?<!there is )|(?<!there are ))\\bno\\b(?![?:A-Za-z])|([?:]\\s*N?![A-Za-z])).*\\R*", "", input_string, perl=TRUE, ignore.case=TRUE)

问题:

否定的向后看似会被忽略,以便所有句子都被替换。问题是在lookbehind语句中使用交替吗?

维克多·史翠比维

您可以使用

(?mxi)^       # Start of a line (and free-spacing/case insensitive modes are on)
(?:           # Outer container group start
  (?!.*\b(?:with|there\h(?:is|are))\h+no\b) # no 'with/there is/are no' before 'no'
  .*\bno\b  # 'no' whole word after 0+ chars
  (?![?:])    # cannot be followed with ? or :
|             # or
  .*          # any 0+ chars
  [?:]\h*n(?![a-z]) # ? or : followed with 0+ spaces, 'n' not followed with any letter
)             # container group end
.*            # the rest of the line and 
\R*           # 0+ line breaks

请参阅regex演示简而言之:将图案发现2个选择,不是的2种类型的线中,一种具有no在它整个字未前面有withthere isthere are和它们之后的空间中,或含有线?:随后与0+水平空间(\h)然后再加上一个n不带其他字母的字母。

参见R演示

sentences <- "The ground was rocky with no cracks in it\r\nNo diggedy, no doubt\r\nUnderstandably, there is no way an elephant can be green"
rx <- "(?mxi)^ # Start of a line
(?:            # Outer container group start
  (?!.*\\b(?:with|there\\h(?:is|are))\\h+no\\b) # no 'with/there is/are no' before 'no'
  .*\\bno\\b   # 'no' whole word after 0+ chars
  (?![?:])     # cannot be followed with ? or :
|              # or
  .*           # any 0+ chars
  [?:]\\h*n(?![a-z]) # ? or : followed with 0+ spaces, 'n' not followed with any letter
)              # container group end
.*             # the rest of the line and 0+ line breaks
\\R*"
res <- gsub(rx, "", sentences, perl=TRUE)
cat(res, sep="\n")

输出:

The ground was rocky with no cracks in it
Understandably, there is no way an elephant can be green

多亏了x修饰符,您可以在正则表达式模式中添加注释,并使用空格对其进行格式化以提高可读性。请注意,所有文字空白都必须替换为\\h(水平空白),\\s(任何空白),\\n(LF),\\r(CR)等,以使其以这种模式工作。

(?i)修改代表的ingore.case=TRUE

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章