正则表达式逗号分隔定界符

埃菲

我正在尝试用逗号分隔符来分隔我的专栏。因此,该列具有多个值,例如;139,239,338,323出于某种原因,以下代码可用于第一列,但其余列为空。

SELECT  
Regexp_extract(StringToParse,r'^(?:[^,\/]*,\/){0}([^,\/]*),\/?') as Word0,
Regexp_extract(StringToParse,r'^(?:[^,\/]*,\/){1}([^,\/]*),\/?') as Word1,
Regexp_extract(StringToParse,r'^(?:[^,\/]*,\/){2}([^,\/]*),\/?') as Word2,
Regexp_extract(StringToParse,r'^(?:[^,\/]*,\/){3}([^,\/]*),\/?') as Word3,
Regexp_extract(StringToParse,r'^(?:[^,\/]*,\/){4}([^,\/]*),\/?') as Word4,
Regexp_extract(StringToParse,r'^(?:[^,\/]*,\/){5}([^,\/]*),\/?') as Word5,
Regexp_extract(StringToParse,r'^(?:[^,\/]*,\/){6}([^,\/]*),\/?') as Word6,
Regexp_extract(StringToParse,r'^(?:[^,\/]*,\/){7}([^,\/]*),\/?') as Word7,
Regexp_extract(StringToParse,r'^(?:[^,\/]*,\/){8}([^,\/]*),\/?') as Word8,
Regexp_extract(StringToParse,r'^(?:[^,\/]*,\/){9}([^,\/]*),\/?') as Word9
FROM
(SELECT event_list AS StringToParse FROM `mytable.2017`)
米哈伊尔·伯利安(Mikhail Berlyant)

尝试下面的BigQuery标准SQL

#standardSQL
SELECT 
  SPLIT(StringToParse)[SAFE_OFFSET (0)] AS Word0, 
  SPLIT(StringToParse)[SAFE_OFFSET (1)] AS Word1, 
  SPLIT(StringToParse)[SAFE_OFFSET (2)] AS Word2, 
  SPLIT(StringToParse)[SAFE_OFFSET (3)] AS Word3, 
  SPLIT(StringToParse)[SAFE_OFFSET (4)] AS Word4, 
  SPLIT(StringToParse)[SAFE_OFFSET (5)] AS Word5, 
  SPLIT(StringToParse)[SAFE_OFFSET (6)] AS Word6, 
  SPLIT(StringToParse)[SAFE_OFFSET (7)] AS Word7, 
  SPLIT(StringToParse)[SAFE_OFFSET (8)] AS Word8, 
  SPLIT(StringToParse)[SAFE_OFFSET (9)] AS Word9 
FROM 
  (SELECT event_list AS StringToParse FROM `mytable.2017`) 

您可以使用以下虚拟数据来测试/玩游戏

#standardSQL
WITH `mytable.2017` AS (
  SELECT '139,239,338,323' AS event_list UNION ALL
  SELECT '123,456,789,135'
)
SELECT 
  SPLIT(StringToParse)[SAFE_OFFSET (0)] AS Word0, 
  SPLIT(StringToParse)[SAFE_OFFSET (1)] AS Word1, 
  SPLIT(StringToParse)[SAFE_OFFSET (2)] AS Word2, 
  SPLIT(StringToParse)[SAFE_OFFSET (3)] AS Word3, 
  SPLIT(StringToParse)[SAFE_OFFSET (4)] AS Word4, 
  SPLIT(StringToParse)[SAFE_OFFSET (5)] AS Word5, 
  SPLIT(StringToParse)[SAFE_OFFSET (6)] AS Word6, 
  SPLIT(StringToParse)[SAFE_OFFSET (7)] AS Word7, 
  SPLIT(StringToParse)[SAFE_OFFSET (8)] AS Word8, 
  SPLIT(StringToParse)[SAFE_OFFSET (9)] AS Word9 
FROM 
  (SELECT event_list AS StringToParse FROM `mytable.2017`)   

同时,如果由于某种原因您必须在此查询中使用正则表达式-请尝试以下

#standardSQL
SELECT  
  REGEXP_EXTRACT_ALL(StringToParse, r'([^,\/]*),\/?')[SAFE_OFFSET(0)]  AS Word0,
  REGEXP_EXTRACT_ALL(StringToParse, r'([^,\/]*),\/?')[SAFE_OFFSET(1)]  AS Word1,
  REGEXP_EXTRACT_ALL(StringToParse, r'([^,\/]*),\/?')[SAFE_OFFSET(2)]  AS Word2,
  REGEXP_EXTRACT_ALL(StringToParse, r'([^,\/]*),\/?')[SAFE_OFFSET(3)]  AS Word3,
  REGEXP_EXTRACT_ALL(StringToParse, r'([^,\/]*),\/?')[SAFE_OFFSET(4)]  AS Word4,
  REGEXP_EXTRACT_ALL(StringToParse, r'([^,\/]*),\/?')[SAFE_OFFSET(5)]  AS Word5,
  REGEXP_EXTRACT_ALL(StringToParse, r'([^,\/]*),\/?')[SAFE_OFFSET(6)]  AS Word6,
  REGEXP_EXTRACT_ALL(StringToParse, r'([^,\/]*),\/?')[SAFE_OFFSET(7)]  AS Word7,
  REGEXP_EXTRACT_ALL(StringToParse, r'([^,\/]*),\/?')[SAFE_OFFSET(8)]  AS Word8,
  REGEXP_EXTRACT_ALL(StringToParse, r'([^,\/]*),\/?')[SAFE_OFFSET(9)]  AS Word9
FROM
  (SELECT event_list AS StringToParse FROM `mytable.2017`)  

当然,在以上所有示例中,您都可以通过引入REGEXP_EXTRACT_ALL的SPLIT子查询来简化代码,然后在外部select中选择每个数组的元素

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章

Golang正则表达式提取2个定界符之间的文本-包括定界符

正则表达式相同,但StringTokenizer和Scanner类定界符给出不同的结果

在正则表达式上拆分而不删除定界符

用正则表达式分割字符串,在行首保留定界符

正则表达式用于两个模式之间的单个定界符

正则表达式在同一定界符之间匹配多个结果

正则表达式-删除定界符之间的字符串

正则表达式拆分文本,忽略引号文本中定界符的出现

Java:replace()以正则表达式结尾的n管道定界符

如何使用Rust正则表达式拆分字符串并保留定界符?

正则表达式(Python)-删除以定界符开头的行并保留其他定界符

正则表达式:在定界符之间捕获,带有可选的结束定界符

可以使用正则表达式在定界符对之间查找文本

逗号分隔的正则表达式

正则表达式捕获每组定界符之间的所有匹配项

正则表达式匹配定界符

正则表达式匹配数字的重复模式,后跟任何类型的定界符?

带组的正则表达式管道定界符

仅在初次使用正则表达式使用多个定界符进行分隔

awk中多字符定界符的非贪婪正则表达式匹配

使用自定义正则表达式定界符查找包含模式的行号

使用正则表达式匹配带有任意定界符的字段

正则表达式在前面分割字符串,多个定界符

正则表达式未按定界符过滤

正则表达式用定界符分割多行

正则表达式-反向引用-字定界符?

正则表达式在定界符之后匹配,并找到更高数量的匹配?

使用正则表达式检查包含由定界符分隔的一组单词的字符串

重复的正则表达式定界符