如何在T-SQL中清理字符串并提取数字后缀

帕特里克·B。

我有一个包含名称的字符串,在大多数情况下,该字符串的末尾带有一个或两个数字。此数字后缀应与名字分开。一个数字代表一种状态,应提取出来。如果有两个数字,则是从右数第二个数字;如果有一个数字,则是从右数第一个数字。这些数字用下划线分隔。在名称中也可以使用下划线。结果应为一列,其名称为clearname并提取状态。

我试图用标准字符串函数(例如Substring,Charindex,Patindex,LEN和son on)解决问题。但是我的方法很快变得非常笨重,难以维护。我想知道是否有一种具有常规SQl-Server功能的优雅解决方案(如果可能的话,不为正则表达式安装其他功能)。

SELECT _data.myStr
    -- , ... AS clearname  /*String cleaned from number_postfixes*/
    -- , ... AS Status     /*second number from the right*/
FROM (
    SELECT 'tree_leafs_offer_2_1' AS myStr  --clearname: tree_leafs_offer; cut off: _2_1; extracted status: 2
        UNION
    SELECT 'tree_leafs_offer_2_10' AS myStr --clearname: tree_leafs_offer_2_10; cut off: _2_10; extracted status: 2
        UNION
    SELECT 'tree_leafs_offer_2_2' AS myStr  --clearname: tree_leafs_offer; cut off: _2_2; extracted status: 2
        UNION
    SELECT 'tree_leafs_offer_1150_1' AS myStr   --clearname: tree_leafs_offer; cut off: _1150_1; extracted status: 1150
        UNION
    SELECT 'tree_leafs_offer_1150_10' AS myStr  --clearname: tree_leafs_offer; cut off: _1150_10; extracted status: 1150
        UNION
    SELECT 'builder_bundle_less_xl_1' AS myStr  --clearname: builder_bundle_less_xl; cut off: _1; extracted status: 1
        UNION
    SELECT 'builder_bundle_less_xl_10' AS myStr --clearname: builder_bundle_less_xl; cut off: _10; extracted status: 10
        UNION
    SELECT 'static_components_wolves_10_4' AS myStr --clearname: static_components_wolves; cut off: _10_4; extracted status: 4
        UNION
    SELECT 'coke_0_boring_components_bundle_grant_1' AS myStr   --clearname: oke_0_boring_components_bundle_grant; cut off: _1; extracted status: 1
        UNION
    SELECT 'coke_0_soccer18_end_1_4h_101' AS myStr  --clearname: coke_0_soccer18_end_1_4h; cut off: _101; extracted status: 101
        UNION
    SELECT 'coke_0_late_downsell_bundle_high_114' AS myStr  --clearname: coke_0_late_downsell_bundle_high; cut off: _114; extracted status: 114
        UNION
    SELECT 'itembundle_mine_bundle_small' AS myStr  --clearname: itembundle_mine_bundle_small; cut off: <nothing>; extracted status: NULL
) AS _data
As-Is Result:
-----------------
myStr:
---------------------------------------
builder_bundle_less_xl_1
builder_bundle_less_xl_10
coke_0_boring_components_bundle_grant_1
coke_0_late_downsell_bundle_high_114
coke_0_soccer18_end_1_4h_101
itembundle_mine_bundle_small
static_components_wolves_10_4
tree_leafs_offer_1150_1
tree_leafs_offer_1150_10
tree_leafs_offer_2_1
tree_leafs_offer_2_10
tree_leafs_offer_2_2

To-Be Result (two new columns):
-------------------
clearname:                              |Status
----------------------------------------------
builder_bundle_less_xl                  |   1
builder_bundle_less_xl                  |  10
coke_0_boring_components_bundle_grant   |   1
coke_0_late_downsell_bundle_high        | 114
coke_0_soccer18_end_1_4h                | 101
itembundle_mine_bundle_small            |NULL
static_components_wolves                |  10
tree_leafs_offer                        |1150
tree_leafs_offer                        |1150
tree_leafs_offer                        |   2
tree_leafs_offer                        |   2
tree_leafs_offer                        |   2
Shnugo

老实说:这种格式太糟糕了!如果这不是一次性的操作,那么您真的应该必须处理之前尝试更改它。

但是-如果您必须坚持使用-您可以尝试以下方法:

编辑:解决了状态位置的错误计算...

DECLARE  @tbl TABLE(ID INT IDENTITY,myStr VARCHAR(1000));
INSERT INTO @tbl VALUES
 ('tree_leafs_offer_2_1')
,('tree_leafs_offer_2_10')
,('tree_leafs_offer_2_2')
,('tree_leafs_offer_1150_1')
,('tree_leafs_offer_1150_10')
,('builder_bundle_less_xl_1')
,('builder_bundle_less_xl_10')
,('static_components_wolves_10_4')
,('coke_0_boring_components_bundle_grant_1')
,('coke_0_soccer18_end_1_4h_101')
,('coke_0_late_downsell_bundle_high_114')
,('itembundle_mine_bundle_small');

查询

WITH cte AS
(
    SELECT t.ID
          ,t.myStr 
            ,A.[key] AS Position
            ,A.[value] AS WordFragment
            ,B.CastedToInt
    FROM @tbl t
    CROSS APPLY OPENJSON(N'["' + REPLACE(t.myStr,'_','","') + '"]') A
    CROSS APPLY(SELECT TRY_CAST(A.[value] AS INT)) B(CastedToInt)
) 
SELECT ID
      ,myStr
        ,STUFF(
        (SELECT CONCAT('_',cte2.WordFragment)
        FROM cte cte2
        WHERE cte2.ID=cte.ID
            AND cte2.Position<=A.PositionHighestNonInt
        ORDER BY cte2.Position
        FOR XML PATH('')
        ),1,1,'') AS ClearName
        ,(SELECT cte3.CastedToInt FROM cte cte3 WHERE cte3.ID=cte.ID AND cte3.Position=A.PositionHighestNonInt+1) AS [Status]
FROM cte
CROSS APPLY (
                 SELECT ISNULL(MAX(x.Position),1000) 
                 FROM cte x 
                 WHERE x.ID=cte.ID AND x.CastedToInt IS NULL
             ) A(PositionHighestNonInt)
GROUP BY ID,myStr,PositionHighestNonInt;

结果

+----+---------------------------------------+--------+
| ID | ClearName                             | Status |
+----+---------------------------------------+--------+
| 1  | tree_leafs_offer                      | 2      |
+----+---------------------------------------+--------+
| 2  | tree_leafs_offer                      | 2      |
+----+---------------------------------------+--------+
| 3  | tree_leafs_offer                      | 2      |
+----+---------------------------------------+--------+
| 4  | tree_leafs_offer                      | 1150   |
+----+---------------------------------------+--------+
| 5  | tree_leafs_offer                      | 1150   |
+----+---------------------------------------+--------+
| 6  | builder_bundle_less_xl                | 1      |
+----+---------------------------------------+--------+
| 7  | builder_bundle_less_xl                | 10     |
+----+---------------------------------------+--------+
| 8  | static_components_wolves              | 10     |
+----+---------------------------------------+--------+
| 9  | coke_0_boring_components_bundle_grant | 1      |
+----+---------------------------------------+--------+
| 10 | coke_0_soccer18_end_1_4h              | 101    |
+----+---------------------------------------+--------+
| 11 | coke_0_late_downsell_bundle_high      | 114    |
+----+---------------------------------------+--------+
| 12 | itembundle_mine_bundle_small          | NULL   |
+----+---------------------------------------+--------+

这个想法:

  • 模型表中提供数据
  • 使用技巧OPENJSON将字符串拆分,然后找到可以转换为的部分INT
  • 查找最高的非整数片段。Status会是未来指数
  • 随着v2017你可以使用STRING_AGG,但v2016,我们必须使用一个基于XML的伎俩来连接所有碎片之前[Status]

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章