使用 rstrip() 和 lstrip() 删除字符串中的第一个和最后一个下划线字符“_”会导致 Python 3.7 中的字符“t”丢失

H同志

我有一系列 .txt 文件，我想删除前缀和后缀以使它们更易于阅读（并进行进一步分析）

虚拟名称类似于“Test_abcdef_000001.txt”、“Test_abcdef_000002.txt”或“Test_abcdeft_000001.txt”

要删除“Test_”和“_000001.txt”部分，我使用 rstrip() 和 lstrip() 如下：

for file in os.listdir(directory):
        if file.endswith(".txt"):
            if file.startswith("Test"):  
                print('old name is: '+file+'\n')
                file = file.lstrip('Test_')
                for i in range(20):
                    if file.endswith(str(i).zfill(6)+'.txt'):
                            file_1 = file.rstrip('_'+str(i).zfill(6)+'.txt')
                            print('New name is: ' + file_1 +'\n')

第一个 for 循环是扫描目录中的所有文件。带有 i 的第二个 for 循环是处理 _000001 或 _000002 测试名称。

因此，例如，对于以下 4 个测试名称，我期待 4 个“新”测试名称：

test_abcdtt_000001.txt --> abcdtt
test_abct_000001.txt --> abct
test_defg_000001.txt --> defg
test_tcty_000001.txt --> tcty

但是，在实际测试中，我有以下结果

test_abcdtt_000001.txt --> abcd
test_abct_000001.txt --> abc
test_defg_000001.txt --> defg
test_tcty_000001.txt --> cty

换句话说，“_”旁边的所有“t”字符都丢失了，这是次优的。对这个问题有什么建议/建议吗？

感谢您的时间和支持。

供参考：我在公司计算机上使用 Python 3.7。所以假设我不能将它升级到 3.9 和/或导入任何花哨的库。另外，我的一些文件里面可能有_，例如Test_ab_ty_ui_000001.txt，为此，最终结果应该是ab_ty_ui。

阮明孝

也许尝试使用re来匹配您想要的模式。

import re

prefix = "Test"
# regex to get everything between 'Test_' and '_{digits}'
regex = rf"^{prefix}_(.*)_(\d+).txt"

# this could also be replaced with glob.glob(f"{directory}/{prefix}*") for be more efficient
for file_name in os.listdir(directory):
    match = re.match(regex, file_name)
    if match:
    print(match.groups()[0])

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-09-17

我来说两句

0 条评论

登录后参与评论

TOP 榜单

文章