如何将具有定义的共享子字符串的列表中的字符串移动到新列表？

MadDanWithABox

我正在尝试创建两个列表，第二个列表中带括号的子字符串永远不会出现在第一个列表中。

给定一个字符串的起始列表和一个空列表：

word_list = ['{a==meliorate}>ed>','{a==meliorate}>s>','{a==meliorate}','{anew}','{annex}>ing>','{anvil}>ed>','{anvil}>ing>','{anvil}','<un<{ban}>ed>','<re<{write}']

new_list=[]

我希望能够拆分word_list，以便将一半的单词弹出到new_list。但是，对于给定的带括号的{substring}，如果在word_list中找到它，则不应在new_list中找到它，反之亦然。

因此，我们将拥有：

word_list = ['{anew}','{anvil}>ed>','{anvil}>ing>','<re<{apply}','<un<{ban}>ed>']

new_list=['{a==meliorate}>ed>','{a==meliorate}>s>','{a==meliorate}','<re<{write}','{annex}>ing>']

到目前为止，我的尝试：

regex = re.compile('.*({[a-z]+}).*')
matches=[]

for element in word_list:
    m = re.search(regex, element)
    if m:
        root = m.group(1)
        matches.append(root)

while counter < len(word_list)/2:
    randroot = random.choice(matches) #select a random {root}
    indices = [i for i, e in enumerate(matches) if e == randroot] #get indices of all words with given root
    for index in indices: #for each index of root-aligned words, appends corresponding word 
        new_list = word_list.pop(index)

但是，我的输出似乎是随机的，两个列表中都出现了包含方括号的元素的字符串。任何帮助深表感谢！

Zhenhir

另一个答案已经涵盖了正则表达式将不匹配任何带有“ =”的字符串，并且您的比较将不会导致输出，而是匹配。

也许最大的问题是，当您从列表中弹出元素时，会更改其长度，因此会更改其中所有元素的索引。这就是为什么您的输出比预期的更加随机的原因。如果要弹出一个早期元素，然后尝试弹出最后一个元素，则还会遇到IndexError。

我已经对代码进行了调整，使其不依赖索引。这可能是处理长度不断变化的可迭代对象的最佳方法。

#!/usr/bin/env python3
import re
import random

word_list = ['{a==meliorate}>ed>','{a==meliorate}>s>','{a==meliorate}','{anew}','{annex}>ing>','{anvil}>ed>','{anvil}>ing>','{anvil}','<un<{ban}>ed>','<re<{write}']

new_list=[]

regex = re.compile(r".*({[a-z=]+}).*")
matches=[]

for element in word_list:
        m = re.search(regex, element)
        if m:
                root = m.group(1)
                matches.append(root)

target = len(word_list) / 2
while len(new_list) < target:
        randroot = random.choice(matches) # select a random {root}
        found_words = [w for w in word_list if randroot in w] # get all words with given root in them

        if len(found_words) > target - len(new_list):
                continue

        new_list.extend(found_words)
        word_list = [w for w in word_list if w not in new_list] # remove all the words we just added

print(word_list)
print(new_list)

更改说明：我只是在您的正则表达式中添加了“ =”即可捕获“ a == meliorate”。我将目标设置为变量，因为的长度word_list将改变。

现在，我只是检查匹配项是否在字符串中，word_list而不是查找精确匹配项。这不是一种完全防错的方法，但是查看您的输入数据，我认为在这里可以安全使用。

这项if检查有助于我们确保每个列表的长度均等。例如，我们不会添加“ a == meliorate”，它会出现3次。.如果我们只有2个广告位才能达到目标。但是请注意，如果无法均匀拆分列表，则将导致无限循环。

我们将找到的单词添加到new_list中extend。现在我们重新构建word_list，不包括new_list..中找到的任何值

结果：

['{a==meliorate}>ed>', '{a==meliorate}>s>', '{a==meliorate}', '{anew}', '<un<{ban}>ed>']
['{annex}>ing>', '{anvil}>ed>', '{anvil}>ing>', '{anvil}', '<re<{write}']

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-01-15

我来说两句

0 条评论

登录后参与评论

TOP 榜单

文章

如何将具有定义的共享子字符串的列表中的字符串移动到新列表？

如何将具有定义的共享子字符串的列表中的字符串移动到新列表？

Qt Creator Windows 10 - “使用 jom 而不是 nmake”不起作用

使用next.js时出现服务器错误，错误：找不到react-redux上下文值；请确保组件包装在<Provider>中

Swift 2.1-对单个单元格使用UITableView

SQL Server中的非确定性数据类型

如何避免每次重新编译所有文件？

Hashchange事件侦听器在将事件处理程序附加到事件之前进行侦听

在同一Pushwoosh应用程序上Pushwoosh多个捆绑ID

HttpClient中的角度变化检测

在 Avalonia 中是否有带有柱子的 TreeView 或类似的东西？

在Wagtail管理员中，如何禁用图像和文档的摘要项？

通过iwd从Linux系统上的命令行连接到wifi（适用于Linux的无线守护程序）

构建类似于Jarvis的本地语言应用程序

Camunda-根据分配的组过滤任务列表

如何了解DFT结果

Embers js中的更改侦听器上的组合框

ggplot：对齐多个分面图-所有大小不同的分面

使用分隔符将成对相邻的数组元素相互连接

PHP Curl PUT 在 curl_exec 处停止

您如何通过 Nativescript 中的 Fetch 发出发布请求？

错误：找不到存根。请确保已调用spring-cloud-contract：convert

应用发明者仅从列表中选择一个随机项一次