如何强制从内存中释放Django模型

泰迪·沃德：

我想使用管理命令对马萨诸塞州的建筑物进行一次性分析。我已将令人反感的代码缩减为8行代码段，以演示我遇到的问题。这些评论仅说明了我为什么要这样做。我在空白命令中逐字运行下面的代码

zips = ZipCode.objects.filter(state='MA').order_by('id')
for zip in zips.iterator():
    buildings = Building.objects.filter(boundary__within=zip.boundary)
    important_buildings = []
    for building in buildings.iterator():
        # Some conditionals would go here
        important_buildings.append(building)
    # Several types of analysis would be done on important_buildings, here
    important_buildings = None

当我运行此确切的代码时，我发现每次迭代外循环时内存使用量都稳定增加（我print('mem', process.memory_info().rss)用来检查内存使用量）。

important_buildings即使超出范围，列表似乎也在占用内存。如果我替换important_buildings.append(building)为_ = building.pk，它不再消耗太多内存，但是对于某些分析，我确实需要该列表。

所以，我的问题是：当Python超出范围时，如何强制Python释放Django模型列表？

编辑：我觉得堆栈溢出有一个陷阱22-如果我写太多细节，没有人愿意花时间阅读它（这成为不太适用的问题），但是如果我写得太少详细信息，我冒险忽略部分问题。无论如何，我非常感谢您的回答，并计划在本周末尝试尝试一些建议时，我终于有机会回到这个话题！

洛朗（Laurent S）：

您没有提供有关模型的大小以及模型之间存在什么链接的太多信息，因此这里有一些想法：

默认情况下QuerySet.iterator()会2000在内存中加载元素（假设您使用的是django> = 2.0）。如果您的Building模型包含大量信息，则可能会占用大量内存。您可以尝试将chunk_size参数更改为更低的值。

您的Building模型在实例之间是否存在链接，这些链接可能导致gc找不到参考循环？您可以使用gc调试功能来获取更多详细信息。

或短路上述想法，也许只是打电话del(important_buildings)，并del(buildings)随后gc.collect()在每个环路力垃圾收集的结束？

变量的范围是函数，而不仅仅是for循环，因此将代码分解为较小的函数可能会有所帮助。尽管请注意python垃圾收集器不会总是将内存返回给操作系统，但是如本答案所述，您可能需要采取更多残酷的措施才能查看rss故障。

希望这可以帮助！

编辑：

为了帮助您了解哪些代码占用了内存以及占用了多少内存，可以使用tracemalloc模块，例如，使用建议的代码：

import linecache
import os
import tracemalloc

def display_top(snapshot, key_type='lineno', limit=10):
    snapshot = snapshot.filter_traces((
        tracemalloc.Filter(False, "<frozen importlib._bootstrap>"),
        tracemalloc.Filter(False, "<unknown>"),
    ))
    top_stats = snapshot.statistics(key_type)

    print("Top %s lines" % limit)
    for index, stat in enumerate(top_stats[:limit], 1):
        frame = stat.traceback[0]
        # replace "/path/to/module/file.py" with "module/file.py"
        filename = os.sep.join(frame.filename.split(os.sep)[-2:])
        print("#%s: %s:%s: %.1f KiB"
              % (index, filename, frame.lineno, stat.size / 1024))
        line = linecache.getline(frame.filename, frame.lineno).strip()
        if line:
            print('    %s' % line)

    other = top_stats[limit:]
    if other:
        size = sum(stat.size for stat in other)
        print("%s other: %.1f KiB" % (len(other), size / 1024))
    total = sum(stat.size for stat in top_stats)
    print("Total allocated size: %.1f KiB" % (total / 1024))

tracemalloc.start()

# ... run your code ...

snapshot = tracemalloc.take_snapshot()
display_top(snapshot)

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2020-06-3

我来说两句

0 条评论

登录后参与评论

TOP 榜单

文章

如何强制从内存中释放Django模型

如何强制从内存中释放Django模型

Qt Creator Windows 10 - “使用 jom 而不是 nmake”不起作用

使用next.js时出现服务器错误，错误：找不到react-redux上下文值；请确保组件包装在<Provider>中

SQL Server中的非确定性数据类型

Swift 2.1-对单个单元格使用UITableView

如何避免每次重新编译所有文件？

在同一Pushwoosh应用程序上Pushwoosh多个捆绑ID

Hashchange事件侦听器在将事件处理程序附加到事件之前进行侦听

应用发明者仅从列表中选择一个随机项一次

在 Avalonia 中是否有带有柱子的 TreeView 或类似的东西？

HttpClient中的角度变化检测

在Wagtail管理员中，如何禁用图像和文档的摘要项？

如何了解DFT结果

Camunda-根据分配的组过滤任务列表

错误：找不到存根。请确保已调用spring-cloud-contract：convert

为什么此后台线程中未处理的异常不会终止我的进程？

构建类似于Jarvis的本地语言应用程序

使用分隔符将成对相邻的数组元素相互连接

您如何通过 Nativescript 中的 Fetch 发出发布请求？

通过iwd从Linux系统上的命令行连接到wifi（适用于Linux的无线守护程序）

使用React / Javascript在Wordpress API中通过ID获取选择的多个帖子/页面

使用 text() 獲取特定文本節點的 XPath