Python3中的多处理和死锁问题

杰森·怀特

我的多处理程序遇到问题，恐怕这是一个相当简单的修复程序，而我只是没有正确正确地实现多处理程序。我一直在研究可能导致问题的原因，但我真正发现的是人们建议使用队列来防止这种情况发生，但这似乎并不能阻止它（再次，我可能只是在实现队列不正确），我已经来了几天，希望能得到一些帮助。提前致谢！

import csv
import multiprocessing as mp
import os
import queue
import sys
import time

import connections
import packages
import profiles


def execute_extract(package, profiles, q):
    # This is the package execution for the extract
    # It fires fine and will print the starting message below
    started_at = time.monotonic()
    print(f"Starting {package.packageName}")
    try:
        oracle_connection = connections.getOracleConnection(profiles['oracle'], 1)
        engine = connections.getSQLConnection(profiles['system'], 1)
        path = os.path.join(os.getcwd(), 'csv_data', package.packageName + '.csv')
        cursor = oracle_connection.cursor()

        if os.path.exists(path):
            os.remove(path)

        f = open(path, 'w')
        chunksize = 100000
        offset = 0
        row_total = 0

        csv_writer = csv.writer(f, delimiter='^', lineterminator='\n')
        # I am having to do some data cleansing.  I know this is not the most efficient way to do this, but currently
        # it is what I am limited too 
        while True:
            cursor.execute(package.query + f'\r\n OFFSET {offset} ROWS\r\n FETCH NEXT {chunksize} ROWS ONLY')
            test = cursor.fetchone()
            if test is None:
                break
            else:
                while True:
                    row = cursor.fetchone()
                    if row is None:
                        break
                    else:
                        new_row = list(row)
                        new_row.append(package.sourceId[0])
                        new_row.append('')
                        i = 0
                        for item in new_row:
                            if type(item) == float:
                                new_row[i] = int(item)
                            elif type(item) == str:
                                new_row[i] = item.encode('ascii', 'replace')
                            i += 1
                        row = tuple(new_row)
                        csv_writer.writerow(row)
                        row_total += 1

            offset += chunksize

        f.close()
        # I know that execution is at least reaching this point.  I can watch the CSV files grow as more and more 
        # rows are added to the for all the packages What I never get are either the success message or error message
        # below, and there are never any entries placed in the tables 
        query = f"BULK INSERT {profiles['system'].database.split('_')[0]}_{profiles['system'].database.split('_')[1]}_test_{profiles['system'].database.split('_')[2]}.{package.destTable} FROM \"{path}\" WITH (FIELDTERMINATOR='^', ROWTERMINATOR='\\n');"
        engine.cursor().execute(query)
        engine.commit()

        end_time = time.monotonic() - started_at
        print(
            f"{package.packageName} has completed.  Total rows inserted: {row_total}.  Total execution time: {end_time} seconds\n")
        os.remove(path)
    except Exception as e:

        print(f'An error has occured for package {package.packageName}.\r\n {repr(e)}')

    finally:
        # Here is where I am trying to add an item to the queue so the get method in the main def will pick it up and
        # remove it from the queue 
        q.put(f'{package.packageName} has completed')
        if oracle_connection:
            oracle_connection.close()
        if engine:
            engine.cursor().close()
            engine.close()


if __name__ == '__main__':
    # Setting mp creation type
    ctx = mp.get_context('spawn')
    q = ctx.Queue()

    # For the Etl I generate a list of class objects that hold relevant information profs contains a list of 
    # connection objects (credentials, connection strings, etc) packages contains the information to run the extract 
    # (destination tables, query string, package name for logging, etc) 
    profs = profiles.get_conn_vars(sys.argv[1])
    packages = packages.get_etl_packages(profs)

    processes = []
    # I'm trying to track both individual package execution time and overall time so I can get an estimate on rows 
    # per second 
    start_time = time.monotonic()

    sqlConn = connections.getSQLConnection(profs['system'])
    # Here I'm executing a SQL command to truncate all my staging tables to ensure they are empty and will not 
    # generate any key violations 
    sqlConn.execute(
        f"USE[{profs['system'].database.split('_')[0]}_{profs['system'].database.split('_')[1]}_test_{profs['system'].database.split('_')[2]}]\r\nExec Sp_msforeachtable @command1='Truncate Table ?',@whereand='and Schema_Id=Schema_id(''my_schema'')'")

    # Here is where I start generating a process per package to try and get all packages to run simultaneously
    for package in packages:
        p = ctx.Process(target=execute_extract, args=(package, profs, q,))
        processes.append(p)
        p.start()

    # Here is my attempt at managing the queue.  This is a monstrosity of fixes I've tried to get this to work
    results = []
    while True:
        try:
            result = q.get(False, 0.01)
            results.append(result)
        except queue.Empty:
            pass
        allExited = True
        for t in processes:
            if t.exitcode is None:
                allExited = False
                break
        if allExited & q.empty():
            break

    for p in processes:
        p.join()

    # Closing out the end time and writing the overall execution time in minutes.
    end_time = time.monotonic() - start_time
    print(f'Total execution time of {end_time / 60} minutes.')

布布

我不确定您为什么会遇到死锁（我完全不相信这与您的队列管理有关），但是我可以肯定地说，如果您执行以下两种操作之一，则可以简化队列管理逻辑：

方法1

确保您的辅助函数execute_extract即使在发生异常的情况下也将某些东西放在结果队列中（我建议您将Exception对象本身放置）。然后，while True:可以将尝试开始以获取结果的整个主过程循环替换为：

results = [q.get() for _ in range(len(processes))]

您可以确保队列中的消息数量固定为等于创建的进程数。

方法2（甚至更简单）

只需颠倒您等待子流程完成并处理结果队列的顺序。您不知道队列中将有多少条消息，但是直到所有进程都返回后，您才开始处理队列。因此，无论队列中有多少条消息，您将获得的全部。只需检索它们，直到队列为空：

for p in processes:
    p.join()
results = []
while not q.empty():
    results.append(q.get())

在这一点上，我通常建议您使用一个多处理池类，例如，multiprocessing.Pool它不需要显式队列即可检索结果。但是，请进行以下任一更改（我建议使用方法2，因为此时只有主进程正在运行，我无法看到它如何导致死锁），然后查看问题是否消失。但是，我不能保证您的问题不在代码中。虽然您的代码过于复杂且效率低下，但显然不是“错误”。至少您会知道您的问题是否在其他地方。

还有我要问的问题：使用购买的上下文ctx = mp.get_context('spawn')而不是仅调用multiprocessing模块本身上的方法，它能使您做什么呢？如果您的平台支持fork呼叫（这是默认上下文），您是否不想使用它？

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-03-5

我来说两句

0 条评论

登录后参与评论

上一篇：当调用tf.linalg.inv时，TensorFlow崩溃并无法创建cuSolverDN实例

TOP 榜单

文章

Python3中的多处理和死锁问题

Python3中的多处理和死锁问题

Linux的官方Adobe Flash存储库是否已过时？

如何使用HttpClient的在使用SSL证书，无论多么“糟糕”是

错误：“ javac”未被识别为内部或外部命令，

在 Python 2.7 中。如何从文件中读取特定文本并分配给变量

Modbus Python施耐德PM5300

为什么Object.hashCode（）不遵循Java代码约定

如何检查字符串输入的格式

检查嵌套列表中的长度是否相同

错误TS2365：运算符'！=='无法应用于类型'“（”'和'“）”'

如何自动选择正确的键盘布局？-仅具有一个键盘布局

如何正确比较 scala.xml 节点？

在令牌内联程序集错误之前预期为 ')'

如何在JavaScript中获取数组的第n个元素？

如何将sklearn.naive_bayes与（多个）分类功能一起使用？

ValueError：尝试同时迭代两个列表时，解包的值太多（预期为 2）

如何监视应用程序而不是单个进程的CPU使用率？

解决类Koin的实例时出错

ES5的代理替代

有什么解决方案可以将android设备用作Cast Receiver？

VBA 自动化错误：-2147221080 (800401a8)

套接字无法检测到断开连接