为什么numba比numpy快？

118

约翰·E：

我不知道为什么numba在这里击败numpy（超过3倍）。我在这里进行基准测试时是否犯了一些根本性的错误？对于numpy来说似乎是完美的情况，不是吗？请注意，作为检查，我还运行了一个结合了numba和numpy的变体（未显示），正如预期的那样，与不带numba的numpy运行相同。

（顺便说一下，这是一个后续问题：对二维数组进行数字处理的最快方法：dataframe vs series vs array vs numba）

import numpy as np
from numba import jit
nobs = 10000 

def proc_numpy(x,y,z):

   x = x*2 - ( y * 55 )      # these 4 lines represent use cases
   y = x + y*2               # where the processing time is mostly
   z = x + y + 99            # a function of, say, 50 to 200 lines
   z = z * ( z - .88 )       # of fairly simple numerical operations

   return z

@jit
def proc_numba(xx,yy,zz):
   for j in range(nobs):     # as pointed out by Llopis, this for loop 
      x, y = xx[j], yy[j]    # is not needed here.  it is here by 
                             # accident because in the original benchmarks 
      x = x*2 - ( y * 55 )   # I was doing data creation inside the function 
      y = x + y*2            # instead of passing it in as an array
      z = x + y + 99         # in any case, this redundant code seems to 
      z = z * ( z - .88 )    # have something to do with the code running
                             # faster.  without the redundant code, the 
      zz[j] = z              # numba and numpy functions are exactly the same.
   return zz

x = np.random.randn(nobs)
y = np.random.randn(nobs)
z = np.zeros(nobs)
res_numpy = proc_numpy(x,y,z)

z = np.zeros(nobs)
res_numba = proc_numba(x,y,z)

结果：

In [356]: np.all( res_numpy == res_numba )
Out[356]: True

In [357]: %timeit proc_numpy(x,y,z)
10000 loops, best of 3: 105 µs per loop

In [358]: %timeit proc_numba(x,y,z)
10000 loops, best of 3: 28.6 µs per loop

我在2012年的macbook air（13.3）（标准anaconda发行版）上运行了该软件。如果相关，我可以提供有关我的设置的更多详细信息。

尼尔·弗里德曼：

我认为这个问题凸显了（某种程度上）从高级语言调用预编译函数的局限性。假设在C ++中，您编写如下内容：

for (int i = 0; i != N; ++i) a[i] = b[i] + c[i] + 2 * d[i];

编译器会在编译时看到整个表达式。它可以在这里做很多非常聪明的事情，包括优化临时文件（以及循环展开）。

但是，在python中，请考虑发生了什么：当您使用numpy时，每个``+”都会对np数组类型（它们只是连续内存块的薄包装，即低级数组）的运算符重载，并调出到一个fortran（或C ++）函数，该函数可以非常快速地执行加法操作。但它只是做一个加法，并吐出一个临时值。

我们可以看到，虽然numpy很棒，方便且相当快，但它却使速度变慢，因为尽管看起来它正在调用一种快速的编译语言来进行艰苦的工作，但编译器却看不到整个程序，只喂一些孤立的小片段。这对编译器非常不利，特别是现代的编译器，它们非常聪明，当编写良好的代码时，每个周期可以退出多个指令。

另一方面，Numba使用了jit。因此，在运行时，它可以确定不需要临时工，并对其进行优化。基本上，Numba可以将程序作为一个整体进行编译，numpy只能调用本身已预先编译的小原子块。

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2020-08-7

我来说两句

0 条评论

登录后参与评论

TOP 榜单

文章

为什么numba比numpy快？

为什么numba比numpy快？

UITableView的项目向下滚动后更改颜色，然后快速备份

Linux的官方Adobe Flash存储库是否已过时？

用日期数据透视表和日期顺序查询

应用发明者仅从列表中选择一个随机项一次

Mac OS X更新后的GRUB 2问题

验证REST API参数

Java Eclipse中的错误13，如何解决？

带有错误“ where”条件的查询如何返回结果？

ggplot：对齐多个分面图-所有大小不同的分面

尝试反复更改屏幕上按钮的位置 - kotlin android studio

如何从视图一次更新多行（ASP.NET - Core）

计算数据帧中每行的NA

蓝屏死机没有修复解决方案

在 Python 2.7 中。如何从文件中读取特定文本并分配给变量

离子动态工具栏背景色

VB.net将2条特定行导出到DataGridView

通过 Git 在运行 Jenkins 作业时获取 ClassNotFoundException

在Windows 7中无法删除文件（2）

python中的boto3文件上传

当我尝试下载 StanfordNLP en 模型时，出现错误

Node.js中未捕获的异常错误，发生调用