在Delphi程序中使用CUDA调用运行C函数

JoãoGabriel sf

我的目标是拥有一个Delphi（或freepascal）代码，它将像这样调用C函数func：

C / Cuda文件：

/* this is the "progcuda.cu" file */
#include <stdio.h>

__global__ void foo(int *a, int *b, int *c, int n){
    /*
    add all the vector's element
    */
}


void func(int *a, int *b, int *c,int n){
    int *da,*db,*dc;
    cudaMalloc(&da, n*sizeof(int));
    cudaMalloc(&db, n*sizeof(int));
    cudaMalloc(&dc, n*sizeof(int));

    cudaMemcpy(da,a,sizeof(int)*n,cudaMemcpyHostToDevice);
    cudaMemcpy(db,b,sizeof(int)*n,cudaMemcpyHostToDevice);
    cudaMemcpy(dc,c,sizeof(int)*n,cudaMemcpyHostToDevice);

    foo<<<1,256>>>(da,db,dc);
    cudaMemcpy(c,dc,sizeof(int),cudaMemcpyDeviceToHost);

    /* do other stuff and call another Host and Device functions*/

    return;
}

Pascal主文件：

// this is the "progpas.pas" file
program progpas;
{$mode objfpc}{$H+}
uses unitpas;

var
    ...


begin
    ...
    func(a, b, c, len);
    ...
end.

Pascal单位文件：

// this is the "unitpas.pas" file
unit unitpas;
{$link progcuda.o}
interface

uses ctypes;
procedure func(a, b, c : cpint32 , n:cint32); cdecl; external;
procedure foo(a, b, c : cpint32 , n:cint32);cdecl; external;

implementation

end.

我已经找到了使用Delphi或FreePascal编程CUDA的文章，但它显示了更多在delphi中编程CUDA的方法。

我不想用Delphi编程CUDA，我想用纯C / C ++代码在CUDA中编程，而只在delphi中调用该C函数。

问题是什么？如何将.cu代码链接到delphi代码？

我正在使用linux ubuntu 16.04 LTS，但如有必要，我在Windows中也有CUDA和VS。

注意：如果你们可以详细解释如何做，将会有所帮助（pascal和链接文件的新手）

我已经试过生成.o目标文件，它的Free Pascal与链接
$ nvcc progcuda.cu -c -o progcuda.o，然后$fpc progpas.pas
但它在连接失败。

注意：我曾经尝试使用gcc和freepascal编译器将C代码生成的普通.o链接到pascal代码，并且可以正常工作，但是如果我使用nvcc而不是gcc并将扩展名重命名为.cu（仍然是相同的代码），则链接失败。

注意：堆栈中的新帐户溢出，我还无法补充答案。

gflegar

我对Delphi和FreePascal一无所知，但对CUDA，C和C ++却一无所知，所以也许我的解决方案也可以为您服务。

我将用一个简单的问题来演示它：

内容f.cu：

int f() { return 42; }

内容main.c：

extern int f();

int main() {
    return f();
}

以下作品：

$ gcc -c -xc f.cu # need -xc to tell gcc it's a C file
$ gcc main.c f.o
(no errors emitted)

现在，当我们尝试替换gcc为nvcc：

$ nvcc -c f.cu
$ gcc main.c f.o
/tmp/ccI3tBM1.o: In function `main':
main.c:(.text+0xa): undefined reference to `f'
f.o: In function `__cudaUnregisterBinaryUtil()':
tmpxft_0000704e_00000000-5_f.cudafe1.cpp:(.text+0x52): undefined reference to `__cudaUnregisterFatBinary'
f.o: In function `__nv_init_managed_rt_with_module(void**)':
tmpxft_0000704e_00000000-5_f.cudafe1.cpp:(.text+0x6d): undefined reference to `__cudaInitModule'
f.o: In function `__sti____cudaRegisterAll()':
tmpxft_0000704e_00000000-5_f.cudafe1.cpp:(.text+0xa9): undefined reference to `__cudaRegisterFatBinary'
collect2: error: ld returned 1 exit status

这里的问题是，nvcc在编译时f.cu，会从CUDA运行时API中添加对某些符号的引用，并且这些符号必须链接到最终的可执行文件。我的CUDA安装在中/opt/cuda，因此我将使用它，但是无论系统上安装了CUDA为何，您都必须替换它。因此，如果libcudart.so在编译库时链接，则会得到：

$ nvcc -c f.cu
$ gcc main.c f.o -L/opt/cuda/lib64 -lcudart
/tmp/ccUeDZcb.o: In function `main':
main.c:(.text+0xa): undefined reference to `f'
collect2: error: ld returned 1 exit status

看起来更好，没有奇怪的错误，但是仍然找不到函数f。那是因为它nvcc被f.cu视为C ++文件，因此在创建目标文件时会进行名称修改，因此我们必须指定要f使用C而不是C ++链接（请参见此处：http：//en.cppreference.com）。 / w / cpp / language / language_linkage）。为此，我们必须进行如下修改f.cu：

extern "C" int f() { return 42; }

现在，当我们这样做时：

$ nvcc -c f.cu
$ gcc main.c f.o -L/opt/cuda/lib64 -lcudart
(no errors emitted)

我希望您能够修改此语言以使其适合您的语言。

编辑：我尝试了一些更复杂的示例：

// f.cu
#include <stdio.h>

__global__ void kernel() {
    printf("Running kernel\n");
}

extern "C" void f() {
    kernel<<<1, 1>>>();
    // make sure the kernel completes before exiting
    cudaDeviceSynchronize();
}

// main.c
extern void f();

int main() {
    f();
    return 0;
}

编译时我得到：

    f.o:(.data.DW.ref.__gxx_personality_v0[DW.ref.__gxx_personality_v0]+0x0): undefined reference to `__gxx_personality_v0'
collect2: error: ld returned 1 exit status

要修复它，您还需要将标准C ++库添加到链接器标志：

$ nvcc -c f.cu
$ gcc main.c f.o -L/opt/cuda/lib64 -lcudart -lstdc++
$ ./a.out
Running kernel

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2020-11-23

我来说两句

0 条评论

登录后参与评论

上一篇：了解Rails模型中的slug_candidates方法

TOP 榜单

文章

在Delphi程序中使用CUDA调用运行C函数

在Delphi程序中使用CUDA调用运行C函数

Linux的官方Adobe Flash存储库是否已过时？

如何使用HttpClient的在使用SSL证书，无论多么“糟糕”是

错误：“ javac”未被识别为内部或外部命令，

Modbus Python施耐德PM5300

为什么Object.hashCode（）不遵循Java代码约定

如何正确比较 scala.xml 节点？

在 Python 2.7 中。如何从文件中读取特定文本并分配给变量

在令牌内联程序集错误之前预期为 ')'

数据表中有多个子行，asp.net核心中来自sql server的数据

VBA 自动化错误：-2147221080 (800401a8)

错误TS2365：运算符'！=='无法应用于类型'“（”'和'“）”'

如何在JavaScript中获取数组的第n个元素？

检查嵌套列表中的长度是否相同

如何将sklearn.naive_bayes与（多个）分类功能一起使用？

ValueError：尝试同时迭代两个列表时，解包的值太多（预期为 2）

ES5的代理替代

在同一Pushwoosh应用程序上Pushwoosh多个捆绑ID

如何监视应用程序而不是单个进程的CPU使用率？

如何检查字符串输入的格式

解决类Koin的实例时出错

如何自动选择正确的键盘布局？-仅具有一个键盘布局