torch.sub cause cuda memory out

Hoin

I have two tensors, a and b, with sizes a.shape=(10000,10000,120) and b.shape=(10000,10000,120).

I'm trying to get a cost matrix between a and b, cost = torch.sub((a-b)**2,-1), where cost.shape=(10000,10000).

The problem is, when I tried to do a-b or torch.sum(a,b,alpha=1), a "CUDA MEMORY OUT" error occurs.

I don't think it should cost that much. It works when the size of the tensor is small, like 2000.

Using a for iteration is not an efficient way. How can I deal with it?

Vvvvvv

It does costs much (about 134 GB).

Let's do some calculations.

Assuming your data is of type torch.float32, a will occupy a memory size of:

32 bits (4 Bytes) * 10000 * 10000 * 120 = 4.8E10 bytes ≈ 44.7 G Bytes

So does b. When you do b-a, the result also has the same shape with a and thus occupies the same amount of memory, which means you need a total of 44.7 GB * 3 (≈ 134 GB) memory to do this operation.

Is your available memory size greater than 134GB?

Possible solution:

If you will no longer use a or b afterwards, you can store the result in one of them to prevents to allocating another 44.7 GB space like this:

torch.sub(a, b, out=a)  # In this case, the result goes to `a`

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How to check the root cause of CUDA out of memory issue in the middle of training?

Will `FileWriter` cause out of memory?

torch.cuda.memory_stats and torch.cuda.memory_snapshot return empty data structures

StringBuilder append cause out of memory

Seaborn: using boxplot cause running out of memory

io.Copy cause out of memory in golang

Recreating activity cause Out Of Memory error

CUDA: Out of memory error on 128 images dataset

Runtimeerror: Cuda out of memory - problem in code or gpu?

ffmpeg - cuda encode - OpenEncodeSessionEx failed: out of memory

Exporting vulkan memory allocation handle cause an out of device memory

CUDA Error: out of memory - Python process utilizes all GPU memory

Compaction cause out of memory error and shutdown the Cassandra process

Does While true loop always cause Out of memory error?

Adding custom view in a loop cause out of memory error

generating DataFrames in for loop in Scala Spark cause out of memory

Async write file million times cause out of memory

Destructuring EF-objects will cause Serilog to run out of memory

"while"/"for" loop in kernel causing CUDA out of memory error?

How to get rid of CUDA out of memory without having to restart the machine?

CUDA out of memory when training ConvLSTMD2D model

CUDA out of memory when training is done on multiple GPU

Cuda out of memory issue with pytorch when training two networks jointly

How to fix PyTorch RuntimeError: CUDA error: out of memory?

PyTorch RuntimeError: CUDA out of memory. Tried to allocate 14.12 GiB

CUDA why just reading (zero write) from unified memory cause next kernel to become slower

Invalid device Ordinal , CUDA / TORCH

Torch, how to check a variable is CUDA or not?

AssertionError: Torch not compiled with CUDA MacOS

TOP Ranking

  1. 1

    Failed to listen on localhost:8000 (reason: Cannot assign requested address)

  2. 2

    Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

  3. 3

    How to import an asset in swift using Bundle.main.path() in a react-native native module

  4. 4

    pump.io port in URL

  5. 5

    Compiler error CS0246 (type or namespace not found) on using Ninject in ASP.NET vNext

  6. 6

    BigQuery - concatenate ignoring NULL

  7. 7

    ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

  8. 8

    ggplotly no applicable method for 'plotly_build' applied to an object of class "NULL" if statements

  9. 9

    Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

  10. 10

    How to remove the extra space from right in a webview?

  11. 11

    java.lang.NullPointerException: Cannot read the array length because "<local3>" is null

  12. 12

    Jquery different data trapped from direct mousedown event and simulation via $(this).trigger('mousedown');

  13. 13

    flutter: dropdown item programmatically unselect problem

  14. 14

    How to use merge windows unallocated space into Ubuntu using GParted?

  15. 15

    Change dd-mm-yyyy date format of dataframe date column to yyyy-mm-dd

  16. 16

    Nuget add packages gives access denied errors

  17. 17

    Svchost high CPU from Microsoft.BingWeather app errors

  18. 18

    Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

  19. 19

    12.04.3--- Dconf Editor won't show com>canonical>unity option

  20. 20

    Any way to remove trailing whitespace *FOR EDITED* lines in Eclipse [for Java]?

  21. 21

    maven-jaxb2-plugin cannot generate classes due to two declarations cause a collision in ObjectFactory class

HotTag

Archive