'Pytorch RuntimeError: CUDA out of memory with a huge amount of free memory

While training the model, I encountered the following problem:

RuntimeError: CUDA out of memory. Tried to allocate 304.00 MiB (GPU 0; 8.00 GiB total capacity; 142.76 MiB already allocated; 6.32 GiB free; 158.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

As we can see, the error occurs when trying to allocate 304 MiB of memory, while 6.32 GiB is free! What is the problem? As I can see, the suggested option is to set max_split_size_mb to avoid fragmentation. Will it help and how to do it correctly?

This is my version of PyTorch:

torch==1.10.2+cu113

torchvision==0.11.3+cu113

torchaudio===0.10.2+cu113



Solution 1:[1]

Your problem may be due to fragmentation of your GPU memory.You may want to empty your cached memory used by caching allocator.

 import torch
 torch.cuda.empty_cache()

Solution 2:[2]

I was trying this command:

python3 val.py --weights ./weights/yolov5l-xs-1.pt --img 1996 --data ./data/VisDrone.yaml

and I have a 24G Titan video Card.

Then I reduced the image size and worked for me. to:

python3 val.py --weights ./weights/yolov5l-xs-1.pt --img 1280  --data ./data/VisDrone.yaml

Results:

Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|????????????????????????????????| 18/18 [00:50<00:00,  2.79s/it]
                 all        548      38759      0.653      0.537      0.584      0.375
          pedestrian        548       8844       0.74      0.631      0.708      0.375
              people        548       5125      0.677      0.506      0.574      0.258
             bicycle        548       1287      0.541      0.377       0.41      0.213
                 car        548      14064      0.828      0.868      0.904      0.681
                 van        548       1975      0.636      0.566      0.601      0.453
               truck        548        750      0.595      0.516      0.538      0.388
            tricycle        548       1045      0.601      0.416      0.457      0.288
     awning-tricycle        548        532      0.387      0.242      0.245      0.173
                 bus        548        251      0.782      0.653      0.725      0.565
               motor        548       4886      0.744      0.598      0.674      0.355

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Erol Gelbul
Solution 2 Oscar Rangel