'RuntimeError: CUDA out of memory with FastAPI

I try to make a backend server based on fastapi.

My backend server works well with almost no errors, but I found an error situation.

I allocate two services to fastapi apps using api.route.

If you look at the code below, you can see that the detected_images and filtered_images services are assigned.

This is my code:

import uvicorn
from fastapi import FastAPI

app = FastAPI()

from api.routes import detected_images
app.include_router(detected_images.router)

from api.routes import filtered_images
app.include_router(filtered_images.router)

if __name__ == '__main__':
    uvicorn.run(app, port='8000', host="127.0.0.1")

But if I run this code as it is, I'll face an error.

In my opinion, it is estimated that GPU memory is temporarily exceeded while running at the same time.

This is my error message:

RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 8.00 GiB total capacity; 6.23 GiB already allocated; 0 bytes free; 6.25 GiB reserved in total by PyTorch)

However, if this is not executed in one python code, divided into two, and executed in order, no errors will occur.

The code below is the way I tried to avoid errors.

This is my process1:

import uvicorn
from fastapi import FastAPI

app = FastAPI()

from api.routes import detected_images
app.include_router(detected_images.router)

if __name__ == '__main__':
    uvicorn.run(app, port='8000', host="127.0.0.1")

This is my process2:

import uvicorn
from fastapi import FastAPI

app = FastAPI()

from api.routes import filtered_images
app.include_router(filtered_images.router)

if __name__ == '__main__':
    uvicorn.run(app, port='8001', host="127.0.0.1")

In addition, I tried the following method, but it didn't work.

import uvicorn
from fastapi import FastAPI

app = FastAPI()

from api.routes import detected_images
app.include_router(detected_images.router)

import torch, gc
gc.collect()
torch.cuda.empty_cache()

from api.routes import filtered_images
app.include_router(filtered_images.router)

if __name__ == '__main__':
    uvicorn.run(app, port='8000', host="127.0.0.1")

Is there any good way to solve this problem?

Solution 1:^[1]

you can try:

if __name__ == "__main__":
    uvicorn.run("agent:app", host="127.0.0.1", port=5000, reload=False, log_level="info", workers=0)

the --reload must be False, and set --workers=0

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	sirok

'RuntimeError: CUDA out of memory with FastAPI

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]