'Python multiprocessing with TensorRT
I am trying to use a TensorRT engine for inference in a python class that inherits from multiprocessing. The engine works in a standalone python script on my system, but now while integrating it into the codebase, the multiprocessing used in the class seems to be causing problems.
I am not getting any errors. It just skips everything after the line self.runtime = trt.Runtime(self.trt_logger)
. My debugger from vscode does not go into the function either.
In the docs the following is mentioned, that I do not fully understand:
The TensorRT builder may only be used by one thread at a time. If you need to run multiple builds simultaneously, you will need to create multiple builders. The TensorRT runtime can be used by multiple threads simultaneously, so long as each object uses a different execution context.
The following parts of my code are started, joined and terminated from another file:
# more imports
import logging
import multiprocessing
import tensorrt as trt
import pycuda.driver as cuda
import pycuda.autoinit
class MyClass(multiprocessing.Process):
def __init__(self, messages):
multiprocessing.Process.__init__(self)
# other stuff
self.exit = multiprocessing.Event()
def load_tensorrt_model(self, config):
'''Load tensorrt model with engine'''
logging.debug('Start')
# Reading the config parameters related to the engine
engine_file = config['trt_engine']['trt_folder'] + os.path.sep + config['trt_engine']['engine_file']
class_names_file = config['trt_engine']['trt_folder'] + os.path.sep + config['trt_engine']['class_names_file']
# Verify if all the necessary files are present, if so load the detection network
if os.path.exists(engine_file) and os.path.exists(class_names_file):
try:
logging.debug('In try statement')
self.trt_logger = trt.Logger()
f = open(engine_file, 'rb')
logging.debug('I can get here, but no further')
self.runtime = trt.Runtime(self.trt_logger)
logging.debug('Cannot get here')
self.engine = self.runtime.deserialize_cuda_engine(f.read())
# More stuff
I have found someone with a multithreading problem, but as of now I was unable to use this to solve my problem.
Any help is appreciated.
System specs:
- Python 3.6.9
- Jetson NX
- Jetpack 4.4.1
- L4T 32.4.4
- Tensorrt 7.1.3.0-1
- Cuda10.2
- Ubuntu 18.04
Solution 1:[1]
same problem. It seems pycuda autoinit not working well under a multi process scenario.
try to replace import pycuda.autoinit
with
cuda.init()
self.cuda_context = cuda.Device(0).make_context()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | lauthu |