'In windows envs, Cupy get error in multithread.pool if GPU already used, even if I have used multiprocessing.set_start_method('spawn')
I use the chainer framework to train my CNN. In order to speed up, I use the cupy and multiprocess package. However, even if I have added the multiprocessing.set_start_method('spawn') statement, I still meet the error like this:
CUDARuntimeError.init
TypeError: an integer is required
My envs is:
win10
py3.7
cuda9.2
Before I asked this question, I have refered this link: Cupy get error in multithread.pool if GPU already used . The total program code is too long, so I show part of my program( the function which raises the error) as follows:
class CNNEvaluation(object):
def __init__(self, gpu_num, epoch_num=50, batchsize=256,dataset='cifar10', valid_data_ratio=0.1, verbose=True):
self.gpu_num = gpu_num
self.epoch_num = epoch_num
self.batchsize = batchsize
self.dataset = dataset
self.valid_data_ratio = valid_data_ratio
self.verbose = verbose
def __call__(self, net_lists):
ctx = mp.get_context('spawn')
evaluations = np.zeros(len(net_lists))
for i in np.arange(0, len(net_lists), self.gpu_num):
process_num = np.min((i + self.gpu_num, len(net_lists))) - i
pool = ctx.Pool(process_num)
arg_data = [(cnn_eval, net_lists[i+j], j, self.epoch_num, self.batchsize, self.dataset,
self.valid_data_ratio, self.verbose) for j in range(process_num)]
evaluations[i:i+process_num] = pool.map(arg_wrapper_mp, arg_data)
pool.terminate()
return evaluations
The output is like the following:
Exception in thread Thread-6:
Traceback (most recent call last):
File "D:\AppInstall\Anaconda3\envs\py3_7\lib\threading.py", line 917, in _bootstrap_inner
self.run()
File "D:\AppInstall\Anaconda3\envs\py3_7\lib\threading.py", line 865, in run
self._target(*self._args, **self._kwargs)
File "D:\AppInstall\Anaconda3\envs\py3_7\lib\multiprocessing\pool.py", line 470, in _handle_results
task = get()
File "D:\AppInstall\Anaconda3\envs\py3_7\lib\multiprocessing\connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
File "cupy\cuda\runtime.pyx", line 134, in
cupy.cuda.runtime.CUDARuntimeError.__init__
TypeError: an integer is required
Then the code freezes and doesn't exit. I can't understand why I have added the multiprocessing.set_start_method('spawn') statement, I still have the error message. Is it because I run the program in the windows envs instead of linux?
Solution 1:[1]
This is because CuPy exceptions cannot be pickled, i.e. exceptions raised in child processes cannot be propagated to its parent process. This issue is going to be fixed in https://github.com/cupy/cupy/pull/2318.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | kmaehashi |