'Python multiprocessing shared memory error on close
I am using Python multiprocessing shared memory.
I see the following error when I try to close a SharedMemory object: BufferError: memoryview has 1 exported buffer
Can someone please let me know what this error means?
Here is how to reproduce: I am using Pyarrow
import pyarrow as pa
import pandas as pd
import numpy as np
from multiprocessing import shared_memory
a = pd.DataFrame(np.random.normal(size=(1000,1000)))
batch = pa.RecordBatch.from_pandas(a)
mock_sink = pa.MockOutputStream()
with pa.RecordBatchStreamWriter(mock_sink, batch.schema) as stream_writer:
stream_writer.write_batch(batch)
data_size = mock_sink.size()
print(data_size)
shm_a = shared_memory.SharedMemory(create=True, size=data_size)
buf = pa.py_buffer(shm_a.buf)
stream = pa.FixedSizeBufferWriter(buf)
with pa.RecordBatchStreamWriter(stream, batch.schema) as stream_writer:
stream_writer.write_batch(batch)
print(shm_a.name)
shm_a.close()
Solution 1:[1]
I have encountered the same problem. After some digging, I found some clue:
From the code above, you can see that once we create a pa.py_buffer
object from share memory's buf
, shm.buf
can't be released. After we delete that py_buffer
object, it can be released successfully without throwing the BufferError
exception.
So the solution I came up to solve my problem is as follows:
This may not be a perfect solution. If anybody comes up with better ones, please post it.
Also note that the BufferError
still seem to arise when we read the shared memory from multiple processes. I am still working on it.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |