'Serialization without Boost.Serialization

I'm trying to implement a simple serialization/deserialization method for my code to be able to pass an object over the network using MPI. In an ideal world I would have used Boost.Serialization and Boost.MPI for that but they are not installed on some of the clusters I have access to so I'm considering doing this myself.

My strategy is to serialize every object into a std::stringstream object and then send a message via MPI_Send using MPI_CHAR as the datatype. In such a case I would pass std::stringstream::str()::c_str() as the pointer and std::streaingstream::str()::size()*sizeof(char) as the size of the message.

I've figured how to serialize everything into a std::stringstream object. My deserialization method also takes a std::stringstream object and deserializes everything back. This works fine except I do not know how to create a std::stringstream object from an array of chars and avoid the extra copy from the array into the stream. Should I change my deserialization method to directly work with an array of char using memcpy instead?



Solution 1:[1]

The MPI way of doing this, would be using MPI_Pack and MPI_Unpack. Of course that is C and might not be as convenient as something using C++ features. For a simple example see http://www.mcs.anl.gov/research/projects/mpi/tutorial/mpiexmpl/src/bcast/C/pack/solution.html

Solution 2:[2]

Use an istrstream, which extracts from a char array. The header for this is <strstream>. And, yes, formally it's deprecated in the C++ Standard. The committee indulged in a great deal of wishful thinking in its early days. istrstream is not going to go away.

Solution 3:[3]

Although the accepted answer is the right way to go, the concrete question "except I do not know how to create a std::stringstream object from an array of chars" was not answered yet.

The answer is that effectively you can't.

Presumably have an object already in memory and you want so see it as an stringstream, even if you can make it work the stringstream will not be able to grow further in the existing memory (which is an useful property of the stream buffer.).

Also there is no much point into making heroic attempts to not copy data that is going to be send in an MPI message. MPI might end up copying the data internally anyway, to "pinned" memory for example depending on the exact method of communication (buffered, unbuffered, asynchronous, etc). That is my understanding.

If the data is too large to be copied at all (e.g. many gigabytes) then it is better to use MPI_Datatypes which can be a real challenge to handle in generic ways.

BTW, the accepted answer is what my MPI library does internally https://gitlab.com/correaa/boost-mpi3, it is probably what the original Boost.MPI does too.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Zulan
Solution 2 Pete Becker
Solution 3