'Wrapper for std::vector that makes Structure of Arrays look like Array of Structures
I have a vector of a struct with 8 different fields (integers and pointers) that serves as a database for my program. Usually only a few of these fields are actually used (often just one). It was fine originally, but now it's running out of memory when storing billions of elements. I want to store this data sparsely without having zero/null entries for the unused fields in each object. However, this is used all over the place in the codebase and is difficult to change.
I decided to store the individual fields as separate vectors and create a class that wraps these vectors, making a SoA appear as an AoS to the callers. The set of fields that are used is known at runtime during creation of the database. It needs to have a large number of std::vector member functions. The best I was able to come up with was some macros and lots of copy-paste lines of code to handle the individual field vectors:
#define SELECT_FIELD_VECT(FUNC) (use_uv() ? uv.FUNC : (use_dv() ? dv.FUNC : (use_sv() ? sv.FUNC : rv.FUNC)))
#define APPLY_FIELD_VECT(FUNC) { if(use_uv()) {uv.FUNC;} if(use_dv()) {dv.FUNC;} if(use_sv()) {sv.FUNC;} if(use_rv()) {rv.FUNC;} }
class md_tracker_t {
vector< match_track_data_uints_t > uv;
vector< delta_pair const * > dv;
vector< std::string const * > sv, rv;
public:
bool empty( void ) const { return SELECT_FIELD_VECT(empty()); }
size_t size( void ) const { return SELECT_FIELD_VECT(size ()); }
size_t capacity( void ) const { return SELECT_FIELD_VECT(capacity()); }
void clear( void ) { uv.clear(); dv.clear(); sv.clear(); rv.clear(); }
void shrink_to_fit( void ) { APPLY_FIELD_VECT(shrink_to_fit()); }
void reserve( size_t const sz ) { APPLY_FIELD_VECT(reserve(sz)); }
void resize ( size_t const sz ) { APPLY_FIELD_VECT(resize(sz)); }
void swap( md_tracker_t &mt ) { uv.swap( mt.uv ); dv.swap( mt.dv ); sv.swap( mt.sv ); rv.swap( mt.rv ); }
void push_back( match_track_data_t const &md ) {
if( use_uv() ) { uv.push_back( md.uints ); }
if( use_dv() ) { dv.push_back( md.deltas ); }
if( use_sv() ) { sv.push_back( md.signature ); }
if( use_rv() ) { rv.push_back( md.rulename ); }
}
void copy_from( size_t const from_ix, size_t const to_ix ) {
if( use_uv() ) { uv[to_ix] = uv[from_ix]; }
if( use_dv() ) { dv[to_ix] = dv[from_ix]; }
if( use_sv() ) { sv[to_ix] = sv[from_ix]; }
if( use_rv() ) { rv[to_ix] = rv[from_ix]; }
}
void add_from( md_tracker_t const &mt, size_t const ix ) {
if( use_uv() ) { uv.push_back( mt.uv[ix] ); }
if( use_dv() ) { dv.push_back( mt.dv[ix] ); }
if( use_sv() ) { sv.push_back( mt.sv[ix] ); }
if( use_rv() ) { rv.push_back( mt.rv[ix] ); }
}
match_track_data_t get_mtd( size_t const ix ) const {
assert( ix < size() );
return match_track_data_t( ( use_uv() ? uv[ix] : match_track_data_uints_t() ),
( use_dv() ? dv[ix] : 0 ),
( use_sv() ? sv[ix] : 0 ),
( use_rv() ? rv[ix] : 0 ) );
}
...
};
This works, but it's messy. It also only uses 4 of the 8 fields. I would like to add more fields later without having to change dozens of lines of code for each field. Is there a more compact/clean way to do this? Some magic with macros, templates, C++11, etc? Thank you.
Solution 1:[1]
Well, the little enable bits you have (use_uv()
, etc) are interesting and I'm pretty sure there's not a generic version of a similar function, so I gave it a go.
To make the data generic, you have to replace the "struct" part and those lovely field names with the indices of an std::tuple
, the generic struct. You can make up for it by extending std::tuple
to add accessors
struct match_track_data_t : std::tuple<match_track_data_uints_t, delta_pair, std::string, std::string> {
using std::tuple<match_track_data_uints_t, delta_pair, std::string, std::string>::tuple;
match_track_data_uints_t& uv() { return std::get<0>(*this); }
match_track_data_uints_t& uv() const { return std::get<0>(*this); }
delta_pair& dv() { return std::get<1>(*this); }
delta_pair& dv() const { return std::get<1>(*this); }
/* etc... */
};
For the vector, the data because a tuple of vectors. The use_*
flags become an array of predicates:
template <typename... Ts>
class md_tracker_t {
std::tuple<std::vector<Ts>...> data;
std::array<bool(*)(), sizeof...(Ts)> use_ix;
public:
md_tracker_t(std::array<bool(*)(), sizeof...(Ts)> use_ix) : use_ix(use_ix) { }
Now, I tried to classify each method in your class as some generic action:
select
- does something with the first enabled vector and returns the valueapply
- does something to each enabled vector (or all vectors)apply_join
- zips the tuple with some other tuple, and does something to each enabled pairget_mtd
- this function I couldn't mash into the other types
The apply*
functions should have a version that applies it only to enabled arrays, or all arrays.
If you map the entire public interface to those generic functions:
bool empty() const { return select([](auto& v) { return v.empty(); }); }
bool size() const { return select([](auto& v) { return v.size(); }); }
bool capacity() const { return select([](auto& v) { return v.capacity(); }); }
void clear() { apply<false>([](auto& v) { v.clear(); }); }
void shrink_to_fit() { apply([](auto& v) { v.shrink_to_fit(); }); }
void reserve( size_t const sz ) { return apply([&sz](auto& v) { v.reserve(sz); }); }
void resize ( size_t const sz ) { return apply([&sz](auto& v) { v.resize(sz); }); }
void swap( md_tracker_t &mt ) { apply_join<false>(mt, [](auto& v, auto& w) { v.swap(w); }); }
void push_back( std::tuple<Ts...> const &md ) { apply_join(md, [](auto& v, auto& e) { v.push_back(e); }); }
void copy_from( size_t const from_ix, size_t const to_ix ) { apply([&from_ix, &to_ix](auto& v) { v[to_ix] = v[from_ix]; }); }
void add_from( md_tracker_t const &mt, size_t const ix ) { apply_join(mt, [&ix](auto& v, auto& w) { v.push_back(w[ix]); }); }
std::tuple<Ts...> get_mtd( size_t const ix ) const { return get_mtd_impl(ix, std::index_sequence_for<Ts...>{}); }
All you have to do is implement them! (I'm skipping a lot of boilerplate here by using C++14 generic lambdas).
To implement select
, you can't just write a loop because std::tuple
can only be indexed with a compile time argument. So you have to go recursive instead: start at zero, and if [0] is enabled, apply the lambda to that vector OR repeat for the next index:
template <typename Functor, size_t Index = 0, typename std::enable_if<Index != sizeof...(Ts)>::type* = nullptr>
decltype(auto) select(Functor&& functor) const {
return use_ix[Index]()
? functor(std::get<Index>(data))
: select<Functor, Index + 1>(std::forward<Functor>(functor));
}
template <typename Functor, size_t Index, typename std::enable_if<Index == sizeof...(Ts)>::type* = nullptr>
decltype(auto) select(Functor&& functor) const { return decltype(functor(std::get<0>(data))){}; }
apply
is a little simpler, because you can just expand the entire tuple with std::index_sequence
and std::get
into a throwaway boolean array. The comma operator (always returns the right hand side) is kind of a cheat code to turn a void function into an expression:
template <bool conditional = true, typename Functor, size_t... Is>
void apply(Functor&& functor, std::index_sequence<Is...>) {
std::initializer_list<bool> { (!conditional || use_ix[Is]() ? (functor(std::get<Is>(data)), false) : false)... };
}
template <bool conditional, typename Functor>
void apply(Functor&& functor) {
return apply(std::forward<Functor>(functor), std::index_sequence_for<Ts...>{});
}
The boolean conditional
template argument is basically an override. If false
, it will apply the lambda no matter what the predicate returns.
A lot of functions, like push_back
and swap
take either a slice or another SoA and cross-join it. For those, we have apply_join
, which is almost the same as apply
except it handles the extra argument:
template <bool conditional = true, typename Functor, typename Arg, size_t... Is>
void apply_join(Functor&& functor, Arg& arg, std::index_sequence<Is...>) {
std::initializer_list<bool> { (!conditional || use_ix[Is]() ? (functor(std::get<Is>(data), std::get<Is>(arg)), false) : false)... };
}
template <bool conditional = true, typename Functor, typename Arg>
void apply_join(Functor&& functor, Arg& arg) {
return apply_join(std::forward<Functor>(functor), arg, std::index_sequence_for<Ts...>{});
}
Finally, get_mtd
just expands the tuple, applies the index operator to each one, then passes it to an std::tuple
:
template <size_t... Is>
std::tuple<Ts...> get_mtd_impl( size_t const ix, std::index_sequence<Is...>) const {
assert(ix < sizeof...(Ts));
return std::tuple<Ts...>(std::get<Is>(data)[ix]...);
}
And that's it!
};
Possibly more code than if you just did it by hand, but you can add fields all day without unnecessary boilerplate.
Usage:
using md_tracker = md_tracker_t<match_track_data_uints_t, delta_pair, std::string, std::string>;
Solution 2:[2]
Intel have created a template library for this exact purpose. See SDLT (SIMD Data Layout Templates)
SIMD Data Layout Templates (SDLT) is a C++11 template library providing containers that represent arrays of "Plain Old Data" objects (a struct whose data members do not have any pointers/references and no virtual functions) using layouts that enable generation of efficient SIMD (single instruction multiple data) vector code.
SDLT uses standard ISO C++11 code. It does not require a special language or compiler to be functional, but takes advantage of performance features (such as OpenMP* SIMD extensions and pragma ivdep ) that may not be available to all compilers. It is designed to promote scalable SIMD vector programming. To use the library, specify SIMD loops and data layouts using explicit vector programming model and SDLT containers, and let the compiler generate efficient SIMD code in an efficient manner. Many of the library interfaces employ generic programming, in which interfaces are defined by requirements on types and not specific types. The C++ Standard Template Library (STL) is an example of generic programming. Generic programming enables SDLT to be flexible yet efficient. The generic interfaces enable you to customize components to your specific needs.
The net result is that SDLT enables you to specify a preferred SIMD data layout far more conveniently than re-structuring your code completely with a new data structure for effective vectorization, and at the same time can improve performance.
I have no experience with this myself so would welcome comments from anyone who has tried it.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Bruce Adams |