'How is CONDITION_VARIABLE implemented?
A longer version of the title question would be:
On my machine,
sizeof(std::condition_variable)
is 72 bytes. What are these 72 bytes used for?
Note: The size of std::condition_variable
depends on the implementation. Some examples sizes are given in Appendix A.
To understand how std::condition_variable
works, I am satisfied to understand wait
, notify_one
, and member objects. I will start with wait
. wait
with a predicate is given below.
template <class _Predicate>
void wait(unique_lock<mutex>& _Lck, _Predicate _Pred) { // wait for signal and test predicate
while (!_Pred()) {
wait(_Lck);
}
}
The above wait
calls the no-predicate wait
.
void wait(unique_lock<mutex>& _Lck) { // wait for signal
// Nothing to do to comply with LWG-2135 because std::mutex lock/unlock are nothrow
_Cnd_wait(_Mycnd(), _Lck.mutex()->_Mymtx());
}
This wait calls _Cnd_wait
on _Mycnd()
. _Cnd_wait
is found here.
int _Cnd_wait(const _Cnd_t cond, const _Mtx_t mtx) { // wait until signaled
const auto cs = static_cast<Concurrency::details::stl_critical_section_interface*>(_Mtx_getconcrtcs(mtx));
_Mtx_clear_owner(mtx);
cond->_get_cv()->wait(cs);
_Mtx_reset_owner(mtx);
return _Thrd_success; // TRANSITION, ABI: Always returns _Thrd_success
}
_Cnd_t
is a pointer to a _Cnd_internal_imp_t
.
using _Cnd_t = struct _Cnd_internal_imp_t*;
The struct _Cnd_internal_imp_t
is defined here.
struct _Cnd_internal_imp_t { // condition variable implementation for ConcRT
std::aligned_storage_t<Concurrency::details::stl_condition_variable_max_size,
Concurrency::details::stl_condition_variable_max_alignment>
cv;
[[nodiscard]] Concurrency::details::stl_condition_variable_interface* _get_cv() noexcept {
// get pointer to implementation
return reinterpret_cast<Concurrency::details::stl_condition_variable_interface*>(&cv);
}
};
I am now looking at the line cond->_get_cv()->wait(cs);
. To understand this line, I need to see Concurrency::details::stl_condition_variable_interface
's member wait
function. This is a virtual function.
class __declspec(novtable) stl_condition_variable_interface {
public:
virtual void wait(stl_critical_section_interface*) = 0;
virtual bool wait_for(stl_critical_section_interface*, unsigned int) = 0;
virtual void notify_one() = 0;
virtual void notify_all() = 0;
virtual void destroy() = 0;
};
Edit 2
cond->_get_cv()
is a pointer to an abstract class stl_condition_variable_interface
. At some point during construction, create_stl_condition_variable
will be called to set the virtual pointer. The virtual pointer for this object will point to the vtable for either stl_condition_variable_vista
given here or stl_condition_variable_win7
given here. The top answer to this stack overflow question explains some of the details.
In my case, the virtual pointer points to the table for stl_condition_variable_win7
.
class stl_condition_variable_win7 final : public stl_condition_variable_interface {
public:
stl_condition_variable_win7() {
InitializeConditionVariable(&m_condition_variable);
}
~stl_condition_variable_win7() = delete;
stl_condition_variable_win7(const stl_condition_variable_win7&) = delete;
stl_condition_variable_win7& operator=(const stl_condition_variable_win7&) = delete;
void destroy() override {}
void wait(stl_critical_section_interface* lock) override {
if (!stl_condition_variable_win7::wait_for(lock, INFINITE)) {
std::terminate();
}
}
bool wait_for(stl_critical_section_interface* lock, unsigned int timeout) override {
return SleepConditionVariableSRW(&m_condition_variable,
static_cast<stl_critical_section_win7*>(lock)->native_handle(), timeout, 0)
!= 0;
}
void notify_one() override {
WakeConditionVariable(&m_condition_variable);
}
void notify_all() override {
WakeAllConditionVariable(&m_condition_variable);
}
private:
CONDITION_VARIABLE m_condition_variable;
};
So my 72 or 8 bytes are reserved to store a CONDITION_VARIABLE
and the essense of wait
is to call SleepConditionVariableSRW
. This function is described here.
END EDIT 2
Appendix A
The only member object of std::condition_variable
is
aligned_storage_t<_Cnd_internal_imp_size, _Cnd_internal_imp_alignment> _Cnd_storage;
std::condition_variable
contains the below member function which allows _Cnd_storage
to be interpreted as a _Cnd_t
.
_Cnd_t _Mycnd() noexcept { // get pointer to _Cnd_internal_imp_t inside _Cnd_storage
return reinterpret_cast<_Cnd_t>(&_Cnd_storage);
}
sizeof(std::condition_variable)
is given by the sizeof(_Cnd_storage)
, which is defined in xthreads.h
.
// Size and alignment for _Mtx_internal_imp_t and _Cnd_internal_imp_t
#ifdef _CRT_WINDOWS
#ifdef _WIN64
_INLINE_VAR constexpr size_t _Mtx_internal_imp_size = 32;
_INLINE_VAR constexpr size_t _Mtx_internal_imp_alignment = 8;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_size = 16;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_alignment = 8;
#else // _WIN64
_INLINE_VAR constexpr size_t _Mtx_internal_imp_size = 20;
_INLINE_VAR constexpr size_t _Mtx_internal_imp_alignment = 4;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_size = 8;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_alignment = 4;
#endif // _WIN64
#else // _CRT_WINDOWS
#ifdef _WIN64
_INLINE_VAR constexpr size_t _Mtx_internal_imp_size = 80;
_INLINE_VAR constexpr size_t _Mtx_internal_imp_alignment = 8;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_size = 72;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_alignment = 8;
#else // _WIN64
_INLINE_VAR constexpr size_t _Mtx_internal_imp_size = 48;
_INLINE_VAR constexpr size_t _Mtx_internal_imp_alignment = 4;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_size = 40;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_alignment = 4;
#endif // _WIN64
#endif // _CRT_WINDOWS
Edit 1/Appendix B
I thought about this after posting the question, and I am not sure how to make it flow with the rest. std::condition_variable
's only member is
aligned_storage_t<_Cnd_internal_imp_size, _Cnd_internal_imp_alignment> _Cnd_storage;
which is interpreted as _Cnd_internal_imp_t
. _Cnd_internal_imp_t
's only member is
std::aligned_storage_t<Concurrency::details::stl_condition_variable_max_size, Concurrency::details::stl_condition_variable_max_alignment> cv;
It is possible that stl_condition_variable_max_size != _Cnd_internal_imp_size
. In fact, this implied in this line
static_assert(sizeof(_Cnd_internal_imp_t) <= _Cnd_internal_imp_size, "incorrect _Cnd_internal_imp_size");
This would mean that it is possible that some of the 72 bytes are "unused."
END EDIT 1
Questions:
std::condition_variable
reserves 72 bytes for aCONDITION_VARIABLE
(see Edit 2). What are these 72 bytes used for?- How could a
std::condition_variable
get away with fewer bytes? It appears as though on some machinesstd::condition_variable
s are only 8 bytes big. See:_INLINE_VAR constexpr size_t _Cnd_internal_imp_size = 8;
Solution 1:[1]
std::condition_variable
reserves 72 bytes for a CONDITION_VARIABLE (see Edit 2). What are these 72 bytes used for?
There was another implementation of condition variable that was backed by Concurrency Runtime (ConcRT). In Visual Studio 2012 it was the only implementation, but it turned out to be not very good.
Starting from VS 2015, there is better implementation backed by the actual CONDITION_VARIABLE
. There is a polymorphism to create different implementations for different Windows versions, as CONDITION_VARIABLE
is available starting Windows Vista, and a complete SRWLOCK
is available starting in Windows 7. The polymorphism uses placement new rather than unions to hide the implementation details and to make the implementation conformant by making it a standard-layout class.
So, there is a place for multiple implementations, out of which the ConcRT is the largest.
Otherwise, sizeof(CONDITION_VARIABLE) == sizeof(void*)
, as well as sizeof(SRWLOCK) == sizeof(void*)
, though they aren't pointers internally. The rest of the size is wasted, if CONDITION_VARIABLE
/ SRWLOCK
implementation is used.
Starting from Visual Studio 2019, Windows XP is no longer supported by the VS toolset (it is supported by VS 2019 by the ability to install VS 2017 toolset). So ConcRT dependency and the ability to create pre-Vista condition_variable
was removed by my PR. A follow-up PR removed ConcRT structure wrappers.
Starting from Visual Studio 2022, Windows Vista is no longer supported by the VS toolset either, my other PR to remove the SRWLOCK
polymorphism is in flight.
Still due to the ABI compatibility between VS 2015, VS 2017, VS 2019, and VS 2022, it is not possible to reduce the size of condition_variable
.
Getting rid of placement new in mutex
constructor and fixing the conformance issue with having mutex
constructor non-constexpr
is also hard (my attempt has failed).
So, VS 2019 and VS 2022 still have to reserve space for the ConcRT implementation, which is no longer used.
With the next ABI breaking release of Visual Studio it is highly likely that the implementation of condition_variable
will change.
How could a std::condition_variable get away with fewer bytes?
_CRT_WINDOWS
implementation never needed to support Windows XP, so does not have ConcRT fallback. Still it shares the implementation with the usual configuration, apparently for maintenance reasons.
Solution 2:[2]
This is an incomplete answer, but it does provide more information.
The constructor of std::condition_variable
calls a function that creates the implementation of the condition variable within _Cnd_storage
condition_variable() {
_Cnd_init_in_situ(_Mycnd());
}
when _CRT_WINDOWS
is defined, it appears that the data stored there is 2 pointers, or one pointer and one integer the size of a pointer; the first of which is probably a virtual function pointer (pointing at the stl_condition_variable_interface
), and the other one is the state.
Depending on what the OS and libraries you are using provide, more or less machinery needs to be in the condition variable implementation.
That implementation may be in source code you do not appear to have access to.
https://github.com/ojdkbuild/tools_toolchain_vs2017bt_1416/blob/master/VC/Tools/MSVC/14.16.27023/crt/src/stl/cond.c appears to be _Cnd_init_in_situ
, which simply forwards to Concurrency::details::create_stl_condition_variable(cond->_get_cv())
.
Here is a VS2013 Concurrency::detalis::_Condition_variable
. It, however, does not appear to be what is created there (it has no virtual base). It has two members:
void * volatile _M_pWaitChain;
Concurrency::critical_section _M_lock;
which may be similar to what is actually stored there (as it was a previous implementation for something similar). The critical section is probably redundant for a std
condition_variable
, as it has an external mutex to work with.
What is in the _M_pWaitChain
I cannot say, other than from its name.
All of this isn't complete. I do know that modern condition variables know when they are signaled if they are holding the lock, and interact with which thread wakes up when the mutex is released; ie, low level internal to OS scheduling stuff.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Yakk - Adam Nevraumont |