'Calling `std::vector<A>::data()` on `A` with const or reference fields, before C++20

This is a followup on the answers for placement new on a class with reference field.

Calling std::vector<A>::data() on type A that has reference or const fields, returns a pointer to objects that may be changed through the original vector by placement new, which causes a const or reference field of an original object to be replaced, while still being managed by another pointer, returned via the call to data().

For example:

struct A {
    const int i = 0;
};

int main() {
    std::vector<A> vec = {{1}, {2}};
    auto ptr = vec.data();
    std::cout << ptr[1].i << std::endl; // 2
    vec.pop_back();
    vec.push_back({3}); // placement new, inside
    std::cout << ptr[1].i << std::endl; // 3
}

C++17 tried to resolve such issues by introducing std::launder but it was later agreed that while std::launder may solve other issues, it doesn't really solve the problem for above use case as noted in NB US042.

Some questions - for C++ versions prior to C++20:

  • Since NB US042 was accepted as a change for C++20 spec, but not marked as a DR - would it be advised, according to the spec only, to avoid the use of std::vector<A>::data() on a type A that has reference or const fields as in above example?

  • Or, the wording of the spec for std::vector<>::data() covers that, making it legal and leaving the implementability question to the library implementers?

  • If it is the latter, what can the library do to make it legal?

  • If it can't really do anything useful to make it legal, is it UB before C++20?

  • If it is UB before C++20, why wasn't this change considered to be a candidate for a DR, same as p0593r6? Most probably compilers do the right thing anyway, why not mandate that retroactively?



Solution 1:[1]

Whether this code is valid depends on whether the implementation has strict pointer safety, a concept eliminated in C++20.

Simply, the pointer ptr is valid for arithmetic in a range from [ptr, ptr+2) immediately after the call to data().

After the call to pop_back(), any saved pointer to index 1 is certainly invalid, due to the iterator invalidation rules for std::vector<T>::pop_back()

Effect: Invalidates iterators and references at or after the point of the erase.

After the call to pop_back(), the pointer ptr obtained earlier is no longer valid for arithmetic in its original range (using it to compute ptr+1 no longer results in a "safely-derived" pointer value).

After the call to push_back(), strict safety of arithmetic using ptr is not restored. However, indexing using the original ptr, which has not been invalidated by pop_back() (only the reachable range was reduced), is still allowed on implementations with "relaxed pointer safety". That is, the expressions ptr+1 and ptr[1] involve valid-but-unsafely-derived pointer values.

A new call to data() returns a pointer value which compares equal to the original, but which can, unlike the old saved pointer, be used to safely derive values out to the current length of the vector. Again, implementations with relaxed pointer validity don't care.


The change made for C++20 for lifetime of objects having const members has no effect here, because the use of existing pointers to refer to the replacement object was not forbidden solely by [basic.life] but also by the iterator invalidation clause in pop_back.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1