'Inserting non-const pair into `std::unordered_map` is slower than const pair
I have some code like these (from cppcon), when inserting a non-const pair into a unordered_map
, the performance is very different to inserting with a const one.
#include <algorithm>
#include <chrono>
#include <iostream>
#include <iterator>
#include <unordered_map>
#include <vector>
using namespace std;
struct StopWatch {
StopWatch() : clk{std::chrono::system_clock::now()} {}
~StopWatch() {
auto now = std::chrono::system_clock::now();
auto diff = now - clk;
cout << chrono::duration_cast<chrono::microseconds>(diff).count() << "ms"
<< endl;
}
decltype(std::chrono::system_clock::now()) clk;
};
void Benchmark_Slow(int iters) {
std::unordered_map<string, int> m;
std::pair<const string, int> p = {};
while (iters--)
m.insert(p);
}
void Benchmark_Fast(int iters) {
std::unordered_map<string, int> m;
const std::pair<const string, int> p = {};
while (iters--)
m.insert(p);
}
int main(void) {
{
StopWatch sw;
Benchmark_Fast(1000000);
}
{
StopWatch sw;
Benchmark_Slow(1000000);
}
return 0;
}
A online demo: Compiler Explorer
128247ms
392454ms
It seems that the const
qualifier let the compiler to choose the unordered_map::insert(const value_type&)
overload instead of the unordered_map::insert( P&& value )
.
cppreference: unordered_map::insert
But I think that a forwarding templated universal reference insert(P&& value)
would be the same as an insert
with const lvalue reference, an identical copy operation.
But the emplace one(with non-const pair) runs much slower than insert one(with const pair).
Am I missing something here ? Or if this is something has a keyword to be searched on the google, I didn't find something answers that. Thank you in advance.
Solution 1:[1]
I think I do found a possible explanation.
from emplace it describe that if insertion fails, the constructed element would be destroyed immediately.
I follow the assembly code compiled with libstd++ of unordered_map::emplace
(which accept templated argument and do std::forward
) and unordered_map::insert
link provided by @Jarod42, I the emplace one always allocate a new hash_node
before it check if the key already in the map, because it's templated and it didn't know the argument type (maybe it only know it's is_convertible_to
), so it do the construction before examine the key. The one in libc++ seems recognize the type is a const reference thus do the same as the insert one, copy construct occurs only if the key is not exsist.
When I modified the code with different key to be inserted, the difference gone away. quick C++ benchmark
I don't know did I miss something else. I' m sorry for this trivial problem was posted.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |