'Inserting non-const pair into `std::unordered_map` is slower than const pair
I have some code like these (from cppcon), when inserting a non-const pair into a unordered_map, the performance is very different to inserting with a const one.
#include <algorithm>
#include <chrono>
#include <iostream>
#include <iterator>
#include <unordered_map>
#include <vector>
using namespace std;
struct StopWatch {
  StopWatch() : clk{std::chrono::system_clock::now()} {}
  ~StopWatch() {
    auto now = std::chrono::system_clock::now();
    auto diff = now - clk;
    cout << chrono::duration_cast<chrono::microseconds>(diff).count() << "ms"
         << endl;
  }
  decltype(std::chrono::system_clock::now()) clk;
};
void Benchmark_Slow(int iters) {
  std::unordered_map<string, int> m;
  std::pair<const string, int> p = {};
  while (iters--)
    m.insert(p);
}
void Benchmark_Fast(int iters) {
  std::unordered_map<string, int> m;
  const std::pair<const string, int> p = {};
  while (iters--)
    m.insert(p);
}
int main(void) {
  {
    StopWatch sw;
    Benchmark_Fast(1000000);
  }
  {
    StopWatch sw;
    Benchmark_Slow(1000000);
  }
  return 0;
}
A online demo: Compiler Explorer
128247ms
392454ms
It seems that the const qualifier let the compiler to choose the unordered_map::insert(const value_type&) overload instead of the unordered_map::insert( P&& value ).
cppreference: unordered_map::insert
But I think that a forwarding templated universal reference insert(P&& value) would be the same as an insert with const lvalue reference, an identical copy operation.
But the emplace one(with non-const pair) runs much slower than insert one(with const pair).
Am I missing something here ? Or if this is something has a keyword to be searched on the google, I didn't find something answers that. Thank you in advance.
Solution 1:[1]
I think I do found a possible explanation.
from emplace it describe that if insertion fails, the constructed element would be destroyed immediately.
I follow the assembly code compiled with libstd++ of unordered_map::emplace (which accept templated argument and do std::forward) and unordered_map::insert link provided by @Jarod42, I the emplace one always allocate a new hash_node before it check if the key already in the map, because it's templated and it didn't know the argument type (maybe it only know it's is_convertible_to), so it do the construction before examine the key. The one in libc++ seems recognize the type is a const reference thus do the same as the insert one, copy construct occurs only if the key is not exsist.
When I modified the code with different key to be inserted, the difference gone away. quick C++ benchmark
I don't know did I miss something else. I' m sorry for this trivial problem was posted.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source | 
|---|---|
| Solution 1 | 
