'Inserting non-const pair into `std::unordered_map` is slower than const pair

I have some code like these (from cppcon), when inserting a non-const pair into a unordered_map, the performance is very different to inserting with a const one.

#include <algorithm>
#include <chrono>
#include <iostream>
#include <iterator>
#include <unordered_map>
#include <vector>
using namespace std;
struct StopWatch {
  StopWatch() : clk{std::chrono::system_clock::now()} {}
  ~StopWatch() {
    auto now = std::chrono::system_clock::now();
    auto diff = now - clk;
    cout << chrono::duration_cast<chrono::microseconds>(diff).count() << "ms"
         << endl;
  }
  decltype(std::chrono::system_clock::now()) clk;
};
void Benchmark_Slow(int iters) {
  std::unordered_map<string, int> m;
  std::pair<const string, int> p = {};
  while (iters--)
    m.insert(p);
}
void Benchmark_Fast(int iters) {
  std::unordered_map<string, int> m;
  const std::pair<const string, int> p = {};
  while (iters--)
    m.insert(p);
}
int main(void) {
  {
    StopWatch sw;
    Benchmark_Fast(1000000);
  }
  {
    StopWatch sw;
    Benchmark_Slow(1000000);
  }
  return 0;
}

A online demo: Compiler Explorer

128247ms
392454ms

It seems that the const qualifier let the compiler to choose the unordered_map::insert(const value_type&) overload instead of the unordered_map::insert( P&& value ). cppreference: unordered_map::insert

But I think that a forwarding templated universal reference insert(P&& value) would be the same as an insert with const lvalue reference, an identical copy operation.

But the emplace one(with non-const pair) runs much slower than insert one(with const pair).

Am I missing something here ? Or if this is something has a keyword to be searched on the google, I didn't find something answers that. Thank you in advance.



Solution 1:[1]

I think I do found a possible explanation. from emplace it describe that if insertion fails, the constructed element would be destroyed immediately. I follow the assembly code compiled with libstd++ of unordered_map::emplace (which accept templated argument and do std::forward) and unordered_map::insert link provided by @Jarod42, I the emplace one always allocate a new hash_node before it check if the key already in the map, because it's templated and it didn't know the argument type (maybe it only know it's is_convertible_to), so it do the construction before examine the key. The one in libc++ seems recognize the type is a const reference thus do the same as the insert one, copy construct occurs only if the key is not exsist. When I modified the code with different key to be inserted, the difference gone away. quick C++ benchmark

I don't know did I miss something else. I' m sorry for this trivial problem was posted.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1