'incorrect output when using C++ lower/upper_bound function with custom compare operator

I've been experimenting with the "lower_bound()/upper_bound()" functions in C++ w.r.t. arrays/vectors, and I get incorrect results when applying custom compare operators to the function. My current understanding (based on https://www.cplusplus.com/reference/algorithm/upper_bound/) is that when you search for some value 'val' (of any datatype) in an array, it returns the first iterator position "it" in the array (from left to right) that satisfies !comp(val,*it), is this wrong? If so, how exactly does the searching work?

P.S. In addition, what is the difference of using lowerbound/upperbound when your searching criterion is a specific boolean compare function?


Here is an example that produced erroneous results:

auto comp2 = [&](int num, pair<int,int>& p2){return num>p2.second;};
vector<pair<int,int>> pairs = {{1,2},{2,3},{3,4}};   //this array should be binary-searchable with 'comp2' comparator, since pairs[i].second is monotonously increasing
int pos2 = upper_bound(pairs.begin(),pairs.end(),2,comp2)-pairs.begin();               
cout<<pos2<<endl;                                   //outputs 3, but should give 0 because !comp2(2,arr[0]) is true, and arr[0] is the ealiest element in the array

Thanks!



Solution 1:[1]

I think most (If not all) of the comparator functions are less, it can be std::less or something similar. So when we provide a custom comp function, we have to provide the less logic and think of it as less.

Now back to the upper_bound, it returns the first element greater than the value, which means our less should return true for it to stop (As Francois pointed out). While our comp function always returns false.

And your understanding about !comp(val,*it) is also not correct. It is the condition to continue the search, not to stop it.

Here is an example implementation of the upper_bound, let's take a look:

template<class ForwardIt, class T, class Compare>
ForwardIt upper_bound(ForwardIt first, ForwardIt last, const T& value, Compare comp)
{
    ForwardIt it;
    typename std::iterator_traits<ForwardIt>::difference_type count, step;
    count = std::distance(first, last);
 
    while (count > 0) {
        it = first; 
        step = count / 2;
        std::advance(it, step);
        if (!comp(value, *it)) {
            first = ++it;
            count -= step + 1;
        } 
        else
            count = step;
    }
    return first;
}

You can see, if (!comp(value, *it)) is when the less return false, it means the value is greater than the current item, it will move forward and continue from the next item. (Because the items are increasing).

In the other case, it will try to reduce the search distance (By half the count) and hope to find earlier item that is greater than value.

Summary: You have to provide comp as less logic and let the upper_bound do the rest.

Solution 2:[2]

upper_bound returns the first element that satisfies comp(val, *it). In the link you provided, it shows

template <class ForwardIterator, class T>
  ForwardIterator upper_bound (ForwardIterator first, ForwardIterator last, const T& val)
{
    ForwardIterator it;
    iterator_traits<ForwardIterator>::difference_type count, step;
    count = std::distance(first,last);
    while (count>0)
    {
        it = first; step=count/2; std::advance (it,step);
        if (!(val<*it))                 // or: if (!comp(val,*it)), for version (2)
          { first=++it; count-=step+1;  }
        else count=step;
    }
    return first;
}

Returns an iterator pointing to the first element in the range [first,last) which compares greater than val.

The searching works by starting at position 0(first). It then uses count to see the range of values it needs to check. It checks the middle of the range (first+count/2), and if that does not satisfy the condition, that position is now first (discarding all values before it), and repeats with the new first and range. If it does satisfy the condition, then the algorithm can discard all values after that, and repeat with the new range. When the range drops to 0, the algorithm can end. It assumes that if arr[5] is false, arr[0], arr[1] ... arr[4] are also false. Same with if arr[8] is true, arr[9], arr[10] ... arr[n] are also true.

The reason your code does not work is because the comparator used returns num>p2.second, meaning it looks for a value of p2.second that is less than num. Since you put in 2 for num, and there is no p2.second less than that in the vector, the output points to a position outside of the vector because it didn't find anything.

The difference between upper_bound and lower_bound is that upper_bound looks for the first value that satisfies the condition, while lower_bound looks for the first value that does not satisfy the condition. So

lower_bound(v.begin(), v.end(), val, [](int it, int val) {return !(val < it);});

is the same as

upper_bound(v.begin(), v.end(), val, [](int val, int it){return val < it;});

Note that for lower_bound, the comparator used takes (*it, val), not (val, *it).

I guess the only difference is how easy it is to frame the comparator in those terms - realizing that a<b is the same as not a>=b.

More explained here. I liked the explanation that said it finds [lower_bound, upper_bound) when using the same comparator.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 H?i Ph?m LĂȘ
Solution 2 kenntnisse