'c++ Best practise when passing in arguments to functions [closed]

So I have read a lot of things about people saying const & is always good as it eliminates copying, and passing by value is a bad idea. Then I've also recently read some posts saying const & is bad for basic data type int, string, etc.

So does anyone know what the "best practice" or most efficient method to follow when passing in arguments to a function, both simple data types such as int, double string and also more complex types e.g vectors, objects, and pointers for modern c++?

Thanks.



Solution 1:[1]

To answer this you have to look at the underlying platform. Techniques and practices that work on one may not work on another. So for the sake of this discussion let's focus on x86_64 and passing integral type arguments. Other architectures and calling conventions can be similar.

The rules are summarized like this:

  1. For integral type objects and floats/doubles pass by value (explanation why below)

  2. For smaller objects like std::optional<> and small structs containing just type 1 values, you can still pass by value.

  3. For any larger object, pass by const reference or reference.

  4. For std::string in particular, use std::string_view in your function as it will allow you to pass a char pointer or char array and no std::string temporary will be created.

  5. Modern C++ introduced "move semantics" and the && operator. This creates other classes of objects that allow you to "take over" the contents of the passed argument instead of making a copy. This technique is very useful for large objects.

Following a more detailed explanation. When calling a method, with integers (including pointers) only, the following register sequence are used: %rdi, %rsi, %rdx, %rcx, %r8 and %r9.

For returning values, %rax and %rdx are used. All these registers are 64 bits.

This is a DEPARTURE from the i386 call semantics where everything was passed on stack, ie it was mandatory that every parameter be stored in memory before the call be made. With the AMD64 ABI implementing passing on registers it has become WAY faster since all the operations are made inside the CPU core itself with no memory access.

So a function like

int func( int a, int b );

will use %rdi=a %rsi=b and the return value will be in %rax. Note that if func is a method of a class, the first argument will the the this pointer so the sequence will be %rdi=this %rsi=a %rdx=b and the return value will be in %rax.

So what happens if you pass that int by reference? Let's compare.

int func( int a, int b ) {
    return a+b;
}

int func( int& a, int& b ) {
    return a+b;
}

when compiled will produce

func(int, int):                              # @func(int, int)
        leal    (%rdi,%rsi), %eax
        retq

func(int const&, int const&):                # @func(int const&, int const&)
        movl    (%rsi), %eax
        addl    (%rdi), %eax
        retq

so notice that the reference is passed as POINTER and that will cause two much more costly operations with memory fetch movl (%rsi), %eax plus the add itself rather than a simple sum leal (%rdi,%rsi), %eax that is done without reaching out to memory.

So IN THIS CASE it is much better for speed and cache usage that you pass by value instead of reference when dealing with integral (int-like) values.

The same above applies to floats and doubles. The registers are different (using %xmm0, etc) but the same logic applies.

For larger objects like std::vector or std::string, it is recommended to pass by const reference if you are not going to modify this object in the body of that function or method. And pass by reference if you need to modify them. Doing this way, the same rules for integral types will apply as pointers and references are considered integer-like.

For example

#include <string>
int len( const std::string& s ) {
    return s.size();
}

int call( const std::string& s ) {
    return len(s);
}

Will yield to

len(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&): # @len(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
        movl    8(%rdi), %eax
        retq
call(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&): # @call(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
        movl    8(%rdi), %eax
        retq

but if you pass a string by value like this

int len2( std::string s ) {
    return s.size();
}

int call2( std::string s ) {
    return len2(s);
}

Then the len2 method itself will still be simple but the caller will have to make a copy of it and lead to a very large caller

len2(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >): # @len2(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
        movl    8(%rdi), %eax
        retq

call2(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >): # @call2(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
        pushq   %r15
        pushq   %r14
        pushq   %r12
        pushq   %rbx
        subq    $40, %rsp
        leaq    24(%rsp), %r12
        movq    %r12, 8(%rsp)
        movq    (%rdi), %r14
        movq    8(%rdi), %rbx
        cmpq    $15, %rbx
        jbe     .LBB7_1
        testq   %rbx, %rbx
        js      .LBB7_12
        movq    %rbx, %rdi
        incq    %rdi
        js      .LBB7_13
        callq   operator new(unsigned long)
        movq    %rax, %r15
        movq    %rax, 8(%rsp)
        movq    %rbx, 24(%rsp)
        testq   %rbx, %rbx
        jne     .LBB7_6
        jmp     .LBB7_9
.LBB7_1:                                # %entry.if.end_crit_edge.i.i
        movq    %r12, %r15
        testq   %rbx, %rbx
        je      .LBB7_9
.LBB7_6:                                # %if.end.i.i
        cmpq    $1, %rbx
        jne     .LBB7_8
        movb    (%r14), %al
        movb    %al, (%r15)
        jmp     .LBB7_9
.LBB7_8:                                # %if.end.i.i.i.i.i
        movq    %r15, %rdi
        movq    %r14, %rsi
        movq    %rbx, %rdx
        callq   memcpy@PLT
.LBB7_9:                                # %_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2ERKS4_.exit
        movq    %rbx, 16(%rsp)
        movb    $0, (%r15,%rbx)
        movq    8(%rsp), %rdi
        movq    16(%rsp), %rbx
        cmpq    %r12, %rdi
        je      .LBB7_11
        callq   operator delete(void*)
.LBB7_11:                               # %_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED2Ev.exit
        movl    %ebx, %eax
        addq    $40, %rsp
        popq    %rbx
        popq    %r12
        popq    %r14
        popq    %r15
        retq
.LBB7_13:                               # %if.end.i.i.i.i.i.i
        callq   std::__throw_bad_alloc()
.LBB7_12:                               # %if.then.i.i.i
        movl    $.L.str, %edi
        callq   std::__throw_length_error(char const*)
.L.str:
        .asciz  "basic_string::_M_create"

Reference: https://uclibc.org/docs/psABI-x86_64.pdf

Solution 2:[2]

For best practices, look-up the C++ core guidelines.

Some general tips:

  1. For large objects, pass by const& (or & if you require mutability), this is because copying the object will be more expensive than copying a pointer to the object. (Note that r-value objects will bind to const&, you can provide a && overload)
  2. For smaller (built-in types), there isn't much point in passing by reference as the object size is probably of the same order as a pointer.
  3. Prefer references to pointers, to avoid needing to do a nullptr check

If you are writing template code, then look-up perfect forwarding. If you are using STL containers, there are proxy objects like std::span and std::string_view which provide a generic view into the containers and can be passed by value.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2