'Program behaves different when running with Valgrind

I have a problem debugging my application with Valgrind (memcheck). It behaves differently than when I run it standalone. There is a place in the code where it tries to determine the associated network interface with MAC address via an IP address. This is done by iterating over all existing interfaces, first using ioctl(sock, SIOCGIFADDR, &ifr) to determine the IP address and if this matches the one I am looking for, using ioctl(sock, SIOCGIFHWADDR, &ifr) to read out the MAC. Without Valgrind this works, with Valgrind the second call returns an empty address. Does anyone have an idea what this could be?

Here is a list of all messages that Valgrind outputs (without details):

(1) ==3120== Syscall param mq_notify(notification) points to uninitialised byte(s).
(1) ==3120== Syscall param mq_notify(notification) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_notify(notification) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_timedsend(msg_ptr) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_timedsend(msg_ptr) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_timedsend(msg_ptr) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_timedsend(msg_ptr) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_timedsend(msg_ptr) points to uninitialised byte(s)
(1) ==3120== Syscall param timer_settime64(value) points to uninitialised byte(s)
(1) ==3120== Syscall param timer_settime64(value) points to uninitialised byte(s)
(1) ==3120== Syscall param mq_notify(notification) points to uninitialised byte(s)

Here is the backtrace of the first message:

(1) ==1930== For lists of detected and suppressed errors, rerun with: -s
(1) ==1930== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
(1) ==3120== Syscall param mq_notify(notification) points to uninitialised byte(s)
(1) ==3120==    at 0xFCD3A1C: mq_notify (mq_notify.c:271)
(1) ==3120==    by 0x1016F6B7: NotificationReceiver<PisType::PisOperation>::registerInternalCallback(void (*)(sigval)) (NotificationReceiver.tpp:159)
(1) ==3120==    by 0x10170B93: NotificationReceiver<PisType::PisOperation>::setupMsgQueue(communication::CommunicationLink) (NotificationReceiver.tpp:99)
(1) ==3120==    by 0x10170CDF: NotificationReceiver<PisType::PisOperation>::NotificationReceiver(INotification::Connection) (NotificationReceiver.tpp:48)
(1) ==3120==    by 0x1016CD77: PisClientServer::createOperationReceiver() (PisClientServer.cpp:534)
(1) ==3120==    by 0x1016DA27: PisClientServer::create() (PisClientServer.cpp:419)
(1) ==3120==    by 0x1016DB67: PisClientServer::executeOperation(PisType::PisOperation) (PisClientServer.cpp:84)
(1) ==3120==    by 0x101CAFEB: StackHandler::create() (StackHandler.cpp:81)
(1) ==3120==    by 0x100A7B0B: main (stackhandler.cxx:130)
(1) ==3120==  Address 0x9ebeb5c4 is on thread 1's stack
(1) ==3120==  in frame #0, created by mq_notify (mq_notify.c:222)
(1) ==3120==  Uninitialised value was created by a stack allocation
(1) ==3120==    at 0xFCD38F0: mq_notify (mq_notify.c:222)

And the related Code:

template <typename T>
void NotificationReceiver<T>::registerInternalCallback(
        internal_callback_t intCallback) {
    intCallbackParam_.commLink = commLink_;
    intCallbackParam_.msgQueue = getMsgQueue(commLink_);
    intCallbackParam_.msgQueueDescriptor = msgQueueDescriptor_;

    memset(&(intCallbackParam_.signalEvent), 0, sizeof(sigevent));
    intCallbackParam_.signalEvent.sigev_notify = SIGEV_THREAD;
    intCallbackParam_.signalEvent.sigev_value.sival_ptr = &intCallbackParam_;
    intCallbackParam_.signalEvent.sigev_notify_function = intCallback;
    intCallbackParam_.signalEvent.sigev_notify_attributes = NULL;

    mq_notify(msgQueueDescriptor_, &intCallbackParam_.signalEvent);
}

The same for the timedsend message:

(1) ==3120== Syscall param mq_timedsend(msg_ptr) points to uninitialised byte(s)
(1) ==3120==    at 0xFCD3B9C: __mq_timedsend (mq_timedsend.c:28)
(1) ==3120==    by 0xFCD3B9C: mq_timedsend (mq_timedsend.c:25)
(1) ==3120==    by 0x1016F1DB: NotificationSender<PisType::PisOperationStatus>::notify(PisType::PisOperationStatus const*) (NotificationSender.tpp:72)
(1) ==3120==    by 0x1016F38F: NotificationSender<PisType::PisOperationStatus>::notifyIfReady(PisType::PisOperationStatus const*, LogFile*) (NotificationSender.tpp:99)
(1) ==3120==    by 0x1016E8FF: PisClientServer::sendOperationStatus(INotification::Connection, PisType::PisOperationStatus) (PisClientServer.cpp:352)
(1) ==3120==    by 0x1016EB17: PisClientServer::callbackStackHandler(PisType::PisOperation) (PisClientServer.cpp:567)
(1) ==3120==    by 0x1017015B: NotificationReceiver<PisType::PisOperation>::internalCallback(sigval) (NotificationReceiver.tpp:209)
(1) ==3120==    by 0xFCD370B: notification_function (mq_notify.c:105)
(1) ==3120==    by 0xFD7851B: start_thread (pthread_create.c:477)
(1) ==3120==    by 0x4354577: clone (clone.S:78)
(1) ==3120==  Address 0x5450999 is on thread 3's stack
(1) ==3120==  in frame #1, created by NotificationSender<PisType::PisOperationStatus>::notify(PisType::PisOperationStatus const*) (NotificationSender.tpp:54)
(1) ==3120==  Uninitialised value was created by a stack allocation
(1) ==3120==    at 0x1016EA68: PisClientServer::callbackStackHandler(PisType::PisOperation) (PisClientServer.cpp:551)
template <typename T>
bool NotificationSender<T>::notify(const T* data) {
    // create empty message buffer
    char msgBuffer[MQ_MSGSIZE_BYTES];
    memset(msgBuffer, 0, MQ_MSGSIZE_BYTES);

    // make sure the data size doesn't exceed the message buffer size
    int dataSizeBytes = std::min(static_cast<int>(sizeof(T)), MQ_MSGSIZE_BYTES);

    // copy the data into the buffer
    memcpy(msgBuffer, data, dataSizeBytes);

    // set a timeout so that sending a message doesn't block the sender
    // indefinitely
    struct timespec timeout;
    timeout.tv_sec = time(NULL) + MQ_SEND_TIMEOUT_SEC;
    timeout.tv_nsec = 0;

    // send message
    if (mq_timedsend(msgQueueDescriptor_, msgBuffer, dataSizeBytes, 0,
                     &timeout) != 0) {
        return false;
    }

    numMessages_++;

    return true;
}

And for settime:

(1) ==3120== Syscall param timer_settime64(value) points to uninitialised byte(s)
(1) ==3120==    at 0xFCD2D60: __timer_settime64 (timer_settime.c:41)
(1) ==3120==    by 0xFCD2F47: timer_settime (timer_settime.c:81)
(1) ==3120==    by 0x102D3AAB: OS_Start_Timer (Timer.c:119)
(1) ==3120==    by 0x10306A1B: GOOSESubscriber_Enable (GOOSESubscriber.c:915)
(1) ==3120==    by 0x10268B53: IEC61850_Start (IEC61850API.c:942)
(1) ==3120==    by 0x1016A23B: PisClientServer::start() (PisClientServer.cpp:382)
(1) ==3120==    by 0x10178AC3: PisServer::start() (PisServer.cpp:235)
(1) ==3120==    by 0x1016DC4F: PisClientServer::executeOperation(PisType::PisOperation) (PisClientServer.cpp:94)
(1) ==3120==    by 0x1016EADB: PisClientServer::callbackStackHandler(PisType::PisOperation) (PisClientServer.cpp:563)
(1) ==3120==    by 0x1017015B: NotificationReceiver<PisType::PisOperation>::internalCallback(sigval) (NotificationReceiver.tpp:209)
(1) ==3120==    by 0xFCD370B: notification_function (mq_notify.c:105)
(1) ==3120==    by 0xFD7851B: start_thread (pthread_create.c:477)
(1) ==3120==  Address 0x5451170 is on thread 3's stack
(1) ==3120==  in frame #1, created by timer_settime (timer_settime.c:74)
(1) ==3120==  Uninitialised value was created by a stack allocation
(1) ==3120==    at 0xFCD2E94: timer_settime (timer_settime.c:74)
struct sigevent SignalEvent;
SignalEvent.sigev_notify = SIGEV_THREAD;
SignalEvent.sigev_notify_function = vTimeUp;
SignalEvent.sigev_value.sival_ptr = ptTimer;
SignalEvent.sigev_notify_attributes = NULL;

if(timer_create(CLOCK_MONOTONIC, &SignalEvent, &(ptTimer->tTimerID)) != 0)
{
    iReturnErrorCode = TIMER_ERROR_OS_FAILED;
}
else
{
    struct itimerspec NextTime;
    NextTime.it_value.tv_sec = u32TimeOut / 1000;
    NextTime.it_value.tv_nsec = (u32TimeOut % 1000) * 1000000;
    if(ptTimer->eType == OSTIMER_TYPE_ONESHOT)
    {
        NextTime.it_interval.tv_sec = 0;
        NextTime.it_interval.tv_nsec = 0;
    }
    else
    {
        NextTime.it_interval.tv_sec = NextTime.it_value.tv_sec;
        NextTime.it_interval.tv_nsec = NextTime.it_value.tv_nsec;
    }
    
    if(timer_settime(ptTimer->tTimerID, 0, &NextTime, NULL) != 0)
    {
        iReturnErrorCode = TIMER_ERROR_OS_FAILED;
    }
}

System/Version:

  • Embedded PowerPC P2020,
  • Valgrind 3.17
  • Yocto built Linux with 5.10 Kernel
  • powerpc-poky-linux-gcc 9.3.0

Version 3.17 is the last one for which there is a recipe compatible with my Yocto version. I have not yet managed to build Valgrind from the sources. But I will open a separate ticket for that.

On my first attempts, there were several messages about missing syscall wrappers (These are also missing in the latest version). So I copied the following lines from syswrap-ppc64-linux.c to syswrap-ppc32-linux.c:

LINXY(__NR_prlimit64, sys_prlimit64), // 325
LINXY(__NR_getsockopt, sys_getsockopt), // 340
LINXY(__NR_recvmsg, sys_recvmsg), // 342

Is this sufficient or do I need to do more here?

I extracted the affected piece of code and tested it separately:

#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <sys/ioctl.h>
#include <net/if.h>
#include <netinet/in.h>
#include <net/if_arp.h>
#include <arpa/inet.h>

int main()
{
    int iReturned = 0;
    struct ifreq ifr;
    struct ifconf ifc;
    char buf[1024];  // buffer for address 1024 should be large enough

    int sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_IP); //Get a datagram socket
    if (sock != -1)
    {
        ifc.ifc_len = sizeof(buf);
        ifc.ifc_buf = buf;
        if (ioctl(sock, SIOCGIFCONF, &ifc) != -1) //Get the IoConfiguration (an array of each adapter)
        {
            struct ifreq* it = ifc.ifc_req;
            const struct ifreq* const end = it + (ifc.ifc_len / sizeof(struct ifreq));  //Find the last point in the array

            for (; it != end; ++it) //Loop through each adapter
            {
                strcpy(ifr.ifr_name, it->ifr_name); //copy the adapter name into our local ifreq structure
                printf("Adapter name: %s \n", ifr.ifr_name );

                if (ioctl(sock,SIOCGIFADDR,&ifr)==-1) {
                    int temp_errno=errno;
                    close(sock);
                    printf("%s",strerror(temp_errno));
                }
                struct sockaddr_in* ipaddr = (struct sockaddr_in*)&ifr.ifr_addr;
                printf("IP address: %s\n",inet_ntoa(ipaddr->sin_addr));

                if (ioctl(sock,SIOCGIFHWADDR,&ifr)==-1) {
                    int temp_errno=errno;
                    close(sock);
                    printf("%s",strerror(temp_errno));
                }

                const unsigned char* mac=(unsigned char*)ifr.ifr_hwaddr.sa_data;
                printf("%02X:%02X:%02X:%02X:%02X:%02X\n",
                    mac[0],mac[1],mac[2],mac[3],mac[4],mac[5]);
            }
        }
        else
        {
            iReturned = -1;
            /* handle error */
        }

        close(sock); /*Close the opened socket*/
    }
    else
    {
        iReturned = -1;
        /* handle error */
    }

    return 0;
}

Interestingly, the determination of the MAC in this case also works with Valgrind:

valgrind /MacLookup 
==4099== Memcheck, a memory error detector
==4099== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4099== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
==4099== Command: /MacLookup
==4099== 
Adapter name: lo 
IP address: 127.0.0.1
00:00:00:00:00:00
Adapter name: eth0 
IP address: 192.168.0.3
00:D0:93:51:A3:1B
==4099== 
==4099== HEAP SUMMARY:
==4099==     in use at exit: 0 bytes in 0 blocks
==4099==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==4099== 
==4099== All heap blocks were freed -- no leaks are possible
==4099== 
==4099== For lists of detected and suppressed errors, rerun with: -s
==4099== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

I assume this indicates that at least one of the Valgrinnd findings is genuine.



Solution 1:[1]

You need to determine if the errors reported are genuine or not. If there are real errors then it's quite plausible that the behaviour under Valgrind should be different.

I see that sigevent_t has a _pad member. I don't know if that gets used internally, but memcheck might be complaining about it if is not initialized.

For the timedsend error, are all msg_len bytes of the contents of msg_ptr?

And finally, I see that timespec64 also has a pad field that could be causing the timer_settime64.

My feeling is that the pad related errors are false positives and Valgrind needs to be improved to avoid them.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1