'Opencl clBuildProgram() access violation exception

I'm having a weird error executing an opencl kernel, When I'm trying to build the opencl kernel using the clBuildProgram() execution

err = clBuildProgram(program, 1, &ocl->device, "", NULL, NULL);

My process starts using more and more memory, until it reaches 13GB (Normally it uses about 400MB), then yields:

"0xC0000005: Access violation executing location"

Memory consumption

The weird part is this happens only if I use the integrated card, which is an Intel HD 4000. If choose other device like the GTX 960 or the CPU it works fine.

Another strange thing is that if there is any syntax error the clBuildProgram function ends fine, giving the compilation error, its only when there isn't any mistakes. Also, if I comment part of my code it goes.

This is my function:

__kernel void update(__global struct PhysicsComponent_ocl_t* vecPhy, __constant struct BoxCollider_ocl_t* vecBx, __constant ulong* vecIdx, __constant float* deltaTime) {
    unsigned int i = get_global_id(0);
    unsigned int j = get_global_id(1);

    if (j > i) { //From size_t j = i + 1; i < vec.size()... 
        //Copy data to local memory to avoid race conditions
        struct AuxPhy_ocl_t phy1;
        copyPhyGL(&vecPhy[vecIdx[i]], &phy1);

        struct AuxPhy_ocl_t phy2;
        copyPhyGL(&vecPhy[vecIdx[j]], &phy2);

        if (collide(&phy1, &phy2, &vecBx[i], &vecBx[j])) {
            ////Check speed correction for obj 1
            struct mivec3_t speed1 = phy1.speed;
            struct mivec3_t speed2 = phy2.speed;

            modifySpeedAndVelocityOnCollision(&phy1, &phy2, &vecBx[i], &vecBx[j], *deltaTime);         //Comprobar los dos objetos, por eso se le da la vuelta a los parametros
            modifySpeedAndVelocityOnCollision(&phy2, &phy1, &vecBx[j], &vecBx[i], *deltaTime);

            //Make the objects not move
            struct mivec3_t auxSub;
            multiplyVectorByScalarLL(&speed1, *deltaTime, &auxSub);
            substractVectorsLL(&phy1.position, &auxSub, &phy1.position);

            multiplyVectorByScalarLL(&speed2, *deltaTime, &auxSub);
            substractVectorsLL(&phy2.position, &auxSub, &phy2.position);

            //Copy data back to global
            copyPhyLG(&phy1, &vecPhy[vecIdx[i]]);
            copyPhyLG(&phy2, &vecPhy[vecIdx[j]]);
        }
    }
}

For example. If I comment the last two functions, builds the program.

//Copy data back to global
//copyPhyLG(&phy1, &vecPhy[vecIdx[i]]);
//copyPhyLG(&phy2, &vecPhy[vecIdx[j]]);

But they are not the cause for this, because if I put this functions, but comment part of the body it also works.

__kernel void update(__global struct PhysicsComponent_ocl_t* vecPhy, __constant struct BoxCollider_ocl_t* vecBx, __constant ulong* vecIdx, __constant float* deltaTime) {
    unsigned int i = get_global_id(0);
    unsigned int j = get_global_id(1);

    if (j > i) { //From size_t j = i + 1; i < vec.size()... 
        //Copy data to local memory to avoid race conditions
        struct AuxPhy_ocl_t phy1;
        copyPhyGL(&vecPhy[vecIdx[i]], &phy1);

        struct AuxPhy_ocl_t phy2;
        copyPhyGL(&vecPhy[vecIdx[j]], &phy2);

        //Removed code was here

        copyPhyLG(&phy1, &vecPhy[vecIdx[i]]);
        copyPhyLG(&phy2, &vecPhy[vecIdx[j]]);
    }
}

I'm mind blown by this, the only thing it comes to my mind it's like the code takes too much space.

Here is the complete kernel code.



Solution 1:[1]

I ran into a similar problem, and in my case it was an infinite loop in one of my kernels. I guess the compiler tried to unroll it or optimize it in some way without checking for bounds.

To validate my hypothesis I built my ocl program with optimizations turned off

    int err = program.build("-cl-opt-disable");

and the build succeeded as I expected.

When you introduce a syntax error the compilation process stops early on and won't reach the optimization part where the compiler bug reside.

The compilers for the other devices don't have this bug and they will give you back an executable that you can run but probably wont terminate (correctly).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 scrappy