'Apply affine transformation to value

I use TF.js to run a key-point prediction model for an input image in browser. And I'd like to apply affine transformation to the value of every keypoint using TF.js and webgl backend.

For the value of every key-point I'd like to do translate, scale and rotation.


Input

As a result of model prediction, I have a tensor with the shape [coord, n], where coord is [x, y] position of the keypoint in pixels.

My tensor

inputTensor.print();

> Tensor
    [[103.9713821, 128.1083069], // <- [x, y]
     [103.7512436, 107.0477371],
     [103.3587036, 115.1293793],
     [99.65448   , 92.0794601 ],
     [103.9862061, 101.7136688],
     [104.2239304, 95.8158569 ],
     [104.6783295, 82.7580566 ]]

Formula

I see tf.image.transform uses the following formula to compute the pixel position.

(x', y') = ((a0 x + a1 y + a2) / k, (b0 x + b1 y + b2) / k) where k = c0 x + c1 y + 1.

I have values for [a0, a1, a2, b0 b1, b2, c0, c1], so seems like I only need a way to apply this formula to every (x, y) pair in my tensor.


CPU Example (I need it on TF.js)

I've tried to do the transformation on the CPU using THREE.js. It works but is too slow. Hope it will give you some ideas of what I expect.

const landmarks: Float32Array = inputTensor.dataSync();

const output: Point3D[] = [];

for (let i = 0; i < landmarks.length - 1; i += 2) {
    const x = landmarks[i];
    const y = landmarks[i + 1];


    const mat4 = new Matrix4();
    mat4.identity();
    
    // Fill in with the basic values
    mat4.multiply(new Matrix4().makeTranslation(x, y, 0));

    // Scale 
    mat4.multiply(
        new Matrix4().makeScale(
            1 / scaleX,
            1 / scaleY,
            1,
        ),
    );
    // Rotate
    mat4.multiply(new Matrix4().makeRotationZ(rotate));
    // Translate
    mat4.multiply(
        new Matrix4().makeTranslation(
            translateX,
            translateY,
            0,
        ),
    );

    const p = new Vector3(x, y, 0).applyMatrix4(mat4);
    output.push(new Point3D(p.x, p.y, p.z));
}


Note

As far as I see tf.image.transform doesn't work for me since it operates with the position of the element, but I need to operate with the value.



Solution 1:[1]

It is easy but the processing using large matrixes multiplication on every single point is using the process of time, you can apply it on changed, refresh rates or scopes. Identity matrixes is faster to determine how much of the input picture change, you are going into the correct way.

[ Example ]:

y1 = tf.keras.layers.Cropping2D(cropping=((start_y, pic_height - box_height - start_y), (start, pic_width - box_width - start)))(picture)
target_1 = tf.keras.layers.Cropping2D(cropping=((previous_start_y, pic_height - box_height - previous_start_y), (previous_start, pic_width - box_width - previous_start)))(char_1)
temp_3 = tf.where(tf.math.greater_equal( np.asarray(y1, dtype=np.float32), np.asarray(target_1, dtype=np.float32)), [1.0], [0.0]).numpy()
temp_3 = tf.math.multiply( temp_3, y1, name=None )

Sample

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Martijn Pieters