'Translating Screen Coordinates [ x, y ] to Camera Pan and Tilt angles
I have a IP Camera which can PTZ. I am currently streaming live feed into the browser and want to allow user to click a point on the screen and the camera will pan and tilt so that the user clicked position will now become the center point of view.
my Camera Pan 360 degrees and Tilt from -55 to 90.
any algorithm that will guide to me achieve my goal ??
Solution 1:[1]
Let's start by declaring a 3D coordinate system around the camera (the origin). I will use the following: The z-axis points upwards. The x-axis is the camera direction with pan=tilt=0
and positive pan angles will move the camera towards the positive y-axis.
Then, the transform for a given pan/tilt configuration is:
T = Ry(-tilt) * Rz(pan)
This is the transform that positions our virtual image plane in 3D space. Let's keep that in mind and go to the image plane.
If we know the vertical and horizontal field of view and assume that lens distortions are already corrected, we can set up our image plane as follows: The image plane is 1 unit away from the camera (just by declaration) in the view direction. Let the center be the plane's local origin. Then, its horizontal extents are +- tan(fovx / 2)
and its vertical extents are +- tan(fovy / 2)
.
Now, given a pixel position (x, y)
in this image (origin in the top left corner), we first need to convert this location into a 3D direction. We start by calculating the local coordinates in the image plane. This is for the image's pixel width w
and pixel height h
:
lx = (2 * x / w - 1) * tan(fovx / 2)
ly = (-2 * y / h + 1) * tan(fovy / 2) (local y-axis points upwards)
lz = 1 (image plane is 1 unit away)
This is the ray that contains the according pixel under the assumption that there is no pan or tilt yet. But now it is time to get rid of this assumption. That's where our initial transform comes into play. We just need to transform this ray:
tx = cos(pan) * cos(tilt) * lx - cos(tilt) * sin(pan) * ly - sin(tilt) * lz
ty = sin(pan) * lx + cos(pan) * ly
tz = cos(pan) * sin(tilt) * lx - sin(pan) * sin(tilt) * ly + cos(tilt) * lz
The resulting direction now describes the ray that contains the specified pixel in the global coordinate system that we set up in the beginning. All that's left is calculate the new pan/tilt parameters:
tilt = atan2(tz, tx)
pan = asin(ty / sqrt(tx^2 + ty^2 + tz^2))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Nico Schertler |