Getting-real-world-coordinates-from-image-frame: Difference between revisions

Revision as of 19:42, 22 May 2013

Why do we need this part?

Our purpose was to convert items on the image to real world coordinates e.g. we wanted to know item placement relative to robot's placement. This is necessary to make robot understand where are objects relative to him, and if we are watching a bigger picture, this is necessary to make robot know where it is relative to the soccer field.

Pinhole camera model

For this, we used pinhole camera model. I am not going to describe all the theory for that, here is what pinhole model does in one formula: File:Http://docs.opencv.org/ images/math/50a3464c85a412907d91fd8895108ff692eb8d08.png

To learn about this model there are good enough resources available (start with wiki and udacity), but more focus on overall idea, troubles we had and tools we used.

How to map coordinates from 3D to 2D?

To be able to transform between two coordinate systems it is needed to know camera intrinsic parameters and extrinsic parameters. Former describes how any real world object arrives to camera’s light sensor. It consists of parameters such as camera’s focal length, principal point and skew of the image axis. Latter gives information about camera’s pose in the observed environment (3 rotations and 3 translations as we live in a 3 dimensional world). To convert 2 dimensional point into 3 dimensional world we also need to make an extra assumption that objects that interest us are on a plane that we determine.

How did we get camera parameters and pose? We based our calibration system on opencv implementations of finding all of these parameters. Didn’t see any need to make anything topnotch in terms of speed, because we will need to get all these parameters only once and we can use them .. forever. We found useful opencv functions specially designed for finding camera matrix (intrinsic parameters) and rotation-translation matrix (extrinsic parameters) (see documentation for calibrateCamera(), findChessboardCorners(), drawChessboardCorners() and projectPoints() also tutorial might be useful). If you are interested in algorithms what makes it work, go watch the documentation or source code. After collecting all winnings in terms of these parameters we were able to convert real world 3D points onto image plane. But as this wasn’t our goal (we wanted 2D -> 3D) we had to keep going.

How to map 2D point to 3D?

There was a bit of chaos and many “wasted” days in terms of reversing this operation. We had problems with inverting matrices. We took the model Opencv Mat::inv() didn’t give right results and some matrix pseudo inverse seemed to be not working – probably these matrices weren’t invertible.

TODO: Dig deeper, what was the problem with not being able to invert those matrices.

In the end we solved equations with Cramer’s rule.

Performance

TODO:

@@ Line 1: / Line 1: @@
-Our purpose was to convert items on the image e.g. image coordinates to real world coordinates e.g. item placement relative to camera placement. For this, we used pinhole camera model.
+== Why do we need this part? ==
+Our purpose was to convert items on the image to real world coordinates e.g. we wanted to know item placement relative to robot's placement. This is necessary to make robot understand where are objects relative to him, and if we are watching a bigger picture, this is necessary to make robot know where it is relative to the soccer field.
-I am not going to describe all the theory for that, there are good enough resources available (start with http://en.wikipedia.org/wiki/Pinhole_camera_model), but more focus on overall idea, troubles we had and tools we used.
+== Pinhole camera model ==
+For this, we used pinhole camera model.
+I am not going to describe all the theory for that, here is what pinhole model does in one formula:
+[[File:http://docs.opencv.org/_images/math/50a3464c85a412907d91fd8895108ff692eb8d08.png]]
-Overall idea…
+To learn about this model there are good enough resources available (start with [http://en.wikipedia.org/wiki/Pinhole_camera_model wiki] and [https://www.youtube.com/watch?v=uhP3jrxraMk udacity]), but more focus on overall idea, troubles we had and tools we used.
+== How to map coordinates from 3D to 2D? ==
 To be able to transform between two coordinate systems it is needed to know camera intrinsic parameters and extrinsic parameters. Former describes how any real world object arrives to camera’s light sensor. It consists of parameters such as camera’s focal length, principal point and skew of the image axis. Latter gives information about camera’s pose in the observed environment (3 rotations and 3 translations as we live in a 3 dimensional world). To convert 2 dimensional point into 3 dimensional world we also need to make an extra assumption that objects that interest us are on a plane that we determine.
-How did we get all of these parameters?
+How did we get camera parameters and pose?
+We based our calibration system on opencv implementations of finding all of these parameters. Didn’t see any need to make anything topnotch in terms of speed, because we will need to get all these parameters only once and we can use them .. forever. We found useful opencv functions specially designed for finding camera matrix (intrinsic parameters) and rotation-translation matrix (extrinsic parameters) (see [http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html documentation] for calibrateCamera(), findChessboardCorners(), drawChessboardCorners() and projectPoints() also [http://docs.opencv.org/doc/tutorials/calib3d/camera_calibration/camera_calibration.html tutorial] might be useful). If you are interested in algorithms what makes it work, go watch the documentation or source code.
+After collecting all winnings in terms of these parameters we were able to convert real world 3D points onto image plane. But as this wasn’t our goal (we wanted 2D -> 3D) we had to keep going.
-We based our calibration system on opencv implementations of finding all of these parameters. Didn’t see any need to make anything topnotch in terms of speed, because we will need to get all these parameters only once and we can use them .. forever. We found useful opencv functions specially designed for finding camera matrix (intrinsic parameters) and rotation-translation matrix (extrinsic parameters) (see [http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html documentation] for calibrateCamera(), findChessboardCorners(), drawChessboardCorners() and projectPoints() also [http://docs.opencv.org/doc/tutorials/calib3d/camera_calibration/camera_calibration.html tutorial] might be useful). If you are interested in algorithms what makes it work, go watch the documentation or source code.
+== How to map 2D point to 3D? ==
+There was a bit of chaos and many “wasted” days in terms of reversing this operation. We had problems with inverting matrices. We took the model Opencv Mat::inv() didn’t give right results and some matrix pseudo inverse seemed to be not working – probably these matrices weren’t invertible.
-After collecting all winnings in terms of these parameters we were able to convert real world 3D points onto image plane. But as this wasn’t our goal (we wanted 2D -> 3D) we had to keep going. There was a bit of chaos and many “wasted” days in terms of reversing this operation. We had problems with inverting matrices. Opencv Mat::inv() didn’t give right results and some matrix pseudo inverse seemed to be not working – probably these matrices weren’t invertible.
 TODO: Dig deeper, what was the problem with not being able to invert those matrices.
 In the end we solved equations with [http://en.wikipedia.org/wiki/Cramer's_rule Cramer’s rule].
+== Performance ==
+TODO:

Search