Anand K Subramanian

clock-icon#computer-vision #geometric-projection #image-processing #math #torch #code #camera

clock-icon 6 May 2025

clock-icon 3 mins

Estimating Camera Parameters from Depth Maps

Warping images based on depth and camera viewpoints for novel view synthesis

  When reading through the Dustr paper, I came across an interesting algorithm.

How do you estimate the camera intrinsic parameters? Can it be posed as an optimization problem?

Recall that the intrinsic camera parameters is the matrix KR3×3K \in \R^{3\x 3} given by

K=(fxsx00fyy0001) K = \begin{pmatrix} f_x & s & x_0 \\ 0 & f_y & y_0 \\ 0 & 0 & 1 \end{pmatrix}

where fx,fyf_x, f_y are the focal lengths along the xx and yy axes of the image respectively, ss is the axis skew, and x0,y0x_0, y_0 are the principal point offsets from the image plane's origin. The principal point is the point where the ray from the camera perpendicular to the image, intersects the image.

Usually, it is standard practice to assume that the camera is well positioned (without skew) and calibrated that the principal point is at the center of the image plane i.e x0=W2,y0=H2,s=0x_0 = \frac{W}{2}, y_0 = \frac{H}{2}, s = 0. Furthermore, we assume that the pixel are almost squares fx=fyf_x = f_y. Thus, we only need to estimate the focal length to get the complete intrinsic matrix.

The point map PRW×H×3P \in \R^{W \x H \x 3} of the given image IRW×H×3I \in \R^{W \x H \x 3} and corresponding depth map DRW×HD \in \R^{W \x H} can be computed in a straightforward manner as Pij=K1Dij(ij1) P_{ij} = K^{-1}D_{ij}\begin{pmatrix} i \\ j \\ 1 \end{pmatrix}

Given a point map PP (which is naturally expressed in the coordinate frame of its corresponding image II), the focal length ff can be expressed as the following minimization problem

f=arg minfi=0Wj=0HCij(i,j)f(PijxPijz,PijyPijz) f^* = \argmin_f \sum_{i=0}^W \sum_{j=0}^H C_{ij} \bigg \|(i',j') - f \left ( \frac{P_{ijx}}{P_{ijz}}, \frac{P_{ijy}}{P_{ijz}} \right ) \bigg \|

Where (i,j)=(iW2,jH2)(i', j') = \left (i - \frac{W}{2}, j - \frac{H}{2} \right ) are the pixel coordinates with respect to the image center (principal point), PijxPijz,PijyPijz\frac{P_{ijx}}{P_{ijz}}, \frac{P_{ijy}}{P_{ijz}} is the projection of the x,yx, y-coordinates of the 3D point map onto the image plane at the index (i,j)(i, j), and CijC_{ij} is the confidence map associated with the estimated depth map DD. When the original (metric) depth map is known, Cij=1;i,jC_{ij} = 1; \quad \forall i,j

The above equation (3) finds the focal length that minimizes the projection error.

point map projection

Projecting 3D point maps back to the image plane for estimating focal length.

© 2025 Anand K Subramanian License Design Built with Kutti