Vol. 90, pp. 9795–9802, November 1993 | CARLO TOMASI AND TAKEO KANADE
The paper presents a factorization method for recovering scene geometry and camera motion from a stream of images, addressing the ill-conditioned problem when objects are distant relative to their size. The method uses the singular value decomposition (SVD) to factorize a $2F \times P$ measurement matrix representing the image coordinates of $P$ points tracked through $F$ frames. Under orthographic projection, this matrix is of rank 3, allowing it to be decomposed into two matrices: one representing camera rotation and the other representing object shape. The method can handle partially filled-in measurement matrices due to occlusions or tracking failures by iteratively growing a partial solution into a full solution. Experiments in both laboratory and outdoor environments demonstrate the method's accuracy and robustness, showing that it does not introduce smoothing in shape or motion. The rank theorem, which underpins the method, captures the redundancy in image sequences and enables efficient processing of large datasets. The factorization method is particularly effective for short-interval image streams, making feature tracking easier. The paper also discusses the relationship between the factorization method and Ullman's earlier work on structure from motion, highlighting the method's theoretical foundation and practical applications.The paper presents a factorization method for recovering scene geometry and camera motion from a stream of images, addressing the ill-conditioned problem when objects are distant relative to their size. The method uses the singular value decomposition (SVD) to factorize a $2F \times P$ measurement matrix representing the image coordinates of $P$ points tracked through $F$ frames. Under orthographic projection, this matrix is of rank 3, allowing it to be decomposed into two matrices: one representing camera rotation and the other representing object shape. The method can handle partially filled-in measurement matrices due to occlusions or tracking failures by iteratively growing a partial solution into a full solution. Experiments in both laboratory and outdoor environments demonstrate the method's accuracy and robustness, showing that it does not introduce smoothing in shape or motion. The rank theorem, which underpins the method, captures the redundancy in image sequences and enables efficient processing of large datasets. The factorization method is particularly effective for short-interval image streams, making feature tracking easier. The paper also discusses the relationship between the factorization method and Ullman's earlier work on structure from motion, highlighting the method's theoretical foundation and practical applications.