Beyond Pixels: Exploring New Representations and Applications for Motion Analysis by Ce Liu
This thesis explores new representations and applications for motion analysis beyond the pixel level. The focus is on analyzing motion from video sequences, proposing new representations such as layers and contours, and developing new algorithms for motion analysis. The thesis also introduces SIFT flow, a method for dense correspondence across scenes, and a nonparametric scene parsing system based on dense scene alignment.
The thesis begins with a discussion on motion analysis, which traditionally focuses on estimating a flow vector for every pixel by matching intensities. However, this approach has limitations, such as inappropriate modeling of the grouping relationship of pixels and a lack of ground-truth data. To address these issues, the thesis proposes a human-assisted motion annotation system that allows users to specify layer configurations and motion hints. This system helps obtain ground-truth motion data for natural video sequences, which is missing in the literature.
The thesis also explores the use of layers as an interface for humans to interact with videos, allowing for the detection and magnification of small motions. Additionally, the thesis introduces a contour motion analysis system for textureless objects under occlusion, demonstrating that simultaneous boundary grouping and motion analysis can solve challenging data where traditional pixel-wise motion analysis fails.
In the second part of the thesis, the benefits of matching local image structures instead of intensity values are explored. The thesis proposes SIFT flow, which establishes dense, semantically meaningful correspondence between two images across scenes by matching pixel-wise SIFT features. Using SIFT flow, the thesis develops a new framework for image parsing by transferring metadata information, such as annotation, motion, and depth, from a large database to an unknown query image. This framework is demonstrated using new applications such as predicting motion from a single image and motion synthesis via object transfer.
The thesis also introduces a nonparametric scene parsing system using label transfer, which shows promising experimental results, suggesting that the system outperforms state-of-the-art techniques based on training classifiers.
The thesis is supported by the guidance and support of several advisors and colleagues, and acknowledges the contributions of many individuals and institutions. The thesis also includes a list of figures and tables, as well as a detailed abstract and acknowledgments section.Beyond Pixels: Exploring New Representations and Applications for Motion Analysis by Ce Liu
This thesis explores new representations and applications for motion analysis beyond the pixel level. The focus is on analyzing motion from video sequences, proposing new representations such as layers and contours, and developing new algorithms for motion analysis. The thesis also introduces SIFT flow, a method for dense correspondence across scenes, and a nonparametric scene parsing system based on dense scene alignment.
The thesis begins with a discussion on motion analysis, which traditionally focuses on estimating a flow vector for every pixel by matching intensities. However, this approach has limitations, such as inappropriate modeling of the grouping relationship of pixels and a lack of ground-truth data. To address these issues, the thesis proposes a human-assisted motion annotation system that allows users to specify layer configurations and motion hints. This system helps obtain ground-truth motion data for natural video sequences, which is missing in the literature.
The thesis also explores the use of layers as an interface for humans to interact with videos, allowing for the detection and magnification of small motions. Additionally, the thesis introduces a contour motion analysis system for textureless objects under occlusion, demonstrating that simultaneous boundary grouping and motion analysis can solve challenging data where traditional pixel-wise motion analysis fails.
In the second part of the thesis, the benefits of matching local image structures instead of intensity values are explored. The thesis proposes SIFT flow, which establishes dense, semantically meaningful correspondence between two images across scenes by matching pixel-wise SIFT features. Using SIFT flow, the thesis develops a new framework for image parsing by transferring metadata information, such as annotation, motion, and depth, from a large database to an unknown query image. This framework is demonstrated using new applications such as predicting motion from a single image and motion synthesis via object transfer.
The thesis also introduces a nonparametric scene parsing system using label transfer, which shows promising experimental results, suggesting that the system outperforms state-of-the-art techniques based on training classifiers.
The thesis is supported by the guidance and support of several advisors and colleagues, and acknowledges the contributions of many individuals and institutions. The thesis also includes a list of figures and tables, as well as a detailed abstract and acknowledgments section.