2013 | Heng Wang, Alexander Kläser, Cordelia Schmid, Cheng-Lin Liu
This paper introduces a video representation method based on dense trajectories and motion boundary descriptors for action recognition. The method captures local motion information using dense trajectories, which are extracted using a state-of-the-art optical flow algorithm. Trajectory shape, appearance (histograms of oriented gradients), and motion (histograms of optical flow) are characterized by trajectory-aligned descriptors. Additionally, a motion boundary histogram (MBH) descriptor, which relies on differential optical flow, is introduced to reduce the influence of camera motion. The MBH descriptor consistently outperforms other state-of-the-art descriptors, especially in real-world videos with significant camera motion. The approach is evaluated on nine datasets, including KTH, YouTube, Hollywood2, UCF sports, IXMAS, UIUC, Olympic Sports, UCF50, and HMDB51, and shows significant improvement over current state-of-the-art results. The paper also discusses the impact of different parameters and computational complexity, providing a comprehensive evaluation of the proposed method.This paper introduces a video representation method based on dense trajectories and motion boundary descriptors for action recognition. The method captures local motion information using dense trajectories, which are extracted using a state-of-the-art optical flow algorithm. Trajectory shape, appearance (histograms of oriented gradients), and motion (histograms of optical flow) are characterized by trajectory-aligned descriptors. Additionally, a motion boundary histogram (MBH) descriptor, which relies on differential optical flow, is introduced to reduce the influence of camera motion. The MBH descriptor consistently outperforms other state-of-the-art descriptors, especially in real-world videos with significant camera motion. The approach is evaluated on nine datasets, including KTH, YouTube, Hollywood2, UCF sports, IXMAS, UIUC, Olympic Sports, UCF50, and HMDB51, and shows significant improvement over current state-of-the-art results. The paper also discusses the impact of different parameters and computational complexity, providing a comprehensive evaluation of the proposed method.