[slides] A Spatio-Temporal Descriptor Based on 3D-Gradients

This paper presents a novel spatio-temporal descriptor for video sequences, based on histograms of oriented 3D spatio-temporal gradients. The key contributions include: (1) an efficient algorithm for computing 3D gradients at arbitrary scales using integral videos, (2) a generic 3D orientation quantization method based on regular polyhedrons, (3) an in-depth evaluation and optimization of descriptor parameters for action recognition, and (4) application of the descriptor to various action datasets (KTH, Weizmann, Hollywood) with superior performance compared to state-of-the-art methods. The descriptor is designed to be robust to illumination changes and small deformations, making it suitable for action recognition tasks in videos. The experimental results demonstrate that the proposed descriptor outperforms existing methods on two out of three datasets and matches the performance on the third. Future work will focus on learning descriptor parameters on a per-class basis to further improve performance.This paper presents a novel spatio-temporal descriptor for video sequences, based on histograms of oriented 3D spatio-temporal gradients. The key contributions include: (1) an efficient algorithm for computing 3D gradients at arbitrary scales using integral videos, (2) a generic 3D orientation quantization method based on regular polyhedrons, (3) an in-depth evaluation and optimization of descriptor parameters for action recognition, and (4) application of the descriptor to various action datasets (KTH, Weizmann, Hollywood) with superior performance compared to state-of-the-art methods. The descriptor is designed to be robust to illumination changes and small deformations, making it suitable for action recognition tasks in videos. The experimental results demonstrate that the proposed descriptor outperforms existing methods on two out of three datasets and matches the performance on the third. Future work will focus on learning descriptor parameters on a per-class basis to further improve performance.

A Spatio-Temporal Descriptor Based on 3D-Gradients

Sep 2008 | Alexander Klaser, Marcin Marszalek, Cordelia Schmid