Recognizing Action at a Distance

Recognizing Action at a Distance

2003 | Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik
This paper presents a method for recognizing human actions at a distance, where a person may be as small as 30 pixels tall. The approach involves creating a spatio-temporal motion descriptor based on optical flow measurements, which are then used in a nearest-neighbor framework for action recognition. Optical flow is treated as a noisy spatial pattern rather than precise pixel displacements, and is smoothed and aggregated to form the motion descriptor. The method retrieves the most similar action from a database of annotated video sequences, enabling not only action recognition but also skeleton transfer and action synthesis. The approach is tested on datasets including ballet, tennis, and football, demonstrating its effectiveness in recognizing actions despite low resolution. The method also allows for the transfer of 2D/3D skeletons and the synthesis of novel action sequences through "Do as I Do" and "Do as I Say" techniques. The results show that the method can accurately classify actions even when the input data is noisy and the figures are small. The paper also discusses the use of the action database for figure correction, where imperfections in the input data are corrected using the database's information. The method is robust to misalignment and can recover human skeletons from video sequences, even when the figures are small and difficult to identify. The approach is based on a motion descriptor derived from smoothed and aggregated optical flow measurements, which is used to compare and classify actions in a nearest-neighbor framework. The method is effective in recognizing actions in medium resolution scenarios, where the figures are not as large as in near field but larger than in far field. The results show that the method can accurately classify actions in various domains, including ballet, tennis, and football, and can be used for applications such as skeleton transfer and action synthesis. The paper concludes that the method provides a robust and effective approach for recognizing human actions at a distance, even in low-resolution scenarios.This paper presents a method for recognizing human actions at a distance, where a person may be as small as 30 pixels tall. The approach involves creating a spatio-temporal motion descriptor based on optical flow measurements, which are then used in a nearest-neighbor framework for action recognition. Optical flow is treated as a noisy spatial pattern rather than precise pixel displacements, and is smoothed and aggregated to form the motion descriptor. The method retrieves the most similar action from a database of annotated video sequences, enabling not only action recognition but also skeleton transfer and action synthesis. The approach is tested on datasets including ballet, tennis, and football, demonstrating its effectiveness in recognizing actions despite low resolution. The method also allows for the transfer of 2D/3D skeletons and the synthesis of novel action sequences through "Do as I Do" and "Do as I Say" techniques. The results show that the method can accurately classify actions even when the input data is noisy and the figures are small. The paper also discusses the use of the action database for figure correction, where imperfections in the input data are corrected using the database's information. The method is robust to misalignment and can recover human skeletons from video sequences, even when the figures are small and difficult to identify. The approach is based on a motion descriptor derived from smoothed and aggregated optical flow measurements, which is used to compare and classify actions in a nearest-neighbor framework. The method is effective in recognizing actions in medium resolution scenarios, where the figures are not as large as in near field but larger than in far field. The results show that the method can accurately classify actions in various domains, including ballet, tennis, and football, and can be used for applications such as skeleton transfer and action synthesis. The paper concludes that the method provides a robust and effective approach for recognizing human actions at a distance, even in low-resolution scenarios.
Reach us at info@futurestudyspace.com