Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review

Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review

July 1997 | Vladimir I. Pavlovic, Rajeev Sharma, and Thomas S. Huang
This paper reviews the literature on visual interpretation of hand gestures for human-computer interaction (HCI). Hand gestures provide an attractive alternative to cumbersome interface devices for HCI. Visual interpretation of hand gestures can help achieve ease and naturalness in HCI. The paper discusses the role of gesture interpretation in HCI, organized by the methods used for modeling, analyzing, and recognizing gestures. Important differences arise based on whether a 3D hand model or an image appearance model is used. 3D models allow more elaborate modeling but are computationally intensive. Appearance-based models are computationally efficient but lack generality. The paper also discusses implemented gestural systems and other potential applications of vision-based gesture recognition. Although progress is encouraging, further theoretical and computational advances are needed for widespread use of gestures in HCI. The paper discusses future research directions in gesture recognition, including integration with other natural modes of HCI. The paper is organized into six sections. Section 2 discusses gesture modeling, including definitions, taxonomies, and temporal and spatial modeling. Section 3 discusses gesture analysis, including feature detection and parameter estimation. Section 4 discusses gesture recognition, including the role of grammar and evaluation criteria. Section 5 discusses gesture-based systems and applications. Section 6 discusses future research directions. Section 7 concludes the paper. Gesture modeling involves defining gestures, classifying them into communicative and manipulative types, and modeling them using either 3D hand/arm models or appearance-based models. 3D models are more complex but allow for more detailed modeling of gestures. Appearance-based models are simpler but less general. Temporal modeling of gestures involves identifying the three phases of a gesture: preparation, nucleus, and retraction. Spatial modeling of gestures involves describing the 3D spatial properties of hand and arm movements. Gesture analysis involves extracting relevant image features from video images of a human operator engaged in HCI. This includes localization of the gesturer, detection of features such as silhouettes, contours, and fingertips, and estimation of model parameters. Parameter estimation involves computing the parameters of the gesture model based on the detected features. Gesture recognition involves classifying and interpreting the parameters in the light of the accepted model and possibly the rules imposed by some grammar. The recognition process may also influence the analysis stage by predicting the gesture model at the next time instance. The paper discusses various approaches to gesture recognition, including the use of 3D hand models, appearance-based models, and deformable 2D templates. It also discusses the challenges of gesture recognition, including occlusions, computational complexity, and the need for accurate parameter estimation. The paper concludes that further research is needed to improve the accuracy, robustness, and speed of gesture recognition systems for HCI.This paper reviews the literature on visual interpretation of hand gestures for human-computer interaction (HCI). Hand gestures provide an attractive alternative to cumbersome interface devices for HCI. Visual interpretation of hand gestures can help achieve ease and naturalness in HCI. The paper discusses the role of gesture interpretation in HCI, organized by the methods used for modeling, analyzing, and recognizing gestures. Important differences arise based on whether a 3D hand model or an image appearance model is used. 3D models allow more elaborate modeling but are computationally intensive. Appearance-based models are computationally efficient but lack generality. The paper also discusses implemented gestural systems and other potential applications of vision-based gesture recognition. Although progress is encouraging, further theoretical and computational advances are needed for widespread use of gestures in HCI. The paper discusses future research directions in gesture recognition, including integration with other natural modes of HCI. The paper is organized into six sections. Section 2 discusses gesture modeling, including definitions, taxonomies, and temporal and spatial modeling. Section 3 discusses gesture analysis, including feature detection and parameter estimation. Section 4 discusses gesture recognition, including the role of grammar and evaluation criteria. Section 5 discusses gesture-based systems and applications. Section 6 discusses future research directions. Section 7 concludes the paper. Gesture modeling involves defining gestures, classifying them into communicative and manipulative types, and modeling them using either 3D hand/arm models or appearance-based models. 3D models are more complex but allow for more detailed modeling of gestures. Appearance-based models are simpler but less general. Temporal modeling of gestures involves identifying the three phases of a gesture: preparation, nucleus, and retraction. Spatial modeling of gestures involves describing the 3D spatial properties of hand and arm movements. Gesture analysis involves extracting relevant image features from video images of a human operator engaged in HCI. This includes localization of the gesturer, detection of features such as silhouettes, contours, and fingertips, and estimation of model parameters. Parameter estimation involves computing the parameters of the gesture model based on the detected features. Gesture recognition involves classifying and interpreting the parameters in the light of the accepted model and possibly the rules imposed by some grammar. The recognition process may also influence the analysis stage by predicting the gesture model at the next time instance. The paper discusses various approaches to gesture recognition, including the use of 3D hand models, appearance-based models, and deformable 2D templates. It also discusses the challenges of gesture recognition, including occlusions, computational complexity, and the need for accurate parameter estimation. The paper concludes that further research is needed to improve the accuracy, robustness, and speed of gesture recognition systems for HCI.
Reach us at info@study.space