"Put-That-There": Voice and Gesture at the Graphics Interface

"Put-That-There": Voice and Gesture at the Graphics Interface

1980 | Richard A. Bolt
The paper explores the integration of voice and gesture inputs at a graphics interface, aiming to create a natural and intuitive user interaction modality. The work involves users commanding simple shapes on a large-screen display. Voice can be combined with pointing, allowing the use of pronouns to reference items on the display, enhancing naturalness and efficiency. Gesture, aided by voice, improves precision in referencing. The research is conducted in the MIT Architecture Machine Group's "Media Room," a physical space where the user's terminal is a room rather than a desktop CRT. The Media Room features a large, wall-sized projection screen and various input devices, including joysticks and touch-sensitive pads. The environment supports a "Spatial Data-Management System" (SDMS), where information is retrieved through spatial navigation rather than typing. The SDMS, called "Dataland," allows users to navigate a rich graphics world using joysticks and touch controls. Users can move, create, copy, and delete items using voice and gesture commands. For example, "Create a blue square there" creates a square at the pointed location. The system uses connected speech recognition and space position sensing to enable natural interaction. The paper discusses the technologies behind the system, including connected speech recognition and space position sensing. It also explores the use of pronouns as "temporary variables" to reference items, and the importance of spatial indexing in data management. The system's ability to combine voice and gesture inputs allows for more natural and efficient interaction with graphical data. The paper highlights the potential of integrating voice and gesture inputs to create a more intuitive and natural user interface. It emphasizes the importance of spatial awareness and the use of pronouns to reference items in a graphical environment. The research demonstrates the feasibility of a seamless integration of voice and gesture inputs in a graphics interface, paving the way for more natural and intuitive user interactions with digital environments.The paper explores the integration of voice and gesture inputs at a graphics interface, aiming to create a natural and intuitive user interaction modality. The work involves users commanding simple shapes on a large-screen display. Voice can be combined with pointing, allowing the use of pronouns to reference items on the display, enhancing naturalness and efficiency. Gesture, aided by voice, improves precision in referencing. The research is conducted in the MIT Architecture Machine Group's "Media Room," a physical space where the user's terminal is a room rather than a desktop CRT. The Media Room features a large, wall-sized projection screen and various input devices, including joysticks and touch-sensitive pads. The environment supports a "Spatial Data-Management System" (SDMS), where information is retrieved through spatial navigation rather than typing. The SDMS, called "Dataland," allows users to navigate a rich graphics world using joysticks and touch controls. Users can move, create, copy, and delete items using voice and gesture commands. For example, "Create a blue square there" creates a square at the pointed location. The system uses connected speech recognition and space position sensing to enable natural interaction. The paper discusses the technologies behind the system, including connected speech recognition and space position sensing. It also explores the use of pronouns as "temporary variables" to reference items, and the importance of spatial indexing in data management. The system's ability to combine voice and gesture inputs allows for more natural and efficient interaction with graphical data. The paper highlights the potential of integrating voice and gesture inputs to create a more intuitive and natural user interface. It emphasizes the importance of spatial awareness and the use of pronouns to reference items in a graphical environment. The research demonstrates the feasibility of a seamless integration of voice and gesture inputs in a graphics interface, paving the way for more natural and intuitive user interactions with digital environments.
Reach us at info@study.space
[slides and audio] %E2%80%9CPut-that-there%E2%80%9D%3A Voice and gesture at the graphics interface