Tactile-Augmented Radiance Fields

Tactile-Augmented Radiance Fields

7 May 2024 | Yiming Dou, Fengyu Yang, Yi Liu, Antonio Loquercio, Andrew Owens
Tactile-Augmented Radiance Fields (TaRF) is a novel scene representation that integrates vision and touch into a shared 3D space. This representation allows for the estimation of both visual and tactile signals at any given 3D position within a scene. The TaRF is captured using a combination of photos and sparsely sampled touch probes. The approach leverages two key insights: (i) common vision-based touch sensors can be registered to images using multi-view geometry techniques, and (ii) visually and structurally similar regions share the same tactile features. By registering touch signals to a captured visual scene and training a conditional diffusion model, the system can generate tactile signals from RGB-D images. The dataset used for training contains more touch samples than previous real-world datasets and provides spatially aligned visual signals for each touch signal. The model is evaluated on several downstream tasks, including tactile localization and material classification. The results show that the model accurately estimates touch signals and improves cross-modal prediction. The TaRF enables the generation of touch probes for novel scene locations and provides a large, spatially aligned dataset of vision and touch data. The work makes the first step towards giving current scene representation techniques an understanding of both visual and tactile properties, which could be critical in applications ranging from robotics to virtual worlds. Limitations include potential misalignments due to sensor errors and assumptions about scene structure. The work is supported by various grants and acknowledges contributions from multiple researchers.Tactile-Augmented Radiance Fields (TaRF) is a novel scene representation that integrates vision and touch into a shared 3D space. This representation allows for the estimation of both visual and tactile signals at any given 3D position within a scene. The TaRF is captured using a combination of photos and sparsely sampled touch probes. The approach leverages two key insights: (i) common vision-based touch sensors can be registered to images using multi-view geometry techniques, and (ii) visually and structurally similar regions share the same tactile features. By registering touch signals to a captured visual scene and training a conditional diffusion model, the system can generate tactile signals from RGB-D images. The dataset used for training contains more touch samples than previous real-world datasets and provides spatially aligned visual signals for each touch signal. The model is evaluated on several downstream tasks, including tactile localization and material classification. The results show that the model accurately estimates touch signals and improves cross-modal prediction. The TaRF enables the generation of touch probes for novel scene locations and provides a large, spatially aligned dataset of vision and touch data. The work makes the first step towards giving current scene representation techniques an understanding of both visual and tactile properties, which could be critical in applications ranging from robotics to virtual worlds. Limitations include potential misalignments due to sensor errors and assumptions about scene structure. The work is supported by various grants and acknowledges contributions from multiple researchers.
Reach us at info@study.space
Understanding Tactile-Augmented Radiance Fields