5 Jun 2024 | Samson Yu†, Kelvin Lin†, Anxing Xiao†, Jiafei Duan§, and Harold Soh†‡
The paper "Octopi: Object Property Reasoning with Large Tactile-Language Models" by Samson Yu, Kelvin Lin, Anxing Xiao, Jiafei Duan, and Harold Soh explores the integration of tactile perception with language to enable physical reasoning in embodied systems. The authors introduce PHYSICLEAR, a dataset that includes GelSight tactile videos and annotations for physical properties such as hardness, roughness, and bumpiness. They also present OCTopi, a system that leverages both tactile representation learning and large vision-language models (LLMs) to predict and reason about tactile inputs with minimal language fine-tuning. Evaluations on PHYSICLEAR show that OCTopi effectively uses intermediate physical property predictions to improve performance on various tactile-related tasks. The paper contributes to the field of tactile-enabled physical reasoning for embodied AI systems.The paper "Octopi: Object Property Reasoning with Large Tactile-Language Models" by Samson Yu, Kelvin Lin, Anxing Xiao, Jiafei Duan, and Harold Soh explores the integration of tactile perception with language to enable physical reasoning in embodied systems. The authors introduce PHYSICLEAR, a dataset that includes GelSight tactile videos and annotations for physical properties such as hardness, roughness, and bumpiness. They also present OCTopi, a system that leverages both tactile representation learning and large vision-language models (LLMs) to predict and reason about tactile inputs with minimal language fine-tuning. Evaluations on PHYSICLEAR show that OCTopi effectively uses intermediate physical property predictions to improve performance on various tactile-related tasks. The paper contributes to the field of tactile-enabled physical reasoning for embodied AI systems.