5 Apr 2024 | Albert J. Zhai, Yuan Shen, Emily Y. Chen, Gloria X. Wang, Xinlei Wang, Sheng Wang, Kaiyu Guan, Shenlong Wang
This paper presents NeRF2Physics, a novel method for dense prediction of physical properties from a collection of images. The method leverages language-embedded feature fields and large language models (LLMs) to estimate physical properties of objects without requiring labeled data. Inspired by how humans reason about physics through vision, the method first extracts a 3D point cloud from a neural radiance field (NeRF) and fuses vision-language features into each point. It then uses LLMs to propose candidate materials for each object based on semantic knowledge. Finally, it estimates the physical properties of each point using zero-shot kernel regression and propagates the estimates across the entire object via spatial interpolation. The method is accurate, annotation-free, and applicable to any object in the open world.
Experiments demonstrate the effectiveness of the proposed approach in various physical property reasoning tasks, such as estimating the mass of common objects, as well as other properties like friction and hardness. The method outperforms existing baselines in mass estimation, and it can also estimate other physical properties like friction and hardness. The method is evaluated on the ABO dataset and a custom dataset of real-world objects with manually measured friction and hardness. The results show that the method can produce reasonable predictions of a variety of physical properties without supervision.
The method is particularly effective in estimating physical properties that require volumetric integration, such as mass. It uses LLM-based estimates of surface thickness to integrate the predicted mass density across cuboids on the surface of the object. The method is also robust to errors in the geometry from NeRF, thanks to its feature fusion strategy. The method can be applied with or without object segmentation masks.
The method has potential applications in immersive computing and content creation, as well as in embodied AI and robot simulation. It can also be used to estimate crop biomass, which is important for agriculture but labor-intensive and destructive to measure manually. The method is supported by a variety of references and is available at https://ajzhai.github.io/NeRF2Physics.This paper presents NeRF2Physics, a novel method for dense prediction of physical properties from a collection of images. The method leverages language-embedded feature fields and large language models (LLMs) to estimate physical properties of objects without requiring labeled data. Inspired by how humans reason about physics through vision, the method first extracts a 3D point cloud from a neural radiance field (NeRF) and fuses vision-language features into each point. It then uses LLMs to propose candidate materials for each object based on semantic knowledge. Finally, it estimates the physical properties of each point using zero-shot kernel regression and propagates the estimates across the entire object via spatial interpolation. The method is accurate, annotation-free, and applicable to any object in the open world.
Experiments demonstrate the effectiveness of the proposed approach in various physical property reasoning tasks, such as estimating the mass of common objects, as well as other properties like friction and hardness. The method outperforms existing baselines in mass estimation, and it can also estimate other physical properties like friction and hardness. The method is evaluated on the ABO dataset and a custom dataset of real-world objects with manually measured friction and hardness. The results show that the method can produce reasonable predictions of a variety of physical properties without supervision.
The method is particularly effective in estimating physical properties that require volumetric integration, such as mass. It uses LLM-based estimates of surface thickness to integrate the predicted mass density across cuboids on the surface of the object. The method is also robust to errors in the geometry from NeRF, thanks to its feature fusion strategy. The method can be applied with or without object segmentation masks.
The method has potential applications in immersive computing and content creation, as well as in embodied AI and robot simulation. It can also be used to estimate crop biomass, which is important for agriculture but labor-intensive and destructive to measure manually. The method is supported by a variety of references and is available at https://ajzhai.github.io/NeRF2Physics.