Vision-based 3D occupancy prediction in autonomous driving: a review and outlook

Vision-based 3D occupancy prediction in autonomous driving: a review and outlook

2024 | Yanan ZHANG, Jinqing ZHANG, Zengran WANG, Junhao XU, Di HUANG
This paper provides a comprehensive review of vision-based 3D occupancy prediction for autonomous driving, discussing its background, challenges, and recent advances. The goal of 3D occupancy prediction is to predict the spatial occupancy status and semantic categories of 3D voxel grids around the autonomous vehicle from image inputs. This task is crucial for autonomous driving systems as it enables fine-grained representation and robust detection of undefined long-tail obstacles in 3D space. Occupancy representation, which originated from robotics, involves predicting whether voxels are occupied by objects, enabling effective collision avoidance. The paper categorizes existing methods into three groups: feature enhancement, deployment-friendly, and label-efficient. Feature enhancement methods aim to improve the network's ability to learn 3D features from 2D inputs. Deployment-friendly methods focus on reducing computational costs and improving learning efficiency. Label-efficient methods aim to achieve satisfactory performance even with limited or no annotations. The paper discusses the challenges in 3D occupancy prediction, including limited representation, limited detection, and expensive fine-grained annotation. It also presents a chronological overview of vision-based 3D occupancy prediction methods and a hierarchically-structured taxonomy of the task. The paper provides a regularly updated GitHub repository to collect related papers, datasets, and codes. The paper highlights the importance of 3D occupancy prediction in autonomous driving, emphasizing its role in providing fine-grained representation and robust detection for undefined long-tail obstacles. It also discusses the challenges in 3D occupancy prediction, including the need for high-quality dense 3D occupancy annotations and the computational demands of 3D occupancy prediction. The paper concludes with future outlooks for 3D occupancy prediction, emphasizing the need for further research in data, methodology, and task.This paper provides a comprehensive review of vision-based 3D occupancy prediction for autonomous driving, discussing its background, challenges, and recent advances. The goal of 3D occupancy prediction is to predict the spatial occupancy status and semantic categories of 3D voxel grids around the autonomous vehicle from image inputs. This task is crucial for autonomous driving systems as it enables fine-grained representation and robust detection of undefined long-tail obstacles in 3D space. Occupancy representation, which originated from robotics, involves predicting whether voxels are occupied by objects, enabling effective collision avoidance. The paper categorizes existing methods into three groups: feature enhancement, deployment-friendly, and label-efficient. Feature enhancement methods aim to improve the network's ability to learn 3D features from 2D inputs. Deployment-friendly methods focus on reducing computational costs and improving learning efficiency. Label-efficient methods aim to achieve satisfactory performance even with limited or no annotations. The paper discusses the challenges in 3D occupancy prediction, including limited representation, limited detection, and expensive fine-grained annotation. It also presents a chronological overview of vision-based 3D occupancy prediction methods and a hierarchically-structured taxonomy of the task. The paper provides a regularly updated GitHub repository to collect related papers, datasets, and codes. The paper highlights the importance of 3D occupancy prediction in autonomous driving, emphasizing its role in providing fine-grained representation and robust detection for undefined long-tail obstacles. It also discusses the challenges in 3D occupancy prediction, including the need for high-quality dense 3D occupancy annotations and the computational demands of 3D occupancy prediction. The paper concludes with future outlooks for 3D occupancy prediction, emphasizing the need for further research in data, methodology, and task.
Reach us at info@study.space
[slides and audio] Vision-based 3D occupancy prediction in autonomous driving%3A a review and outlook