This paper presents a framework for placing local object detection within the context of the overall 3D scene by modeling the interdependence of objects, surface orientations, and camera viewpoint. The authors argue that traditional object detection methods, which consider all scales and locations in an image as equally likely, can be improved by incorporating probabilistic estimates of 3D geometry. They propose a statistical framework that allows simultaneous inference of object identities, surface orientations, and camera viewpoint using a single image from an uncalibrated camera. The framework is evaluated on a challenging dataset of outdoor images containing cars and people, often occluded and at various scales. The results demonstrate significant improvements over standard low-level detectors, highlighting the benefits of integrating 3D reasoning into object detection. The authors also introduce a new technique for estimating camera viewpoint based on image matching, which further enhances the accuracy of object detection. The paper concludes by discussing the limitations and potential extensions of the model, emphasizing the potential for a comprehensive vision system capable of complete image understanding.This paper presents a framework for placing local object detection within the context of the overall 3D scene by modeling the interdependence of objects, surface orientations, and camera viewpoint. The authors argue that traditional object detection methods, which consider all scales and locations in an image as equally likely, can be improved by incorporating probabilistic estimates of 3D geometry. They propose a statistical framework that allows simultaneous inference of object identities, surface orientations, and camera viewpoint using a single image from an uncalibrated camera. The framework is evaluated on a challenging dataset of outdoor images containing cars and people, often occluded and at various scales. The results demonstrate significant improvements over standard low-level detectors, highlighting the benefits of integrating 3D reasoning into object detection. The authors also introduce a new technique for estimating camera viewpoint based on image matching, which further enhances the accuracy of object detection. The paper concludes by discussing the limitations and potential extensions of the model, emphasizing the potential for a comprehensive vision system capable of complete image understanding.