Robust Object Detection with Interleaved Categorization and Segmentation

Robust Object Detection with Interleaved Categorization and Segmentation

2007 | Bastian Leibe · Aleš Leonardis · Bernt Schiele
This paper presents a novel method for detecting and localizing objects of a visual category in cluttered real-world scenes. The approach combines object categorization and figure-ground segmentation as two closely collaborating processes. The key idea is that the tight coupling between these processes allows them to benefit from each other, improving overall performance. The core of the method is a flexible learned representation for object shape that combines information from different training examples in a probabilistic extension of the Generalized Hough Transform. This enables the system to detect categorical objects in novel images and infer a probabilistic segmentation from recognition results. The segmentation is then used to improve recognition by focusing on object pixels and discarding background influences. Additionally, the information about where a hypothesis draws its support is used in an MDL-based hypothesis verification stage to resolve ambiguities and account for partial occlusion. The method is evaluated on several large datasets, showing its effectiveness for both rigid and articulated objects. It achieves competitive performance with training sets that are one to two orders of magnitude smaller than those used in comparable systems. The approach uses a codebook of local appearances, generated through clustering, to learn an Implicit Shape Model (ISM) that specifies where codebook entries may occur on an object. This model is flexible and requires fewer training examples to learn possible object shapes. The paper also discusses related work, including structural representations for object categorization and the transition from recognition to top-down segmentation. It introduces a codebook generation process using clustering methods, including k-means and agglomerative clustering. The paper presents an efficient average-link clustering algorithm for large-scale codebook generation. The method is applied to object detection and segmentation, with a focus on integrating recognition and segmentation. The approach uses a probabilistic formulation for top-down segmentation, integrating learned knowledge of the recognized category with image support. The resulting procedure provides a pixel-wise figure-ground segmentation and a per-pixel confidence estimate. The segmentation is then used to improve recognition by focusing on object pixels and resolving ambiguities between overlapping hypotheses. The paper concludes with an experimental evaluation of the system's performance on various object categories, demonstrating its robustness to scale changes and effectiveness in cluttered real-world scenes.This paper presents a novel method for detecting and localizing objects of a visual category in cluttered real-world scenes. The approach combines object categorization and figure-ground segmentation as two closely collaborating processes. The key idea is that the tight coupling between these processes allows them to benefit from each other, improving overall performance. The core of the method is a flexible learned representation for object shape that combines information from different training examples in a probabilistic extension of the Generalized Hough Transform. This enables the system to detect categorical objects in novel images and infer a probabilistic segmentation from recognition results. The segmentation is then used to improve recognition by focusing on object pixels and discarding background influences. Additionally, the information about where a hypothesis draws its support is used in an MDL-based hypothesis verification stage to resolve ambiguities and account for partial occlusion. The method is evaluated on several large datasets, showing its effectiveness for both rigid and articulated objects. It achieves competitive performance with training sets that are one to two orders of magnitude smaller than those used in comparable systems. The approach uses a codebook of local appearances, generated through clustering, to learn an Implicit Shape Model (ISM) that specifies where codebook entries may occur on an object. This model is flexible and requires fewer training examples to learn possible object shapes. The paper also discusses related work, including structural representations for object categorization and the transition from recognition to top-down segmentation. It introduces a codebook generation process using clustering methods, including k-means and agglomerative clustering. The paper presents an efficient average-link clustering algorithm for large-scale codebook generation. The method is applied to object detection and segmentation, with a focus on integrating recognition and segmentation. The approach uses a probabilistic formulation for top-down segmentation, integrating learned knowledge of the recognized category with image support. The resulting procedure provides a pixel-wise figure-ground segmentation and a per-pixel confidence estimate. The segmentation is then used to improve recognition by focusing on object pixels and resolving ambiguities between overlapping hypotheses. The paper concludes with an experimental evaluation of the system's performance on various object categories, demonstrating its robustness to scale changes and effectiveness in cluttered real-world scenes.
Reach us at info@study.space
[slides] Robust Object Detection with Interleaved Categorization and Segmentation | StudySpace