COMPUTATIONAL MODELLING OF VISUAL ATTENTION

COMPUTATIONAL MODELLING OF VISUAL ATTENTION

FEBRUARY 2001 | Laurent Itti and Christof Koch
Computational models of visual attention focus on bottom-up control of attentional deployment, emphasizing the role of saliency in determining where attention is directed. Five key trends in this field include: (1) saliency depends on context and is computed pre-attentively using center-surround mechanisms; (2) a saliency map encodes stimulus conspicuity across the visual scene; (3) inhibition-of-return prevents re-attention to previously attended locations; (4) attention and eye movements are closely linked, requiring careful coordinate system management; and (5) scene understanding and object recognition influence attentional selection. These trends provide a framework for understanding both computational and neurobiological aspects of visual attention. The brain regions involved in visual attention include early visual processing areas, with the dorsal stream responsible for spatial localization and the ventral stream for object recognition. The prefrontal cortex plays a key role in modulating both streams. Computational models often use a saliency map to guide attention, with features such as intensity contrast, color opponency, and neuronal tuning contributing to saliency computation. These models incorporate pre-attentive feature extraction, spatial competition, and feedback modulation to enhance visual processing. Saliency is computed through pre-attentive mechanisms that detect features like intensity contrast, color opponency, and orientation. These features are processed in parallel across the visual field, with competition among feature maps leading to a saliency map that encodes overall visual saliency. Attentional scanpaths are generated through winner-take-all competition and inhibition-of-return, ensuring that attention is directed to the most salient locations. This process is supported by experimental evidence showing that attention can enhance visual processing and modulate early visual features. Models of attention and recognition often integrate bottom-up and top-down cues, with attentional selection playing a crucial role in object recognition. For example, the MORSEL model demonstrates how attention is necessary for recognizing objects, while other models use hierarchical knowledge trees to guide attention to informative parts of a scene. These models have been applied to tasks such as scene recognition and object detection, showing the importance of attention in visual processing. The integration of attention and recognition remains a challenge, as models must account for the interaction between attentional orienting and scene understanding. Future research aims to develop more complete models of attentional control, incorporating both bottom-up and top-down cues, as well as neuroanatomical constraints. These models will help advance our understanding of how attention is deployed in the visual system, with applications in artificial vision and robotics.Computational models of visual attention focus on bottom-up control of attentional deployment, emphasizing the role of saliency in determining where attention is directed. Five key trends in this field include: (1) saliency depends on context and is computed pre-attentively using center-surround mechanisms; (2) a saliency map encodes stimulus conspicuity across the visual scene; (3) inhibition-of-return prevents re-attention to previously attended locations; (4) attention and eye movements are closely linked, requiring careful coordinate system management; and (5) scene understanding and object recognition influence attentional selection. These trends provide a framework for understanding both computational and neurobiological aspects of visual attention. The brain regions involved in visual attention include early visual processing areas, with the dorsal stream responsible for spatial localization and the ventral stream for object recognition. The prefrontal cortex plays a key role in modulating both streams. Computational models often use a saliency map to guide attention, with features such as intensity contrast, color opponency, and neuronal tuning contributing to saliency computation. These models incorporate pre-attentive feature extraction, spatial competition, and feedback modulation to enhance visual processing. Saliency is computed through pre-attentive mechanisms that detect features like intensity contrast, color opponency, and orientation. These features are processed in parallel across the visual field, with competition among feature maps leading to a saliency map that encodes overall visual saliency. Attentional scanpaths are generated through winner-take-all competition and inhibition-of-return, ensuring that attention is directed to the most salient locations. This process is supported by experimental evidence showing that attention can enhance visual processing and modulate early visual features. Models of attention and recognition often integrate bottom-up and top-down cues, with attentional selection playing a crucial role in object recognition. For example, the MORSEL model demonstrates how attention is necessary for recognizing objects, while other models use hierarchical knowledge trees to guide attention to informative parts of a scene. These models have been applied to tasks such as scene recognition and object detection, showing the importance of attention in visual processing. The integration of attention and recognition remains a challenge, as models must account for the interaction between attentional orienting and scene understanding. Future research aims to develop more complete models of attentional control, incorporating both bottom-up and top-down cues, as well as neuroanatomical constraints. These models will help advance our understanding of how attention is deployed in the visual system, with applications in artificial vision and robotics.
Reach us at info@study.space
[slides and audio] Computational modelling of visual attention