How does the brain automatically capture and focus visual attention? This paper explains the automatic capture and focusing of visual attention through simple analysis of spatiotemporal variations in the visual environment. If the temporal variations of the signals are further emphasized, the high spatial frequencies begin to dominate. It is assumed that the transmittance for the sensory signals is modulated by separate control circuits that sample input from the same area of the visual field but at a lower resolution. The study proposes that the transmittance for sensory signals is modulated by separate control circuits sampling input from the same area but at a lower resolution. If the variations are related to the spatial resolution, which varies within wide limits over the retina, the visual field is “opened” up to a radius where it captures the most salient structures of the image. This assumes that the transmittance for the sensory signals is modulated by separate control circuits that sample input from the same area of the visual field but at a lower resolution. If the variations are related to the spatial resolution, which varies within wide limits over the retina, the visual field is “opened” up to a radius where it captures the most salient structures of the image. By modeling the visual system's response to spatiotemporal variations, the research sheds light on the complex mechanisms underlying visual attention. This research relates to *medicine* and *biological sciences*.