Depth perception is the visual ability to perceive the world in three dimensions. Although any animal capable of moving around its environment must be able to sense the distance of objects in that environment, the term perception is reserved for humans, who are the only beings that can tell each other about their experiences of distances.
Depth sensation is the ability to move accurately, or to respond consistently, based on the distances of objects in an environment. With this definition, every moving animal has some sensation of depth.
Depth perception arises from a variety of depth cues. These are typically classified into binocular cues that require input from both eyes and monocular cues that require the input from just one eye. Binocular cues include stereopsis, yielding depth from binocular vision through exploitation of parallax. Monocular cues include size: distant objects subtend smaller visual angles than near objects.A third class of cues requires synthetic integration of binocular and monocular cues.
- Motion parallax - When an observer moves, the apparent relative motion of several stationary objects against a background gives hints about their relative distance. This effect can be seen clearly when driving in a car nearby things pass quickly, while far off objects appear stationary. Some animals that lack binocular vision due to wide placement of the eyes employ parallax more explicitly than humans for depth cueing (e.g. some types of birds, which bob their heads to achieve motion parallax, and squirrels, which move in lines orthogonal to an object of interest to do the same).1
- Depth from motion - One form of depth from motion, kinetic depth perception, is determined by dynamically changing object size. As objects in motion become smaller, they appear to recede into the distance or move farther away; objects in motion that appear to be getting larger seem to be coming closer. Using kinetic depth perception enables the brain to calculate time to crash distance (aka time to collision or time to contact - TTC) at a particular velocity. When driving, we are constantly judging the dynamically changing headway (TTC) by kinetic depth perception.
- Color vision - Correct interpretation of color, and especially lighting cues, allows the beholder to determine the shape of objects, and thus their arrangement in space. The color of distant objects is also shifted towards the blue end of the spectrum. (e.g. distant mountains.) Painters, notably Cezanne, employ "warm" pigments (red, yellow and orange) to bring features forward towards the viewer, and "cool" ones (blue, violet, and blue-green) to indicate the part of a form that curves away from the picture plane.
- Perspective - The property of parallel lines converging at infinity allows us to reconstruct the relative distance of two parts of an object, or of landscape features.
- Relative size - An automobile that is close to us looks larger than one that is far away; our visual system exploits the relative size of similar (or familiar) objects to judge distance.
- Aerial perspective - Due to light scattering by the atmosphere, objects that are a great distance away have lower luminance contrast and lower color saturation. In computer graphics, this is called "distance fog". The foreground has high contrast; the background has low contrast. Objects differing only in their contrast with a background appear to be at different depths.
- Depth from Focus - The lens of the eye can change its shape to bring objects at different distances into focus. Knowing at what distance the lens is focused when viewing an object means knowing the approximate distance to that object.
- Occlusion (also referred to as interposition) - Occlusion (blocking the sight) of objects by others is also a clue which provides information about relative distance. However, this information only allows the observer to create a "ranking" of relative nearness.
- Peripheral vision - At the outer extremes of the visual field, parallel lines become curved, as in a photo taken through a fish-eye lens. This effect, although it's usually eliminated from both art and photos by the cropping or framing of a picture, greatly enhances the viewer's sense of being positioned within a real, three dimensional space. (Classical perspective has no use for this so-called "distortion", although in fact the "distortions" strictly obey optical laws and provide perfectly valid visual information, just as classical perspective does for the part of the field of vision that falls within its frame.)
- Texture gradient - Suppose you are standing on a gravel road. The gravel near you can be clearly seen in terms of shape, size and colour. As your vision shifts towards the distant road the texture cannot be clearly differentiated.
Binocular and oculomotor cues
Binocular vision uses both eyes together to give depth cues. Only one eye is needed to determine distance using the focus required to see an object properly:
- Stereopsis or Retinal disparity - Animals that have their eyes placed frontally can also use information derived from the different projection of objects onto each retina to judge depth. By using two images of the same scene obtained from slightly different angles, it is possible to triangulate the distance to an object with a high degree of accuracy. If an object is far away, the disparity of that image falling on both retinas will be small. If the object is close or near, the disparity will be large. It is stereopsis that tricks people into thinking they perceive depth when viewing Magic Eyes, Autostereograms, 3D movies and stereoscopic photos.
- Accommodation - This is an oculomotor cue for depth perception. When we try to focus on far away objects, the ciliary muscles stretches the eye lens, making it thinner. The kinesthetic sensations of the contracting and relaxing ciliary muscles (intraocular muscles) is sent to the visual cortex where it is used for interpreting distance/depth.
- Convergence - This is a binocular oculomotor cue for distance/depth perception. By virtue of stereopsis the two eye balls focus on the same object. In doing so they converge. The convergence will stretch the extraocular muscles. Kinesthetic sensations from these extraocular muscles also help in depth/distance perception. The angle of convergence is smaller when the eye is fixating on far away objects.
Of these various cues, only convergence, focus and familiar size provide absolute distance information. All other cues are relative (ie, they can only be used to tell which objects are closer relative to others). Stereopsis is merely relative because a greater or lesser disparity for nearby objects could either mean that those objects differ more or less substantially in relative depth or that the foveated object is nearer or further away (the further away a scene is, the smaller is the retinal disparity indicating the same depth difference).
It would be over-simplification to ignore the mental processes at work as a person sees with two normal eyes. The fact that binocular stereopsis is occurring, enables the brain to infer and perceive certain additional depth in the form of a mental construct. Closing one eye shuts down this stereo construct. Recent work toward improving digital display of stereoscopic images has re-vitalized the field, as practical applications often do. Those working in the field have identified several processes of interpolation, previously ignored or considered irrelevant. These provide a linkage in the mental construct of objects visible to only one eye, while viewing with both eyes in a forward direction. Recent literature has addressed the relationship between the stereo viewing area and the periphery. Recent analysis has demonstrated that objects just outside the angle of double visual coverage, are, in fact, integrated by the mind into the stereo construct by a process of inference. Briefly stated, " all objects, in even moderate focus, within the central viewing field of a single eye, are, an important part of the stereo construct". Their physical position is noted, and SEEN very accurately in the mental stereo visualization process, though visible to only one of the 2 eyes in use.
Most open-plains herbivores, especially hoofed grazers, lack binocular vision because they have their eyes on the sides of the head, providing a panoramic, almost 360º, view of the horizon - enabling them to notice the approach of predators from almost any direction. However most predators have both eyes looking forwards, allowing binocular depth perception and helping them to judge distances when they pounce or swoop down onto their prey. Animals that spend a lot of time in trees take advantage of binocular vision in order to accurately judge distances when rapidly moving from branch to branch.
Matt Cartmill, a physical anthropologist & anatomist at Duke University Medical Center, has criticized this theory, citing other arboreal species which lack stereoscopic vision, such as squirrels and certain birds. Instead, he proposes a "Visual Predation Hypothesis," which argues that ancestral primates were insectivorous predators resembling tarsiers, subject to the same selection pressure for frontal vision as other predatory species. He also uses this hypothesis to account for the specialization of primate hands, which he suggests became adapted for grasping prey, somewhat like the way raptors employ their talons. link to PDF
Depth perception in art
Photographs capturing perspective are two-dimensional images that often illustrate the illusion of depth. (This differs from a painting, which may use the physical matter of the paint to create a real presence of convex forms and spacial depth.) Stereoscopes and Viewmasters, as well as 3-D movies, employ binocular vision by forcing the viewer to see two images created from slightly different positions (points of view). By contrast, a telephoto lens — used in televised sports, for example, to zero in on members of a stadium audience — has the opposite effect. The viewer sees the size and detail of the scene as if it were close enough to touch, but the camera's perspective is still derived from its actual position a hundred meters away, so background faces and objects appear about the same size as those in the foreground.
Trained artists are keenly aware of the various methods for indicating spacial depth (color shading, distance fog, perspective and relative size), and take advantage of them to make their works appear "real". The viewer feels it would be possible to reach in and grab the nose of a Rembrandt portrait or an apple in a Cezanne still life — or step inside a landscape and walk around among its trees and rocks.
Cubism was based on the idea of incorporating multiple points of view in a painted image, as if to simulate the visual experience of being physically in the presence of the subject, and seeing it from different angles. The radical "High Cubist" experiments of Braque and Picasso circa 1909 are interesting but more bizarre than convincing in visual terms. Slightly later paintings by their followers, such as Robert Delaunay's views of the Eiffel Tower, or John Marin's Manhattan cityscapes, borrow the explosive angularity of Cubism to exaggerate the traditional illusion of three-dimensional space. A century after the Cubist adventure, the verdict of art history is that the most subtle and successful use of multiple points of view can be found in the pioneering late work of Cezanne, which both anticipated and inspired the first actual Cubists. Cézanne's landscapes and still lifes powerfully suggest the artist's own highly-developed depth perception. At the same time, like the other Post-Impressionists, Cézanne had learned from Japanese prints the significance of respecting the flat (two-dimensional) rectangle of the picture itself; Hokusai and Hiroshige ignored or even reversed linear perspective and thereby remind the viewer that a the picture can only be "true" when it acknowledges the truth of its own flat surface. By contrast, European "academic" painting was devoted to a sort of Big Lie that the surface of the canvas is only an enchanted doorway to a "real" scene unfolding beyond, and that the artist's main task is to distract the viewer from any disenchanting awareness of the presence of the painted canvas. Cubism, and indeed most of modern art is a struggle to confront, if not resolve, the paradox of suggesting spacial depth on a flat surface, and explore that inherent contradiction through innovative ways of seeing, as well as new methods of drawing and painting.
Disorders affecting depth perception
- Ocular conditions such as amblyopia, optic nerve hypoplasia, and strabismus may reduce the perception of depth.
- Since (by definition), binocular depth perception requires two functioning eyes, a person with only one functioning eye has none.
It is typically felt that Depth perception must be learned in infancy using an unconscious inference.
- Palmer, S. E. (1999) Vision science: Photons to phenomenology. Cambridge, MA: Bradford Books/MIT Press.
- Pinker, S. (1997). The Mind’s Eye. In How the Mind Works (pp. 211–233) ISBN 0-393-31848-6
- Purves D, Lotto B (2003) Why We See What We Do: An Empirical Theory of Vision. Sunderland, MA: Sinauer Associates.
- Scott B. Steinman, Barbara A. Steinman and Ralph Philip Garzia. (2000). Foundations of Binocular Vision: A Clinical perspective. McGraw-Hill Medical. ISBN 0-8385-2670-5
1 The term 'parallax vision' is often used as a synonym for binocular vision, and should not be confused with motion parallax. The former allows far more accurate gauging of depth than the latter.