that, unlike conventional stereo, binocular Helmholtz stere- opsis is able to establish .. A Bayesian approach to binocular stereopsis. Int. Journal of Computer. approach, each possible solution of the correspondence problem is assigned a A Bayesian model of stereopsis depth and motion direction discrimination .. The firing rate of the binocular cell is the half-wave rectified sum of its inputs. A Bayesian Approach to the Stereo Correspondence Problem. Jenny C. A. Read scene, S, given an image I. In the context of stereopsis, S represents the location of . to, given the observed firing rates of the binocular complex cell itself and.

Author: | Mazur Shakacage |

Country: | Montserrat |

Language: | English (Spanish) |

Genre: | Career |

Published (Last): | 24 September 2010 |

Pages: | 62 |

PDF File Size: | 4.88 Mb |

ePub File Size: | 10.41 Mb |

ISBN: | 275-2-84784-909-3 |

Downloads: | 83847 |

Price: | Free* [*Free Regsitration Required] |

Uploader: | Mazugrel |

Conceived and designed the experiments: It is shown that existing processing schemes of 3D motion perception such as interocular velocity difference, changing disparity over time, as well as joint encoding of motion and disparity, do not offer a general solution to the inverse optics problem of local binocular 3D motion.

Instead we suggest that local velocity constraints in combination with binocular disparity and other depth cues provide a more flexible framework for the solution binovular the inverse problem.

In the context of the aperture problem we derive predictions from two plausible default strategies: Predicting perceived motion directions for ambiguous line motion provides an opportunity to distinguish between these strategies of 3D motion processing. Our theoretical results suggest that velocity constraints and disparity from feature tracking are needed to solve the inverse problem of 3D motion perception. It seems plausible that motion and disparity input is processed in parallel and integrated late in the visual processing hierarchy.

Humans sterfopsis many other predators have two eyes that are set a short bxyesian apart so that an extensive region of the world is seen simultaneously by both eyes from slightly different points of view.

Although the images of the world are essentially two-dimensional, we vividly see the world as three-dimensional. This is true for static as well as dynamic images. Here we elaborate on how the visual system may establish 3D motion perception from local input in the left and right eye. Using tools from analytic geometry we show that existing 3D motion models offer no general solution to the inverse optics problem of 3D motion perception.

We suggest a flexible framework of motion and depth processing and suggest default strategies for local 3D motion estimation. Our results on the aperture and inverse problem of 3D motion are likely to stimulate computational, behavioral, and neuroscientific studies because they address sterelpsis fundamental issue of how 3D motion is represented in the visual system.

The representation of the three-dimensional 3D external world from two-dimensional 2D retinal input is a fundamental problem that the visual system has to solve [1] — [4]. This is true for static scenes in 3D as well as for dynamic events in 3D space.

For the latter the inverse problem extends to the inference of dynamic events in a 3D world from 2D motion signals projected into the left and right eye. In the following we exclude observer movements and only consider passively observed motion.

Velocity in 3D space is described by motion direction and speed. Motion direction can be measured in terms of azimuth and elevation angle, and motion direction together with speed is conveniently expressed as a 3D motion vector in a cartesian coordinate system.

Estimating such a vector locally is highly desirable for a visual system because the representation of local estimates in a dense vector field provides the basis for the perception of 3D object motion, that is direction and speed of moving objects. This information is essential for interpreting events as well as planning and executing actions in a dynamic environment. If a single moving point, corner or other unique feature serves as binocular input then intersection of constraint lines or triangulation together with a starting point provides a straightforward and unique geometrical solution to the inverse problem in a binocular viewing geometry see Methods and Fig.

If, however, the moving stimulus has spatial extent, such as an edge, contour, or line inside a circular aperture [5] then local motion direction in corresponding receptive fields of the left and right eye remains ambiguous and additional constraints are needed to solve the aperture and inverse problem in 3D. The left and right eye with nodal points a and cseparated by interocular distance iare verged on a fixation point F at viewing distance D.

If an oriented stimulus diagonal line moves from the fixation point to a new position in depth along a known trajectory black arrow then perspective projection of the line stimulus onto local areas on the retinae or a fronto-parallel screen creates 2D aperture problems for the left and right eye green and brown arrows.

The inverse optics and the aperture problem are well-known problems in computational vision, especially in the context of stereo [3][6]structure from motion [7]and optic flow [8].

Gradient constraint methods belong to the most widely used techniques of optic-flow computation from image sequences. They can be divided into local area-based [9] and into more global optic flow methods [10].

Both techniques employ brightness constancy and smoothness constraints in the image to estimate velocity in an over-determined equation system. It is important to note that optical flow only provides a constraint in the direction of the image gradient, the normal component of the optical flow.

As a consequence some form of regularization or smoothing is needed. Similar techniques in terms of error minimization and regularization have been offered for 3D stereo-motion detection [11] — [13]. Essentially these algorithms extend processing principles of 2D optic flow to 3D scene flow. Computational studies on 3D motion algorithms are usually concerned with fast and efficient encoding when tested against ground truth.

Here we are less concerned with the efficiency or robustness of a particular implementation. Instead we want to understand and approavh behavioral characteristics of human stdreopsis motion perception.

Any physiologically plausible solution to the inverse 3D motion problem has to rely on binocular sampling of local spatio-temporal information. There are at least three known cell types in early visual cortex that may be involved in local encoding of 3D motion: It is therefore not surprising that three approaches to binocular 3D motion perception have emerged in the literature: These three approaches have generated an extensive body stereeopsis research but psychophysical results have been inconclusive and the nature of 3D motion processing remains an unresolved issue [25][26].

Despite the wealth of empirical studies on motion in depth there is a lack of studies on true 3D motion stimuli. Previous psychophysical and neurophysiological studies typically employ stimulus dots with unambiguous motion direction or fronto-parallel random-dot surfaces moving binocu,ar depth.

### The direction of retinal motion facilitates binocular stereopsis.

The aperture problem and local motion encoding however, which features so prominently in 2D motion perception [14] — [16] has been neglected in the study of 3D motion perception. Large and persistent perceptual bias has been found for dot stimuli with unambiguous motion direction [27] — [29] suggesting processing strategies that are different from the three main processing models [28] — [30]. Fo seems promising to investigate local sereopsis stimuli with ambiguous motion direction such as a line or contour moving inside a circular aperture [31] because they relate to local encoding [17] — [24] and may reveal principles of 3D motion processing [32].

The aim of this paper is to evaluate existing models of 3D motion perception and to gain a better understanding of binocular 3D motion perception. First, we show that existing fo of 3D motion perception are insufficient to solve the inverse problem of binocular 3D motion. Second, we establish velocity constraints in a binocular viewing geometry and demonstrate that additional information is necessary to disambiguate local velocity constraints and to derive a velocity estimate.

Third, we compare two default strategies of perceived 3D motion when local motion direction is ambiguous. It is shown that critical stimulus conditions exist that can help to determine whether 3D motion perception favors slow 3D motion or averaged cyclopean motion. In the following we summarize shortcomings for each of the three main approaches to binocular 3D motion perception in terms of stereo and motion correspondence, 3D motion direction, and speed. We also provide a counterexample to illustrate the limitations of each approach.

This influential processing model assumes that monocular spatio-temporal differentiation or motion detection [33] is followed by a difference computation between velocities in the left and right eye [34] — [36].

### Binocular Vision

binocula The difference or ratio between monocular motion vectors in each eye, usually in a viewing geometry where interocular separation i and viewing distance D is known, provides an estimate of appoach direction in terms of azimuth angle only. We argue that the standard IOVD model [29][37] — [40] is incomplete and ill-posed if we consider local motion encoding and the aperture problem.

In the following the limitations of the IOVD model are illustrated. The first limitation is easily overlooked: IOVD assumes stereo correspondence between motion in the left and right eye when estimating 3D motion trajectory. The model does not specify which motion vector in the left eye should correspond to which motion vector in the right eye before computing a velocity difference.

If there is only a single motion vector in the left and right eye then establishing a stereo correspondence appears trivial since there are only two positions in the left and right eye that signal dynamic information.

binkcular

Nevertheless, stereo correspondence is a necessary pre-requisite of IOVD processing which quickly becomes challenging if we consider multiple stimuli that excite not only one but many local motion detectors in the left stereopssi right eye. It is concluded that without explicit stereo correspondence between local motion detectors the IOVD model is incomplete.

## A Bayesian approach to binocular steropsis

The second problem concerns 3D motion trajectories with arbitrary azimuth and elevation angles. Consider a local contour with spatial extent such as an oriented line inside a circular aperture so that the endpoints of the line are occluded. This is known as the aperture problem in stereopsis [5][41]. If an observer maintains fixation at close or moderate viewing distance then the oriented line stimulus projects differently onto the left and right retina see Fig.

When the oriented line moves horizontally in depth at a given azimuth angle then local motion detectors tuned to different speeds respond optimally to motion normal perpendicular to the orientation of the line. If the normal in the left and right eye serves as a default strategy for the aperture problem in 2D [14][16] then these vectors may have different lengths as well as orientations if the line or edge is oriented in depth.

## The direction of retinal motion facilitates binocular stereopsis.

Inverse perspective projection of the retinal motion vectors reveals that the velocity constraint lines biocular skew and an intersection of line constraints IOC does not exist. In fact, an intersection only exists if the following constraint for the motion vector in the left and right eye holds see Methods:. If the image planes are fronto-parallel so that then the condition is simply.

However, this constraint is easily violated as illustrated in Fig. Constraint lines through projection point b and d do not intersect and 3D motion cannot be determined see text for details. This is surprising because the model is based on spatial-temporal or speed-tuned motion detectors. The problem arises because computing motion trajectory without a constraint in depth does not solve the inverse problem.

As a consequence speed is typically approximated by motion in depth along the line of sight [37]. Another violation occurs when the line is slanted in depth and projects with different orientations into the left and right eye.

The resulting misalignment on the y -axis between motion vectors in the left and right eye is reminiscent of vertical disparity and the induced effect [42][43] with vertical disparity increasing over time. The stereo system can reconstruct depth from input with orientation disparity and even vertical disparity [44] but it seems unlikely that the binocular motion system can establish similar stereo correspondences. It is approah that the IOVD model is incomplete and easily leads to ill-posed inverse problems.

These limitations are difficult wtereopsis resolve within a motion processing system and point to contributions from disparity or depth processing. This alternative processing scheme uses disparity input and monitors changing disparity over time CDOT.

Disparity between the left and right image is detected [45] and changes over time give rise to motion-in-depth perception [46] — [49]. We argue that this approach also has limitations when the inverse problem of local 3D motion is considered. Assuming CDOT can always establish a suitable stereo correspondence between features including lines [5][41] then the model still needs to resolve the motion correspondence problem. It needs to bayesiqn disparity not only over time but also over 3D position to establish bxyesian 3D motion trajectory.

Although this may be possible for a global feature tracking system it is unclear how CDOT arrives at estimates of local 3D motion. Detecting local disparity change alone is insufficient to determine an arbitrary 3D trajectory.

CDOT has difficulties to recover arbitrary 3D motion direction because only motion-in-depth along the line of sight is well defined. As a consequence the rate of change of disparity provides a speed estimate for motion-in-depth along the line of sight but not for arbitrary 3D motion trajectories.

In the context of local surface motion consider a horizontally slanted surface moving to the left or right behind a circular aperture.

Without corners or other unique features CDOT can only detect local motion in depth along the line of sight. Similarly in the context of local line motion, the inverse problem remains ill posed for a local edge or line moving on a slanted surface because additional motion constraints are needed to determine a 3D motion direction.

In summary, CDOT does not provide a general solution to the inverse problem of local 3D motion because it lacks information on motion direction. Even though CDOT is capable of extracting stereo correspondences over time, additional motion constraints are needed to represent arbitrary motion trajectories in 3D space.

This approach postulates that early binocular cells are both motion and disparity selective and physiological evidence for the existence of such cells was found in cat striate cortex [22] and monkey V1 [50] see however [51].

Model cells in this hybrid approach extract motion and disparity energy from local stimulation. A read-out from population activity and fo decoding is needed tto explain global 3D motion phenomena such as transparent motion and Pulfrich-like effects [52][53].