«ABSTRACT From a signal processing perspective, we examine the main factors deﬁning the visual quality of autostereoscopic 3-D displays, which are ...»
Spatial-Angular Analysis of Displays for Reproduction of
Amir Saida and Eino-Ville Talvalab
a Hewlett-Packard Laboratories, Palo Alto, CA, USA
b Electrical Engineering Dept., Stanford University, Stanford, CA, USA
From a signal processing perspective, we examine the main factors deﬁning the visual quality of autostereoscopic
3-D displays, which are beginning to reproduce the plenoptic function with increasing accuracy. We propose using intuitive visual tools and ray-tracing simulations to gain insight into the signal processing aspects, and we demonstrate the advantages of analyzing what we call mixed spatial-angular spaces. With this approach we are able to intuitively demonstrate some basic limitations of displays using anisotropic diﬀusers or lens arrays.
Furthermore, we propose new schemes for improved performance.
Keywords: Light ﬁeld, plenoptic function, autostereo displays, image processing, spatial-angular resolution
1. INTRODUCTION As display technologies advance, we can foresee autostereo devices reproducing three-dimensional scenes, for multiple observers and without special viewing apparatus, through increasingly more accurate reproductions of light ﬁelds. While there are many diﬀerent approaches to achieve this goal, which are described in several books and surveys,1–6 a major challenge to newcomers to this research ﬁeld is to sort out the diﬀerent characteristics of each method. It is easy to start focusing too much on issues related to current technologies and implementations, and not consider the fundamental problems that are inherent to a method, nor being able to recognize techniques that may be more or less promising in the long term.
Another common diﬃculty comes from the fact that a 3-D display is meant to be viewed from diﬀerent angles, and that the quality of the observed image keeps changing with viewer position, possibly in a non-smooth manner. Thus, photos of display prototypes convey very limited information about how well the device reproduces natural 3-D views, and how convincing and pleasing the viewing experience is. Video can convey signiﬁcantly more information, but its use is still not very common, and it is much harder to publish and maintain.
We also observe that there is very active research on capturing and rendering of light ﬁelds (in 2-D), with possible applications to computational photography,7 but there has been less work on the application of that research to full reproduction (3-D display) of light ﬁelds. There are many diﬃculties in translating well-known image/video processing properties—like pixelation and aliasing—into intuition about 3-D display visual quality.
Considering these problems, in this work we present the following contributions to the study of displays
capable of reproducing light ﬁelds:
• We discuss how images created from a type of mathematical representation, which we call spatial-angular images, provide rich and intuitive visual information for analysis of some fundamental factors that determine the quality of the reproduced 3-D views, and thus also the overall quality of a type of display.
• Using properties of spatial-angular images, we demonstrate that even a simple analysis of view distortion can show why, for diﬀerent types of displays, the quality of the reproduction must depend on the depth of the reproduced objects in relation to the display plane.
Author e-mail addresses: amir email@example.com, firstname.lastname@example.org
• We use ray-tracing simulations of displays to create realistic views of diﬀerent systems. This makes it easy to show how diﬀerent parameters change the view quality and type of visual artifacts, enabling more intuitive understanding of the diﬀerent factors and trade-oﬀs. Furthermore, simulation allows us to temporarily disregard physical laws. For example, it is interesting to know how a display would look when lens aberration is completely eliminated.
We also propose new approaches to designing better displays which exploit how new digital projectors, display technologies, and powerful computers allow us to ignore design constraints that were essential in the analog domain.
• Displays that use lens arrays and diﬀused light modulators at the focal plane commonly have a spatial view resolution (pixels per unit length) roughly identical to the lens size, and thus very small lenses are needed to avoid pixelation. We show that by changing amount of diﬀusion and employing multiple projectors the spatial resolution can be several times larger than that deﬁned by lens size.
• We extend the ideas of the modiﬁcation above, showing that it can be considered only a particular solution, in a much larger set of possible schemes for creating 3-D displays. New technology enables us to move to a new paradigm, where instead of thinking about combining traditional imaging optics, we can instead use multiple light modulators combined with elements for light diﬀusion, refraction or reﬂection (not necessarily lenses), to re-create the four-dimensional light ﬁeld of a scene. This requires hardware, now becoming available, that is able to manage and compute the huge amounts of information required to map the desired light ﬁeld rays to the inputs of the light modulators.
All the analyses and techniques presented in this work are applicable to displays that present parallax in both the horizontal and vertical direction. However, to simplify the notation and explanations, we show examples of displays with parallax in the horizontal direction only.
2. SPATIAL-ANGULAR IMAGES We follow the terminology for light ray distribution used by Adelson and Bergen,8 and call the function describing the distribution of light radiance in space the plenoptic function. The original deﬁnition includes time and light wavelength as parameters, which we consider implicitly to simplify the notation. Thus, we assume that radiance in a point in space a = (x, y, z), propagating along direction d(φ, θ) = (sin φ cos θ, sin φ sin θ, cos φ), is deﬁned by the ﬁve-dimensional function pg (x, y, z, φ, θ), with φ ∈ [0, π] and θ ∈ [0, 2π) corresponding to variables of standard spherical coordinate systems.
In a transparent medium the plenoptic function has a certain amount of redundancy because radiance is conserved, i.e., pg (x, y, z, φ, θ) = pg (x + α sin φ cos θ, y + α sin φ sin θ, z + α cos φ, φ, θ), (1) for all values of α that correspond to unoccluded points in a transparent medium.
When studying an apparatus that is meant to recreate the plenoptic function in a surface, it is convenient to assume that the display is in plane z = 0. In this case, we deﬁne the display’s plenoptic function as
Figure 1. The spherical (a) and Cartesian (b) parameterizations of the plenoptic function in a display located at plane z = 0, and a diagram (c) showing why light from a point source at (x0, y0, z0 ) deﬁne a line in the (x, u) space.
One advantage of deﬁning the plenoptic function as pa (x, y, φ, θ) is that we use the same angular variables as in the well-known spherical system of coordinates. However, there are also advantages of using the Cartesian angular dimensions (u, v), and deﬁning the display’s plenoptic function as
Fig. 1 (a) and (b) shows the two representations.
One advantage of this representation is that, from the basic geometry shown in Fig. 1(c), light emitted from
a point source located at (x0, y0, z0 ) maps to the following lines in the (x, u) and (y, v) spaces:
It is possible to get a more intuitive understanding of the typical structure of pd (x, y, u, v) by creating images where pixel color and brightness are deﬁned by the values of the plenoptic function as we change variables in two chosen dimensions, while keeping the other dimensions at ﬁxed values. For example, using α and β to represent constants, we can create spatial images from pd (x, y, α, β), which correspond to views from an orthographic camera if axes are properly scaled. Angular dimension images, deﬁned as pd (α, β, u, v) correspond to images of perspective cameras with a pinhole at (α, β, 0), and ﬁlm plane z = 1.
Mixed spatial and angular images, deﬁned in the forms pd (x, α, u, β) and pd (α, y, β, v), are unusual, but as we show below, can be very useful for understanding the capabilities of 3-D displays. To start, let us consider how a display at z = 0 would recreate the plenoptic function deﬁned by the set of objects in 3-D space shown in Fig. 2. Since we consider displays with only horizontal parallax, we constrain our analysis to images deﬁned by (x, u) dimensions.
Fig. 3 shows three examples of spatial-angular images (with constants listed in the caption). We can observe that the most striking feature is that they are composed of nearly-constant-color areas, separated by what may seem like straight lines. Actually these may not be perfectly straight, but we know that they have to be composed of overlapping segments of straight lines, as deﬁned by eq. (4), created by light emitted from surface points, beginning and ending at points (in the spatial-angular image) deﬁned by occlusion between scene objects.
Thus, the border points, deﬁning color changes in the object’s surface, or occlusion between objects, create the clearly visible transitions that are nearly straight.
We can also observe in Fig. 3 another fact predicted by eq. (4): the slope of the lines between regions is deﬁned by the light source depth (z0 ). Thus, red and black areas, deﬁned by the cap that is located at positive z0, has Figure 2. Frontal, 45◦ and 90◦ left-side views of the simulated set of three-dimensional objects used as example throughout this document. The three sphere caps have centers at line x = y = 0, and the base of the blue cap is in the display plane (z = 0).
Figure 3. Examples of spatial-angular images for a display at z = 0 reproducing views of the objects shown in Fig.
The horizontal direction corresponds to the spatial x dimension, and the vertical direction corresponds to the angular u dimension. (a) pd (x, 0.25, u, 0); (b) pd (x, 0.15, u, 0); (c) pd (x, 0.15, u, 0.1).
transitions with negative slope; blue and white areas, corresponding to the blue cap at z0 ≈ 0 are separated by nearly vertical lines; the yellow and gray areas for the cap at negative z0 create lines with positive slope.
This type of image had been observed and their properties had been discussed by several authors.8–11 They are also known as epipolar images (EPI) in computer vision.12 In the next section we show the types of spatial-angular images that real displays can produce. To simplify the analysis, when we refer to a spatial-angular image, we mean the image created from the particular twodimensional function ph (x, u) = pd (x, 0, u, 0). (5) Furthermore, in all our spatial-angular images the horizontal direction corresponds to the spatial dimension (x), and vertical direction corresponds to the angular (u) dimension, with point x = u = 0 at the center of the image.
Since nearly all our analysis is qualitative, the axes and their values, being the same for all images, are not shown.
Since practical displays cannot show an inﬁnite amount of information, they must have light intensity nearly constant in discrete regions in the (x, y, u, v) space. We use the term spatial resolution to indicate (maybe roughly, when it is hard to deﬁne an exact value) how ﬁne the discretization in the spatial x dimension is, and the term angular resolution, for the same in the angular u dimension.
Fig. 4 shows the two images that we use as reference, corresponding to objects shown in Fig. 2, as they would be produced by an ideal display at plane z = 0. On the left side we have a view from an orthographic camera oriented with a 15◦ angle from the z-axis. On the right we have ph (x, u).∗ ∗ All images are better viewed in color and magniﬁed in the electronic version of this document.
Figure 4. An ideal 3-D display recreating the arrangement of objects shown in Fig.
2 would show the spatial view on the left, and the spatial-angular view on the right. These images are used for comparisons with those from real displays.
Figure 5. Views of scene recreated by displays implemented using anisotropic diﬀuser screen and multiple back-projectors.
(a) 15 projectors; (b) 30 projectors; (c) 60 projectors.
3. APPLICATION TO 3-D DISPLAY ANALYSISUp to this point we have considered only the plenoptic function created by the objects in the 3-D scene we want to reproduce. An ideal display would recreate the same values, but current displays recreate only approximations.
We analyze the quality of the approximation by simulating the display using ray-tracing,13 and creating one spatial view and one spatial-angular image of the display.
3.1 Multiple projectors with anisotropic diﬀusers or with double lenticular arrays The ﬁrst type of display that we consider consists of multiple projectors arranged in an horizontal arc, projecting from behind onto a screen made of an anisotropic light diﬀuser, with wide diﬀusion along the vertical direction and very narrow diﬀusion along the horizontal direction (see Ag´cs et al.14 for more details). This arrangement o is very simple, and in a way very intuitive: light rays along diﬀerent directions are generated by simply having light projected by diﬀerent devices. To simplify this ﬁrst analysis, we assume that each projected image has very high spatial resolution. Fig. 5 show views of simulations of this type of display, when using diﬀerent number of projectors.
As we compare those images with the ideal in Fig. 4, we observe artifacts like line discontinuity/ghosting, and uneven brightness (increased in this example to be more visible). However, single images do not give any hint of how these artifacts change with viewer position. On the other hand, if we look at the corresponding spatial-angular images in Fig. 6, we can have an idea of what are the artifacts and how they change with viewer position, because these images do contain visual information about all the viewing angles.