Stereoscopic media
Digital Visual Effects Yung-Yu Chuang
3D is hot today
3D has a long history
• 1830s, stereoscope
• 1920s, first 3D film, The Power of Love
projected dual-strip in the red/green anaglyph format
• 1920s, teleview system
Teleview was the earliest alternate-frame sequencing form of film projection. Through the use of two interlocked projectors, alternating left/right frames were projected one after another in rapid succession. Synchronized viewers attached to the arm-rests of the seats in the theater open and closed at the same time, and took advantage of the viewer's persistence of vision, thereby creating a true stereoscopic image.
3D has a long history
• 1950s, the "golden era" of 3-D
• The attempts failed because immature technology results in viewer discomfort.
• 1980s, rebirth of 3D, IMAX
Why could 3D be successful today?
• It finally takes off until digital processing makes 3D films both easier to shoot and watch.
• New technology for more comfortable viewing experiences
– Accurately-adjustable 3D camera rigs
– Digital processing and post-shooting rectification – Digital projectors for accurate positioning
– Polarized screen to reduce cross-talk
3D TVs
Computers
Notebooks
Game consoles
Nintendo 3DS
HTC EVO 3D
3D contents (games)
3D contents (films)
3D contents (broadcasting)
3D cameras
Fuji Real3D W1 and W3 ($600)
Sony HDR-TD10E
Outline
• Human depth perception
• 3D displays
• 3D cinematography
• Stereoscopic media postprocessing
Human depth
perception
Space perception
• The ability to perceive and interact with the structure of space is one of the fundamental goals of the visual system.
• Our visual system reconstructs the world from two non-
Euclidean inputs, the two distinct
retinal images.
Binocular vision
• Two retinal images are different because the retinas are in slightly different places.
• Why two eyes?
1. See more of the world
2. Still can perceive the
world after losing one.
• The combination of signals from each eye makes performance on many tasks
better with both eyes than with either eye alone.
Binocular vision
(190)
(110)
Overlapping binocular visual fields give predator a better
chance to spot small fast- moving objects
Vertical visual field: 60 up 80 down
Binocular disparity
• Binocular disparity: the differences between the two retinal images of the same scene.
• Monocular: with one eye
• Stereopsis: the ability to use binocular disparity as a cue to depth.
• Note that, although stereopsis adds richness to depth perception, it is not a necessary
condition for depth perception. Example:
rabbits and 2D films.
Monocular cues to 3D space
• Every view of the world provides multiple depth cues.
• Usually, the cues reinforce each other,
combining to produce a convincing and reliable representation of 3D world.
• Occasionally, however, the cues are contradictory.
• Escher fools us by deliberately manipulating depth cues and other visual inferences. He arranges sensible local cues into a globally impossible story.
Monocular cues to 3D space
Occlusion
• Occlusion gives relative position of objects as a depth cue.
• It occurs in almost every scene and some argues that it is the most reliable depth cue.
Occlusion
accidental viewpoints Occlusion is a nonmetrical depth cue. It only gives us the relative order but not information about distance
in the third dimension.
Size and position cues
• We have projective geometry embedded. We know that, all else being equal, smaller things are farther away.
The cue of relative size
Texture gradient
Larger objects in one area and smaller ones in another.
Smaller is interpreted as farther ways, the arrangement creates the perception of a ground plane.
Texture gradient
Why do we get less of a sense of depth?
Relative height
Objects at different distances on the ground plane will form images at different heights in the retinal image.
Objects farther away will seen as higher in the image.
Familiar size
Familiar size: a depth cue based on knowledge of the typical size of objects.
Aerial perspective
Aerial (haze) perspective: light is scattered by the Atmosphere, and more light is scattered when we
Look through more atmosphere.
Aerial perspective
Linear perspective
Parallel lines in 3D world appear to converge in the 2D image,
except for the ones parallel to the image plane.
vanishing point vanishing
line
Monocular cues
Pictorial depth cues
• All these monocular cues are pictorial depth cues produced by the projection of the 3D world onto the 2D surface of the retina.
• Combined with proper shading, these cues could be effective in illustrating 3D.
Pictorial depth cues
Pictorial depth cues
Pictorial depth cues
Monocular cues
• Because there are lots of monocular cues and they are not less important than binocular
ones, some images could look more
stereoscopic than others. Example, 2D-to-3D conversion is easier for some images but more difficult for others.
Motion cues
• Motion parallax is a non-pictorial depth cue.
When your head moves, closer objects move faster than more distant ones.
Motion cues
• A very effective depth cue but it relies on head movements.
• Some 3D games are designed this way.
Binocular vision
interocular distance
~6.5cm
Accommodation and convergence
• Eyes need to be focused to see objects at different distances clearly.
• Human eye focuses via a process called
accommodation, in which lens gets fatter as we direct our gaze toward nearer objects.
Accommodation and convergence
Accommodation and convergence
• Human can perceive depth by accommodation and convergence.
Binocular vision
Assume that the viewer is focusing on the red crayon.
red crayon
Binocular vision
• Note that the retinal images are inverted. The object of our gaze falls on the fovea, center of the retina.
• The blue one happens to fall on corresponding retinal points.
Binocular vision
Horopter: the surface with zero disparity.
Panum’s fusional area: the region of space, in front of and behind the horopter, within which binocular vision is possible.
Diplopia: double vision
Binocular vision
Binocular vision
Binocular vision
left view right view
Binocular vision
Binocular vision
Binocular vision
• Our nervous system cannot measure the angle very accurately. Thus, we can only perceive relative depth.
• The role of eye movement is to bring the images within Panum’s fusional area.
Stereoscope
• Invented by Sir Charles Wheatstone in 1830s. Our visual system treats binocular disparity as a depth cue, no
matter it is produced by actual or simulated images.
Stereoscope
• Invented in 1850s, it is stereopsis for the masses.
Free vision
Free vision: the technique of converging
(crossing) or diverging the eyes in order to view a stereogram without a stereoscope.
Random dot stereograms
• For 100 years after the invention of stereoscope, it is supposed that stereopsis occurred relatively late in the processing of visual stimuli; i.e. we recognize facial
features and then use them to find depth.
Random dot stereograms
• Julesz thought that stereopsis might help reveal
camouflaged objects (the example of cats and mice) and invented random dot stereograms.
There is no monocular cue, yet we can recognize the shape. Disparity alone is enough for stereopsis!
Random dot stereograms
• Paint the scene in white and spray it with pepper.
Stereoblindness
• An inability to make use of binocular disparity as a depth cue.
• Approximately 3% to 5% of the population lacks stereoscopic depth perception.
Summary
• Monocular cues: occlusion, size and position cues, aerial perspective, linear perspective.
• Motion cues
• Accommodation and convergence cues
• Binocular cues: resolve stereo correspondence problem, Panum’s area.
3D displays
3D displays
passive active 光譜眼鏡
3D displays
• Note that monocular cues can be produced by rendering/capturing the contents correctly.
• Most 3D displays enrich space perception by exploiting binocular vision. Thus, they have to present different contents to each of both eyes.
With glasses
shutter glasses: time multiplex
anaglyph: spectrum multiplex polarization glasses:
polarization multiplex
head mounted displays:
space multiplex
Anaglyph glasses
Many color formats
Supported by YouTube and Google StreetView
Anaglyph glasses
Pros: cheap (home-made) Cons: without colors
bad 3D
Polarization glasses
linear polarizer
Polarization glasses
Polarization glasses
• How to display two polarized frames?
• Need accurate calibration so that the frames are aligned.
• Need non-depolarized screen.
• Cheaper glasses. Used in theater.
Polarization glasses
Pros: cheaper, no need to sync Cons: lower resolution
Problems with linear polarizer
• Non-aligned viewers might see cross-talk
Circular polarizer
Spectral glasses
• Wavelength multiplexing. Infitec GmbH.
Adapted by Dolby 3D digital cinema.
• Advanced anaglyph by dividing colors better.
• We will talk about human perception to colors first.
Spectral power distribution
fluorescent light ( 日光燈 )
400nm (bluish)
650nm (red) 550nm
(green)
Spectral glasses
• Anyway, we learn that although light is a complex combination of electromagnetic waves of different wavelengths, it can be represented by three primary colors due to human perception.
Spectral glasses
• Filters are added into projectors and glasses so that only lights of specific wavelengths can pass by. The color box adjusts colors.
Spectral glasses
Dolby 3D Digital Cinema
• Only requires one projector.
active hi-speed filter Pros: no need for special screen
Cons: expensive filters darker
specular
Shutter glasses
• Twice frequency (usually 120Hz).
• Liquid crystal. Needs to sync.
• Persistence of vision ( 視覺暫留 )
Pros: good 3D
Cons: need to sync darker
expensive not good for
multi-user
Shutter glasses
TI DLP technology
passive polarization
active shutter
passive shutter (2ms sync frame)
Head mounted displays
Pros: very good 3D
could be used with head trackers Cons: expensive
heavy closed
single-user
Autostereoscopic
• Control lights to radiate to specific directions by accurate optics calculation.
– Spatial-multiplexed – Time-multiplexed
Spatial-multiplexed
• The next question is how to let left eye see only the left image and right eye see only the right one.
Autostereoscopic
Barrier
• LC barrier could switch
between 2D and 3D display modes.
Lanticular
• It is also possible to switch between 2D and 3D display modes.
Slanted
• To make more consistent horizontal and vertical aspect ratio.
2-view
Multiple-view
Philips 9-view format
Multiple-view
• Need more inputs.
• Reduced resolution.
Pros: no glasses multi-user Cons: location bad 3D
Time-multiplexed
• Directional back light unit 3D film
Pros: resolution Cons: only 2views
Comparisons
Common 3D formats
• Side-by-side
• Multi-view
• 2D+Z
2-view
Multi-view
2D+Z
Volumetric displays
• Non-volumetric displays will make viewers fatigue after long viewing time because of
inconsistency between focus and convergence.
• Volumetric displays will be better in this aspect but it is much more expensive and requires
more data consumption (more views are required).
• Pros: good 3D, no glasses, multi-user
• Cons: often with limited size, suitable only for objects, not scenes
Actuality System
USC ICT
SONY RayModeler
Summary
• Many 3D displays will be produced in the coming years.
• Glass-equipped 3D display technology is very matured.
• Autostereoscopic displays need more time and will be used for advertisement first.
• 3D contents are the major bottleneck.
• But, 3D cameras are on the corner.
3D cinematography
Capture stereoscopic photos
Capture stereoscopic photos
adjust baseline adjust convergence
A few terms
• Depth bracket
A few terms
• Depth chart
Sources of visual fatigue
• The most important point to be considered in stereoscopic cinema.
• Symptoms: conscious (headache, tiredness, soreness of the eyes) or unconscious
(perturbation of oculomotor system)
• Some even reported the case of an infant whose oculomotor wsystem was permanently disturbed by viewing a stereoscopic movie.
Sources of visual fatigue
• Binocular asymmetry: photometric or
geometric differences between the left and right retinal images. Kooi and Toet
experimentally measured threshiolds on various asymmetries that will lead to visual incomfort.
– Human visual system is most sensitive to vertical binocular disparities.
– 35 armin horizontal disparity range is quite
acceptable and 70 arcmin disparity is too much to be viewed.
Sources of visual fatigue
vertical shift
size
difference
distortion difference
horizontal shift (diverged)
Sources of visual fatigue
• Crosstalk (crossover or ghosting): typical values for crosstalk are 0.1-0.3% with polarization-
based systems, and 4-10% with LCD shutter glasses. It could be reduced by a preprocess called ghost-busting.
• Breaking the proscenium rule (breaking the stereoscopic window): a simple solution is to float the window
Sources of visual fatigue
• Horizontal disparity limits: the eyes should not diverge and there is a certain limit below which human visual system can fuse
• Vertical disparity: causes torsion motion of the ocular globes and is only tolerable for short
time intervals.
• Vergence-accommodation conflicts: focus
distance of the eyes is not consistent with their convergence angle. They happen very often for stereoscopic displays and could be relaxed by using the depth of field of the visual system.
Stereopsis
• Stereopsis could be broken for the following
not aligned large disparity
ghosting failure to fuse
Vergence-accommodation
• Vergence, Convergence, divergence: the angle formed by the optical axis of the two eyes in binocular vision.
• Plane of convergence.
Vergence-accommodation
• There is an area around it where vergence and
accommodation agree, which is called zone of comfort.
This discrepancy could damage the visual acuity before the age of 8.
real world stereoscopic displays
Principles to avoid fatigue
• Maintain coordination among views
• Have a continuous depth chart
• Place rest area between strong 3D shots
• Use shallow depth of field for excessive depth range
• Respect to stereoscopic window
Some other rules
Excessive depth range
Bending the stereoscopic window
Breaking the stereoscopic window
Floating the stereoscopic window
Without floating
With floating
Stereoscopic media
postprocessing
Display adaptation
Content-aware display adaptation
• Stereoscopic displays have different
resolutions, aspect ratios and comfort zones.
• To display stereoscopic media properly on different displays, we need content-aware
stereoscopic image display adaptation methods to perform image retargeting and depth
adaptation simultaneously.
2D media retargeting
image resizing
video
video resizing
video
• Adapt depths to the comfort zone to avoid
visual discomfort such as blur and double vision
Panum’s fusional area
Depth adaptation
Stereoscopic comfort zone
Comfort zone
Previous work (2D image resizing)
• Seam Carving for Content-Aware Image Resizing
[Avidan et al., SIGGRAPH07]
Previous work (2D image resizing)
• Optimized scale-and-stretch (OSS)
[Wang et al., SIGGRAPH ASIA08]
Previous work (linear shift)
Content-aware Display Adaptation
and Interactive Editing
for Stereoscopic Images
Mapping between disparities and depths
L R L R
x
d=x
R-x
LxR xR
xL xL
Mapping between disparities and depths
e: interocular distance D: viewing distance
Overview
• Minimize
Left
Right
Saliency detection
• Graph-based visual saliency algorithm [Jonathan et al., NIPS06]
0 1
Mesh representation
• An image is represented as a rectangular mesh
Quad importance
• Average of saliency values of all pixels in q
Image correspondence
• Build the stereoscopic constraints
• The state-of-the-art stereo methods are still far from perfection
• We only use sparse features pairs
– Reliable
– “Sparse correspondences + image warping” are often enough to fool human brain
Feature correspondence
• is expressed as a linear combination of
Energy minimization
• Energy function
Content preservation
: Distortion energy : Line bending energy : Alignment energy
: Consistent disparity energy
Energy minimization
Distortion energy
• The optimal is completely defined by and
Line bending energy
Ideal OSS
Our method
Energy minimization
• Energy function
Stereoscopic constraints
: Distortion energy : Line bending energy : Alignment energy
: Consistent disparity energy
Alignment energy
• Maintain purely horizontal parallax (avoid vertical parallax)
Consistent disparity energy
• Maintain the relative shapes and depths
• Keep the absolute shapes and depths
Energy minimization
• Energy function
• Linear least squares problem closed form solution
User editing
• User specifies depths (Z-axis)
• User specifies 3D position (XYZ-axis)
User editing
User editing
User editing
• User specifies depths (Z-axis)
Target disparity
• User specifies 3D position (XYZ-axis)
User editing
Target 2D position
Results
original
Saliency
Scaling
Our method
Results
Original Scaling Our method (relative)
Our method (absolute)
Results
Results
• Resizing + depths adaptation by user editing
Original Saliency
Results
• Depth adaptation by user editing
Results
• Depth adaptation by user editing
• Treat the boat as a 3D object
User study
• 24 subjects
• Image resizing
• Depth adaptation
User study – part 1
• Image resizing
– 1.5x width
– Comparison with OSS and our method
– Q: which viewing is more clear and comfortable?
User study – part 1
• Vote rate 90.6%
(174/192)
• 92.0% (160/174)
prefered our results than OSS’s
No. 1 2 3 4 5 6 7 8
Vote rate 96% 71% 100% 79% 92% 92% 100% 96%
Prefer ours 87% 76% 100% 74% 95% 100% 96% 100%
User study – part 2
• Depth adaptation
–
User study – part 2
• Depth adaptation
–
– Q: which image’s foreground area is closer to you?
User study – part 2
• Vote rate 91.7% (176/192)
• Correct rate 88.6% (156/176)
No. 1 2 3 4 5 6 7 8
Vote rate 100% 96% 100% 96% 79% 92% 88% 83%
Correct
rate 79% 83% 88% 91% 84% 91% 100% 95%
Nonlinear disparity mapping
• video
Stereoscopic copy and paste
• video