Stereoscopic media

(1)

Stereoscopic media

Digital Visual Effectsg Yung-Yu Chuang

(2)

3D is hot today

(3)

3D has a long history

• 1830s, stereoscope

• 1920s first 3D film The Power of Love

• 1920s, first 3D film, The Power of Love

projected dual-strip in the red/green anaglyph format

anaglyph format

• 1920s, teleview system

Teleview was the earliest alternate-frame sequencing form ofq g film projection. Through the use of two interlocked p j g projectors, alternating left/right frames were projected one after another in rapid succession. Synchronized viewers attached to the arm-rests of the seats in the theater open and closed at the same time, and took advantage of the viewer'spersistence of vision, thereby creating a true stereoscopic image.

(4)

3D has a long history

• 1950s, the "golden era" of 3-D

Th f il d b i

• The attempts failed because immature technology results in viewer discomfort.

• 1980s, rebirth of 3D, IMAX

(5)

Why could 3D be successful today?

• It finally takes off until digital processing makes 3D films both easier to shoot and watch

3D films both easier to shoot and watch.

• New technology for more comfortable viewing i

experiences

– Accurately-adjustable 3D camera rigs

– Digital processing and post-shooting rectification – Digital projectors for accurate positioning

– Polarized screen to reduce cross-talk

(6)

3D TVs

(7)

Computers

(8)

Notebooks

(9)

Game consoles

Nintendo 3DS Nintendo 3DS

(10)

HTC EVO 3D

(11)

3D contents (games) ( )

(12)

3D contents (films) ( )

(13)

3D contents (broadcasting) ( )

(14)

3D cameras

F ji R l3D W1 d W3 ($600) Fuji Real3D W1 and W3 ($600)

(15)

Sony HDR-TD10E

(16)

Outline

• Human depth perception 3D di l

• 3D displays

• 3D cinematography

• Stereoscopic media postprocessing

(17)

Human depth perception

(18)

Space perception

• The ability to perceive and interact with the structure of space is one of the fundamental structure of space is one of the fundamental goals of the visual system.

• Our visual system reconstructs the world from two non-

Euclidean inputs, the two distinct

retinal images.

(19)

Binocular vision

• Two retinal images are different because the retinas are in slightly different places

retinas are in slightly different places.

• The combination Wh ? of signals from each eye makes

• Why two eyes?

1. See more of the ld

performance on many tasks

world 2. Still can

perceive the

better with both eyes than with

perceive the world after

losing one.

either eye alone.

losing one.

(20)

Binocular vision

Overlapping binocular visual fields give predator a better (190) fields give predator a better

chance to spot small fast- moving objects

(110)

g j

V i l i l fi ld 60

Vertical visual field: 60  up 80  down

(21)

Binocular disparity

• Binocular disparity: the differences between the two retinal images of the same scene

the two retinal images of the same scene.

• Monocular: with one eye

• Stereopsis: the ability to use binocular disparity as a cue to depth.

• Note that although stereopsis adds richness to

• Note that, although stereopsis adds richness to depth perception, it is not a necessary

condition for depth perception Example:

condition for depth perception. Example:

rabbits and 2D films.

(22)

Monocular cues to 3D space

• Every view of the world provides multiple depth cues

cues.

• Usually, the cues reinforce each other,

bi i t d i i d li bl combining to produce a convincing and reliable representation of 3D world.

• Occasionally, however, the cues are contradictory.

• Escher fools us by deliberately manipulating

• Escher fools us by deliberately manipulating depth cues and other visual inferences. He arranges sensible local cues into a globally arranges sensible local cues into a globally impossible story.

(23)

Monocular cues to 3D space

(24)

Occlusion

• Occlusion gives relative position of objects as a depth cue

depth cue.

• It occurs in almost every scene and some argues th t it i th t li bl d th

that it is the most reliable depth cue.

(25)

Occlusion

Occlusion is a nonmetrical depth cue It only gives us depth cue. It only gives us the relative order but not information about distance

id t l i i t in the third dimension.

accidental viewpoints

(26)

Size and position cues

• We have projective geometry embedded. We know that all else being equal smaller things know that, all else being equal, smaller things are farther away.

The cue of relative size

(27)

Texture gradient

Larger objects in one area and smaller ones in another.

Smaller is interpreted as farther ways the arrangement Smaller is interpreted as farther ways, the arrangement

creates the perception of a ground plane.

(28)

Texture gradient

Why do we get less of a sense of depth?

(29)

Relative height

Objects at different distances on the ground plane will form images at

g p g

different heights in the retinal image.

Objects farther away will seen as hi h i h i

higher in the image.

(30)

Familiar size

Familiar size: a depth cue based on knowledge of the typical size of objects

typical size of objects.

(31)

Aerial perspective

Aerial (haze) perspective: light is scattered by the Aerial (haze) perspective: light is scattered by the Atmosphere, and more light is scattered when we

Look through more atmosphere.

(32)

Aerial perspective

(33)

Linear perspective

Parallel lines in 3D world appear to converge in the 2D image

vanishing point

i hi to converge in the 2D image, except for the ones parallel to the image plane.

vanishing

line g p

(34)

Monocular cues

(35)

Pictorial depth cues

• All these monocular cues are pictorial depth cues produced by the projection of the 3D cues produced by the projection of the 3D world onto the 2D surface of the retina.

C bi d ith h di th

• Combined with proper shading, these cues could be effective in illustrating 3D.

(36)

Pictorial depth cues

(37)

Pictorial depth cues

(38)

Pictorial depth cues

(39)

Monocular cues

• Because there are lots of monocular cues and they are not less important than binocular

they are not less important than binocular ones, some images could look more

stereoscopic than others Example 2D to 3D stereoscopic than others. Example, 2D-to-3D conversion is easier for some images but more difficult for others

difficult for others.

(40)

Motion cues

• Motion parallax is a non-pictorial depth cue.

When your head moves closer objects move When your head moves, closer objects move faster than more distant ones.

(41)

Motion cues

• A very effective depth cue but it relies on head movements

movements.

• Some 3D games are designed this way.

(42)

Binocular vision

interocular distance distance

~6.5cm

(43)

Accommodation and convergence

• Eyes need to be focused to see objects at different distances clearly

different distances clearly.

• Human eye focuses via a process called

d ti i hi h l t f tt

accommodation, in which lens gets fatter as we direct our gaze toward nearer objects.

(44)

Accommodation and convergence

(45)

Accommodation and convergence

• Human can perceive depth by accommodation and convergence

and convergence.

(46)

Binocular vision

d red crayon

Assume that the viewer is focusing on the red crayon.

(47)

Binocular vision

• Note that the retinal images are inverted. The object of our gaze falls on the fovea center of object of our gaze falls on the fovea, center of the retina.

Th bl h t f ll di

• The blue one happens to fall on corresponding retinal points.

(48)

Binocular vision

Horopter: the surface with zero disparity

with zero disparity.

Panum’s fusional area: the region of space, in front of and behind the horopter, within which binocular vision is possible.

Diplopia: double vision Diplopia: double vision

(49)

Binocular vision

(50)

Binocular vision

(51)

Binocular vision

left view right view

(52)

Binocular vision

(53)

Binocular vision

(54)

Binocular vision

• Our nervous system cannot measure the angle very accurately Thus we can only perceive very accurately. Thus, we can only perceive relative depth.

Th l f t i t b i th

• The role of eye movement is to bring the images within Panum’s fusional area.

(55)

Stereoscope

• Invented by Sir Charles Wheatstone in 1830s. Our visual system treats binocular disparity as a depth cue no

system treats binocular disparity as a depth cue, no matter it is produced by actual or simulated images.

(56)

Stereoscope

• Invented in 1850s, it is stereopsis for the masses

masses.

(57)

Free vision

Free ision: the techniq e of con erging Free vision: the technique of converging

(crossing) or diverging the eyes in order to view a stereogram without a stereoscope

a stereogram without a stereoscope.

(58)

Random dot stereograms

• For 100 years after the invention of stereoscope, it is supposed that stereopsis occurred relatively late in the supposed that stereopsis occurred relatively late in the processing of visual stimuli; i.e. we recognize facial

features and then use them to find depth.

(59)

Random dot stereograms

• Julesz thought that stereopsis might help reveal

camouflaged objects (the example of cats and mice) camouflaged objects (the example of cats and mice) and invented random dot stereograms.

There is no monocular cue, yet we can recognize the shape. Disparity alone is enough for stereopsis!

(60)

Random dot stereograms

• Paint the scene in white and spray it with pepper

pepper.

(61)

Stereoblindness

• An inability to make use of binocular disparity as a depth cue

as a depth cue.

• Approximately 3% to 5% of the population lacks

t i d th ti

stereoscopic depth perception.

(62)

Summary

• Monocular cues: occlusion, size and position cues aerial perspective linear perspective cues, aerial perspective, linear perspective.

• Motion cues

• Accommodation and convergence cues

• Binocular cues: resolve stereo correspondence p problem, Panum’s area.

(63)

3D displays

(64)

3D displays

光譜眼鏡

passive ti

active

(65)

3D displays

• Note that monocular cues can be produced by rendering/capturing the contents correctly rendering/capturing the contents correctly.

• Most 3D displays enrich space perception by l iti bi l i i Th th h t exploiting binocular vision. Thus, they have to present different contents to each of both

eyes.

(66)

With glasses

anaglyph: spectrum multiplex polarization glasses:

polarization multiplex

h d t d di l shutter glasses: time multiplex

head mounted displays:

space multiplex

(67)

Anaglyph glasses

Many color formats

Supported by YouTube andpp y Google StreetView

(68)

Anaglyph glasses

Pros: cheap (home-made) Cons: without colors

Cons: without colors bad 3D

(69)

Polarization glasses

linear polarizer polarizer

(70)

Polarization glasses

(71)

Polarization glasses

• How to display two polarized frames?

N d t lib ti th t th f

• Need accurate calibration so that the frames are aligned.

• Need non-depolarized screen.

• Cheaper glasses. Used in theater.p g

(72)

Polarization glasses

Pros: cheaper, no need to sync no need to sync Cons: lower resolution

(73)

Problems with linear polarizer

• Non-aligned viewers might see cross-talk

(74)

Circular polarizer

(75)

Spectral glasses

• Wavelength multiplexing. Infitec GmbH.

Adapted by Dolby 3D digital cinema Adapted by Dolby 3D digital cinema.

• Advanced anaglyph by dividing colors better.

• We will talk about human perception to colors first.

(76)

Spectral power distribution

fl li h (日光燈)

400nm (bluish)

650nm (red) 550nm

(green)

fluorescent light (日光燈)

(bluish) (green) (red)

(77)

Spectral glasses

• Anyway, we learn that although light is a complex combination of electromagnetic waves of different combination of electromagnetic waves of different wavelengths, it can be represented by three primary colors due to human perception.

(78)

Spectral glasses

• Filters are added into projectors and glasses so that only lights of specific wavelengths can pass by The only lights of specific wavelengths can pass by. The color box adjusts colors.

(79)

Spectral glasses

(80)

Dolby 3D Digital Cinema

• Only requires one projector.

ti hi d filt Pros: no need for active hi-speed filter Pros: no need for

special screen

Cons: expensive filtersp darker

specular

(81)

Shutter glasses

• Twice frequency (usually 120Hz).

Li id l N d

• Liquid crystal. Needs to sync.

• Persistence of vision (視覺暫留)

Pros: good 3D

Cons: need to sync d k

darker

expensive not good for not good for

multi-user

(82)

Shutter glasses

(83)

TI DLP technology

passive polarization

active shutter

passive shutter (2ms sync frame) (2ms sync frame)

(84)

Head mounted displays

P d 3D

Pros: very good 3D

could be used with head trackers Cons: expensive

Cons: expensive heavy

closed

single-user

(85)

Autostereoscopic

• Control lights to radiate to specific directions by accurate optics calculation

by accurate optics calculation.

– Spatial-multiplexed

Ti lti l d

– Time-multiplexed

(86)

Spatial-multiplexed

• The next question is how to let left eye see only to let left eye see only the left image and right eye see only the right eye see only the right one.

(87)

Autostereoscopic

(88)

Barrier

• LC barrier could switch

between 2D and 3D display between 2D and 3D display modes.

(89)

Lanticular

• It is also possible to switch between 2D and 3D display between 2D and 3D display modes.

(90)

Slanted

• To make more consistent horizontal and vertical aspect ratio

vertical aspect ratio.

(91)

2-view

(92)

Multiple-view

(93)

Philips 9-view format

(94)

Multiple-view

• Need more inputs

inputs.

• Reduced l ti

resolution.

Pros: no glasses multi-user C l ti

Cons: location bad 3D

(95)

Time-multiplexed

• Directional back light unit 3D film

Pros: resolution Cons: only 2viewsy

(96)

Comparisons

(97)

Common 3D formats

• Side-by-side M l i i

• Multi-view

• 2D+Z

(98)

2-view

(99)

Multi-view

(100)

2D+Z

(101)

Volumetric displays

• Non-volumetric displays will make viewers fatigue after long viewing time because of fatigue after long viewing time because of

inconsistency between focus and convergence.

V l t i di l ill b b tt i thi t

• Volumetric displays will be better in this aspect but it is much more expensive and requires

d t ti ( i

more data consumption (more views are required).

• Pros: good 3D, no glasses, multi-user

• Cons: often with limited size, suitable only for , y objects, not scenes

(102)

Actuality System

(103)

USC ICT

(104)

SONY RayModeler

(105)

Summary

• Many 3D displays will be produced in the coming years

coming years.

• Glass-equipped 3D display technology is very t d

matured.

• Autostereoscopic displays need more time and will be used for advertisement first.

• 3D contents are the major bottleneck.j

• But, 3D cameras are on the corner.

(106)

3D cinematography

(107)

Capture stereoscopic photos

(108)

Capture stereoscopic photos

adjust baseline adjust convergence

(109)

A few terms

• Depth bracket

(110)

A few terms

• Depth chart

(111)

Sources of visual fatigue

• The most important point to be considered in stereoscopic cinema

stereoscopic cinema.

• Symptoms: conscious (headache, tiredness,

f th ) i

soreness of the eyes) or unconscious (perturbation of oculomotor system)

• Some even reported the case of an infant

whose oculomotor wsystem was permanently disturbed by viewing a stereoscopic movie.

(112)

Sources of visual fatigue

• Binocular asymmetry: photometric or

geometric differences between the left and geometric differences between the left and right retinal images. Kooi and Toet

experimentally measured threshiolds on various experimentally measured threshiolds on various asymmetries that will lead to visual incomfort.

H i l t i t iti t ti l – Human visual system is most sensitive to vertical

binocular disparities.

35 armin horizontal disparity range is quite – 35 armin horizontal disparity range is quite

acceptable and 70 arcmin disparity is too much to be viewed.

(113)

Sources of visual fatigue

vertical hif

size diff

distortion diff

horizontal hif shift difference difference shift

(diverged)

(114)

(115)

Sources of visual fatigue

• Crosstalk (crossover or ghosting): typical values for crosstalk are 0 1 0 3% with

for crosstalk are 0.1-0.3% with

polarization-based systems, and 4-10% with LCD shutter glasses It could be reduced by a

shutter glasses. It could be reduced by a preprocess called ghost-busting.

B ki th i l (b ki th

• Breaking the proscenium rule (breaking the stereoscopic window): a simple solution is to fl t th i d

float the window

(116)

Sources of visual fatigue

• Horizontal disparity limits: the eyes should not diverge and there is a certain limit below which diverge and there is a certain limit below which human visual system can fuse

V ti l di it t i ti f th

• Vertical disparity: causes torsion motion of the ocular globes and is only tolerable for short

ti i t l time intervals.

• Vergence-accommodation conflicts: focus

distance of the eyes is not consistent with their convergence angle. They happen very often for stereoscopic displays and could be relaxed by using the depth of field of the visual system.

(117)

Stereopsis

• Stereopsis could be broken for the following

not aligned large disparity

ghosting failure to fuse

g g

(118)

Vergence-accommodation

• Vergence, Convergence, divergence: the angle formed by the optical axis of the two eyes in formed by the optical axis of the two eyes in binocular vision.

Pl f

• Plane of convergence.

(119)

Vergence-accommodation

• There is an area around it where vergence and

accommodation agree which is called zone of comfort accommodation agree, which is called zone of comfort.

This discrepancy could damage the visual acuity before the age of 8.

real world stereoscopic displays

(120)

Principles to avoid fatigue

• Maintain coordination among views

H i d h h

• Have a continuous depth chart

• Place rest area between strong 3D shots

• Use shallow depth of field for excessive depth rangeg

• Respect to stereoscopic window

(121)

Some other rules

(122)

Excessive depth range

(123)

Bending the stereoscopic window

(124)

Breaking the stereoscopic window

(125)

Floating the stereoscopic window

(126)

Without floating

(127)

With floating

(128)

Stereoscopic media postprocessing

(129)

Display adaptation

(130)

Content-aware display adaptation

• Stereoscopic displays have different

resolutions, aspect ratios and comfort zones.

• To display stereoscopic media properly on p y p p p y different displays, we need content-aware

stereoscopic image display adaptation methods p g p y p to perform image retargeting and depth

adaptation simultaneously.p y

(131)

2D media retargeting

image image resizing

video video

video resizing

video

(132)

Depth adaptation

• Adapt depths to the comfort zone to avoid i l di f t h bl d d bl i i visual discomfort such as blur and double vision

Panum’s fusional area

(133)

Stereoscopic comfort zone

• Vergence (vertical rotation of both eyes in opposite directions to maintain binocular vision)

directions to maintain binocular vision)

• Accommodation (change of focus)

Since accommodation and vergence are reflexively

• Since accommodation and vergence are reflexively coupled mechanisms, their artificial decoupling when viewing stereoscopic displays has often been theorized viewing stereoscopic displays has often been theorized as a significant factor underlying the occurrence of

visual discomfort

• The ranges of accommodation and vergence that can be achieved without any excessive errors in either

di ti f d t th f l i l

direction are referred to as the zone of clear single binocular vision

(134)

Stereoscopic comfort zone

Comfort zone

(135)

Previous work (2D image resizing) ( )

• Seam Carving for Content-Aware Image Resizing g g g

[Avidan et al., SIGGRAPH07]

(136)

Previous work (2D image resizing) ( )

• Optimized scale-and-stretch (OSS)Optimized scale and stretch (OSS)

[Wang et al., SIGGRAPH ASIA08]

(137)

Previous work (linear shift) ( )

(138)

Content-aware Display Adaptation and Interactive Editing

and Interactive Editing

for Stereoscopic Images

(139)

Mapping between disparities and depths

depths

d=x

_R_R

-x

_L_L

x xx_R_R x_RR

x_L x_L

L R L R

(140)

Mapping between disparities and depths

depths

Z: perceived depth d : screen disparity d : screen disparity e: interocular distance

D: viewing distance

L R

(141)

Mapping between disparities and depths

depths

e: interocular distance D: viewing distance D: viewing distance

(142)

Overview

Left

Right

• Minimize

(143)

Saliency detection

• Graph-based visual saliency algorithm [Jonathan et al., NIPS06]

1

0

(144)

Mesh representation

• An image is represented as a rectangular meshg p g

(145)

Quad importance

• Average of saliency values of all pixels in qAverage of saliency values of all pixels in q

(146)

Image correspondence

• Build the stereoscopic constraintsBuild the stereoscopic constraints

• The state-of-the-art stereo methods are still far from perfection

from perfection

• We only use sparse features pairs

R li bl – Reliable

– “Sparse correspondences + image warping” are often enough to fool human brain

enough to fool human brain

(147)

Feature correspondence

• Feature extractionFeature extraction

– SIFT feature

• Feature matching

• Verification

E i f d l

– Estimate fundamental matrix using RANSAC

N i l i

• Non-maximal suppression

(148)

Feature correspondence

• is expressed as a linear combination of is expressed as a linear combination of

(149)

Energy minimization

• Energy functionEnergy function

Content preservation

: Distortion energy

: Line bending energy : Line bending energy : Alignment energy

: Consistent disparity energy : Consistent disparity energy

(150)

Energy minimization

(151)

Distortion energy

• The optimal is completely defined by and

(152)

Line bending energy

Ideal OSS OSS

Our method

(153)

Energy minimization

• Energy function Energy function

Stereoscopic constraints

: Distortion energy

: Line bending energy : Line bending energy : Alignment energy

: Consistent disparity energy : Consistent disparity energy

(154)

Alignment energy

• Maintain purely horizontal parallaxMaintain purely horizontal parallax (avoid vertical parallax)

(155)

Consistent disparity energy

• Maintain the relative shapes and depthsMaintain the relative shapes and depths

• Keep the absolute shapes and depths

(156)

Energy minimization

• Energy functionEnergy function

• Linear least squares problem  closed form solution

(157)

User editing

• User specifies depths (Z-axis)User specifies depths (Z axis)

• User specifies 3D position (XYZ-axis)

(158)

User editing

(159)

User editing

(160)

User editing

• User specifies depths (Z-axis)User specifies depths (Z axis)

Target disparity

(161)

User editing

• User specifies 3D position (XYZ-axis)User specifies 3D position (XYZ axis)

Target 2D position Target 2D position

(162)

Results

original Scaling

Saliency Our method

(163)

Results

O th d O th d

Original Scaling Our method (relative)

Our method (absolute)

(164)

Results

(165)

Results

• Resizing + depths adaptation by user editingg p p y g

Original Saliency

(166)

Results

• Depth adaptation by user editing

40 60 80

0 20

sparity

-60 -40

dis -20

-80 -60

feature index (a) Original

(b) All negative

(c) Half negative, half positive (c) Half negative, half positive (d) All positive

(167)

Results

• Depth adaptation by user editing T t th b t 3D bj t

• Treat the boat as a 3D object

(168)

User study

• 24 subjects24 subjects

• Image resizing

Depth adaptation

• Depth adaptation

(169)

User study – part 1

• Image resizing

– 1.5x width

– Comparison with OSS and our method

– Q: which viewing is more clear and comfortable?

(170)

User study – part 1

• Vote rate 90.6%

(174/192) (174/192)

• 92.0% (160/174)

prefered our results than OSS’s

No. 1 2 3 4 5 6 7 8

Vote rate 96% 71% 100% 79% 92% 92% 100% 96%

Prefer ours 87% 76% 100% 74% 95% 100% 96% 100%

(171)

User study – part 2

• Depth adaptationp p

–

(172)

User study – part 2

• Depth adaptationp p

–

– Q: which image’s foreground area is closer to you?

(173)

User study – part 2

• Vote rate 91.7% (176/192)Vote rate 91.7% (176/192)

• Correct rate 88.6% (156/176)

No. 1 2 3 4 5 6 7 8

Vote rate 100% 96% 100% 96% 79% 92% 88% 83%

Correct t

79% 83% 88% 91% 84% 91% 100% 95%

rate

(174)

Nonlinear disparity mapping

• video

(175)

Stereoscopic copy and paste

• video