We perceive a 3D object when light reflected from its surface arrive our retinal.
The retinal image, however, is two-dimensional. Hence, the question is how the visual
system reconstructs a 3D representation of an object from 2D retinal images. It is well
known that a human observer can utilize depth cues available in an image to facilitate
this reconstruction. One of the useful cues is shading, or gradual variation of the
luminance on a surface (Horn, 1970). The luminance of a point on a surface depends on
not only the intensity and the incident angle of the light reaching the surface but also the
slant of the surface at that point, which determines the proportion of the light reflected
from the surface reaches the eyes of an observer. Hence, under the same illumination,
the shading provides information of the slant and in turn determined the relative depth
of each point on the surface (Horn, 1986; Horn & Brooks, 1989). Ramachandran (1988a,
1988b) further showed that the human visual system uses two assumptions in extraction
of shape from shading information: One is that there is a single light source illuminating
the whole scene, and the other is that the light is shining from above. With these two
constraints, the visual system is able to solve shape from shading problem as the
top-bottom luminance gradient from bright to dark suggests a convex surface while the
2
gradient from dark to bright, a concave surface. Such effect is best demonstrated in the
crater illusion in which inverting an image can alter the perception of surface from
convex to concave and vice versa (Gibson, 1950).
The Ramachandranian shape from shading has its limit. As shown in the hollow
face illusion (Gregory, 1970), a hollow face is always perceived as a convex regardless
whether the lighting is from above or from bottom. One interpretation for this hollow
face illusion is that the faces are familiar objects with known shape, and thus does not
require shading information to resolve its 3D shape. Hence, the face perception should
be insensitive to shading. This interpretation, however, contradicts with a number of
studies showing that face recognition is impaired when the luminance distribution of a
face image is inverted as in photographic negative (Gilad, Meng, & Sinha, 2009;
Johnston, Hill, & Carman, 1992; Kemp, Pike, White, & Musselman, 1996; Liu &
Chaudhuri, 1997). In addition it is showed that it is more difficult to recognize
bottom-lit faces than top-lit faces (Hill & Bruce, 1996; Johnston et al., 1992; Liu, Collin,
Burton, & Chaudhuri, 1999; Liu, Collin, Rainville, & Chaudhuri, 2000). That is, a
change of illumination condition has little effect on shape from shading on a face but
does have an effect on face recognition. However, to best of our knowledge, there is no
reliable measurement on whether observers perceive the same depth on a hollow face or
3
a bottom-lit face as they do on a normal top-lit face. Perhaps only part of information on
those unnaturally lit face can be recovered by the visual system, and the information is
enough to perceive a hollow face as convex but not enough for an observer to recognize
a bottom-lit face as good as a top-lit one.
We thus investigated the effect of shading on face perception by observing how
face discrimination ability changes with lighting directions. To avoid nuisance
“naturalness” factor that may result from Ramachandranian lighting-from-top constraint,
we shifted the position of light source laterally from the front to the side of a face.
Hence, all light source positions are ecologically equally plausible. The illumination and
face information can be separated by an algorithm based on the assumptions that the
face is symmetric (Grammer & Thornhill, 1994; Rhodes, 1988; Rhodes, Geddes, Jeffery,
Dziurawiec, & Clark, 2002; Rhodes, Peters, Lee, Morrone, & Burr, 2005) and is
illuminated by a single light source. Facial characteristics may be partitioned into two
types. One is the information from the inherent coloration and reflectance properties of
the facial surface, known as the surface albedo. And the other is information from the
illumination of the 3D face shape, termed the illumination component. Notice that the
luminance component cannot have a luminance less than zero, we can recast them in
terms of their contrast variation around the mean (scaled as 1) over the domain
4
where is the contrast of the inherent coloration of the face, or albedo. And
is the illumination component. Since both and in eq. 1 are constrained to
be less than one, their product in the interaction term must be much less than 1 and can
therefore be considered negligible in most situations. Thus,
Each term may be specified in terms of its symmetric (S) and asymmetric (A)
components:
where and for the illumination
component. And for the albedo component that is assumed to be
in symmetric view, and therefore has no asymmetric component .
All that is needed for complete segregation of the surface albedo from the
illumination component is to distinguish the residual symmetric illumination component
LS from the entire albedo component CS. We hypothesize that a) CS contains
predominantly high-frequency energy whereas LS contains predominantly
5
low-frequency energy. And b) the remaining frequency bands in each case are of little
relevance to face reconstruction. Hence, we can separate the two symmetric components
by a spatial frequency filter. Figure 1 demonstrates our method. Panel A is the
asymmetric low spatial frequency component that represents the asymmetric
illumination information, , of the original face image ( , panel C) while
Panel B is the symmetric high spatial frequency that represents the albedo component
( ). The hybrid face in Panel E is the sum of the LA and CS and is almost
perceptually identical to the original face. That is, the combination of the asymmetric
high frequency and the symmetric low spatial frequency ( ) components (Panel
D) plays little role in face perception. Hence, subtracting them from the original face
image affects little of our face perception.
Therefore, in the current study, we investigated how the change of the asymmetric
low spatial frequency by illumination conditions affects face discrimination and depth
judgment on hybrid faces. The hybrid picture paradigm (Schyns & Oliva, 1997, 1999)
was designed to study the relative contribution of specific spatial frequencies. In our
case, the hybrid face would be a combination of CS and LA or CS and LS. If the
illumination condition can affect the retrieval of 3D shape information from a face, we
should expect the change of illumination condition can affect both face discrimination
6
and depth judgment on the hybrid faces. In addition, such illumination effect should
exert their influence through the asymmetric low spatial frequency component of the
face images and not the symmetric ones.
7