The view angle and geometry between camera and objects are the key information in the proposal algorithm. In this chapter, we are going to introduce how to calculate the view angle and the geometry between camera and objects in an image. In Section 3.1, we introduce the basic principles of photo-imaging, including the thin lens equation and the concept of Effective Focal Length (EFL). In Section 3.2, we give the formula to calculate the view angle between objects in an image. The geometry between user and objects is illustrated in Section 3.3.
3.1 The Principle of Photo-Imaging
Figure 3.1 illustrates an example of photo-imaging. L is a positive lens whose focal length is f . C locating at the center of the lens L is the optical center of L. F is a focus of L. The distance between C and F , denoted as f , is the focal length of L. The line perpendicular to the lens and passing C and F , denoted as A, is the optical axis of L. O to the left of L is an object, and I to the right of L is the image of O with lens L. The distance between O and L is called the object distance and denoted as do; and the distance between C and I is called the image distance and denoted as di. The angle α is called the view angle of object O.
A
L
C O
α α
I
F
do di
f
Figure 3.1: The relation of object distance, image distance and focal length.
The thin lens equation, Eq. (3.1), 1 do + 1
di = 1
f. (3.1)
gives the relation between f , do and di. If we take a photo, do is the distance from the object
we have
di = 1
1−dfof ≈ f. (3.2)
Especially, if do ≫ f, di can be approximated by f . Actually, this is the most case in photographing, and we use f = di implicitly in the following discussion.
Taking a photograph, the image is recorded on a film gauge or converted by a CCD or CMOS device into electronic signal and saved as a file. Traditionally, the most popular film gauges are the 35mm still photography films that also known as 135 films. The size of 135 films is 35mm × 24mm. The ”angle of view” or ”field of view” of a photograph are the diagonal angular extent of the scenes. To prevent possible ambiguity, we prefer the term field of view in what follows. If D is the diagonal of the film, the field of view can be obtained by
2 arctan D
2di ≈ 2 arctan D
2f. (3.3)
Compared to the traditional film, the size of the CCD or CMOS device in a digital camera, usually not known by users, varies from one model to another and is smaller than traditional films. So, it is not so convenient to calculate the field of view. To get ride of the problem, the concept of EFL is introduced to have a standard description. Let D135 denote the diagonal length of 135 films, D denote the real diagonal length of the film, CCD or CMOS, and f denote the real focal length of the camera lens. The EFL of the lens is given by
fEF L = D135× f
D. (3.4)
By this means, even without the information of the real size of the film, we can calculate
the field of view by
2 arctan D135
2fEF L. (3.5)
3.2 View Angles between Objects
The view angle between two objects is the angular extend between them. In this section, we derive a formula to calculate the view angle between objects in an image. Figure 3.2 illustrates the coordinate layout of an image. The size of the image is W × H, and the
W
H P1 (x1, y1) P2(x2, y2)
I (W/2, H/2)
A (W, 0) O (0, 0)
B (0, H)
Figure 3.2: The coordinate layout of an image.
diagonal of the image is √
W2+ H2. Assume there is a coordinate system whose origin is located at the upper-left corner of the image. The upper-right corner, denoted as A, is with coordinate (W, 0), the lower-left corner, denoted as B, is with coordinate (0, H), and the center of the image, denoted as I, is with coordinate (W/2, H/2). Let fEF L be the EFL as taking this image. If there are two objects located at P1 and P2, we would like to develop a formula to calculate the view angle between them. First of all, the image is scaled to
respectively, in the scaled one in which √
W2+ H2 = D135. In addition, the lens center, denoted as C, is with coordinate (W/2, H/2, fEF L). Figure 3.3 illustrate the view angle α between P1 and P2. According to the law of cosines, we have
D c
Then, α can be obtained from Eq. (3.7)
α = arccos
(∥P1− C∥2+∥P2− C∥2− ∥P1− P2∥2 2× ∥P1− C∥ × ∥P2− C∥
)
. (3.7)
3.3 Geometry between Camera and Objects
Geometry defines the relation of camera and objects with coordinate systems between them.
In this section, we introduce the coordinate systems that exist in camera and objects and the projection relation between them. Coordinate systems include Earth frame and AR frame as illustrated in Fig. 3.4.
A
Figure 3.4: Geometry between user and objects
Earth frame is used in tracking objects moving on the ground. BE ={bE1, bE2, bE3} is a base of the Earth frame in which
bE1 is the unit vector pointing to the north bE2 is the unit vector pointing to the east
bE3 is the unit vector pointing to the ground perpendicularly
The locations of POI stored in database and the camera we want to find are defined in Earth frame. AR frame is related to the smartphone display. We assume the device is held horizontally, the screen is facing the user, and the button is at the right. Axis x, y and z defined in the AR frame as illustrated in Fig. 3.4, where BAR ={bAR, bAR, bAR} is a base
of the AR frame in which
bAR1 is the along the x-axis of the screen (right) bAR2 is the along the y-axis of the screen (down) bAR3 is the along the direction of the lens (front)
C, the center of the lens, is the origin of the AR frame. f is the virtual focal length (if
the screen is considered as a film). If an object is far from the lens, the distance from the lens to the image is roughly equal to f . If we place the screen in front of the lens at the distance of f , then a projection relation exists between objects and images. Let
C′ denote the center of the screen, a denote an object, and
a′ denote the image of the object.
We have to AR-frame. Since Ca′ ∥ Ca, similar relation is inferred in Eq. 3.8.
[a]AR,x: [a′]AR,x= [a]AR,y : [a′]AR,y = [a]AR,z : [a′]AR,z. (3.8)