Background Projection - VIEW SYNTHESIS - 利用以消失點為基礎之新模型從單一視角進行立體視訊合成

Chapter 3. VIEW SYNTHESIS

3.2 Background Projection

Before the introduction to the coordinates transform, there is a specific equation to describe. It is assumed that there are two points O1: (WO1, HO1, DO1) and O2: (WO2, HO2, DO2) where H_O1 equals H_O2. The projected points of O₁ and O₂ on screen plane are P₁: (W_P1, H_P1, DP1) and P2: (WP2, HP2, DP2). The position of camera in world coordinate is CO: (0, hC, 0).

Because O₁ is on the line C_OP₁ and O₂ is on the line C_OP₂ , O1 and O2 could be represented by (W_P₁t,(H_P₁h_c)t,D_P₁t) and (W_P₁w,(H_P₁h_c)w,D_P₁w) respectively, where t and s is constant. When HP1 equals HP2, t is equal to s when HP1 and HP2 are not equal to hC. Consequently, D_O1 equals D_O2. It means that the depth value of the two points which are both on horizontal plane would be equal when their projected points on screen plane have the same height.

Based on the vanishing model, we transform the point (x,y) on the horizontal plane in screen coordinate into the width w and depth d in world coordinate by

- 17 -

Where hC is the height from the horizontal plane to camera in world coordinate as

2 ) distance between camera to the bottom of screen in world coordinate, Wframe and H^frame are the resolution of the video sequence. The derivations of equation (1), (2), and(3) are in the following.

Because vanishing point is the farthest in the vanishing model, the line from camera to vanishing point is parallel to the horizontal plane in world coordinate. When camera aims at the farthest, vanishing point is located at the middle of the screen, as shown in Fig 12-A. The vanishing point is below or above the middle of the frame if it aims at somewhere in the horizontal plane, as Fig. 12-B and 12-C respectively. In Fig. 12, V is the vanishing point in the screen, M is the middle of the screen, B is the bottom of screen, and the dotted line represents the screen plane.

(A)

- 18 -

(B)

(C)

Figure 12- The related position between vanishing point and the middle of screen when (A) camera aims at the farthest, (B) camera aims at the upper horizontal plane, (C) camera aims at

the lower horizontal plane.

We take the case where camera aims at the lower horizontal plane as example. In Fig.

13-A, there are two points P and P’ in the screen. The relation of P and P’ is shown as in Fig.

14. The corresponding point of P’ on the horizontal plane is J’. The depth value d is SJ when the point P’ is on the central line of the screen. However, every point with same value of y-axis which has the corresponding point on the same horizontal plane has the same depth value as the above-mentioned. We could calculate the depth value of P’ through estimating the depth value of the point P with the same value of y-axis on the central line of the screen. And the corresponding point of P on the horizontal plane is J.

Because of OJS is equal to JOV and JOV could be divided into VOM and MOP, OJS is equal to VOM addition to POM . In the following derivation,

SO is h_C, SJ is d , OM is the focal length f , and VMO and PMO are both right angles.

- 19 -

The aspect of derivation to hC is similar to equation (4). When the point P is equal to B, d is equal to L. And then, equation (3) can be derived from equation (4).

In Fig. 13-B, the point P is located between V and M . The derivation is similar to Fig. 13-A except that PM in Fig. 13-A is positive but it is negative in Fig. 13-B. It represents POM is negative in Fig. 13-B. OJS is equal to subtraction ||MOP||

from ||VOM ||.

If the case is that camera aims at the upper horizontal plane, the derivation is also similar to (4). In Fig. 13-A, B, MV is positive but it is negative in Fig. 13-C. OJS is equal to

||MOP  VOM .

Since the case could not affect the transform equation as the above-mentioned, we take Fig. 13-A as an example without loss of generality.

The width w of J’ in world coordinate can be simply estimated through the theorem of

width, we have to transform these information by

- 20 -

Figure 13- Illustrate the derivation to d and h with the point P located between (A) V and M, (B) M and B, at the case with camera aimed at the lower horizontal plane, (C) at the case with

camera aimed at the upper horizontal plane.

- 21 -

Figure 14- Another viewpoint from Fig. 13-A for illustration.

The following is going to transform the point P: (x, y) on the vertical plane in image coordinate into depth d and width w in world coordinate. The fundamental idea is to search the corresponding foothold on the horizontal plane. The corresponding foothold is defined as the point with the same depth and the width as the point P on the horizontal plane. The depth and width of the corresponding foothold can be derived from equation (1), (2).

In the world coordinate, the screen plane equation is assumed as

ah+bd=c (8)

Where a, b, and c are constant. In Fig. 15, it is assumed that there are two points p₁: (W_p1, H_p1, Dp1) and f1: (Wf1, Hf1, Df1) in world coordinate, where Hp1 is not equal to Hf1 but Wp1 equals W_f1 and D_p1 equals D_f1. p₁ is the corresponding point in the world coordinate that projects to the point (x,y) in the image coordinate. The projected points on the screen plane is p2: (W_p1*t, (H_p1-h_C)*t, D_p1*t) and f₂: (W_f1*s, (H_f1-h_C)*s, D_f1*s) respectively, where s and t is constant. By the equation

[aHp1-ahC+bDp1]*t = [aHp2-ahC+bDf1]*s = c. (9) Only if Wp1 is zero or a is zero, the width on the screen plane of these points is the same.

It means that the projected points of p₁ and f₁ on the screen plane have the same width in world coordinate (or the same value of x-axis in image coordinate) when the projected point p₂ of p₁ is located at the central line

- 22 -

ah+bd=c, w=0 (10)

or camera aims at vanishing point as Fig. 12-A. At these cases, the corresponding foothold can be found by intersecting the line (ah+bd=c, w=Wp1*t) which has the same width with the corresponding point with the vanishing line separating vertical and horizontal planes.

Otherwise, for the case that Wp1 is not zero and a is not zero, we need to find the line p₂ f₂ on the screen plane.

In Fig. 15, it is assumed that there are four points, p₃: (0, (H_p1-h_C)*t, D_p1*t) and f₃: (0,

- 23 - horizontal plane. The corresponding foothold in image coordinate could be transformed into depth and width in world coordinate by equation (1)(2). Finally, we have gotten the width w and the depth d in world coordinate from the point (x,y) on vertical plane in image

- 24 -

coordinate.

Figure 15- Illustration for the relation between p₃p₂ and f₃f₂.

(A)

(B)

- 25 -

(C)

Figure 16- The cross-sectional view on plane w=0 from Fig. 15.

在文檔中利用以消失點為基礎之新模型從單一視角進行立體視訊合成 (頁 25-34)