• 沒有找到結果。

Chapter 2 Drowsiness Detection and Warning System

2.2. System Operation

2.2.4. Parameter Estimation

In the parameter estimation stage, the parameters of percentage of eye closure over time, eye blinking frequency, eye closure duration, head orientation (including tilt, pan and rotation angles), mouth opening duration, and degree of gaze are estimated from the located driver’s face. The parameters will later be used in the reasoning stage to infer the drowsiness level of the driver. To begin, the facial features are enclosed with rectangular windows. For the eyes, lee/1.5×lem/ 3.5 windows are used and for the mouth an lee×lem/ 2.5 window is employed, where l is the length between the ee two eyes and l is the length between the center of the two eyes and the mouth (see em Figure 10). Both l and ee l are calculated from the face detected at the early stage em of system processing. Characteristics of facial features are computed within windows, which are to be used to estimate the parameters mentioned above for drowsiness reasoning.

A. Eye Parameters

Considering an eye, it is observed that the vertical edge magnitude of an opening eye is typically larger than that of a closed eye. Referring to Figure 11, two images containing an opening eye and a closed eye and the associated vertical edge magnitude

magnitudes than the closed eye. In addition, the higher degree of eye closure, the smaller the summed edge magnitude. Let Ele and Ere be the averages of edge magnitude of the left and the right eyes, respectively. We define the degree de of eye closure as de=min{Ele, }/ 255Ere .

(a) (b) Fig. 11. (a) Opening eye, (b) closed eye.

Refer to Figure 12(a), where the calculated des of a video clip of 300 images are graphically depicted. See the local minima between images 79 and 149 in the figure.

We exhibit the eyes extracted from the sub-clips from images 77 to 84 in Figure 12(b) and from images 145 to 152 in Figure 12(c). The eyes present in images 80 and 148 are closed or almost closed. An image containing a closed eye actually corresponds to a local minimum in the de-curve. However, there are many local minima along the

curve. Only those small enough correspond to the images containing closed eyes. In order to highlight those minima, we derive the ve-curve from the de-curve according to ve =(dea)2, where a is determined in the following. Let m denote the mean of the

consideration. If the calculated mean is close to that of the previous video clip, a is set to the current mean; otherwise set to the previous mean. Figure 12(d) displays the

ve-curve computed from the de-curve. Comparing these two curves, significant valleys in the de-curve have been emphasized as peaks in the ve-curve. We threshold theve-curve to obtain a binary curve, referred to as the be-curve, shown in Figure 12(e).

(a)

(b)

(c)

(d)

(e) Fig. 12. (a) The de-curve of a video clip of 300 images, (b) the sub-clip from images

77 to 84, (c) the sub-clip from images 145 to 152, (d) theve-curve, and (e) the be-curve.

Based on the be-curve, we easily calculate the parameters of percentage of eye

closure over time (PERCLOS), blinking frequency (BF), and eye closure duration (D).

Let n (=300 used in our experiments) be the number of images in a video clip and

Tnbe the duration of the clip. The percentage of eye closure over time (PERCLOS) is defined as

number of pulses along the b-curve. Let Di denote the duration of pulse i. The eye closure duration (D) is then defined as

1 estimation is performed every video clip.

B. Head Parameters

Head parameters, including the tilt (α ), pan (β) and rotation (γ ) angles of the head, are to be estimated from the located face. Since the above parameters are three dimensional in nature while the located face is in the 2D image plane, only

approximate estimations of the parameters can be achieved. Recall that the located face is described in terms of a triangle that connects the two eyes and the mouth. The orientation of the face triangle somehow reflects the orientation of the head in the 3D space. Referring to the face model shown in Figure 10, if the head performs a tilt, the

line segment lem will be foreshortened. Let l′em be the foreshortened line segment.

The tilt angle α of the head can be easily determined as α =cos (1 lem′ /lem).

Similarly, let l′ee be the foreshortened line segment of lee when the head pans. The pan angle β of the head is given by β =cos (1 lee′ /lee). Finally, if the head rotates, its rotation angle γ is given by γ =cos1

(

xlxr /lee

)

, where x and l x are the r horizontal coordinates of the left and right eyes, respectively.

C. Mouth Parameters

Refer to Figure 13, where a closed mouth and an opening mouth and the associated edge magnitude maps are displayed. The opening mouth has relatively larger and stronger edge magnitudes than the closed mouth. In addition, the higher degree of mouth openness, the larger the summed edge magnitude. Let Em be the average of edge magnitude of the mouth. We define the degree dm of mouth openness as dm =Em/ 255.

(a) (b) Fig. 13. (a) Closed mouth, (b) opening mouth

Refer to Figure 14(a), where the calculated dms of a video clip of 300 images are graphically depicted. See the local maxima between images 235 and 287 in the figure.

We exhibit the mouths extracted from the sub-clip from images 212 to 285 in Figure 14(b). The mouths present between images 247 and 276 are opened or almost opened.

An image containing an opened mouth actually corresponds to a local maximum in the

dm-curve. However, there are many local maxima along the curve. Only those large enough correspond to the images containing opened mouths. In order to highlight those maxima, we derive the vm -curve from the dm -curve according to

( )2 under consideration. If the calculated mean is close to that of the previous video clip, b is set to the current mean; otherwise set to the previous mean. Figure 14(d) displays the

vm-curve computed from the dm-curve. Comparing these two curves, significant peaks in the dm-curve have been significantly emphasized in the vm-curve. We threshold thevm-curve to obtain a binary curve, referred to as the b -curve, shown in m

Figure 14(e).

(a)

(b)

(c)

(d) Fig. 14. (a) The dm-curve of a video clip of 300 images, (b) the sub-clip from images

221 to 285, (c) thevm-curve, and (d) the b -curve. m

Based on the b -curve, for each clip of 300 images we calculate the parameter of m

mouth openness duration (D ). Let m D denote the duration of pulse i. The mouth i openness duration (D ) is then defined as m

1

max{ }

p

m i

i n

D D

= ≤ ≤ , where n is the number p of pulses along the b -curve. m

D. Gaze Parameter

When a person gazes at something, the person has near fixed eye and face

orientations. We figure out these two conditions at any time instant based on the difference dgbetween the predicted and observed horizontal displacements of the face

known from the Kalman filter during face tracking. Figure 15(a) shows the calculated

d -curve for a video clip of 300 images. We threshold the curve to obtain the binary g

bg-curve shown in Figure 15(b). In this example video clip, there is a salient gaze occurring after image 170. We determine a degree of gaze for each clip of 300 video images, which is calculated by averaging the v values over the clip. g

(a)

(b) Fig. 15. (a) Thed -curve of a video clip of 300 images and g

(b) the corresponding bg-curve.

相關文件