• 沒有找到結果。

3. Techniques in Video-based Sleep Monitoring

3.2. Background Modeling

model. In image processing studies, moving objects or regions not belonging to background are called foreground. Otherwise, objects that always appear still in the video are called background. In order to tell which part is foreground or background for further analysis, there is a set of image processing techniques called background subtraction that separates the foreground from the scene.

According to Piccardi [10], three types of background modeling techniques have been identified, including parametric, non-parametric background density estimation and spatial correlation approaches. Parametric methods such as Gaussian mixture model (GMM) need to be pre-assigned some variables to the model to get better results. Frame differencing, local binary pattern (LBP) and local ternary pattern (LTP) background model fall in the category of non-parametric approaches. Spatial correlation approaches such as co-occurrence of image variations attempt to model the background scene using spatial information.

Each of the above approaches has its own advantages and weak points. In order to get the best performance, user has to find out what is the suitable situation for utilizing these approaches. In the following, the background modeling methods used in this thesis will be elaborated respectively.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

3.2.1. Consecutive Frames Subtraction

The simplest approach of background modeling is consecutive frames subtraction. If two frames are totally identical, then the result will be a dark image due to subtracting by itself. On the other hand, if there are some brightness areas in the subtraction image, we can conclude that some movements occur near those regions. Through subtraction of consecutive frames, it is possible to identify areas not belonging to the still background. In Fig. 3-2, the areas with brightness responses belong to the foreground, the remaining darkness places are considered part of the background.

Fig. 3-2: The first column of images are the first frames of consecutive frames, the second column of images are the second frames of consecutive frames and the last column of images

are the results by using consecutive frames subtraction.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

3.2.2. Gaussian Mixture Model

GMM is an adaptive approach which models the color or intensity distribution of every pixel in an image as a mixed model of several Gaussian distributions. As illustrated in Fig. 3-3, every pixel belonging to the background it will be modeled as a part of GMM. Once a motion pixel shows up, it will be excluded from the model because its intensity is very different from those in the model. By using GMM, it is possible to build an adaptive mechanism to simulate the background color distribution of every pixel. Therefore, those pixels will be seen as motion area if they do not belong to the current model. The advantage of GMM is that it can be modified by setting a learning rate. It is also flexible to assign different values to the parameters such as and of the Gaussian distribution to make them adapt to various types of background. But since GMM has to model each pixel in an image, the computational complexity is very high. Fig. 3-4 gives an example of applying GMM to the video clip. GMM can be used to locate the motion areas and rebuild the background if the object becomes still in the video.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Fig. 3-3: Every pixel belonging to the background will be modeled as a part of GMM. and denote the mean and the standard deviation of a Gaussian distribution respectively

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Fig. 3-4: (Top) original frames captured from video; (bottom) corresponding motion after performing Gaussian mixture subtraction

3.2.3. Local Binary Pattern

LBP is a texture-based method for background subtraction proposed by Heikkilä and Pietikäinen [5]. Each pixel is modeled as a group of local binary pattern histograms that are calculated over a circular region around the pixel. The procedure for calculating LBP is illustrated in Fig. 3-5 and Eq. (3-1, in which denotes the LBP code of a center pixel , is the gray value of the center pixel and is the gray value of p-th neighborhood of P equally spaced pixels on a circle of radius R. LBP was shown to be tolerant to illumination variations, the multimodality of the background, and the introduction or removal of background objects. Furthermore, this method can achieve real-time processing.

(3-1)

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Fig. 3-5: The procedure to generate a local binary pattern7

After calculating LBP for every pixels of an image, the LBP background model of the image can be established by specifying a parameter called which is the radius of the area desired for gathering statistics of the LBP. As illustrated in Fig. 3-6, the histogram which is the background model of LBP can be gathered from the local binary patterns in a predefined area. However, this is just the model for this pixel; all of the pixels in this image have its own histogram and can be calculated separately. After all of the background models have been built, we can compare the new histogram from a new frame with existing one by simple intersection or distance to see how similar they are. The higher the similarity, the more likely the patterns belong to the existing background. In other words, they do not belong to the same background if the similarity between these two histograms is lower than a specified threshold.

7 Retrieved and edited at http://www.scholarpedia.org/article/Local_Binary_Pattern

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Fig. 3-6: The histogram of LBP background model

3.2.4. Local Ternary Pattern Model

LBP is a computationally efficient local texture descriptor that has been applied successfully to tasks such as texture classification, face recognition, and background modeling.

However, LBP is quite sensitive to random noise in near-uniform image regions, which are typically seen in many home environments. The local ternary pattern (LTP) classifies the neighboring pixels in a region into three sets, namely, greater than, approximately equal to, and less than. The formula used to define LBP (Eq. (3-1) only needs a slight modification to be used in calculating LTP (Eq. (3-2). Refer to the diagram shown in Fig. 3-7, LTP encodes the relationship using a ternary string so that a slight change in a pixel‟s intensity value will not result in any variation in the corresponding representation, achieving better noise immunity than the original LBP [8].

Fig. 3-7: The procedure of make a local ternary code

The rest of the procedure for establishing background model is almost the same as that for LBP background modeling. The major difference is that the number of bins of the histogram becomes quite different. Due to the change of base system from 2 to 3, the number of bins increases very fast. If a large radius is adopted, the computational complexity will be prohibitively high. There is always a trade-off between the need for speed and accuracy.

相關文件