Model Accuracy, Robustness and Sensitivity

Background Model Maintenance via Density Estimation

3.1.2 Model Accuracy, Robustness and Sensitivity

To estimate a density distribution from a sequence of intensities I0,x, . . . , It,x2 for a pixel at a position x via the GMM, three issues regarding model accuracy, robust-ness and sensitivity need to be addressed. Specifically, a mixture model consisting of N Gaussian distributions at time instance t can be denoted by

P (It,x) = XN n=1

w_t−1,x,nN It,x; µ_t−1,x,n, σ_t−1,x,n² ,

where N symbolizes a Gaussian probability density function

N I; µ, σ² .

2Here It,x ∈ R denotes the 1-D pixel intensity only. Yet, all our formulations can be easily extended to multi-dimensional color image processing, e.g., It,x ∈ R³.

µ_t−1,x,n and σ_t−1,x,n² are the Gaussian parameters of the nth model, and w_t−1,x,n is the respective mixture weight. For maintaining this mixture model, the parameters µ_t−1, σ_t−1² and w_t−1 need to be updated based on a new observation It,x. In the GMM, the update rule for µ, for the case that It,x matches the nth Gaussian model, is

µ_t,x,n = (1− ρ)µt−1,x,n + ρI_t,x,

where ρ∈ [0, 1] is a learning rate³ that controls how fast the estimate µ converges to new observations. Likewise, similar update rules can be applied to renewing σ² and w, given corresponding learning rates.

In updating the Gaussian parameters µ and σ², their values should reflect the up-to-date statistics of a scene as accurately as possible. It is thus preferable to set their learning rates to large values to quickly derive Gaussian distributions that fit new observations. Also as noted in [36], setting higher learning rates for µ and σ² improves model convergency and accuracy, and brings few side-effect in model stability.

While the model estimation accuracy depends on the learning rates for µ and σ, one can see that the R-S trade-off is affected by the learning rate for the mixture weight w. In the original GMM for background model estimation, the classification of Gaussian models into foreground and background is done by evaluating their mixture weights through thresholding. The Gaussian models that appear more often will receive larger weights in the model updating process, and will possibly be labeled as background [54]. However, the frequency of model occurrence should not be the only factor that guides the changes of mixture weights. For example,

3The definition of learning rate is inherited from [54].

one may prefer to give large weights to the Gaussian models of tree shadows (for background adaptation) while to keep small weights to those of parked cars (for foreground detection), despite the similar frequencies of occurrence of these two objects. By incorporating the high-level information of pixel types, e.g., of shadow or car, into the weight updating process, flexible background modeling can then be carried out. As more pixel types are designated by a surveillance system, more appropriate controls on weight changes can be advised accordingly, which will help resolving the R-S trade-off in background modeling. Based on this observation, we propose a new bivariate learning rate control scheme based on a feedback of pixel type for GMM.

3.2 Bivariate Learning Rate Control via High-Level Feedback

Our presentations of the proposed bivariate learning rate control via high-level feedback is divided into three parts. Firstly, an algorithm of background model maintenance using the GMM is proposed, wherein two types of learning rates are formally defined. We highlight the importance of the learning rate control for mixture weights and elaborate its relationship to foreground pixel labeling.

Secondly, a feedback scheme that controls the learning rates for mixture weights is detailed. Under this feedback control, different learning rates can be applied to different image locations and scene types, which makes dynamic background adaptation possible. Thirdly, a heuristic based on frame difference is introduced to assist the learning rate control for the adaptation of double-quick lighting changes.

False alarms caused by, for example, sudden sunshine changes in the background can hence be suppressed by this heuristic while significant, object motions can still be captured.

3.2.1 Background Model Maintenance

Given a new observation of pixel intensity It,x, the task of background model maintenance is to match this new observation to existing Gaussian distributions, if possible, and to renew all the parameters of the Gaussian mixture model for this pixel. The detailed steps of the proposed background model maintenance using the GMM is shown in Algorithm 2.

For the model matching in Algorithm 2, l(t, x) is utilized to index the best matched Gaussian model of It,x, if existing. Otherwise, l(t, x) = 0 will be set to indicate It,x is a brand-new observation and should be modeled by a new Gaussian distribution. The matching results of It,x can be recorded by model matching indicators, i.e.,

and will be used in the later model update. Unlike [54] that adopts a more complex formulation in model matching, i.e.,

l(t, x) = arg min

n=1,...,N

|It,x− µt−1,x,n|

σ_t−1,x,n , (3.1)

a simple rule that selects the model of higher weight as the best match is used in

Algorithm 2. The proposed weight-based matching rule prefers matching a pixel observation to the Gaussian model of background (with higher weight) other than those of foreground, if this observation falls in the scopes of multiple models. Using this rule not only saves computational costs but also fits the proposed rate control scheme better, as will be discussed in more detail later.

After model matching, we check if M_t,x,l(t,x) is equal to 0, which implies no model matched. If so, a model replacement is performed to incorporate I_t,x into the GMM; otherwise, a model update is executed. In the replacement phase, the least weighted Gaussian model is replaced by the current intensity observation. In the update phase, the following three rules,

µ_t,x,l(t,x) = 1− ρt,x,l(t,x)(α)

µt−1,x,l(t,x)+ ρ_t,x,l(t,x)(α) I_t,x, (3.2) σ²_t,x,l(t,x) = 1− ρt,x,l(t,x)(α)

σt−1,x,l(t,x)² + ρt,x,l(t,x)(α) It,x− µt,x,l(t,x)

, (3.3)

wt,x,n = (1− η^t,x(β)) w_t−1,x,n+ ηt,x(β) Mt,x,n, (3.4)

are applied, where ρt,x,l(t,x)(α) ∈ R denotes the learning rate for the Gaussian parameters µ and σ², and ηt,x(β) ∈ R is a new learning rate introduced in this research for controlling the updating speed of the mixture weight w. Here, the two scalars α and β can be viewed as hyper-parameters over ρ and η for tuning their values. In [54], the learning rate ρ is defined as

ρt,x,l(t,x)(α) .

= αN It,x; µt−1,x,l(t,x), σt−1,x,l(t,x)²

, (3.5)

while in [36] it is given by in quicker convergence in Gaussian parameter learning [36], we still choose (3.5) in our implementation for experimental comparisons and put our emphasis on the control of the learning rate η for the mixture weight. In later experiments we will show that better performance can be achieved by controlling the learning rate η than by tuning the rate ρ. Also, as noted in [36], typical values of α are in [0.1, 0.001] for both (3.5) and (3.6), yielding a wide range of convergence rates in Gaussian parameter estimation. Here we set α = 0.025 as a default value for quick model learning.

In previous background modeling researches, e.g., [25], [36], [54], a naive setting for mixture weight update, i.e.,

w_t,x,n = (1− α) wt−1,x,n+ α M_t,x,n, (3.7)

is adopted. The rule (3.7) can be viewed as a special case of the proposed weight update of (3.4) with ηt,x = α. In (3.7), all image pixels are confined to having an identical rate setting in mixture weight learning, so that scene changes can not be properly handled with respect to space and time. Instead, with our generaliza-tion that assigns individual learning rates for mixture weights to image pixels and adapts them over time, higher flexibility in regularizing background adaptation

4Interested readers can find the details in [36].

Algorithm 2: Background model maintenance

can be obtained. Note that the index n is not attached to ηt,x because the chang-ing rates for the weights w_t,x,ns, ∀n, are designed to be consistent among the N Gaussian models of the same image pixel. Regarding the computation of ηt,x, we link it to the high-level feedback of pixel types and describe the feedback control in Sec. 3.2.2.

In the GMM, all the scene changes, regardless of being foreground or back-ground, are modeled by Gaussian distributions. To further distinguish these two classes, a foreground indicator Ft,x,n for each Gaussian model is defined using the

在文檔中雙階層視訊分析 – 由靜態背景模型到動態前景切割 (頁 66-72)