Automatic learning framework of real-time vehicle classifier. The first

The panel data shown in Figure 1.4 represent a large vehicle in lane 2 and a small vehicle in lane 6, and depict the energy distributions and magnitudes of vehicles.

Figure 1.4 shows that the spread-spectrum of a large vehicle is so wide that the neighbor lanes are influenced, and reveals that the large vehicle tends to have the higher reflected energy and the wider spread-spectrum than a small one.

Conceptually the energy magnitude and spread-spectrum could be selected as features for classifying vehicles. However, in multi-lane environments, the spread-spectrum of a large vehicle may simultaneously cause a virtual pattern which is similar to a small vehicle or motorcycle in the adjacent lanes, such that we may confuse whether a vehicle or motorcycle exists in the adjacent lanes.

In this work, an automatic learning framework shown in Figure 6.1 is proposed for developing a real-time vehicle classifier, and the flow chart can be described as follows. First, the received voltage data will be converted into frequency-domain information shown in Figure 1.2 via Fast Fourier Transform. Then an algorithm is

Convert the voltage data to the frequency-domain information

Is a vehicle?

Learn 2D GMM parameters via EM YES

Extract the features NO

Form a real-time vehicle classifier YES

Retrieve real-time voltage data

Enough vehicles?

Figure 6.1: Automatic learning framework of real-time vehicle classifier. The first step is to retrieve the real-time voltage data, and the voltage date are large enough to generate 200 frequency-domain frames per second uniformly.

applied to determine if the transformed information forms vehicles. If reflected signals of a vehicle is obtained, the features of a vehicle will be extracted from a series of frequency-domain frames. While sufficient training vehicles are gathered (about 30 vehicles), a two-dimensional Gaussian mixed model is introduced to describe these extracted features, and combines an expectation maximization algorithm (1977) [45]

to estimate the GMM parameters.

6.1.1 Vehicle Detection Algorithm

The aim of the vehicle detection algorithm is to determine whether a vehicle exists from the panel data shown in Figure 1.4. For achieving the purpose, a multi-threshold detection algorithm is proposed for detect vehicle and described as follows.

Step 1 start point detection. Assume there are around 200 frequency-domain frames per second based on the capability of the analog digital converter, and the arrival

rate is uniform. The variance and magnitude thresholds (T h₁ and T h₂) are set to obtain the start point of a potential vehicle. If the number of successive frames that satisfy the thresholds (i.e., above T h₁ and T h₂) exceeds three times on each lane’s frequency range, the start point of a potential vehicle is obtained.

Otherwise, reflected signals are regards as the background information. Note that T h₁ and T h₂ are relatively low so that a potential vehicle can be detected as possible.

Step 2 end point detection. The end point can be obtained if the number of succes-sive frames that satisfy the thresholds (i.e., below T h1 and T h2) exceeds T h3

times. T h3 is the least number of frames required to separate vehicles one by one, and T h3 can obtained by the following equation.

T h₃ = f rames × (headway − length of detection zone)

the lowest speed of vehicle , (6.1.1) where the minimal headway has to be larger than the virtual loop, or there is no way to separate vehicles one by one.

Step 3 minimal active duration detection. According to the start and end points, the active duration can be calculated, and the minimal active duration threshold T h₄ can be calculated based on the following equation.

T h₄ = f rames × (vehicle length + length of detection zone)

the highest speed of vehicle . (6.1.2) If the active duration is less than T h₄, the data will be regarded as a virtual pattern or noise from the adjacent lanes.

(a) (b) (c)

Figure 6.2: Virtual vehicle detection. (a) is a scenario that a virtual vehicle exists, and (b)(c) show there may be two vehicles in the adjacent lanes.

Step 4 virtual vehicle detection. Because the spread-spectrum of a large vehicle may simultaneously cause a virtual pattern which is similar to a small vehicle or motorcycle in the adjacent lanes, such that the appearing duration of the “real”

and “virtual” vehicles have high overlap from time series perspective. Therefore, for determining whether virtual vehicles exist, an overlapping ratio threshold T h₅ is proposed to detect whether a virtual vehicle exists. An overlapping ratio in each lane is calculated by using overlapping duration to divide the active duration.

If any overlapping ratios in the neighboring lanes are above the overlapping ratio threshold (i.e., above T h₅), a virtual vehicle is detected. Figure 6.2(a) illustrates the idea of virtual vehicle that can be detected; one of vehicles could be a virtual vehicle. Figure 6.2(b)(c) shows there are two potential vehicles in the adjacent lanes.

Step 5 virtual vehicle obviation. After a virtual vehicle is detected, the energy mag-nitude within the overlapping duration will be utilized to compare. Since re-flected signals of a real vehicle are always larger than these of virtual ones.

Hence, if a lane always has the higher odds in energy magnitude comparison, the reflected signals in that lane will be regard as a real vehicle, whereas others are virtual vehicles.

6.1.2 Vehicle Feature Extraction

The main difficulties regarding the vehicle feature extraction include: first, for provid-ing the accurate features, the vehicle detection algorithm and accurate lane positions are required in advance. However, the spread of reflected signals of a large vehicle usually influences neighbor lanes such that the accuracy of vehicle detection degraded.

Second, a good feature for describing moving vehicles is difficult to obtain, because most of features are affected by different vehicle speeds. One of our contributions is that the two proposed features can be obtained in real-time and are less sensitive to speed and the signals of vehicles in neighboring lanes. The selected magnitude and variance in the corresponding range of lane highly depend on the vehicle sizes when vehicles passed through the detection zone, consequently, a pair of features, “maximal energy peak” and “maximal energy variance”, are proposed to form a learning sample x ∈ R²⁺.

After obtaining the reflected signals of a vehicle, The maximal energy peak can be extracted as follows. First, the maximal energy value within the corresponding lane can be found from a frequency-domain frame shown in Figure 1.2(b), then a series of maximal energies within the active duration can form a frequency-domain characteristic, and the peak of this series is chosen to form the maximal energy peak.

Another selected feature, maximal energy variance is processed in the same way, except the maximal energy value is replaced by the energy variance that is the variance

of the energies within each lane.

6.1.3 Two-Dimensional Gaussian Mixed Model

In this study, large and small vehicles represent the two groups to be detected, be-cause the effective reflected length of a motorcycle is too short and the magnitude of motorcycle is too small to distinguish. Consequently, motorcycles will be filtered out.

A GMM with two components can be described as:

p(x) = α₁g(x; µ₁, Σ₁) + α₂g(x; µ₂, Σ₂), (6.1.3) where g(x; µ, Σ) is a Gaussian density function shown as below:

g(x; µ, Σ) = 1

p(2π)²|Σ|exp

−1

2(x − µ)^TΣ⁻¹(x − µ)

, (6.1.4)

and α₁, α₂ denote the non-negative weights and add up to one. µ and Σ denote the mean and covariance matrix of Gaussian distribution respectively. All of these parameters can be estimated by using the EM algorithm, and the entire iterative procedure will stop until the difference between several successive iterations is be-low some specific tolerance level. The expectation of likelihood with two vehicles is detailed as follows:

E(xpectation) − Step :

where θ^(k) is the estimate at step k. For simplifying the discussion, the posterior probability β_j(x) of vehicle j is expressed as follows:

β_j(x) = α_jg(x; µ_j, Σ_j)

α1g(x; µ1, Σ1) + α2g(x; µ2, Σ2). (6.1.5) For obtaining the maximum likelihood of J(θ), the update formula for µj, σ²_j of each vehicle j can be derived from the first order necessary condition and addressed as follows.

Moreover, in order to meet the equality constraint P²

j=1

α_j = 1, Lagrange multipliers

are introduced to redefine the new objective function J_new, and α_j also can be derived as follows. iteratively, and such that the likelihood of GMM approaches the maximum.

6.2 Estimation of Vehicle Speed

Most radars use Doppler effect to obtain the relative speed of the target, however, the direction of illumination of road-side radar detectors is perpendicular to the direction of traffic, thus Doppler effect is unapparent and hard to be utilized in our study.

In this study, the principle of single loop detector for estimating vehicle speed is employed. The formula is shown in equation 6.2.1. The length of vehicle can be derived directly from the result of vehicle type shown in the previous section, but the length of the same vehicle is assumed to be equal. In addition, the length of detection zone is also assumed to the 3dB beam width of the antenna in the E-plane.

The denominator is the time difference between start point and end point shown in Step 1 and 2 of the vehicle detection algorithm. The estimated value of vehicle speed is very rough, because all variables shown in equation (6.2.1) are of much uncertainty.

Consequently, the estimation of vehicle speed of equation (6.2.1) is just a theoretical value.

vehicle speed = (vehicle length + length of detection zone)

the occupancy of time . (6.2.1)

Chapter 7 Numerical results

7.1 Field Test of Lane Boundary Estimator

The real-world experimental environment shown in Figure 7.1 can be mapped to the sketch map shown in Figure 1.1, h in the real-world experimental scenario is 5 meters, while D is around 60 meters. The real-world data, including signal and video files, are simultaneously gathered via an industrial PC. The operating frequency of the road-side radar detector is at 10.5 GHz, and the bandwidth of RF is 50MHz. The hardware blocks of the road-side radar detector include the CMOS transceiver, digital signal processing (DSP) unit and two antenna arrays [16], all of which are integrated in a metallic box. The algorithm in DSP is written in C codes. The bin range can be approximately 78 cm long, although this may vary with the sensor resolution.

A total of 102,041 files involving 172 real-world vehicles are gathered, and each file comprises 512 voltage values. The frequency-domain information shown in Figure 2(b) can be obtained by using a FFT to convert the time-domain voltage values into the frequency domain. The summary of vehicle types in multi-lane environment is shown in Table 7.1, and reveals that the learning samples comprise different vehicle

(a) (b)

Figure 7.1: Real-world experimental environments in the suburbs. (a) is a sunny scenario, and (b) is a rainy scenario.

types, including motorcycle, small car, and large car. Notably, since the property of motorcycle is not suitable for identifying the lane positions, hence the motorcycle information will be filtered out in the entire learning procedure.

7.1.1 Results of Single-Value Information

According to the feature; that is, a single passing vehicle contributes just one count to the corresponding bin, proposed in [11], the accumulated vehicle count could be marshaled via a histogram shown in Figure 7.2. Then the GMM with different initials

Table 7.1: Summary of vehicle types in a multi-lane environment, which includes motorcycles, small vehicles (i.e., Honda civic, Toyota Camry, Volks wagan T4), and large vehicles (i.e., cement, gravel truck, and large bus).

Motorcycle Small vehicle Large vehicle

Lane 1 8 35 1

Lane 2 1 36 7

Lane 3 7 38 10

Lane 4 2 27 0

Figure 7.2: The histogram is the accumulated vehicle count mentioned in the patent [11]. That is; a single passing vehicle contributes just one count to the corresponding bin.

(including a K-mean initialization) are applied to work with an EM algorithm to address the lane position, and the learning result with different initials converges to the four primary outcomes shown in Figure 7.3. The peak of the probability density functions describes the center of each lane, and the valley of the probability density functions represents the lane boundary. The recorded video images are applied to verify whether the passed vehicle appeared in the corresponding lane range, but the result shows that the traffic count in each lane can not be captured accurately, since the Gaussian component associated with the larger variance mostly covers two adjacent lanes.

(a)

(b)

(c)

(d)

Figure 7.3: The learned Gaussian distribution density functions of GMM based on the histogram shown in Figure 7.2, (a)(b)(c)(d) are the results obtained by an EM algorithm associated with different initials.

Table 7.2: The learned parameters of GMM are obtained by an EM algorithm Weight Mean Variance

Component 1 0.36 22.15 0.98 Component 2 0.30 25.66 1.21 Component 3 0.27 35.08 0.95 Component 4 0.06 38.43 0.98

Table 7.3: The learned parameters of GMM are obtained by the variant of EM algorithm

Weight Mean Variance Component 1 0.39 22.29 0.97 Component 2 0.27 25.92 1.00 Component 3 0.28 35.10 0.95 Component 4 0.06 38.61 0.82

7.1.2 Results of Span and Conflict Information

The histograms of the accumulated span and conflict information are shown in Figure 7.4. The learned results of the EM algorithm and its variant based on the histogram are shown in Figure 7.5(a) and (b), respectively. Tables 7.2 and 7.3 represent the parameters of GMM obtained by the EM algorithm and its variant respectively.

The fact that lanes 3 and 4 contain more small cars and large cars than lanes 1 and 2 is shown in Table 7.1, but the accumulated span and conflict information in lanes 3 and 4 are less than that in lanes 1 and 2, it is because the peak information caused by passed vehicles in lanes 3 and 4 is less apparent than that in lanes 1 and 2 when vehicles passed through the front of the detection area simultaneously.

The learned cross points between lane 1 and lane 2 shown in Figure 7.5(a)(b) are around 24, respectively, hence, the lane boundary can be set to 24. For avoiding the ambiguous areas, the range of lane 1 can be set to 21-23, and the range of lane 2 can be set to 25-27. Similarly, the range of lane 3 can be set to 34-36, while the range of

Figure 7.4: The histograms of real-world data. The top histogram is for the accumu-lated span information, while the lower histogram is for the conflict information.

(a)

(b)

Figure 7.5: (a) is the learned Gaussian distribution density functions obtained by using an EM algorithm, and (b) is the outcome obtained by using the variant of EM algorithm. The difference in learning results between (a) and (b) is that the variance of Gaussian components in (b) is not bigger than that in (a) underlying the same span and conflict information. Both of results are more accurate than that shown in Figure 7.3, and which indicated that the span and conflict information are superior to single-value information

lane 4 is set to 38-40. Using these rules, the classification results can be summarized in Table 7.4.

In our numerical experiments, based on the histogram generated by the proposed span and conflict information obtained from the gathered radar data, the mean of Gaussian components estimated by an EM algorithm is usually very close to the mean estimated by its variant, and hundreds of simulation results also show that the variance reduction always can be achieved by using the variant of EM algorithm.

The variance of Gaussian functions shown in Table 7.3 obtained by the variant of EM algorithm is not bigger than that shown in Table 7.2 obtained by an EM algorithm.

The computational burden of the variant of EM algorithm is less than that of EM algorithm, because the learning samples for each component only comprise these assigned points based on the highest likelihood.

Basically, the convergence speed of the EM variant is faster than the native EM, because the learning samples for each component only comprise these assigned points based on the component probability. The number of calculating Gaussian probability is about N × M, while that of the EM variant is N, which is independent of the component number. Furthermore, the component variance shown in Figure 7.5(b) is less on average than that shown in Figure 7.5(a); that is, the proposed EM variant can be expected to reduce the variance.

The classification result shown in Table 7.4 exhibits that over 95% accuracy in terms of recording passing vehicles can be captured in the corresponding lanes. The accuracy can be further improved provided that the passed vehicles do not exhibit lane-changing behavior when passing through the detection area. Each vehicle type except for motorcycle in each lane almost reaches such a high rate of accuracy whether

Table 7.4: The classifying results (including vehicles exhibit lane-changing behavior).

Lane 1 Lane 2 Lane 3 Lane 4 Learned results

21-23 33 1 0 0 34

25-27 3 43 0 0 46

34-36 0 0 48 0 48

38-40 0 0 0 27 27

Real-world 36 44 48 27 97.4%

Table 7.5: The classifying results only consider small and large vehicles, and the passed vehicles do not exhibit lanechanging behavior when passing through the de-tection area.

Sampling accuracy Type I Error α

Lane 1 100% 3.9%

Lane 2 100% 2.8%

Lane 3 100% 2.3%

Lane 4 100% 2.5%

sunny or rainy days.

7.1.3 Motorcycles and Lane Change

Figure 7.6 and 7.7 show two scenarios that selected from the real-world data. Both scenarios exhibit the adverse information for identifying traffic lane boundaries. The fist case shown in Figure 7.6 indicates that a motorcycle is at the position close to the dividing line when passing through the detection area. Since motorcycles may appear at any position of roads possibly, the information of motorcycle is incapable of aiding in identifying lane positions. The second case shown in Figure 7.7 indicates the passed vehicle exhibits lane-changing behavior when passing through the detection area.

Both cases generate the same span information that ranges from 23 to 25, and just fall in the conflict area. In other words, both scenarios cause difficulty in learning traffic lane positions. Hence, for achieving the higher accuracy, the installation location

(a)

Figure 7.6: A motorcycle scenario and its histogram of peak count. This case shows that the track of motorcycle is not proper to describe the lane boundary information.

Table 7.6: Lane range

of the road-side radar should avoid areas where vehicles usually exhibit lane change behavior, and the signal of motorcycles should be filtered out.

7.2 Field Test of Vehicle Classifier

Since lane 2 and 3 have more large vehicles relative to lane 1 and 4, thus, only lane 2 and 3 are selected as our targets for the vehicle classification learning. There are totally 43 and 48 vehicles obtained in lane 2 and 3 respectively by using an

Figure 7.7: A lane change scenario of a small vehicle and its histogram of peak count.

It indicates that the lane-changing behavior of vehicle is also not proper to describe the lane boundary information.

detection algorithm, if the passed vehicles do not exhibit lane change behavior when passing through the detection area, and the reflected signals of motorcycles are also filtered out. If only small and large vehicles are considered, the accuracies of vehicles appearing in the corresponding lanes can achieve 95% above.

Figure 7.8 shows vehicle feature scatter diagrams in lane 2 and 3. For distin-guishing the different vehicle types’ features, the large vehicles are marked as black squares, and small vehicles are represented by dark gray rhombuses. Each of vehicle sample are described by two features, average energy maximum, and average energy variance shown in x-axis and y-axis of Figure 7.8 respectively. Notably, the scales of axes are different for drawing all of the vehicle samples within a fixed screen size.

Table 7.7 describes the estimated GMM parameters in lane 2 solved by EM algo-rithm and the results can be geometrically plotted in Figure 7.9. The large vehicles

(a)

(b)

Figure 7.8: (a) shows the scatter diagram in lane 2. (b) shows the scatter diagram in lane 3.

Table 7.7: GMM parameters for lane 2. The component with the smaller mean represents the small vehicles, and the component with the larger mean represents the large vehicles.

Component 1 Component 2 (Small vehicles) (Large vehicles)

Weight 0.72 0.28

Mean (137.13, 60.11) (277.08, 126.72)

Variance(x) 41.16 56.54

Variance(y) 17.99 28.89

Covariance(x, y) 25.05 33.04

Table 7.8: Compare learned results and real-world date for lane 2. The diagonal counts are vehicles that are classified correctly, and non-diagonal counts are erroneous results.

Small vehicles Large vehicles Learned results

Component 1 31 0 31

Component 2 5 7 12

In real-world 36 7 Accuracy : 88.37%

are marked as dark squares, and small vehicles are of gray squares. Each additional standard deviation, an oval is plotted. The last convergence value of maximum like-lihood is -10.17, and the learned proportion of large vehicles is around 0.28, while the real proportion of large vehicles is 0.16. The accuracy for lane 2 is (31+7) / 43

在文檔中路側雷達偵測器之自動化學習演算法研發 (頁 64-96)