Time-Frequency-Varying Standard Deviation Using 2D Interpo-

When analyzing a nonlinear FM signal or a multicomponent signal, frequency-varying window width is preferred to achieve higher energy concentration than time-varying window width. Therefore, the relationship between the optimal standard devi-ation and the chirp rate is examined from another point of view. A Gaussian kernel w(t) with standard deviation σ and its Fourier transform W (f ) are given by

w(t) = 1

√2πσ e⁻^2σ2^t2 , W (f ) =√

2πσ e^−2π²^σ²^f². (1.28)

The temporal and spectral spreads of the kernel function are respectively defined as:

δ_t² = w₂− w1²

w₀ = σ²

2 where wi =

∫ _∞

−∞tⁱ|w(t)|²dt, (1.29) δ²_f = W₂− W1²

W₀ = 1

8π²σ² where W_i =

∫ _∞

−∞fⁱ|W (f)|²df . (1.30) The spreads are sometimes indicated with the Heisenberg box [9]. In the TF plane, the Gaussian kernel can be deemed as a two-dimensional (2D) mask, i.e. a box with time spread δ_tand frequency spread δ_f. If the FWHM in (1.9) is employed, the Gaussian mask has width 2√

2 ln 2 δ_tand height 2√

2 ln 2 δ_f, and the height-to-width ratio γ is given by

γ = 2√

2 ln 2 δ_f 2√

2 ln 2 δ_t = δ_f

δ_t = 1

2πσ². (1.31)

This equation implies that the standard deviation σ of the Gaussian kernel can be deter-mined by the height-to-width ratio γ of the 2D Gaussian mask. The TFR can be deemed as the convolution of the ideal TFR with the 2D Gaussian mask. Therefore, the problem is how to tune the shape (controlled by γ) of the mask for every TF point such that the TFR has energy as concentrated on its ridges as possible.

To express the notion of the answer, the problem is simplified by considering that the

Freq uenc y (H z)

Figure 1.3: The ideal TFR (solid straight lines) of a linear FM signal with chirp rate a and three uniform TF masks (dashed rectangles) with different height-to-width ratios γ’s:

(a) mask1 (γ > |a|), (b) mask2 (γ = |a|), and (c) mask3 (γ < |a|). The gray block in each sub-figure represents the region having the highest envelope of the convolution of the ideal TFR with the mask. This region is equivalent to the ideal TFR as mask2 (γ =|a|) is used.

2D mask is “uniform”. For a discrete signal consisting of only one linear FM component,

x[m] = exp⁽j2π(a(m∆_t)²/2 + bm∆_t)⁾, (1.32) the exact chirp rate is a constant, i.e. f^′_inst[m] = a. The ideal TFR of the signal is shown in Fig. 1.3 ( solid straight lines). To examine the energy concentration of the convolution of the ideal TFR with the 2D uniform mask, three kinds of masks with height-to-width ratios γ > |a|, γ = |a| and γ < |a| are utilized, as depicted in Figs. 1.3(a), 1.3(b) and 1.3(c) (dashed rectangles), respectively. The gray block in each sub-figure of Fig. 1.3 represents the region of the TF points having the highest envelope of the convolution. It is obvious that the region is exactly the distribution of the ideal TFR when γ = |a|. Because the signal is a linear FM signal, the distributions of the convolution along the frequency axis at all time instants are similar. Therefore, Fig. 1.4 only shows the normalized envelopes of the convolutions at t = 80 (sec.). The envelope is nonzero between 43Hz and 128Hz for all the three masks; however, the envelope is the most concentrated when the mask with γ =|a| is utilized.

For a 2D Gaussian mask, which is nonuniform, the height-to-width ratio γ = |a| is also the optimal choice, but the difference of the concentration levels as shown in Fig. 1.4 would be not so significant. According to (1.31), the optimal standard deviation σ_opt is

Nor

Figure 1.4: The normalized envelopes of the convolutions of the ideal TFR with the three kinds of TF masks shown in Fig. 1.3 at t = 80 sec.: (a) mask1 (γ > |a|), (b)

Since this result is equivalent to that in (1.22) and (1.25), it is feasible to determine the optimal standard deviation from the shape of the 2D mask. Note that the chirp rate may be 0 or±∞ in some cases, and thus upper bound σmaxand lower bound σ_minof the standard deviation should be defined. For instance, 2√

2 ln 2 σ_max can be set equal to the signal length.

Consider the more complicated case that the signal under analysis consists of multiple components or a nonlinear FM component. Because the chirp rate is no longer a constant, f^′_inst[m, n] is defined as:

• If (m∆_t, n∆_f) is on the ridge (called an on-ridge point), f^′_inst[m, n] is define as the chirp rate of the component occurring at this TF point.

• If (m∆_t, n∆_f) is off the ridge (called an off-ridge point), f^′_inst[m, n] is undefined.

The ideal TFR of a monocomponent nonlinear FM signal is depicted in Fig. 1.5. The points q2, q4 and q5 are on-ridge points, while q1 and q3 are off-ridge points. According to Cohen’s derivation in (1.21), the optimal standard deviation of the on-ridge point with chirp rate f^′_inst[m, n] can be approximated by

σ_opt² [m, n]≈ 1

2π · 1

|f^′inst[m, n]|. (1.34)

The problem is how to determine the optimal standard deviations for the off-ridge points.

0 t

m∆

0 f

n ∆

Figure 1.5: The mask dilation strategy for a nonlinear FM signal. The solid line is the ideal TFR. For the on-ridge points (q2, q4 and q5), the height-to-width ratios γ’s of the masks are equal to the absolute values of the chirp rates. For the off-ridge points (q1 and q3), the mask at q1 should be the same as that at q2; however, γ of the mask at q3 should be in-between those at q2, q4 and q5 to avoid overlapping with the ideal TFR.

Observe the shapes of the Gaussian masks of the on-ridge points q2, q4 and q5, as shown in Fig. 1.5. To achieve high energy concentration, the height-to-width ratio of the mask at q1 should be the same as that at q2; however, the height-to-width ratio of the mask at q3 should be in-between those at q2, q4 and q5 to avoid overlapping with the ideal TFR.

This implies that at time instant m₀∆_t in Fig. 5, applying a single value σ[m₀] to the entire frequency band is worse than using σ[m₀, n]. Similarly, at frequency n₀∆_f, apply-ing a sapply-ingle value σ[n₀] to the entire time interval would be worse than using σ[m, n₀].

For the purpose of low complexity, 2D interpolation is employed to obtain the γ’s (i.e.

f^′_inst[m, n]’s) for all the off-ridge points. Once all the f^′_inst[m, n]’s are determined, the approximate optimal standard deviations for all the TF points can be obtained from (1.34).

In our simulations, 2D triangle-based linear interpolation on the f^′_inst[m, n] is utilized.

Although this interpolation method may not be the optimal, it can achieve higher energy concentration among some well known interpolations on the f^′_inst[m, n], the tan⁻¹(f^′_inst [m, n]) or the 1/(2π)/f^′_inst[m, n]: nearest neighbor interpolation, triangle-based linear interpolation, triangle-based cubic interpolation and MATLAB 4 griddata method. There is always a tradeoff between energy concentration and complexity. Therefore, it is imprac-tical to design an ASTFT with enormous amount of computation even though it has the

highest energy concentration. Although the proposed technique is not the best for energy concentration, it has a great advantage in terms of low complexity.

在文檔中時頻分析與線性完整轉換 (頁 45-49)