Decomposition Stage - Singular Spectrum Analysis (SSA)

2 Signal Processing and Feature Extraction Techniques

2.1 Singular Spectrum Analysis (SSA)

2.1.1 Decomposition Stage

Considering a continuous, nonzero, and real-valued time series, it can always be discretized no matter the time series is a single-channel one, 𝑥(𝑡), or a multi-channel one, 𝐱(𝑡). The discrete time series, 𝐱_𝑘, has length of 𝑁 (𝑁 > 2) and sampling interval of ∆𝑡, meaning 𝐱_𝑘 = 𝐱(𝑘∆𝑡) where 1 ≤ 𝑘 ≤ 𝑁. Again, the time series can be a multi-channel one

𝐱_𝑘= [𝑥^𝑘,1 𝑥_𝑘,2 𝑥_𝑘,3

…

𝑥_𝑘,𝑛]^𝑇 (2.1)

where 𝑛 is the number of channels and the superscript 𝑇 denotes the transpose of a matrix. If 𝑛 equals 1, M-SSA becomes SSA and 𝐱_𝑘 becomes 𝑥_𝑘. Moreover, although a constant sampling interval is normally assumed, the subscript, 1, 2, 3,

… , 𝑁

, can be interpreted not only as discrete time moments but also as labels of any other linearly ordered structure. The decomposition stage described below consists of two steps: embedding and SVD.

First Step: Embedding

Embedding is a standard procedure in time series analysis and, after the embedding is performed, further development can be variated according to the purpose of investigation. SSA starts from embedding a trajectory matrix of time series. By specifying an integer, 𝐿, called window length,

doi:10.6342/NTU201901242 with the dimension 𝐿. If the emphasis needs to be addressed on the dimension of the lagged vectors, then it shall be called as L-lagged vectors (or, simply, lagged vectors) and the trajectory matrix shall be called the L-trajectory matrix.

Both the rows and columns of the trajectory matrix, 𝐗, are sub-series of the original time series and Equation (2.2) defines a unique correspondence between the trajectory matrix and the original time series. Once the window length, 𝐿, is sufficiently large, each lagged vector can be considered as a separate series and be used to investigate the dynamic characteristics for the time series. The simplest example is the well-known ‘moving average’ method, where the averages of the lagged vectors are computed, and there are also much more sophisticated approaches. At any rate, the window length must be large enough so that each lagged vector incorporates the essential part of the dynamic characteristics. From another point of view, not only the row size but also the column size must be large enough to allow a clear separation for the following SVD, where the singular vectors illustrate the detail content of the original time series and the singular values present the energy information corresponding to each frequency component (Bozzo et al., 2010).

The trajectory matrix in Equation (2.2) possesses an obvious symmetry property: the transposed matrix, 𝐗^𝑇, is the trajectory matrix of the same time series with window length of 𝐾 rather than 𝐿.

The ijth element is 𝐱_𝑖𝑗 = 𝐱_{𝑖+𝑗−1} which yields that the trajectory matrix has equal elements on the

‘(positive sloping) skew-diagonals’ (or ‘anti-diagonals’) where (𝑖 + 𝑗) equals a constant. Thus, the trajectory matrix is a Hankel matrix (or catalecticant matrix). It is a useful characteristic which is referred to as Hankel diagonals (skew-diagonals or anti-diagonals) and used in the final step to

doi:10.6342/NTU201901242 12

Second Step: Singular Value Decomposition

The second step is to conduct the SVD to the trajectory matrix, 𝐗, as:

𝐗 = 𝐔𝛔𝐕^𝑇= [𝐮1 𝐮₂

…

𝐮_𝐿] vectors and the columns of 𝐕 are called the right-singular vectors of the trajectory matrix.

Assuming the rank of the trajectory matrix is 𝐷 as:

𝐷 = rank[𝐗] = max{𝑖, such that 𝜎_𝑖 > 0} (2.4) In most of applications, the trajectory matrix is full rank, denoting that 𝐷 = min{𝐿, 𝐾}. The collection (𝜎_𝑖, 𝐮_𝑖, 𝐯_𝑖) where 𝑖 = 1, 2,

… , 𝐷

is called the ith eigentriple that forms the basis of decomposition.

By using SVD, it is possible to write the trajectory matrix as a sum of 𝐷 elementary matrices which is given by 𝐗_𝑖 = 𝜎_𝑖𝐮_𝑖𝐯_𝑖^𝑇:

Under the assumption that the time series is stationary, the trajectory matrix in Equation (2.3) can be replaced by the lag-covariance matrix, 𝐂, and the principal component analysis (PCA) can be equivalently performed instead of SVD for the efficient analysis of large-sized data. There are two distinct methods widely used to the define lag-covariance matrix, named BK approach (Broomhead

doi:10.6342/NTU201901242 13

and King, 1986) and VG approach (Vautard and Ghil, 1989), respectively. An important observation shows that VG approach is equivalent to BK approach by padding 𝐾 − 1 zeros before and after the original time series (Allen and Smith, 1996; Ghil and Taricco, 1997), hence only BK approach is introduced here:

𝐂 = 𝐗^𝑇𝐗 = (𝐔𝛔𝐕^𝑇)^𝑇(𝐔𝛔𝐕^𝑇) = 𝐕𝛔²𝐕^𝑇 (2.7) where the singular vector matrix 𝐕 are now equal to the eigenvector matrix and the singular value matrix is now equal to the square root of eigenvalue matrix. In this approach, the left-singular vectors can be derived as:

𝐮_𝑖 = _𝜎¹

𝑖𝐗𝐯_𝑖 (2.8)

and the elementary matrices in Equation (2.5) can be re-written as:

𝐗 = ∑^𝐷_𝑖=1𝐗_𝑖 = ∑^𝐷_𝑖=1𝐗𝐯_𝑖𝐯_𝑖^𝑇 = 𝐗𝐯₁𝐯₁^𝑇+ 𝐗𝐯₂𝐯₂^𝑇+

⋯ + 𝐗

𝐯_𝐷𝐯_𝐷^𝑇 (2.9) By using these equations, SSA or M-SSA becomes much more efficient while dealing with large-sized data. The distribution of singular values (or eigenvalues) in descending order is referred to as singular spectrum and can be used to specify which principal components shall be included in the next step.

SVD in Equation (2.3) possesses a number of optimal features. One of these features is that, among all the matrices of rank 𝑟 where 𝑟 < 𝐷 , the matrix, 𝐗̂ = ∑^𝑟_𝑖=1𝐗_𝑖, provides the best approximation to the trajectory matrix so that the (Frobenius) norm of the error matrix is minimum (Golyandina and Zhigljavsky, 2013). Another optimal feature relates to the directions determined by the eigenvectors, 𝐯₁, 𝐯₂,

… ,

𝐯_𝐷. Specifically, the first eigenvector, 𝐯₁, determines the direction such that the projections of the lagged vectors onto this direction is maximum. Moreover, every subsequent eigenvector determines a direction that is orthogonal to all previous directions, and the projections of the lagged vectors onto this direction is also maximum. Therefore, it is natural to call the direction of the ith eigenvector, 𝐯_𝑖, the ith principal direction. Note that the elementary matrices, 𝐗_𝑖, are built up from the projections of the lagged vectors onto ith directions.

在文檔中訊號分解與特徵萃取技術：應用於敏感設備之減、避震 (頁 26-29)