Forecasting Wavelet Transformed Time Series with Attentive Neural Networks
ICDM 2018
Yi Zhao1, Yanyan Shen*1, Yanmin Zhu1, Junjie Yao2
1Shanghai Jiao Tong University
2East China Normal University
Outline
Motivation
Preliminaries
Model
Experiments
Conclusion
2
Motivation
Forecasting complex time series demands time-domain & frequency- domain information.
e.g., stock prices, web traffic, etc.
Various methods to extract local time-frequency features which are i mportant to predict the future values.
Fourier Transform
Short-time Fourier Transform
Wavelet Transform
Use the varying global trend to identify the most salient parts of local time-frequency information to better predict the future values.
3
Preliminaries
4
Problem Statement
Given a time series , predict , the future value in time via a function :
Wavelets
Given a basic wavelet function h, we can get the wavelets :
Continuous Wavelet Transform (CWT)
The continuous wavelet transform refers to the “similarity” between the signal and the basis function :
Model Overview
5
LST M
1. Input time series
2. Scalogram 3. CNN feature extraction
4. Attention Module
5. Fusion &
Prediction
^��+�
f_att(, W)
Preprocessing
Given input time series , we denote by the wavelet transform coefficients matrix.
The scalogram is defined as follows:
Source: Wavelet Tutorial by Robi Polikar, http://users.rowan.edu/~polikar/WTpart3.html
Model
D
AttentionNet
……
…
� �
� �
� �
� �−� D
LST M
LST M
LST M
……
h1 h�−1
h� h�−1
h�
^��+�
VGG output features Attention Module
Fusion & Prediction
…
…
…
…
…
…
�
1…
�
t…
�
TC
Model
CNN: extract local time-frequency features
Feed scalogram to a stack of convolution layers:
LSTM: learn global long-term trend and get hidden state in the last step
Attention module: discriminate the importance of local features dynamically
Given time-frequency features and
Attention score:
Weighted sum of local time-frequency features:
Fusion & Prediction: combine local and global features for prediction
Objective Function
Squared Loss:
7
Datasets
Stock opening prices
Collected from Yahoo! Finance.
Daily opening prices of 50 stocks among 10 sectors from 2007 t o 2016.
Each stock has 2518 daily opening prices. Daily opening prices fr om 2007 to 2014 are used as training data, and those in 2015 and 2016 are used for validation and testing, respectively.
Power consumption
Electric power consumption in one household over 4 years.
Sampled at one-minute rate.
475,023 data points in year 2010.
8
Main Results
9
Metric
:
Mean Squared Error:
Baselines
Naïve: take the last value in the series as the prediction value
Ensemble of LSTM & CNN: feed the concatenation of features from VGGnet and the last h idden state from LSTM into the fusion and prediction directly.
Case Study
10
Illustration of attention mechanism
Given an input of 20 stock prices, we show the scalogram, and t he attention weights.
The model attends to the local features that are similar to the glo bal trend and helps in predicting the future value.
Conclusion
Wavelet transform is able to explicitly disclose the latent component s at different frequencies from a complex time series.
We develop a novel attention-based neural network that leverages C NN to extract local time-frequency features and applies LSTM to capt ure the long-term global trend simultaneously.
The experimental results on two real life datasets verify the usefulne ss of time-frequency information from wavelet transformed time serie s and the our method in terms of prediction accuracy.
11