Different models, optical flow - Typhoon Intensity Class Estimation

4. Deep Learning

5.2. Typhoon Intensity Class Estimation

5.2.1. Different models, optical flow

In this section results, obtained for TC/xTC experiment are partially reused. It has been shown, that among other candidates, ‘model3’ and ‘model4’ network architectures performed better. The problem of binary classification is simpler, then problem of multiclass classification, however, it is clear, that visual features, which are learned by models are similar due to the similarity in problems itself. It is assumed, that models, which performed better for one problem, will also perform relatively in another problem, therefore there is no need to repeat experiment with network architectures, which displayed inferior performance. The largest problem with choosing a network architecture over another, is finding a balance in the number of training parameters.

Networks with small number of training parameters don’t have enough power to solve a particular task, whereas networks with too many parameters tend to overfit. As for the reference to the performance, the state-of-the-art CNN for image classification

“Xception” by Google was used [52]. Also as a base line was chosen a model from work by Lucas [48]. All networks were trained from scratch, using ensemble learning and 10-fold cross validation, with the same data split, which was used in TC/xTC experiment. In the experiment four models and two types of data (images, images + optical flow) were used. The results of different models performance and different types of data used is shown in Fig. 5.2.1.1.

Fig. 5.2.1.1. Evaluation metrics for different models. Data used: images (top), images + optical flow (bottom).

Regarding models performance, as expected, network architectures ‘model3’

and ‘model4’ showed better performance over ‘model_d’, which was prone to overfitting again due to the over complicated design and too many training parameters. The state-of-the-art Xception by Google showed slightly better performance, however, due to the nature of the Keras framework, it couldn’t been used with LSTM part together as a single model and trained in a bundle, and therefore was discarded for further experiments. Of course, it could be used as an independent module along with LSTMs, but it might compromise the precision of the experiment.

The results of adding optical flow to the input data are difficult to interpret at the first glance. It is obvious from the plots, that not only mean values of evaluation metrics were not improved, but variance of evaluation metrics among models within ensembles also got higher, meaning, that optical flow ‘confused’ models, as opposite to make predictions more precise. It can be explained by a number of factors. In TC/xTC problem, the difference between those two phenomenons is very significant in the peripheral area, where cloud formations differ the most. It can be observed by inspecting figures of mean values (Fig. 5.1.2.2.). There is a very noticeable difference between TC and xTC classes, however difference within different categories of TC is not qualitative. Intuitively, it is understood, that typhoon intensity information is mostly related to the central area of the typhoon, rather than to it’s peripheral cloud formations.

Optical flow estimation algorithm has its limitations, and perform well, where distinct

visual features present in images. However, central typhoon part is usually presented by a uniform blob of pixels, where it is impossible to distinguish between moving and still parts of the image. This is usually the part of the image, where optical flow estimation algorithm struggles the most. It is known, that this area of typhoon, where the wind speed is the highest. The example of wind speed distribution within a typhoon is shown in 5.2.1.2.

Fig. 5.2.1.2. RapidScat gathered wind speed and direction data on Soudelor on Aug.

3, 2015, at 1900 UTC (3 p.m. EDT) just west of Mariana Islands. RapidScat measure sustained winds around the center at more than 36 meters per second/70 knots/80.5

mph/129.6 kph). Credits: NASA JPL, Doug Tyle

It is assumed that optical flow has some correlation with real particles movement captured in a particular frame. However due to the fact that the algorithm solely relies on detecting visual features, it is impossible to capture a motion closer to the eye of a typhoon. In fact, in those areas the correlation of speed (not a direction) of motion detected and real particle movement is negative. Compare Fig. 5.2.1.2 and Fig.

5.2.1.3. The phenomena is clearly seen in Fig. 5.2.1.4, where the strongest winds are expected to be in the central area of the picture, but flow estimation algorithm fails to

detect any significant motion there, due to the cloud density and the corresponding representation as a indistinguishable blob of pixels.

Fig. 5.2.1.3. Mean optical flow (pixel shift absolute values) for different intensity classes.

Fig. 5.2.1.4. Vector-type and color code visualization of optical flow for typhoon 201728 frame 93.

This experiment shows that utilizing of optical flow for classification of intensity category within TC doesn’t improve prediction results, but in fact confuse models and make variance of predictions higher, meaning providing less confidence for those predictions.

Examples of prediction for different sequences and different models are shown in Fig. 5.2.1.5. The source information, which is used is IR images only, without optical flow.

Fig. 5.2.1.5. Examples of intensity prediction for different models. Data: images.

5.2.2. LSTM

In her work, Danlan [10], showed that results of single frame predictions can be significantly improved by utilizing time-series nature of typhoon sequences and using recurrent neural nets, LSTM in particular. In this experiment ‘many to one’ variant of LSTM is used, namely, the model has an input of n subsequent frames, starting from

i-n up to i, and produces an output for frame i+1. The ‘many to one’ LSTM architecture is shown in Fig. 5.2.2.1.

Fig. 5.2.2.1. LSTM architecture.

The question to address is how many frames (can be translated to how many hours, since the frame resolution is one hour) is optimal to perform better predictions.

The configuration of the architecture was chosen as following: using CNN architecture of ‘model4’ with addition of 100 LSTM units to it. The variable parameter is number of frames for input. It is expected to see some optimal value, based on observing variance of predictions in models within an ensemble. Varying the number of input frames should result is the tradeoff between improving result by adding more temporal information and degrading trend caused by confusing model with irrelevant information and decreasing essential information for training examples by effectively cutting off some number of frames.

Fig. 5.2.2.2. Evaluation metrics for some candidate models vs LSTMs with different input sequence length.

As it is observed from the Fig. 5.2.2.2, the optimal input sequence length is around 15 frames. Example of Intensity class predictions for different models vs LSTM with input sequence length equals 15 frames are shown in Fig. 5.2.2.3.

Fig. 5.2.2.3. Intensity class predictions. Different models vs LSTM with input sequence length equals 15 frames.

5.2.3. Results

The results of this experiment are better than the results of related works. The approach of this work lead to improving accuracy for typhoon classification problem and reached 67.50% (63.92% by Chen and 58% by Rodes-Guirao). This task, however, is an artificial, and used only for the reference as a way of evaluation the approach performance. In real life scenario, it is much more meaningful to use machine learning for estimating central pressure or sustained wind speed for the purpose of classification, as opposite to directly estimate classification indexes due to the very low discretization of this parameter over estimated physical characteristics.

在文檔中應用深度學習於分析颱風衛星影像 (頁 78-84)