Chapter 5: Results
5.2. Data mining results
國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
42
Categorical variables related with time have been summarized in Table 22. On average, flight delay performance has not decreased from 2014 to 2016. The only notable improvement is that flights from 2016 tend to arrive earlier. This is contrary to results found in the descriptive statistics. Furthermore, since all coefficients for months are positive, it can be implied that January is the month where flights arrive the earliest and June to September the latest confirming trends found using descriptive statistics.
Weekdays do not show much significant results. However, Saturdays seem to be day where most flights arrive late and Tuesdays where flights depart the earliest. This corresponds to Figure 4: Number of flights operated per weekday where Tuesdays are the least busy and Saturdays the busiest day of operations.
Regressing flights by hour group also provide an interesting trend. Using the midnight group as a reference (00:00-00:59), flights after 11:00 arrive late with 21:00-21:59 arriving the latest. On the other hand, flight departures after 11:00 tend to depart early. For departure flights, only the 06:00-06:00 group displayed a positive coefficient which means that flights in this hour group will most probably be delayed.
5.2. Data mining results
Table 23: Top lift factors for departure and arrival flights
Arrival Departure
Dimension Value Lift Dimension Value Lift
Origin Wuxi 2.041 Destination Wuxi 2.729
Origin Nanjing 1.924 Destination Nanjing 2.480
Airline Shenzhen Airlines 1.744 Operation Incoming delay 2.204
Origin Bali 1.583 Airline China Eastern 1.819
Origin Shanghai 1.506 Airline Shenzhen Airlines 1.656
Origin Shenzhen 1.474 Destination Shanghai 1.589
Origin Jakarta 1.432 Aircraft type Boeing 747-400 1.533
Airline China Eastern 1.402 Destination Shenzhen 1.503
Aircraft type Boeing 747-400 1.383 Time of day 15:00-16:59 1.448
Airline Asiana 1.354 Destination Los Angeles 1.437
The highest lift values overall for delayed arrival and departure flights are displayed in Table 23. For arrival flights, six of the highest lift rules are with regards to origin airport. Airlines that serve these airports similarly also show a higher lift rule. In departure flights, things get more interesting as a variety of dimensions are represented under the top 10 highest lift rules. This allows us to investigate departure delays factors even further for more interesting trends.
‧
國立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
43
Interesting observations include flights to and from Wuxi and Nanjing that are the top factors for both delay and arrival flights respectively. Lift rules of 2.041 for example implies there is a 104.1% chance that flights from Wuxi will be delayed. Similarly, flights that arrive late will most likely also depart late as the lift rule of 2.204 would suggest. Another notable observation is that destinations contribute to an airline being delayed. Examples include Shenzhen Airlines that operate Wuxi and Shenzhen and China Eastern that operate to Wuxi, Nanjing and Shanghai.
The remainder of this section presents a summary of lift rules for firstly for antecedents containing one factor to determine the various effects and correlation on on-time performance. Then follows a brief discussion of antecedents containing more than one factor to allow for interaction between two or more factors. An in-depth rules analysis will follow in the next section.
Single factor results
Table 24: Lift values of incoming on time performance factors
Antecedent Consequent Lift
Delayed arrival Delayed departure 2.204 On time arrival Delayed departure 0.672 Early arrival On time departure 1.200 On time arrival On time departure 1.107 Delayed arrival On time departure 0.608
The first factor to consider was to consider the factor that was the correlated factors with departure delay as shown in Table 24. A delayed arrival had 120.4% chance of being delayed with the next highest lift rule being the negatively correlated on time arrival rule which was obtained only by lowering the confidence level to 0.1 implying that flights that arrive on time followed by a late departure do not occur frequently enough to be considered at the confidence threshold. On the other hand, for on time departures early and on time arrivals are positively correlated and as expected, delayed arrival is negatively correlated with on time performance.
To investigate the effects of incoming on time performance effects even further, on time and delayed departures were further split into early and on time departure and delayed and overtime departure respectively. To obtain sufficient rules, confidence levels have once again been dropped to 0.1.
‧
國立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
44
Figure 14: Graph showing how departure on time performance follows that of incoming performance
Figure 14 show that whatever on-time performance behavior arrivals show, there is a high probability that it would display the same behavior for the departing flight.
Most notable, early and on time arrivals show a positive correlation to their departure counterparts, with early arrivals more likely to depart early as well than on time. On the contrary, delayed arrival flights seem to have their effect leveraged as delayed arrival flights have high probability of being delayed or even being delayed overtime (more than one hour). This may be due to operational inefficiencies between airlines as discussed in later sections.
Lift rules grouped by time of the day generally follow that of the amount of flights operated in that hour. However, towards the end of the day as flights decrease, lift values do not recover in the same way. In fact, lift values follow patterns of Figure 8 instead. From Figure 15 below, flights occurring after 15:00 seem to be positively correlated to delay whether arrival or departure.
1,605 0,930 0,417 0,251
1,013 1,689 0,738 0,3940,385 0,758 2,052 2,847
0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5
Early On time Delayed Overtime
Departure
Lift Arrival Early
Arrival On time Arrival Delayed
‧
國立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
45
Figure 15: Lift values by hourgroup
As shown in Table 25, destinations that are more prone to delays are mainly from Asia and particularly China. The only destination on the list that is not from China is Los Angeles. To find characteristics of these destinations, lift rules were also applied on city pair size and distances.
Table 25: Lift values for top delayed airports Origin Lift Destination Lift
Wuxi 2.041 Wuxi 2.729
Nanjing 1.923 Nanjing 2.480
Bali 1.583 Shanghai 1.589
Shanghai 1.506 Shenzhen 1.503
Shenzhen 1.474 Los Angeles 1.437
Jakarta 1.432 Beijing 1.268
Manila 1.335 Manila 1.189
Ningbo 1.296 Hong Kong 1.053
Bangkok 1.261 Bangkok 1.007
Guangzhou 1.222 Beijing 1.212 Hong Kong 1.196 Seoul 1.094 Ho Chih Minh 1.052
Since no clear characteristics can be observed from the airports in this list, lift rules for airport categories are obtained. Airports are categorized by size and distance.
0,693 0,442 0,696 1,006 1,159 1,027 1,252
0,724 0,739 0,709 0,980 1,448 1,213 1,166
0,0 0,5 1,0 1,5 2,0
00:00- 05:59 06:00- 08:59 09:00- 11:59 12:00- 14:59 15:00- 17:59 18:00- 20:59 21:00- 23:59
Arrival lift Departure lift
‧
國立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
46
A
B
Figure 16: Lift values for airports categorized by size (above) and distance (below)
When destinations are grouped together, some trends are seen shown in Figure 16.
Flights operating to and from big airports, (airports over 50 million passengers per annum) are usually delayed regardless of arrival or departure. On the other hand, short flights below 1,000km also have a higher probability of being delayed.
Furthermore, long haul flights over 5,000km are usually delayed but due to the distance of the flight can easily make up the delay and arrive early as indicated by the much lower lift rule for arrival flights.
0,946 0,877 0,986 1,107
0,988 0,926 0,941 1,076
0 0,5 1 1,5
Below 10
million 11-30 million 31-50 million 50 million +
Arrival Departure
1,184 0,923 0,951 0,695
1,197 0,850 0,927 1,157
0 0,5 1 1,5
0-999km 1000-2999km 3000-4999km 5000km+
Arrival Departure
‧
國立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
47
Table 26: Lift values for delayed airlines
Arrival Departures
Airlines Lift Airlines Lift
Shenzhen Airlines 1.744 China Eastern 1.819 China Eastern 1.402 Shenzhen Airlines 1.656
Asiana 1.354 China Airlines 1.341
China Airlines 1.269 Mandarin Airlines 1.335 TransAsia 1.257 Peach Aviation 1.181
Thai 1.229 TransAsia 1.155
Mandarin Airlines 1.198 Hong Kong Airlines 1.138
Air Macau 1.161 Air China 1.006
UNI Air 1.099 Tiger Air Taiwan 0.986 Hong Kong Airlines 1.058 Cathay Pacific 0.978
When observing airlines, six airlines that appear on delayed arrivals also appear on delayed departure flights as shown in Table 26, further strengthening the observation that departure and arrival delays are correlated. At confidence level 0.2, the top 10 already include airlines that are negatively to departure delays, which may imply that airlines may not be as the most significant factor when determining flight delays.
Table 27: Lift values for overtime flights Antecedent Consequent Lift China Eastern Overtime arrival 1.886 TransAsia Overtime arrival 1.731 China Airlines Overtime arrival 1.245 China Eastern Overtime departure 3.207
In order observe more rules, confidence thresholds would need to be dropped and rules may tend toward observations by chance rather than frequent observations.
However, after increasing on time performance to four classes as in Table 27: Lift values for overtime flights, association rules for overtime on-time performance also appear showing that China Eastern has the highest overtime probability for departures and arrivals in addition to the most significant rules for delays
Comparing the list of arrival delay associated airlines with that of departures, some differences in airlines occur. Of note is Asiana which is associated with arrival delays but not departure delays. This implies that there are possible some behavior with regards to turnaround times which can be investigated.
‧
國立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
48
Figure 17: Lift values in relation to scheduled turnaround times
From Figure 17, a clear pattern emerges that the probability of a departure delay increases as turnaround time decreases. Thus, airlines or more specifically pilots, have an incentive to arrive earlier if turnaround times are shorter and hence turnaround times are negatively correlated with arrival delay. Flight schedules with longer turnaround times allow airlines more time to adjust their schedule should things go wrong during the turnaround. Therefore, longer turnaround times are positively correlated to arrival delay due to the lower incentive.
Airlines schedule longer turnaround times for flights with more passengers or usually for flights occurring at the end of the day with an overnight maintenance stop as discussed earlier. Following this logic, we analyze factors relating to passenger size and load factors.
0,817 0,944 1,029 0,981 1,2241,217 1,117 0,877 0,779 0,763
0 0,5 1 1,5
Very short Short Medium Long Very long
Lift value
Scheduled turnaround time
Arrival delay lift Departure delay lift
‧
國立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
49
Table 28: Lift rules for passenger loads
Antecedent Consequent Lift
By load factor
71-85% load Delayed arrival 1.033
51-70% load Delayed arrival 0.946
85%+ load Delayed arrival 0.890
Passenger loads Delayed departure (none) By passenger number
201-300 passengers Delayed arrival 1.151 300+ passengers Delayed arrival 1.050 101-200 passengers Delayed arrival 0.885 0-100 passengers Delayed arrival 0.846 300+ passengers Delayed departure 1.372 201-300 passengers Delayed departure 1.053 0-100 passengers Delayed departure 1.049 101-200 passengers Delayed departure 0.909
From Table 28: Lift rules for passenger loads, no clear trend or pattern can be observed from lift rules. Lift values do however confirm the results found by regression that arrival delay decreases as load factor increases and that there is no significant relationship with departure delays. To investigate the effects of the number of passengers further, we look at aircraft size and aircraft classifications.
Figure 18: Lift values of aircraft types
0,759 0,937 1,117 1,174 1,245 0,820 1,383 0,820 0,957 1,065
0,967 0,807 0,874 1,126 1,216 0,900 1,533 0,891 0,975 0,937
0 0,5 1 1,5 2
Airbus A320 Airbus A321 Airbus A330-200 Airbus A330-300 Airbus A340-300 Boeing 737-800 Boeing 747-400 Boeing 777-200 Boeing 777-300 McDonnell MD90
Lift values
Arrival lift Departure lift
‧
國立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
50
Figure 19: Lift values of aircraft classifications
Figure 18 shows that the highest lift values came from flights operated by Boeing 747-400, followed by Airbus widebodies such as A330 and A340. This may be because these are the oldest aircrafts on the market and may frequently run into mechanical problems resulting in delays. Interestingly, the Boeing 777 which is a widebody displayed somewhat similar lift values as narrowbody aircrafts. However, when aircraft sizes are grouped by classifications as shown in Figure 19, narrowbodies are negatively correlated to their widebodied counterparts. Regional aircrafts may be an exception as they are mainly operated by certain airlines prone to delays (e.g.
Mandarin Airlines).
A follow-up observation regarding passenger size, turnaround time and aircraft type, the rules that seem out of place are characteristic of low cost airlines: narrowbodies, 100-200 passengers per flight and quick turnarounds.
0,838 1,168 0,903 1,042 1,032
0,891 1,081 0,901 1,097 0,930
0 0,5 1 1,5
Airbus narrow Airbus wide Boeing narrow Boeing wide Regional Arrival lift
‧
國立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
51
Figure 20: Comparison of LCC and FSC lift values
As expected, low cost carrier flights are negatively correlated with flight delays as shown in Figure 20 Although delay probabilities for FSC are not much (4.5% and 1.8%
for arrival and departure flights respectively), probability of delay for LCC arriving late decreased by 31.1%.
Figure 21: Lift values for alliance affiliation
When analyzing how alliance affiliation affects on-time performance, only SkyTeam displays lift values above 1 as displayed in Figure 21, indicating positive correlation with flight delays. Another observation for further investigation is the difference in lift rules between arrival and departure flights. This may indicate efficiency between alliance members’ scheduling of flights.
1,045 1,018
0,689 0.873
0 0,5 1 1,5
Arrival Lift Departure Lift
Full service carrier Low cost carrier
1,172 0,985 0,756 0,718 0,942
1,257 0,750 0,867 0,987 0,968
0 0,5 1 1,5
Skyteam Star Alliance oneworld Value Alliance None
Arrival Departure
‧
國立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
52
Figure 22: Lift values for competition on route
On the other side of the spectrum, having competitors seem to be negatively
correlated to probability of flight delays except for routes having four to five airlines competing as shown in Figure 22. For some reason, routes with more than five airlines seem to decrease the probability of a flight delay again.
Multiple factor results
Table 29: Highest lift values for 2 factor antecedents
Arrival 2 factor Departure 2 factor
18:00-20:59 Arrival, 300+ passengers 1.604 1-2 hour turnaround, 15:00-17:59
departure 1.979
Within 999km, 300+ passengers 1.590 Within 999km, 15:00-17:59 departure 1.810 3-4 competitors, 300+ passengers 1.515 300+ passengers, 15:00-17:59 departure 1.800 5+ competitors, within 999km 1.506 201-300 passengers, 0-59 minute
turnaround 1.766
Within 999km, 2-4 hour planned
turnaround 1.474 Boeing widebody, 15:00-17:59 departure 1.724 Very large airport, 21:00-23:59 arrival 1.470 Airbus widebody, 1 hour turnaround 1.714 Airbus widebody, very long turnaround 1.462 1-2 competitors, 15:00-17:59 departure 1.681 Airbus widebody, 21:00-23:59 arrival 1.459 Medium sized airport, 300+ passengers 1.604 5000km+ origin, 21:00-23:59 arrival 1.458 Within 999km, 300+ passenger 1.599 12:00-14:59 arrival, 8 hour+ turnaround 1.456 5000km destination, 1-2 hour turnaround 1.576
When analyzing rules of length three, the top results for departure delays all included delayed incoming arrival, once again emphasizing that departure delay is correlated with incoming arrival delay. Incoming arrival on time performance has therefore
0,776 0,916 1,152 0,947 1,035 0,999 1,035 0,960
0 0,5 1 1,5
No competitor 1-2 competitors 3-4 competitors 5+ competitors
Arrival Departure
‧
been removed from the dataset to reveal how other factors can influence departure delays. A common factor to appear in departure delay is departing between 15:00-17:59, once again pointing to the highest lift for hourgroup analysis. Using rules of longer length, we can see which factors contribute to delays in that hourgroup namely short distances, short turnaround time and high passenger loads.
Looking at arrivals however, flights carrying more than 300 passengers are more prone to arrival delays. Another interesting observation is that short routes with many competitors are also prone to arrival delays. These routes normally operate a high frequency schedule and congestion in airspace may be a reason for these kinds of delays.
To determine which factors affected each airline, the airline was pinned down as an antecedent to determine the main cause for each airline to be delayed. The results of each of the 10 airlines most correlated with arrival and departure delays in indicated in Table 30.
Table 30: Main contributing factor for delayed airlines Delayed arrival as consequent
Airline Dimension Dimension value Lift
Shenzhen Airlines Origin Airport Mega city origin 1.189
China Eastern Arrival time Afternoon arrival 1.972
Asiana Competition 5+ competitors 1.354
China Airlines Competition 3-4 competitors on route 1.653
TransAsia Arrival time Late evening arrival 1.499
Thai Competition 3-4 competitors on route 1.320
Mandarin Airlines Passengers 201-300 passengers 1.457
Air Macau Turnaround Very long turnaround 1.291
UNI Air (no significant dimension found)
Hong Kong Airlines Turnaround Very short turnaround 1.230 Delayed departure as consequent
Airline Dimension Dimension value Lift
China Eastern Operations Delayed incoming arrival 2.984 Shenzhen Airlines Operations Delayed incoming arrival 2.505
China Airlines Turnaround Very short 2.638
Mandarin Airlines Operations Delayed incoming arrival 2.151 Peach Aviation Operations Delayed incoming arrival 3.568
TransAsia Turnaround Very short 2.533
Hong Kong Airlines Operations Delayed incoming arrival 2.425
Air China Operations Delayed incoming arrival 2.319
‧
國立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
54
From Table 30 arrival delays for airlines can come from various factors, however for departure delays, delays usually occur because of late incoming arrivals. Surprisingly, afternoon departures did not influence departing airlines.
Table 31: Main contributing factor for on time airlines On time arrival as consequent
Airline Dimension Dimension value Lift
Air Asia Arrival time Late morning 1.581
Tiger Airways Arrival time Midday 1.432
Singapore Airlines Arrival time Midday 1.424
Japan Airlines Arrival time Late morning 1.468
Jetstar Asia Origin distance Medium 1.491
Philippine Airlines Origin size Large 1.453
Peach Aviation Arrival time Late morning 1.460
Vanilla Air Planned turnaround Very short 1.373
Malaysia Airlines Aircraft class Boeing narrowbody 1.213
Tiger Air Taiwan Arrival time Dawn 1.348
On time departure as consequent
Airline Dimension Dimension value Lift
Emirates Passengers 200-301 1.302
All Nippon Airlines Departure time Early morning 1.260
Delta Turnaround Very long 1.222
Japan Airlines Operation Early arrival 1.284
Vietnam Airlines Turnaround Short 1.250
Air Asia Operation On time arrival 1.289
Air Asia X Operation On time arrival 1.300
United Passengers 300+ 1.225
Jetstar Asia Operation Early arrival 1.297
Singapore Airlines Operation Early arrival 1.300
On the contrary, multifactor antecedents can also be used to determine which factors play a role in on-time performance. A similar process was carried out on the top 10 airlines for arrival and departure on time performance. Table 31 ranks airlines based on the single factor lift rules on on-time performance and shows that arrival time is an important factor for airlines arriving early. Like their delayed counterparts, an on time or early revival is positively associated with an on-time departure. Trends for investigation is how Peach Aviation and Tiger Air Taiwan appears on both top 10 lists for arriving on time, yet depart delayed.
‧
國立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
55