• 沒有找到結果。

C AUSALITY OF T AIWAN S INGLE A UTO -V EHICLE A CCIDENTS

CHAPTER 4 EMPIRICAL STUDY

4.4 C AUSALITY OF T AIWAN S INGLE A UTO -V EHICLE A CCIDENTS

4.4.1 Data

The 2005 Taiwan single auto-vehicle (SAV) accident data was adopted to demonstrate the feasibility of the proposed approach for accident causality analysis. In particular, accident severity was considered as the target variable for this study. The primary reason of replacing the dataset used in the previous two sections with another dataset is that the rule

support is extremely low except the bump-into-facility crash type. It demonstrates the uniqueness of those accident types and might result in the void of rules with relationships.

The 2005 Taiwan single auto-vehicle (SAV) accident data was also collected by police departments including all the death involved and injury only accidents. The total number of SAV accidents, excluding invalid cases, was 3,138. The number of invalid cases was 27, which accounted for 0.86% of the total cases. These cases were invalid mainly due to the unknown attribute values of the driver’s characteristics. They were directly ignored in the study based on their relatively small size. The collected attributes and their corresponding categories are summarized in Table 4-8.

TABLE 4-8 Attribute and Category

Attribute Category

Age Under (<18), Young (18-35), Middle-aged (36-55), Elderly (>55)

Gender Male, Female

License type Regular, Occupational, Other License condition Valid, Invalid, Unknown

Occupations Student, Working people, No job, Unknown Trip purpose Necessary (Working, school, business), Other

Trip time MP (07-09), DOP (09-16), AP (16-19), NOP (19-23), Midnight (23-07) Seat belt use Fastening, Not fastening, Unknown

Cell phone use Using, Not using, Unknown Drinking condition Drinking, Not drinking, Other Road type Highway, Urban, Rural Speed limit 50-, 51-79, 80+

Road shape Intersection, Segment, Ramp or other Pavement material Asphalt, Other, No pavement

Surface deficiency Normal, Other (e.g. holes, soft, and so on) Surface condition Dry, Wet or other

Obstruction Yes, No (within 15 meters)

Sight distance Good, Poor (based on road design speed) Signal type Regular, Flash, No signal

Signal condition Normal, Abnormal, No signal Median Island, Marker, Marking, None Roadside marking Yes, No

Weather Sunny or cloudy, Rainy, Other Illumination With light, No light

Alignment Straight, Curved, Other Accident severity Death involved, Injury only

4.4.2 Classification with Rough sets

The Taiwan 2005 SAV accident data was first analyzed with rough sets theory to generate a minimum rule set covering all objects. This analysis consisted of two steps:

variable selection and rule induction. The former step was to identify the variables that were unable to differentiate the accident severity. In the analysis, four out of 25 variables were

identified as redundant, including pavement material, surface deficiency, signal condition, and weather condition, which may arise from the following two reasons. First, their effects could be replaced by other variables. For example, the effect of the weather variable could be substituted by that of the surface condition variable since raining would result in wet surface. It is understood that the weather condition would affect not merely surface conditions; for example, strong wind or large snow fall would raise the difficulty on drivers’

control of their vehicles. However, these weather conditions rarely appear in Taiwan. The second reason was that these redundant variables had no significant impact on accident severity. For example, 98.6% and 98.5% of the accidents were reported on roads with an asphalt pavement and on roads without surface deficiency respectively. Therefore, the pavement material and surface deficiency variables were reported as redundant. After excluding the four redundant variables, the remaining 21 variables were considered in generating rules.

With 21 non-redundant explanatory variables, 315 rules were generated with rough sets theory to represent the 3,138 accident cases. This study applied the most frequently used algorithm – minimum covering – to generate rules. Its aim was to generate the minimum number as well as the shortest length of rules to cover all accidents. Of which, 295 rules were exact rules and 20 were approximate rules. An exact rule refers to a situation that the severity of an accident could be identified under a particular circumstance. On the other hand, an approximate rule represents a certain circumstance under which the accident severity could not be uniquely determined.

The rule support histogram was shown in Figure 4-2, where the number of rules in the vertical axis is shown against the number of support, the horizontal axis. The right-skewed shape showed that most rules were of low support. It suggests that most SAV accidents hold relatively unique patterns. On the other hand, some rules showed high support even though 21 factors were considered.

support

no. of rules

0 50 100 150 200 250

050100150200

Min.=1.00 1st Qu.=2.00 Median=6.00 Mean=27.44 3rd Qu.=34.50 Max.=279.00

FIGURE 4-2 Rule support.

4.4.3 Determination of Rule Support Threshold for Differentiating Accidents

For the purpose of analysis, the accident cases were separated into two subsets: one subset includes accidents of support high enough such that their relationship could be claimed; the other subset consists of the remaining accidents whose relationship may not exist. The choice of threshold of rule support was determined by examining the average hit rate of accidents related to different levels of support. The whole data were first tested.

Second, accidents related to rule support of one were excluded, and the remaining accidents were tested. Then, accidents related to rule support less than or equal to two were excluded, and the remaining accidents were tested. The test continued until the accidents related to rule support less than or equal to nine were excluded. In each test, decision trees were employed to obtain the average hit rate with Monte Carlo simulations of 2000 times; 75%

of cases were selected for training and the remaining 25% of cases were adopted for testing for each simulation**. Moreover, a reference average hit rate was created for comparing the improvement. A reference hit rate was obtained by testing data randomly selected from the original dataset with specified sample size and injury/death case ratio. The sample size and injury/death ratio was determined by the aforementioned dataset selected by rough sets rules as shown in Table 4-9.

TABLE 4-9 Dissimilar Strong Rules Leading to Death or Other Sample size

Data Included cases

Total Injury Death

Injury/Death ratio

Whole Whole 3138 2834 304 9.32

G1 Support > 1 3010 2776 234 11.86

G2 Support > 2 2940 2772 168 16.50

G3 Support > 3 2907 2771 136 20.38

G4 Support > 4 2867 2767 100 27.67

G5 Support > 5 2837 2755 82 33.60

G6 Support > 6 2800 2741 59 46.46

G7 Support > 7 2773 2736 37 73.95

G8 Support > 8 2757 2720 37 73.51

G9 Support > 9 2725 2715 10 271.50

The average hit rate was shown in Figure 4-3. The hit rate for data selected by rough sets rules was illustrated with solid lines; the reference hit rate was drawn with dotted lines.

It could be observed that the average hit rates were increasing with the exclusion of accidents related to low support rules, especially for the minority class – fatal accidents.

Especially, when accidents related to rules greater than five (G5) or seven (G7), the average

** Stratified random sampling was employed to partition data into training and testing groups. That is, 75% of injury only cases were randomly chosen for training, and so for 75% of death only cases.

hit rate of death involved cases significantly increased as labeled with solid circles in the graph. Although the G7 point showed relatively significant increase, the G7 data consisted only 37 death involved cases. On the other hand, the G5 data contained 82 death involved cases and raised the hit rate from 0.2 to around 0.5. Therefore, the support of six was considered as the threshold to differentiate between rules. That is, accidents related to rules with support greater or equal to six were considered as high-support-rule accidents.

Data

Average hit rate (%)

Whole G1 G2 G3 G4 G5 G6 G7 G8 G9

0.00.20.40.60.81.0

RS_Death RS_Injury RS_Overall Ref_Death Ref_Injury Ref_Overall

FIGURE 4-3 Average hit rate with respect to accidents related to rules with different support.

* RS_Death, RS_Injury, and RS_Overall refer to the average hit rate for death involved, injury only, and overall cases selected by rough sets rules, respectively. Ref_Death, Ref_Injury, and Ref_Overall refer to the average hit rate of reference for death involved, injury only, and overall cases, respectively.

4.4.4 Rule Comparison for High-Rule-Support Accidents

Among all the 315 rules, 164 of them were strong rules; 19 of those strong rules led to death involved or other accidents, and the remaining 145 strong rules led to injury only accidents. The following comparisons focused on the differences between death involved or other accidents and injury only accidents. In other words, the possible causal factors diverting an injury only accident to a death involved or other accident were examined

The rules having no similarity to injury only rules and the remaining 16 strong rules were demonstrated in the following two paragraphs, respectively.

1. Dissimilar death involved or other rules

There were three death involved or other rules having no similarity to injury only rules as listed in Table 4-10. The first dissimilar rule, D1, describes the young working drivers who were drinking and might be using cell phones driving on a curved road with poor sight distance but with lighting. While normal drivers would lower their speeds to safely pass a curve, the leading-to-death rule suggests that the corresponding driving speeds would not be low. Moreover, the curved road with poor sight distance raised the difficulty of driving.

Although there were another 10 strong rules relating to curved roads and leading to injury only cases, none of them were specified as young drinking drivers. This might suggest that these drivers can easily misjudge the safe driving speed and can not properly maneuver the vehicle while passing a curve with a poor sight distance.

Seen in Table 4-10, the D2 and D3 rules describe the corresponding death involved accidents occurring under the condition that the drivers were not wearing seatbelts and were possibly drinking driving. Fastening the seatbelt and drinking driving have long been critical policy issues for the government of Taiwan; violating either one, especially the latter, leads to a substantial fine. Therefore, it is expected that these two unlawful behaviors occurring at the same time, as described in D2 and D3, will be rare. However, committing both these violations, whether combined with an unfriendly road environment or not, a death involved case would likely occur.

TABLE 4-10 Dissimilar Strong Rules Leading to Death or Other

Rule

Attribute1 D1 D2 D3

Age Young -- --

Occupation Working -- --

Seat belt use -- Not using Not using

Cell Unknown Unknown --

Drink Drinking Unknown Unknown

Road type -- -- Rural

Sight distance Poor -- --

Illumination Yes -- Yes

Alignment Curved -- --

Severity Death Death Death

1 The attributes where all the three rules were unspecified were not represented to reduce the space.

2. Similar death involved or other rules

There were 16 death involved or other rules similar to injury only rules as listed in Table 4-11. The S1 and S2 rules were the rules most similar to injury only rules; these two rules had been cited as similar rules by injury only rules for 47 and 46 times, respectively.

The rule S1 illustrated the condition that regular-valid-licensed young male working drivers driving with unspecified purposes and wearing seatbelts had been drinking alcohol and were driving around midnight on straight rural roads at low speed limits, dry surface, median

marking, and no signals. Although this describes drinking and driving behaviors, drinking itself can not fully represent the cause shifting the accident to a fatal one. By looking into the strong rules, some of them also related to drinking and driving behavior; however, as long as the drivers were not young people, it was not midnight, the quality of the corresponding road environment was not poor (i.e. it was a urban road, a road with a median island, or a road at a higher speed limit), or the surface was not dry, the accident severity was shown to be injury only. When the driver is young, the corresponding behavior could be somewhat risky and a more risky driving environment is usually associated with midnight driving (Lin and Fearn, 2003). Moreover, a road with poor quality could not mitigate the bumping impact of an accident; and when the surface is dry, it might encourage fast driving especially under low traffic (midnight on rural roads). Therefore, the combined unfavorable factors led to death involved accidents.

As stated, the rule S2 illustrated a condition very similar to S1. These two rules were almost identical except that the rule S2 did not specify the drinking behavior, but specified that the corresponding road environment may encourage fast driving – low traffic and good sight distance (around midnight driving along a straight rural road with illumination and roadside marking). Though the corresponding driver was not specified as drinking, the possibly more speedy driving behavior also led to death involved accidents.

In contrast to the first two rules, the rules S3 and S4 illustrate the accidents occurring on high-quality roads (highways or urban roads with median islands). The driving speeds on these roads are usually high especially on highways with a minimum speed of 80 kph. The high driving speeds combined with the impaired maneuvering skills, as well as lower situational awareness due to drinking, once an accident occurs, a death involved case is expected. When compared to their similar rules, these death involved cases could be merely injury only if the driver was not a young male (middle-aged, elderly or female), if the road was narrower (an urban road without roadside marking), or if the road did not mislead drivers to drive at an inappropriately high speed. Having either one of the factors could reduce the driving speeds or make the drivers drive more carefully.

The rules S5, S6 and S7 describe the conditions that the accidents occurred on low-speed-limit rural roads or in a low traffic environment (midnight) except that the trip purposes were unspecified, the drinking conditions were unknown, and the seatbelt usages were unknown. By looking into their similar rules, all else equal, the S5, S6 and S7 accidents became injury only if the driver did wear a seatbelt or if the driver was certainly not drinking. This addresses the effect of injury prevention by wearing a seatbelt and avoiding the deteriorated maneuvering skills as well as lower situational awareness due to drinking.

TABLE 4-11 Strong Rules Leading to Death or Other

Rule

Attribute S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14 S15 S16

Age Young Young Young Young Young Young Young Young -- -- -- -- Young Middle Young

--Gender Male Male -- Male Male Male -- -- Male Male Male -- Male -- --

--License type Regular Regular Regular Regular Regular -- Regular -- -- Regular Regular -- -- -- --

--License con. Valid -- -- -- -- -- -- -- -- -- -- Valid -- -- Valid

--Occupation Working Working -- Working -- Working Working Working Working -- Working Working Working -- --

--Purpose Other Other Other -- -- -- -- -- -- Other -- -- -- -- --

--Time Midnight Midnight Midnight Midnight -- Midnight -- DOP Midnight -- -- NOP -- -- Midnight Midnight

Protection Using Using -- Using Unknown Unknown Unknown Using -- -- Unknown -- -- -- -- Unknown

Cell -- -- Not using Not using -- Unknown -- -- Unknown Unknown -- Unknown Unknown Unknown Unknown --Drink Drinking -- Drinking Drinking Unknown -- Unknown -- -- -- Unknown Drinking Unknown Unknown Unknown

--Road type Rural Rural Urban Highway Rural -- Rural Highway Highway -- -- -- -- Rural --

--Speed -50 -50 -- -- -50 -- -- 80+ -- -- 51-79 -50 -- -50 51-79 51-79

Road shape -- Segment Segment Segment Segment Segment Segment Segment -- -- Segment -- -- Segment -- Other

Surf. status Dry Dry -- Dry -- -- -- Dry -- -- -- -- -- -- --

--Obstruction -- -- No -- -- No -- -- -- No -- -- -- -- --

--Sight dist. -- -- -- -- Good Good Good -- -- Poor Good -- Good -- --

--Signal type No No -- -- No -- No -- -- -- -- -- -- -- --

--Median Marking Marking Island -- -- Marking -- Island Island -- -- -- -- -- Island Island

Rd. side -- Yes Yes -- -- Yes Yes -- -- -- -- -- No -- --

--Illumination -- Yes Yes -- Yes -- -- -- -- No -- -- Yes -- --

--Alignment Straight Straight Straight -- -- -- -- -- -- -- -- -- -- -- --

--Severity Other Other Other Other Death Death Death Other Death Death Death Death Death Death Death Death

Similarity1 47 46 18 16 11 7 7 7 3 3 3 2 1 1 1 1

1Similarity referred to the number of rules which were similar to this rule but led to injury only crashes.

The rule S8 describes young working people driving on highway segments with a dry surface during day off-peak periods and wearing seatbelts. When compared to the similar rules, all else equal, the accidents became injury only cases if the driver was certainly not drinking, if the driver owned an occupational or military driving license, or if the trip time was during the afternoon peak hours. Only soldiers in charge of driving can obtain a military driving license, therefore, under a high-speed-driving environment, drivers with occupational or military licenses are expected to be more capable to avoid fatal accidents than normal drivers once an accident occurs. Moreover, the traffic flow during peak hours is denser than that during off-peak hours; consequently, the corresponding driving speed is expected to be lower. Once an accident occurs, the severity should be less severe.The rule S9, similar to S8, describes the accidents that occurred on highways, but the drivers were specified as male drivers instead of young drivers; moreover, the trip time was around midnight rather than off-peak periods during the day. When compared to its similar rules, the S9 accidents could become less severe if the trip time was during afternoon peak periods.

The denser traffic during peak hours might restrict the driving speed. Even though the drivers could be of high risk (young or male drivers), the environment might limit their driving speeds and the corresponding accidents might not be fatal.

The rule S10 describes the regularly-licensed male drivers driving on poorly-sighted roads without any obstructions on the roads. When compared to its similar rules, all else equal, the accidents could be less severe if there were obstructions on the roads. According to the definition, obstructions are defined as any obstacles within 15 meters of the crash.

This distance is much shorter than the defined safe sight distance which is 45 meters under a normal 40-kph driving speed, and a driver might spot the obstacles and lower his/her driving speed. On the other hand, the male drivers driving at relatively high speeds, even though the road has poor sight distance, result in a fatal accident.

The rule S11 describes regularly-licensed working people driving on a medium-speed-limit road with good sight distance. Its similar rules suggest that these accidents could be less severe if the drivers were certainly not drinking. Similarly, the accidents under the rules S12 and S13 would be less severe if the drivers were certainly not using cell phones or not drinking driving. The accidents under the same driving environment described by S15 were less severe if the drivers were the elderly, who are usually considered to be of lower risk than young drivers. Even under a road encouraging fast driving (medium speed limit with median island), the elderly drivers might drive carefully and maintain a reasonable driving speed while the young drivers might not.

The information provided by the remaining rules, S14 and S16, is relatively vague since most attributes were unspecified and all the behavioral attributes were either unspecified or unknown. Moreover, the associated similar rules were different in behavioral

attributes. Therefore, it is relatively difficult to tell the differences between the selected rules and their associated similar rules.

4.4.5 Logistic Regression Analysis for the Remaining Accidents

Different from the accident cases with strong causal relationships, the 363 accidents associated with the weak support rules or the approximate rules were analyzed with regression methods to investigate the possible associations between factors and extract the variations due to insufficient information. In particular, binary logistic regression models were adopted. The model structure was revised from the one proposed by Kim et al. (1995) where the accident severity was affected by driver characteristics, trip characteristics, behavioral factors, environmental factors, and interactions between driver and behavioral factors. Backward elimination was applied to select variables.

The reference severity was injury only and the estimation results were summarized in Table 4-12. The estimated Hosmer-Lemeshow p-value was 0.293 (> 0.100) which indicated

The reference severity was injury only and the estimation results were summarized in Table 4-12. The estimated Hosmer-Lemeshow p-value was 0.293 (> 0.100) which indicated