Propagation Patterns - Literature Review - 通過車輛偵測器資料分析與視覺化探索壅塞擴散模式

Chapter 2 Literature Review

2.2 Propagation Patterns

Besides detection of traffic states and events, how the congestion propagates and/or cascade is also one of the most interested issues among research related traffic congestion.

This may provide us the ability to extrapolate the traffic state on the road network nearby and the potential relationship between neighboring road segments. With these we can further understand the structure of our road network and the cascading behavior of a congestion took place.

Long, Gao, Ren & Lian (2008) stated the importance of effectively identifying network bottlenecks for improving network service level and preventing congestions.

Congestion is defined by critical standards based on average journey speed (AJV) and a congestion propagation model based on cell transmission model (CTM) is proposed.

Simulations are performed on Sioux Falls network. The simulated result can provide references for decision in controlling traffic demand.

Wang, Z., et al. (2013) utilized GPS trajectories and provide multiple views for visually exploring and analyzing on the level of propagation graphs and road segment level. The whole visualization contains speed variation on pixel level while the propagation result is shown on road segment level.

Ji, & Geroliminis (2014) observed congestion propagation on a macroscopic scale.

By taxi GPS as sparse probe vehicle data and maximum connected component of congested links, interconnected congested links and the critical congestion pockets are identified. The proposed method can effectively distinguish the congestion pockets out of the network and track the evolution of congestion through time.

2.3 Kernel Density Estimation (KDE) and Applications

An adjusted kernel density estimation approach is applied in this study, thus related literatures including the general kernel density estimation method and some discussion about this approach as well as the applications in transportation field and several other disciplines are reviewed in this section. Probability density estimation approaches can be

estimation is made based on the assumptions related to the distribution embedded in the data set. However, there can be larger gaps between the assumed parametric model and reality. As a nonparametric probability density estimation approach (Rosenblatt 1956;

Whittle 1958; Parzen 1962), kernel density estimation does not require any assumption for the distribution of data points. This provides more flexibility and allows researchers to discover more characteristics beneath the data set such as such as its actual distribution.

Hence, the great importance of kernel density estimation has shown in both theoretical and applied statistics fields. Rosenblatt, M. (1956) and Parzen, E. (1962) developed current form of kernel density estimation, which is also termed Parzen-Rosenblatt window method in some fields such as signal processing and econometrics.

Yu (2009) has done a study on KDE, investigating the most appropriate search bandwidth choice for six different probability distribution evaluated by mean integrated square error (MISE) and asymptotic mean integrated squared error (AMISE). Yu concluded that the KDE with variable search bandwidth can provide acceptable estimation results.

Xie and Yan (2008) suggested a network KDE method transformed from a standard planar KDE to fill the shortcomings while the problem is network based. The innovation of this research is to represent network space with lixel, which is the linear units of equal network length. This approach is tested with traffic accident data and road network in Bowling Green, Kentucky in 2005. This approach has the ability to solve the problem of overestimation of density values. The impacts on density calculation from different kernel functions and different search bandwidth are also investigated and found that search bandwidth brought the highest influence by controlling the smoothness of the spatial pattern.

Chang (2012) applied KDE and integrate data mining to assess common physiological indicators of multiple diseases. To estimate the probability of illness of patients being examined, KDE is applied to estimate the probability distribution of each common physiological indicator under different health condition.

Hu (2012) established an approach to analysis GPS trajectory and collected the trajectories of visitors in Yehliu Geopark. Possible spatial distribution of visitors within the park can be calculated through KDE. In addition, time factor is also taken into account to investigate the location of crowds and the spatial distribution of visitors. Ultimately, the density distribution of visitors within the park during different time period is simulated.

The simulation result can be used to reconsider the space allocation and route design.

2.3.1 Standard Kernel Density Estimation (Standard KDE)

Assuming ( ,x x₁ ₂,...,x_n) is a univariate independent and identically distributed

sample extracted from some distribution with an unknown density f , its kernel density estimator can be written as:

Where K is the kernel function which is a non-negative symmetric function and satisfies



K u du( ) 1^{. Since} ^K is a probability density function, f also has the

I. Uniform (rectangular window):

In Eq. (2.1), h is a positive number named bandwidth or smoothing parameter. It controls the smoothness and preciseness of kernel density estimation. A larger h may lead to underfitting and fail to represent the appearance of the real density function. By contrast, a smaller h does not perform well on smoothing the curve and may lead to overfitting.

2.3.2 Planar Kernel Density Estimation (Planar KDE)

In order to perform density estimation of various spatial related issues, the standard kernel density estimation concept is then extended to 2-D planes. The general form of the planar kernel density estimator in a 2-D space can be written as:

r ratio. Instead of giving an equal weight to all points within bandwidth r, a distance decay effect is taken into account. That is, as the distance between a point and location s increases, that point is weighted less while calculating the overall density.

Some commonly applied kernel functions used to account for the distance decay effect are expressed in an alternated form below (Gibin, Longley, & Atkindon, 2007; Levine, 2004):

II. Quartic function (approximating Gaussian function): common values chosen for scaling factor K.

III. Minimum variance function:

2.3.3 Network Kernel Density Estimation (Network KDE)

To perform density estimation of point events with network constraints, network KDE is proposed (Xie & Yan, 2008). This approach differs from the planar kernel density estimation in several aspects. Network KDE is a 1-D measurement, while planar KDE is a 2-D one. Network space is used in the point event context and the kernel function is developed based on network distance instead of Euclidean distance. Hence, it performs better on density estimation while a planar KDE may over-detect clustered patterns. The general form of the network KDE can be expressed as:

2.4 Summary of Literature Review

The summary of characteristics of reviews in terms of data source, approaches and types of road network is listed in Table 2.1. According to the review in former sections, most studies focus on either traffic state and event detection or propagation patterns of congestions. Furthermore, there are some shortcomings on the data source they utilize.

Some of them are not open to public while others require high operation cost and complicated preprocessing techniques. To bridge the gaps, this study proposed an adjusted KDE approach to account for congestion detection, propagation patterns and visualization of VD data.

Table 2.1 Summary of Characteristics of Reviews

Chapter 3 Methodology

In this chapter, the characteristics of adjusted network KDE is represented. The proposed adjusted KDE approach is applied to determine the congestion cascading pattern in terms of the conditional probability for congestion incidents and the potential relationship between adjacent road segments is investigated. Procedures of extracting network information, preprocessing VD data and performing KDE estimation by employing the proposed approach are also explained in this chapter.

3.1 Adjusted Network Kernel Density Estimation (Adj.

Network KDE)

In this study, we will make some adjustments on the original network KDE approach.

In order to interpret the spatio-temporal characteristic of the congestion propagation within an urban road network, network kernel density estimation approach is applied.

Instead of using network distance, this research employs “degree of adjacency” based on the structure of the road network and adjacency matrix. The locations of VDs do not follow a specific rule, for example at the front, middle or the end of the road segment.

Hence, VD data can only present the whole road segment and precise network distance cannot be calculated. Furthermore, the conditional probability that congestion occurs on the upstream road segment given the occurrence of another congestion on the downstream road segment is also considered. The adjusted form of the network KDE can be written as:

1 are both locations of VDs. In addition, each s can also be viewed as the center of several neighboring road segments including itself, which contribute the effect to adjacent is.

3.2 Data Description

Two main components of our data are introduced in the following sections, including the description of how we represent our urban road network structure, as well as the contents and the procedure of preprocessing raw VD data. We apply the conception of the adjacency matrix to form our road network structure. String comparison technique is applied to filter target VD set in our region of interest (ROI) and time intervals, while criteria are set to perform data preprocessing including the elimination of erroneous and some conversion of units.

3.2.1 Network Structure

The concept of the adjacency matrix is introduced to describe the network structure of our ROI. Most networks in previous research have been binary in nature. That is to say, the edges between nodes are either existing or not (Newman 2004). A network with such an attribute can be represented by an n n adjacency matrix  with elements

1 if and are connected,

However, our road network is slightly different. Since all the VDs are located on the road segments, our adjacency matrix is edge based. Furthermore, most of the arterials in our road network are bidirectional, and thereby the direction of traffic is also considered.

That is, we will have an n n adjacent matrix  where d is the entrance of a downstream road segment with respect to the exit of an upstream road segment u with elements adjacent matrix of our ROI will be constructed following the conception of adjacency and part of it is shown in Figure 3.2. For some of the road segments, there are no VD installed.

Another matrix containing the turning information is also constructed at this stage, as shown in Table 3.2. The tuning information is extracted from the VD reference data set, and the attribute is tagged as S (straight), L (left turn) and R (right turn). For those cannot be identified, coordinate information in terms of longitude and latitude is applied.

Figure 3.1 Road Network Example

Table 3.1 Adjacency Matrix Example

Upstream Arterial A B C D …

ID A2E B2E C2N D2N …

Downstream Direction East East North North … Arterial ID Direction Segment CD CD AB AB …

A A2E East CD 1 0 1 0

B B2E East CD 0 1 0 0

C C2N North AB 0 0 1 0

D D2N North AB 0 1 0 1

… … … … …

Figure 3.2 Part of the 1^st Order Adjacency Matrix of Our ROI

ID VHSIP20 VHNJV20 VHMKV20 VHMM620 VHMML20 VFZK620 VG6J520

Ato(頭) 上游方向東東東東東東東東東

This study focuses on the 1^st and 2^nd order adjacency, thereby requiring the 1^st and 2^nd order adjacency matrices and turning information. We make a dot product of the 1^st order adjacency matrix itself to obtain the 2^nd order adjacency matrix. In the 2^nd order adjacency matrix, the elements with value 1, is named 2^nd order adjacency. A 2^nd order adjacency relationship indicates that two road segments are connected through another road segment.

3.2.2 VD data processing

The dataset contains high resolution VD data (recorded every 5 minutes) in Taipei City from January, 2015 to March, 2017, provided by the Traffic Control Center of Taipei City Traffic Engineering Office. Figure 3.3 shows part of the raw data. Some preprocessing work must be done in order to extract the target data we are interested in.

The raw data contain information including device ID, date and time, lane order, volume and travel speed of large vehicles and regular passenger cars, lane occupancy, and average interval between vehicles. There are also columns for motorcycles, however, none of them are actually detected. Table 3.3 explains the important components in the VD data which are useful for this study. We use average travel speed as our major indicator for traffic congestion. The average travel speed is calculated by converting big car volume into car volume based on the passenger car unit. Travel speeds on different lanes within a road segment are averaged. In our analysis, data are filtered by ROI and the time interval of interest (different peak periods of weekdays).

Figure 3.3 Raw VD Data

Table 3.3 Contents of Columns Columns Contents

DeviceID Name of vehicle detectors DateTime2 Tag of date and time LaneOrder Number of lane

BigVolume Volume of large vehicles BigSpeed Speed of large vehicles

CarVolume Volume of regular passenger cars CarSpeed Speed of regular passenger cars LGID Identifier of the direction of traffic

There are slight differences between the two different procedures of preprocessing VD data in terms of the travel speed criteria setting. Level of Service (LOS) C and average speed are chosen in this study. The data preprocessing procedure is described as follows.

I. Filtering the data of the set of VDs based on our ROI and target time intervals by string matching techniques.

DEVICEID LANEORDER BIGVOLUME BIGSPEED CARVOLUME CARSPEED MOTORVOLUME MOTORSPEED AVGSPEED LANEOCCUPY DATETIME2 RATE AVGINT LGID

VGUEI60 1 1 41 12 30.67 0 0 30.62 3.5 2015/1/1 00:05 240 161.5 0

96 road segments and 66 VDs are included in our ROI. By unifying the time format of the raw data, string matching can be performed. Weekday data and weekend data are then separated.

II. Ignoring missing data and removing erroneous data due to malfunctioning VD devices.

Erroneous data here mean records whose values are obviously unreasonable.

For example, travel speeds remain zero even during peak hours for several days or travel speeds exceeding the speed limit for over 40%.

III. Constructing the incident chart for different criteria respectively.

A. For LOS C, according to section 19.6 in 2011 Taiwan Highway Capacity Manual (Transportation Planning Division, 2011), travel speed can be used to determine LOS for urban road network with different speed limits. The complete criteria are shown in Table 3.4. We consider LOS C, which is often taken as the standard of light congestion by transportation management agencies as our threshold. Under this state, except for more restrictions in making lane changes, drivers and motorists also experience certain tension. In this study, a congestion is recorded if the travel speed is lower than 30 km/hr.

The differences between the actual travel speed and the LOS C threshold are also calculated.

Table 3.4 LOS Criteria for Urban Road Network with 50km/hr Speed Limit incidents, the normal traffic condition should be defined so that we construct a baseline for reference first. The baseline is set based on the weekly average of travel speed within the week of the targeted time interval. The difference between the actual travel speed and the baseline value is calculated. Those lower than 80% of the value on the baseline are recorded.

IV. As a preparation step for further processing, data recorded from step III are transformed to a binary data structure. For negative values of the difference between the actual travel speed and the threshold, 1 is assigned for them, while others are assigned 0. The value 1 shows a VD detected a possible congestion or incident during a certain time interval, while 0 indicates an acceptable level of service.

3.3 Analysis Procedure

Base on the road network structure construction and data preprocessing, we can obtain the 1^st order and 2^nd order adjacency relationships of the road segments and binary incident chart of the VDs within our ROI. The analysis procedure will be explained as follows and shown in Figure 3.4.

I. Detecting incidents

Base on the binary incident chart obtained from the data preprocessing stage, a cell with value 1 indicates possible congestion or incident takes place. For a single VD, if there is a sequence of value 1 that lasts for at least 4 time intervals (20 minutes), we define it as a possible congestion incident.

II. Calculating the conditional probability that incidents occur on neighboring road segments p_is

Duration of each congestion incident is recorded in step I. During the congestion incident on a certain road segment, the numbers of consecutive time intervals identified as congested on neighboring road segments are also recorded.

We define the ratio of the latter (upstream adjacent road segment) and the former (downstream road segment) as the conditional probability of neighboring road segments affected by the congested road segment.

III. Calculating the kernel density at each road segment

Based on the result in step II and adjacency relationship obtained from data preprocessing, the kernel density can be calculated through Equation (3.1). A simple example is provided for illustration as follows. For the road network shown in Figure 3.5, congestion occurs on the target road segment TG, road segment 1,R in the 1^st order right turn relationship with respect to TG, and another road segment 2,SR in the 2^nd order straight-right turn relationship with respect to TG. Two congestion incidents were detected on TG; one started from 6:45 AM and ended at 7:15 AM, while the other started from 8:20 PM and ended at 8:50 PM. Both lasted for six time intervals (30 minutes). How p_is of these two congestion incidents are obtained are shown in Figure 3.6(a) and Figure 3.6(b), respectively. 4 and 3 congestion intervals were detected on 1,R during the two congestion incidents on TG respectively. 3 and 3 congestion intervals are detected on 2,SR during the two congestion incidents

Figure 3.4 Analysis Procedure

Figure 3.5 Example Road Network for KDE

Interval Segment

~6:50 ~6:55 ~7:00 ~7:05 ~7:10 ~7:15 𝑝_𝑖𝑠

1,R 4/6

2,SR 3/6

Figure 3.6(a) 𝑝_𝑖𝑠 Calculation Example 1 TG

1,R 2,SR

Interval Segment

~8:25 ~8:30 ~8:35 ~8:40 ~8:45 ~8:50 𝑝_𝑖𝑠

1,R 3/6

2,SR 3/6

Figure 3.6(b) 𝑝_𝑖𝑠 Calculation Example 2

Chapter 4 Case Study

A case study is performed using the urban road network of Taipei City with the proposed algorithm applied. The dataset includes a real arterial network in part of the Da-an district, Taipei City. The VD data from JDa-anuary, 2015 to March, 2017 are provided by the Traffic Control Center of Taipei City Traffic Engineering Office. The point location of each VD is paired with a road segment. The network of our ROI is analyzed in this chapter in terms of the kernel density of congestion. Different scenarios, including a day with a special event and a week during the construction of bike lanes are investigated.

The analysis for each scenario is organized as overview, segment-wise perspective and summary. The criteria of LOS C and average travel speed are analyzed, respectively.

4.1 Descriptions of the Case Study

Our ROI is defined by boundaries constructed by 5 arterials within Taipei City. The boundaries are listed in Table 4.1. The ROI area and locations of VDs installed are shown in Figure 4.1, where road segments are represented by thicker lines while the square dots represent VDs. Our ROI contains 96 road segments with different traffic directions separated. Totally 66 VDs are installed within this road network. There are no VDs on 30 road segments while multiple VDs are installed on some road segments. Data of peak hours during weekdays are extracted for analysis. VDs and their corresponding numbers and road segments are listed in Table 4.2.

Table 4.1 Boundaries of ROI

Boundary North South West East

Arterial Jen-Ai Rd. 1. Xin-Hai Rd.

2. Roosevelt Rd.

Hang-Zhou S. Rd. 1. An-He Rd.

2. Le-Li St.

Figure 4.1 Road Network of the ROI and Location of VDs

Table 4.2 VDs in The ROI and Their Corresponding ID and Road Segments

ID NO Arterial Dir Block ID NO Arterial Dir Block

The following result analysis is based on the kernel density estimation result of our ROI. Part of the kernel density estimation result of scenario 1 applying the criteria of average travel speed is shown in Table 4.3. For each ID, its adjacent road segments and

在文檔中通過車輛偵測器資料分析與視覺化探索壅塞擴散模式 (頁 22-0)