• 沒有找到結果。

通過車輛偵測器資料分析與視覺化探索壅塞擴散模式

N/A
N/A
Protected

Academic year: 2022

Share "通過車輛偵測器資料分析與視覺化探索壅塞擴散模式"

Copied!
75
0
0

加載中.... (立即查看全文)

全文

(1)

國立臺灣大學工學院土木工程學系 碩士論文

Department of Civil Engineering College of Engineering National Taiwan University

Master Thesis

通過車輛偵測器資料分析與視覺化探索壅塞擴散模式 Exploring the Propagation Pattern of Traffic Congestion Through Analyzing and Visualizing Vehicle Detector Data

許家維 Chia-Wei Hsu

指導教授:許聿廷 博士 Advisor: Yu-Ting Hsu, Ph.D.

中華民國 107 年 7 月

July, 2018

(2)

口試委員審定書

(3)

誌謝

其實有點難以置信,論文寫作終於也進入了尾聲,我想我會懷念這段安靜而專 注的日子。人生有許多事情,正如船後的波紋,總要過後才覺得美,我想念交通組 就是其中之一。大三升大四的暑假鼓起勇氣敲開了小許老師辦公室的門、面試時大

許老師讓我當場做一首詩、修課做研究時咒罵著跑不出結果的 code、無數個夜晚

的挑燈夜戰、論文寫作的過程與口試,種種情景都還歷歷在目。過程當中,雖然有 過徬徨,有過迷惘;想過逃避,也想過放棄,更曾經懷疑過自己到底在幹嘛,但只 要想起愛因斯坦說過的話,低潮與失落也就煙消雲散,如果我們知道我們在做什麼,

那麼這就不叫研究了,不是嗎?儘管在這研究中,仍有許多力有未逮之處,但我仍 然慶幸自己堅持到了最後一刻。也許多年後,我會忘記寫過的論文細節,但曾經幫 助過我的人、始終支持著我的人,我將會一直銘記在心。

轉眼間在台大待了整整六年,我衷心的感謝這一路上所遇到的師長,讓對學術 的嚴謹、批判的思辨與人文關懷內化到我的生命中,形塑成一種人生的哲學與處世 的態度。尤其是領我進門並指導論文的許聿廷老師,不僅在理論知識上不厭其煩地 教導,也願意給予學生的研究探索空間、啟發與支持,讓我看見一位好老師的榜樣。

感謝Albert 老師、朱致遠老師、賴勇成老師、大許老師在課堂當中的鼓勵、指教甚

至是當頭棒喝,讓我懂得謙卑、懂得踏實,也依舊懂得作夢。感謝Adrian 與 Way

老師,引我進入英文寫作的領域,並提供無數寶貴意見,使論文更臻完善。

在探討壅塞的研究過程當中,自己也常常卡在泥淖中,停滯不前、找不到出口,

好在有你們,讓一切困難都迎刃而解。感謝小許家的夥伴柏傑總是走在最前面,至 佑總是在身邊,讓我在無助的時候有人引導、有人陪伴;乃慈、毓軒、思文、書廷、

宜萱、香吟、文宇像哥哥姊姊般一直以來的照顧,即使外宿依然能感受到家的溫暖。

謝謝士淵,願意和我聊彼此的心裡話,以及每周三再忙都要赴約的茶湯會;謝謝研 究室的室友任宏、明華、子皓、韻如、柏維、郁方、子鈞、依穎、Umav、俊嘉、

大包、婉菁、晟松、洵顏在課業上的互相幫忙、球場上的默契搭檔、精神上的勉勵 以及實質上的餵食;謝謝學弟妹譽仁、明儀、薇亘、子鈺、智勛、冠頡、儒斌、浩 雅、妤庭、祝銘,時常能夠有一些想法的交換,並提供我許多建議;謝謝水服的彥 伯、仲皓、承勳、志浩、祝姐、星魚、信總、34 期夥伴、36 期小浩浩們,成全了 我的碧海與藍天;感謝養育、支持我的爸媽,謝謝你們賜予我健康的身體與衣食無 缺的成長環境,讓我能夠專心於學業而無後顧之憂;謝謝交控中心同仁提供研究所 需的資料、瓅凱學長提供技術上的協助,才能有今天的成果。感謝兩位口試委員以 及林育生技正所給予的鼓勵與寶貴建議,衷心盼望藉由這篇論文的淺薄見解,能拋 磚引玉讓相關議題持續被更多人重視、討論與探究。

2018 年 8 月 6 日,鎮守在土木系館 3 樓的第 693 天。雖然不捨,但終究還是 必須離開,邁向一段嶄新的旅程。期望自己別忘那一年,那一天,出發時心中的夢。

莫忘初衷。

祝福所有關心我的朋友們平安快樂。

許家維 謹誌

(4)

中文摘要

對於交通管理而言,道路交通狀況的監測是很重要的課題,而如何監控和避免 道路壅塞是交通管理最主要關心的面向之一。壅塞發生的原因包括自用車使用率 提升使得車流量增加、道路系統容量不足或設計不良,以及事故或施工導致車道容 量縮減等。而壅塞的影響層面則包括通勤時間的增長、駕駛人情緒上的負面衝擊、

生活品質的降低以及在緊急應變上的潛在威脅。因此,深入了解這些壅塞所影響的 範圍與層面,並找出道路系統中可能的瓶頸處,提供可靠資訊予用路人與交通管理 單位作為參考,將可協助對於預防壅塞更積極的作為,並在交通管理策略上做出改 善。

過去文獻中,針對不同資料來源所進行的交通狀態與事件偵測、壅塞擴散模式 與資料視覺化皆有相關研究與討論,本研究將基於高解析度之車輛偵測器資料分 析,將資料處理、模式辨識與視覺化三個區塊一併納入,建立一個完整的壅塞分析 架構。本研究首先進行原始車輛偵測器資料的清理,處理資料中缺失與錯誤的問題,

並篩選出後續分析所需要的特定資料,並根據與不同的壅塞界定門檻值,定義出壅 塞發生的時空位置。本研究以車輛偵測器的實際位置,地圖圖資與鄰接矩陣的觀念 建立路網。接著使用調整的核密度推估方法進行壅塞擴散模式的分析,在不同時間 段進行案例分析,探討一階、二階鄰接以及不同轉向的上游路段受下游壅塞源頭影 響的情形,歸納出可供參考的壅塞擴散模式和推估原則,並透過視覺化呈現交通車 流資料的變化特性。

藉由案例分析的不同情境設定,與不同尺度的視覺化結果,將可以從圖面上觀 察到整體路網當中有較高機率發生壅塞的位置,以及各源頭路段發生壅塞之後,傳 遞的方向與影響程度。在路網中大部分的路段上觀察到的現象符合一些一般性的 原則,上游路段受到壅塞的影響,一階鄰接路段大於二階鄰接路段;另外在相同鄰 接度的情形下,直行進入下游路段大於左轉進入下游路段,左轉進入下游路段又大 於右轉進入下游路段。

關鍵字: 車輛偵測器、核密度估計、交通狀態偵測、壅塞擴散模式、視覺化

(5)

ABSTRACT

The monitoring of roadway traffic conditions is critical for traffic management, where the detection of traffic congestion is one of major concerns. Traffic congestion may have various causes, including the increase of traffic volume due to higher private vehicle usage, inappropriate design or lack of capacity of road network and layout changes on the road segment owing to non-recurrent incidents such as traffic accidents or construction work. Traffic congestion may lead to the rise of commuting time, negative impact of driver physiology, lower quality of life and potential hazard on emergency response could be the impact of traffic congestion. Hence, further understanding of how traffic congestion was formed, propagated and dissipates, and identifying possible bottlenecks are critical for overall traffic management. Based on the relevant knowledge, it is possible to provide drivers and traffic management agencies reliable information to more actively prevent traffic congestion and thereby improve the quality of traffic management strategies.

In the current literature, traffic state detection, congestion propagation pattern and traffic data visualization have been studied and discussed, respectively. Based on high- resolution VD data, this study integrates the consideration of data processing, pattern recognition and visualization to develop a data analysis framework for better understanding of traffic congestion in an urban network. Data cleaning is first performed to deal with the missing and erroneous data, and then a specific data set needed for further analysis is extracted. Based on different thresholds of congestion detection, the spatio- temporal locations of congestion occurrences are recorded. The network structure is

(6)

the adjacent matrix. An adjusted kernel density estimation approach is proposed and applied to case studies, in order to investigate the effects of congestion propagation on road segments with different characteristics in terms of connection type and adjacency.

Finally, a general principle describing the propagation pattern of traffic congestion is concluded and presented through data visualization.

Based on different scenarios for the case study and visualization result under different scales, locations with higher probability to be congested in the whole network and the propagation direction and impact after congestion occurred can be observed. Most of the road segments within the network follows some general principles. In terms of the impact on upstream road segments, road segments of the 1st order adjacency receive larger impact than road segments of the 2nd order adjacency. In addition, for road segments of the same order adjacency, which goes straight to the congested road segment is affected most by the source. The segment with a left turn comes second and the segment with a right turn receives the least influence.

Keywords: congestion propagation pattern, kernel density estimation, traffic state detection, vehicle detectors, visualization

(7)

CONTENTS

口試委員審定書 ... i

誌謝 ... ii

中文摘要 ... iii

ABSTRACT ... iv

CONTENTS ... vi

LIST OF FIGURES ... viii

LIST OF TABLES ... x

Chapter 1 Introduction ... 1

1.1 Background ... 1

1.2 Research Objectives... 6

1.3 Thesis Organization ... 7

Chapter 2 Literature Review ... 9

2.1 Traffic State and Event Detection ... 9

2.2 Propagation Patterns ... 11

2.3 Kernel Density Estimation and Applications ... 12

2.3.1 Standard Kernel Density Estimation ... 14

2.3.2 Planar Kernel Density Estimation ... 16

2.3.3 Network Kernel Density Estimation ... 17

2.4 Summary of Literature Review ... 18

Chapter 3 Methodology ... 20

3.1 Adjusted Network Kernel Density Estimation ... 20

(8)

3.2.1 Network Structure ... 21

3.2.2 VD data processing ... 25

3.3 Analysis Procedure ... 29

Chapter 4 Case Study ... 33

4.1 Descriptions of the Case Study ... 33

4.2 Result Analysis ... 36

4.2.1 Scenario 1: 2015/12/28~2015/12/31 ... 38

4.2.2 Scenario 2: 2016/4/18~2016/4/22 ... 46

4.3 Summary of Insights from Case Study ... 54

Chapter 5 Conclusions and Future Work ... 56

5.1 Conclusions ... 56

5.2 Future Work ... 58

REFERENCE ... 60

(9)

LIST OF FIGURES

Figure 1.1 Planned Bike Lane Network in Downtown Taipei... 4

Figure 1.2 Layout of Widened Sidewalk on Fu-Xing S. Road ... 4

Figure 1.3 Real Time Traffic Status of Taipei City ... 5

Figure 1.4 Thesis Organizations ... 8

Figure 3.1 Road Network Example ... 23

Figure 3.2 Part of the 1st Order Adjacency Matrix of Our ROI ... 24

Figure 3.3 Raw VD Data ... 26

Figure 3.4 Analysis Procedure... 31

Figure 3.5 Example Road Network for KDE ... 31

Figure 3.6 pis Calculation Example...31

Figure 4.1 Road Network of the ROI and Location of VDs ... 34

Figure 4.2 KDE of Scenario 1 with LOS C (2015/12/28~30) ... 39

Figure 4.3 KDE of Scenario 1 with LOS C (2015/12/31) ... 39

Figure 4.4 Upstream Influence from The Congestion of Segment 33 (S1_C) ... 41

Figure 4.5 Upstream Influence from The Congestion of Segment 43 (S1_C) ... 41

Figure 4.6 Upstream Influence from The Congestion of Segment 44 (S1_C) ... 42

Figure 4.7 KDE of Scenario 1 with Average Travel Speed (2015/12/28~30) ... 43

Figure 4.8 KDE of Scenario 1 with Average Travel Speed (2015/12/31) ... 43

Figure 4.9 Upstream Influence from The Congestion of Segment 40 (S1_avg)...45

Figure 4.10 Upstream Influence from The Congestion of Segment 44 (S1_avg) ... 45

Figure 4.11 Upstream Influence from The Congestion of Segment 56 (S1_avg)...46

(10)

Figure 4.13 Upstream Influence from The Congestion of Segment 33 (S2_C) ... 48

Figure 4.14 Upstream Influence from The Congestion of Segment 39 (S2_C)...49

Figure 4.15 Upstream Influence from The Congestion of Segment 84 (S2_C)...50

Figure 4.16 KDE of Scenario 2 with Average Travel Speed (2016/4/18~22)...51

Figure 4.17 Upstream Influence from The Congestion of Segment 40 (S2_avg)...53

Figure 4.18 Upstream Influence from The Congestion of Segment 46 (S2_avg)...53

Figure 4.19 Upstream Influence from The Congestion of Segment 58 (S2_avg)...54

Figure 4.20 Upstream Influence from The Congestion of Segment 56 (S2_avg)...54

(11)

LIST OF TABLES

Table 2.1 Summary of Characteristics of Reviews ... 19

Table 3.1 Adjacency Matrix Example ... 23

Table 3.2 Turning Matrix Example ... 24

Table 3.3 Contents of Columns ... 26

Table 3.4 LOS Criteria for Urban Road Network with 50km/hr Speed Limit ... 28

Table 4.1 Boundaries of ROI ... 34

Table 4.2 VDs in The ROI and Their Corresponding ID and Road Segments ... 35

Table 4.3 Detailed KDE Result of Scenario 1 with Average Travel Speed...37

(12)

Chapter 1 Introduction

1.1 Background

Traffic congestion is one of the main focuses of traffic management. It is a state when traffic demand exceeds roadway capacity. The characteristic of traffic congestion occurring within urban road networks can be quite different from those taking place on freeways because of traffic signals, intersections and the complexity of road networks.

Traffic congestion can be further divided into recurrent one which usually occurs during peak hours and non-recurrent one resulting from a variety of incidents, such as traffic accidents, road construction as well as large activities.

Researchers are interested in several related topics, including the formation of traffic congestion, the estimation of negative effects caused by traffic congestion, the bottlenecks in the road network and the strategies to prevent as well as ease congestion. To answer the questions mentioned above, congestion incidents need to be identified from traffic data first. How to detect traffic congestion through a systematic approach of collecting and analyzing traffic data has been the key issue for traffic management. In previous studies, traffic data are often extracted from loop detectors. However, there are some obvious shortcomings, such as the difficulties in facility maintenance, high malfunctioning and misdetection rate. Hence, other types facilities for detection, for example, electronic toll collection (ETC) sensors, monitors and microwave vehicle detectors (VD) are installed. ETC system has been operating on the freeways in Taiwan since 2014. Besides improving the service level of the freeway system, it also contributes to the collection of large amount of traffic data. These data can be used for traffic

(13)

management and opened to both academia and individuals for extended applications. In Taipei City, vehicle detectors are widely installed within the urban road network, and high-resolution traffic data are collected. They provide abundant traffic data including point travel speed, traffic volume and occupancy. The daily VD data are provided without charge on the governmental open data platform, Data.Taipei website. Through the investigation of these data, the characteristics of traffic flows can be observed and a baseline traffic condition can be determined. By comparing the traffic data of a set of target VDs within a Region Of Interest (ROI) during a certain time interval with the baseline, congestion incidents can be detected. Traffic congestion may be manifested as a chain reaction, forming a shockwave across a certain scope of a roadway network (Li, She, Luo, & Yu, 2013). Some studies on traffic congestion forecasting have been conducted by employing pheromone communication models (Kurihara, Tamaki, Numao, Yano, Kagawa, & Morita, 2009), density wave models (Nagatani, 2002) and so on. To understand how a congestion incident may propagate throughout a network and dissipate based on the exploration of real data can be the research direction to further enhance urban traffic management.

In order to provide pedestrians and cyclists a safer environment, Taipei City government has been implementing the bike lane network plan since 2014. Considering the departure efficiency, that is, the time needed to eliminate the queue at traffic signals, three north-south arterials and three east-west arterials are selected. Each of them has a width of at least 40 meters and metro routes passes through four of them. The planned network is shown in Figure 1.1. For those with wider sidewalks, for example, Jen-Ai road and Zhong-Shan N. road, marking lines for bike lanes are painted on the original

(14)

are drawn. The layout of a widened sidewalk with a bike lane is shown in Figure 1.2.

Residents had been reporting the congestion and inconvenience during the bike lane construction on Fu-Xing S. Road and Xin-Sheng S. Road from March to September in 2016. According to the travel speed collected from vehicle detectors, during the construction, travel speed slightly decreased by 6.49% to 7.91% and the service level had been degraded (Taipei City Traffic Engineering Office, 2016). However, the service level had almost recovered after the construction work was completed. Hence, whether there are some differences in terms of the traffic flow characteristics and congestion propagation pattern between arterials under construction and the others is worth investigating. Moreover, more detailed understanding of relationships among neighboring road segments may also provide traffic management agencies and individuals valuable information for evaluating the influences of construction decisions, determining traffic management strategies and providing navigation. Hence, high- resolution VD data during the construction in an ROI covering the arterials under construction can be extracted for further analysis. Characteristics of the congestion propagation pattern including the conditional probability that a congestion may occur given the occurrence of another congestion, the potential relationship between adjacent road segments and how traffic congestion contribute to different road segments can be observed.

(15)

Figure 1.1 Planned Bike Lane Network in Downtown Taipei

Figure 1.2 Layout of Widened Sidewalk on Fu-Xing S. Road

(16)

In this study, we seek to obtain better understanding of the pattern of how traffic congestion propagates and influences a roadway network. The traffic control center of Taipei City has provided a system for real time traffic status inquiry by plotting the road performance information on a Google Map as partly shown in Figure 1.3. Straightforward information can be extracted based on the collected traffic data (Chen, Guo & Wang, 2015), while the cascading traffic pattern may further suggest driver behavior of diverting to circumvent congested road segments. Hence, the main purpose of this study is to go deeper to investigate the effects of congestion afterwards. To monitor where and when traffic congestion occurs, we take point vehicular speed as the primary consideration.

Based on the traffic data collected from vehicle detectors (VDs), we cluster these data by capturing the spatiotemporal variation of vehicular speed over the network so as to identify congestion incidents. Based on the congestion incidents identified, affected road segments can also be further determined.

Figure 1.3 Real Time Traffic Status of Taipei City

(17)

Ultimately, this research seeks to investigate the propagation of congestion incidents within an urban road network. By visualizing the bottle necks and shockwave after a congestion occurs, we provide some research insight so that a precautionary traffic management strategy may be taken.

1.2 Research Objectives

In this study, we expect to have further understanding about the cascading pattern of traffic congestion based on high resolution VD data, which may be a reference for the determination of traffic management strategies. System for real time traffic status inquiry provided by the traffic control center of Taipei City and Google Map visualize instant traffic status in terms of vehicular speeds over the roadway network via a web-based inquiry interface, but we are more interested in the probability of congestion passing to neighboring areas. To be more specific, the research objectives are summarized as below:

I. Propose an alternated probability density estimation approach to properly compute the conditional probability that a congestion may occur on a certain road segment given the occurrence of another congestion.

II. Determine the potential relationship between adjacent road segments based on the degree of adjacency and turning (straight, left turn or right turn) pattern.

III. Visualize the density estimation result and discuss how a congestion on a road segment make contributions to adjacent road segments and affect neighboring areas.

(18)

1.3 Thesis Organization

Figure 1.4 illustrates the organization of this thesis, which consists of six chapters.

Chapter 2 provides the literature review of several dimensions related to traffic states and events detection, propagation patterns of congestion and the applications of different forms of Kernel Density Estimation (KDE) approaches. Based on the gaps identified in Chapter 2, Chapter 3 proposed an alternated form of KDE approach, descriptively presented the associated data used for analysis and showed the procedure to apply the proposed approach. Next, Case studies using the road network of Taipei City are performed, and results are visualized in Chapter 4. Finally, conclusions of research findings and recommendations for future research are summarized in Chapter 5.

(19)

Figure 1.4 Thesis Organizations

(20)

Chapter 2 Literature Review

The situation of traffic congestion within road network has drawn much attention from both governmental and private units related to traffic management and data analysis.

As detectors and monitors with high density installed and abundant traffic data collected, the insight into potential patterns within have intrigued increasing research interest to various case studies. These may lead us to further understanding to the characteristic of traffic flow, structure beneath our road network and some thoughts toward traffic management strategies.

This chapter will be organized as follows. The research papers applying different approaches for detecting traffic state and events are reviewed in section 2.1.

Literature discussing the propagation pattern of traffic congestions is reviewed in section 2.2. Section 2.3 review the studies about kernel density estimation itself and its application in various disciplines. Last, a brief summary is presented in section 2.4.

2.1 Traffic State and Event Detection

Congestion, slow, smooth and accidents are traffic states than can describe traffic flow on roads, providing critical information to travelers as well as transportation agencies (Li, She, Luo & Yu, 2013). Various data are collected from loop detectors, camera surveillance systems, probe cars and GPS including travel time, traffic speed and trajectories are used for traffic state and event detection. Algorithms are designed and case studies are performed on freeways and in urban road networks.

(21)

Coifman, B. (2002) stated that the link travel time can reflect traffic state to some extent. Direct measurement of travel time may require correlation of multiple location observations, that is, additional detector hardware or new communication infrastructure are needed. Thus, a method estimating trajectories and link travel time only using data from an individual set of dual loop detector is proposed. Basic traffic flow theory is applied for extrapolating local conditions on extended links. With no incidents or delays involved, this approach provides good time estimation results.

Kerner, Demir, Herrtwich, Klenov, Rehborn, Aleksic & Haug (2005) introduced an approach which perform traffic state detection with floating car data (FCD). Probe cars are sent to collect travel time within a reporting section. A travel time increase due to congestion emergence and a travel time decrease because of congestion dissolution are recorded. Two or more probe cars can provide substantial information for a typical traffic accident to be recognized. This approach can provide a 65% probability to recognize incidents last longer than 20 minutes with a penetration rate of 1.5% of probe cars within whole amount of vehicles.

Li, She, Luo & Yu (2013) applied freeway video surveillance system for traffic state detection use. Existing surveillance camera infrastructure can provide data in a video form.

However, there are difficulties including angle and zooming while extracting traffic data form surveillance cameras. Based on the movement of vehicles in images, they proposed a system to estimate traffic flow speed and occupancy rate and estimate typical traffic states (congested, slow and smooth) that can leverage the existing surveillance infrastructure. The traffic state detection accuracy ratio during daytime is higher than 85%, while the accuracy of congestion reaches 91.8%.

(22)

Wang, Lu, Yuan, Zhang & Van De Wetering (2013) proposed a method to perform visual analysis of urban traffic traffic jam based on trajectory data. GPS trajectories of taxis are collected and strategies for extract congestion information are developed.

Trajectories are cleaned first in order to fit in a road network. Secondly, traffic speed on each road segment is calculated. Lastly, spatio-temporal graphs showing congestion and its propagation can provide descriptions of a traffic jam.

Anbaroglu, Heydecker & Cheng (2014) stated that differences between urban network and motorways, and the fact that the nature vary from recurrent congestions (RC) and non-recurrent congestions (NRC), limits the use of existing incident detection methods mostly focused on motorways without distinguishing RCs and NRCs.

Substantially high link journey travel time observations (LJTs) occur simultaneously are clustered in the proposed NRC detection method. Besides minimum duration restrictions, localization index is also introduced to describe the closeness between congestion clusters.

They concluded that those LJTs at least 40% higher than expected value should belong to NRC through the result of sensitivity analysis using a weighted product model (WPM).

2.2 Propagation Patterns

Besides detection of traffic states and events, how the congestion propagates and/or cascade is also one of the most interested issues among research related traffic congestion.

This may provide us the ability to extrapolate the traffic state on the road network nearby and the potential relationship between neighboring road segments. With these we can further understand the structure of our road network and the cascading behavior of a congestion took place.

(23)

Long, Gao, Ren & Lian (2008) stated the importance of effectively identifying network bottlenecks for improving network service level and preventing congestions.

Congestion is defined by critical standards based on average journey speed (AJV) and a congestion propagation model based on cell transmission model (CTM) is proposed.

Simulations are performed on Sioux Falls network. The simulated result can provide references for decision in controlling traffic demand.

Wang, Z., et al. (2013) utilized GPS trajectories and provide multiple views for visually exploring and analyzing on the level of propagation graphs and road segment level. The whole visualization contains speed variation on pixel level while the propagation result is shown on road segment level.

Ji, & Geroliminis (2014) observed congestion propagation on a macroscopic scale.

By taxi GPS as sparse probe vehicle data and maximum connected component of congested links, interconnected congested links and the critical congestion pockets are identified. The proposed method can effectively distinguish the congestion pockets out of the network and track the evolution of congestion through time.

2.3 Kernel Density Estimation (KDE) and Applications

An adjusted kernel density estimation approach is applied in this study, thus related literatures including the general kernel density estimation method and some discussion about this approach as well as the applications in transportation field and several other disciplines are reviewed in this section. Probability density estimation approaches can be

(24)

estimation is made based on the assumptions related to the distribution embedded in the data set. However, there can be larger gaps between the assumed parametric model and reality. As a nonparametric probability density estimation approach (Rosenblatt 1956;

Whittle 1958; Parzen 1962), kernel density estimation does not require any assumption for the distribution of data points. This provides more flexibility and allows researchers to discover more characteristics beneath the data set such as such as its actual distribution.

Hence, the great importance of kernel density estimation has shown in both theoretical and applied statistics fields. Rosenblatt, M. (1956) and Parzen, E. (1962) developed current form of kernel density estimation, which is also termed Parzen-Rosenblatt window method in some fields such as signal processing and econometrics.

Yu (2009) has done a study on KDE, investigating the most appropriate search bandwidth choice for six different probability distribution evaluated by mean integrated square error (MISE) and asymptotic mean integrated squared error (AMISE). Yu concluded that the KDE with variable search bandwidth can provide acceptable estimation results.

Xie and Yan (2008) suggested a network KDE method transformed from a standard planar KDE to fill the shortcomings while the problem is network based. The innovation of this research is to represent network space with lixel, which is the linear units of equal network length. This approach is tested with traffic accident data and road network in Bowling Green, Kentucky in 2005. This approach has the ability to solve the problem of overestimation of density values. The impacts on density calculation from different kernel functions and different search bandwidth are also investigated and found that search bandwidth brought the highest influence by controlling the smoothness of the spatial pattern.

(25)

Chang (2012) applied KDE and integrate data mining to assess common physiological indicators of multiple diseases. To estimate the probability of illness of patients being examined, KDE is applied to estimate the probability distribution of each common physiological indicator under different health condition.

Hu (2012) established an approach to analysis GPS trajectory and collected the trajectories of visitors in Yehliu Geopark. Possible spatial distribution of visitors within the park can be calculated through KDE. In addition, time factor is also taken into account to investigate the location of crowds and the spatial distribution of visitors. Ultimately, the density distribution of visitors within the park during different time period is simulated.

The simulation result can be used to reconsider the space allocation and route design.

2.3.1 Standard Kernel Density Estimation (Standard KDE)

Assuming ( ,x x1 2,...,xn) is a univariate independent and identically distributed

sample extracted from some distribution with an unknown density f , its kernel density estimator can be written as:

1 1

1 1

( ) ( ) ( )

n n

i

h i

h

i i

x x

f x K x x K

n nh h

 

(2.1)

Where K is the kernel function which is a non-negative symmetric function and satisfies

K u du( ) 1 . Since K is a probability density function, f also has the

(26)

I. Uniform (rectangular window):

( ) 1

K u  2, for u 1 (2.2)

II. Triangular:

( ) (1 )

K u   u , for u 1 (2.3)

III. Epanechnikov (parabolic):

3 2

( ) (1 )

K u 4 u , for u 1 (2.4)

IV. Quartic (biweight):

15 2 2

( ) (1 )

K u 16 u , for u 1 (2.5)

V. Gaussian:

1 2

1 2

( ) 2

K u e u

(2.6)

In Eq. (2.1), h is a positive number named bandwidth or smoothing parameter. It controls the smoothness and preciseness of kernel density estimation. A larger h may lead to underfitting and fail to represent the appearance of the real density function. By contrast, a smaller h does not perform well on smoothing the curve and may lead to overfitting.

(27)

2.3.2 Planar Kernel Density Estimation (Planar KDE)

In order to perform density estimation of various spatial related issues, the standard kernel density estimation concept is then extended to 2-D planes. The general form of the planar kernel density estimator in a 2-D space can be written as:

2 1

( ) 1 ( )

n

is i

s k d

r r

(2.7)

Where ( )s is the density at location s ,d is the distance from point is i to location s , and r is the bandwidth in Planar KDE. k is the kernel, modeled as a function of dis

r ratio. Instead of giving an equal weight to all points within bandwidth r, a distance decay effect is taken into account. That is, as the distance between a point and location s increases, that point is weighted less while calculating the overall density.

Some commonly applied kernel functions used to account for the distance decay effect are expressed in an alternated form below (Gibin, Longley, & Atkindon, 2007; Levine, 2004):

I. Gaussian function:

2 2

( ) 1 exp( )

2 2

is is

d d

k r r , when 0dis r (2.8)

(dis) 0

k r  , when disr

(28)

II. Quartic function (approximating Gaussian function):

2

(dis) (1 dis2 )

k K

r   r , when 0disr (2.9)

(dis) 0

k r  , when disr

To ensure that the basic assumption

k u du( ) 1 is not violated. 3

 and 3 4 are common values chosen for scaling factor K.

III. Minimum variance function:

2 2

( ) 3(3 5 ) 8

is is

d d

k r   r , when 0disr (2.10)

(dis) 0

k r  , when disr

2.3.3 Network Kernel Density Estimation (Network KDE)

To perform density estimation of point events with network constraints, network KDE is proposed (Xie & Yan, 2008). This approach differs from the planar kernel density estimation in several aspects. Network KDE is a 1-D measurement, while planar KDE is a 2-D one. Network space is used in the point event context and the kernel function is developed based on network distance instead of Euclidean distance. Hence, it performs better on density estimation while a planar KDE may over-detect clustered patterns. The general form of the network KDE can be expressed as:

(29)

1

( ) 1 ( )

n

is i

s k d

r r

(2.11)

2.4 Summary of Literature Review

The summary of characteristics of reviews in terms of data source, approaches and types of road network is listed in Table 2.1. According to the review in former sections, most studies focus on either traffic state and event detection or propagation patterns of congestions. Furthermore, there are some shortcomings on the data source they utilize.

Some of them are not open to public while others require high operation cost and complicated preprocessing techniques. To bridge the gaps, this study proposed an adjusted KDE approach to account for congestion detection, propagation patterns and visualization of VD data.

(30)

Table 2.1 Summary of Characteristics of Reviews

(31)

Chapter 3 Methodology

In this chapter, the characteristics of adjusted network KDE is represented. The proposed adjusted KDE approach is applied to determine the congestion cascading pattern in terms of the conditional probability for congestion incidents and the potential relationship between adjacent road segments is investigated. Procedures of extracting network information, preprocessing VD data and performing KDE estimation by employing the proposed approach are also explained in this chapter.

3.1 Adjusted Network Kernel Density Estimation (Adj.

Network KDE)

In this study, we will make some adjustments on the original network KDE approach.

In order to interpret the spatio-temporal characteristic of the congestion propagation within an urban road network, network kernel density estimation approach is applied.

Instead of using network distance, this research employs “degree of adjacency” based on the structure of the road network and adjacency matrix. The locations of VDs do not follow a specific rule, for example at the front, middle or the end of the road segment.

Hence, VD data can only present the whole road segment and precise network distance cannot be calculated. Furthermore, the conditional probability that congestion occurs on the upstream road segment given the occurrence of another congestion on the downstream road segment is also considered. The adjusted form of the network KDE can be written as:

(32)

1

( ) 1 ( )

n

is is i

s p k adj

r r

(3.1)

Where adjis is the degree of adjacency of upstream road segment i and downstream road segment s and pis is the conditional probability that congestion occurs on i given another congestion occurring on s . To be more specific, s and i are both locations of VDs. In addition, each s can also be viewed as the center of several neighboring road segments including itself, which contribute the effect to adjacent is.

3.2 Data Description

Two main components of our data are introduced in the following sections, including the description of how we represent our urban road network structure, as well as the contents and the procedure of preprocessing raw VD data. We apply the conception of the adjacency matrix to form our road network structure. String comparison technique is applied to filter target VD set in our region of interest (ROI) and time intervals, while criteria are set to perform data preprocessing including the elimination of erroneous and some conversion of units.

3.2.1 Network Structure

The concept of the adjacency matrix is introduced to describe the network structure of our ROI. Most networks in previous research have been binary in nature. That is to say, the edges between nodes are either existing or not (Newman 2004). A network with such an attribute can be represented by an n n adjacency matrix  with elements

(33)

1 if and are connected,

0 otherwise

ij

i j

  

However, our road network is slightly different. Since all the VDs are located on the road segments, our adjacency matrix is edge based. Furthermore, most of the arterials in our road network are bidirectional, and thereby the direction of traffic is also considered.

That is, we will have an n n adjacent matrix  where d is the entrance of a downstream road segment with respect to the exit of an upstream road segment u with elements

1 if and are connected,

0 otherwise

du

d u

  

Figure 3.1 shows a sample road network, while table 3.1 represents its 1st order adjacency matrix. We name it the 1st order adjacency when  du 1 . The 1st order adjacent matrix of our ROI will be constructed following the conception of adjacency and part of it is shown in Figure 3.2. For some of the road segments, there are no VD installed.

Another matrix containing the turning information is also constructed at this stage, as shown in Table 3.2. The tuning information is extracted from the VD reference data set, and the attribute is tagged as S (straight), L (left turn) and R (right turn). For those cannot be identified, coordinate information in terms of longitude and latitude is applied.

(34)

Figure 3.1 Road Network Example

Table 3.1 Adjacency Matrix Example

Upstream Arterial A B C D

ID A2E B2E C2N D2N

Downstream Direction East East North North … Arterial ID Direction Segment CD CD AB AB

A A2E East CD 1 0 1 0

B B2E East CD 0 1 0 0

C C2N North AB 0 0 1 0

D D2N North AB 0 1 0 1

(35)

Figure 3.2 Part of the 1st Order Adjacency Matrix of Our ROI

Table 3.2 Turning Matrix Example

Upstream Arterial A B C D

ID A2E B2E C2N D2N

Downstream Direction East East North North …

Arterial ID Direction Segment CD CD AB AB

A A2E East CD self 0 R 0

B B2E East CD 0 self 0 0

C C2N North AB 0 0 self 0

D D2N North AB 0 L 0 self

Bfrom(尾) 編號 0 1 2 3 4 5 6 7 8

下游 幹道 信義 信義 信義 信義 信義 信義 和平東 和平東 和平東

ID VHSIP20 VHNJV20 VHMKV20 VHMM620 VHMML20 VFZK620 VG6J520

Ato(頭) 上游 方向

編號 幹道 ID 方向 路段 杭金 金新 新建 建復 復敦 敦光 南羅 羅金 金新

0 信義 VHSIP20 杭金 1 1 0 0 0 0 0 0 0

1 信義 VHNJV20 金新 0 1 1 0 0 0 0 0 0

2 信義 VHMKV20 新建 0 0 1 1 0 0 0 0 0

3 信義 VHMM620 建復 0 0 0 1 1 0 0 0 0

4 信義 VHMML20 復敦 0 0 0 0 1 1 0 0 0

5 信義 敦光 0 0 0 0 0 1 0 0 0

6 和平東 南羅 0 0 0 0 0 0 1 1 0

7 和平東 VFZK620 羅金 0 0 0 0 0 0 0 1 1

8 和平東 VG6J520 金新 0 0 0 0 0 0 0 0 1

9 和平東 新建 0 0 0 0 0 0 0 0 0

10 和平東 VFTLH60 建復 0 0 0 0 0 0 0 0 0

11 和平東 復敦 0 0 0 0 0 0 0 0 0

12 和平東 VFPMQ20 敦基 0 0 0 0 0 0 0 0 0

13 和平東 西 南羅 0 0 0 0 0 0 0 0 0

14 和平東 VFZK620 西 羅金 0 0 0 0 0 0 0 0 0

15 和平東 VG6J520 西 金新 0 0 0 0 0 0 0 0 0

16 和平東 西 新建 0 0 0 0 0 0 0 0 0

17 和平東 VFTLH60 西 建復 0 0 0 0 0 0 0 0 0

18 和平東 VFQM660 西 復敦 0 0 0 0 0 0 0 0 0

19 和平東 西 敦基 0 0 0 0 0 0 0 0 0

20 辛亥 汀羅 0 0 0 0 0 0 0 0 0

21 辛亥 VF9KB20 羅新 0 0 0 0 0 0 0 0 0

22 辛亥 新建 0 0 0 0 0 0 0 0 0

23 辛亥 VF9KW60 建復 0 0 0 0 0 0 0 0 0

24 辛亥 VEWM560 復基 0 0 0 0 0 0 0 0 0

25 辛亥 VDYN960 基芳 0 0 0 0 0 0 0 0 0

26 辛亥 西 汀羅 0 0 0 0 0 0 0 0 0

27 辛亥 VF9KB60 西 羅新 0 0 0 0 0 0 0 0 0

28 辛亥 西 新建 0 0 0 0 0 0 0 0 0

29 辛亥 VF9KW60 西 建復 0 0 0 0 0 0 0 0 0

30 辛亥 VEFMN20 西 復基 0 0 0 0 0 0 0 0 0

31 辛亥 VEFMN60 西 基芳 0 0 0 0 0 0 0 0 0

32 金山南 VIPIZ61 仁信 0 0 0 0 0 0 0 0 0

33 金山南 VJSJD40 信愛 0 1 0 0 0 0 0 0 0

34 金山南 VG8IK40 愛和 0 0 0 0 0 0 0 0 0

35 金山南 VIPIZ61 仁信 0 1 0 0 0 0 0 0 0

(36)

This study focuses on the 1st and 2nd order adjacency, thereby requiring the 1st and 2nd order adjacency matrices and turning information. We make a dot product of the 1st order adjacency matrix itself to obtain the 2nd order adjacency matrix. In the 2nd order adjacency matrix, the elements with value 1, is named 2nd order adjacency. A 2nd order adjacency relationship indicates that two road segments are connected through another road segment.

3.2.2 VD data processing

The dataset contains high resolution VD data (recorded every 5 minutes) in Taipei City from January, 2015 to March, 2017, provided by the Traffic Control Center of Taipei City Traffic Engineering Office. Figure 3.3 shows part of the raw data. Some preprocessing work must be done in order to extract the target data we are interested in.

The raw data contain information including device ID, date and time, lane order, volume and travel speed of large vehicles and regular passenger cars, lane occupancy, and average interval between vehicles. There are also columns for motorcycles, however, none of them are actually detected. Table 3.3 explains the important components in the VD data which are useful for this study. We use average travel speed as our major indicator for traffic congestion. The average travel speed is calculated by converting big car volume into car volume based on the passenger car unit. Travel speeds on different lanes within a road segment are averaged. In our analysis, data are filtered by ROI and the time interval of interest (different peak periods of weekdays).

(37)

Figure 3.3 Raw VD Data

Table 3.3 Contents of Columns Columns Contents

DeviceID Name of vehicle detectors DateTime2 Tag of date and time LaneOrder Number of lane

BigVolume Volume of large vehicles BigSpeed Speed of large vehicles

CarVolume Volume of regular passenger cars CarSpeed Speed of regular passenger cars LGID Identifier of the direction of traffic

There are slight differences between the two different procedures of preprocessing VD data in terms of the travel speed criteria setting. Level of Service (LOS) C and average speed are chosen in this study. The data preprocessing procedure is described as follows.

I. Filtering the data of the set of VDs based on our ROI and target time intervals by string matching techniques.

DEVICEID LANEORDER BIGVOLUME BIGSPEED CARVOLUME CARSPEED MOTORVOLUME MOTORSPEED AVGSPEED LANEOCCUPY DATETIME2 RATE AVGINT LGID

VGUEI60 1 1 41 12 30.67 0 0 30.62 3.5 2015/1/1 00:05 240 161.5 0

VGUEI60 2 0 0 5 42 0 0 42 1.25 2015/1/1 00:05 240 250 1

VGUEI60 3 0 0 8 48 0 0 48 1.5 2015/1/1 00:05 240 222.25 1

VMEKQ40 0 0 0 13 44 0 0 44 2.25 2015/1/1 00:05 240 178.75 0

VMEKQ40 1 0 0 10 33.4 0 0 33.4 2.25 2015/1/1 00:05 240 184.25 0

VLMR820 0 2 63.5 1 0 0 0 42.33 0.75 2015/1/1 00:05 240 250 0

VLMR820 1 0 0 2 0 0 0 0 0.5 2015/1/1 00:05 240 250 1

VHWGD40 0 0 0 4 42.5 0 0 42.5 0.75 2015/1/1 00:05 240 250 0

VHWGD40 1 0 0 3 37 0 0 37 0.75 2015/1/1 00:05 240 235.75 0

VHWGD40 2 3 51 8 49.75 0 0 50.09 3 2015/1/1 00:05 240 176.25 1

VHWGD40 3 0 0 4 34 0 0 34 0.75 2015/1/1 00:05 240 250 1

VELJA00 0 1 40 0 0 0 0 21 0.5 2015/1/1 00:05 240 250 0

VELJA00 1 1 40 3 50 0 0 50 0.75 2015/1/1 00:05 240 236 0

VELJA00 2 0 0 6 37.67 0 0 37.67 1 2015/1/1 00:05 240 236 0

VELJA00 3 1 25 0 0 0 0 34 0.5 2015/1/1 00:05 240 250 1

VELJA00 4 0 0 7 40 0 0 40 1.25 2015/1/1 00:05 240 236.25 1

VELJA00 5 0 0 4 45 0 0 45 0.5 2015/1/1 00:05 240 224 1

VQFHC20 0 0 0 7 41.57 0 0 41.57 0 2015/1/1 00:05 240 41.5 0

VQFHC20 1 0 0 10 61.5 0 0 61.5 1 2015/1/1 00:05 240 32 1

VQFHC20 2 0 0 1 56 0 0 56 0 2015/1/1 00:05 240 14.75 1

VQFHC20 3 0 0 10 44.1 0 0 44.1 1 2015/1/1 00:05 240 32 1

(38)

96 road segments and 66 VDs are included in our ROI. By unifying the time format of the raw data, string matching can be performed. Weekday data and weekend data are then separated.

II. Ignoring missing data and removing erroneous data due to malfunctioning VD devices.

Erroneous data here mean records whose values are obviously unreasonable.

For example, travel speeds remain zero even during peak hours for several days or travel speeds exceeding the speed limit for over 40%.

III. Constructing the incident chart for different criteria respectively.

A. For LOS C, according to section 19.6 in 2011 Taiwan Highway Capacity Manual (Transportation Planning Division, 2011), travel speed can be used to determine LOS for urban road network with different speed limits. The complete criteria are shown in Table 3.4. We consider LOS C, which is often taken as the standard of light congestion by transportation management agencies as our threshold. Under this state, except for more restrictions in making lane changes, drivers and motorists also experience certain tension. In this study, a congestion is recorded if the travel speed is lower than 30 km/hr.

The differences between the actual travel speed and the LOS C threshold are also calculated.

(39)

Table 3.4 LOS Criteria for Urban Road Network with 50km/hr Speed Limit Average Travel Speed

V (km/hr)

LOS

V≥35 A

30≤V<35 B

25≤V<30 C

20≤V<25 D

15≤V<20 E

V<15 F

B. For average speed, a different threshold is adopted. To detect non-recurrent incidents, the normal traffic condition should be defined so that we construct a baseline for reference first. The baseline is set based on the weekly average of travel speed within the week of the targeted time interval. The difference between the actual travel speed and the baseline value is calculated. Those lower than 80% of the value on the baseline are recorded.

IV. As a preparation step for further processing, data recorded from step III are transformed to a binary data structure. For negative values of the difference between the actual travel speed and the threshold, 1 is assigned for them, while others are assigned 0. The value 1 shows a VD detected a possible congestion or incident during a certain time interval, while 0 indicates an acceptable level of service.

(40)

3.3 Analysis Procedure

Base on the road network structure construction and data preprocessing, we can obtain the 1st order and 2nd order adjacency relationships of the road segments and binary incident chart of the VDs within our ROI. The analysis procedure will be explained as follows and shown in Figure 3.4.

I. Detecting incidents

Base on the binary incident chart obtained from the data preprocessing stage, a cell with value 1 indicates possible congestion or incident takes place. For a single VD, if there is a sequence of value 1 that lasts for at least 4 time intervals (20 minutes), we define it as a possible congestion incident.

II. Calculating the conditional probability that incidents occur on neighboring road segments pis

Duration of each congestion incident is recorded in step I. During the congestion incident on a certain road segment, the numbers of consecutive time intervals identified as congested on neighboring road segments are also recorded.

We define the ratio of the latter (upstream adjacent road segment) and the former (downstream road segment) as the conditional probability of neighboring road segments affected by the congested road segment.

(41)

III. Calculating the kernel density at each road segment

Based on the result in step II and adjacency relationship obtained from data preprocessing, the kernel density can be calculated through Equation (3.1). A simple example is provided for illustration as follows. For the road network shown in Figure 3.5, congestion occurs on the target road segment TG, road segment 1,R in the 1st order right turn relationship with respect to TG, and another road segment 2,SR in the 2nd order straight-right turn relationship with respect to TG. Two congestion incidents were detected on TG; one started from 6:45 AM and ended at 7:15 AM, while the other started from 8:20 PM and ended at 8:50 PM. Both lasted for six time intervals (30 minutes). How pis of these two congestion incidents are obtained are shown in Figure 3.6(a) and Figure 3.6(b), respectively. 4 and 3 congestion intervals were detected on 1,R during the two congestion incidents on TG respectively. 3 and 3 congestion intervals are detected on 2,SR during the two congestion incidents respectively. Thus, the kernel density calculation is (1, ) 1 4 ( )1 1 3 ( )1

3 6 3 3 6 3

R k k

  

for 1,R and (2, ) 1 3 ( )2 1 3 ( )2

3 6 3 3 6 3

SR k k

   for 2,SR if r3 is chosen as the

search bandwidth.

(42)

Figure 3.4 Analysis Procedure

Figure 3.5 Example Road Network for KDE

Interval Segment

~6:50 ~6:55 ~7:00 ~7:05 ~7:10 ~7:15 𝑝𝑖𝑠

TG

1,R 4/6

2,SR 3/6

Figure 3.6(a) 𝑝𝑖𝑠 Calculation Example 1 TG

1,R 2,SR

(43)

Interval Segment

~8:25 ~8:30 ~8:35 ~8:40 ~8:45 ~8:50 𝑝𝑖𝑠

TG

1,R 3/6

2,SR 3/6

Figure 3.6(b) 𝑝𝑖𝑠 Calculation Example 2

參考文獻

相關文件

The case where all the ρ s are equal to identity shows that this is not true in general (in this case the irreducible representations are lines, and we have an infinity of ways

Infusing higher-order thinking and learning to learn into content instruction: A case study of secondary computing studies in Scotland. Critical thinking: What it is and why

In this thesis, we have proposed a new and simple feedforward sampling time offset (STO) estimation scheme for an OFDM-based IEEE 802.11a WLAN that uses an interpolator to recover

This study proposed the Minimum Risk Neural Network (MRNN), which is based on back-propagation network (BPN) and combined with the concept of maximization of classification margin

In this study, variable weights are added into the probabilistic density function of Elliptical Probabilistic Neural Network (EPNN), so that the kernel function can be adjusted

The aim of this study is to investigate students in learning in inequalities with one unknown, as well as to collect corresponding strategies and errors in problem solving..

In order to investigate the bone conduction phenomena of hearing, the finite element model of mastoid, temporal bone and skull of the patient is created.. The 3D geometric model

Furthermore, based on the temperature calculation in the proposed 3D block-level thermal model and the final region, an iterative approach is proposed to reduce