• 沒有找到結果。

Analyses with GDELT in Other Geographic Contexts

2. Literature Review

2.5 Analyses with GDELT in Other Geographic Contexts

In the context of South China Sea tensions, there are no existing studies to date  using analyses based on GDELT. As such, this dissertation represents a first cut at  the data and aims to serve as a foundation for future research on the issue. That said,  previous studies have used GDELT for analyses related to various other geographic  contexts around the world. For the purposes of this dissertation, those studies using  GDELT to analyze or predict conflictive events, such as protests, violent uprisings,  armed conflict, and genocides, are the most relevant in terms of purpose, data, and  methodology.  

77  Scott E. Page, The Model Thinker , University of Michigan, 2015, p. 36. 

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Two authors, Philip A. Schrodt and James E. Yonamine, have published 

various articles over the years covering the use of event databases, including GDELT,  for prediction, and their studies serve as an indispensable resource for relevant  research. In an article published before the public release of GDELT, Yonamine and  Schrodt compare the two primary forms of data used in quantitative conflict studies: 

structural data and event data. Structural databases focus on broad structural aspects  of interstate relations, tend to be manually compiled by researchers, are typically  aggregated at the state-year level, and have slowly developed since the 1960s. Event  databases are typically derived from news reports, contain records of specific events  or stories, are coded by date and even time, and have evolved from manual 

compilation in the mid-1970s to automatic compilation and coding using computer  algorithms since the late 1980s, allowing for the inclusion of much more fine-grained  data on a theoretically limitless number of issues. They argue that structural 

databases have been useful for understanding broad questions about international  relations and conflict but are limited by their temporal aggregation, unable to shed  light on the ongoing interactions between states, and unhelpful for policymakers  interested in predicting events at given times. They then discuss various event  databases, including their approaches to data collection, challenges faced, and  limitations, making relevant suggestions for improvement that seem to have 

foreshadowed the launch of GDELT. A follow-up article published two years later 78 by the same authors builds upon this work.  79

Following GDELT’s initial public release, Yonamine describes in a reference  article on dealing with event data the three main aggregation choices available to  researchers: actors, actions, and time. The analyses in this dissertation aggregate 80

78  James E. Yonamine and Philip A. Schrodt, “A Guide to Event Data: Past, Present, and Future,” 

November 28, 2011, 

<http://jayyonamine.com/wp-content/uploads/2012/07/YonamineSchrodt_A_Guide_to_Event_Data.

pdf>. 

79  Philip A. Schrodt and James E. Yonamine, “A Guide to Event Data: Past, Present, and Future,” All  Azimuth 2(2): 5–22, July 2013, <http://dergipark.gov.tr/download/article-file/147447>. 

80  James E. Yonamine, “Working with Event Data: A Guide to Aggregation Choices,” 2013, 

<http://jayyonamine.com/wp-content/uploads/2013/04/Working-with-Event-Data-A-Guide-to-Aggre

the data by time into monthly intervals, which are short enough to reflect the effects  of individual events and be useful for policymakers yet long enough to include  sufficient data for averaging. This dissertation also filters data by location, omitting  all records not relevant to the South China Sea, and, for analyses related to RQ1, by  state actor, in order to assess the relationship between state involvement on tensions. 

In the few years since its public release, prediction of conflictive events has  been among the most common themes in studies using GDELT. Many of these have  focused on predicting protests, violent uprisings, armed conflict, and genocides in  Africa and the Middle East. In an early study using GDELT in 2013, he uses an  autoregressive fractionally integrated moving average (ARFIMA) model to forecast  levels of violence in districts in Afghanistan. In the study, he also compares existing  event databases based on five attributes—broad spatial coverage, density, geocoding,  accuracy, and future availability in real-time—suggesting that GDELT is the first to  satisfy all of the criteria of an ideal dataset. In other research, Yonamine has also 81 used GDELT data to analyze the effects of violence against Israel on the Tel Aviv  stock exchange, finding that the two are not significantly correlated but that the  conflictive events do affect certain companies included in the exchange, and the  effects of civil war on interstate war, finding that domestic conflicts increase the  likelihood of that state becoming involved in interstate conflicts with its neighbors.   82

In another early study from 2013, Arva et al. compare the Integrated Conflict  Early Warning System (ICEWS) database – a US government project that was capable  of forecasting conflictive events but later became classified – and the GDELT 1.0  Event Database in terms of the forecast accuracy using various statistical models. 

They find that GDELT “performs as well or better than the data in the original  ICEWS”, suggesting that is likely the best publicly available global dataset for 

81  James E. Yonamine, “Predicting Future Levels of Violence in Afghanistan Districts using GDELT,” 

April 2013, p. 2, 

<http://jayyonamine.com/wp-content/uploads/2013/04/Predicting-Future-Levels-of-Violence-in-Afg hanistan-Districts-using-GDELT.pdf>. 

82  James E. Yonamine, A Nuanced Study of Political Conflict Using the Global Datasets of Events Location  and Tone (GDELT) Dataset , Pennsylvania State University, August 2013, 

<https://etda.libraries.psu.edu/catalog/18659 >. 

making predictions related to conflictive events. Moreover, they conclude that 83 GDELT’s “firehose” approach to data collection and inclusion in the resulting  dataset could make it less suitable for monitoring but may actually be to its benefit  for the purposes of statistical forecasting, as is the focus of the analyses for RQ2 in  this dissertation.  84

Brandt, Freeman, Lin, and Schrodt look at the effect of different length 

training sets in forecasting using GDELT data. Taking cross-strait relations as a case  study, they find that shorter length training sets may be as effective as longer ones.  85 They also suggest that, at certain points in event data, there may be a “clear change  in the dynamics of the data,” so inclusion of all available historical data may not be  necessary or even desirable. For the purposes of this dissertation, their conclusions 86 are significant because they demonstrate that it is not necessary to use the entire  dataset dating back to 1979 to achieve meaningful results.  

Abb and Strüver, like this study, draw upon Goldstein values from the GDELT  1.0 Event Database, using them as a measure of the “quality of relations” between  two countries. It should be noted that this term they chose to use is simply the  inverse of “tensions” as it is referred to in this study, so it is essentially measuring  tensions between two countries, one of which is always China in their article. 

83  Bryan Arva, John Beieler, Ben Fisher, Gustavo Lara, Philip A. Schrodt, Wonjun Song, Marsha  Sowell, and Sam Stehle, “Improving Forecasts of International Events of Interest,” European  Political Studies Association 2013 Annual General Conference, July 3, 2013, 

<https://ssrn.com/abstract=2225130>. 

84  Bryan Arva, John Beieler, Ben Fisher, Gustavo Lara, Philip A. Schrodt, Wonjun Song, Marsha  Sowell, and Sam Stehle, “Improving Forecasts of International Events of Interest,” European  Political Studies Association 2013 Annual General Conference, July 3, 2013, p. 57, 

<https://ssrn.com/abstract=2225130>. 

85  Patrick T. Brandt, John R. Freeman, Tse-min Lin, and Philip A. Schrodt, “Forecasting Conflict in  the Cross-Straits: Long Term and Short Term Predictions,” Annual Meeting of the American  Political Science Association, September 4, 2013, 

<http://www.utdallas.edu/~pbrandt/Patrick_Brandts_Website/Research_files/ForecastWindows.pdf>

86  Patrick T. Brandt, John R. Freeman, Tse-min Lin, and Philip A. Schrodt, “Forecasting Conflict in  the Cross-Straits: Long Term and Short Term Predictions,” Annual Meeting of the American  Political Science Association, September 4, 2013, p. 3, 

<http://www.utdallas.edu/~pbrandt/Patrick_Brandts_Website/Research_files/ForecastWindows.pdf>

Whereas the analyses in this dissertation use monthly averages, their data are 

aggregated into yearly averages in order to match the time interval frequency of their  other variables, of which their dependent variable “global policy alignment” is 87 derived from United Nations General Assembly voting records. The validity of the 88 Goldstein values data is confirmed by manually comparing the levels of 

conflict/cooperation in each time period with real-world events and relevant 

qualitative literature, as is done for tensions data in {3.2.2 Linking GDELT 1.0 Event 89 Database Data to Real World Events} and {3.2.5 Linking GDELT 2.0 GKG Data to  Real World Events} of this dissertation. Using these data, they conclude that the  quality of bilateral relations between China and Southeast Asian countries is  strongly correlated with policy alignment with China at the global level.  90

Davis, Fuchs, and Johnson also incorporate data from the GDELT 1.0 Event  Database into their analysis of bilateral political relations on bilateral trade. As in  this dissertation, they use Goldstein values as a measure of “tensions”, noting that it 

“captures the likelihood that the event will impact on the stability of the country”.  91 Furthermore, like this dissertation as well as Abb and Strüver, they manually link  real-world events to visible changes in the tensions data to confirm its validity.  92

Various other studies related to interstate relations and conflict in other  geographic contexts have been conducted using GDELT data. Morgan and Reiter, for 

87  Pascal Abb and Georg Strüver, “Regional Linkages and Global Policy Alignment: The Case of  China–Southeast Asia Relations,” Issues & Studies 51(4): 33–83, December 2015, p. 54.   

88  Pascal Abb and Georg Strüver, “Regional Linkages and Global Policy Alignment: The Case of  China–Southeast Asia Relations,” Issues & Studies 51(4): 33–83, December 2015, pp. 49–50.   

89  Pascal Abb and Georg Strüver, “Regional Linkages and Global Policy Alignment: The Case of  China–Southeast Asia Relations,” Issues & Studies 51(4): 33–83, December 2015, pp. 56–57.   

90  Pascal Abb and Georg Strüver, “Regional Linkages and Global Policy Alignment: The Case of  China–Southeast Asia Relations,” Issues & Studies 51(4): 33–83, December 2015.   

91  Christina Davis, Andreas Fuchs, and Kristina Johnson, “State Control and the Effects of Foreign  Relations on Bilateral Trade,” University of Heidelberg Department of Economics Discussion  Paper Series 576, November 2014, pp. 21–22, 

<http://archiv.ub.uni-heidelberg.de/volltextserver/17673/1/davis_fuchs_johnson_2014_dp576.pdf>. 

92  Christina Davis, Andreas Fuchs, and Kristina Johnson, “State Control and the Effects of Foreign  Relations on Bilateral Trade,” University of Heidelberg Department of Economics Discussion  Paper Series 576, November 2014, pp. 22, 

<http://archiv.ub.uni-heidelberg.de/volltextserver/17673/1/davis_fuchs_johnson_2014_dp576.pdf>. 

examples, analyze factors affecting government funding allocation for roads in India,  one of which is georeferenced violent events from GDELT. Although the variables 93 and approaches used in such studies are not directly applicable to those of this  dissertation, it is worth mentioning them as relevant examples of how GDELT data  has been used in analyses of international relations in other geographic contexts. 

To date, no studies have attempted to use GDELT data to analyze the 

relationship between state involvement and tensions in the South China Sea, explore  historic levels of tensions in the maritime area, or predict the escalation and 

deescalation of tensions there into the future. The analyses of this dissertation aim to  change that. By using two distinct databases, two ways of measuring tensions, and  aggregating tensions into monthly averages, they first assess and provide 

visualizations of historical tensions based on observed data related to the South  China Sea. Then, for RQ1, they explore the relationship between state involvement  and tensions for eleven countries using two different interpretations of state 

involvement. For RQ2, predictions are made using four benchmark models and four  forecast models for past and future tensions in each time period. These models are  then compared based on their respective forecast accuracies to determine their  relative performance at predicting South China Sea tensions over time.  

Regardless of outcome, the results will serve as an important contribution to  discussions of the future of maritime territorial disputes in that they will be a first  attempt to apply these relatively recent methodological approaches to South China  Sea regional relations. Moreover, they will either support or refute the many claims  and analyses arguing that certain states are responsible for heightened tensions, that  South China Sea tensions will increase or decrease, or that the maritime area is  primed as a flashpoint for armed conflict, claims that invariably lack empirical  backing and simply rely on common sense assumptions and incomplete evidence. 

93  Richard Morgan and Dan Reiter, “How War Makes the State: Insurgency, External Threat, and  Road Construction in India,” revised version of paper from 2013 Annual Meeting of the American  Political Science Association, October 17, 2013, 

<http://www.cidcm.umd.edu/workshop/papers/reiter.pdf>. 

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y