3. Data and Methodology
3.5. Methodologies
國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
Twitter Archive website (http://www.trumptwitterarchive.com/archive), contained the date and content. In the 60-month period of study, Trump wrote 266 tweets related to Mexico.
We determined a sentiment for each tweet. Sentiment was evaluated according to Trump’s sentiment towards Mexico; we assigned each tweet with one of three sentiment labels, positive, neutral, or negative, determined by whether he had worded it in a positive, neutral, or negative tone. The monthly tweet sentiment was then calculated by assigning each sentiment label with a score; positive: +1, neutral: 0, negative -1, and summing the sentiment of each Mexico-related tweet. It is represented by TRUMP in our model.
3.5. Methodologies
3.5.1. Multiple Linear Regression Model
The first approach this research uses is multiple linear regression analysis; its aim is to measure the impact of different factors on S&P/BMV IPC. The multiple regression equation is:
ReIPC = +1 ReOIL + 2 RetSP500 + 3 ReEX + 4 TRUMP +
Where:
ReIPC: Return on S&P/BMV IPC.
: Constant coefficient.
1 - 4: Regression coefficient of each independent variable.
ReOIL: Return in international price of oil.
ReSP500: Return of S&P500 index.
ReEX: Return Change in exchange rate USD/MXN.
TRUMP: Sentiment of Donald Trump’s tweets towards Mexico.
: Standard error.
3.5.2. Linear Regression Assumptions
Multiple linear regression makes several key assumptions:
1. Linear: Predictor variables in the regression have a straight-line relationship with the outcome variable. Scatterplots can show whether there is a linear or curvilinear
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
relationship, another method is by complying with the normality and homoscedasticity assumptions, if the residuals are normally distributed
2. Normality: Multiple regression assumes that the residuals are normally distributed.
3. No Multicollinearity: Multiple regression assumes that the independent variables are not highly correlated with each other. This assumption is tested using variance inflation factor (VIF) values. If the VIF values are under 5 it means the variables are not highly correlated.
4. Homoscedasticity: This assumption states that the variance of error terms are similar across the values of the independent variables. A plot of standardized residuals versus predicted values can show whether points are equally distributed across all values of the independent variables. When the scatter plot shows no pattern, it means homoscedasticity assumption is met.
3.5.3. Hypothesis Testing MLR
Ho1: 1 = 0; There is no influence of international price of oil on S&P/BMV IPC.
Ha1: 1 ≠ 0; There is influence of international price of oil on S&P/BMV IPC.
Ho2: 2 = 0; There is no influence of S&P 500 index on S&P/BMV IPC.
Ha2: 2 ≠ 0; There is influence of S&P 500 index on S&P/BMV IPC.
Ho3: 3 = 0; There is no influence of exchange rate USD/MXN on S&P/BMV IPC.
Ha3: 3 ≠ 0; There is influence of exchange rate USD/MXN on S&P/BMV IPC.
Ho4: 4 = 0; There is no influence of Trump’s tweets about Mexico on S&P/BMV IPC.
Ha4: 4 ≠ 0; There is influence of Trump’s tweets about Mexico S&P/BMV IPC.
3.5.4. ARDL Bound Co Integration Test
In order to test the relationship and measure the impact of different factors on S&P/BMV IPC this research uses ARDL analysis to compare its findings with MLR analysis. The co integration concept surges in order to know if variables are indeed, related. The ARDL approach has several advantages:
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
1. ARDL model contains the lagged value(s) of the dependent variable, the current and lagged values of regressors as explanatory variables.
2. It uses a combination of endogenous and exogenous variables. Explanatory variables are also used as dependent variables.
3. Unlike other methods of testing co integration, ARDL bound can be used irrespective of whether the variables of the study are I(0) or I(1) or a combination of both. However, the series must not be I(2), otherwise we cannot use ARDL bound test.
4. Once the lag order of the model is identified, co integration can be tested using bound test procedure of ordinary least square (OLS). From the bound test result, if variables are cointegrated, it is necessary to specify both short-run (ARDL) and long run (VECM) models. If they are not cointegrated it is necessary to specify the short-run (ARDL) model only.
5. This model performs better and avoids issues of weak power in modelling the co-integrating relationship with small samples (Romilly et al., 2001; Pesaran, 1997).
Initially unit root test (ADF and Phillip Person) are applied on both the series at the levels and at the first difference to check the stationary of the variables and to ensure that none of the variables is integrated of order (2). In the second step, we develop the ARDL model according to specifications based on the Akaike information criterion (AIC). This is followed by bounds testing to check for a co-integrating relationship between the dependent and the explanatory variables.
Regressions are estimated as following:
IPCt = +1IPCt-1 + 2OILt-1 + 3SP500t-1 + 4EXt-1 + 5TRUMPt-1 + ∑𝑛𝑖=1∝6IPCt-1 +
∑𝑛𝑖=1∝7OILt-1 + ∑𝑛𝑖=1∝8SP500t-1 + ∑𝑛𝑖=1∝9EXt-1 + ∑𝑛𝑖=1∝10TRUMPt-1 + 1t
OILt = +1IPCt-1 + 2OILt-1 + 3SP500t-1 + 4EXt-1 + 5TRUMPt-1 + ∑𝑛𝑖=1β6IPCt-1 +
∑𝑛𝑖=1β7OILt-1 + ∑𝑛𝑖=1β8SP500t-1 + ∑𝑛𝑖=1β9EXt-1 + ∑𝑛𝑖=1β10TRUMPt-1 + 2t
SP500t = +1IPCt-1 + 2OILt-1 + 3SP500t-1 + 4EXt-1 + 5TRUMPt-1 + ∑ni=1γ6IPCt-1 +
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
∑ni=1γ7OILt-1 + ∑ni=1γ8SP500t-1 + ∑ni=1γ9EXt-1 + ∑ni=1γ10TRUMPt-1 + 4t
EXt = + 1IPCt-1 + 2OILt-1 + 3SP500t-1 + 4EXt-1 + 5TRUMPt-1 + ∑ni=1μ6IPCt-1 +
∑ni=1μ7OILt-1 + ∑ni=1μ8SP500t-1 + ∑ni=1μ9EXt-1 + ∑ni=1μ10TRUMPt-1 + 3t Where:
IPC: S&P/BMV IPC value
IPC: Difference of S&P/BMV IPC value
OIL: International price of oil
OIL: Difference in international price of oil
ReSP500: S&P 500 value
SP500: Difference of S&P 500 value
EX: Exchange rate USD/MXN
EX: Difference of exchange rate USD/MXN
TRUMP:
Sentiment of Donald Trump’s tweets towards Mexico.
TRUMP: Difference of sentiment of Donald Trump’s tweets towards Mexico
: Standard error
Sentiment of Trump’s tweets is treated as an exogenous variable, it does not respond to the return of IPC, international oil price, return of S&P 500 nor the exchange rate USD/MXN.
3.5.5. Diagnostic Tests ARDL
The diagnostic tests serial correlation, normality and heteroscedasticity test are conducted to check the goodness of fit of the model. The Cumulative sum (CUSUM), and the CUSUMSQ (cumulative sum of squares) stability tests were applied and the statistical are inside the critical bound at 5%, indicating that the regression equation is stable.
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
3.5.6. Hypothesis Testing ARDL IPC Equation
Ho1: 1 = 2 = 3 = 4 = 5 = 0; There is no long run relationship among the variables in the equation.
Ha1: 1 ≠ 2 ≠ 3 ≠ 4 ≠ 5 ≠ 0; There is long run relationship among the variables in the equation.
OIL Equation
Ho1: 1 = 2 = 3 = 4 = 5 = 0; There is no long run relationship among the variables in the equation.
Ha1: 1 ≠ 2 ≠ 3 ≠ 4 ≠ 5 ≠ 0; There is long run relationship among the variables in the equation.
SP500 Equation
Ho1: 1 = 2 = 3 = 4 = 5 = 0; There is no long run relationship among the variables in the equation.
Ha1: 1 ≠ 2 ≠ 3 ≠ 4 ≠ 5 ≠ 0; There is long run relationship among the variables in the equation.
EX Equation
Ho1: 1 = 2 = 3 = 4 = 5 = 0; There is no long run relationship among the variables in the equation.
Ha1: 1 ≠ 2 ≠ 3 ≠ 4 ≠ 5 ≠ 0; There is long run relationship among the variables in the equation.
In order to test the hypothesis, Wald F-test is used. It detects the joint significance of lagged values of variables in the equation and provide us with the F-statistic and upper and lower critical values. The evidence of co integration is found when F-statistics is above the upper critical value and vice versa. In case F-statistics is between upper and lower bound values, result is inconclusive. After the co integration is established among variables, cointegrating equation for IPC is estimated using long-term error, also known as error correction term in the Error correction model.