Daily VaR - 風險值方法實證研究─以一壽險公司為例

CHAPTER 3:DATA DESCRIPTION

3.2 Daily VaR

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Figure 2. Daily P&L Distributions

Notes: Histogram of daily profit and loss data reported by insurance company from 1/Jan/2011 to 27/Feb/2014. Data are de-meaned and divided by its standard deviations.

3.2 DAILY VAR

The daily VaR estimates are generated by insurance company for the purpose of forecast evaluation or back-testing and are required by regulation to be calculated with the same risk model used for in internal measurement of trading risk. Generally, the VaRs are for one-day ahead horizon and a 99% confidence level for losses.

However, this paper also test a 95% confidence level VaR. Since, there are only few violations of 99% confidence interval VaR. With statistical concerns, we choose 95%

confidence level VaR to have more observation units. In our case, the insurance company’s internal model with a VaR confidence level of 99% only has four exceptions during the sample period. With 95% confidence interval VaR, the internal model has 19 exceptions during this period.

At 95^th and 99^th percentile, P&L would be expected to exceed VaR 38 and 7 times in 777 trading days. However, the numbers of violation are only 19 and 4 times in this period. With this sense, the internal VaR forecasts happen to be conservative.

We can drill down this phenomenon further by looking at the mean violation at Table 3. Column 4 shows that the mean violations of 95% and 99% VaR are more than one and two standard deviations beyond the VaR. To get a sense of the size of these violations, we take Normal distribution as a benchmark. Under a Normal distribution the probability of a loss just one standard deviation beyond a 99% VaR is 0.04%.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

And the probability of a loss two standard deviations beyond 99% VaR is virtually 0.

With that in mind, while violations of VaR are infrequent, the magnitudes of violations can be surprisingly large. In Figure 3, we present the time series of insurance company’s P&L and corresponding one-day ahead 95^th and 99^th percentile VaR forecast (expressed in terms of the standard deviation of the insurance company’s P&L). This plot tends to confirm the conservativeness of the VaR forecasts where violations of VaR are relatively few but large.

Table 3. Daily VaR Summary Statistics Confidence

Interval Mean VaR Number of

Violation Mean Violation

95% -2.07 19 -1.03

99% -3.10 4 -2.78

Notes: Daily VaR data are divided by its sample standard deviation to protest the confidentiality. Mean violation refers to the loss in excess of the VaR.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Figure 3. Internal Daily 95% & 99% VaR Models and Actual P&L Notes: The upper model is used to forecast the one-day ahead 95% percentile of P&L. The lower model is used to forecast the one-day ahead 99% percentile of P&L. Daily P&L are plotted by dotted lines, and VaR are plotted by lines. Data are expressed in standard deviations.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Figure 4. Violations of Internal 95% & 99% VaR Models

Note: The upper plot shows the daily P&L for those days on which P&L drops below the forecast 95^th percentile given the internal models, and the lower one shows the 99^th percentile.

Data are expressed in standard deviations.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y Chapter 4: Research Method

4.1 VALUE-AT-RISK (VAR)

Jorion (2001) mentioned two ways of computing VaR. They are nonparametric and parametric VaR. In this paper, we use the parametric method to compute VaR.

We pick a normal distribution to fit the data. First, we need to translate the general distribution f(w) into standard normal distribution , where has mean zero and standard deviation of unity. We associate W* with cutoff return R* such that W*=(1+R*). Generally, R* is negative and can be written as -|R*|. Further, we can associate R* with standard normal deviate by setting

^| ^| (1-1)

It is equivalent to set

∫ ∫^| ^| ∫ (1-2)

Thus the problem of finding VaR is equivalent to finding the deviate such that the area to the left of it is equal to 1-c. For a defined probability p, the deviate can be found from table of cumulative standard normal distribution function, that is,

∫ (1-3)

We then retrace our steps, back from we just found to cutoff return R* and VaR. From equation (1-1), the cutoff return is

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

(1-4)

From more generality, assume now that the parameters and are expressed on the annual basis. The time interval considered is , in years. We find the VaR relative to the mean as

√ (1-5)

In other words, the VaR figure is simply a multiple of the standard deviation of the distribution times an adjustment factor that is related directly to the confidence level and horizon. When VaR is defined as an absolute dollar loss, we have

√ (1-6)

Set the return , the 99% VaR forecast is then given by ̂ ̂, and the 95% VaR forecast is given by ̂ ̂.

4.2 TIME SERIES MODEL

Figure 4 shows the violations of 95% VaR tend to be clustered⁵. That suggests the volatility of P&L may be time varying to a degree not captured by the internal models. To capture and predict the volatility, we formulate an alternative VaR model determined from time series models of portfolio return. Time series models allow us to have and ; hence, we can use delta-normal method to compute daily VaR for the trading positions.

5 With total 19 violations, there are 8 violations happened in July 2011. And among them, there are 4 violations followed by previous-day violation.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

4.2.1 The ARMA Process

We need sample mean to compute VaR. In this section, we will introduce ARMA process to generate the mean we need. Enders (2010) demonstrates the ARMA process with white-noise at the beginning part. A sequence { } is a white-noise process if each value in the sequence has a mean of zero, has a constant variance, and is uncorrelated with all other realizations. Formally, if the notation E(x) denotes the theoretical mean value of x, the sequence is a white-noise process if, for each time period t,

[or var( )=var( )=…= ] ( )

=0 for all j and s [or ( ) ]

Have white-noise in mind, for each period t, is constructed by taking the values and multiplying each by the associated value of . A sequence formed in this manner is called a mobbing average of order q and is denoted by MA(q).

∑ (2-1-1)

It is possible to combine a moving-average process with a linear difference equation to obtain an autoregressive moving-average model. Consider the pth order difference equation

∑ (2-1-2)

Now let { } be the MA(q) process given by (2-1-1), so that we can write

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

∑ ∑ (2-1-3)

We follow the convention of normalizing units so that is always equal to unity. If the characteristic roots of (2-1-3) are all in the unit circle, { } is called an autoregressive moving-average (ARMA) model for . If we take ARMA(1,1) as an example, we can write the equation as following and take as .

4.2.2 The ARCH/ GARCH Process

With mean at hand, we still need standard deviation to compute daily VaR. In this section, we will introduce two time series process which can generate time-varying volatility for the P&L. They are ARCH and GARCH processes.

ARCH

Engle (1982) let { ̂ } denote the estimated residuals from the model

so that the conditional variance of is

| [ ]

To this point, we have set equal to the constant . Suppose that the conditional variance is not constant. One simple strategy is to forecast the conditional variance as an AR(q) process using squares of the estimated residuals

̂ ̂ ̂ ̂ (2-2-1)

where is a white-noise process.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

If the values of all equal zero, the estimated variance is simply the constant . Otherwise, the conditional variance of evolves according to the autoregressive process given by (2-2-1). As such, you can use (2-2-1) to forecast the conditional variance at t+1 as

̂ ̂ ̂ ̂

For this reason, an equation like (2-2-1) is called an autoregressive conditional heteroskedastic (ARCH) model. So, ARCH(1) can be expressed by

among them and .

GARCH

Bollerslev (1986) extended Engle’s original work by developing a technique that allows the conditional variance to be an ARMA process. Let the error process be such that

√

where , and

∑ ∑ (2-2-2)

Since { } is a white-noise process, the conditional and unconditional means of are equal to zero. Taking the expected value of , it is easy to verify that

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

[ ]

This important point is that the conditional variance of is given by

. Thus, the conditional variance of is the ARMA process given by the expression in (2-2-2). This general ARCH(p,q) model- called GARCH(p,q)- allows for both autoregressive and moving-average components in the heteroskedastic variance. Hence, GARCH(1,1) can be expressed by

among them and .

Now, we understand the ARMA and ARCH/GARCH processes, and we can combine two processes to generate the mean and standard deviation for computing VaR. Take ARMA(1,1)- GARCH(1,1) as an example, it can be represented by the following equations

, ,

The stands for mean, , and stands for standard deviation. With these two series, we can compute 95% VaR forecast at time t by ̂ ̂ and 99%

VaR forecast by ̂ ̂.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

4.3 MODEL SELECTION

After fitting P&L with time series models, we have to pick the best among them. And, we use AIC and SBC as the standard to verify the fitting performance of models.

The AIC and the BIC

In Enders (20100, for a given sample size T, selecting the values of p and q so as to minimize AIC (Akaike Information Criterion) equivalent to selecting p and q so as to minimize the sum:

AIC = T ln(SSR)+2(1+p+q)

Minimizing the value of the AIC implies that each estimated parameter entails a benefit and a cost. Clearly, a benefit of adding another parameter is that the value of SSR is reduced. The cost is that degrees of freedom are reduced and there is added parameter uncertainty. Thus adding additional parameters will decrease ln(SSR) but will increase (1+p+q). The AIC allows you to add parameters until the marginal cost (i.e., the marginal cost is 2 for each parameter estimated) equals the marginal benefit.

The BIC (Schwartz Baysian Information Criterion) incorporates the larger penalty (1+p+q) lnT. To use the BIC, select the values of p and q so as to minimize

BIC = T ln(SSR) + (1+p+q) ln(T)

For any reasonable sample size, ln(T) > 2 so that the marginal cost of adding parameters using the BIC exceeds that of the AIC. Hence, the BIC will select a more parsimonious model than the AIC. As indicated in the text, the BIC has superior large simple properties. It is possible to prove that the BIC is asymptotically consistent while the AIC is biased toward selecting an overparameterized model.

However, Monte Carlo studies have shown that in small samples, the AIC can work better than the BIC.

‧

In order to compare the performance of models, we have to do the backtesting to verify the accuracy of VaR models. Backtesting is a formal statistical framework that consists of verifying that actual losses are in line with projected losses. This involves systematically comparing the history of VaR forecasts with their associated portfolio return.

4.4.1 Kupiec

Kupiec (1995) develops approximate 95 percent confidence regions for verification test, which are reported in Table 4. These regions are defined by tail points of the log-likelihood ratio:

Table 4. Model Backtesting, 95% Non-rejection Test Confidence Regions Nonrejection Region for Number of Failures N Probability Note: N is the number of failures that could be observed in a sample size T without rejecting the null hypothesis that p is the correct probability at the 95 percent level of test confidence.

Source: Adapted from Kupiec (1995)

[ ] [ ]

which is asymptotically, (i.e., wheh T is large) distributed Chi-square with one degree of freedom under the null hypothesis that p is the true probability. Thus we would reject the null hypothesis if LR > 3.841.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

4.4.2 Christoffersen

Christoffersen (1998) extends the LRuc statistic to specify that the deviation must be serially independent. The test is set up as following:

Deviation indicator= 0 if VaR is not exceeded;

Deviation indicator= 1 otherwise.

Then we define Tij as the number of days in which state j occurred in one day while it was at I the previous day andπi as the probability of observing an exception conditional on state i the previous day. Table 5 shows how to construct a table of conditional exceptions.

If today’s occurrence of an exception is independent of what happened the previous day, the entries in the second and third columns should be identical. The relevant test statistic is

[ ] [ ]

Here, the first term represents the maximized likelihood under the hypothesis that exceptions are independent across days, or

The second term is the maximized likelihood for the observed data.

Table 5. Building an Exception Table: Expected Number of Exceptions Conditional

Day Before

No Exception Exception Current day

No exception

Exception )

Total

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

The combined test statistic for conditional coverage then is

LR_cc= LR_uc+ LR_ind

Each component is independently distributed as x²(1) asymptotically. The sum is distributed as x²(2). Thus we should reject at the 95 precents test confidence level if LR>5.991. We would reject independence alone if LRind>3.84.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y Chapter 5: Results

The time series models are estimated each day with data available up to that point. To obtain stable estimates for the model, forecasts for 2011 (days 1 through 243) are in-sample. Rolling out-of-sample forecasts starts after 2012. Out-of-sample estimates are updated daily.

Here, we adopt two kinds of reduced-form models; they are fitted model and Berkowitz & O’Brien model (BO Model). The fitted model uses the first 243-day data as the in-sample to fit a time series model; and the BO model follows Berkowitz

& O’Brien (2002) using a ARMA(1,1)-GARCH(1,1) as the time series model.

Given parameters estimates, we forecast the next day’s 95% and 99% VaR.

The results of the forecast, both within and out-of-sample, are shown in Figure 5 by the grey line, along with P&L by the dotted line and internal model by solid line. As we can see one-day ahead reduced-form forecasts appear to track the lower tails of P&L really well compared to the internal structural model. It tracked the huge P&L drop in 2011, which did not caught by the internal model. This shows that time series model does better at adjusting in volatility through time.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

95% VaR of Fitted Model

95% VaR of BO Model

Figure 5. P&L, 95% Internal VaR, Fitted VaR, and BO VaR Note: Data are expressed in standard deviations.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

99% VaR of Fitted Model

99% VaR of BO Model

Figure 6. P&L, 99% Internal VaR, Fitted VaR, and BO VaR Note: Data are expressed in standard deviations.

‧

Summary statistics and backtests for three models are presented in Table 6 and Table 7 as following.

The third column of Table 6 shows that the time series models remove first-order persistence successfully. Table 6 also shows that both time series models can have lower mean VaR and max VaR in 99^th percentile VaR. The phenomenon might be interpreted that time series models have better performance in the fat-tailed situation. But, in 95^th percentile VaR, that phenomenon disappears.

The higher mean VaRs mean the internal models are more conservative and should generate lower mean violation and max violation. However, the mean violation and max violation, shown in column 8 and 9, exhibit that is not the case.

Even though the internal model is more conservative, the time series models still can have lower mean and max violation.

This result indicates a potentially important advantage for the reduced-form model. Since the magnitudes of the VaR forecasts are used for determined economic capital for the insurance companies. The reduced-form time series models are able to deliver lower required capital requirement without having large violations. This reflects the reduced-form models have greater responsiveness to the P&L volatility.

Table 6. Summary Statistics of Three Models

Summary

Note: Box-Ljung statistics are for first-order serial correlation. The Internal Model are calibrated by insurance company. In column 2, the fitted model is ARMA(1,1)-ARCH(1); in column 3, the model is ARMA(1,1)-GARCH(1,1). The grey shading parts have better performance compared to Internal Models.

‧

results provide little basis to distinguish between the time series models and internal model.

For 95^th percentile VaR, all methods are rejected in terms of coverage. And, for the independence, only the 95^th percentile VaR of internal model is rejected. The main reason of that is because there are 4 continuous violations coming after previos –day violation in this period. And, for 99^th percentile VaRs of all methods are not rejected.

Table 7. Backtests for Three Models

Backtests Violation

Note: P-values are demonstrated in square brackets. The 5% critical value is 3.84, and the 1% critical value is 6.64. * and ** stand for significance at the 5 and 1 percent levels, respectively.

To have a further understanding of this reduced-form model, this paper separates the data into three parts, 2011, 2012, and 2013. And the results of these models are listed in Table 8 to Table 13 as following.

We follow the same fashion with the previous method. Using fitted model and B.O. model as reduced-form time series models. For both models, we use the first half data as in-sample and the rest as out-of-sample. Take 2011’s data as an example, we have 243 P&L data, so we use the first 120 data to fit time series models for fitted model and B.O. model.

‧

Even though they have lower max violations, they have greater mean violations too.

That means the fitted models do not outperform the internal models. In addition, the backtests in Table 9 also shows that the 95^th percentile VaR of fitted model is the only model rejected in the coverage test in 2011.

And for the B.O. model in 2011, it has the same result as in all samples. 99^th percentile VaR has a better performance compared to the internal model, and it passes both coverage and independence tests in backtests.

Table 8. Summary Statistics of Three Models in 2011

Summary

Note: Box-Ljung statistics are for first-order serial correlation. The Internal Models are calibrated by insurance company. In column 2, the Fitted Model is ARCH(1); in column 3, B.O. Model is ARMA(1,1)-GARCH(1,1). The grey shading parts have better performance compared to Internal Model.

‧

Table 9. Backtests of Three Models in 2011

Backtests Violation Rate Coverage Conditional

Coverage

Note: P-values are demonstrated in square brackets. The 5% critical value is 3.84, and the 1% critical value is 6.64. * and ** stand for significance at the 5 and 1 percent levels, respectively.

And in 2012, the results in summary statistics have a different fashion. In Table 10, the fitted models still barely have violations. This time, compared to internal models, fitted model and B.O. model have better performance in the 95^th percentile VaRs in terms of mean VaR, mean violation, and max violation. This fashion is totally different form the total sample and sample in 2011- 99^th percentile VaRs of most models have better performances. Moreover, in Table 11, B.O. model passes the coverage test in 95^th percentile VaR in backtest.

‧

Table 10. Summary Statistics of Three Models in 2012

Summary

Note: Box-Ljung statistics are for first-order serial correlation. The Internal Models are calibrated by insurance company. In column 2, fitted model is ARMA(3,3); in column 3, B.O. Model is

在文檔中風險值方法實證研究─以一壽險公司為例 - 政大學術集成 (頁 19-0)