• 沒有找到結果。

Stochastic Mortality Models with Cox Error Structures

Chapter 4. Mortality Modeling with Non-Gaussian Innovations and Applications to

4.1. Stochastic Mortality Models with Cox Error Structures

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

49 

Chapter 4

Mortality Modeling with

Non-Gaussian Innovations and Applications to the Valuation of Longevity Swaps

4.1 Stochastic Mortality Models with Cox Error Structures

In this section, we first review the RH model, in which the mortality index and cohort effect follow ARIMA models with normal innovations. However, according to the mortality data, the residuals exhibit non-Gaussian distribution. Consequently, we assume that the number of deaths follows a Cox process and that the death rates adhere to the RH model in which the residuals, the mortality indices, and the cohort effects follow three non-Gaussian distributions: JD, VG, and NIG. We also develop an iterative process for calibrating the corresponding parameters of the Cox process with leptokurtic intensity.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

50 

Renshaw and Haberman’s (2006) Model

We analyze changes in mortality as a function of both age x and time t. For mortality forecasting, the cohort-based extension to the LC model proposed by Renshaw and Haberman (2006) is as follows:

 

x t, x x t x t x x t,

ln m     k   

e

, (4-1)

where

m

x t, is the death rate for age x in calendar year t, defined as running from time to time

t  1

;  describes the average pattern of mortality over an age x

group;

k

t explains the time trend of the general mortality level;  represents x age-specific patterns of mortality change, indicating the sensitivity of the logarithm of the force of mortality at age x to variations in

k

t;

t x is a cohort effect;

x

controls age-specific cohort contributions to the mortality projection; and

e

x t,

represents the error term, which is normally distributed with mean 0 and variance

e2. This structure is designed to capture age–period–cohort effects.

To forecast future mortality dynamics, the mortality index

k

t follows a one-dimensional random walk with drift (Lee and Carter, 1992), as follows:

1

t t t

k k

   

, (4-2)

where

is a drift term and

t is a sequence of independent and identically zero-mean Gaussian random variables. Let the year of birth be equal to c t x  . Following the model setup of Renshaw and Haberman (2006) and Cairns et al. (2010), we model the cohort factor  as an ARIMA(1,1,0) process that is independent of c

k

t:

1

c c zc

     

      , (4-3) t

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

51 

where zc  is a sequence of independent and identically standard normal random variables.

Normality Test for the RH Model

According to table 1 in Dowd et al.’s (2010) article, the residuals of the Renshaw and Haberman (2006) model exhibit leptokurticity. In this subsection, we therefore apply the JB statistic (Jarque and Bera, 1980) to test empirically the normality of the three mortality data sets from England and Wales, France, and Italy for subjects aged 60–89 years during the period 1900–1983. The mortality data came from the Human Mortality Database (HMD) website.7 Table 4-1 contains the results of the JB test for the residuals of the RH model, the first difference of the three countries’ mortality indices, and the corresponding cohort effects from 1900 to 1983. The JB statistic rejects the assumption of normality for the residuals of the RH model and the cohort effects. Therefore, we use the heavy-tailed distributions—JD, VG and NIG—to model the non-Gaussian nature of the error terms of the RH model.

Table 4-1. The Jarque-Bera Test

England and Wales France Italy

Residuals of the RH Model

373.952 1489.534 15531.672 [< 0.001] [< 0.001] [< 0.001]

First Difference in Mortality Indices

1.175 0.638 2.532

[0.477] [0.500] [0.182]

the Residuals of 343.557 43.933 295.214

Cohort Effects [< 0.001] [< 0.001] [< 0.001]

Note: The p-values of the Jarque-Bera test are in brackets.

       

7 See http://www.mortality.org/. 

heavy-tailed distributions: JD, VG and NIG. In the subsequent subsection, we take

,

e

x t as an example to describe the properties of these heavy-tailed distributions;

analogous results are obtained for

t and

z

t x . We refer to these distributions in Chapter 2.

A Cox Process with Leptokurtic Intensity

We assume that

D

x t, , or the number of deaths at age x during year t, adheres to a Cox process, also known as a doubly stochastic Poisson process. That is,

 

where

e

x t, is assumed to be an age- and period-homogeneous heavy-tailed distribution that captures leptokurticity. Let

d

x t, be the corresponding number of deaths actually observed. Conditional on

e

x t,

y

, the number of deaths

D

x t, becomes a Poisson distribution with intensity

E

xtexp

  

xx t

k

 

x t x  . As a

y

result, the log-likelihood function based on the Cox regression model is defined as

, , ,

,

 

Thus, to find the maximum likelihood estimates of the parameters of the Renshaw and Haberman (2006) model, we can maximize Equation (4-5) with respect to

x,

x,

k

t,

x,

t x , and the error term distribution parameters. The closed-form solution of the log-likelihood function in Equation (4-5) is derived as follows:

     

,

 

Appendix C. Note that when

e

x t, is ignored while modeling the number of deaths (i.e.,

M

ex t,

 

1  ), the log-likelihood function defined in Equation (4-7) is precisely 1 the same as that proposed by Wilmoth (1993), Brouhns et al. (2002), and Cairns et al.

(2009).

Similar to the two-step procedures of Lee and Carter (1992), Brouhns et al.

(2002), and Renshaw and Haberman (2006), we first calibrate the parameters

x,

x,

k

t,

x, and

t x- with an updating scheme. Then, we estimate

,

,

,

, and the distribution parameters of the residuals of the mortality indices and cohort effects. In the first step, there are six sets of parameters, namely, the

x,

x,

k

t,

x, and

t x- parameters, as well as the

e

x t, distribution parameters. Following Brouhns et al. (2002) and Renshaw and Haberman (2006), we use the following updating scheme: Let nx be the total number of ages. Starting with

x

 1/ n

x and

LC model, we calibrate the corresponding

e

x t, distribution parameters by maximizing the sample log-likelihood function:

 

Then, with the

e

x t, distribution parameters estimated from Equation (4-8), we employ an iterating method to estimate the corresponding parameters of the RH model according to the elementary Newton method (Goodman, 1979; Brouhns et al., 2002; Renshaw and Haberman, 2006).

Following the estimating procedure of Renshaw and Haberman (2006), the parameters are estimated by iteration. In each iteration step, we update a single set of parameters; the other parameters are fixed at their current estimates using the following updating scheme: Consequently, the updating scheme is as follows:

   

We repeat the updating cycle (Equations (4-8)–(4-14)) and stop when the log-likelihood function in Equation (4-7) converges.8 Model identification can be conveniently achieved with parameter constraints: t 0

t

After obtaining the mortality indices and cohort effects, we can calculate the parameters of Equations (4-2) and (4-3) by maximizing the log-likelihood function, as follows:

In this section, we investigate the goodness-of-fit distributions for the number of deaths, the first differences of the mortality indices, and the cohort effects. Using the        

8 The criterion used to stop the iterative fitting procedure is a very small relative change in the log-likelihood function. We adopt 10−7 as the default value.

9 Similar to Brouhns et al. (2002), after updating the kt parameters, we impose a centering constraint

t 0 estimates for kt by the same number. Following the analogical procedure, the constraints of x and

t x are also achieved. 

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

56 

mortality data from 1900–1983, we fit the residuals of the RH model to four distributions: normal, JD, VG, and NIG. We then fit the mortality indices and cohort effects from the best-fitting model according to the Bayesian information criterion (BIC) to the same four distributions. Finally, we project the subsequent 25-year mortality rates (1984–2008).

In-Sample Goodness of Fit

Using mortality data from 1900 to 1983, we first investigate the goodness-of-fit distributions of the number of deaths for England and Wales, France, and Italy. Table 4-2 presents the LLF, Akaike information criterion (AIC), and BIC statistics10 for the number of deaths at which the residuals of the RH model adhere to the normal, JD, VG, and NIG models. All three criteria indicate that the normal distribution is the worst fitting model for the number of deaths. They also indicate that the VG model is consistently the best model for the number of deaths in the three mortality data sets.

Therefore, we use the mortality indices and cohort effects obtained from the VG model to investigate the pattern of the error terms of the time and cohort effects.

Table 4-2. Goodness-of-fit Measures for the Number of Deaths

Model England and Wales France Italy

LLF AIC BIC LLF AIC BIC LLF AIC BIC

Normal -32727 33011 33839 -30356 30640 31468 -45130 45414 46242 JD -32557 32844 33681 -30237 30524 31361 -43719 44006 44843 VG

-32554 32840 33674 -30084 30370 31204 -42900 43186 44020

NIG -32556 32842 33676 -30236 30522 31356 -44314 44600 45434

The test results for the first difference in mortality indices are in Table 4-3, which contains the LLF, AIC, and BIC statistics for the normal, JD, VG, and NIG distributions. The Gaussian model is the worst according to the LLF criterion, which        

10 AIC LLF NP and BIC LLF0.5NPlog

 

NS , where NP is the effective number of parameters being estimated and NS is the number of observations. 

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

57 

also indicates that the best fit for the three mortality data sets derives from the JD model. The best in-sample goodness of fit for mortality indices changes for the normal distribution of all mortality data, because the BIC introduces a penalty term for the effective number of parameters.

Table 4-3. Goodness-of-fit Tests for the First Difference in Mortality Indices

Model England and Wales France Italy

LLF AIC BIC LLF AIC BIC LLF AIC BIC

Normal -164.87 166.87 169.29 -174.07 176.07 178.49 -179.71 181.71 184.13 JD

-161.89 166.89 172.93 -173.71 178.71 184.75 -178.59 183.59 189.64

VG -162.67

166.67 171.50 -173.85 177.85 182.69 -178.67 182.67 187.51

NIG -163.49 167.49 172.33 -173.84 177.84 182.68 -178.68 182.68 187.52

In Table 4-4 we present the LLF, AIC, and BIC statistics for the normal, JD, VG, and NIG distributions for cohort effects. The LLF, AIC, and BIC statistics consistently indicate that the best fit for Italy derives from the JD model, but for England and Wales, it derives from the NIG model. For France, the best model for cohort effects is the JD model according to LLF, but in terms of the AIC and BIC, the best is the VG model. All three criteria indicate that the normal distribution is the worst fitting model for the cohort effects. Consequently, with mortality data from three countries over the period 1900–1983, in-sample model selection criteria indicate a preference for modeling the RH model with non-Gaussian innovations.

Table 4-4. Goodness-of-fit Tests for the Residuals of Cohort Effects

Model England and Wales France Italy

LLF AIC BIC LLF AIC BIC LLF AIC BIC

Normal -96.11 99.11 103.18 -107.63 110.63 114.69 -125.75 128.75 132.81 JD -84.27 90.27 98.4 -97.57 103.57 111.69 -105.31 111.31 119.44 VG -83.83 88.83 95.6 -97.79 102.79 109.57 -115.01 120.01 126.79 NIG

-83.48 88.48 95.25 -97.86 102.86 109.64 -111.45 116.45 123.22

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

58 

Mortality Projection

To assess out-of-sample performance, we apply the parameters estimated from the time period 1900–1983 to obtain 25-year mortality projections, calculating the mean absolute percentage error (MAPE) as follows:

, (4-16)

where

A

i is the logarithm of the historical mortality rate;

F

i is the natural logarithm of the forecast mortality rate; and n is the number of observations.

By applying the calibrated parameters of the RH model to the VG innovations (the best model according to BIC), we reveal the impact of the different distributions on the mortality projection for MAPE from 1984 to 2008 (Table 4-5). A lower value indicates better predictive power for the distribution. For comparison, we also provide the mortality projection of the original RH model with four forecasting distributions—normal, JD, VG and NIG (the original RH-Normal model corresponds to the M2 model of Cairns et al., 2009). The VG-NIG model11 is the best mortality projection for the mortality data of England and Wales. The VG-VG model provides the best one for the mortality data from France and Italy. As a result, in terms of the MAPE criterion, the RH model with non-Gaussian innovations provides better mortality projection than that obtained from the original RH model with normal innovations.

       

11 A VG-NIG model corresponds to a VG error term in the RH model and to NIG distributions for the time and cohort effects.

1

1 n i i

i i

A F MAPE n A

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

59 

Table 4-5. MAPE of Logarithm of Mortality Projection in 1984-2008 (Unit: %)

Model England and Wales France Italy

Original RH-Normal 4.9344 5.1999 7.2268

Original RH-JD 4.9343 5.1915 7.2822

Original RH-VG 4.9393 5.1000 7.9326

Original RH-NIG 4.9319 5.1891 7.2983

VG-Normal 4.8072 5.1719 7.0612 VG-JD 4.8064 5.1619 7.1135 VG-VG 4.807 5.0595 6.9687

VG-NIG

4.8063

5.1598 7.1293

Note: Original RH-Normal is the same as M2 of Cairns et al. (2009). The X-Y model corresponds to an X error term in the RH model and to Y distributions for the time and cohort effects.