• 沒有找到結果。

Invariance in the recurrence of large returns and the validation of models of price dynamics

N/A
N/A
Protected

Academic year: 2021

Share "Invariance in the recurrence of large returns and the validation of models of price dynamics"

Copied!
15
0
0

加載中.... (立即查看全文)

全文

(1)

Invariance in the recurrence of large returns and the validation of models of price dynamics

Lo-Bin Chang,1Stuart Geman,2Fushing Hsieh,3and Chii-Ruey Hwang4

1Department of Applied Mathematics, National Chiao Tung University, Hsinchu, Taiwan

2Division of Applied Mathematics, Brown University, Providence, Rhode Island 02912, USA

3Department of Statistics, University of California, Davis, California 95616, USA

4Institute of Mathematics, Academia Sinica, Taipei, Taiwan

(Received 11 April 2013; published 9 August 2013)

Starting from a robust, nonparametric definition of large returns (“excursions”), we study the statistics of their occurrences, focusing on the recurrence process. The empirical waiting-time distribution between excursions is remarkably invariant to year, stock, and scale (return interval). This invariance is related to self-similarity of the marginal distributions of returns, but the excursion waiting-time distribution is a function of the entire return process and not just its univariate probabilities. Generalized autoregressive conditional heteroskedasticity (GARCH) models, market-time transformations based on volume or trades, and generalized (L´evy) random-walk models all fail to fit the statistical structure of excursions.

DOI:10.1103/PhysRevE.88.022116 PACS number(s): 02.50.Ey, 02.50.Tt, 05.40.−a

I. INTRODUCTION

Given a sequence of stock prices s0,s1, . . .recorded at fixed intervals, say every 5 min, let rn

.

= log sn

sn−1, n= 1,2, . . . be the corresponding sequence of returns. Fix N and define an excursion to be a return that is large, in absolute value, relative to the set{r1,r2, . . . ,rN}. Specifically, following Hsieh et al.

[1], define the excursion process z1,z2, . . . ,zN: zn=



1 if rn l or rn u,

0 if rn∈ (u,l),

where l and u are, respectively, the 10th and 90th percentiles of{r1, . . . ,rN}. We call the event zn= 1 an excursion since it

represents a large movement of the stock relative to the chosen set of returns. We will study the distribution of waiting times between large stock returns by studying the distribution of the number of zeros between successive ones of the excursion process. Our motivation includes the following.

(1) An empirical observation (cf. [2]) indicates that this waiting-time distribution is nearly invariant to time scale (e.g., 30-s, 1-min, or 5-min returns), to stock (e.g., IBM or Citigroup), and to year (e.g., 2001 or 2007).

(2) The waiting time to large returns is of obvious interest to investors and much easier to study if, and to the extent that, it is invariant across time scale, stock, and year.

(3) The particular waiting-time distribution found in the data and its invariance to time scale have implications for models of price and volatility movement. For instance, L´evy processes, “market-time” models based on volume or trades, and generalized autoregressive conditional heteroskedasticity (GARCH) models are each one way or another inconsistent with the empirical data.

(4) Overwhelmingly, the evidence for self-similarity comes from studies of the univariate (marginal) return distributions (e.g., evidence for a stable-law distribution), but marginal tributions leave data models underspecified. Waiting-time dis-tributions provide additional, explicitly temporal constraints, and these appear to be nearly universal.

Larger returns can be studied by using more extreme percentiles. Although we have not experimented extensively,

the empirical results we will report on appear to be qualitatively robust to the chosen percentiles and hence the definition of “large return.” In general, the upper and lower percentiles index a family of waiting-time distributions that might prove useful to systematically constrain the dynamics of price and volatility models.

In Sec.II, we study the invariance of the empirical waiting-time distribution. Starting with the L´evy type models, we first make a connection between the model-based distribution and the geometric distribution. To be concrete, let S(t) follow the “Black-Scholes model” (geometric Brownian motion) as an example: d log S(t)= μdt + σdw(t), where w(t) is a standard Brownian motion. Because of the independent increments property of Brownian motion w(t), the return sequence under this model is exchangeable (i.e., the distribution of any per-mutation remains the same). Therefore, the empirical waiting-time distribution under this model is provably invariant to waiting-time scale and to time period. More specifically, the probability of getting a large return, with l being the 10th percentile and u being the 90th percentile, is exactly 0.2 at each return interval, and the empirical waiting-time distribution is therefore nearly a geometric distribution with a parameter of 0.2 (see Sec.II A

for more details). We emphasize the these considerations apply without modification not just to the geometric Brownian motion but to all of its popular generalizations as geometric L´evy processes.

Not surprisingly (cf. “stochastic volatility”), the actual (i.e., empirical) waiting-time distribution is different from geometric. But what is surprising is the invariance of this distribution across time scale, stock, and year. In Sec. II B

we make an exhaustive comparison of empirical waiting-time distributions using trading prices of approximately 300 stocks from the S&P 500 observed over the 8 years from 2001 through 2008. Invariance to time scale is strong in all 8 years; invariance to stock is strong in years 2001–2007 and less strong in 2008, and invariance across years is stronger for pairs of years that do not include 2008. (We have not studied the years since 2008.) In Sec. II C, we will connect waiting-time invariance to self-similarity, being careful to distinguish a self-similar

process from a process having self-similar increments (i.e.,

(2)

0 20 40 60 80 100 120 140 160 180 200 -0.015 -0.01 -0.005 0 0.005 0.01 0.015 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 -0.015 -0.01 -0.005 0 0.005 0.01 0.015 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 -0.015 -0.01 -0.005 0 0.005 0.01 0.015 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 70 75 80 85 90 95 100 (a) (b) (c) (d)

5-minute intervals

5-minute intervals

traded pric

e

return

return

5-minute intervals

return

5-minute intervals

FIG. 1. (Color online) Returns, percentiles, and the excursion process. (a) IBM stock prices, every 5 min, during the 252 trading days in 2005. The opening (9:30 to 9:40 AM) and closing (3:50 to 4:00 PM) prices are excluded, leaving 75 prices per day (9:40 AM, 9:45 AM, . . . , 3:50 PM). (b) Intraday 5-min returns for the prices displayed in (a). There are 252× 74 = 18 648 data points. (c) Returns, with the 10th and 90th percentiles superimposed. (d) Zoomed portion of (c) with 200 returns. The “excursion process” is the discrete time zero-one process that signals (with ones) returns above or below the selected percentiles.

Which of the state-of-the-art models of price dynamics are consistent with the empirical distribution of the excursion process? The existence of a nearly invariant waiting-time distribution between excursions provides a new tool for evalu-ating these models, through which questions of consistency with the data can be addressed using statistical measures of fit and hypothesis tests. In general, we will advocate for permutation and other combinatorial statistical approaches that robustly and efficiently exploit symmetries shared by large classes of models, supporting exact hypothesis tests as well as exploratory data analysis. In Sec. IIIwe introduce some combinatorial tools for hypothesis testing and explore the implications of waiting-time distributions to the time scale of volatility clustering. We continue with this approach in Sec.IVwith a discussion of stochastic volatility modeling as well as market time and other stochastic time-change models. We conclude in Sec.Vwith a summary and some proposals for price and volatility modeling.

II. WAITING TIMES BETWEEN LARGE RETURNS There were 252 trading days in 2005. The traded prices of IBM stock (sn, n= 0,1, . . . ,18 899) at every 5-min interval

from 9:40 AM to 3:50 PM (75 prices each day) throughout the 252 days are plotted in Fig.1(a).1Often, activities near opening and closing are not representative. To mitigate their influence, we exclude prices in the first 10 min (9:30 to 9:40 AM) and last 10 min (3:50 to 4:00 PM) of each day. The corresponding intraday returns, rn

.

= log sn

sn−1, n= 1,2, . . . ,18 648 (74 returns per day), are plotted in Fig. 1(b). Overnight returns are not included.

We declare a return “rare” if it is rare relative to the interval of study, in this case the calendar year 2005. We might, for

1The price at a specified time is defined to be the price of the most recent trade.

(3)

instance, choose to study the largest and smallest returns in the interval or the largest 10% and smallest 10%. Figure1(c)shows the 2005 intraday returns with the 10th and 90th percentiles superimposed. More generally, given any fractions f,g∈ [0,1] (e.g., 0.1 and 0.9), define

lf = lf(r1, . . . ,rN)

= inf {r : No.{n : rn r,1  n  N}  f N}, (1) ug = ug(r1, . . . ,rN)

= sup {r : No.{n : rn r,1  n  N}  (1 − g)N}, (2)

where, presently, N= 18 648. The lower and upper lines in Fig. 1(c) are l0.1 and u0.9, respectively. Figure 1(d) is a magnified view, covering r1001, . . . ,r1200, but with l0.1and u0.9 still figured as in Eqs.(1)and(2)from the entire set of 18 648 returns.2

The excursion process is the zero-one process that signals large returns, meaning returns that either fall below lfor above ug:

zn= 1{rnlf orrnug}.

Hence zn= 1 for at least 20% of n ∈ {1,2, . . . ,18 648} in

the example in Fig. 1. Obviously, many generalizations are possible, involving indicators of single-tale excursions (e.g.,

f = 0, g = 0.9 or f = 0.1, g = 1) or many-valued excursion

processes (e.g., znis 1 if rn lf, 2 if rn ug, and 0 otherwise).

Or we could be more selective by choosing a smaller fraction

f and a larger fraction g and thereby move in the direction of truly rare events. (There is then an inevitable tradeoff between the magnitude of the excursions and the sample size; more rare events are studied at the cost of statistical power.) Here we will work with the special case f = 0.1 and g = 0.9, but a similar exploration could be made of these other excursion processes.

A. The role of the geometric distribution

As with the Black-Scholes model discussed in the Introduc-tion, any stochastic process with stationary and independent in-crements (i.e., any L´evy process) has exchangeable inin-crements and hence exchangeable returns if used as a model for the log-price distribution. What would the excursion waiting-time distribution look like under a geometric Brownian-motion model or one of its generalizations to geometric L´evy?

Specifically, assume

dlog S(t)= μdt + σdw(t),

where w(t) is a L´evy process. Then the return sequence

Rk = log S(t0+ kδt) − log S(t0+ (k − 1)δt),

∀k = 1,2,3, · · · ,n (3)

is exchangeable. With the particular percentiles used here, the sequence z1,z2, . . . ,zN has 20% 1’s and 80% 0’s. If real

returns were exchangeable, then the excursion process would be as well since the percentiles lf and ug [Eqs.(1)and(2)]

are symmetric functions of the returns. Hence, the probability

2To break ties and to mitigate possible confounding effects from “microstructures,” prices are first perturbed, independently, by a random amount chosen uniformly between−$0.005 and +$0.005.

that a 1 is followed immediately by another 1 (waiting time zero) is very close to 0.2. (Not exactly 0.2, even ignoring edge effects, because there are a finite number of 1’s; the first 1 of the pair uses one of them up.) The probability that exactly one 0 intervenes is very close to (0.8)(0.2)= 0.16, two 0’s is very close to (0.8)(0.8)(0.2)= 0.128, and so forth following the geometric distribution.

In general, the waiting-time distribution for an exchange-able process converges to the geometric distribution as the number of excursions (number of return intervals) goes to infinity [2,3]. In this sense, the Kolmogorov-Smirnov (KS) distance to the geometric distribution is a measure of departure of a return process from exchangeability and can be used as a statistic to calibrate the temporal structure of real price data as well as proposed models of prices and returns (as will be discussed more deeply in Secs.IIIandIV).3Figure2compares the empirical waiting-time distribution generated by 93 240 one-minute 2005 IBM returns to the geometric distribution with a parameter of 0.20. Obviously, there is a substantial departure, characterized by high probabilities of short and long waits in the real data compared to the geometric distribution. (The slope of the P -P curve is greater than 1 or less than 1 as waiting-time probabilities are, respectively, larger than or smaller than geometric.) Thus, for example, the empirical probability that the waiting time is zero (zn+1 = 1 given that zn= 1) is about 0.32 instead of 0.20. Indeed, estimates of

this probability reliably fall in a narrow range, from about 0.32 to 0.33, independent of the time interval with respect to which returns are defined, the stock from which the returns are derived, and the year from which the data are collected. In fact, the entire empirical waiting-time distribution is nearly invariant to time scale, stock, and year, as we shall now demonstrate.

B. Empirical evidence for invariance

Chang et al. [2] and Hsieh et al. [1] studied the waiting-time distribution between excursions, i.e., the distribution on the number of 0’s between two 1’s. The empirical waiting-time distribution from 2005 for the 18 648 five-minute returns, the 93 240 one-minute returns, and the 186 480 thirty-second returns of IBM are shown in the top row of Fig.3. They are remarkably similar.

Invariance to scale. The bottom row of Fig.3has three P -P plots that come from taking the three waiting-time distributions (30 s, 1 min, and 5 min, shown in the top row) two at a time. The KS distances, one for each comparison, are also shown. The distribution of waiting times between excursions for IBM 2005 returns is strikingly invariant to the return interval. (We are using dKS here as a descriptive statistic and not for the

purpose of hypothesis testing. These waiting times are not

precisely invariant, and many pairs that look well matched

3Given two cumulative distribution functions (cdf’s), F

1and F2, the P-P plot is the two-dimensional curve from (0,0) to (1,1) defined by{(F1(t),F2(t)) : t ∈ R}. The Kolmogorov-Smirnov distance is the maximum vertical (and horizontal) distance between the diagonal and the P -P plot, which is also the maximum distance between F1and F2: dKS(F1,F2)= supt|F1(t)− F2(t)|.

(4)

0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 10 20 30 40 50 -14 -12 -10 -8 -6 -4 -2 0 IBM Geometric(0.2) log pr obabilit y waiting time KS=0.145 geometric cdf waiting-time c d f, IBM 1-min r eturns

FIG. 2. (Color online) Geometric (0.2) and empirical waiting times. The empirical waiting-time distribution of 1-min returns of IBM stock in 2005 was compared with the geometric distribution with parameter 0.2. (left) Log plots for the geometric distribution and the empirical waiting-time distribution. The x axis is the waiting times, and the y axis is the log probabilities of the waiting times. (right) P -P plots for the geometric distribution vs the empirical waiting-time distribution. The KS distance is the maximum horizontal (maximum vertical) distance between the P -P curve [shown in blue (dark gray)] and the diagonal [shown in red (light gray)].

will nevertheless have small p values simply because of the large sample sizes.)

The phenomenon is not unique to IBM or to the year 2005. We tested approximately 300 of the S&P 500 stocks for the years 2001 through 2008. The results are summarized in TableI. In this regard, 2008 is not an outlier, as can be seen from the last column of the table and from the three histograms

of KS distances, one for each pair of return intervals, over all stocks tested in 2008 (Fig.4).

As we will see shortly, self-similar processes have excursion waiting-time distributions that are invariant to scale. It is interesting then to note that the empirical evidence for waiting-time invariance is substantially weaker at larger inter-vals, e.g., using hourly or daily returns. This same progression

0 10 20 30 40 50 60 70 80 90 100 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0 10 20 30 40 50 60 70 80 90 100 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0 10 20 30 40 50 60 70 80 90 100 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

waiting time between excursions waiting time between excursions waiting time between excursions

30-sec returns 1-min returns 5-min returns

empirical pr obabilit y empirical pr obabilit y empirical pr obabilit y KS=0.016 KS=0.005 KS=0.014

waiting-time cdf, 5-min returns

waiting-time cdf, 1-min returns waiting-time cdf, 5-min returns

waiting-time c d f, 1-min r eturns waiting-time c d f, 30-sec r eturns waiting-time c d f, 30-sec r eturns

FIG. 3. (Color online) Scale invariance. (top) Empirical waiting-time distributions captured from 30-s, 1-min, and 5-min returns of IBM in 2005. (bottom) P -P plots for the three waiting-time distributions taken two at a time and their corresponding Kolmogorov-Smirnov distances.

(5)

TABLE I. Scale invariance, aggregate data. Approximately 300 stocks were tested. Median KS distances for pairwise comparisons of three time scales (30 s, 1 min, 5 min) are shown for 2001 through 2008.

2001 2002 2003 2004 2005 2006 2007 2008

30 s vs 1 min 0.0199 0.0109 0.0148 0.0148 0.0163 0.0128 0.0113 0.0103

1 min vs 5 min 0.0253 0.0203 0.0197 0.017 0.0175 0.017 0.0194 0.0143

30 s vs 5 min 0.0348 0.0247 0.0268 0.0259 0.0264 0.0223 0.0242 0.0172

is often observed in studies of self-similarity (cf. [4]). Possibly, it can be traced to sample size. Because the return sequences are derived from a single calendar year, larger return intervals have smaller numbers of returns and hence a larger variance of the empirical waiting-time distribution. For example, as a rough estimate, we can expect hourly returns to multiply the spread of a 5-min-return across-stock histogram of empirical KS distributions (as in the bottom left panel of Fig.5) by about 

60/5≈ 3.5, which would substantially obscure the evidence for invariance. It is also possible that invariance systematically breaks down for larger return intervals. We have not explored either hypothesis.

Invariance to stock and year. How do the excursion

waiting-time distributions of one stock compare to those of another? For each of the 8 years studied we compared the waiting-time distributions for 5-min returns between all pairs of the 300 or so stocks in our data set. See Fig.5 and the accompanying table. With the possible exception of 2008, excursion waiting-time distributions are nearly invariant across stocks.

Finally, we examined the change in waiting-time distribu-tions from year to year. For each stock and each return interval (30 s, 1 min, 5 min), we compared distributions between pairs of years. TableIIindicates that waiting-time distributions were typically unchanged during the period 2001 to 2007 but considerably different during the financial crises of 2008.

C. Connections to self-similarity

Recall that P (t), t 0, is a self-similar process if there exists H  0 (Hurst index) such that

L{P (δt),t  0} = L{δHP

(t),t 0}

for all δ 0, where L{Q(t),t  0} denotes the probability distribution (“law”) of the process Q(·). In other words,

the joint distributions of (P (δt1),P (δt2), . . . ,P (δtm)) and δH(P (t1),P (t2), . . . ,P (t

m)) are the same for all m, t1,t2, . . . ,tm,

and δ (e.g., [5]). Let S(t), t 0, be the price of a stock at time t. Beginning with Mandelbrot [6,7], it has often been observed that the marginal distribution of the (drift-corrected) increments in price, or, more typically, log price, is nearly self-similar, e.g., log S(δt)− log S(δ(t − 1)) has nearly the same distribution as δHlog S(t)− δHlog S(t− 1), although

different methods for estimating the exponent H give different values. Many authors (e.g., [8,9]) argued that the exponent is not constant (generally decreasing at larger scales) or that there are actually multiple exponents, as in the more general multifractal models. Within the framework of (single-exponent) self-similarity, the estimation method of Mantegna and Stanley [10] is among the most convincing since it focuses on the centers of return distributions rather than their tails. Mantegna and Stanley reported a Hurst index of about 0.71 for the S&P 500, with evidence for self-similarity spanning three orders of magnitude in the return interval, although as they and others (e.g., [11]) pointed out, scaling breaks down at larger intervals.

Additionally, many authors have studied empirical scaling through a variety of statistics that can be derived from, but are not directly equivalent to, self-similarity. For example, Gopikrishnan et al. [12] investigated scaling properties of normalized returns, while Wang and Hui [13] studied scaling phenomena using returns divided by their daily average returns. Gencay et al. [14] explored wavelet variance, Matteo [15] used rescaled range analysis, and Glattfelder et al. [16] described 12 scaling laws in high-frequency foreign exchange data. Wang et al. [17] studied the return interval between big volatilities and showed the persistence of scaling for a range of time resolution scales (δt= 1,5,10,15,30 min).

Here we give a brief explanation of the mathematical relationship between self-similarity and scale invariance of the

0 0.05 0.1 0.15 0.2 0.25 0 10 20 30 40 50 0 0.05 0.1 0.15 0.2 0.25 0 20 40 60 80 0 0.05 0.1 0.15 0.2 0.25 0 10 20 30 40 50 KS distance KS distance KS distance

number of pairs number of pairs number of pairs

30 sec vs 1 min 1 min vs 5 min 30 sec vs 5 min

FIG. 4. (Color online) Histogram of KS distances from 2008. Each panel shows the histogram of Kolmogorov-Smirnov distances between excursion waiting-time distributions at different time scales in 2008 for approximately 300 stocks.

(6)

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.05 0.1 0.15 0.2 0.25 0 1000 2000 3000 4000 5000 6000 0 0.05 0.1 0.15 0.2 0.25 0 1000 2000 3000 4000 5000 6000

Year Median

2001 0.0264

2002 0.0293

2003 0.0228

2004 0.0209

2005

0.022

2006 0.0208

2007 0.0293

2008

0.035

KS=0.021 KS=0.059 waiting-time c d f, IBM 2005 waiting-time c d f, IBM 2008 waiting-time cdf, GPS 2005 waiting-time cdf, GPS 2008 KS distance KS distance

number of pairs number of pairs

2005 2008

FIG. 5. (Color online) Invariance to stock. Comparisons of excursion waiting-time distributions for 5-min returns between IBM and GPS in (top left) 2005 and (top right) 2008. (bottom) Histograms of KS distributions for all pairs of stocks show a breakdown of invariance across stocks in 2008 as compared to 2005. The table gives a summary of year-by-year comparisons of waiting-time distributions across stocks. With the exception of 2008, waiting times are nearly invariant to stock.

excursion waiting distribution. Assume that the drift-corrected log price P (·) is a self-similar process. Then, as for the return process, at scale δ with drift coefficient r,

Rt(δ) . = log S(δt) S(δ(t− 1)) = P (δt) − P (δ(t − 1)) + δr ⇒ LR(δ)t ,t  1  = LδHRt(1)+ (δ − δH)r,t  1  = LG(δ)(Rt(1)),t 1  ,

where G(δ)(x) is the monotone function δHx+ (δ − δH)r.

Now let Z(δ)

n , n= 1,2, . . . ,N, be the excursion process

corresponding to the return process R(δ)

n , n= 1,2, . . . ,N, for

some scale (interval) δ (e.g., 30 s or 5 min). Since percentages are unchanged by monotone transformations, it follows that

L{Z(δ)

n ,n= 1,2, . . . ,N} = L{Zn(1),n= 1,2, . . . ,N} for all δ >

0. In short, self-similarity of the process P (t), t  0, implies TABLE II. Year-to-year changes in excursion waiting-time distri-butions, giving medians of KS distances over all stocks and all pairs of years from 2001 through 2007 and median distances over all stocks from a single pair of years, 2005 and 2008. Waiting-time distributions in 2008 differ substantially from those of previous years.

2001 to 2007 2005 vs 2008

30-s returns 0.0236 0.0623

1- min returns 0.0219 0.0681

5-min returns 0.0228 0.0811

that the excursion process, and therefore its waiting-time distribution, is invariant to scale.

One family of self-similar models for P , made popular in finance by Mandelbrot [6], is the family of stable L´evy processes, i.e., processes with stable, stationary, and indepen-dent increments. But the corresponding returns, R1(δ),R(δ)2 , . . ., are then independent and identically distributed (iid) for all

δ >0, and this violates volatility clustering. This shortcoming (already apparent to Mandelbrot [6] in 1963) has led to the consideration of other self-similar models that have stationary and possibly stable, but not necessarily independent, increments. One way to construct such processes is through random time changes of Brownian motion [7,18–21]. We will return to this approach in Sec.IV C. A more direct approach is with fractional Brownian motion (FBM), which we will briefly discuss now as an illustration of the application of the excursion waiting-time distribution in the study of price fluctuations and their models.

The FBMs are a family of self-similar Gaussian processes, one for each Hurst index H ∈ (0,1]. The particular value

H = 1/2 is the ordinary Brownian motion. Which value of H

best describes the 5-min excursion waiting-time distribution of the 2005 IBM data? We explored different values of H . For each value, we generated 500 samples of the process P and extracted 18 648 returns, along with the corresponding excursion processes and their waiting-time distributions. (As discussed, in light of the fact that FBM is self-similar, the waiting-time distribution is invariant to δ.) Each waiting-time distribution has a KS distance to the distribution extracted

(7)

0.76 0.78 0.8 0.82 0.84 0.86 0.88 0.9 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Hurst index av erage KS distanc e

waiting-time cdf, IBM 5-min returns

waiting-time c

d

f, best-fitting FBM

KS=0.046

FIG. 6. (Color online) Fractional Brownian motion and excursion waiting times. (left) For each Hurst index H= 0.76,0.77, . . . ,0.89 we generated 500 FBM samples and extracted 18 648 returns, matching the 18 648 returns in the 5-mine 2005 IBM data. The average KS distances between the FBM excursion waiting times and the empirical IBM waiting times are plotted. The best fit, with a KS of about 0.046, is at H = 0.81. (right) P -P plot of excursion waiting-time distribution for a sample from the best-fitting FBM vs the empirical IBM distribution. FBM overestimates the probabilities of short and very long waiting times.

from the real data. The averages of the 500 KS distances for each of H = 0.76,0.77, . . . ,0.89 are shown in the left panel of Fig.6. The smallest KS distance over all examined H values was approximately 0.046 at H= 0.81. As can be seen from the right panel of Fig.6, in comparison to real returns the fitted FBM model has too many short and too many long waiting times.

III. CONDITIONAL INFERENCE, PERMUTATIONS, AND HYPOTHESIS TESTING

Our purpose in this section is to introduce some statis-tical tools that relate the near-invariance of the excursion waiting-time distribution to the temporal characteristics of the empirical return data, focusing particularly on the time scale of volatility clustering. In Sec. IV these tools will be used to explore some familiar themes in price-dynamics modeling, including implied volatility, GARCH models, and various approaches to stochastic time change, also known as market time. The statistical characterization of price and volatility fluctuations is obviously very complicated. Under that circumstance, model-free statistical methods can be par-ticularly effective tools for probing dynamics and discerning spatial and temporal patterns. The excursion process itself is an example in that it avoids absolute thresholds and model-based parameter estimates. Permutation tests are another example and are particularly suitable for relating the excursion process to the time scales operating in price fluctuations, as we shall now discuss.

A. Permutation tests

Returns are not exchangeable. If they were, there would be no stochastic volatility. Although we anticipated a failure of exchangeability, what is not apparent is the time scales involved in this departure of real dynamics from the basic

random-walk models encapsulated by the geometric L´evy processes. Are the 5-min returns of IBM locally exchangeable? What if we were to permute the twelve 5-min returns in each hour; would the price process look any different, either visually or statistically? As for visually, there is certainly no obvious “tell,” judging from a comparison of Figs. 7(b) and 7(c). Figure 7(b)plots the prices of IBM at 5-min intervals from 9:45 AM to 3:45 PM on a randomly selected day in 2005. Figure 7(c) plots a surrogate price sequence, derived from the original [i.e., the trajectory in Fig. 7(b)] by permuting, randomly and independently, each set of 12 returns within each of the 6 h. The surrogate sequence is started at the same price as the original and therefore again has the same price as the original at each ensuing hour. There is no visual clue that separates the real from the surrogate price sequence, and in our experience there never is one.

How about statistically? Can we detect a difference in the dynamics? Is there any indication that separates a real trajectory from its permutation surrogates? If so, how does this separation depend on time scale? We could as easily permute the set of 5-min returns within each week, each day, each hour, or each 30-min interval. At what time scale does exchangeability break down? Put differently, at what time scales does volatility clustering operate? These questions can be systematically and robustly answered through a permutation test and the resulting departure of the excursion waiting times between the permuted and original trajectories as measured through the KS distance.

Let r1,r2, . . . ,r18648 be the 18 648 five-minute intraday returns, as defined in Sec.II. Consider any statistic T (function of these returns), such as the KS distance between the excur-sion waiting-time distribution and the geometric distribution, as examined in Fig. 7(a). Also consider the particular “null hypothesis” HothatL{(Rρ(1),Rρ(2), . . . ,Rρ(18648))} is invariant

(8)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.12 0.125 0.13 0.135 0.14 0.145 0.15 0 200 400 600 800 1000 1200 0.1 0.105 0.11 0.115 0.12 0.125 0.13 0.135 0 100 200 300 400 500 600 700 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0 100 200 300 400 500 600 700 800 900 1000 0 10 20 30 40 50 60 70 80 82.7 82.8 82.9 83 83.1 83.2 83.3 83.4 83.5 0 10 20 30 40 50 60 70 80 82.5 82.6 82.7 82.8 82.9 83 83.1 83.2 83.3 83.4 83.5 (a) (f) (e) (d) (c) (b) geometric cdf waiting-time cdf

, IBM 5-min returns

KS=0.131

IBM stock pric

e

5-minute intervals 5-minute intervals

stock prices (returns permuted)

KS distance to geometric KS distance to geometric KS distance to geometric number of permutations number of permutations number of permutations

(permute all returns)

p<0.0004 p<0.0002

(permute in 20-min intervals) (permute in 10-min intervals) FIG. 7. (Color online) Exchangeability and time scale. The 5-min returns on IBM stock in 2005 were tested for their departure from exchangeability, as reflected in the excursion waiting-time distribution. (a) P -P plot of the excursion waiting-time distribution of the IBM returns vs the geometric distribution (corresponding to the waiting time between successes in a Bernoulli sequence with a probability of 0.2 of success). The distributions would be nearly identical if the returns were exchangeable. (b) Trajectory of IBM prices from 9:45 AM to 3:45 PM, sampled every 5 min, for a randomly selected day in 2005. (c) Same starting price as in (b), but with the twelve 5-min returns in each of the six 1-h intervals randomly and independently permuted. Since the returns within a given hour are exactly preserved, the stock prices in (b) and (c) are the same at 10:45 AM and at each hour thereafter. The dynamics governing the trajectories in (b) and (c) are not apparently different. (d) Distribution of KS distances to the geometric distribution, obtained from 5000 surrogate return sequences corresponding to 5000 random permutations of the 18 648 IBM 5-min returns. The vertical red line marks the KS distance (0.131) of the original sequence of returns. (e) Test for local exchangeability. Surrogates were produced by independently permuting every disjoint 20-min block of four 5-min returns. The distribution of KS distances was again computed from 5000 surrogates. In general, tests employing larger time intervals produce still lower p values. Thus, despite appearances, the evidence strongly points to a highly significant difference between the trajectories in (b) and (c). (f) The ensemble of surrogates derived from permutations of pairs of returns for every 10-min block are indistinguishable from the original sequence with respect to the departure of their excursion waiting-time distributions from geometric.

the random variables associates with the observed returns. The point is not that we actually believe Ho (among other things,

it violates volatility clustering), but rather that it leads to a measure of departure from exchangeability as determined by the particular statistic being examined and the particular set of permutations . Under the null hypothesis a sequence of M iid permutations, ρ1(·),ρ2(·), . . . ,ρM(·), chosen from the uniform

distribution on the set of permutations in , produces a sequence of M+ 1 conditionally iid T ’s, namely, the observed

Tobs= T (r1,r2, . . . ,r18648) together with one additional value for each permutation:

Tρm = T 

rρm(1),rρm(2), . . . ,rρm(18648) 

, m= 1,2, . . . ,M.

It follows that under Ho

Pr{No.{m = 1,2, . . . ,M : Tρm Tobs}  N} 

N+ 1 M+ 1.

(4) In other words, if N= No.{m = 1,2, . . . ,M : Tρm  Tobs}, then (N+ 1)/(M + 1) is an exact p value for Ho in the

direction of the alternative Hathat Tobsis larger than would be expected under Ho.4

Figure 7(d) illustrates the test with M= 5000 and  unrestricted, i.e., the entire permutation group on the sequence 1,2, . . . ,18 648. Since Tobsis larger than any of the values of T evaluated for the surrogate (i.e., permuted) sequences, N = 0, and the test has a p value of 50011 ≈ 0.0002. As expected, the waiting-time distribution of real returns is not consistent with exchangeability and in fact produced the largest deviation from geometric among all of the 5001 sequences. Suppose now that we restrict  to include only local permutations, say within each day or hour or 20-min period. Then selecting from the uniform distribution on  is the same thing as independently choosing a permutation for each (nonoverlapping) day or hour

4This is an instance of conditional inference in that the test is conditioned on the particular realization. The correctness of the p value follows from its correctness for any realization.

(9)

or 20-min period, providing a mechanism for systematically exploring the time scale of volatility clustering.

B. Exploring time scale

Clearly, we cannot treat the entire set of 18 648 IBM 5-min returns from 2005 as exchangeable [Fig.7(d)]. In practice, traders adjust for changes in volatility, as measured by σ (the standard deviation of logarithmic returns); returns should only be considered exchangeable within a time period. But how often should volatility be updated? Are the returns, at least approximately, exchangeable within days or perhaps within 1-or 0.5-h intervals? In general, consider a partitioning of the index set{1,2, . . . ,18 648} into disjoint intervals of length λ, where λ is a time span, measured in units of 5 min, over which the returns are presumed to be essentially exchangeable. We would use λ= 74 to test exchangeability within single days (recall that the first and last 10 min of each day of prices are excluded) and λ= 12, 6, 4, and 2, respectively, to test exchangeability in 1-h, 30-min, 20-min, and 10-min intervals. By virtue of Eq.(4), these hypotheses can be tested, and exact p values can be computed by generating ensembles of surrogate return sequences from ensembles of random permutations and then comparing the corresponding values of the KS statistic to its observed value. For fixed λ, permutations are drawn iid from the uniform distribution on the set of permutations  that preserves membership in the designated intervals.

Figures7(e)and7(f)show the results of testing for local exchangeability of the excursion process in the 5-min IBM data over 20-min [λ= 4, Fig.7(e)] and 10-min [λ= 2, Fig.7(f)] intervals. Intervals longer than 20 min result in smaller p values. Evidently, if time-varying volatility is the source of the breakdown in exchangeability, then it is operating at an extremely high frequency.

In line with the near-invariance of the waiting-time distri-bution, we find that other intervals, other stocks, and other years lead to similar results.

IV. TIME SCALE AND STOCHASTIC VOLATILITY MODELS

These observations of nongeometric waiting times and remarkably rapid changes in volatility suggest mechanisms for evaluating the validity of models of price and return dynamics. Which models and mechanisms are consistent with the observed properties of the excursion process? Stock dynamics are highly nonstationary, and stochastic volatility is a compelling modeling tool through which nonstationarity can be accommodated. We examined implied volatility, GARCH volatility models, and market-time transformations (trade and volume based) for their consistency with the invariance of excursion waiting times and the empirical characteristics of local and global exchangeability. We were unable to match the data from any one of these points of view, as discussed in the following sections.

A. Implied volatility

One place to look for a nonstationary volatility process that is commensurate with the breakdown of exchangeability is in the volatility implied by the pricing of options. Implied

volatilities are forward looking and, as such, not a model for

σ → σt in the Black-Scholes model. But the question here

is not whether they reflect the actual minute-to-minute or hour-to-hour volatilities of their underlying stocks, but rather whether they include sufficiently rapid changes in amplitude to support the lack of global and even local exchangeability in the return process.

Eight days of minute-by-minute Citigroup 2008 stock and option prices were sampled from 9:35 AM until 3:55 PM (381 prices per day) and used to compute the minute-by-minute volatilities implied by the 19 April 2008 put with strike price 22.5 (left panel, Fig. 8). This sequence was used to produce a corresponding return process, from which an empir-ical excursion waiting-time distribution was extracted.5 The volatility trajectory includes substantial fluctuations across multiple time scales, as is evident Fig. 8, and it would be reasonable to expect a failure of exchangeability in the derived return process. To the contrary, the waiting-time distribution was surprisingly similar to geometric (Fig.8, middle panel, KS value of 0.02), and in fact the return sequence was indistinguishable from global exchangeability based on the KS statistic and full-interval permutations (right panel). Results for local exchangeability were similar. The experiment again makes the point that extreme high-frequency fluctuations in volatility might be needed to match the properties of the real excursion process in the context of a Black-Scholes model with time-varying σ . Implied volatilities evidently do not take into account these strong intraday volatility fluctuations.

B. GARCH

We examined the suitability of Engle’s [22] autoregressive conditional heteroskedasticity (ARCH) model and its general-ization, GARCH [23], for producing excursion processes that match the statistics of the excursions of real stock returns. We explored a collection of ARCH and GARCH models by fitting to the 1-min returns from the 2005 IBM stock prices. Over a wide range of values for the moving average and autoregressive orders (q and p, respectively), we found that GARCH(p,q) models provide a nearly perfect fit to empirical waiting-time distributions but fail to match the invariance properties of these distributions across return intervals. We will show results for the particular model GARCH(10,10) but emphasize that virtually identical results were obtained for the more commonly used GARCH(1,1) model, as well as every combination of 1 p  10 and 1  q  10 that we tested. Given the ample amount of data (93 240 one-minute returns) and given that for 1 p,q  10 the GARCH(p,q) model is included in the GARCH(10,10) model, we choose to show the results for GARCH(10,10).

After fitting the GARCH parameters (see Table III for estimated parameters and their standard errors), the model was used to produce a full year of simulated 1-min returns. The excursion waiting-time distribution of the simulated data matches the distribution extracted from the real data, as indicated by the P -P plot in the top left panel of Fig.9and

5The scale of the volatility process is irrelevant since the excursion process is invariant to multiplication of the returns.

(10)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0 200 400 600 800 1000 1200 0 500 1000 1500 2000 2500 3000 0.58 0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74 0.76 implied v olatilit y minutes KS=0.020 geometric cdf waiting-time c d f, simulat ed r eturns KS distance to geometric number of permutations

FIG. 8. (Color online) Implied volatility generates exchangeable returns. Eight days of minute-by-minute 2008 Citigroup stock and put prices (strike price 22.5, maturing on 19 April 2008) were used to calculate the minute-by-minute implied volatility and to generate simulated minute-by-minute returns from a geometric Brownian motion with volatility function (σ= σt) equal to the implied volatility. (left) Minute-by-minute implied volatility. (middle) The excursion waiting-time distribution of the simulated returns closely resembles the geometric distribution, unlike the real 1-min returns for which the P -P plot against the geometric is essentially identical to the one shown in the Fig.7(a)(5-min returns of IBM). (right) Simulated returns were not distinguishable from exchangeable returns through the KS statistic, despite substantial fluctuations in the implied volatility at multiple time scales.

the small KS distance. Furthermore, as with the real data and in contrast to experiments with implied volatility (Sec.IV A), GARCH simulated returns are not exchangeable, even under permutations confined to 2-min intervals (see top right, bottom left, and bottom right panels, respectively, for results on full exchangeability and 4- and 2-min exchangeability). In general, the match between simulated and actual returns was excellent. On the other hand, real stocks produce excursion waiting times that are nearly scale invariant, as already documented in Sec.IIand illustrated in Fig.3for the 2005 IBM data. For comparison, the left panel of Fig.10reproduces the bottom middle panel of Fig. 3, whereas the right panel shows the corresponding P -P plot for the GARCH simulated data. The KS distance between 1- and 5-min waiting-time distributions for the IBM data is 0.005, whereas the GARCH generated 1-min returns, aggregated to produce 5-min returns, produce a KS distance of 0.05. In general, GARCH models have poor scaling properties, as already noted in the discussion of intraday return intervals in Sec. 4 of Andersen and Bollerslev [24]. In fact, GARCH models, although elegant and apparently

suitable for fitting volatility, are inconsistent in the sense that, in general, a process cannot obey a GARCH model for both 1- and k-min returns for any k= 2,3, . . . , as is easily demonstrated analytically.

C. Market time

There is no reason to believe that a good model for the logarithm of stock prices should be homogeneous in time. To the contrary, the random-walk model suggests that the variance of a return should depend on the number or volume of transactions (the number of “steps”) rather than the number of seconds. The compelling idea that market time is measured by accumulated activity rather than the time on the clock seems to have been suggested first by Mandelbrot and Taylor [7] and then worked through, more formally, by Clark [18]. It has been revisited in several influential papers since then; see the discussions by Geman [25] and Shephard [26] for reviews and references.

TABLE III. GARCH parameter estimation. The GARCH(10,10) model (σ2 t = ω +

10

i=1αiR2t−i+ 10

i=1βiσt2−i) was estimated from 93 240 one-minute returns of IBM stock from 2005 (Rt, t= 1,2, . . . ,93 240), using the Univeristy of California, Davis,GARCH MATLABtoolbox. The estimated values and standard errors of the 21 parameters are shown. Zero values are common due to stability and positivity constraints.

Parameter Estimated value Standard error Parameter Estimated value Standard error

ω 1.5111 0.0000 β1 0.0646 0.3472 α1 0.1445 0.0044 β2 0 0.3514 α2 0.0758 0.0508 β3 0 0.3200 α3 0.0427 0.0404 β4 0.2385 0.2767 α4 0.0368 0.0406 β5 0 0.2305 α5 0 0.0285 β6 0 0.2140 α6 0 0.0283 β7 0.1892 0.1739 α7 0 0.0260 β8 0 0.1994 α8 0 0.0179 β9 0 0.1862 α9 0 0.0178 β10 0.1613 0.1038 α10 0 0.0163

(11)

0.1250 0.13 0.135 0.14 200 400 600 800 1000 1200 0.120 0.122 0.124 0.126 0.128 0.13 0.132 0.134 0.136 0.138 0.14 100 200 300 400 500 600 700 800 900 1000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0 200 400 600 800 1000 1200 KS distance to geometric

KS distance to geometric KS distance to geometric

number of permutations

number of permutations number of permutations

p<0.0002

p<0.0008 p<0.0002

permute all returns

permute within 4-min intervals permute within 2-min intervals

waiting-time c

d

f, GARCH 1-min r

eturns

waiting-time cdf, IBM 1-min returns

KS=0.010

FIG. 9. (Color online) Simulated 1-min IBM returns using GARCH. One-minute returns on IBM for all of 2005 were used to fit a GARCH model with autoregressive and moving average terms each of order 10 (p= q = 10). (top left) Waiting-time distribution between excursions in the simulated returns was a near-perfect match to the empirical distribution. Results of permutation tests for (top right) global exchangeability and (bottom) local exchangeability (4-min intervals, bottom left, and 2-min intervals, bottom right) were essentially identical to the results for the real returns (not shown).

Here we employ a simple yet definitive test that rules out the possibility that any function of volume or number of transactions can render the return process compatible with a geometric Brownian motion or, for that matter, any of its L´evy generalizations. In particular, time changes based on volume or trade numbers do not transform returns into exchangeable sequences. The key, then, to ruling out these simple market-time transformations lies in the dynamics; it is not enough to simply match the marginal distributions of the returns, as we now demonstrate.

Formally, let D(t)= log S(t), and start with the cus-. tomary model D(t)= μt + σw(t), where w is a standard Brownian motion or a more general process with stationary and independent increments (i.e., a L´evy process). Volatility clustering is inconsistent with the resulting stationarity and/or independence of the increments of D (and hence the modeled returns). One remedy is to introduce a volatility process,

σ → σ(t), as in the well-known models of Hull and White [27]

and Heston [28], or any of a variety of other models for stochastic volatility (cf. [26]). Another remedy is to introduce a market-time process τ (t), usually independent of w, and write w(τ (t)) in place of w(t). (Actually, the two models are oftentimes equivalent; see, e.g., [21,29].) Depending on the details of the model for S and for τ , D(t) becomes

μt+ σw(τ(t)) or μτ(t) + σ w(τ(t)).

Assuming that τ is independent of w, Clark [18] experi-mented with various functions of the volume as measures of market time:

τ(t)− τ(s) = f (V (t) − V (s)) ∀ s  t, (5) where V (t) is accumulated volume and f is monotone increasing. More recently, Easley et al. [30] provided support for Eq.(5)by demonstrating “partial recovery of normality” using equal-volume returns. On the other hand, An´e and Geman [31] have argued that the number of trades, as opposed

(12)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 KS=0.050 KS=0.005

waiting-time cdf, GARCH 5-min returns waiting-time cdf, IBM 5-min returns

waiting-time c d f, IBM 1-min r eturns waiting-time c d f, GARCH 1-min r eturns

FIG. 10. (Color online) Failure of the GARCH model to match waiting-time scale invariance of real returns. (left) P -P plot matching excursion waiting-time distributions for the IBM 1-min to the IBM 5-min returns (2005). Distributions are nearly identical. (right) Same comparison using GARCH-generated 1-min returns, aggregated to make a record of 5-min returns. Although there is an excellent fit to the 1-min data (see Fig.9), the model fails to scale across different return intervals.

to the accumulated volume, is the fundamental determinant of

τ[hence f (T (t)− T (s)) in(5), where T (t) is the accumulated number of transactions]. Mandelbrot and Taylor [7] raised both possibilities.

The typical test shows that the normal distribution is a better approximation of the distribution of returns when returns are defined by equal intervals of τ rather than equal intervals of “clock time.” But this is a weak test. The marginal distribution of a process carries no information about its temporal statistics. Dynamics are more important but not as easily explored. The excursion waiting-time distribution is fundamentally about dynamics and provides an easy and sensitive test of whether a time-transformed price process is, even approximately, a geometric L´evy process (e.g., geometric Brownian motion).

Whether volume based [e.g., Eq. (5)] or trade based [V (t)→ T (t)], let 0 < t1< t2<· · · be an increasing se-quence yielding equal increments of τ : τ (tk)− τ(tk−1)= τ(tl)− τ(tl−1)∀ k = l, k,l > 0. If D(t) = μτ(t) + σ w(τ(t)),

then set Rk= D(τk)− D(τk−1), and otherwise, if D(t)= μt+ σ w(τ(t)), set Rk= D(τk)− D(τk−1) −μ(tk− tk−1).

(The difference is negligible for short intervals.) For either model of D and either model of τ (volume based or trade based), if the market-time corrected process is geometric Brownian motion (or, more generally, L´evy), then the return sequence R1,R2, . . .constructed in this manner is necessarily iid and therefore exchangeable.

Consider, for example, Fig.11, where we examine equal-market-time returns on IBM 2005 stock under the assumption that τ is determined by the number of trades. In particular, returns were defined on successive intervals containing 110 trades each (corresponding, on average, to 5 min of clock time). Thus, Rk= D(τk)− D(τk−1)= log S  tτk  − log Stτk−1  ,

where tτk is the time when the τkth trade occurred and τk = 110k for all k= 0,1,2, . . .. Obviously, the process R1,R2, . . .

is far from exchangeable (right panel), and the waiting-time distribution is a poor approximation of the geometric distribu-tion (left panel). We examined all combinadistribu-tions of models for

Dand τ (volume based and trade based). Each case produces a figure essentially identical to Fig. 11; these market-time transformations fail to render the returns exchangeable.

By the evidence, neither the number of trades nor the accumulated volume is, in and of itself, a viable measure of market time. The dynamics of the return process, following a volume or trade-based time change, do not resemble those of a geometric Brownian motion or any other L´evy process.

V. SUMMARY AND CONCLUDING REMARKS We have given empirical evidence for a new invariant in the price movements of stocks. The waiting-time distribution between large returns (excursions) is nearly invariant to scale (length of the return interval), stock, and the year of observation. The clustering of excursions is a manifestation of the well-studied clustering of volatility. The invariance in the clustering of excursions therefore constrains proposed models and mechanisms for volatility clustering. Self-similar (log) price processes have invariant waiting times between excursions, but the evidence for self-similarity is confined to the distributions on log price increments and not the processes themselves. Furthermore, scaling indices estimated from return data vary from study to study [11] and are extremely sensitive to statistical methodology, as might be expected given that most approaches focus on the tail behavior of the return distributions. By contrast, waiting-time distributions rely on percentiles, which are robust and nonparametric and evidently stable given the weight of evidence for invariance presented in Sec.II.

We have illustrated the possible utility of excursion waiting times by examining some models for price and volatility dynamics. In general, the failure of even local exchangeability of excursions (and therefore returns) points to rapid changes

(13)

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0 200 400 600 800 1000 1200 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 number of permutations KS distance to geometric p<0.0002 geometric cdf waiting-time c d f, c o rr ec ted f or maket time KS=0.086

FIG. 11. (Color online) Interval time measured by number of trades. In 2005, there was an average of about 110 trades of IBM stock every 5 min. If market time were measured by the number of trades and were adequate to transform prices into a L´evy process, then returns over 110-trade intervals would be exchangeable. (left) Excursion waiting-time distribution for equal-market-time intervals (110 trades) does not match the geometric distribution. (right) Equal-market-time returns are not exchangeable, as evidenced by the distribution of KS values under permutations. Market time measured by volume instead of trades also fails to render returns exchangeable.

in volatility. Thus, implied volatility, for example, is much too smooth (despite its appearance; see Fig.8). ARCH and GARCH models, even of low order, track volatility sufficiently well to produce simulated returns with excursion waiting times that are a near-perfect match to empirical waiting times. But unlike real returns, aggregating the simulated 1-min returns into simulated 5-min returns produces a different waiting-time distribution. This might have been anticipated (though not guaranteed) by the observation that these models themselves lack scale invariance. Finally, we examined the appealing idea of a market-activity-based time change in an effort to remove volatility clustering and restore exchangeability to the random-walk model. Returns were redefined with respect to equal increments of market time, as opposed to clock time, under both volume-based and trade-based measures of market activity. Neither definition of market time rendered an exchangeable sequence of excursions.

The usual caution about the distinction between statistical significance and scientific significance bears repeating here. We have introduced exact hypothesis tests that produce very small p values. In and of themselves, these values are not particularly interesting given the large sample sizes involved (e.g., almost 20 000 five-minute returns on IBM stock from 2005). Our focus, instead, was on the trajectory of p values under a sequence of global-to-local exchangeability tests and on the comparison of p values between data produced by real returns and data simulated from models.

A more subtle statistical issue concerns the use of aggre-gated data for inference about temporal dynamics, especially scaling properties, such as self-similarity. Consider using a year’s worth of price data (S(t),t∈ [0,T ]) for estimating the joint distribution on successive returns R1(δ), . . . ,R(δ)

n over

intervals of length δ, where

Rk(δ)= log S(δk) S(δ(k− 1)).

(Typically, n= 1, and the goal is to study the distribution on returns and its relationship to δ.) To keep things simple, assume that

log S(t)= t

0

σ(s)dw(s),

where w is an α-stable L´evy process (α= 2 when w is Brown-ian motion), consistent with the basic geometric random-walk framework but accommodating nonconstant volatility.6 The α-stable L´evy processes are self-similar, with scaling exponent

α∈ (0,2] (i.e., Hurst index H = 1/α ∈ [0.5,∞)). However,

given the year under study, with its particular sample path of σ (t), t ∈ [0,T ], log S(t) is not self-similar: L{log S(δt)} =

L{δ1/αlog S(t)}.7 Nevertheless, an experimental study, such as Refs. [4,6,32], to name just a few, might well lead to the opposite conclusion, as follows.

Assume for the time being that σ (t) is independent of w(t) and pathwise smooth enough to have negligible fluctuations in intervals of length nδ, which is reasonable for all δ sufficiently small. What properties should be expected of the empirical joint distribution ˆF on R1(δ), . . . ,Rn(δ), ˆ FR(δ) 1,...,R (δ) n (r1, . . . ,rn)= δ T T /δ k=1 n i=1 1R(δ) k+i<ri,

derived from a year of returns? In particular, which, if any, of the scaling properties of w are inherited by the empirical return distribution? Under the smoothness assumption on σ ,

6The other general approach to time-varying volatility is through a time change of w: w(t)→ w(τ(t)). As mentioned in Sec.IV C, in most models the two approaches, σ dw(t)→ σ (t)dw(t) and w(t) → w(τ(t)), come down to the same thing. For more on conditions of equivalence see Veraart and Winkel [29].

7Here log S(t)= t

0σ(s)dw(s), but L{log S(δt)} =

L{δ1/α t

0σ(δs)dw(s)} = L{δ

(14)

a straightforward calculation shows that ˆ FR(δ) 1 ,...,R (δ) n (r1, . . . ,rn)≈ ˆFR(1)1 ,...,R (1) n −1/α(r1, . . . ,r n)), (6)

which is in fact the property that characterizes the increments of a self-similar process, with scaling index α, such as the increments of w itself. The fact that σ = σ(t) is lost in the aggregation. The returns R1(δ), . . . ,R(δ)

n appear to come from a

self-similar process even though they do not.

The implicit assumption behind aggregation is stationarity. In its absence, the aggregated estimator is a mixture of distributions, each generated by w but mixed with respect to the occupation measure of σ (t) over the yearlong observation

t∈ [0,T ].8 Chang and Geman [33] demonstrated that the convergence is quite rapid and the approximation in (6)

typically holds even when the return interval δ is large relative to the fluctuations of σ . What does the same reasoning say about the empirical waiting-time distribution for excursions, as computed over the same time interval? This is a substantially harder calculation, but in one regard the conclusion is likely to be the same: if we accept the geometric random-walk model, then scale invariance of the empirical waiting-time distribution for all δ sufficiently small is a foregone conclusion. On the other hand, the particular invariant distribution, including, for example, the empirical probability of zero wait between excursions (approximately 0.32), very much depends on the particular occupation measure of σ .

In light of these observations, empirical scale invariance in the timing of excursions and for self-similarity of the price process is at least consistent with the geometric random-walk model, if not in fact further support for its basic soundness, whether or not the volatility process is stationary. What is more, the near invariance of the excursion waiting-time distribution across stocks and years points to a volatility-generating process

8The distribution of the random variable σ (X) when X is uniform on [0,T ].

with an occupation measure that is surprisingly reproducible, modulo a constant scale. Notice that if nonconstant market activity were the source of stochastic volatility, then its strong correlations across stocks would begin to explain invariance of waiting times across stocks. Notice also that most days begin and end with relatively high activity, a daily rhythm which might contribute to the invariance from one era to another.

In light of the results in Sec. IV C, however, we would need to look beyond any simple function of trades or volume for the relevant measure of market activity (and hence market time). It might be sensible, for example, to view trades as indicating the time of a step in the random walk and volume as determining the scale of the distribution on the step size (related to the ideas of Gabaix et al. [34]). There is no reason to believe that the relationship between the volume v of a trade and the scale σ = σ(v) of the resulting random step would be linear (though presumably it is monotonic). To the contrary, it would depend on the complexities of supply and demand, as might be reflected in the state and dynamics of the collective order book. In any case, it might be feasible to estimate σ (v) nonparametrically by maximum likelihood. The test of the model would then be the same: are returns over equal market-time intervals exchangeable?

ACKNOWLEDGMENTS

The authors gratefully acknowledge insightful discussions with H´elyette Geman and Matthew Harrison, as well as financial support from the Office of Naval Research under Contract No. N000141010933; the National Science Founda-tion under Grants No. ITR-0427223, No. DMS-1007593, and No. DMS-1007219; the Defense Advanced Research Projects Agency under Contract No. FA8650-11-1-7151; the Center of Mathematical Modeling and Scientific Computing; the National Center for Theoretical Science, Hsinchu, Taiwan; and the National Science Council under Grant No. 100-2115-M-009-007-MY2.

[1] F. Hsieh, S.-C. Chen, and C.-R. Hwang,Quant. Finance 12, 213 (2012).

[2] L.-B. Chang, A. Goswami, F. Hsieh, and C.-R. Hwang, Bull. Inst. Math. Acad. Sin. 8, 31 (2013).

[3] P. Diaconis and D. Freedman,Ann. Probab. 8, 745 (1980). [4] R. N. Mantegna and H. E. Stanley, An Introduction to

Econo-physics (Cambridge University Press, Cambridge, 2000). [5] P. Embrechts and M. Maejima, Self-Similar Processes (Princeton

University Press, Princeton, NJ, 2002). [6] B. Mandelbrot,J. Bus. 36, 394 (1963).

[7] B. Mandelbrot and H. Taylor,Oper. Res. 15, 1057 (1967). [8] L. Calvet and A. Fisher,Rev. Econ. Stat. 84, 381 (2002). [9] Z. Xu and R. Gencay,Physica A 323, 578 (2003).

[10] R. N. Mantegna and H. E. Stanley,Nature (London) 376, 46 (1995).

[11] J.-P. Bouchaud,Quant. Finance 1, 105 (2001).

[12] P. Gopikrishnan, V. Plerou, L. A. N. Amaral, M. Meyer, and H. E. Stanley,Phys. Rev. E 60, 5305 (1999).

[13] B. H. Wang and P. M. Hui, Eur. Phys. J. B 20, 573 (2001).

[14] R. Gencay, F. Selcuk, and B. Whitcher,Physica A 289, 249 (2001).

[15] T. D. Matteo,Quant. Finance 7, 21 (2007).

[16] J. B. Glattfelder, A. Dupuis, and R. B. Olsen,Quant. Finance

11, 599 (2011).

[17] F. Wang, K. Yamasaki, S. Havlin, and H. E. Stanley,Phys. Rev. E 73, 026117 (2006).

[18] P. K. Clark,Econometrica 41, 135 (1973). [19] T. G. Andersen,J. Finance 51, 169 (1996). [20] C. C. Heyde,J. Appl. Probab. 36, 1234 (1999).

[21] H. Geman, D. B. Madan, and M. Yor, Math. Finance 11, 79 (2001).

[22] R. F. Engle,Econometrica 50, 987 (1982). [23] T. Bollerslev,J. Econometrics 31, 307 (1986).

[24] T. G. Andersen and T. Bollerslev,J. Empirical Finance 4, 115 (1997).

[25] H. Geman,J. Banking Finance 29, 2701 (2005).

[26] N. Shephard, Stochastic Volatility (Oxford University Press, Oxford, 2005).

(15)

[27] J. Hull and A. White,J. Finance 42, 281 (1987). [28] S. L. Heston,Rev. Financial Stud. 6, 327 (1993).

[29] A. E. Veraart and M. Winkel, in Encyclopedia of Quantitative Finance, edited R. Cont (Wiley, Hoboken, NJ, 2010).

[30] D. Easley, M. M. L´opez de Prado, and M. O’Hara, J. Portfolio Manage. 39, 19 (2012).

[31] T. An´e and H. Geman,J. Finance 55, 2259 (2000).

[32] U. A. Muller, M. M. Dacorogna, R. B. Olsen, O. V. Pictet, M. Schwarz, and C. Morgenegg,J. Banking Finance 14, 1189 (1990).

[33] L.-B. Chang and S. Geman, Physica A (2013), doi: 10.1016/j.physa.2013.06.049.

[34] X. Gabaix, P. Gopikrishnan, V. Plerou, and H. E. Stanley,Nature (London) 423, 267 (2003).

數據

FIG. 1. (Color online) Returns, percentiles, and the excursion process. (a) IBM stock prices, every 5 min, during the 252 trading days in 2005
FIG. 3. (Color online) Scale invariance. (top) Empirical waiting-time distributions captured from 30-s, 1-min, and 5-min returns of IBM in 2005
FIG. 4. (Color online) Histogram of KS distances from 2008. Each panel shows the histogram of Kolmogorov-Smirnov distances between excursion waiting-time distributions at different time scales in 2008 for approximately 300 stocks.
FIG. 5. (Color online) Invariance to stock. Comparisons of excursion waiting-time distributions for 5-min returns between IBM and GPS in (top left) 2005 and (top right) 2008
+5

參考文獻

相關文件

Understanding and inferring information, ideas, feelings and opinions in a range of texts with some degree of complexity, using and integrating a small range of reading

Wang, Solving pseudomonotone variational inequalities and pseudocon- vex optimization problems using the projection neural network, IEEE Transactions on Neural Networks 17

which can be used (i) to test specific assumptions about the distribution of speed and accuracy in a population of test takers and (ii) to iteratively build a structural

We explicitly saw the dimensional reason for the occurrence of the magnetic catalysis on the basis of the scaling argument. However, the precise form of gap depends

Define instead the imaginary.. potential, magnetic field, lattice…) Dirac-BdG Hamiltonian:. with small, and matrix

Miroslav Fiedler, Praha, Algebraic connectivity of graphs, Czechoslovak Mathematical Journal 23 (98) 1973,

• A function is a piece of program code that accepts input arguments from the caller, and then returns output arguments to the caller.. • In MATLAB, the syntax of functions is

Microphone and 600 ohm line conduits shall be mechanically and electrically connected to receptacle boxes and electrically grounded to the audio system ground point.. Lines in