Skip to main content

Theory and Modern Applications

Parameter estimation for discretized geometric fractional Brownian motions with applications in Chinese financial markets

Abstract

It is widely accepted that financial data exhibit a long-memory property or a long-range dependence. In a continuous-time situation, the geometric fractional Brownian motion is an important model to characterize the long-memory property in finance. This paper thus considers the problem to estimate all unknown parameters in geometric fractional Brownian processes based on discrete observations. The estimation procedure is built upon the marriage between the bipower variation and the least-squares estimation. However, unlike the commonly used approximation of the likelihood and transition density methods, we do not require a small sampling interval. The strong consistency of these proposed estimators can be established as the sample size increases to infinity in a chosen sampling interval. A simulation study is also conducted to assess the performance of the derived method by comparing with two existing approaches proposed by Misiran et al. (International Conference on Optimization and Control 2010, pp. 573–586, 2010) and Xiao et al. (J. Stat. Comput. Simul. 85(2):269–283, 2015), respectively. Finally, we apply the proposed estimation approach in the analysis of Chinese financial markets to show the potential applications in realistic contexts.

1 Introduction

Long-memory phenomena have been observed in numerous scientific fields such as hydrology, geophysics, economics, finance, climatology, physics, biology, medicine, music, and telecommunications engineering among others (see Hurst [23], Hosking [20], Lo [28], Willinger et al. [40], Baillie [1], Lai et al. [26], Hu and Øksendal [22], Granger and Hyung [19], Fleming and Kirby [18], Chronopoulou and Viens [11], Rossi and Fantazzini [35], Nguyen et al. [32] and the references therein). A time series with a long-memory behavior has a slow and hyperbolically declining autocorrelation function or, equivalently, an infinite spectrum at zero frequency. In fact, the best-known and widely used stochastic model that exhibits a long-memory property in a continuous-time situation, is of course the fractional Brownian motion (hereafter fBm), which is a suitable generalization of the standard Brownian motion. Consequently, the fBm and stochastic processes driven by it have found many applications in diverse fields including economics, finance, geophysics, biology, oceanography, meteorology, telecommunication engineering, physics, chemistry, medicine, and environmental studies (see, e.g., Doukhan et al. [14], Çağlar [8], Chronopoulou and Viens [10], Chronopoulou and Viens [11], Comte et al. [13] and the references therein). In particular, the well-known geometric fractional Brownian motion (hereafter gfBm) has been extensively used for capturing the fluctuations of stock prices (see Duncan et al. [15], Elliott and Chan [16], Elliott and Van Der Hoek [17], Hu and Øksendal [22], Mishura [30], Rostek [36]).

The development of gfBm naturally leads to studies in statistical inference, which has attracted great practical and theoretical interest. In fact, proper estimation of unknown parameters in stochastic models is of the utmost importance, since these estimators significantly affect risk management, derivatives pricing, and portfolio optimization. In particular, applications in finance often require both speed and accuracy in parameter estimation for small samples in order to facilitate dynamic decision making and risk management (see, e.g., Phillips and Yu [34]). As a consequence, parameter estimation for gfBm as a challenging theoretical problem has been of great interest in the past decade. For example, the problem of parameter estimation in a simple linear model driven by a fBm was investigated in Bertin et al. [5], Bertin et al. [6], Hu et al. [21], Xiao et al. [41], Brouste and Iacus [7], Liu and Song [27], Xiao et al. [42], Cheng et al. [9], Xiao et al. [46], Sun et al. [37]. In ground-breaking works, Xiao et al. [44], Xiao et al. [45], Tanaka et al. [38], Wang et al. [39] established the asymptotic theory for the estimators of fractional Ornstein–Uhlenbeck processes. In addition, for statistical inference in gfBm at discrete intervals, to the best of our knowledge, Kukush et al. [25] first developed an incomplete maximum-likelihood estimation for the drift parameter, which is separated from the estimation of the long-memory parameter, showing advances to some estimation methods specially designed only for the Hurst parameter, such as the R/S analysis, variation analysis, etc. Moreover, in the study of Misiran et al. [31], in which a general discrete-data complete maximum-likelihood-type procedure has been designed for estimating all the unknown parameters in gfBm, including the drift parameter, the diffusion coefficient and Hurst index, without the proof of asymptotic behavior. For the ground-breaking work of Xiao et al. [43], all the unknown parameters of gfBm from discrete observations based on the quadratic variation and the maximum-likelihood approach have been well estimated, in particular, the asymptotic properties of the estimators have been provided in Xiao et al. [43] as well. In addition, we also refer to a recent monograph, Kubilius et al. [24], for a complete exposition on different approaches used in statistical inference for fractional diffusions. However, both the complete maximum-likelihood estimators proposed in Misiran et al. [31] and the incomplete maximum-likelihood estimators introduced in Xiao et al. [43] are very time consuming. This is because the complete maximum-likelihood estimation involves numerically solving the profile-likelihood function and the implementation of the incomplete maximum-likelihood estimation need to compute the inverse and determinant of the autocovariance matrix of the fBm, which requires excessive computational time and computer memory. For example, a simple Monte Carlo sample takes a little less than 20 minutes on a desktop Intel i7 computer, while a more realistic empirical study will take a much longer time to run. Hence, both methods are not well suited for the case with an increased data size. Consequently, finding some less time-consuming estimators for discretely observed gfBm becomes a challenging problem and is of great interest for practical purposes.

Therefore, this paper will mainly focus on the demand for less time-consuming estimators for gfBm based on discrete-time data from practical applications, concretely, the three parameters of the drift, the volatility, and the Hurst parameter involved in gfBm. Based on discrete-time observations, we study the problem of estimating all the unknown parameters, including the drift, the diffusion, and the Hurst for gfBm in the setting of \(0< H<1\). The main contribution of this paper is to construct the estimators and to derive the asymptotic theory for these proposed estimators. While our framework in this paper is applicable to a wide range of Gaussian processes (e.g., subfractional Brownian motion, bifractional Brownian motion, weighted-fractional Brownian motion), we focus here on the case of fBm, which is widely used in physics, electrical engineering, biophysics, and finance.

The remainder of this paper proceeds as follows. Section 2 introduces the model and addresses the estimators of gfBm, which is observed at discrete points of time. This section proposes the least-squares-type estimators for both the drift and diffusion coefficients and presents the bipower variation estimator for the Hurst parameter. The asymptotic properties of these proposed estimators are also discussed in Sect. 2. Section 3 presents two existing estimation procedures for gfBm. In Sect. 4, we give some simulation examples to show the finite-sample performance of these estimators. The computational tests show favorable results for our proposed estimator even with relatively small sample sizes. The simulation results also demonstrate that our method is computationally simple and asymptotically unbiased. To show how to apply our approach in realistic contexts, Sect. 5 is devoted to presenting our empirical results of four major financial indices in China: the Shanghai Composite index (SHCI), the Shenzhen Component index (SZCI), the CSI Smallcap 500 index (CSI 500), and the CSI 300 index (CSI 300). Concluding remarks are discussed in the final section. All the proofs are collected in the Appendix.

2 Parameter estimation

Since the pioneering work of Mandelbrot and Van Ness [29], fBm has been extensively used to capture long-range dependence, self-similarity, non-Markovianity, or subdiffusivity and su- perdiffusivity. A crucial problem with the applications of these stochastic models driven by fBm in practice is how to estimate the unknown values of the parameters involved in these models. In this paper, we consider the problem of estimating all unknown parameters in gfBm, which is widely used for option pricing and is able to capture the memory dependency. In fact, there is a key challenge for estimating parameters in gfBm: the discretely observed data are not Markovian. This means that state-space models and Kalman-filter estimators can not be applied to estimate the parameters in gfBm. In this paper, the bipower variation and the least-squares estimation are thus employed to estimate the unknown parameters in gfBm. In what follows, we first introduce some notations, and then present the estimators for gfBm.

2.1 Model simplification

To capture the long-range dependence in financial asset returns, Duncan et al. [15], Elliott and Chan [16], Hu and Øksendal [22] used gfBm to capture the dynamics of stock prices. Thus, the stock price, \(S_{t}\), can be written as

$$ dS_{t} = \mu S_{t}\,dt + \sigma S_{t}\,dB^{H}_{t} ,\quad t\ge 0 , S_{0}=1, $$
(2.1)

where μ is the rate of return, σ is the volatility, and (\(B_{t}^{H}\), \(t\ge 0\)) is a fBm with Hurst parameter \(H\in (\frac{1}{2}, 1)\).

Using the Wick integration, Hu and Øksendal [22] stated that the solution of (2.1) can be written as (see Eq. (5.2.6) of Mishura [30]):

$$ S_{t} = \exp \biggl( \mu t -\frac{1}{2}\sigma ^{2} t^{2H}+ \sigma B^{H}_{t} \biggr)= \exp (Y_{t} ),\quad t\ge 0 , $$
(2.2)

where \(Y_{t}= \mu t -\frac{1}{2}\sigma ^{2} t^{2H}+ \sigma B^{H}_{t} \).

Comparing (2.1) with (2.2), it loses no information to transform the observation \(S_{t}\) into \(Y_{t}\). Hence, estimating the parameters from (2.1) is equivalent to estimating the unknown parameters from \(Y_{t}\). Now, we assume that the process, \(Y_{t}\), is observed at discrete-time instants \((t_{1}, t_{2}, \ldots , t_{N})\). Consequently, for \(H\in (0,1)\), the observation vector is \({\mathbf{Y}}=(Y_{t_{1}},Y_{t_{2}},\ldots ,Y_{t_{N}})'\), where the prime (′) is used to denote the vector transposition and all the nonprimed vectors are row vectors. In particular, to simplify notations we assume \(t_{k}=kh\), \(k=1, 2, \ldots , N\) for a fixed-step size \(h>0\). Consequently, for \(H\in (0,1)\), the discrete-time observation can be expressed in the form of vectors as

$$ {\mathbf{Y}}=\mu {\mathbf{t}}+\sigma{\mathbf{B}}_{t}^{H} -\frac{1}{2}\sigma ^{2}{ \mathbf{t}}^{2H} , $$
(2.3)

where \({\mathbf{Y}}=(Y_{h},Y_{2h},\ldots ,Y_{Nh})'\), \({\mathbf{t}}=(h, 2h, \ldots , Nh)'\), \({\mathbf{t}}^{2H}= ( h^{2H}, (2h)^{2H}, \ldots , (Nh)^{2H} )'\), and \({\mathbf{B}}_{t}^{H}=(B_{h}^{H}, \ldots , B_{Nh}^{H})' \). Our aim is to estimate the unknown parameters H, σ, and μ from observations \(Y_{t_{i}}=Y_{ih}\), \(0\leq i\leq N\) for a fixed interval h and study their asymptotic properties as \(N\rightarrow \infty \). In consequence, we will use the notation C for a generic constant, which may change from line to line.

2.2 Estimation procedures

Based on the situation of discrete observations mentioned in Sect. 2.1, we now proceed to estimate the unknown parameters of gfBm based on discrete-time observations.

First, we consider the problem of estimating the Hurst parameter of gfBm. From (2.3), we have

$$\begin{aligned}& Y_{h} = \mu h +\sigma B_{h}^{H}- \frac{1}{2}\sigma ^{2} h^{2H} , \\& Y_{2h} = \mu 2h +\sigma B_{2h}^{H}- \frac{1}{2}\sigma ^{2} (2h)^{2H} , \\& \vdots = \vdots \\& Y_{Nh} = \mu Nh +\sigma B_{Nh}^{H}- \frac{1}{2}\sigma ^{2} (Nh)^{2H} . \end{aligned}$$

Now, let

$$\begin{aligned} Z_{ih} =&Y_{ih}-Y_{(i-1)h} \\ =&\mu h+ \sigma \bigl(B_{ih}^{H}-B_{(i-1)h}^{H} \bigr)-\frac{1}{2} \sigma ^{2} \bigl[ (ih )^{2H}- \bigl( (i-1 )h \bigr)^{2H} \bigr] \\ =&\mu h+\sigma U_{ih}-\frac{1}{2}\sigma ^{2}V_{ih} , \end{aligned}$$

where \(U_{ih}=B_{ih}^{H}-B_{(i-1)h}^{H}\) and \(V_{ih}= (ih )^{2H}- ( (i-1 )h )^{2H}\).

Not surprisingly, \(U_{ih}\) (\(i=1,\ldots ,N\)) are normal distributions. Using the result of the expected absolute value of a bivariate normal distribution, we obtain

$$\begin{aligned} \mathbb{E} \vert U_{ih}U_{ (i-1 )h} \vert = \frac{2 h^{2H}}{\pi} \Bigl( \rho _{1}\arcsin \rho _{1}+\sqrt {1-\rho _{1}^{2}} \Bigr) , \end{aligned}$$
(2.4)

where \(\rho _{1}= (2^{2H-1}-1 )h^{2H}\).

Similarly, a standard calculation yields

$$\begin{aligned} \mathbb{E} \vert U_{ih}U_{ (i-2 )h} \vert = \frac{2 h^{2H}}{\pi} \Bigl( \rho _{2}\arcsin \rho _{2}+\sqrt {1-\rho _{2}^{2}} \Bigr) , \end{aligned}$$
(2.5)

where \(\rho _{2}=\frac{h^{2H}}{2} (3^{2H}+1-2^{2H+1} )\).

Then, using the bipower variation (see, Barndorff-Nielsen and Shephard [4]), we define the ratio function as follows:

$$\begin{aligned} R(H) =& \frac{\frac{1}{N-1}\sum_{i=2}^{N} \vert Z_{ih} \vert \vert Z_{ (i-1 )h} \vert }{ \frac{1}{N-2}\sum_{i=3}^{N} \vert Z_{ih} \vert \vert Z_{ (i-2 )h} \vert } \\ =& \frac{ \mathbb{E} [ \vert Z_{ih} \vert \vert Z_{ (i-1 )h} \vert ]}{ \mathbb{E} [ \vert Z_{ih} \vert \vert Z_{ (i-2 )h} \vert ]} \\ =& \frac{ \mathbb{E} [ \vert \mu h+\sigma U_{ih}-\frac{1}{2}\sigma ^{2}V_{ih} \vert \vert \mu h+\sigma U_{ (i-1 )h}-\frac{1}{2}\sigma ^{2}V_{ (i-1 )h} \vert ]}{\mathbb{E} [ \vert \mu h+\sigma U_{ih}-\frac{1}{2}\sigma ^{2}V_{ih} \vert \vert \mu h+\sigma U_{ (i-2 )h}-\frac{1}{2}\sigma ^{2}V_{ (i-2 )h} \vert ]} \\ =& \frac{ \sigma ^{2} \mathbb{E} \vert U_{ih}U_{ (i-1 )h} \vert +o (h^{2H})}{\sigma ^{2} \mathbb{E} \vert U_{ih}U_{ (i-2 )h} \vert +o (h^{2H})} \\ \sim & \frac{ \mathbb{E} \vert U_{ih}U_{ (i-1 )h} \vert }{ \mathbb{E} \vert U_{ih}U_{ (i-2 )h} \vert } , \end{aligned}$$
(2.6)

where “” means that the ratio of the left- and right-hand sides converges to one as N tends to infinite.

By combining (2.4) and (2.5) with (2.6), we obtain the following nonlinear function

$$\begin{aligned}& \frac{ (N-2 )\sum_{i=2}^{N} \vert Z_{ih} \vert \vert Z_{ (i-1 )h} \vert }{ (N-1 )\sum_{i=3}^{N} \vert Z_{ih} \vert \vert Z_{ (i-2 )h} \vert } = \frac{\rho _{1}\arcsin \rho _{1}+\sqrt{1-\rho _{1}^{2}}}{\rho _{2}\arcsin \rho _{2}+\sqrt{1-\rho _{2}^{2}} } . \end{aligned}$$
(2.7)

Finally, we can obtain the estimate Ĥ of the Hurst index H, by solving the nonlinear function of (2.7).

Remark 2.1

We can construct other estimators for the Hurst parameter using the realized variation ratio method or change-of-frequency approach. Both methods are based on multipower variations of the higher-order difference of gfBm. We refer to the excellent work of Barndorff-Nielsen et al. [3] or Bardet and Surgailis [2] for details.

In practice, it is impossible to obtain an analytical expression for H from (2.7). We use a numerical procedure to obtain it by solving (2.7). In what follows, we are in a position to estimate the drift μ and the volatility σ. The technique of the plug-in least-squares estimation has been employed due to the timesaving property. Now, from (2.3), we have

$$\begin{aligned} Y_{ih} =&\mu ih +\sigma B_{ih}^{H}- \frac{1}{2}\sigma ^{2} (ih)^{2H} . \end{aligned}$$

The least-squares estimators aim to minimize the following function

$$\begin{aligned} \sum_{i=1}^{N} \biggl\vert Y_{ih}-\mu ih+\frac{1}{2}\sigma ^{2} (ih)^{2H} \biggr\vert ^{2} . \end{aligned}$$

This implies that

$$\begin{aligned} \textstyle\begin{cases} \sum_{i=1}^{N} ih Y_{ih} - \mu \sum_{i=1}^{N} (ih)^{2} + \frac{1}{2}\sigma ^{2}\sum (ih)^{2H+1} = 0 , \\ \sum_{i=1}^{N} (ih)^{2H}Y_{ih} - \mu \sum_{i=1}^{N} (ih)^{2H+1} + \frac{1}{2}\sigma ^{2}\sum_{i=1}^{N} (ih)^{4H}= 0 . \end{cases}\displaystyle \end{aligned}$$

Let \(a=\sum_{i=1}^{N} (ih)^{2} \), \(b=\sum_{i=1}^{N} (ih)^{2H+1}\), \(c=\sum_{i=1}^{N} (ih)^{4H}\). Then, we have

$$\begin{aligned} \textstyle\begin{cases} \sum_{i=1}^{N} ih Y_{ih} - a\mu + \frac{1}{2}\sigma ^{2} b = 0 , \\ \sum_{i=1}^{N} (ih)^{2H}Y_{ih} - b\mu + \frac{1}{2}\sigma ^{2} c=0 . \end{cases}\displaystyle \end{aligned}$$
(2.8)

As a consequence, we can obtain the estimators of μ̂ and \(\hat{\sigma}^{2}\) by solving (2.8):

$$\begin{aligned}& \hat{\mu} = \frac{\hat{b}\sum_{i=1}^{N}(ih)^{2\hat{H}} Y_{ih} - \hat{c}\sum_{i=1}^{N} ih Y_{ih}}{\hat{b}^{2}-\hat{a}\hat{c}} , \end{aligned}$$
(2.9)
$$\begin{aligned}& \hat{\sigma}^{2} = \frac{2 (\hat{a}\sum_{i=1}^{N} (ih)^{2\hat{H}} Y_{ih}-\hat{b}\sum_{i=1}^{N}ihY_{ih} )}{\hat{b}^{2}-\hat{a}\hat{c}} , \end{aligned}$$
(2.10)

where \(\hat{a}=\sum_{i=1}^{N} (ih)^{2} \), \(\hat{b}=\sum_{i=1}^{N} (ih)^{2 \hat{H}+1}\), \(c=\sum_{i=1}^{N}(ih)^{4\hat{H}}\).

The estimators of μ and \(\sigma ^{2}\) proposed in this paper do not involve any computational problem and we do not rely on numerical solution. Hence, the estimators proposed in (2.9) and (2.10) are efficient.

Remark 2.2

The parameter-estimation procedure for the gfBm proposed in this paper will proceed as follows:

  1. (i)

    Obtain the estimator of the Hurst parameter by solving the nonlinear function (2.7);

  2. (ii)

    Compute the estimator of μ by (2.9);

  3. (iii)

    Calculate the estimator of \(\sigma ^{2}\) by (2.10).

2.3 Asymptotic properties

In what follows, we turn to study the asymptotic behavior of these estimators defined by (2.7), (2.9), and (2.10), respectively. First, we state the strong consistency of Ĥ.

Theorem 2.3

The estimator Ĥ obtained by solving (2.7) converges to H almost surely as N goes to infinite.

Proof

See the Appendix. □

Now, we consider the strong consistency for both μ̂ and \(\hat{\sigma}^{2}\).

Theorem 2.4

The estimators μ̂ and \(\hat{\sigma}^{2}\) defined by (2.9) and (2.10), respectively, show strong consistency, that is,

$$\begin{aligned}& \hat{\mu} \rightarrow \mu \quad \textit{a.s. as }N\rightarrow \infty, \end{aligned}$$
(2.11)
$$\begin{aligned}& {\hat{\sigma}^{2}} \rightarrow \sigma ^{2} \quad \textit{a.s. as }N \rightarrow \infty . \end{aligned}$$
(2.12)

Proof

See the Appendix. □

Remark 2.5

The estimators proposed in this paper can be easily extended for all \(H\in (0,1)\). Using the same arguments as Theorem 2.3 and Theorem 2.4, we can obtain the strong consistency of Ĥ, μ̂ and \(\hat{\sigma}^{2}\) for all \(H\in (0,1)\).

3 Two alternative procedures

In order to examine the performance of the proposed estimators mentioned above, in this section, we introduce two existing estimation procedures for the sake of comparison. The first one is the complete maximum-likelihood approach proposed in Misiran et al. [31] and the other is the incomplete maximum likelihood method provided in Xiao et al. [43].

3.1 The complete maximum-likelihood estimation

As stated in Misiran et al. [31], the Gaussian property of the gfBm makes the process a perfect candidate for the use of the maximum likelihood method to estimate all the unknown parameters, simultaneously. Thus, from Misiran et al. [31], we obtain the complete maximum-likelihood estimators for μ and \(\sigma ^{2}\) from the observation \({\mathbf{Y}}=(Y_{h},Y_{2h},\ldots ,Y_{Nh})\) as

$$\begin{aligned}& \widetilde{\sigma}^{2}_{M} = \frac{2{\mathbf{Z}}'\sum_{1}^{-1}{\mathbf{Z}} }{ \sqrt{N^{2}+{\mathbf{X}}'_{H}\sum_{1}{\mathbf{X}}_{H}{\mathbf{Z}}'\sum_{1} {\mathbf{Z}}}+N} , \end{aligned}$$
(3.1)
$$\begin{aligned}& \widetilde{\mu}_{M} = \frac{1}{{\mathbf{1}}'\sum_{0}^{-1}{\mathbf{1}}} \Biggl( { \mathbf{1}}' \sum_{0}^{-1}{ \mathbf{Z}} +\frac{\widetilde{\sigma}^{2}_{M}}{2} { \mathbf{1}}'\sum _{0}^{-1}{\mathbf{X}}_{H} \Biggr) , \end{aligned}$$
(3.2)

where \({\mathbf{Z}}=(Z_{h},Z_{2h},\dots ,Z_{Nh})'\), \({\mathbf{X}}_{H}= (h^{2H}, (2h)^{2H}-h^{2H}, \dots , (Nh)^{2H}-(Nh-h)^{2H} )'\), \(Z_{ih}=Y_{ih}-Y_{(i-1)h}\), \(\sum_{1}=\sum_{0}^{-1} ({\mathbf{I}}- \frac{{\mathbf{1}}{\mathbf{1}}'\sum_{0}^{-1}}{{\mathbf{1}}'\sum_{0}^{-1}{\mathbf{1}}} )\), I is an \(N\times N\) identity matrix and

$$ \sum_{0}=\frac{1}{2} h^{2H} \bigl( \vert i-j+1 \vert ^{2H}-2 \vert i-j \vert ^{2H}+ \vert i-j-1 \vert ^{2H} \bigr) _{i,j=1,2, \dots ,N} . $$

Observe that the estimators \(\widetilde{\mu}_{M}\) and \(\widetilde{\sigma}^{2}_{M}\) depend on H, which should also be estimated. Actually, Misiran et al. [31] stated that the estimator \(\hat{H}_{M}\) of H can be obtained by minimizing the following profile-likelihood function

$$ \biggl(N \ln \hat{\sigma}^{2}_{M}+\ln \biggl\vert \sum_{0} \biggr\vert \biggr)+ \frac{1}{ \hat{\sigma}^{2}_{M}} \biggl({\mathbf{Z}}-\hat{\mu}_{M} {\mathbf{1}}+ \frac{\hat{\sigma}^{2}_{M}}{2} { \mathbf{X}}_{H} \biggr)' \sum _{0}^{-1} \biggl({\mathbf{Z}}-\hat{\mu}_{M} {\mathbf{1}}+\frac{\hat{\sigma}^{2}_{M}}{2}{ \mathbf{X}}_{H} \biggr) . $$
(3.3)

For solving the optimization problem (3.3) we have to rely on a numerical solution. The function fminsearch, which is a standard part of MATLAB, is a candidate tool.

Remark 3.1

The estimation procedure for the gfBm proposed by Misiran et al. [31] will proceed as follows:

  1. (i)

    Maximize (3.3) numerically to obtain the estimator Ĥ of H;

  2. (ii)

    Compute \(\widetilde{\sigma}^{2}_{M}\) by replacing H with \(\hat{H}_{M}\) in (3.1);

  3. (iii)

    Calculate \(\widetilde{\mu}_{M}\) by replacing H with \(\hat{H}_{M}\) in (3.2);

  4. (iv)

    Calculate the estimator \(\hat{\sigma}_{M}^{2}\) of \(\sigma ^{2}\) using the relationship \(\hat{\sigma}_{M}^{2}=h^{-\hat{H}}\widetilde{\sigma}^{2}_{M}\);

  5. (v)

    Calculate the estimator \(\hat{\mu}_{M}\) of μ using the relationship \(\hat{\mu}_{M}=h^{-1}\widetilde{\mu}_{M}\).

Remark 3.2

It should be noted that the accuracies of \(\hat{\mu}_{M}\) and \(\hat{\sigma}_{M}^{2}\) depend crucially on the estimator Ĥ of the Hurst parameter H. As a consquence, replacing H with Ĥ in (3.1) and (3.2) could impact the asymptotic behavior of \(\hat{\mu}_{M}\) and \(\hat{\sigma}_{M}^{2}\). Intuitively speaking, the more accurate the Ĥ, the more accurate are \(\hat{\mu}_{M}\) and \(\hat{\sigma}_{M}^{2}\). Hence, we should use some optimization methods to obtain the optimum value of Ĥ from (3.3).

Remark 3.3

Computationally, the algorithm proposed in this paper is very fast. The major advantage of our method is that the computational cost is markedly lower than the approach presented by Misiran et al. [31]. This is because the approach presented by Misiran et al. [31] involves the numerical computation of the covariance matrix and the logarithm of its determinant. However, our method just relies on a simple result obtained via the variation method.

3.2 The incomplete maximum-likelihood estimation

By contrast, from Xiao et al. [43], we obtain the incomplete maximum-likelihood estimators for μ and \(\sigma ^{2}\) from the observation \({\mathbf{Y}}=(Y_{h},Y_{2h},\ldots ,Y_{Nh})\) as

$$\begin{aligned}& \hat{H}_{X} = \frac{1}{2}-\frac{1}{2\ln 2}\ln \frac{\sum_{i=1}^{N-1} [\exp (Y_{(i+1)h} ) -\exp (Y_{ih} ) ]^{2}}{\sum_{i=1}^{\lfloor \frac{N}{2}\rfloor -1} [\exp (Y_{2(i+1)h} )-\exp (Y_{2ih} ) ]^{2}} , \end{aligned}$$
(3.4)
$$\begin{aligned}& \hat{\sigma}_{X}^{2} = \frac{\sum_{i=0}^{N-1} (Y_{(i+1)h}-Y_{ih} )^{2}}{ Nh^{2H}} , \end{aligned}$$
(3.5)
$$\begin{aligned}& \hat{\mu}_{X} = \frac{\sigma ^{2} {\mathbf{t}}' \Gamma _{H}^{-1} {\mathbf{t}}^{2H} +2 {\mathbf{Y}}'\Gamma _{H}^{-1} {\mathbf{t}} }{2{\mathbf{t}}'\Gamma _{H}^{-1} {\mathbf{t}}} , \end{aligned}$$
(3.6)

where

$$\begin{aligned} \Gamma _{H}= \bigl[ \operatorname{Cov} \bigl[ B^{H}_{ih}, B^{H}_{jh} \bigr] \bigr]_{i,j=1,2,\ldots ,N} =\frac{h^{2H}}{2} \bigl(i^{2H}+j^{2H}- \vert i-j \vert ^{2H} \bigr)_{i,j=1,2,\ldots ,N} . \end{aligned}$$

Obviously, the maximum-likelihood estimator \(\hat{\mu}_{X}\) involves the numerical computation of the inverse and the determinant of the covariance matrix, which induces open computational problems. However, the development of computer technologies made it possible to obtain this estimator effectively and efficiently.

Remark 3.4

We would like to mention that the parameter-estimation procedure for the gfBm presented in Xiao et al. [43] always proceeded as follows:

  1. (i)

    Calculate the estimator of the Hurst parameter by (3.4);

  2. (ii)

    Compute the estimator of \(\sigma ^{2}\) using (3.5) with replacing H with \(\hat{H}_{X}\);

  3. (iii)

    Obtain the estimator of μ, by replacing H with \(\hat{H}_{X}\) and \(\sigma ^{2}\) with \(\hat{\sigma}_{X}^{2}\), in (3.6).

Remark 3.5

Let us mention also that the computation time costed by obtaining \(\hat{\mu}_{X}\) is high since this estimator involves the inverse of the covariance matrix.

4 Simulation study

For the sake of reproducibility, in this section, we study the finite-sample properties of the proposed estimators. As addressed in the previous sections, the estimators of μ, \(\sigma ^{2}\), and H have several desirable properties for sufficiently large truncation points, such as consistency and asymptotic normality. In other words, if the observed time series is relatively long, statistical inference could be performed on the estimate. However, in some other cases, the observations are irregular and relatively short. Hence, it would be interesting to analyze the performance of μ̂, \(\hat{\sigma}^{2}\), and Ĥ in small samples, which is needed to justify the application of asymptotic results. The information in small samples may also affect the choice of sampling interval. In what follows, we conduct Monte Carlo studies for different values of μ, \(\sigma ^{2}\), and H to numerically investigate the efficiency of our estimators. Moreover, we compare the finite-sample properties of our method with two existing approaches, which are proposed in Sect. 3 for details.

Actually, the main obstacle of Monte Carlo simulation is the difficulty to obtain fBm, in contrast to Brownian motion. In the literature, there are some methods to solve the problem of simulating fBm (see Coeurjolly [12]). In this paper, we apply Paxson’s algorithm (see Paxson [33]). This means that we first generate the fractional Gaussian noise based on Paxson’s method by fast Fourier transformation. Then, we can obtain the fBm using the result that the fBm is defined as a partial sum of the fractional Gaussian noise. Finally, we obtain gfBm of Eq. (2.3). For a better understanding of our method, we describe the steps for the simulation of the gfBm together with the calculation of the estimators proposed in this paper. Therefore, the estimation procedure of this paper by Monte Carlo simulation method is summarized as follows:

  1. (i)

    Set the sampling interval h and the sampling size N;

  2. (ii)

    Set the values for the three variables μ, H, and σ;

  3. (iii)

    Generate fractional Gaussian noise based on Paxson’s method;

  4. (iv)

    Construct the path of the gfBm;

  5. (v)

    Obtain the estimator of the Hurst parameter by solving the nonlinear function (2.7);

  6. (vi)

    Calculate the drift estimator μ̂ using (2.9);

  7. (vii)

    Calculate the estimator \(\hat{\sigma}^{2}\) by (2.10).

In the case of an empirical study, we just need to proceed from (v) to (vii). For comparison, we should replace steps (v) to (vii) with Remark 3.1 or Remark 3.4, respectively. Now, we sum up the estimation procedures by Monte Carlo simulation method, which is shown in Fig. 1.

Figure 1
figure 1

Flow chart of the proposed estimation procedures based on the Monte Carlo simulation method

In addition, for some fixed sampling intervals, we carry out a simulation study to compare the estimators of our method (mentioned above), with the complete maximum-likelihood estimation proposed in Misiran et al. [31] (see Remark 3.1) and the incomplete maximum-likelihood estimation presented Xiao et al. [43] (see Remark 3.4) by using some generating datasets with different sampling size N and different sampling interval h. For each case, replications involving 1000 samples are simulated from the true model. All the procedures are coded in Matlab and the results obtained using a 3.60 GHz Intel Core i7-4790 CPU with 8 GB of RAM and running Windows 10. For a fixed sampling interval \(h=1/12\) (e.g., data is collected by monthly observations), Table 1 reports the mean and standard deviation (standard deviation (S.Dev.)) of these estimators proposed in this paper for different sample sizes, where the true value denotes the parameter value used in the Monte Carlo simulation. Moreover, to show the efficiency of our method, the results from the approach provided by Misiran et al. [31] and Xiao et al. [43] are also presented in Table 1. The average CPU time (in seconds) is also documented in Table 1. Furthermore, in order to test the effect of sampling interval, Table 2 reports simulation results for sampling intervals of \(h=1/52\) (e.g., data collected by weekly observations).

Table 1 Estimation results of the sampling interval \(h=1/12\) with different sample size
Table 2 Estimation results of the sampling interval \(h=1/52\) with different sample size

From numerical computations, we can see that the biases and the standard deviations in the estimators of μ, H, and σ decrease as the sample size increases. Hence, we can conclude that the estimators of these three methods perform well for the Hurst parameters \(H\in (0,1)\). As expected, the simulated means of these estimators converge to the true value rapidly and the simulated standard deviations decrease to zero with a slight positive bias as the number of observations increases. The results given in Tables 1 and 2 clearly show that all the three methods are numerically nearly equivalent. However, we can see from the results obtained in Tables 1 and 2 that our methodology performs considerably better than the other two. Most of the biases and variances obtained by using our method are within an acceptable tolerance. It is observed in most cases that our estimates for the Hurst parameter are obviously quite stable and less biased. The performance on estimators of μ and σ are also fairly satisfactory. We can also see that the larger the sample size N, the better the estimation performs. It is clear from Tables 1 and 2 that both the complete maximum-likelihood estimation of Misiran et al. [31] and the complete maximum-likelihood estimation of Xiao et al. [43] provide almost precise estimators, in particular, the biases and the standard deviations are greater than those from the estimators based on this paper. Moreover, the most important finding is that the computation time costed by the estimation procedure provided in this paper is significantly lower than those of the approaches proposed in Misiran et al. [31] and Xiao et al. [43]. This is mainly due to the proposed fast algorithm in this paper, by contrast, the approach proposed in Misiran et al. [31] requires a one-dimensional search and needs to compute the inverse of the covariance matrix and the logarithm of its determinant, but the method provided in Xiao et al. [43] needs to calculate the inverse of the covariance matrix. The method provided in this paper is also very convenient since it relies on a simple result obtained via the variation method. The performance of the approach proposed in this paper is comparable to that of the estimation procedure proposed in Xiao et al. [43] with a lower computation cost. The method proposed in this paper gives the smallest error among the three approaches and all the estimators we propose are independent of the sampling interval h, suggesting the advantages of our proposed method compared to the approaches by Xiao et al. [43] and Misiran et al. [31] in more scenarios.

5 Empirical applications

To better illustrate our proposed method, we apply our method to real data. The data utilized in our empirical investigation are extracted from the GTA Research Service Center including four major market indexes in China, SHCI, SZCI, CSI 500, and CSI 300 spanning from 01/04/2010 through 12/31/2019. After excluding those days for which the records are not complete (e.g., for holidays or stock exchange anticipated closures), the whole dataset includes \(M=2431\) trading days. The index prices are observed at a time interval of \(h=1/250\) (e.g., data collected once a day) and the returns are calculated using the logarithmic differenced data

$$ r_{ih}=Y_{ih}=\ln S_{(i+1)h}-\ln S_{ih},\quad i=1,2,\ldots ,M. $$

Basic descriptive plots for the financial data in the log-return format are presented in the following figures. In particular, Fig. 2 provides some empirical data of SHCI with sampling frequency of every day: Fig. 2(a) shows the daily closing values of SHCI in the sample period. In the original trace, we note that there appears to be no long-run average level. This is the evidence of a nonstationary time series. However, after difference operation, the differenced trace appears to be quite stable over time, and the differenced operation has produced a stationary time series. Figure 2(b) illustrates the continuously compounded returns (the log returns) associated with the price series in Fig. 2(a). In contrast to the price series in Fig. 2(a), the log returns appear to be quite stable over time, and the transformation from prices to returns has produced a stationary time series. The quantile–quantile (Q–Q) figure during the sample period is presented in Fig. 2(c). If the empirical returns are normally distributed, we expect to observe a straight line in this figure. However, this is not the case. Contrarily, in the previous Q–Q plot, for low and high values (tails of the distribution) there exists a clear departure of the plot from the reference line that corresponds to a normal distribution. This means that the SHCI log return doesn’t follow a normal distribution. In Fig. 2(d) an example of the probability density function of the SHCI is given. Similarly, the statistical figures of SZCI, CSI 500 and CSI 300 are presented in Figs. 35.

Figure 2
figure 2

Some statistical figures of daily returns for SHCI from January 4th 2010 to December 31st 2019

Figure 3
figure 3

Some statistical figures of daily returns for SZCI from January 4th 2010 to December 31st 2019

Figure 4
figure 4

Some statistical figures of daily returns for CSI 500 from January 4th 2010 to December 31st 2019

Figure 5
figure 5

Some statistical figures of daily returns for CSI 300 from January 4th 2010 to December 31st 2019

To give a brief insight into the properties of the data, Table 3 tabulates the basic descriptive statistics of SHCI, SZCI, CSI 500, and CSI 300 in the full sample period. Names are given in the first column. The second, third, fourth, and fifth columns contain the basic descriptive statistics for four indices. Moreover, both skewness and kurtosis are also presented in Table 3. As is known, the skewness of a symmetric distribution, such as the normal distribution, is zero. However, none of the series seems to be symmetric. As shown in Table 3, four series have negative skewness, which implies that the distributions have a long left tail. Also, from Table 3 we can see that both series have a kurtosis that exceeds a value of three, which is the kurtosis of the normal distribution. This means that the distributions are peaked (leptokurtic) relative to the normal distribution. As both of the descriptive statistics (i.e., skewness and kurtosis) indicate deviations from normal values, we can expect that the observed distributions are not normally distributed. These results are also confirmed by the quantile–quantile (Q–Q) plots (from Fig. 2(c) to Fig. 5(c)). It is well known that if we plot the quantiles of the chosen series against the quantiles of the normal distribution, we can detect strong deviations, especially at the tails. From Fig. 2(c) to Fig. 5(c), the plots indicate an S-shape curve, which is a typical sign of a nonnormal distribution in a financial time series.

Table 3 Summary statistics of daily returns for four indices

After examining the basic statistical properties of the selected time series, we turn now to present an investigation of whether these four indices have long-range dependence. Generally speaking, there are a number of procedures accessible in the literature for testing for the presence of long memory in time series of stock returns (see, for example, the Geweke–Porter–Hudak procedure, the R/S method, the aggregated variance approach, the aggregated absolute value method, and the Whittle approach). We use the ACF plot method, which seems to be the simplest one for us. Figure 6 plots the sample autocorrelation functions of the daily returns of SHCI, SZCI, CSI 500, and CSI 300 in the sample period. From Fig. 6, we observe that the decays of autocorrelation functions are very weak in both. Therefore, we may say that two indices have long-range dependence.

Figure 6
figure 6

Upper-left: Sample ACF for the SHCI log-returns. Here and in what follows, the horizontal lines in graphs displaying sample ACFs are set as the 95% confidence bands corresponding to the ACF of iid Gaussian white noise. Upper-right: Sample ACF for the SZCI log-returns. Lower-left: Sample ACF for the CSI 500 log-returns. Lower-right: Sample ACF for the CSI 300 log-returns

Finally, we are in a position to estimate the unknown parameters μ, H, and σ from the selected financial series by (2.7), (2.9), and (2.10). Using Eq. (2.2) and the real data, we estimate the desired parameters based on the estimation procedures proposed in this paper (see Remark 2.2). All the estimation results are presented in Table 4. When compared to the real data of SHCI, SZCI, CSI 500, and CSI 300, these estimated parameter values seem reasonable.

Table 4 Empirical results for SHCI, SZCI, CSI 500 and CSI 300

6 Conclusion

The long-memory feature has evolved into an important part of the time-series analysis during the last decades, as researchers in empirical studies have sought to use “ideal” models in practical applications of net traffic, economics, finance, biology, physics, chemistry, and medicine. One of the fascinations with long-memory processes is their inherent ability to bridge both persistent stationary and nonstationary time series. fBm is the best-known long-memory stochastic model in the continuous-time situation. As a consequence, stochastic models driven by fBm are used by statisticians, econometricians, and researchers in many of the physical sciences who have become aware of the very strong persistence in the autocorrelations and other measures of the temporal dependence of some time series. However, a crucial problem with applications of fBm in practice is how to estimate the unknown values of the parameters in stochastic processes driven by fBm. In this paper, we extend the notion of fBm into the discrete-time domain and then proposed the estimation methodology for gfBm. Employing the bipower variation and the least-squares method, we have constructed a procedure for estimating all unknown parameters in gfBm. The strong consistency of these proposed estimators has been also provided in this paper. To compare the estimators from our method with the complete maximum-likelihood estimation method proposed in Misiran et al. [31] and the incomplete maximum-likelihood estimation provided in Xiao et al. [43], we perform a simulation study to illustrate the effectiveness and the efficiency of our methodology. The simulation exercise also shows that our proposed estimators work well in practice, even with small sizes. Furthermore, to show how our approach can be used in realistic contexts, an empirical study is given based on SHCI, SZCI, CSI 500, and CSI 300 of Chinese stock markets, demonstrating that our method is easy to implement and has a smaller computational cost than the complete maximum-likelihood estimators proposed by Misiran et al. [31] and the incomplete maximum-likelihood estimators provided in Xiao et al. [43]. Certainly, for future study, it is required to use different schemes of estimation with a higher order of convergence for the improvement of the methodology. We also expect the needs for these methods and for improvements in the statistical machinery that is available to practitioners to grow further as the financial industry continues to expand and data sets become richer. The field is therefore of growing importance for both theorists and practitioners.

This study also suggests several directions for future research. The first one is to extend the underlying asset-price processes into more general processes such as the mixed-exponential jump diffusion model, or the stochastic volatility, which might provide insight into the robustness of the results obtained herein. Another direction for future research is to estimate unknown parameters in gfBm with microstructure noises. For the final one, it can be considered that the problem of estimating parameters for gfBm with jumps processes using some statistical methods, such as the constrained expectation-maximization or the majorization-minimization algorithm.

Availability of data and materials

The datasets used during the current study are available from the corresponding author on reasonable request.

References

  1. Baillie, R.T.: Long memory processes and fractional integration in econometrics. J. Econom. 73(1), 5–59 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bardet, J.M., Surgailis, D.: Measuring the roughness of random paths by increment ratios. Bernoulli 17(2), 749–780 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  3. Barndorff-Nielsen, O.E., Corcuera, J.M., Podolskij, M.: Limit theorems for functionals of higher order differences of Brownian semi-stationary processes. In: Prokhorov and Contemporary Probability Theory, pp. 69–96. Springer, Berlin (2013)

    Chapter  Google Scholar 

  4. Barndorff-Nielsen, O.E., Shephard, N.: Power and bipower variation with stochastic volatility and jumps. J. Financ. Econom. 2(1), 1–37 (2004)

    Google Scholar 

  5. Bertin, K., Torres, S., Tudor, C.: Maximum-likelihood estimators and random walks in long memory models. Statistics 45(4), 361–374 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bertin, K., Torres, S., Tudor, C.A.: Drift parameter estimation in fractional diffusions driven by perturbed random walks. Stat. Probab. Lett. 81(2), 243–249 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  7. Brouste, A., Iacus, S.M.: Parameter estimation for the discretely observed fractional Ornstein-Uhlenbeck process and the Yuima R package. Comput. Stat. 28(4), 1529–1547 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  8. Çağlar, M.: A long-range dependent workload model for packet data traffic. Math. Oper. Res. 29(1), 92–105 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  9. Cheng, P., Shen, G., Chen, Q.: Parameter estimation for nonergodic Ornstein-Uhlenbeck process driven by the weighted fractional Brownian motion. Adv. Differ. Equ. 2017(1), 1 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  10. Chronopoulou, A., Viens, F.G.: Estimation and pricing under long-memory stochastic volatility. Ann. Finance 8(2–3), 379–403 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  11. Chronopoulou, A., Viens, F.G.: Stochastic volatility and option pricing with long-memory in discrete and continuous time. Quant. Finance 12(4), 635–649 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  12. Coeurjolly, J.: Simulation and identification of the fractional Brownian motion: a bibliographical and comparative study. J. Stat. Softw. 5(7), 1–53 (2000)

    Article  Google Scholar 

  13. Comte, F., Coutin, L., Renault, É.: Affine fractional stochastic volatility models. Ann. Finance 8(2–3), 337–378 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  14. Doukhan, P., Oppenheim, G., Taqqu, M.S.: Theory and Applications of Long-Range Dependence. Springer, Berlin (2003)

    MATH  Google Scholar 

  15. Duncan, T., Hu, Y., Pasik-Duncan, B.: Stochastic calculus for fractional Brownian motion I. Theory. SIAM J. Control Optim. 38(2), 582–612 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  16. Elliott, R., Chan, L.: Perpetual American options with fractional Brownian motion. Quant. Finance 4(2), 123–128 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  17. Elliott, R., Van Der Hoek, J.: A general fractional white noise theory and applications to finance. Math. Finance 13(2), 301–330 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  18. Fleming, J., Kirby, C.: Long memory in volatility and trading volume. J. Bank. Finance 35(7), 1714–1726 (2011)

    Article  Google Scholar 

  19. Granger, C.W., Hyung, N.: Occasional structural breaks and long memory with an application to the s&p 500 absolute stock returns. J. Empir. Finance 11(3), 399–421 (2004)

    Article  Google Scholar 

  20. Hosking, J.R.: Fractional differencing. Biometrika 68(1), 165–176 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  21. Hu, Y., Nualart, D., Xiao, W., Zhang, W.: Exact maximum-likelihood estimator for drift fractional Brownian motion at discrete observation. Acta Math. Sci. 31(5), 1851–1859 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  22. Hu, Y., Øksendal, B.: Fractional white noise calculus and applications to finance. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 6(1), 1–32 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  23. Hurst, H.E.: Long-term storage capacity of reservoirs. Trans. Am. Soc. Civ. Eng. 116, 770–808 (1951)

    Article  Google Scholar 

  24. Kubilius, K., Mishura, Y., Ralchenko, K.: Parameter Estimation in Fractional Diffusion Models, vol. 8. Springer, Berlin (2017)

    MATH  Google Scholar 

  25. Kukush, A., Mishura, Y., Valkeila, E.: Statistical inference with fractional Brownian motion. Stat. Inference Stoch. Process. 8(1), 71–93 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  26. Lai, D., Davis, B.R., Hardy, R.J.: Fractional Brownian motion and clinical trials. J. Appl. Stat. 27(1), 103–108 (2000)

    Article  MATH  Google Scholar 

  27. Liu, Z., Song, N.: Minimum distance estimation for fractional Ornstein-Uhlenbeck type process. Adv. Differ. Equ. 2014(1), 1 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  28. Lo, A.: Long-term memory in stock market prices. Econometrica 59(5), 1279–1313 (1991)

    Article  MATH  Google Scholar 

  29. Mandelbrot, B.B., Van Ness, J.W.: Fractional Brownian motions, fractional noises and applications. SIAM Rev. 10(4), 422–437 (1968)

    Article  MathSciNet  MATH  Google Scholar 

  30. Mishura, Y.: Stochastic Calculus for Fractional Brownian Motion and Related Processes. Springer, Berlin (2008)

    Book  MATH  Google Scholar 

  31. Misiran, M., Lu, Z., Teo, K.: Fractional Black-Scholes models: complete MLE with application to fractional option pricing. In: International Conference on Optimization and Control 2010, pp. 573–586. Guiyang, China (2010)

    Google Scholar 

  32. Nguyen, D.B.B., Prokopczuk, M., Sibbertsen, P.: The memory of stock return volatility: asset pricing implications. J. Financ. Mark. 47, 100487 (2020)

    Article  Google Scholar 

  33. Paxson, V.: Fast, approximate synthesis of fractional Gaussian noise for generating self-similar network traffic. ACM SIGCOMM Comput. Commun. Rev. 27(5), 5–18 (1997)

    Article  Google Scholar 

  34. Phillips, P.C., Yu, J.: Jackknifing bond option prices. Rev. Financ. Stud. 18(2), 707–742 (2005)

    Article  Google Scholar 

  35. Rossi, E., Fantazzini, D.: Long memory and periodicity in intraday volatility. J. Financ. Econom. 13(4), 922–961 (2015)

    Google Scholar 

  36. Rostek, S.: Option Pricing in Fractional Brownian Markets. Springer, Berlin (2009)

    Book  MATH  Google Scholar 

  37. Sun, L., Wang, L., Fu, P.: Maximum likelihood estimators of a long-memory process from discrete observations. Adv. Differ. Equ. 2018(1), 1 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  38. Tanaka, K., Xiao, W., Yu, J.: Maximum likelihood estimation for the fractional Vasicek model. Econom. 8(3), 32 (2020)

    Google Scholar 

  39. Wang, X., Xiao, W., Yu, J.: Modeling and forecasting realized volatility with the fractional Ornstein-Uhlenbeck process. J. Econom. (2022, in press). https://doi.org/10.1016/j.jeconom.2021.08.001

    Article  Google Scholar 

  40. Willinger, W., Taqqu, M.S., Leland, W.E., Wilson, D.V.: Self-similarity in high-speed packet traffic: analysis and modeling of ethernet traffic measurements. Stat. Sci. 10(1), 67–85 (1995)

    Article  MATH  Google Scholar 

  41. Xiao, W., Zhang, W., Xu, W.: Parameter estimation for fractional Ornstein-Uhlenbeck processes at discrete observation. Appl. Math. Model. 35(9), 4196–4207 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  42. Xiao, W., Zhang, W., Zhang, X.: Parameter identification for drift fractional Brownian motions with application to the Chinese stock markets. Commun. Stat., Simul. Comput. 44(8), 2117–2136 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  43. Xiao, W., Zhang, W., Zhang, X.: Parameter identification for the discretely observed geometric fractional Brownian motion. J. Stat. Comput. Simul. 85(2), 269–283 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  44. Xiao, W., Yu, J.: Asymptotic theory for estimating drift parameters in the fractional Vasicek model. Econom. Theory 35(1), 198–231 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  45. Xiao, W., Yu, J.: Asymptotic theory for rough fractional Vasicek models. Econ. Lett. 177, 26–29 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  46. Xiao, W., Zhang, X., Zuo, Y.: Least squares estimation for the drift parameters in the sub-fractional Vasicek processes. J. Stat. Plan. Inference 197, 141–155 (2018)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work is partly supported by the Humanities and Social Sciences Research and Planning Fund of the Ministry of Education of China (No. 20YJA630053), the Natural Science Foundation of China (No. 11801590 and No. 61673019), the Tianyuan Foundation of the National Natural Science Foundation of China (No. 12126313), the Fundamental Research Funds for the Central Universities (No. 19lgpy243) and Research Foundation for Young Teachers of Guangdong University of Technology.

Author information

Authors and Affiliations

Authors

Contributions

The main idea of this paper was proposed by LS and JC. XL prepared the manuscript initially and performed all the steps of the proofs in this research. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xianggang Lu.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Appendix

Appendix

1.1 A.1 Proof of Theorem 2.3

Proof

Using the ergodic theorem, we can easily obtain the desired result by a straightforward argument. □

1.2 A.2 Proof of Theorem 2.4

Proof

Let’s prove the convergence of μ̂ first. For the sake of convenience, we define

$$\begin{aligned}& \widetilde{\mu} = \frac{b\sum_{i=1}^{N}(ih)^{2H} Y_{ih} - c\sum_{i=1}^{N} ih Y_{ih}}{b^{2}-ac} , \end{aligned}$$
(A.1)
$$\begin{aligned}& \widetilde{\sigma}^{2} = \frac{2 (a\sum_{i=1}^{N} (ih)^{2H} Y_{ih}- b\sum_{i=1}^{N} ih Y_{ih} )}{b^{2}-ac} , \end{aligned}$$
(A.2)

where \(a=\sum_{i=1}^{N} (ih)^{2} \), \(b=\sum_{i=1}^{N} (ih)^{2H+1}\), \(c=\sum_{i=1}^{N}(ih)^{4H}\). For technical reasons, we now deal with the strong convergence of μ̃. Substituting \(Y_{ih}\) by \(Y_{ih}=\mu ih +\sigma B_{ih}^{H}-\frac{1}{2}\sigma ^{2} (ih)^{2H}\) into (A.1), we have

$$\begin{aligned} \widetilde{\mu} =& \frac{b\sum_{i=1}^{N} ( \mu ih +\sigma B_{ih}^{H}- \frac{1}{2}\sigma ^{2} (ih)^{2H} )(ih)^{2H} - c\sum_{i=1}^{N} ( \mu ih +\sigma B_{ih}^{H}- \frac{1}{2}\sigma ^{2} (ih)^{2H} )ih}{b^{2}-ac} \\ =&\mu + \frac{b\sigma \sum_{i=1}^{N} (ih)^{2H}B_{ih}^{H}-c\sigma \sum_{i=1}^{N} ih B_{ih}^{H}}{b^{2}-ac} . \end{aligned}$$
(A.3)

Thus, \(\mathbb{E} [\hat{\mu}]=\mu \) and hence μ̂ is unbiased. On the other hand, we have

$$\begin{aligned} \operatorname{Var} [\widetilde{\mu}- \mu ] =& \mathbb{E} [ \widetilde{\mu} - \mu ]^{2} \\ =& \mathbb{E} \biggl[ \frac{b\sigma \sum_{i=1}^{N} (ih)^{2H}B_{ih}^{H} - c\sigma \sum_{i=1}^{N} ihB_{ih}^{H}}{b^{2}-ac} \biggr]^{2} \\ =&\frac{\sigma ^{2}}{(b^{2}-ac)^{2}}\mathbb{E} \Biggl[b\sum_{i=1}^{N} (ih)^{2H}B_{ih}^{H} - c\sum _{i=1}^{N} ih B_{ih}^{H} \Biggr]^{2} \\ =&\frac{\sigma ^{2}}{(b^{2}-ac)^{2}}\mathbb{E} \Biggl[b^{2} \Biggl( \sum _{i=1}^{N} (ih)^{2H}B_{ih}^{H} \Biggr)^{2} + c^{2} \Biggl(\sum_{i=1}^{N} ihB_{ih}^{H} \Biggr)^{2} \\ &{} -2bc\sum_{i=1}^{N} (ih)^{2H}B_{ih}^{H} \sum_{i=1}^{N} ih B_{ih}^{H} \Biggr] \\ \leq &\frac{\sigma ^{2}}{(b^{2}-ac)^{2}} \Biggl[b^{2}\mathbb{E} \Biggl( \sum _{i=1}^{N} (ih)^{2H}B_{ih}^{H} \Biggr)^{2} + c^{2} \mathbb{E} \Biggl(\sum _{i=1}^{N} ih B_{ih}^{H} \Biggr)^{2} \\ &{}+2bc\mathbb{E} \Biggl(\sum_{i=1}^{N} (ih)^{2H}B_{ih}^{H}\sum _{i=1}^{N} ihB_{ih}^{H} \Biggr) \Biggr] \\ \leq &\frac{\sigma ^{2}}{(b^{2}-ac)^{2}} \bigl(b^{2} I_{1} + c^{2} I_{2} +2bc I_{1} I_{2} \bigr) , \end{aligned}$$
(A.4)

where \(I_{1} = \mathbb{E} (\sum_{i=1}^{N} (ih)^{2H}B_{ih}^{H} )^{2}\), \(I_{2} = \mathbb{E} (\sum_{i=1}^{N} ih B_{ih}^{H} )^{2}\).

A standard calculation yields

$$\begin{aligned} I_{1} =& \mathbb{E} \Biggl[\sum _{i=1}^{N} (ih)^{2H}B_{ih}^{H} \Biggr]^{2} \\ =& \mathbb{E} \bigl[h^{2H}B_{h}^{H} + (2h)^{2H}B_{2h}^{H} +(3h)^{2H}B_{3h}^{H}+ \cdots (Nh)^{2H}B_{Nh}^{H} \bigr]^{2} \\ =&\mathbb{E} \Biggl[\sum_{i=1}^{N}(ih)^{4H} \bigl(B_{ih}^{H} \bigr)^{2} +2\sum _{l> k}(kh)^{2H}(lh)^{2H}B_{kh}^{H} B_{lh}^{H} \Biggr] \\ =&\mathbb{E} \Biggl[\sum_{i=1}^{N} (ih)^{4H} \bigl(B_{ih}^{H} \bigr)^{2}+2 \sum_{l \geq k}(kh)^{2H}(lh)^{2H}B_{kh}^{H} B_{lh}^{H} - 2\sum_{i=1}^{N} (ih)^{4H} \bigl(B_{ih}^{H} \bigr)^{2} \Biggr] \\ =&2\sum_{l\geq k}(kh)^{2H}(lh)^{2H} \mathbb{E} \bigl(B_{kh}^{H} B_{lh}^{H} \bigr)-\sum_{i=1}^{N} {ih}^{6H} \\ \leq &C\sum_{l\geq k} (kh)^{2H}(lh)^{2H}kh (lh)^{2H-1}-\sum_{i=1}^{N} (ih)^{6H} \\ = & C\sum_{l\geq k}(kh)^{2H+1}(lh)^{4H-1} -\sum_{i=1}^{N} (ih)^{6H} \\ = & C\sum_{l=1}^{N}(lh)^{4H-1}\sum _{k=1}^{l} (kh)^{2H+1}-\sum _{i=1}^{N} (ih)^{6H} \\ \sim & C h^{6H+2} N^{6H+2} , \end{aligned}$$
(A.5)

where C is a generic constant, which may change from line to line.

With almost no extra effort, we can obtain

$$\begin{aligned} I_{2} =& \mathbb{E} \Biggl[\sum _{i=1}^{N} ih B_{ih}^{H} \Biggr]^{2} \\ =& \mathbb{E} \bigl[h^{2}B_{h}^{H} + (2h)^{2}B_{2h}^{H} +(3h)^{2}B_{3h}^{H}+ \cdots (Nh)^{2}B_{Nh}^{H} \bigr]^{2} \\ =&\mathbb{E} \Biggl[ \sum_{i=1}^{N} (ih)^{2} \bigl(B_{ih}^{H} \bigr)^{2}+2 \sum_{l> k}khlh B_{kh}^{H} B_{lh}^{H} \Biggr] \\ =&\sum_{i=1}^{N}(ih)^{2+2H} +2\sum _{l>k}khlh \mathbb{E} \bigl[B_{kh}^{H} B_{lh}^{H} \bigr] \\ =&2\sum_{l\geq k}khlh\mathbb{E} \bigl[B_{kh}^{H} B_{lh}^{H} \bigr]- \sum_{i=1}^{N}(ih)^{2+2H} \\ \leq &C\sum_{l\geq k}(kh)^{2}(lh)^{2H}- \sum_{i=1}^{N}(ih)^{2+2H} \\ \sim &C h^{2H+4} N^{2H+4} , \end{aligned}$$
(A.6)

where C is a constant.

Moreover, we can easily obtain that

$$\begin{aligned}& \begin{aligned} &a=\sum_{i=1}^{N} (ih)^{2} \sim (Nh)^{3},\qquad b=\sum_{i=1}^{N} (ih)^{2H+1} \sim (Nh)^{2H+2}, \\ &c=\sum_{i=1}^{N} (ih)^{4H} \sim (Nh)^{4H+1}. \end{aligned} \end{aligned}$$
(A.7)

Inserting these convergency results of (A.5), (A.6), and (A.7) together into (A.4), as N goes to infinity and for \(H\in (0,1)\), we obtain

$$\begin{aligned} \operatorname{Var} [\widetilde{\mu} ] \sim & \frac{C}{N^{2-2H}} , \end{aligned}$$
(A.8)

which converges to zero for fixed h.

Next, we will use the Borel–Cantelli lemma to prove the strong convergence of μ̃. To this end, we will show that

$$ \sum_{N\geq 1}\mathbb{P} \biggl( \vert \widetilde{\mu}-\mu \vert > \frac{1}{N^{\epsilon}} \biggr) < \infty $$
(A.9)

for some \(\epsilon >0\).

Take \(0<\epsilon <1-H\). Then, from Chebyshev’s inequality, the property of the central absolute moments of Gaussian random variables and (A.8), we have

$$\begin{aligned} \mathbb{P} \biggl( \vert \widetilde{\mu}-\mu \vert >\frac{1}{N^{\epsilon}} \biggr) \leq & N^{q\epsilon} \mathbb{E} \bigl[ \vert \widetilde{\mu}-\mu \vert ^{q} \bigr] \\ =& C N^{q\epsilon} \bigl(\mathbb{E} \bigl[ \vert \widetilde{\mu}- \mu \vert ^{2} \bigr] \bigr)^{q/2} \\ \sim & C N^{q\epsilon + (H-1 )q} . \end{aligned}$$

For sufficiently large q, we have \(q\epsilon +(H-1)q<-1\). Thus, (A.9) is proved, which implies

$$ \widetilde{\mu} \overset{\mathrm{a.s.}}{\rightarrow} \mu , $$
(A.10)

as \(N\rightarrow \infty \).

Consequently, using the continuous mapping theorem, the strong consistency of Ĥ and (A.10), we can obtain

$$ \hat{\mu}-\mu = (\hat{\mu}-\widetilde{\mu} ) + ( \widetilde{\mu}-\mu ) \overset{\mathrm{a.s.}}{\rightarrow} 0 , $$

which implies (2.11).

Next, we are interested in the strong consistency of \(\hat{\sigma}^{2}\). Substituting \(Y_{ih}\) by \(Y_{ih}=\mu ih +\sigma B_{ih}^{H}-\frac{1}{2}\sigma ^{2} (ih)^{2H}\) in (A.2), we have

$$\begin{aligned} \widetilde{\sigma}^{2} =& \frac{2 (a\sum_{i=1}^{N} ( \mu ih +\sigma B_{ih}^{H}- \frac{1}{2}\sigma ^{2} (ih)^{2H} )(ih)^{2H}-b\sum_{i=1}^{N} ( \mu ih +\sigma B_{ih}^{H}- \frac{1}{2}\sigma ^{2} (ih)^{2H} ) ih )}{b^{2}-ac} \\ =&\sigma ^{2} + \frac{2a\sigma \sum_{i=1}^{N} (ih)^{2H}B_{ih}^{H} - 2b\sigma \sum_{i=1}^{N} ihB_{ih}^{H}}{b^{2}-ac}. \end{aligned}$$
(A.11)

Thus, \(\mathbb{E} [\widetilde{\sigma}^{2} ]=\sigma ^{2}\) and hence \(\widetilde{\sigma}^{2}\) is unbiased. On the other hand, we can easily obtain

$$\begin{aligned} \operatorname{Var} \bigl[\widetilde{\sigma}^{2} \bigr] =& \mathbb{E} \bigl(\widetilde{\sigma}^{2} - \sigma ^{2} \bigr)^{2} \\ =&\frac{4\sigma ^{2}}{ (b^{2}-ac )^{2}}\mathbb{E} \Biggl[a\sum_{i=1}^{N} i^{2H}B_{ih}^{H} - b \sum _{i=1}^{N} ihB_{ih}^{H} \Biggr]^{2} \\ =& \frac{4\sigma ^{2}}{(b^{2}-ac)^{2}}\mathbb{E} \Biggl[a^{2} \Biggl(\sum _{i=1}^{N} i^{2H}B_{ih}^{H} \Biggr)^{2} + b^{2} \Biggl(\sum_{i=1}^{N} ihB_{ih}^{H} \Biggr)^{2} \\ &{} - 2ab \sum _{i=1}^{N}(ih)^{2H}B_{ih}^{H} \sum_{i=1}^{N} ihB_{ih}^{H} \Biggr] \\ \leq &\frac{4\sigma ^{2}}{(b^{2}-ac)^{2}} \bigl(a^{2}I_{1} + b^{2}I_{2} + 2ab\sqrt{I_{1}I_{2}} \bigr) , \end{aligned}$$

where \(I_{1} = \mathbb{E} (\sum_{i=1}^{N} (ih)^{2H}B_{ih}^{H} )^{2}\) and \(I_{2} = \mathbb{E} (\sum_{i=1}^{N} ihB_{ih}^{H} )^{2}\).

Using the same argument as (A.8), we obtain

$$\begin{aligned} \operatorname{Var} \bigl[\widetilde{\sigma}^{2} \bigr] \sim & \frac{C}{N^{2H}} , \end{aligned}$$
(A.12)

which converges to zero as N goes to infinity.

Using the similar argument as (A.10), we obtain

$$ \widetilde{\sigma}^{2} \overset{\mathrm{a.s.}}{\rightarrow} \sigma ^{2} , $$
(A.13)

as \(N\rightarrow \infty \).

Consequently, using the continuous mapping theorem, the strong consistency of Ĥ and (A.13), we can obtain

$$ \hat{\sigma}^{2}-\sigma ^{2}= \bigl(\hat{ \sigma}^{2}- \widetilde{\sigma}^{2} \bigr) + \bigl(\widetilde{ \sigma}^{2}-\sigma ^{2} \bigr) \overset{\mathrm{a.s.}}{\rightarrow} 0 , $$

which implies (2.12). □

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, L., Chen, J. & Lu, X. Parameter estimation for discretized geometric fractional Brownian motions with applications in Chinese financial markets. Adv Cont Discr Mod 2022, 69 (2022). https://doi.org/10.1186/s13662-022-03743-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13662-022-03743-3

JEL Classification

Keywords