 Research
 Open access
 Published:
Parameter estimation for discretized geometric fractional Brownian motions with applications in Chinese financial markets
Advances in Continuous and Discrete Models volume 2022, Article number: 69 (2022)
Abstract
It is widely accepted that financial data exhibit a longmemory property or a longrange dependence. In a continuoustime situation, the geometric fractional Brownian motion is an important model to characterize the longmemory property in finance. This paper thus considers the problem to estimate all unknown parameters in geometric fractional Brownian processes based on discrete observations. The estimation procedure is built upon the marriage between the bipower variation and the leastsquares estimation. However, unlike the commonly used approximation of the likelihood and transition density methods, we do not require a small sampling interval. The strong consistency of these proposed estimators can be established as the sample size increases to infinity in a chosen sampling interval. A simulation study is also conducted to assess the performance of the derived method by comparing with two existing approaches proposed by Misiran et al. (International Conference on Optimization and Control 2010, pp. 573–586, 2010) and Xiao et al. (J. Stat. Comput. Simul. 85(2):269–283, 2015), respectively. Finally, we apply the proposed estimation approach in the analysis of Chinese financial markets to show the potential applications in realistic contexts.
1 Introduction
Longmemory phenomena have been observed in numerous scientific fields such as hydrology, geophysics, economics, finance, climatology, physics, biology, medicine, music, and telecommunications engineering among others (see Hurst [23], Hosking [20], Lo [28], Willinger et al. [40], Baillie [1], Lai et al. [26], Hu and Øksendal [22], Granger and Hyung [19], Fleming and Kirby [18], Chronopoulou and Viens [11], Rossi and Fantazzini [35], Nguyen et al. [32] and the references therein). A time series with a longmemory behavior has a slow and hyperbolically declining autocorrelation function or, equivalently, an infinite spectrum at zero frequency. In fact, the bestknown and widely used stochastic model that exhibits a longmemory property in a continuoustime situation, is of course the fractional Brownian motion (hereafter fBm), which is a suitable generalization of the standard Brownian motion. Consequently, the fBm and stochastic processes driven by it have found many applications in diverse fields including economics, finance, geophysics, biology, oceanography, meteorology, telecommunication engineering, physics, chemistry, medicine, and environmental studies (see, e.g., Doukhan et al. [14], Çağlar [8], Chronopoulou and Viens [10], Chronopoulou and Viens [11], Comte et al. [13] and the references therein). In particular, the wellknown geometric fractional Brownian motion (hereafter gfBm) has been extensively used for capturing the fluctuations of stock prices (see Duncan et al. [15], Elliott and Chan [16], Elliott and Van Der Hoek [17], Hu and Øksendal [22], Mishura [30], Rostek [36]).
The development of gfBm naturally leads to studies in statistical inference, which has attracted great practical and theoretical interest. In fact, proper estimation of unknown parameters in stochastic models is of the utmost importance, since these estimators significantly affect risk management, derivatives pricing, and portfolio optimization. In particular, applications in finance often require both speed and accuracy in parameter estimation for small samples in order to facilitate dynamic decision making and risk management (see, e.g., Phillips and Yu [34]). As a consequence, parameter estimation for gfBm as a challenging theoretical problem has been of great interest in the past decade. For example, the problem of parameter estimation in a simple linear model driven by a fBm was investigated in Bertin et al. [5], Bertin et al. [6], Hu et al. [21], Xiao et al. [41], Brouste and Iacus [7], Liu and Song [27], Xiao et al. [42], Cheng et al. [9], Xiao et al. [46], Sun et al. [37]. In groundbreaking works, Xiao et al. [44], Xiao et al. [45], Tanaka et al. [38], Wang et al. [39] established the asymptotic theory for the estimators of fractional Ornstein–Uhlenbeck processes. In addition, for statistical inference in gfBm at discrete intervals, to the best of our knowledge, Kukush et al. [25] first developed an incomplete maximumlikelihood estimation for the drift parameter, which is separated from the estimation of the longmemory parameter, showing advances to some estimation methods specially designed only for the Hurst parameter, such as the R/S analysis, variation analysis, etc. Moreover, in the study of Misiran et al. [31], in which a general discretedata complete maximumlikelihoodtype procedure has been designed for estimating all the unknown parameters in gfBm, including the drift parameter, the diffusion coefficient and Hurst index, without the proof of asymptotic behavior. For the groundbreaking work of Xiao et al. [43], all the unknown parameters of gfBm from discrete observations based on the quadratic variation and the maximumlikelihood approach have been well estimated, in particular, the asymptotic properties of the estimators have been provided in Xiao et al. [43] as well. In addition, we also refer to a recent monograph, Kubilius et al. [24], for a complete exposition on different approaches used in statistical inference for fractional diffusions. However, both the complete maximumlikelihood estimators proposed in Misiran et al. [31] and the incomplete maximumlikelihood estimators introduced in Xiao et al. [43] are very time consuming. This is because the complete maximumlikelihood estimation involves numerically solving the profilelikelihood function and the implementation of the incomplete maximumlikelihood estimation need to compute the inverse and determinant of the autocovariance matrix of the fBm, which requires excessive computational time and computer memory. For example, a simple Monte Carlo sample takes a little less than 20 minutes on a desktop Intel i7 computer, while a more realistic empirical study will take a much longer time to run. Hence, both methods are not well suited for the case with an increased data size. Consequently, finding some less timeconsuming estimators for discretely observed gfBm becomes a challenging problem and is of great interest for practical purposes.
Therefore, this paper will mainly focus on the demand for less timeconsuming estimators for gfBm based on discretetime data from practical applications, concretely, the three parameters of the drift, the volatility, and the Hurst parameter involved in gfBm. Based on discretetime observations, we study the problem of estimating all the unknown parameters, including the drift, the diffusion, and the Hurst for gfBm in the setting of \(0< H<1\). The main contribution of this paper is to construct the estimators and to derive the asymptotic theory for these proposed estimators. While our framework in this paper is applicable to a wide range of Gaussian processes (e.g., subfractional Brownian motion, bifractional Brownian motion, weightedfractional Brownian motion), we focus here on the case of fBm, which is widely used in physics, electrical engineering, biophysics, and finance.
The remainder of this paper proceeds as follows. Section 2 introduces the model and addresses the estimators of gfBm, which is observed at discrete points of time. This section proposes the leastsquarestype estimators for both the drift and diffusion coefficients and presents the bipower variation estimator for the Hurst parameter. The asymptotic properties of these proposed estimators are also discussed in Sect. 2. Section 3 presents two existing estimation procedures for gfBm. In Sect. 4, we give some simulation examples to show the finitesample performance of these estimators. The computational tests show favorable results for our proposed estimator even with relatively small sample sizes. The simulation results also demonstrate that our method is computationally simple and asymptotically unbiased. To show how to apply our approach in realistic contexts, Sect. 5 is devoted to presenting our empirical results of four major financial indices in China: the Shanghai Composite index (SHCI), the Shenzhen Component index (SZCI), the CSI Smallcap 500 index (CSI 500), and the CSI 300 index (CSI 300). Concluding remarks are discussed in the final section. All the proofs are collected in the Appendix.
2 Parameter estimation
Since the pioneering work of Mandelbrot and Van Ness [29], fBm has been extensively used to capture longrange dependence, selfsimilarity, nonMarkovianity, or subdiffusivity and su perdiffusivity. A crucial problem with the applications of these stochastic models driven by fBm in practice is how to estimate the unknown values of the parameters involved in these models. In this paper, we consider the problem of estimating all unknown parameters in gfBm, which is widely used for option pricing and is able to capture the memory dependency. In fact, there is a key challenge for estimating parameters in gfBm: the discretely observed data are not Markovian. This means that statespace models and Kalmanfilter estimators can not be applied to estimate the parameters in gfBm. In this paper, the bipower variation and the leastsquares estimation are thus employed to estimate the unknown parameters in gfBm. In what follows, we first introduce some notations, and then present the estimators for gfBm.
2.1 Model simplification
To capture the longrange dependence in financial asset returns, Duncan et al. [15], Elliott and Chan [16], Hu and Øksendal [22] used gfBm to capture the dynamics of stock prices. Thus, the stock price, \(S_{t}\), can be written as
where μ is the rate of return, σ is the volatility, and (\(B_{t}^{H}\), \(t\ge 0\)) is a fBm with Hurst parameter \(H\in (\frac{1}{2}, 1)\).
Using the Wick integration, Hu and Øksendal [22] stated that the solution of (2.1) can be written as (see Eq. (5.2.6) of Mishura [30]):
where \(Y_{t}= \mu t \frac{1}{2}\sigma ^{2} t^{2H}+ \sigma B^{H}_{t} \).
Comparing (2.1) with (2.2), it loses no information to transform the observation \(S_{t}\) into \(Y_{t}\). Hence, estimating the parameters from (2.1) is equivalent to estimating the unknown parameters from \(Y_{t}\). Now, we assume that the process, \(Y_{t}\), is observed at discretetime instants \((t_{1}, t_{2}, \ldots , t_{N})\). Consequently, for \(H\in (0,1)\), the observation vector is \({\mathbf{Y}}=(Y_{t_{1}},Y_{t_{2}},\ldots ,Y_{t_{N}})'\), where the prime (′) is used to denote the vector transposition and all the nonprimed vectors are row vectors. In particular, to simplify notations we assume \(t_{k}=kh\), \(k=1, 2, \ldots , N\) for a fixedstep size \(h>0\). Consequently, for \(H\in (0,1)\), the discretetime observation can be expressed in the form of vectors as
where \({\mathbf{Y}}=(Y_{h},Y_{2h},\ldots ,Y_{Nh})'\), \({\mathbf{t}}=(h, 2h, \ldots , Nh)'\), \({\mathbf{t}}^{2H}= ( h^{2H}, (2h)^{2H}, \ldots , (Nh)^{2H} )'\), and \({\mathbf{B}}_{t}^{H}=(B_{h}^{H}, \ldots , B_{Nh}^{H})' \). Our aim is to estimate the unknown parameters H, σ, and μ from observations \(Y_{t_{i}}=Y_{ih}\), \(0\leq i\leq N\) for a fixed interval h and study their asymptotic properties as \(N\rightarrow \infty \). In consequence, we will use the notation C for a generic constant, which may change from line to line.
2.2 Estimation procedures
Based on the situation of discrete observations mentioned in Sect. 2.1, we now proceed to estimate the unknown parameters of gfBm based on discretetime observations.
First, we consider the problem of estimating the Hurst parameter of gfBm. From (2.3), we have
Now, let
where \(U_{ih}=B_{ih}^{H}B_{(i1)h}^{H}\) and \(V_{ih}= (ih )^{2H} ( (i1 )h )^{2H}\).
Not surprisingly, \(U_{ih}\) (\(i=1,\ldots ,N\)) are normal distributions. Using the result of the expected absolute value of a bivariate normal distribution, we obtain
where \(\rho _{1}= (2^{2H1}1 )h^{2H}\).
Similarly, a standard calculation yields
where \(\rho _{2}=\frac{h^{2H}}{2} (3^{2H}+12^{2H+1} )\).
Then, using the bipower variation (see, BarndorffNielsen and Shephard [4]), we define the ratio function as follows:
where “∼” means that the ratio of the left and righthand sides converges to one as N tends to infinite.
By combining (2.4) and (2.5) with (2.6), we obtain the following nonlinear function
Finally, we can obtain the estimate Ĥ of the Hurst index H, by solving the nonlinear function of (2.7).
Remark 2.1
We can construct other estimators for the Hurst parameter using the realized variation ratio method or changeoffrequency approach. Both methods are based on multipower variations of the higherorder difference of gfBm. We refer to the excellent work of BarndorffNielsen et al. [3] or Bardet and Surgailis [2] for details.
In practice, it is impossible to obtain an analytical expression for H from (2.7). We use a numerical procedure to obtain it by solving (2.7). In what follows, we are in a position to estimate the drift μ and the volatility σ. The technique of the plugin leastsquares estimation has been employed due to the timesaving property. Now, from (2.3), we have
The leastsquares estimators aim to minimize the following function
This implies that
Let \(a=\sum_{i=1}^{N} (ih)^{2} \), \(b=\sum_{i=1}^{N} (ih)^{2H+1}\), \(c=\sum_{i=1}^{N} (ih)^{4H}\). Then, we have
As a consequence, we can obtain the estimators of μ̂ and \(\hat{\sigma}^{2}\) by solving (2.8):
where \(\hat{a}=\sum_{i=1}^{N} (ih)^{2} \), \(\hat{b}=\sum_{i=1}^{N} (ih)^{2 \hat{H}+1}\), \(c=\sum_{i=1}^{N}(ih)^{4\hat{H}}\).
The estimators of μ and \(\sigma ^{2}\) proposed in this paper do not involve any computational problem and we do not rely on numerical solution. Hence, the estimators proposed in (2.9) and (2.10) are efficient.
Remark 2.2
The parameterestimation procedure for the gfBm proposed in this paper will proceed as follows:

(i)
Obtain the estimator of the Hurst parameter by solving the nonlinear function (2.7);

(ii)
Compute the estimator of μ by (2.9);

(iii)
Calculate the estimator of \(\sigma ^{2}\) by (2.10).
2.3 Asymptotic properties
In what follows, we turn to study the asymptotic behavior of these estimators defined by (2.7), (2.9), and (2.10), respectively. First, we state the strong consistency of Ĥ.
Theorem 2.3
The estimator Ĥ obtained by solving (2.7) converges to H almost surely as N goes to infinite.
Proof
See the Appendix. □
Now, we consider the strong consistency for both μ̂ and \(\hat{\sigma}^{2}\).
Theorem 2.4
The estimators μ̂ and \(\hat{\sigma}^{2}\) defined by (2.9) and (2.10), respectively, show strong consistency, that is,
Proof
See the Appendix. □
Remark 2.5
The estimators proposed in this paper can be easily extended for all \(H\in (0,1)\). Using the same arguments as Theorem 2.3 and Theorem 2.4, we can obtain the strong consistency of Ĥ, μ̂ and \(\hat{\sigma}^{2}\) for all \(H\in (0,1)\).
3 Two alternative procedures
In order to examine the performance of the proposed estimators mentioned above, in this section, we introduce two existing estimation procedures for the sake of comparison. The first one is the complete maximumlikelihood approach proposed in Misiran et al. [31] and the other is the incomplete maximum likelihood method provided in Xiao et al. [43].
3.1 The complete maximumlikelihood estimation
As stated in Misiran et al. [31], the Gaussian property of the gfBm makes the process a perfect candidate for the use of the maximum likelihood method to estimate all the unknown parameters, simultaneously. Thus, from Misiran et al. [31], we obtain the complete maximumlikelihood estimators for μ and \(\sigma ^{2}\) from the observation \({\mathbf{Y}}=(Y_{h},Y_{2h},\ldots ,Y_{Nh})\) as
where \({\mathbf{Z}}=(Z_{h},Z_{2h},\dots ,Z_{Nh})'\), \({\mathbf{X}}_{H}= (h^{2H}, (2h)^{2H}h^{2H}, \dots , (Nh)^{2H}(Nhh)^{2H} )'\), \(Z_{ih}=Y_{ih}Y_{(i1)h}\), \(\sum_{1}=\sum_{0}^{1} ({\mathbf{I}} \frac{{\mathbf{1}}{\mathbf{1}}'\sum_{0}^{1}}{{\mathbf{1}}'\sum_{0}^{1}{\mathbf{1}}} )\), I is an \(N\times N\) identity matrix and
Observe that the estimators \(\widetilde{\mu}_{M}\) and \(\widetilde{\sigma}^{2}_{M}\) depend on H, which should also be estimated. Actually, Misiran et al. [31] stated that the estimator \(\hat{H}_{M}\) of H can be obtained by minimizing the following profilelikelihood function
For solving the optimization problem (3.3) we have to rely on a numerical solution. The function fminsearch, which is a standard part of MATLAB, is a candidate tool.
Remark 3.1
The estimation procedure for the gfBm proposed by Misiran et al. [31] will proceed as follows:

(i)
Maximize (3.3) numerically to obtain the estimator Ĥ of H;

(ii)
Compute \(\widetilde{\sigma}^{2}_{M}\) by replacing H with \(\hat{H}_{M}\) in (3.1);

(iii)
Calculate \(\widetilde{\mu}_{M}\) by replacing H with \(\hat{H}_{M}\) in (3.2);

(iv)
Calculate the estimator \(\hat{\sigma}_{M}^{2}\) of \(\sigma ^{2}\) using the relationship \(\hat{\sigma}_{M}^{2}=h^{\hat{H}}\widetilde{\sigma}^{2}_{M}\);

(v)
Calculate the estimator \(\hat{\mu}_{M}\) of μ using the relationship \(\hat{\mu}_{M}=h^{1}\widetilde{\mu}_{M}\).
Remark 3.2
It should be noted that the accuracies of \(\hat{\mu}_{M}\) and \(\hat{\sigma}_{M}^{2}\) depend crucially on the estimator Ĥ of the Hurst parameter H. As a consquence, replacing H with Ĥ in (3.1) and (3.2) could impact the asymptotic behavior of \(\hat{\mu}_{M}\) and \(\hat{\sigma}_{M}^{2}\). Intuitively speaking, the more accurate the Ĥ, the more accurate are \(\hat{\mu}_{M}\) and \(\hat{\sigma}_{M}^{2}\). Hence, we should use some optimization methods to obtain the optimum value of Ĥ from (3.3).
Remark 3.3
Computationally, the algorithm proposed in this paper is very fast. The major advantage of our method is that the computational cost is markedly lower than the approach presented by Misiran et al. [31]. This is because the approach presented by Misiran et al. [31] involves the numerical computation of the covariance matrix and the logarithm of its determinant. However, our method just relies on a simple result obtained via the variation method.
3.2 The incomplete maximumlikelihood estimation
By contrast, from Xiao et al. [43], we obtain the incomplete maximumlikelihood estimators for μ and \(\sigma ^{2}\) from the observation \({\mathbf{Y}}=(Y_{h},Y_{2h},\ldots ,Y_{Nh})\) as
where
Obviously, the maximumlikelihood estimator \(\hat{\mu}_{X}\) involves the numerical computation of the inverse and the determinant of the covariance matrix, which induces open computational problems. However, the development of computer technologies made it possible to obtain this estimator effectively and efficiently.
Remark 3.4
We would like to mention that the parameterestimation procedure for the gfBm presented in Xiao et al. [43] always proceeded as follows:

(i)
Calculate the estimator of the Hurst parameter by (3.4);

(ii)
Compute the estimator of \(\sigma ^{2}\) using (3.5) with replacing H with \(\hat{H}_{X}\);

(iii)
Obtain the estimator of μ, by replacing H with \(\hat{H}_{X}\) and \(\sigma ^{2}\) with \(\hat{\sigma}_{X}^{2}\), in (3.6).
Remark 3.5
Let us mention also that the computation time costed by obtaining \(\hat{\mu}_{X}\) is high since this estimator involves the inverse of the covariance matrix.
4 Simulation study
For the sake of reproducibility, in this section, we study the finitesample properties of the proposed estimators. As addressed in the previous sections, the estimators of μ, \(\sigma ^{2}\), and H have several desirable properties for sufficiently large truncation points, such as consistency and asymptotic normality. In other words, if the observed time series is relatively long, statistical inference could be performed on the estimate. However, in some other cases, the observations are irregular and relatively short. Hence, it would be interesting to analyze the performance of μ̂, \(\hat{\sigma}^{2}\), and Ĥ in small samples, which is needed to justify the application of asymptotic results. The information in small samples may also affect the choice of sampling interval. In what follows, we conduct Monte Carlo studies for different values of μ, \(\sigma ^{2}\), and H to numerically investigate the efficiency of our estimators. Moreover, we compare the finitesample properties of our method with two existing approaches, which are proposed in Sect. 3 for details.
Actually, the main obstacle of Monte Carlo simulation is the difficulty to obtain fBm, in contrast to Brownian motion. In the literature, there are some methods to solve the problem of simulating fBm (see Coeurjolly [12]). In this paper, we apply Paxson’s algorithm (see Paxson [33]). This means that we first generate the fractional Gaussian noise based on Paxson’s method by fast Fourier transformation. Then, we can obtain the fBm using the result that the fBm is defined as a partial sum of the fractional Gaussian noise. Finally, we obtain gfBm of Eq. (2.3). For a better understanding of our method, we describe the steps for the simulation of the gfBm together with the calculation of the estimators proposed in this paper. Therefore, the estimation procedure of this paper by Monte Carlo simulation method is summarized as follows:

(i)
Set the sampling interval h and the sampling size N;

(ii)
Set the values for the three variables μ, H, and σ;

(iii)
Generate fractional Gaussian noise based on Paxson’s method;

(iv)
Construct the path of the gfBm;

(v)
Obtain the estimator of the Hurst parameter by solving the nonlinear function (2.7);

(vi)
Calculate the drift estimator μ̂ using (2.9);

(vii)
Calculate the estimator \(\hat{\sigma}^{2}\) by (2.10).
In the case of an empirical study, we just need to proceed from (v) to (vii). For comparison, we should replace steps (v) to (vii) with Remark 3.1 or Remark 3.4, respectively. Now, we sum up the estimation procedures by Monte Carlo simulation method, which is shown in Fig. 1.
In addition, for some fixed sampling intervals, we carry out a simulation study to compare the estimators of our method (mentioned above), with the complete maximumlikelihood estimation proposed in Misiran et al. [31] (see Remark 3.1) and the incomplete maximumlikelihood estimation presented Xiao et al. [43] (see Remark 3.4) by using some generating datasets with different sampling size N and different sampling interval h. For each case, replications involving 1000 samples are simulated from the true model. All the procedures are coded in Matlab and the results obtained using a 3.60 GHz Intel Core i74790 CPU with 8 GB of RAM and running Windows 10. For a fixed sampling interval \(h=1/12\) (e.g., data is collected by monthly observations), Table 1 reports the mean and standard deviation (standard deviation (S.Dev.)) of these estimators proposed in this paper for different sample sizes, where the true value denotes the parameter value used in the Monte Carlo simulation. Moreover, to show the efficiency of our method, the results from the approach provided by Misiran et al. [31] and Xiao et al. [43] are also presented in Table 1. The average CPU time (in seconds) is also documented in Table 1. Furthermore, in order to test the effect of sampling interval, Table 2 reports simulation results for sampling intervals of \(h=1/52\) (e.g., data collected by weekly observations).
From numerical computations, we can see that the biases and the standard deviations in the estimators of μ, H, and σ decrease as the sample size increases. Hence, we can conclude that the estimators of these three methods perform well for the Hurst parameters \(H\in (0,1)\). As expected, the simulated means of these estimators converge to the true value rapidly and the simulated standard deviations decrease to zero with a slight positive bias as the number of observations increases. The results given in Tables 1 and 2 clearly show that all the three methods are numerically nearly equivalent. However, we can see from the results obtained in Tables 1 and 2 that our methodology performs considerably better than the other two. Most of the biases and variances obtained by using our method are within an acceptable tolerance. It is observed in most cases that our estimates for the Hurst parameter are obviously quite stable and less biased. The performance on estimators of μ and σ are also fairly satisfactory. We can also see that the larger the sample size N, the better the estimation performs. It is clear from Tables 1 and 2 that both the complete maximumlikelihood estimation of Misiran et al. [31] and the complete maximumlikelihood estimation of Xiao et al. [43] provide almost precise estimators, in particular, the biases and the standard deviations are greater than those from the estimators based on this paper. Moreover, the most important finding is that the computation time costed by the estimation procedure provided in this paper is significantly lower than those of the approaches proposed in Misiran et al. [31] and Xiao et al. [43]. This is mainly due to the proposed fast algorithm in this paper, by contrast, the approach proposed in Misiran et al. [31] requires a onedimensional search and needs to compute the inverse of the covariance matrix and the logarithm of its determinant, but the method provided in Xiao et al. [43] needs to calculate the inverse of the covariance matrix. The method provided in this paper is also very convenient since it relies on a simple result obtained via the variation method. The performance of the approach proposed in this paper is comparable to that of the estimation procedure proposed in Xiao et al. [43] with a lower computation cost. The method proposed in this paper gives the smallest error among the three approaches and all the estimators we propose are independent of the sampling interval h, suggesting the advantages of our proposed method compared to the approaches by Xiao et al. [43] and Misiran et al. [31] in more scenarios.
5 Empirical applications
To better illustrate our proposed method, we apply our method to real data. The data utilized in our empirical investigation are extracted from the GTA Research Service Center including four major market indexes in China, SHCI, SZCI, CSI 500, and CSI 300 spanning from 01/04/2010 through 12/31/2019. After excluding those days for which the records are not complete (e.g., for holidays or stock exchange anticipated closures), the whole dataset includes \(M=2431\) trading days. The index prices are observed at a time interval of \(h=1/250\) (e.g., data collected once a day) and the returns are calculated using the logarithmic differenced data
Basic descriptive plots for the financial data in the logreturn format are presented in the following figures. In particular, Fig. 2 provides some empirical data of SHCI with sampling frequency of every day: Fig. 2(a) shows the daily closing values of SHCI in the sample period. In the original trace, we note that there appears to be no longrun average level. This is the evidence of a nonstationary time series. However, after difference operation, the differenced trace appears to be quite stable over time, and the differenced operation has produced a stationary time series. Figure 2(b) illustrates the continuously compounded returns (the log returns) associated with the price series in Fig. 2(a). In contrast to the price series in Fig. 2(a), the log returns appear to be quite stable over time, and the transformation from prices to returns has produced a stationary time series. The quantile–quantile (Q–Q) figure during the sample period is presented in Fig. 2(c). If the empirical returns are normally distributed, we expect to observe a straight line in this figure. However, this is not the case. Contrarily, in the previous Q–Q plot, for low and high values (tails of the distribution) there exists a clear departure of the plot from the reference line that corresponds to a normal distribution. This means that the SHCI log return doesn’t follow a normal distribution. In Fig. 2(d) an example of the probability density function of the SHCI is given. Similarly, the statistical figures of SZCI, CSI 500 and CSI 300 are presented in Figs. 3–5.
To give a brief insight into the properties of the data, Table 3 tabulates the basic descriptive statistics of SHCI, SZCI, CSI 500, and CSI 300 in the full sample period. Names are given in the first column. The second, third, fourth, and fifth columns contain the basic descriptive statistics for four indices. Moreover, both skewness and kurtosis are also presented in Table 3. As is known, the skewness of a symmetric distribution, such as the normal distribution, is zero. However, none of the series seems to be symmetric. As shown in Table 3, four series have negative skewness, which implies that the distributions have a long left tail. Also, from Table 3 we can see that both series have a kurtosis that exceeds a value of three, which is the kurtosis of the normal distribution. This means that the distributions are peaked (leptokurtic) relative to the normal distribution. As both of the descriptive statistics (i.e., skewness and kurtosis) indicate deviations from normal values, we can expect that the observed distributions are not normally distributed. These results are also confirmed by the quantile–quantile (Q–Q) plots (from Fig. 2(c) to Fig. 5(c)). It is well known that if we plot the quantiles of the chosen series against the quantiles of the normal distribution, we can detect strong deviations, especially at the tails. From Fig. 2(c) to Fig. 5(c), the plots indicate an Sshape curve, which is a typical sign of a nonnormal distribution in a financial time series.
After examining the basic statistical properties of the selected time series, we turn now to present an investigation of whether these four indices have longrange dependence. Generally speaking, there are a number of procedures accessible in the literature for testing for the presence of long memory in time series of stock returns (see, for example, the Geweke–Porter–Hudak procedure, the R/S method, the aggregated variance approach, the aggregated absolute value method, and the Whittle approach). We use the ACF plot method, which seems to be the simplest one for us. Figure 6 plots the sample autocorrelation functions of the daily returns of SHCI, SZCI, CSI 500, and CSI 300 in the sample period. From Fig. 6, we observe that the decays of autocorrelation functions are very weak in both. Therefore, we may say that two indices have longrange dependence.
Finally, we are in a position to estimate the unknown parameters μ, H, and σ from the selected financial series by (2.7), (2.9), and (2.10). Using Eq. (2.2) and the real data, we estimate the desired parameters based on the estimation procedures proposed in this paper (see Remark 2.2). All the estimation results are presented in Table 4. When compared to the real data of SHCI, SZCI, CSI 500, and CSI 300, these estimated parameter values seem reasonable.
6 Conclusion
The longmemory feature has evolved into an important part of the timeseries analysis during the last decades, as researchers in empirical studies have sought to use “ideal” models in practical applications of net traffic, economics, finance, biology, physics, chemistry, and medicine. One of the fascinations with longmemory processes is their inherent ability to bridge both persistent stationary and nonstationary time series. fBm is the bestknown longmemory stochastic model in the continuoustime situation. As a consequence, stochastic models driven by fBm are used by statisticians, econometricians, and researchers in many of the physical sciences who have become aware of the very strong persistence in the autocorrelations and other measures of the temporal dependence of some time series. However, a crucial problem with applications of fBm in practice is how to estimate the unknown values of the parameters in stochastic processes driven by fBm. In this paper, we extend the notion of fBm into the discretetime domain and then proposed the estimation methodology for gfBm. Employing the bipower variation and the leastsquares method, we have constructed a procedure for estimating all unknown parameters in gfBm. The strong consistency of these proposed estimators has been also provided in this paper. To compare the estimators from our method with the complete maximumlikelihood estimation method proposed in Misiran et al. [31] and the incomplete maximumlikelihood estimation provided in Xiao et al. [43], we perform a simulation study to illustrate the effectiveness and the efficiency of our methodology. The simulation exercise also shows that our proposed estimators work well in practice, even with small sizes. Furthermore, to show how our approach can be used in realistic contexts, an empirical study is given based on SHCI, SZCI, CSI 500, and CSI 300 of Chinese stock markets, demonstrating that our method is easy to implement and has a smaller computational cost than the complete maximumlikelihood estimators proposed by Misiran et al. [31] and the incomplete maximumlikelihood estimators provided in Xiao et al. [43]. Certainly, for future study, it is required to use different schemes of estimation with a higher order of convergence for the improvement of the methodology. We also expect the needs for these methods and for improvements in the statistical machinery that is available to practitioners to grow further as the financial industry continues to expand and data sets become richer. The field is therefore of growing importance for both theorists and practitioners.
This study also suggests several directions for future research. The first one is to extend the underlying assetprice processes into more general processes such as the mixedexponential jump diffusion model, or the stochastic volatility, which might provide insight into the robustness of the results obtained herein. Another direction for future research is to estimate unknown parameters in gfBm with microstructure noises. For the final one, it can be considered that the problem of estimating parameters for gfBm with jumps processes using some statistical methods, such as the constrained expectationmaximization or the majorizationminimization algorithm.
Availability of data and materials
The datasets used during the current study are available from the corresponding author on reasonable request.
References
Baillie, R.T.: Long memory processes and fractional integration in econometrics. J. Econom. 73(1), 5–59 (1996)
Bardet, J.M., Surgailis, D.: Measuring the roughness of random paths by increment ratios. Bernoulli 17(2), 749–780 (2011)
BarndorffNielsen, O.E., Corcuera, J.M., Podolskij, M.: Limit theorems for functionals of higher order differences of Brownian semistationary processes. In: Prokhorov and Contemporary Probability Theory, pp. 69–96. Springer, Berlin (2013)
BarndorffNielsen, O.E., Shephard, N.: Power and bipower variation with stochastic volatility and jumps. J. Financ. Econom. 2(1), 1–37 (2004)
Bertin, K., Torres, S., Tudor, C.: Maximumlikelihood estimators and random walks in long memory models. Statistics 45(4), 361–374 (2011)
Bertin, K., Torres, S., Tudor, C.A.: Drift parameter estimation in fractional diffusions driven by perturbed random walks. Stat. Probab. Lett. 81(2), 243–249 (2011)
Brouste, A., Iacus, S.M.: Parameter estimation for the discretely observed fractional OrnsteinUhlenbeck process and the Yuima R package. Comput. Stat. 28(4), 1529–1547 (2013)
Çağlar, M.: A longrange dependent workload model for packet data traffic. Math. Oper. Res. 29(1), 92–105 (2004)
Cheng, P., Shen, G., Chen, Q.: Parameter estimation for nonergodic OrnsteinUhlenbeck process driven by the weighted fractional Brownian motion. Adv. Differ. Equ. 2017(1), 1 (2017)
Chronopoulou, A., Viens, F.G.: Estimation and pricing under longmemory stochastic volatility. Ann. Finance 8(2–3), 379–403 (2012)
Chronopoulou, A., Viens, F.G.: Stochastic volatility and option pricing with longmemory in discrete and continuous time. Quant. Finance 12(4), 635–649 (2012)
Coeurjolly, J.: Simulation and identification of the fractional Brownian motion: a bibliographical and comparative study. J. Stat. Softw. 5(7), 1–53 (2000)
Comte, F., Coutin, L., Renault, É.: Affine fractional stochastic volatility models. Ann. Finance 8(2–3), 337–378 (2012)
Doukhan, P., Oppenheim, G., Taqqu, M.S.: Theory and Applications of LongRange Dependence. Springer, Berlin (2003)
Duncan, T., Hu, Y., PasikDuncan, B.: Stochastic calculus for fractional Brownian motion I. Theory. SIAM J. Control Optim. 38(2), 582–612 (2000)
Elliott, R., Chan, L.: Perpetual American options with fractional Brownian motion. Quant. Finance 4(2), 123–128 (2004)
Elliott, R., Van Der Hoek, J.: A general fractional white noise theory and applications to finance. Math. Finance 13(2), 301–330 (2003)
Fleming, J., Kirby, C.: Long memory in volatility and trading volume. J. Bank. Finance 35(7), 1714–1726 (2011)
Granger, C.W., Hyung, N.: Occasional structural breaks and long memory with an application to the s&p 500 absolute stock returns. J. Empir. Finance 11(3), 399–421 (2004)
Hosking, J.R.: Fractional differencing. Biometrika 68(1), 165–176 (1981)
Hu, Y., Nualart, D., Xiao, W., Zhang, W.: Exact maximumlikelihood estimator for drift fractional Brownian motion at discrete observation. Acta Math. Sci. 31(5), 1851–1859 (2011)
Hu, Y., Øksendal, B.: Fractional white noise calculus and applications to finance. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 6(1), 1–32 (2003)
Hurst, H.E.: Longterm storage capacity of reservoirs. Trans. Am. Soc. Civ. Eng. 116, 770–808 (1951)
Kubilius, K., Mishura, Y., Ralchenko, K.: Parameter Estimation in Fractional Diffusion Models, vol. 8. Springer, Berlin (2017)
Kukush, A., Mishura, Y., Valkeila, E.: Statistical inference with fractional Brownian motion. Stat. Inference Stoch. Process. 8(1), 71–93 (2005)
Lai, D., Davis, B.R., Hardy, R.J.: Fractional Brownian motion and clinical trials. J. Appl. Stat. 27(1), 103–108 (2000)
Liu, Z., Song, N.: Minimum distance estimation for fractional OrnsteinUhlenbeck type process. Adv. Differ. Equ. 2014(1), 1 (2014)
Lo, A.: Longterm memory in stock market prices. Econometrica 59(5), 1279–1313 (1991)
Mandelbrot, B.B., Van Ness, J.W.: Fractional Brownian motions, fractional noises and applications. SIAM Rev. 10(4), 422–437 (1968)
Mishura, Y.: Stochastic Calculus for Fractional Brownian Motion and Related Processes. Springer, Berlin (2008)
Misiran, M., Lu, Z., Teo, K.: Fractional BlackScholes models: complete MLE with application to fractional option pricing. In: International Conference on Optimization and Control 2010, pp. 573–586. Guiyang, China (2010)
Nguyen, D.B.B., Prokopczuk, M., Sibbertsen, P.: The memory of stock return volatility: asset pricing implications. J. Financ. Mark. 47, 100487 (2020)
Paxson, V.: Fast, approximate synthesis of fractional Gaussian noise for generating selfsimilar network traffic. ACM SIGCOMM Comput. Commun. Rev. 27(5), 5–18 (1997)
Phillips, P.C., Yu, J.: Jackknifing bond option prices. Rev. Financ. Stud. 18(2), 707–742 (2005)
Rossi, E., Fantazzini, D.: Long memory and periodicity in intraday volatility. J. Financ. Econom. 13(4), 922–961 (2015)
Rostek, S.: Option Pricing in Fractional Brownian Markets. Springer, Berlin (2009)
Sun, L., Wang, L., Fu, P.: Maximum likelihood estimators of a longmemory process from discrete observations. Adv. Differ. Equ. 2018(1), 1 (2018)
Tanaka, K., Xiao, W., Yu, J.: Maximum likelihood estimation for the fractional Vasicek model. Econom. 8(3), 32 (2020)
Wang, X., Xiao, W., Yu, J.: Modeling and forecasting realized volatility with the fractional OrnsteinUhlenbeck process. J. Econom. (2022, in press). https://doi.org/10.1016/j.jeconom.2021.08.001
Willinger, W., Taqqu, M.S., Leland, W.E., Wilson, D.V.: Selfsimilarity in highspeed packet traffic: analysis and modeling of ethernet traffic measurements. Stat. Sci. 10(1), 67–85 (1995)
Xiao, W., Zhang, W., Xu, W.: Parameter estimation for fractional OrnsteinUhlenbeck processes at discrete observation. Appl. Math. Model. 35(9), 4196–4207 (2011)
Xiao, W., Zhang, W., Zhang, X.: Parameter identification for drift fractional Brownian motions with application to the Chinese stock markets. Commun. Stat., Simul. Comput. 44(8), 2117–2136 (2015)
Xiao, W., Zhang, W., Zhang, X.: Parameter identification for the discretely observed geometric fractional Brownian motion. J. Stat. Comput. Simul. 85(2), 269–283 (2015)
Xiao, W., Yu, J.: Asymptotic theory for estimating drift parameters in the fractional Vasicek model. Econom. Theory 35(1), 198–231 (2019)
Xiao, W., Yu, J.: Asymptotic theory for rough fractional Vasicek models. Econ. Lett. 177, 26–29 (2019)
Xiao, W., Zhang, X., Zuo, Y.: Least squares estimation for the drift parameters in the subfractional Vasicek processes. J. Stat. Plan. Inference 197, 141–155 (2018)
Acknowledgements
This work is partly supported by the Humanities and Social Sciences Research and Planning Fund of the Ministry of Education of China (No. 20YJA630053), the Natural Science Foundation of China (No. 11801590 and No. 61673019), the Tianyuan Foundation of the National Natural Science Foundation of China (No. 12126313), the Fundamental Research Funds for the Central Universities (No. 19lgpy243) and Research Foundation for Young Teachers of Guangdong University of Technology.
Author information
Authors and Affiliations
Contributions
The main idea of this paper was proposed by LS and JC. XL prepared the manuscript initially and performed all the steps of the proofs in this research. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Appendix
Appendix
1.1 A.1 Proof of Theorem 2.3
Proof
Using the ergodic theorem, we can easily obtain the desired result by a straightforward argument. □
1.2 A.2 Proof of Theorem 2.4
Proof
Let’s prove the convergence of μ̂ first. For the sake of convenience, we define
where \(a=\sum_{i=1}^{N} (ih)^{2} \), \(b=\sum_{i=1}^{N} (ih)^{2H+1}\), \(c=\sum_{i=1}^{N}(ih)^{4H}\). For technical reasons, we now deal with the strong convergence of μ̃. Substituting \(Y_{ih}\) by \(Y_{ih}=\mu ih +\sigma B_{ih}^{H}\frac{1}{2}\sigma ^{2} (ih)^{2H}\) into (A.1), we have
Thus, \(\mathbb{E} [\hat{\mu}]=\mu \) and hence μ̂ is unbiased. On the other hand, we have
where \(I_{1} = \mathbb{E} (\sum_{i=1}^{N} (ih)^{2H}B_{ih}^{H} )^{2}\), \(I_{2} = \mathbb{E} (\sum_{i=1}^{N} ih B_{ih}^{H} )^{2}\).
A standard calculation yields
where C is a generic constant, which may change from line to line.
With almost no extra effort, we can obtain
where C is a constant.
Moreover, we can easily obtain that
Inserting these convergency results of (A.5), (A.6), and (A.7) together into (A.4), as N goes to infinity and for \(H\in (0,1)\), we obtain
which converges to zero for fixed h.
Next, we will use the Borel–Cantelli lemma to prove the strong convergence of μ̃. To this end, we will show that
for some \(\epsilon >0\).
Take \(0<\epsilon <1H\). Then, from Chebyshev’s inequality, the property of the central absolute moments of Gaussian random variables and (A.8), we have
For sufficiently large q, we have \(q\epsilon +(H1)q<1\). Thus, (A.9) is proved, which implies
as \(N\rightarrow \infty \).
Consequently, using the continuous mapping theorem, the strong consistency of Ĥ and (A.10), we can obtain
which implies (2.11).
Next, we are interested in the strong consistency of \(\hat{\sigma}^{2}\). Substituting \(Y_{ih}\) by \(Y_{ih}=\mu ih +\sigma B_{ih}^{H}\frac{1}{2}\sigma ^{2} (ih)^{2H}\) in (A.2), we have
Thus, \(\mathbb{E} [\widetilde{\sigma}^{2} ]=\sigma ^{2}\) and hence \(\widetilde{\sigma}^{2}\) is unbiased. On the other hand, we can easily obtain
where \(I_{1} = \mathbb{E} (\sum_{i=1}^{N} (ih)^{2H}B_{ih}^{H} )^{2}\) and \(I_{2} = \mathbb{E} (\sum_{i=1}^{N} ihB_{ih}^{H} )^{2}\).
Using the same argument as (A.8), we obtain
which converges to zero as N goes to infinity.
Using the similar argument as (A.10), we obtain
as \(N\rightarrow \infty \).
Consequently, using the continuous mapping theorem, the strong consistency of Ĥ and (A.13), we can obtain
which implies (2.12). □
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sun, L., Chen, J. & Lu, X. Parameter estimation for discretized geometric fractional Brownian motions with applications in Chinese financial markets. Adv Cont Discr Mod 2022, 69 (2022). https://doi.org/10.1186/s13662022037433
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13662022037433