Theory and Modern Applications

# Estimation for incomplete information stochastic systems from discrete observations

## Abstract

This paper is concerned with the estimation problem for incomplete information stochastic systems from discrete observations. The suboptimal estimation of the state is obtained by constructing the extended Kalman filtering equation. The approximate likelihood function is given by using a Riemann sum and an Itô sum to approximate the integrals in the continuous-time likelihood function. The consistency of the maximum likelihood estimator and the asymptotic normality of the error of estimation are proved by applying the martingale moment inequality, Hölder’s inequality, the Chebyshev inequality, the Burkholder–Davis–Gundy inequality and the uniform ergodic theorem. An example is provided to verify the effectiveness of the estimation methods.

## 1 Introduction

Stochastic differential equations have been widely used in many application areas such as engineering, information and medical science [4, 5]. Recently, stochastic differential equations have been applied to a description of the dynamics of a financial asset, asset portfolio and term structure of interest rates, such as the Black–Scholes option pricing model [6], Vasicek and Cox–Ingersoll–Ross pricing formulas [8, 9, 27], the Chan–Karloyi–Longstaff–Sanders model [10], the Constantinides model [11], and the Ait–Sahalia model [1]. Some parameters in pricing formulas describe the related assets dynamic, however, these parameters are always unknown. In the past few decades, some authors studied the parameter estimation problem for economic models. For example, Yu and Phillips [33] used a Gaussian approach to the study of the parameter estimation for continuous-time short-term interest rates model, Overback [21], Rossi [24] and Wei et al. [29] investigated the parameter estimation problem for the Cox–Ingersoll–Ross model by applying the maximum likelihood method, the least-square method and the Gaussian method, respectively. Moreover, some methods have been used to estimate parameters in general nonlinear stochastic differential equation from continuous-time observations. For instance, we have Bayes estimations [16, 22], maximum likelihood estimations [3, 30, 31] and M-estimations [25, 32]. However, in fact, it is impossible to observe a process continuously in time. Therefore, parametric inference based on sampled data is important in dealing with practical problems, such as least-square estimations [17, 18], approximate transition densities [2, 7] and adaptive maximum likelihood type estimations [26].

A variety of stochastic systems are described by stochastic differential equations [23], and sometimes the stochastic systems have incomplete information. Many authors studied the state estimation problem for incomplete information stochastic systems by using Kalman filtering or extended Kalman filtering [13, 15, 19, 28]. Furthermore, sometimes the parameters and states of a stochastic system are unknown at the same time. Therefore, the parameter estimation and state estimation need to be solved simultaneously. In recent years, some authors investigated the parameter estimation problem for incomplete information linear stochastic systems. For example, Deck and Theting [12] used Kalman filtering and the Bayes method to study the linear homogeneous stochastic systems. Kan et al. [14] discussed the linear nonhomogeneous stochastic systems based on the methods used in [12]. Mbalawata et al. [20] applied Kalman filtering and a maximum likelihood estimation to investigate the parameter and state estimation for linear stochastic systems.

As is well known, parameter estimation for incomplete information linear stochastic systems has been studied by some authors [12, 14, 20]. However, the asymptotic property of the parameter estimator has not been discussed in [20], and in [12, 14], only the drift parameter has been considered. In this paper, the parameter estimation problem for incomplete information nonlinear stochastic system is investigated from discrete observations. Firstly, the suboptimal estimation of the state is obtained by constructing the extended Kalman filtering. Secondly, the approximate likelihood function is given by using a Riemann sum and an Itô sum to approximate the integrals in the continuous-time likelihood function. In the approximate likelihood function, both drift and diffusion parameter are unknown. Finally, the consistency of the maximum likelihood estimator and the asymptotic normality of the error of estimation are proved by applying the martingale moment inequality, Hölder’s inequality, the Chebyshev inequality, the Burkholder–Davis–Gundy inequality and the uniform ergodic theorem.

This paper is organized as follows. In Sect. 2, the state estimation is derived and the approximate likelihood function is given. In Sect. 3, some important lemmas are proved, the consistency of estimator and asymptotic normality of the error of estimation are discussed. In Sect. 4, an example is provided. The conclusion is given in Sect. 5.

## 2 Problem formulation and preliminaries

In this paper, the estimation problem for incomplete information stochastic system is investigated. The stochastic system is described as follows:

$$\textstyle\begin{cases} d X_{t}= f(X_{t},\theta )\,dt+ d W_{t},\qquad X_{0}\sim u_{\theta }, \\ d Y_{t}= h(X_{t})\,dt+ d V_{t} , \qquad Y_{0}=Y_{0}, \end{cases}$$
(1)

where $$\theta \in \varTheta =\overline{\varTheta }_{0}$$ (the closure of $$\varTheta _{0}$$) with $$\varTheta _{0}$$ being an open bounded convex subset of $$\mathbb{R}$$ is an unknown parameter, $$(W_{t},t\geq 0)$$ and $$(V_{t},t\geq 0)$$ are independent Wiener processes, $$X_{t}$$ is ergodic, $$u_{\theta }$$ is the invariant measure, $$\{Y_{t}\}$$ is observable, while $$\{X_{t}\}$$ is unobservable.

From now on the work is under the assumptions below.

### Assumption 1

$$|f(x,\theta )-f(y,\theta )|\leq K(\theta )|x-y|$$, $$|f(x,\theta )| \leq K_{1}(\theta )(1+|x|)$$, $$\sup \{K(\theta ),K_{1}(\theta )\}<\infty$$, $$\theta \in \varTheta$$, $$x,y \in {\mathbb{R}}$$.

### Assumption 2

$$|h(x)-h(y)|\leq |x-y|$$.

### Assumption 3

$${\mathbb{E}}|X_{0}|^{p}<\infty$$ for each $$p>0$$.

### Assumption 4

$${\mathbb{E}}[f(X_{0},\theta )(f(X_{0},\theta _{0})-\frac{1}{2}f(X_{0}, \theta ))]$$ has the unique maximal value at $$\theta =\theta _{0}$$, where $$\theta _{0}$$ is the true parameter.

The likelihood function cannot be given directly due to the unobservability of $$\{X_{t}\}$$. Therefore, we should solve the estimation problem of $$\{X_{t}\}$$ firstly.

The state estimator is designed as follows:

$$\textstyle\begin{cases} d \widehat{X_{t}}= f(\widehat{X_{t}},\theta )\,dt+K_{t}(d Y_{t}-h( \widehat{X_{t}})\,dt), \\ \widehat{X_{0}}= X_{0}. \end{cases}$$
(2)

According to (1) and (2), we have

$$d(X_{t}-\widehat{X_{t}})=\bigl(f(X_{t}, \theta )-f(\widehat{X_{t}},\theta )-K _{t} \bigl(h(X_{t})-h(\widehat{X_{t}})\bigr)\bigr)\,dt+d W_{t}-K_{t}\, d V_{t}.$$
(3)

From the Itô lemma and (3), it can be checked that

\begin{aligned}& d(X_{t}-\widehat{X_{t}})^{2} \\& \quad = 2(X_{t}-\widehat{X_{t}}) \bigl(f(X_{t}, \theta )-f(\widehat{X_{t}},\theta )-K_{t}\bigl(h(X_{t})-h( \widehat{X_{t}})\bigr)\bigr)\,dt \\& \qquad {} + 2(X_{t}-\widehat{X_{t}}) (d W_{t}-K_{t} \, d V_{t})+\bigl(1+K_{t}^{2}\bigr)\,dt \\& \quad = \bigl[2(X_{t}-\widehat{X_{t}}) \bigl(f(X_{t}, \theta )-f(\widehat{X_{t}}, \theta )\bigr) \\& \qquad {} - 2K_{t}(X_{t}-\widehat{X_{t}}) \bigl(h(X_{t})-h(\widehat{X_{t}})\bigr)+1+K_{t} ^{2}\bigr]\,dt \\& \qquad {} + 2(X_{t}-\widehat{X_{t}}) (d W_{t}-K_{t} \, d V_{t}). \end{aligned}
(4)

Taking the expectation from both sides of (4), from Assumption 2, we obtain

\begin{aligned}& d{\mathbb{E}}(X_{t}-\widehat{X_{t}})^{2} \\& \quad =\bigl[2{\mathbb{E}}(X_{t}-\widehat{X_{t}}) \bigl(f(X_{t},\theta )-f( \widehat{X_{t}},\theta )\bigr) \\& \qquad {}-2K_{t}{\mathbb{E}}(X_{t}-\widehat{X_{t}}) \bigl(h(X_{t})-h( \widehat{X_{t}})\bigr)+1+K_{t}^{2} \bigr]\,dt \\& \quad =\bigl[K_{t}^{2}-2K_{t}{ \mathbb{E}}(X_{t}-\widehat{X_{t}})^{2}+1+2K_{t} {\mathbb{E}}(X_{t}-\widehat{X_{t}})^{2} \\& \qquad {}+2{\mathbb{E}}(X_{t}-\widehat{X_{t}}) \bigl(f(X_{t},\theta )-f( \widehat{X_{t}},\theta )\bigr)\bigr] \,dt \\& \qquad {}-2K_{t}{\mathbb{E}}(X_{t}-\widehat{X_{t}}) \bigl(h(X_{t})-h( \widehat{X_{t}})\bigr)\,dt \\& \quad =\bigl[\bigl(K_{t}-{\mathbb{E}}(X_{t}- \widehat{X_{t}})^{2}\bigr)^{2}-\bigl({\mathbb{E}}(X _{t}-\widehat{X_{t}})^{2}\bigr)^{2}+1 \bigr]\,dt \\& \qquad {}+2K_{t}{\mathbb{E}}(X_{t}-\widehat{X_{t}})^{2} \,dt-2K_{t}{\mathbb{E}}(X _{t}-\widehat{X_{t}}) \bigl(h(X_{t})-h(\widehat{X_{t}})\bigr)\,dt \\& \qquad {}+2{\mathbb{E}}(X_{t}-\widehat{X_{t}}) \bigl(f(X_{t},\theta )-f( \widehat{X_{t}},\theta )\bigr)\,dt \\& \quad \geq \bigl[-\bigl({\mathbb{E}}(X_{t}-\widehat{X_{t}})^{2} \bigr)^{2}+1+2{\mathbb{E}}(X _{t}-\widehat{X_{t}}) \bigl(f(X_{t},\theta )-f(\widehat{X_{t}},\theta )\bigr)\bigr] \,dt. \end{aligned}
(5)

Therefore, when $$K_{t}={\mathbb{E}}(X_{t}-\widehat{X_{t}})^{2}$$, (5) has the minimum value

$$d{\mathbb{E}}(X_{t}-\widehat{X_{t}})^{2}=\bigl[- \bigl({\mathbb{E}}(X_{t}- \widehat{X_{t}})^{2} \bigr)^{2}+1+2{\mathbb{E}}(X_{t}-\widehat{X_{t}}) \bigl(f(X _{t},\theta )-f(\widehat{X_{t}},\theta )\bigr)\bigr] \,dt.$$
(6)

From Assumption 1, one has

\begin{aligned}& \bigl[-\bigl({\mathbb{E}}(X_{t}-\widehat{X_{t}})^{2} \bigr)^{2}+1+2{\mathbb{E}}(X _{t}-\widehat{X_{t}}) \bigl(f(X_{t},\theta )-f(\widehat{X_{t}},\theta )\bigr)\bigr] \,dt \\& \quad \leq \bigl[-\bigl({\mathbb{E}}(X_{t}-\widehat{X_{t}})^{2} \bigr)^{2}+1+2K(\theta ) {\mathbb{E}}(X_{t}- \widehat{X_{t}})^{2}\bigr]\,dt. \end{aligned}

Since $$f(X_{t},\theta )$$ is nonlinear, we cannot obtain the optimal state estimation of $$X_{t}$$, the suboptimal state estimation is considered.

Consider

$$d{\mathbb{E}}(X_{t}-\widehat{X_{t}})^{2} = \bigl(2K(\theta ){\mathbb{E}}(X _{t}-\widehat{X_{t}})^{2}- \bigl({\mathbb{E}}(X_{t}-\widehat{X_{t}})^{2} \bigr)^{2}+1\bigr)\,dt.$$
(7)

Let

$${\mathbb{E}}(X_{t}-\widehat{X_{t}})^{2}=\gamma _{t},$$
(8)

one has

$$d\gamma _{t}=\bigl(2K(\theta )\gamma _{t}-\gamma _{t}^{2}+1\bigr)\,dt.$$
(9)

It is easy to check that

$$\gamma _{t}=\frac{(\sqrt{K^{2}(\theta )+1}+K(\theta ))(1-e^{-2t\sqrt{K ^{2}(\theta )+1}})}{1+(\frac{K(\theta )+\sqrt{K^{2}(\theta )+1}}{\sqrt{K ^{2}(\theta )+1}-K(\theta )})e^{-2t\sqrt{K^{2}(\theta )+1}}}.$$
(10)

Then we obtain

$$\gamma _{t}\rightarrow K(\theta )+\sqrt{K^{2}(\theta )+1}= \gamma ( \theta ).$$
(11)

Therefore, it is obvious that

$$\textstyle\begin{cases} d \widehat{X_{t}}= f(\widehat{X_{t}},\theta )\,dt+\gamma _{t}(d Y_{t}- \widehat{X_{t}}\,dt), \\ \widehat{X_{0}}= X_{0}. \end{cases}$$
(12)

Let

$$dV^{*}_{t}=d Y_{t}-\widehat{X_{t}}\,dt,$$
(13)

where $$(V^{*}_{t},t\geq 0)$$ is assumed to be a standard Wiener process defined on a complete probability space $$(\varOmega , \mathscr{F}, \{ {\mathscr{F}}_{t}\}_{t\geq 0}, \mathrm{P})$$.

Hence

$$\textstyle\begin{cases} d \widehat{X_{t}}= f(\widehat{X_{t}},\theta )\,dt+\gamma _{t}\, dV^{*}_{t}, \\ \widehat{X_{0}}= X_{0}. \end{cases}$$
(14)

It is assumed that the system (14) reaches the steady state, which means that

$$\textstyle\begin{cases} d \widehat{X_{t}}= f(\widehat{X_{t}},\theta )\,dt+\gamma (\theta )\, dV ^{*}_{t}, \\ \widehat{X_{0}}= X_{0}. \end{cases}$$
(15)

In summary, the suboptimal state estimation of $$X_{t}$$ is (15).

The likelihood function obeys the expression

$$\ell _{t}(\theta )= \int ^{t}_{0}\frac{f(\widehat{X_{s}},\theta )}{\gamma ^{2}(\theta )}\, d \widehat{X_{s}}-\frac{1}{2} \int ^{t}_{0}\frac{f^{2}( \widehat{X_{s}},\theta )}{\gamma ^{2}(\theta )}\,ds.$$
(16)

Then the approximate likelihood function can be written as

$$\ell _{n}(\theta )=\sum^{n}_{i=1} \frac{f(\widehat{X_{t_{i-1}}},\theta )}{\gamma ^{2}(\theta )}(\widehat{X_{t_{i}}}-\widehat{X_{t_{i-1}}})- \frac{ \Delta }{2}\sum^{n}_{i=1} \frac{f^{2}(\widehat{X_{t_{i-1}}},\theta )}{ \gamma ^{2}(\theta )}.$$
(17)

### Remark 1

That system (14) reaches the steady state means that the Riccatti equation satisfies $$\frac{d\gamma _{t}}{\,dt}=0$$. Hence, we obtain $$\gamma _{t}=\gamma (\theta )$$.

### Remark 2

In (15), both drift item and diffusion item have a parameter. Thus, it is difficult to discuss the asymptotic property of the estimator. In the next section, the problem is solved.

## 3 Main results and proofs

The following lemmas are useful to derive our results.

### Lemma 1

Assume that $$\{\widehat{X_{t}}\}$$ is a solution of the stochastic differential (14) and Assumptions 14 hold. Then, for any integer $$n\geq 1$$ and $$0\leq s\leq t$$,

$${\mathbb{E}} \vert \widehat{X_{t}}-\widehat{X_{s}} \vert ^{2p}=O\bigl( \vert t-s \vert ^{p}\bigr).$$

### Proof

Suppose $$\theta _{0}$$ is the true parameter value; by applying Hölder’s inequality, it follows that

\begin{aligned}& \vert \widehat{X_{t}}-\widehat{X_{s}} \vert ^{2p} \\& \quad = \biggl\vert \int ^{t}_{s}f(\widehat{X_{u}},\theta _{0})\,du+\gamma (\theta _{0}) \int ^{t}_{s}dV^{*}_{u} \biggr\vert ^{2p} \\& \quad \leq 2^{2p-1}\biggl( \biggl\vert \int ^{t}_{s}f(\widehat{X_{u}},\theta _{0})\,du \biggr\vert ^{2p}+\bigl( \gamma (\theta _{0})\bigr)^{2p} \biggl\vert \int ^{t}_{s}dV^{*}_{u} \biggr\vert ^{2p}\biggr) \\& \quad \leq 2^{2p-1}\biggl((t-s)^{2p-1} \int ^{t}_{s} \bigl\vert f(\widehat{X_{u}}, \theta _{0}) \bigr\vert ^{2p}\,du+\bigl( \gamma (\theta _{0})\bigr)^{2p} \biggl\vert \int ^{t}_{s}dV^{*}_{u} \biggr\vert ^{2p}\biggr). \end{aligned}

Since

$$\bigl\vert f(\widehat{X_{u}},\theta _{0}) \bigr\vert ^{2p}\leq K_{1}(\theta _{0})^{p}2^{p-1} \bigl(1+ \vert \widehat{X_{u}} \vert ^{2p}\bigr),$$
(18)

from Assumption 3 together with the stationarity of the process, one has

$${\mathbb{E}}\biggl[ \int ^{t}_{s} \bigl\vert f(\widehat{X_{u}}, \theta _{0}) \bigr\vert ^{2p}\,du\biggr]=O\bigl( \vert t-s \vert \bigr).$$
(19)

From the Burkholder–Davis–Gundy inequality, it can be checked that

$${\mathbb{E}}\biggl[ \biggl\vert \int ^{t}_{s}dV^{*}_{u} \biggr\vert ^{2p}\biggr]\leq C_{p}{\mathbb{E}} \biggl\vert \int ^{t}_{s}du \biggr\vert ^{p}=C_{p}(t-s)^{2p},$$
(20)

where $$C_{p}$$ is a positive constant depending only on p.

Then we have

$${\mathbb{E}}\biggl[ \biggl\vert \int ^{t}_{s}dV^{*}_{u} \biggr\vert ^{2p}\biggr]=O\bigl( \vert t-s \vert ^{p}\bigr).$$
(21)

From the above analysis, it follows that

$${\mathbb{E}} \vert \widehat{X_{t}}-\widehat{X_{s}} \vert ^{2p}=O\bigl( \vert t-s \vert ^{p}\bigr).$$
(22)

The proof is complete. □

### Lemma 2

Under Assumptions 1 and 3, when $$\Delta \rightarrow 0$$, one has

$${\mathbb{E}} \Biggl\vert \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f( \widehat{X_{t_{i-1}}},\theta )f(\widehat{X_{s}},\theta _{0})}{\gamma ^{2}(\theta )}\,ds-\sum ^{n}_{i=1}\frac{f(\widehat{X_{t_{i-1}}},\theta )f( \widehat{X_{t_{i-1}}},\theta _{0})}{\gamma ^{2}(\theta )}\Delta \Biggr\vert \rightarrow 0.$$

### Proof

By applying Hölder’s inequality, it can be checked that

\begin{aligned}& {\mathbb{E}} \Biggl\vert \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f( \widehat{X_{t_{i-1}}},\theta )f(\widehat{X_{s}},\theta _{0})}{\gamma ^{2}(\theta )}\,ds-\sum ^{n}_{i=1}\frac{f(\widehat{X_{t_{i-1}}},\theta )f( \widehat{X_{t_{i-1}}},\theta _{0})}{\gamma ^{2}(\theta )}\Delta \Biggr\vert \\& \quad = {\mathbb{E}} \Biggl\vert \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f( \widehat{X_{t_{i-1}}},\theta )(f(\widehat{X_{s}},\theta _{0})-f( \widehat{X_{t_{i-1}}},\theta _{0}))}{\gamma ^{2}(\theta )}\,ds \Biggr\vert \\& \quad \leq \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}{\mathbb{E}} \biggl\vert \frac{f( \widehat{X_{t_{i-1}}},\theta )(f(\widehat{X_{s}},\theta _{0})-f( \widehat{X_{t_{i-1}}},\theta _{0}))}{\gamma ^{2}(\theta )} \biggr\vert \,ds \\& \quad \leq \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\biggl({\mathbb{E}}\biggl[ \frac{f( \widehat{X_{t_{i-1}}},\theta )}{\gamma ^{2}(\theta )}\biggr]^{2}\biggr)^{ \frac{1}{2}}\bigl({\mathbb{E}} \bigl[f(\widehat{X_{s}},\theta _{0})-f( \widehat{X_{t_{i-1}}},\theta _{0})\bigr]^{2} \bigr)^{\frac{1}{2}}\,ds. \end{aligned}

From Lemma 1 together with Assumptions 1 and 3, we have

$${\mathbb{E}}\bigl[f(\widehat{X_{s}},\theta _{0})-f( \widehat{X_{t_{i-1}}}, \theta _{0})\bigr]^{2}=O(\Delta ),$$
(23)

and $${\mathbb{E}}[\frac{f(\widehat{X_{t_{i-1}}},\theta )}{\gamma ^{2}( \theta )}]^{2}$$ is bounded.

From the above analysis, it follows that

$${\mathbb{E}} \Biggl\vert \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f( \widehat{X_{t_{i-1}}},\theta )f(\widehat{X_{s}},\theta _{0})}{\gamma ^{2}(\theta )}\,ds-\sum ^{n}_{i=1}\frac{f(\widehat{X_{t_{i-1}}},\theta )f( \widehat{X_{t_{i-1}}},\theta _{0})}{\gamma ^{2}(\theta )}\Delta \Biggr\vert \rightarrow 0,$$
(24)

as $$\Delta \rightarrow 0$$.

The proof is complete. □

### Lemma 3

Under Assumptions 1 and 3, we have

$${\mathbb{E}} \Biggl\vert \sum^{n}_{i=1} \frac{f^{2}(\widehat{X_{t_{i-1}}},\theta )}{ \gamma ^{2}(\theta )}\Delta - \int ^{t}_{0}\frac{f^{2}(\widehat{X_{s}}, \theta )}{\gamma ^{2}(\theta )}\,ds \Biggr\vert \rightarrow 0,$$

as $$\Delta \rightarrow 0$$.

### Proof

From the Hölder inequality and Assumption 1 together with the stationarity of the process, one has

\begin{aligned}& {\mathbb{E}} \Biggl\vert \sum^{n}_{i=1} \frac{f^{2}(\widehat{X_{t_{i-1}}},\theta )}{\gamma ^{2}(\theta )}\Delta - \int ^{t}_{0}\frac{f^{2}( \widehat{X_{s}},\theta )}{\gamma ^{2}(\theta )}\,ds \Biggr\vert \\& \quad = {\mathbb{E}} \Biggl\vert \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\biggl[\frac{f^{2}( \widehat{X_{t_{i-1}}},\theta )}{\gamma ^{2}(\theta )}- \frac{f^{2}( \widehat{X_{s}},\theta )}{\gamma ^{2}(\theta )}\biggr]\,ds \Biggr\vert \\& \quad = {\mathbb{E}} \Biggl\vert \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f^{2}( \widehat{X_{t_{i-1}}},\theta )-f^{2}(\widehat{X_{s}},\theta )}{\gamma ^{2}(\theta )}\,ds \Biggr\vert \\& \quad \leq \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}{\mathbb{E}}\biggl[\frac{ \vert f( \widehat{X_{t_{i-1}}},\theta )+f(\widehat{X_{s}},\theta ) \vert \vert f(X_{t_{i-1}}, \theta )-f(X_{s},\theta ) \vert }{\gamma ^{2}(\theta )} \biggr]\,ds \\& \quad \leq \frac{K_{2}(\theta )}{\gamma ^{2}(\theta )}\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\bigl({\mathbb{E}} \vert \widehat{X_{t_{i-1}}}- \widehat{X_{s}} \vert ^{2} \bigr)^{\frac{1}{2}}\,ds, \end{aligned}

where $$\sup \{K_{2}(\theta )\}<\infty$$.

According to the above analysis together with Lemma 1, it follows that

$${\mathbb{E}} \Biggl\vert \sum^{n}_{i=1} \frac{f^{2}(\widehat{X_{t_{i-1}}},\theta )}{ \gamma ^{2}(\theta )}\Delta - \int ^{t}_{0}\frac{f^{2}(\widehat{X_{s}}, \theta )}{\gamma ^{2}(\theta )}\,ds \Biggr\vert \rightarrow 0,$$

as $$\Delta \rightarrow 0$$.

The proof is complete. □

### Remark 3

By employing the Hölder inequality, the Burkholder–Davis–Gundy inequality and the stationarity of the process, the above lemmas have been proved. These lemmas play a key role in the proof of the following main results.

In the following theorem, the consistency in probability of the maximum likelihood estimator is proved by applying a martingale moment inequality, the Chebyshev inequality, the uniform ergodic theorem and the results of Lemmas 13.

### Theorem 1

When $$\Delta \rightarrow 0$$, $$n\rightarrow \infty$$ and $$n\Delta \rightarrow \infty$$,

$$\widehat{\theta _{0}}\overset{P}{\rightarrow }\theta _{0}.$$

### Proof

According to the expression of the approximate likelihood function and Eq. (1), it follows that

\begin{aligned} \ell _{n}(\theta ) =& \sum^{n}_{i=1}\frac{f(\widehat{X_{t_{i-1}}},\theta )}{\gamma ^{2}( \theta )} \biggl( \int ^{t_{i}}_{t_{i-1}}f(\widehat{X_{s}},\theta _{0})\,ds+ \gamma (\theta _{0}) \int ^{t_{i}}_{t_{i-1}}dV^{*}_{s} \biggr)- \frac{\Delta }{2}\sum^{n}_{i=1} \frac{f^{2}(\widehat{X_{t_{i-1}}}, \theta )}{\gamma ^{2}(\theta )} \\ =& \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f(\widehat{X_{t_{i-1}}}, \theta )f(\widehat{X_{s}},\theta _{0})}{\gamma ^{2}(\theta )}\,ds+\sum ^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f(\widehat{X_{t_{i-1}}},\theta )\gamma (\theta _{0})}{\gamma ^{2}(\theta )} \,dV^{*}_{s} \\ & {} -\frac{1}{2}\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f^{2}( \widehat{X_{t_{i-1}}},\theta )}{\gamma ^{2}(\theta )}\,ds \\ =& \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f(\widehat{X_{t_{i-1}}}, \theta )(f(\widehat{X_{s}},\theta _{0})-\frac{1}{2}f( \widehat{X_{t_{i-1}}},\theta ))}{\gamma ^{2}(\theta )}\,ds \\ & {} + \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f(\widehat{X_{t_{i-1}}}, \theta )\gamma (\theta _{0})}{\gamma ^{2}(\theta )} \,dV^{*}_{s}. \end{aligned}

Then we have

\begin{aligned}& \frac{1}{n\Delta }\ell _{n}(\theta ) \\& \quad =\frac{1}{n\Delta }\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f( \widehat{X_{t_{i-1}}},\theta )(f(\widehat{X_{s}},\theta _{0})- \frac{1}{2}f(\widehat{X_{t_{i-1}}},\theta ))}{\gamma ^{2}(\theta )}\,ds \\& \qquad {}+\frac{1}{n\Delta }\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f( \widehat{X_{t_{i-1}}},\theta )\gamma (\theta _{0})}{\gamma ^{2}(\theta )} \,dV^{*}_{s}. \end{aligned}
(25)

From the martingale moment inequality, it can be checked that

\begin{aligned}& {\mathbb{E}} \Biggl\vert \frac{1}{n\Delta }\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f( \widehat{X_{t_{i-1}}},\theta )\gamma (\theta _{0})}{\gamma ^{2}(\theta )} \,dV^{*}_{s} \Biggr\vert ^{2} \\& \quad \leq \frac{1}{(n\Delta )^{2}}C\frac{\gamma ^{2}(\theta _{0})}{\gamma ^{4}(\theta )}{\mathbb{E}}\sum ^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\bigl(f( \widehat{X_{t_{i-1}}}, \theta )\bigr)^{2}\,ds \\& \quad \leq \frac{1}{n\Delta }C_{1} \\& \quad \rightarrow 0, \end{aligned}

where C and $$C_{1}$$ are constants.

By applying the Chebyshev inequality, it can be found that

$$\frac{1}{n\Delta }\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f( \widehat{X_{t_{i-1}}},\theta )\gamma (\theta _{0})}{\gamma ^{2}(\theta )} \,dV^{*}_{s}\overset{P}{\rightarrow }0,$$
(26)

when $$\Delta \rightarrow 0$$, $$n\rightarrow \infty$$ and $$n\Delta \rightarrow \infty$$.

From Lemmas 23 together with the uniform ergodic theorem (see e.g. [22]), one has

\begin{aligned}& \frac{1}{n\Delta }\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f( \widehat{X_{t_{i-1}}},\theta )(f(\widehat{X_{s}},\theta _{0})- \frac{1}{2}f(\widehat{X_{t_{i-1}}},\theta ))}{\gamma ^{2}(\theta )}\,ds \\& \quad \overset{P}{\rightarrow } {\mathbb{E}}\biggl[\frac{f(x_{0},\theta )(f(x_{0},\theta _{0})-\frac{1}{2}f(x _{0},\theta ))}{\gamma ^{2}(\theta )}\biggr], \end{aligned}

when $$\Delta \rightarrow 0$$, $$n\rightarrow \infty$$ and $$n\Delta \rightarrow \infty$$.

Hence, it leads to the relation

$$\frac{1}{n\Delta }\ell _{n}(\theta )\overset{P}{\rightarrow } { \mathbb{E}}\biggl[\frac{f(x _{0},\theta )(f(x_{0},\theta _{0})-\frac{1}{2}f(x_{0},\theta ))}{\gamma ^{2}(\theta )}\biggr],$$
(27)

when $$\Delta \rightarrow 0$$, $$n\rightarrow \infty$$ and $$n\Delta \rightarrow \infty$$.

From Assumption 4, it is easy to check that

$$\widehat{\theta _{0}}\overset{P}{\rightarrow }\theta _{0},$$
(28)

when $$\Delta \rightarrow 0$$, $$n\rightarrow \infty$$ and $$n\Delta \rightarrow \infty$$.

The proof is complete. □

In the following theorem, the asymptotic normality of the error of estimation is proved by employing the martingale moment inequality, the Chebyshev inequality, the uniform ergodic theorem and the dominated convergence theorem.

### Theorem 2

When $$\Delta \rightarrow 0$$, $$n^{\frac{1}{2}}\Delta \rightarrow 0$$ and $$n\Delta \rightarrow \infty$$ as $$n\rightarrow \infty$$,

$$\sqrt{n\Delta }(\theta _{0}-\widehat{\theta _{0}}) \overset{d}{\rightarrow }N\biggl(0,\frac{\gamma ^{2}(\theta _{0})}{{\mathbb{E}}[f ^{\prime }(x_{0},\theta _{0})\gamma (\theta _{0})-2f(x_{0},\theta _{0})\gamma ^{\prime }(\theta _{0})]^{2}}\biggr).$$

### Proof

Expanding $$\ell '_{n}(\theta _{0})$$ about $$\widehat{\theta _{0}}$$, it follows that

$$\ell '_{n}(\theta _{0})=\ell '_{n}(\widehat{\theta _{0}})+\ell ''_{n}( \widetilde{\theta }) (\theta _{0}-\widehat{\theta _{0}}),$$
(29)

where θ̃ is between $$\widehat{\theta _{0}}$$ and $$\theta _{0}$$.

In view of Theorem 1, it is well known that $$\ell '_{n}( \widehat{\theta _{0}})=0$$; then

$$\ell '_{n}(\theta _{0})=\ell ''_{n}(\widetilde{\theta }) (\theta _{0}- \widehat{\theta _{0}}).$$
(30)

Since

\begin{aligned}& \ell ''_{n}(\theta ) \\& \quad = \sum^{n}_{i=1}\frac{f''(\widehat{X_{t_{i-1}}},\theta )\gamma ^{2}( \theta )-4\gamma (\theta )\gamma ^{\prime }(\theta )f'(\widehat{X_{t_{i-1}}}, \theta )}{\gamma ^{4}(\theta )}(X_{t_{i}}-X_{t_{i-1}}) \\& \qquad {} + \sum^{n}_{i=1} \frac{6f(\widehat{X_{t_{i-1}}},\theta )(\gamma ^{\prime }( \theta ))^{2}-2\gamma (\theta )\gamma ^{\prime \prime }(\theta )f( \widehat{X_{t_{i-1}}},\theta )}{\gamma ^{4}(\theta )}(X_{t_{i}}-X_{t _{i-1}}) \\& \qquad {} + \Delta \sum^{n}_{i=1} \frac{4\gamma (\theta )\gamma ^{\prime }(\theta )f'( \widehat{X_{t_{i-1}}},\theta )f(\widehat{X_{t_{i-1}}},\theta )}{\gamma ^{4}(\theta )} \\& \qquad {} - \Delta \sum^{n}_{i=1} \frac{f''(\widehat{X_{t_{i-1}}},\theta )f( \widehat{X_{t_{i-1}}},\theta )\gamma ^{2}(\theta )-3f^{2}( \widehat{X_{t_{i-1}}},\theta )(\gamma ^{\prime }(\theta ))^{2}}{\gamma ^{4}( \theta )} \\& \qquad {} + \Delta \sum^{n}_{i=1} \frac{\gamma (\theta )\gamma ^{\prime \prime }(\theta )f ^{2}(\widehat{X_{t_{i-1}}},\theta )-\gamma ^{2}(\theta )(f^{\prime }( \widehat{X_{t_{i-1}}},\theta ))^{2}}{\gamma ^{4}(\theta )} \\& \quad = \sum^{n}_{i=1}\frac{f''(\widehat{X_{t_{i-1}}},\theta )\gamma ^{2}( \theta )-4\gamma (\theta )\gamma ^{\prime }(\theta )f'(\widehat{X_{t_{i-1}}}, \theta )}{\gamma ^{4}(\theta )} \int ^{t_{i}}_{t_{i-1}}f(\widehat{X_{s}}, \theta _{0})\,ds \\& \qquad {} + \sum^{n}_{i=1} \frac{f''(\widehat{X_{t_{i-1}}},\theta )\gamma ^{2}( \theta )-4\gamma (\theta )\gamma ^{\prime }(\theta )f'(\widehat{X_{t_{i-1}}}, \theta )}{\gamma ^{4}(\theta )}\gamma (\theta _{0}) \int ^{t_{i}}_{t_{i-1}}dV ^{*}_{s} \\& \qquad {} + \sum^{n}_{i=1} \frac{6f(\widehat{X_{t_{i-1}}},\theta )(\gamma ^{\prime }( \theta ))^{2}-2\gamma (\theta )\gamma ^{\prime \prime }(\theta )f( \widehat{X_{t_{i-1}}},\theta )}{\gamma ^{4}(\theta )} \int ^{t_{i}}_{t _{i-1}}f(\widehat{X_{s}},\theta _{0})\,ds \\& \qquad {} + \sum^{n}_{i=1} \frac{6f(\widehat{X_{t_{i-1}}},\theta )(\gamma ^{\prime }( \theta ))^{2}-2\gamma (\theta )\gamma ^{\prime \prime }(\theta )f( \widehat{X_{t_{i-1}}},\theta )}{\gamma ^{4}(\theta )}\gamma (\theta _{0}) \int ^{t_{i}}_{t_{i-1}}dV^{*}_{s} \\& \qquad {} + \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{4\gamma (\theta )\gamma ^{\prime }(\theta )f'(\widehat{X_{t_{i-1}}},\theta )f(\widehat{X_{t_{i-1}}}, \theta )}{\gamma ^{4}(\theta )}\,ds \\& \qquad {} - \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f''( \widehat{X_{t_{i-1}}},\theta )f(\widehat{X_{t_{i-1}}},\theta )\gamma ^{2}(\theta )-3f^{2}(\widehat{X_{t_{i-1}}},\theta )(\gamma ^{\prime }(\theta ))^{2}}{\gamma ^{4}(\theta )}\,ds \\& \qquad {} + \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{\gamma (\theta )\gamma ^{\prime \prime }(\theta )f^{2}(\widehat{X_{t_{i-1}}},\theta )-\gamma ^{2}(\theta )(f^{\prime }(\widehat{X_{t_{i-1}}},\theta ))^{2}}{\gamma ^{4}(\theta )}\,ds \\& \quad = \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f''( \widehat{X_{t_{i-1}}},\theta )(f(\widehat{X_{s}},\theta _{0})-f( \widehat{X_{t_{i-1}}},\theta ))}{\gamma ^{2}(\theta )}\,ds \\& \qquad {} + \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{4f'( \widehat{X_{t_{i-1}}},\theta )\gamma ^{\prime }(\theta )(f( \widehat{X_{t_{i-1}}},\theta )-f(\widehat{X_{s}},\theta _{0}))}{\gamma ^{3}(\theta )}\,ds \\& \qquad {} + \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{3f(\widehat{X_{t_{i-1}}}, \theta )(\gamma ^{\prime }(\theta ))^{2}(2f(\widehat{X_{s}},\theta _{0})-f( \widehat{X_{t_{i-1}}},\theta ))}{\gamma ^{4}(\theta )}\,ds \\& \qquad {} + \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f(\widehat{X_{t_{i-1}}}, \theta )\gamma (\theta )\gamma ^{\prime \prime }(\theta )(f(\widehat{X_{t_{i-1}}}, \theta )-2f(\widehat{X_{s}},\theta _{0}))}{\gamma ^{3}(\theta )}\,ds \\& \qquad {} - \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{(f^{\prime }( \widehat{X_{t_{i-1}}},\theta ))^{2}}{\gamma ^{2}(\theta )}\,ds \\& \qquad {} + \gamma (\theta _{0})\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f''( \widehat{X_{t_{i-1}}},\theta )\gamma ^{2}(\theta )-4\gamma (\theta ) \gamma ^{\prime }(\theta )f'(\widehat{X_{t_{i-1}}},\theta )}{\gamma ^{4}( \theta )} \,dV^{*}_{s} \\& \qquad {} + \gamma (\theta _{0})\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{6f( \widehat{X_{t_{i-1}}},\theta )(\gamma ^{\prime }(\theta ))^{2}-2\gamma ( \theta )\gamma ^{\prime \prime }(\theta )f(\widehat{X_{t_{i-1}}},\theta )}{\gamma ^{4}(\theta )} \,dV^{*}_{s} \end{aligned}

we have

\begin{aligned}& \ell ''_{n}(\theta _{0}) \\& \quad = \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f''( \widehat{X_{t_{i-1}}},\theta _{0})(f(\widehat{X_{s}},\theta _{0})-f( \widehat{X_{t_{i-1}}},\theta _{0}))}{\gamma ^{2}(\theta _{0})}\,ds \\& \qquad {} + \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{4f'( \widehat{X_{t_{i-1}}},\theta _{0})\gamma ^{\prime }(\theta _{0})(f( \widehat{X_{t_{i-1}}},\theta _{0})-f(\widehat{X_{s}},\theta _{0}))}{ \gamma ^{3}(\theta _{0})}\,ds \\& \qquad {} + \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{3f(\widehat{X_{t_{i-1}}}, \theta _{0})(\gamma ^{\prime }(\theta _{0}))^{2}(2f(\widehat{X_{s}},\theta _{0})-f( \widehat{X_{t_{i-1}}},\theta _{0}))}{\gamma ^{4}(\theta _{0})}\,ds \\& \qquad {} + \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f(\widehat{X_{t_{i-1}}}, \theta _{0})\gamma ^{\prime \prime }(\theta _{0})(f(\widehat{X_{t_{i-1}}},\theta _{0})-2f( \widehat{X_{s}},\theta _{0}))}{\gamma ^{2}(\theta _{0})}\,ds \\& \qquad {} - \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{(f^{\prime }( \widehat{X_{t_{i-1}}},\theta _{0}))^{2}}{\gamma ^{2}(\theta _{0})}\,ds \\& \qquad {} + \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f''( \widehat{X_{t_{i-1}}},\theta _{0})\gamma (\theta _{0})-4\gamma ^{\prime }( \theta _{0})f'(\widehat{X_{t_{i-1}}},\theta _{0})}{\gamma ^{2}(\theta _{0})} \,dV^{*}_{s} \\& \qquad {} + \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{6f(\widehat{X_{t_{i-1}}}, \theta _{0})(\gamma ^{\prime }(\theta _{0}))^{2}-2\gamma (\theta _{0})\gamma ^{\prime \prime }(\theta _{0})f(\widehat{X_{t_{i-1}}},\theta _{0})}{\gamma ^{3}( \theta _{0})} \,dV^{*}_{s}. \end{aligned}

From the same method used in Theorem 1, it is easy to check that

$$\frac{1}{n\Delta }\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f''( \widehat{X_{t_{i-1}}},\theta _{0})\gamma (\theta _{0})-4\gamma ^{\prime }( \theta _{0})f'(\widehat{X_{t_{i-1}}},\theta _{0})}{\gamma ^{2}(\theta _{0})} \,dV^{*}_{s}\overset{P}{\rightarrow }0$$
(31)

and

$$\frac{1}{n\Delta }\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{6f( \widehat{X_{t_{i-1}}},\theta _{0})(\gamma ^{\prime }(\theta _{0}))^{2}-2\gamma (\theta _{0})\gamma ^{\prime \prime }(\theta _{0})f(\widehat{X_{t_{i-1}}},\theta _{0})}{ \gamma ^{3}(\theta _{0})} \,dV^{*}_{s}\overset{P}{\rightarrow }0.$$
(32)

By applying the results of Lemmas 23 and the uniform ergodic theorem, it follows that

\begin{aligned}& \frac{1}{n\Delta }\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f''( \widehat{X_{t_{i-1}}},\theta _{0})(f(\widehat{X_{s}},\theta _{0})-f( \widehat{X_{t_{i-1}}},\theta _{0}))}{\gamma ^{2}(\theta _{0})}\,ds \overset{P}{ \rightarrow }0, \end{aligned}
(33)
\begin{aligned}& \frac{1}{n\Delta }\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{4f'( \widehat{X_{t_{i-1}}},\theta _{0})\gamma ^{\prime }(\theta _{0})(f( \widehat{X_{t_{i-1}}},\theta _{0})-f(\widehat{X_{s}},\theta _{0}))}{ \gamma ^{3}(\theta _{0})}\,ds\overset{P}{ \rightarrow }0, \\& \frac{1}{n\Delta }\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{3f( \widehat{X_{t_{i-1}}},\theta _{0})(\gamma ^{\prime }(\theta _{0}))^{2}(2f( \widehat{X_{s}},\theta _{0})-f(\widehat{X_{t_{i-1}}},\theta _{0}))}{ \gamma ^{4}(\theta _{0})}\,ds \\& \quad \overset{P}{\rightarrow }3\frac{(\gamma ^{\prime }(\theta _{0}))^{2}}{\gamma ^{4}(\theta _{0})}{\mathbb{E}}\bigl[f(x _{0},\theta _{0})\bigr]^{2}, \\& \frac{1}{n\Delta }\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f( \widehat{X_{t_{i-1}}},\theta _{0})\gamma ^{\prime \prime }(\theta _{0})(f( \widehat{X_{t_{i-1}}},\theta _{0})-2f(\widehat{X_{s}},\theta _{0}))}{ \gamma ^{2}(\theta _{0})}\,ds \\& \quad \overset{P}{\rightarrow }-\frac{(\gamma ^{\prime \prime }(\theta _{0}))^{2}}{\gamma ^{2}(\theta _{0})}{\mathbb{E}}\bigl[f(x _{0},\theta _{0})\bigr]^{2}, \end{aligned}
(34)

and

$$\frac{1}{n\Delta }\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{(f^{\prime }( \widehat{X_{t_{i-1}}},\theta _{0}))^{2}}{\gamma ^{2}(\theta _{0})}\,ds \overset{P}{ \rightarrow }\frac{1}{\gamma ^{2}(\theta _{0})}{\mathbb{E}}\bigl[f ^{\prime }(x_{0}, \theta _{0})\bigr]^{2}.$$
(35)

Therefore, we have

\begin{aligned}& \frac{1}{n\Delta }\ell ''_{n}(\theta _{0}) \\& \quad \overset{P}{\rightarrow } {\mathbb{E}}\biggl[\frac{3(\gamma ^{\prime }(\theta _{0}))^{2}-\gamma ^{2}(\theta _{0})(\gamma ^{\prime \prime }(\theta _{0}))^{2}}{\gamma ^{4}(\theta _{0})} \bigl(f(x_{0}, \theta _{0})\bigr)^{2}+ \frac{1}{\gamma ^{2}(\theta _{0})}\bigl(f^{\prime }(x_{0},\theta _{0}) \bigr)^{2}\biggr]. \end{aligned}

According to the expression of $$\ell ''_{n}(\theta )$$ and by employing the martingale moment inequality, the Chebyshev inequality, the uniform ergodic theorem and the dominated convergence theorem, it follows that

$$\frac{1}{n\Delta }\bigl(\ell ''_{n}( \widetilde{\theta })-\ell ''_{n}(\theta _{0})\bigr)\overset{P}{\rightarrow }0.$$
(36)

Hence, it can be found that

$$\frac{1}{n\Delta }\ell ''_{n}(\widetilde{ \theta }) \overset{P}{\rightarrow } {\mathbb{E}}\biggl[\frac{3(\gamma ^{\prime }(\theta _{0}))^{2}- \gamma ^{2}(\theta _{0})(\gamma ^{\prime \prime }(\theta _{0}))^{2}}{\gamma ^{4}(\theta _{0})} \bigl(f(x_{0},\theta _{0})\bigr)^{2}+ \frac{1}{\gamma ^{2}(\theta _{0})}\bigl(f ^{\prime }(x_{0},\theta _{0}) \bigr)^{2}\biggr].$$
(37)

Since

\begin{aligned}& \ell '_{n}(\theta ) \\& \quad = \sum^{n}_{i=1}\frac{f'(\widehat{X_{t_{i-1}}},\theta )\gamma ( \theta )-2f(\widehat{X_{t_{i-1}}},\theta )\gamma ^{\prime }(\theta )}{\gamma ^{3}(\theta )}( \widehat{X_{t_{i}}}-\widehat{X_{t_{i-1}}}) \\& \qquad {} - \Delta \sum^{n}_{i=1} \frac{f(\widehat{X_{t_{i-1}}},\theta )f^{\prime }( \widehat{X_{t_{i-1}}},\theta )\gamma (\theta )-f^{2}( \widehat{X_{t_{i-1}}},\theta )\gamma ^{\prime }(\theta )}{\gamma ^{3}(\theta )} \\& \quad = \sum^{n}_{i=1}\frac{f'(\widehat{X_{t_{i-1}}},\theta )\gamma ( \theta )-2f(\widehat{X_{t_{i-1}}},\theta )\gamma ^{\prime }(\theta )}{\gamma ^{3}(\theta )} \biggl( \int ^{t_{i}}_{t_{i-1}}f(\widehat{X_{s}},\theta _{0})\,ds+ \gamma (\theta _{0}) \int ^{t_{i}}_{t_{i-1}}dV^{*}_{s}\biggr) \\& \qquad {} - \Delta \sum^{n}_{i=1} \frac{f(\widehat{X_{t_{i-1}}},\theta )f^{\prime }( \widehat{X_{t_{i-1}}},\theta )\gamma (\theta )-f^{2}( \widehat{X_{t_{i-1}}},\theta )\gamma ^{\prime }(\theta )}{\gamma ^{3}(\theta )} \\& \quad = \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f'(\widehat{X_{t_{i-1}}}, \theta )(f(\widehat{X_{s}},\theta _{0})-f(X_{t_{i-1}},\theta ))}{\gamma ^{2}(\theta )}\,ds \\& \qquad {} + \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f(\widehat{X_{t_{i-1}}}, \theta )\gamma ^{\prime }(\theta )(f(X_{t_{i-1}},\theta )-2f(\widehat{X_{s}}, \theta _{0}))}{\gamma ^{3}(\theta )}\,ds \\& \qquad {} + \sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\gamma (\theta _{0}) \frac{f'( \widehat{X_{t_{i-1}}},\theta )\gamma (\theta )-2f( \widehat{X_{t_{i-1}}},\theta )\gamma ^{\prime }(\theta )}{\gamma ^{3}(\theta )}\,dV^{*}_{s}, \end{aligned}

it follows that

\begin{aligned}& \frac{1}{\sqrt{n\Delta }}\ell '_{n}(\theta _{0}) \\& \quad = \frac{1}{\sqrt{n\Delta }}\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f'( \widehat{X_{t_{i-1}}},\theta _{0})(f(\widehat{X_{s}},\theta _{0})-f(X _{t_{i-1}},\theta _{0}))}{\gamma ^{2}(\theta _{0})}\,ds \\& \qquad {} + \frac{1}{\sqrt{n\Delta }}\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f( \widehat{X_{t_{i-1}}},\theta _{0})\gamma ^{\prime }(\theta _{0})(f(X_{t_{i-1}}, \theta _{0})-2f(\widehat{X_{s}},\theta _{0}))}{\gamma ^{3}(\theta _{0})}\,ds \\& \qquad {} + \frac{1}{\sqrt{n\Delta }}\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f'( \widehat{X_{t_{i-1}}},\theta _{0})\gamma (\theta _{0})-2f( \widehat{X_{t_{i-1}}},\theta _{0})\gamma ^{\prime }(\theta _{0})}{\gamma ^{2}( \theta _{0})} \,dV^{*}_{s}. \end{aligned}

As

\begin{aligned}& {\mathbb{E}} \Biggl\vert \frac{1}{\sqrt{n\Delta }}\sum^{n}_{i=1} \int ^{t_{i}} _{t_{i-1}}\frac{f'(\widehat{X_{t_{i-1}}},\theta _{0})(f( \widehat{X_{s}},\theta _{0})-f(\widehat{X_{t_{i-1}}},\theta _{0}))}{ \gamma ^{2}(\theta _{0})}\,ds \Biggr\vert \\& \quad \leq \frac{1}{\sqrt{n\Delta }}\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}} {\mathbb{E}} \biggl\vert \frac{f'(\widehat{X_{t_{i-1}}},\theta _{0})(f( \widehat{X_{s}},\theta _{0})-f(\widehat{X_{t_{i-1}}},\theta _{0}))}{ \gamma ^{2}(\theta _{0})} \biggr\vert \,ds \\& \quad \leq \frac{\sum^{n}_{i=1}\int ^{t_{i}}_{t_{i-1}}({\mathbb{E}}[f'( \widehat{X_{t_{i-1}}},\theta _{0})]^{2})^{\frac{1}{2}}({\mathbb{E}}[f( \widehat{X_{s}},\theta _{0})-f(\widehat{X_{t_{i-1}}},\theta _{0})]^{2})^{ \frac{1}{2}}\,ds}{\gamma ^{2}(\theta _{0})\sqrt{n\Delta }}, \end{aligned}

it is easy to check that $${\mathbb{E}}[f'(\widehat{X_{t_{i-1}}},\theta _{0})]^{2}$$ is bounded.

From Lemma 1 and Assumption 1, we have

$${\mathbb{E}}\bigl[f(\widehat{X_{s}},\theta _{0})-f( \widehat{X_{t_{i-1}}}, \theta _{0})\bigr]^{2}=O(\Delta ).$$
(38)

Then it follows that

$${\mathbb{E}} \Biggl\vert \frac{1}{\sqrt{n\Delta }}\sum^{n}_{i=1} \int ^{t_{i}} _{t_{i-1}}\frac{f'(\widehat{X_{t_{i-1}}},\theta _{0})(f( \widehat{X_{s}},\theta _{0})-f(\widehat{X_{t_{i-1}}},\theta _{0}))}{ \gamma ^{2}(\theta _{0})}\,ds \Biggr\vert \rightarrow 0,$$
(39)

when $$\Delta \rightarrow 0$$, $$n^{\frac{1}{2}}\Delta \rightarrow 0$$ and $$n\Delta \rightarrow \infty$$ as $$n\rightarrow \infty$$.

By applying the Chebyshev inequality, it can be found that

$$\frac{1}{\sqrt{n\Delta }}\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f'( \widehat{X_{t_{i-1}}},\theta _{0})(f(\widehat{X_{s}},\theta _{0})-f( \widehat{X_{t_{i-1}}},\theta _{0}))}{\gamma ^{2}(\theta _{0})}\,ds \overset{P}{ \rightarrow }0.$$
(40)

By applying the same methods, we have

$$\frac{1}{\sqrt{n\Delta }}\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f( \widehat{X_{t_{i-1}}},\theta )\gamma ^{\prime }(\theta )(f(X_{t_{i-1}}, \theta )-2f(\widehat{X_{s}},\theta _{0}))}{\gamma ^{3}(\theta )}\,ds \overset{P}{ \rightarrow }0.$$
(41)

It is obvious that

\begin{aligned}& \frac{1}{\sqrt{n\Delta }}\sum^{n}_{i=1} \int ^{t_{i}}_{t_{i-1}}\frac{f'( \widehat{X_{t_{i-1}}},\theta _{0})\gamma (\theta _{0})-2f( \widehat{X_{t_{i-1}}},\theta _{0})\gamma ^{\prime }(\theta _{0})}{\gamma ^{2}( \theta _{0})} \,dV^{*}_{s} \\& \quad \overset{d}{\rightarrow }N\biggl(0,\frac{1}{\gamma ^{2}(\theta _{0})}{\mathbb{E}} \bigl[f^{\prime }(x_{0},\theta _{0})\gamma (\theta _{0})-2f(x_{0},\theta _{0})\gamma ^{\prime }(\theta _{0})\bigr]^{2}\biggr). \end{aligned}

Hence, we have

$$\frac{1}{\sqrt{n\Delta }}\ell '_{n}(\theta _{0}) \overset{d}{\rightarrow }N\biggl(0,\frac{1}{\gamma ^{2}(\theta _{0})}{\mathbb{E}}\bigl[f ^{\prime }(x_{0},\theta _{0})\gamma (\theta _{0})-2f(x_{0},\theta _{0})\gamma ^{\prime }(\theta _{0})\bigr]^{2}\biggr),$$
(42)

when $$\Delta \rightarrow 0$$, $$n^{\frac{1}{2}}\Delta \rightarrow 0$$ and $$n\Delta \rightarrow \infty$$ as $$n\rightarrow \infty$$.

From the above analysis, it can be checked that

$$\sqrt{n\Delta }(\theta _{0}-\widehat{\theta _{0}}) \overset{d}{\rightarrow }N\biggl(0,\frac{\gamma ^{2}(\theta _{0})}{{\mathbb{E}}[f ^{\prime }(x_{0},\theta _{0})\gamma (\theta _{0})-2f(x_{0},\theta _{0})\gamma ^{\prime }(\theta _{0})]^{2}}\biggr),$$
(43)

when $$\Delta \rightarrow 0$$, $$n^{\frac{1}{2}}\Delta \rightarrow 0$$ and $$n\Delta \rightarrow \infty$$ as $$n\rightarrow \infty$$.

The proof is complete. □

## 4 Example

Consider the incomplete information nonlinear stochastic system described as follows:

$$\textstyle\begin{cases} d X_{t}= \theta \frac{X_{t}}{\sqrt{1+X_{t}^{2}}}\,dt+ d W_{t},\qquad X _{0}\sim u_{\theta }, \\ d Y_{t}= X_{t}\,dt+ d V_{t}, \end{cases}$$

where θ is an unknown parameter, $$(W_{t},t\geq 0)$$ and $$(V_{t},t\geq 0)$$ are independent Wiener processes, $$u_{\theta }$$ is the invariant measure, $$\{Y_{t}\}$$ is observable, while $$\{X_{t}\}$$ is unobservable, the equation of $$X_{t}$$ is called a hyperbolic diffusion equation.

Firstly, it is easy to verify that $$X_{t}$$ is an ergodic diffusion process.

Then, since

\begin{aligned}& \biggl\vert \theta \frac{x}{\sqrt{1+x^{2}}}-\theta \frac{y}{\sqrt{1+y^{2}}} \biggr\vert \leq 2\theta \vert x-y \vert , \\& \biggl\vert \theta \frac{x}{\sqrt{1+x^{2}}} \biggr\vert \leq \theta \vert 1+x \vert , \end{aligned}

$${\mathbb{E}}[\theta \frac{X_{0}}{\sqrt{1+X_{0}^{2}}}(\theta _{0}\frac{X _{0}}{\sqrt{1+X_{0}^{2}}}-\frac{1}{2}\theta \frac{X_{0}}{\sqrt{1+X _{0}^{2}}})]$$ attains the unique maximum at $$\theta =\theta _{0}$$, it is easy to check that the coefficients satisfy Assumptions 14.

Therefore, it is easy to check that

$$\widehat{\theta _{0}}\overset{P}{\rightarrow }\theta _{0},$$

when $$\Delta \rightarrow 0$$, $$n\rightarrow \infty$$ and $$n\Delta \rightarrow \infty$$.

## 5 Conclusion

The aim of this paper is to estimate the parameter for incomplete information stochastic system from discrete observations. The suboptimal estimation of the state has been obtained by constructing the extended Kalman filtering and the approximate likelihood function has been given. The consistency of the maximum likelihood estimator and the asymptotic normality of the error of estimation have been proved by applying the martingale moment inequality, Hölder’s inequality, the Chebyshev inequality, the Burkholder–Davis–Gundy inequality and the uniform ergodic theorem. Further research topics will include the parameter estimation for incomplete information stochastic systems driven by a Lévy process.

## References

1. Ait-Sahalia, Y.: Testing continuous-time models of the spot interest rate. Rev. Financ. Stud. 9, 385–426 (1995)

2. Aït-Sahalia, Y.: Maximum likelihood estimation of discretely sampled diffusions: a closed-form approximation approach. Econometrica 70, 223–262 (2002)

3. Barczy, M., Pap, G.: Asymptotic behavior of maximum likelihood estimator for time inhomogeneous diffusion processes. J. Stat. Plan. Inference 140, 1576–1593 (2010)

4. Øksendal, B.: Stochastic Differential Equations: An Introduction with Applications. Springer, Berlin (2003)

5. Bishwal, J.P.N.: Parameter Estimation in Stochastic Differential Equations. Springer, London (2008)

6. Black, F., Scholes, M.: The pricing of options and corporate liabilities. J. Polit. Econ. 81, 637–654 (1973)

7. Chang, J.Y., Chen, S.X.: On the approximate maximum likelihood estimation for diffusion processes. Ann. Stat. 39, 2820–2851 (2011)

8. Cox, J., Ingersoll, J., Ross, S.: An intertemporal general equilibrium model of asset prices. Econometrica 53, 363–384 (1985)

9. Cox, J., Ingersoll, J., Ross, S.: A theory of the term structure of interest rates. Econometrica 53, 385–408 (1985)

10. Chan, K.C.: An empirical comparison of alternative models of the short-term interest rate. J. Finance 47, 1209–1227 (1992)

11. Constantinides, G.M.: A theory of the nominal term structure of interest rates. Rev. Financ. Stud. 5, 531–552 (1992)

12. Deck, T., Theting, T.G.: Robust parameter estimation for stochastic differential equations. Acta Appl. Math. 84, 279–314 (2004)

13. Dong, H.: On infinity estimation of randomly occurring faults for a class of nonlinear time varying systems with fading channels. IEEE Trans. Autom. Control 61, 479–484 (2016)

14. Kan, X., Shu, H.S., Che, Y.: Asymptotic parameter estimation for a class of linear stochastic systems using Kalman–Bucy filtering. Math. Probl. Eng. 2012, Article ID 342705 (2012). https://doi.org/10.1155/2012/342705

15. Kannan, R.: Orientation estimation based on LKF using differential state. IEEE Sens. J. 15, 6156–6163 (2015)

16. Kutoyants, Y.A.: Statistical Inference for Ergodie Diffusion Processes. Springer, London (2004)

17. Long, H.: Parameter estimation for a class of stochastic differential equations driven by small stable noises from discrete observations. Acta Math. Sci. 30, 645–663 (2010)

18. Long, H., Shimizu, Y., Sun, W.: Least squares estimators for discretely observed stochastic processes driven by small Lévy noises. J. Multivar. Anal. 116, 422–439 (2013)

19. Lu, X., Xie, L., Zhang, H., et al.: Robust Kalman filtering for discrete-time systems with measurement delay. IEEE Trans. Circuits Syst. II, Express Briefs 54, 522–526 (2007)

20. Mbalawata, I.S., Särkkä, S., Haario, H.: Parameter estimation in stochastic differential equations with Markov chain Monte Carlo and non-linear Kalman filtering. Comput. Appl. Stat. 28, 1195–1223 (2013)

21. Overback, L., Rydén, T.: Estimation in Cox–Ingersoll–Ross model. Econom. Theory 13, 430–461 (1997)

22. Prakasa Rao, B.L.S.: Statistical Inference for Diffusion Type Processes. Arnold, London (1999)

23. Protter, P.E.: Stochastic Integration and Differential Equations: Stochastic Modelling and Applied Probability, 2nd edn. Applications of Mathematics (New York), vol. 21. Springer, Berlin (2004)

24. Rossi, G.D.: Maximum likelihood estimation of the Cox–Ingersoll–Ross model using particle filters. Comput. Econ. 36, 1–16 (2010)

25. Shimizu, Y.: Estimation of parameters for discretely observed diffusion processes with a variety of rates for information. Ann. Inst. Stat. Math. 64, 545–575 (2012)

26. Uchida, M., Yoshida, N.: Adaptive estimation of an ergodic diffusion process based on sampled data. Stoch. Process. Appl. 122, 2885–2924 (2012)

27. Vasicek, O.: An equilibrium characterization of the term structure. J. Financ. Econ. 5, 177–188 (1977)

28. Wang, Z., Lam, J., Liu, X.H.: Filtering for a class of nonlinear discrete-time stochastic systems with state delays. J. Comput. Appl. Math. 201, 153–163 (2007)

29. Wei, C., Shu, H.S., Liu, Y.R.: Gaussian estimation for discretely observed Cox–Ingersoll–Ross model. Int. J. Gen. Syst. 45, 561–574 (2016)

30. Wei, C., Shu, H.S.: Maximum likelihood estimation for the drift parameter in diffusion processes. Stoch. Int. J. Probab. Stoch. Process. 88, 699–710 (2016)

31. Wen, J.H., Wang, X.J., Mao, S.H., et al.: Maximum likelihood estimation of McKean–Vlasov stochastic differential and its application. Appl. Math. Comput. 274, 237–246 (2015)

32. Yoshida, N.: Asymptotic behavior of M-estimator and related random field for diffusion process. Ann. Inst. Stat. Math. 42, 221–251 (1990)

33. Yu, J., Phillips, P.C.B.: A Gaussian approach for continuous time models of the short-term interest rate. Econom. J. 4, 210–224 (2001)

Not applicable.

## Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 61403248.

## Author information

Authors

### Contributions

The author read and approved the final manuscript.

### Corresponding author

Correspondence to Chao Wei.

## Ethics declarations

### Competing interests

The author declares that they have no competing interests.

### Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

Reprints and permissions

Wei, C. Estimation for incomplete information stochastic systems from discrete observations. Adv Differ Equ 2019, 227 (2019). https://doi.org/10.1186/s13662-019-2169-2