Our approach in finding the optimal control is based on the definition of the saddle point given below as in [20] with slight changes to suit our problem.
Definition 4.1

(i)
If the pair ({u}^{\ast}(x(t,\omega )),{v}^{\ast}(x(t,\omega )))\in {U}_{1}\times {U}_{2} is optimal, then there exists a saddle point of the game over the interval [0,T] with respect to x(t,\omega )\in {\mathbb{R}}^{n}, if
\begin{array}{c}J(t,x(t,\omega ),{u}^{\ast}(x(t,\omega )),v(x(t,\omega )))\hfill \\ \phantom{\rule{1em}{0ex}}\le J(t,x(t,\omega ),{u}^{\ast}(x(t,\omega )),{v}^{\ast}(x(t,\omega )))\hfill \\ \phantom{\rule{1em}{0ex}}\le J(t,x(t,\omega ),u(x(t,\omega )),{v}^{\ast}(x(t,\omega )))\hfill \end{array}
for all u(x(t,\omega ))\in {U}_{1} and v(x(t,\omega ))\in {U}_{2}, where {U}_{1} and {U}_{2} are nonempty sets of admissible controls.

(ii)
The upper value of the game at any path x(t,\omega ) and time t\in [0,T] is defined by
{V}^{\ast}(x(t,\omega ))=\underset{u\in {U}_{1}}{inf}\underset{v\in {U}_{2}}{sup}J(t,x(t,\omega ),u(x(t,\omega )),v(x(t,\omega ))),
and the lower value of the game is
{V}_{\ast}(x(t,\omega ))=\underset{v\in {U}_{2}}{sup}\underset{u\in {U}_{1}}{inf}J(t,x(t,\omega ),u(x(t,\omega )),v(x(t,\omega )))
and if
{V}^{\ast}(x(t,\omega ))={V}_{\ast}(x(t,\omega ))\equiv V(x(t,\omega )).
The objective is to find the optimal admissible controls, {u}^{\ast}(x(t,\omega ))\in {U}_{1} and {v}^{\ast}(x(t,\omega ))\in {U}_{2}, such that V(x(t,\omega ),t) satisfies Definition 4.1 for {U}_{1}\subset {E}^{p} and {U}_{2}\subset {E}^{q}.
Theorem 4.1 (Bellman principle of optimality)
If {u}^{\ast}(x(t,\omega )) is optimal over the interval [0,T] starting at an initial state {x}_{0}(\omega ), then {u}^{\ast}(x(t,\omega )) is necessarily optimal over the subinterval [t,t+dt] for any dt such that Tt\ge dt>0.
For the proof of the above theorem, refer to [15].
Applying Theorem 4.1 and Definition 4.1 to the value of the game V(x(t,\omega )), we have that
\begin{array}{rcl}V(x(t,\omega ))& =& \underset{u}{min}\underset{v}{max}{E}^{t}\{\varphi (x(T,\omega ))\\ +{\int}_{t}^{t+dt}{e}^{\beta \tau}\mathcal{L}(\tau ,x(\tau ,\omega ),u(x(\tau ,\omega )),v(x(\tau ,\omega )))\phantom{\rule{0.2em}{0ex}}d\tau \}\\ =& {E}^{t}\{\mathcal{L}(t,x(t,\omega ),{u}^{\ast}(x(t,\omega )),{v}^{\ast}(x(t,\omega )))\phantom{\rule{0.2em}{0ex}}dt+{e}^{\beta \phantom{\rule{0.2em}{0ex}}dt}V(x(t+dt,\omega ))\}\\ =& \mathcal{L}(t,x(t,\omega ),{u}^{\ast}(x(t,\omega )),{v}^{\ast}(x(t,\omega )))\phantom{\rule{0.2em}{0ex}}dt\\ +t(1\beta \phantom{\rule{0.2em}{0ex}}dt){E}^{t}\left[V(x(t+dt,\omega ))\right].\end{array}
(5)
We need to calculate the expectation of the function V(x(t+dt,\omega )). Approximating the function V(x(\cdot ,\omega )) using Taylor’s formula, we have
\begin{array}{rcl}V(x(t+dt,\omega ))& =& V(x(t,\omega ))+{V}^{\prime}(x(t,\omega ))[x(t+dt,\omega )x(t,\omega )]\\ +\frac{1}{2}{V}^{\u2033}(x(t,\omega )){[x(t+dt,\omega )x(t,\omega )]}^{2}+\cdots .\end{array}
Ignoring the terms of higher powers and letting dx(t,\omega )=x(t+dt,\omega )x(t,\omega ), we get
V(x(t+dt,\omega ))=V(x(t,\omega ))+{V}^{\prime}(x(t,\omega ))[dx(t,\omega )]+\frac{1}{2}{V}^{\u2033}(x(t,\omega )){[dx(t,\omega )]}^{2}.
(6)
Substituting the stochastic equation (4) into equation (6) and using the properties of Ito’s lemma, we give the function V(x(t+dt,\omega )) by
\begin{array}{rcl}V(x(t+dt,\omega ))& =& V(x(t,\omega ))+\left[\right(f(x(t,\omega ))+G(\omega )u(x(t,\omega ))\\ +H(\omega )v(x(t,\omega ))){V}^{\prime}(x(t,\omega ))\\ +\frac{1}{2}{\sigma}^{2}(x(t,\omega ),t){V}^{\u2033}(x(t,\omega ),t)]\phantom{\rule{0.2em}{0ex}}dt\\ +\sigma (x(t,\omega ),t){V}^{\prime}(x(t,\omega ))\phantom{\rule{0.2em}{0ex}}dW(t).\end{array}
(7)
Taking the expectation of equation (7), we have
\begin{array}{rl}{E}^{t}\left[V(x(t+dt,\omega ))\right]=& V(x(t,\omega ))+\left[\right(f(x(t,\omega ))+G(\omega )u(x(t,\omega ))\\ +H(\omega )v(x(t,\omega ))){V}^{\prime}(x(t,\omega ))\\ +\frac{1}{2}{\sigma}^{2}(x(t,\omega ),t){V}^{\u2033}(x(t,\omega ))]\phantom{\rule{0.2em}{0ex}}dt.\end{array}
(8)
Substituting equation (8) to equation (5) yields
\begin{array}{rcl}\beta V(x(t,\omega ))& =& \mathcal{L}(t,x(t,\omega ),{u}^{\ast}(x(t,\omega )),{v}^{\ast}(x(t,\omega )))+[f(x(t,\omega ))\\ +G(\omega )u(x(t,\omega ))+H(\omega )v(x(t,\omega ))]{V}^{\prime}(x(t,\omega ))\\ +\frac{1}{2}Tr\left[\sigma {\sigma}^{T}(x(t,\omega ),t){V}^{\u2033}(x(t,\omega ))\right].\end{array}
(9)
The above equation is the Bellman equation similar to the one in [21] which is a parabolic differential equation that has simple solutions for some simple processes and utility functions. In this paper we will adopt the idea of [22] instead of solving the Bellman equation, which is not always easy. From the Bellman equation we can solve for the optimum values u(x(t,\omega ))\in {U}_{1} and v(x(t,\omega ))\in {U}_{2}, by taking the derivative with respect to u(x(t,\omega )) and v(x(t,\omega )),
\begin{array}{c}0={\mathcal{L}}_{u}+{G}^{T}(\omega ){V}_{x}(x(t,\omega ),t),\hfill \\ R(t,\omega )u(x(t,\omega ))={G}^{T}(\omega ){V}_{x}(x(t,\omega ),t),\hfill \\ {u}^{\ast}(x(t,\omega ))={R}^{1}(t,\omega ){G}^{T}(\omega ){V}_{x}(x(t,\omega ),t).\hfill \end{array}
(10)
As for the maximizer v(x(t,\omega )), we have
\begin{array}{c}0={\mathcal{L}}_{v}+{H}^{T}(\omega )V(x(t,\omega ),t),\hfill \\ S(t,\omega )v(x(t,\omega ))={H}^{T}(\omega ){V}_{x}(x(t,\omega ),t),\hfill \\ {v}^{\ast}(x(t,\omega ))={S}^{1}(t,\omega ){H}^{T}(\omega ){V}_{x}(x(t,\omega ),t).\hfill \end{array}
(11)
Substituting the values of {u}^{\ast}(x(t,\omega )) and {v}^{\ast}(x(t;\omega )) onto ℒ in equation (9) and collecting the like terms yields the expression
\begin{array}{rcl}\beta V(x(t,\omega ))& =& q(x(t,\omega ))\\ \frac{1}{2}{V}_{x}^{T}(x(t,\omega ),t)G(\omega ){R}^{1}(t,\omega ){G}^{T}(\omega ){V}_{x}(x(t,\omega ))\\ +\frac{1}{2}{V}_{x}^{T}(x(t,\omega ),t)H(\omega ){S}^{1}(t,\omega ){H}^{T}(\omega ){V}_{x}(x(t,\omega ))\\ +{V}_{x}^{T}(x(t,\omega ),t)f(x(t,\omega ))\\ +\frac{1}{2}Tr\left[{V}_{xx}(x(t,\omega ))\sigma (x(t,\omega )){\sigma}^{T}(x(t,\omega ))\right].\end{array}
(12)
Equation (12) is a nonlinear second order partial differential equation (PDE), and its solution is a bit challenging as it is nonlinear and in high dimensions. As assumed in [13], there is a connection between the controls and the variance of the Brownian noise. Considering the difference in our control weights, we have the following cases:

(i)
H{S}^{1}{H}^{T}G{R}^{1}{G}^{T}<0 implies that more weight is on the minimizing control than on the maximizing control variable.

(ii)
H{S}^{1}{H}^{T}G{R}^{1}{G}^{T}>0 implies more weight on the maximizing control than on the minimizing control variable.

(iii)
H{S}^{1}{H}^{T}G{R}^{1}{G}^{T}=0, the weights of the controls are equivalent, hence it is an ideal situation for a minimax optimal control.
The intuition we get from [13] is that the higher the variance, the lower the weight of the controls, hence ‘cheap’ controls and vice versa. In our case we want to strike a deal such that both players attain their optimums. The variance of the Brownian noise here is given by \sigma {\sigma}^{T}>0, therefore we want to attain a situation whereby \lambda (t)[G{R}^{1}{G}^{T}H{S}^{1}{H}^{T}]=\sigma {\sigma}^{T} for all x\in {\mathbb{R}}^{n} and t\in [0,T], where the difference of the control coefficients will be the same as the variance of the noise. Our assumption on the balancing parameter is different from the one suggested by other authors, as in [13] and [15], where the balancing term is just a constant parameter. In our case, the balancing variable \lambda (t) is dependent on t such that at any time instant the equality sign is attained as the variance terms differing with time.
Suppose that
V(x(t,\omega ))=\lambda (t)log\mathrm{\Phi}(x(t,\omega )).
(13)
We determine all the partial derivatives of the new value function given in equation (13),
{V}_{x}(x(t,\omega ))=\lambda (t)\frac{1}{\mathrm{\Phi}(x(t,\omega ))}{\mathrm{\Phi}}_{x}(x(t,\omega ))
(14)
and
{V}_{xx}(x(t,\omega ))=\lambda (t)\frac{{\mathrm{\Phi}}_{xx}(x(t,\omega ))\mathrm{\Phi}(x(t,\omega )){\mathrm{\Phi}}_{x}(x(t,\omega )){\mathrm{\Phi}}_{x}(x(t,\omega ))}{\mathrm{\Phi}(x(t,\omega )){\mathrm{\Phi}}^{T}(x(t,\omega ))}.
(15)
Therefore substituting (13), (14), (15) and taking into consideration the assumption that \lambda (t)[G{R}^{1}{G}^{T}H{S}^{1}{H}^{T}]=\sigma {\sigma}^{T} for all t\in [0,T] to the nonlinear PDE given in (12), we have
\begin{array}{rcl}\beta \mathrm{\Phi}(x(t,\omega ))log\mathrm{\Phi}(x(t,\omega ))& =& \frac{1}{\lambda (t)}\mathrm{\Phi}(x(t,\omega ))q(x(t,\omega ))\\ +{\mathrm{\Phi}}_{x}^{T}(x(t,\omega ))f(x(t,\omega ))\\ +\frac{1}{2}Tr\left[{\mathrm{\Phi}}_{xx}(x(t,\omega ))\sigma (x(t,\omega )){\sigma}^{T}(x(t,\omega ))\right],\end{array}
(16)
which yields a second order quasilinear PDE with the boundary condition given as
\mathrm{\Phi}(x(T,\omega ))=exp(\frac{1}{\lambda (T)}\varphi (x(T,\omega ))).
(17)
If the solution \mathrm{\Phi}(x(t,\omega )) is found to exist for equation (16), then we have the results given below.
Theorem 4.2 If \mathrm{\Phi}(x(t,\omega )) satisfies equation (16), then the transformed control optimums are given as
{u}^{\ast}(x(t,\omega ))=\lambda (t){R}^{1}(t,\omega ){G}^{T}(\omega )\frac{{\mathrm{\Phi}}_{x}(x(t,\omega ))}{\mathrm{\Phi}(x(t,\omega ))}
and
{v}^{\ast}(x(t,\omega ))=\lambda (t){S}^{1}(t,\omega ){H}^{T}(\omega )\frac{{\mathrm{\Phi}}_{x}(x(t,\omega ))}{\mathrm{\Phi}(x(t,\omega ))}
for the value
V(x(t,\omega ))=\lambda (t)log\mathrm{\Phi}(x(t,\omega )),
where
\lambda (t)
satisfies
\lambda (t)[G{R}^{1}{G}^{T}H{S}^{1}{H}^{T}]=\sigma {\sigma}^{T},\phantom{\rule{1em}{0ex}}\mathrm{\forall}x\in {\mathbb{R}}^{n},t\in [0,T].
One would observe that {u}^{\ast}(x(t,\omega )) is now positive while {v}^{\ast}(x(t,\omega )), this is so because the problem has been transformed from minimax to maxmin problem. The PDE in (16) is found to be a bit difficult to solve in terms of dependence variables x and t, therefore in this paper we resort to transforming the above PDE to an ODE for which, in most cases, a solution can be obtained. Consider a onedimensional problem for this case, thus n=1 and fix t, then the equation becomes more dependent on x. This leads to a nonlinear ODE, and before solving the nonlinear ODE, we have the following assumptions.
(A:1)

(i)
\mathrm{\Phi}(x(t,\omega ),t), f(x(t,\omega )) and q(x(t,\omega )) are nonnegative functions.

(ii)
\mathrm{\Phi}(x(t,\omega ),t) is Lipschitz continuous for all (t,x)\in ([0,T]\times \mathbb{R}) and \omega \in \mathrm{\Omega}.

(iii)
f(x(t,\omega )) and q(x(t,\omega )) are also continuous functions and bounded functions for all x\in \mathbb{R}.
Let
\sigma (x(t,\omega )){\sigma}^{T}(x(t,\omega ))=\theta (x(t,\omega ))>0.
Multiplying throughout by {\theta}^{1}(x(t,\omega )), we have
\begin{array}{rcl}[\frac{{d}^{2}\mathrm{\Phi}(x(t,\omega ),t)}{d{x}^{2}}+\tilde{f}(x)\frac{d\mathrm{\Phi}(x(t,\omega ),t)}{dx}]& =& \frac{2}{\lambda}\tilde{q}(x)\mathrm{\Phi}(x(t,\omega ),t)\\ +r(x)\mathrm{\Phi}(x(t,\omega ),t)log\mathrm{\Phi}(x(t,\omega ),t),\end{array}
(18)
where
\begin{array}{c}\tilde{f}(x(t,\omega ))=2f(x(t,\omega )){\theta}^{1}(x(t,\omega )),\hfill \\ \tilde{q}(x(t,\omega ))=2q(x(t,\omega )){\theta}^{1}(x(t,\omega ))\hfill \end{array}
and
r(x(t,\omega ))=\beta {\theta}^{1}(x(t,\omega )).
For transformation and simplicity purposes, we would represent the following functions as \mathcal{U}=\mathrm{\Phi}(x(t,\omega ),t) and \mathcal{V}=\frac{d\mathrm{\Phi}(x(t,\omega ),t)}{dx}.
This yields the following first order ODE:
\{\begin{array}{l}\dot{\mathcal{U}}=\mathcal{V},\\ \dot{\mathcal{V}}=\tilde{f}(x)\mathcal{V}+\frac{2}{\lambda}\tilde{q}(x)\mathcal{U}+r(x)\mathcal{U}log\mathcal{U},\end{array}
(19)
which gives the equation
F(x,\dot{\mathcal{U}},\dot{\mathcal{V}})=\left(\begin{array}{c}\mathcal{V}\\ \tilde{f}(x)\mathcal{V}+\frac{2}{\lambda}\tilde{q}(x)\mathcal{U}+r(x)\mathcal{U}log\mathcal{U}\end{array}\right).
(20)
Given the following conditions:
(A:2)

(i)
U={U}_{1}\times {U}_{2}\in {\mathbb{R}}^{m} is a compact and bounded set.

(ii)
I\in \mathbb{R} is bounded.

(iii)
H=[a,b], a>0 and b>0
F:U\times I\times H\to {\mathbb{R}}^{2}\left({\parallel \cdot \parallel}_{{\ell}_{1}}\text{norm}\right).
By the Lipschitz condition in (A:1), we have
\begin{array}{rcl}F(x,\mathcal{U},\mathcal{V})F(x,\tilde{\mathcal{U}},\tilde{\mathcal{V}})& \le & \mathcal{U}\tilde{\mathcal{U}}+\tilde{f}(x)\mathcal{V}\tilde{\mathcal{V}}\\ +\frac{2}{\lambda }\tilde{q}(x)\mathcal{U}\tilde{\mathcal{U}}+\tilde{r}(x)\mathcal{U}log\mathcal{U}\tilde{\mathcal{U}}log\tilde{\mathcal{U}},\end{array}
(21)
we know that
\begin{array}{rcl}\mathcal{U}log\mathcal{U}\tilde{\mathcal{U}}log\tilde{\mathcal{U}}& =& (log(\xi )+1)\mathcal{U}\tilde{\mathcal{U}}\phantom{\rule{1em}{0ex}}\text{for}\xi \in [a,b]\\ \le & \underset{\xi \in [a,b]}{max}(1+log\xi )\mathcal{U}\tilde{\mathcal{U}}.\end{array}
(22)
For the equation
X=\left(\begin{array}{c}\mathcal{U}\\ \mathcal{V}\end{array}\right).
Therefore,
\{\begin{array}{l}\dot{X}=F(x,X),\\ X({x}_{0}(\omega ))={X}_{0}(\omega )\in {I}_{0}\times {H}_{0}.\end{array}
(23)
Hence the solution has been found to exist, with the terminal condition given by
\mathrm{\Phi}({x}_{0}(\omega ),t)=exp(\frac{1}{\lambda (t)}\varphi ({x}_{0}(\omega ))).
(24)
In summary we have the following results.
Theorem 4.3
Consider a special case for the equation
\begin{array}{rcl}\beta \mathrm{\Phi}(x(t,\omega ))log\mathrm{\Phi}(x(t,\omega ))& =& \frac{1}{\lambda (t)}\mathrm{\Phi}(x(t,\omega ))q(x(t,\omega ))\\ +{\mathrm{\Phi}}_{x}^{T}(x(t,\omega ))f(x(t,\omega ))\\ +\frac{1}{2}Tr\left[{\mathrm{\Phi}}_{xx}(x(t,\omega ))\sigma (x(t,\omega )){\sigma}^{T}(x(t,\omega ))\right]\end{array}
for a onedimensional problem and for
\mathrm{\Phi}(x(t,\omega ),t)=\mathrm{\Phi}(x(t,\omega )).
Then, assuming that (A:1) and (A:2) hold, at least one solution has been found to exist.
The solution in (23) is not necessarily unique, and to attain uniqueness, more boundary conditions to the ODE must be given. For a onedimensional problem at least one solution has been found to exist, and for n\ge 2 the equation is a PDE which is difficult to solve.
4.1 Iterative optimal control estimates
From Theorem 4.3, consider the estimated value function to be given as
\begin{array}{rcl}\mathrm{\Phi}(x({t}_{j+1},\omega ))& =& {\int}_{\partial \mathrm{\Upsilon}}\rho (\mathrm{\Upsilon}{x}_{j})exp(\frac{1}{\lambda (T)}\varphi (x(T,\omega )))\\ \times exp({\int}_{{t}_{j}}^{{t}_{N1}}({\tilde{f}}_{j}+{\tilde{Q}}_{j})\phantom{\rule{0.2em}{0ex}}dt)\phantom{\rule{0.2em}{0ex}}d\mathrm{\Upsilon},\end{array}
(25)
where
\mathrm{\Upsilon}=(x({t}_{j},\omega ),x({t}_{j+1},\omega ),\dots ,x({t}_{N1},\omega ))
and
d\mathrm{\Upsilon}=(dx({t}_{j},\omega )\phantom{\rule{0.2em}{0ex}}dx({t}_{j+1},\omega )\cdots \phantom{\rule{0.2em}{0ex}}dx({t}_{N1},\omega )).
The expectation of the value function is driven by stochastic differential equation (19). The function \rho (\mathrm{\Upsilon}{x}_{j}) in equation (25) is the probability density function of the transitions, and the function {\tilde{Q}}_{j} will be defined later.
Certainly, we cannot surely know future paths and the future control values due to the presence of the noise to the problem. This does not mean we have to give up since future paths cannot be certainly known, therefore we may estimate future paths, hence future control values, in order to attain optimums as the controls are dependent on the path control.
The continuous time interval is divided into small time intervals to attain small equal discrete paths assuming we are not distorting the trajectory in any way, that is, let
{x}_{j+1}(\omega ){x}_{j}(\omega )=x({t}_{j+1},\omega )x({t}_{j},\omega )\phantom{\rule{1em}{0ex}}\text{for all}{t}_{j}\in [\u03f5,T\u03f5]\text{for}\u03f5\to 0.
Suppose the transition between the paths is given by
\begin{array}{rcl}\rho (\mathrm{\Upsilon}{x}_{j})& =& \rho ({x}_{N1},\dots ,{x}_{j+1}{x}_{j})\\ =& \prod _{j=0}^{N1}\rho ({x}_{j+1}{x}_{j}),\\ {x}_{j}\text{\u2019s are identically independent and}j=0\text{is the initial state}.\end{array}
(26)
The above equation is the cumulative probability density function for the sample path from {x}_{j} to {x}_{N1}. The transitions of the sample paths are Markovian as they are solely dependent on the current path ({x}_{j}) at time {t}_{j}. Following the work of [15] to the latter, we take the noise term to be Gaussian distributed with mean zero and variance \theta (x)=\sigma {\sigma}^{T} as given earlier. Therefore,
\rho ({x}_{j+1}{x}_{j})=\frac{exp(\frac{1}{2}{\sum}_{j=0}^{N1}{\parallel \frac{{x}_{j+1}(\omega ){x}_{j}(\omega )}{{\delta}_{j}}{g}_{j}(\omega )\parallel}^{2}{\delta}_{j}{\theta}_{j}^{1}(\omega ))}{{\prod}_{j=0}^{N1}{((2\pi ){\theta}_{j}(\omega ))}^{\frac{1}{2}}}
(27)
for {\delta}_{j}={t}_{j+1}{t}_{j}, which is the change in time t.
Hence we have the following results given as a lemma.
Lemma 4.1 From both Theorem 4.2 and Theorem 4.3, and assuming that the transitions are given by equation (27), we give the iterative optimal controls as
{u}^{\ast}(\omega )={R}_{j}^{1}(\omega ){G}^{T}(\omega )\frac{exp({A}_{j}(\omega )+{B}_{j}(\omega ))}{{\int}_{\partial \mathrm{\Upsilon}}exp({A}_{j}(\omega )+{B}_{j}(\omega ))\phantom{\rule{0.2em}{0ex}}d\mathrm{\Upsilon}}
and
{v}^{\ast}(\omega )={S}_{j}^{1}(\omega ){H}^{T}(\omega )\frac{exp({A}_{j}(\omega )+{B}_{j}(\omega ))}{{\int}_{\partial \mathrm{\Upsilon}}exp({A}_{j}(\omega )+{B}_{j}(\omega ))\phantom{\rule{0.2em}{0ex}}d\mathrm{\Upsilon}}
for the estimated value function
\begin{array}{rcl}\mathrm{\Phi}(x({t}_{j+1},\omega ))& =& {\int}_{\partial \mathrm{\Upsilon}}\rho (\mathrm{\Upsilon}{x}_{j})exp(\frac{1}{\lambda (T)}\varphi (x(T,\omega )))\\ \times exp(\sum _{j=0}^{N1}({\tilde{f}}_{j}+{\tilde{Q}}_{j})\phantom{\rule{0.2em}{0ex}}dt)\phantom{\rule{0.2em}{0ex}}d\mathrm{\Upsilon},\end{array}
where
{A}_{j}(\omega )=\frac{1}{2}\sum _{j=0}^{N1}{\parallel \frac{{x}_{j+1}(\omega ){x}_{j}(\omega )}{{\delta}_{j}}{g}_{j}(\omega )\parallel}^{2}{\delta}_{j}{\theta}_{j}^{1}(\omega )
and
{B}_{j}(\omega )=\sum _{j=0}^{N1}({\tilde{f}}_{j}+{\tilde{Q}}_{j}){\theta}_{j}^{1}(\omega ){\delta}_{j}.
Proof From Theorem 4.2, suppose that the solution is given as an estimated iterative value function in equation (25). Consider the discrete paths of the optimal trajectory given as
\begin{array}{rcl}\mathrm{\Phi}(x({t}_{j+1},\omega ))& =& {\int}_{\partial \mathrm{\Upsilon}}\rho (\mathrm{\Upsilon}{x}_{j})exp(\frac{1}{\lambda (T)}\varphi (x(T,\omega )))\\ \times exp({\int}_{{t}_{j}}^{{t}_{N1}}({\tilde{f}}_{j}+{\tilde{Q}}_{j})\phantom{\rule{0.2em}{0ex}}dt)\phantom{\rule{0.2em}{0ex}}d\mathrm{\Upsilon}.\end{array}
(28)
Now, substituting equation (27) to equation (25), we have
\begin{array}{rcl}\mathrm{\Phi}(x({t}_{j},\omega ),t)& =& {\int}_{\partial \mathrm{\Upsilon}}\frac{exp(\frac{1}{\lambda (T)}\varphi (x(T,\omega )))}{{\prod}_{j=0}^{N1}{((2\pi ){\theta}_{j})}^{\frac{1}{2}}}\\ \times exp(\frac{1}{2}\sum _{j=0}^{N1}\frac{1}{{\lambda}_{j}}{\parallel \frac{{x}_{j+1}{x}_{j}}{{\delta}_{j}}{g}_{j}\parallel}^{2}{\delta}_{j}{\theta}_{j}^{1})\\ \times exp\left(\sum _{j=0}^{N1}({\tilde{f}}_{j}+{\tilde{Q}}_{j}){\delta}_{j}\right)\phantom{\rule{0.2em}{0ex}}d\mathrm{\Upsilon}\end{array}
(29)
for
\tilde{Q}(x)=Ar(x)+(1+\frac{2}{\lambda}\tilde{q}(x)\left\right),
where
C=\underset{\xi \in (a,b)}{max}(1+log\xi ).
We know that
\tilde{f}(x)=2f(x){\theta}^{1}(x),\phantom{\rule{2em}{0ex}}\tilde{q}(x)=q(x){\theta}^{1}(x)
and
r(x)=\beta {\theta}^{1}(x).
Therefore, we have
\begin{array}{rcl}\mathrm{\Phi}(x({t}_{j},\omega ))& =& {\int}_{\partial \mathrm{\Upsilon}}\frac{exp(\frac{1}{\lambda (T)}\varphi (x(T,\omega )))}{{\prod}_{j=0}^{N1}{((2\pi ){\theta}_{j})}^{\frac{1}{2}}}\\ \times [exp(\frac{1}{2}\sum _{j=0}^{N1}\frac{1}{{\lambda}_{j}}{\parallel \frac{{x}_{j+1}{x}_{j}}{{\delta}_{j}}{g}_{j}\parallel}^{2}{\delta}_{j}{\theta}_{j}^{1})]\\ \times [exp\left(\sum _{j=0}^{N1}({\tilde{f}}_{j}+{\tilde{Q}}_{j}){\delta}_{j}\right)]\phantom{\rule{0.2em}{0ex}}d\mathrm{\Upsilon}.\end{array}
(30)
Let
\begin{array}{c}{A}_{j}(\omega )=\frac{1}{2}\sum _{j=0}^{N1}{\parallel \frac{{x}_{j+1}(\omega ){x}_{j}(\omega )}{{\delta}_{j}}{g}_{j}(\omega )\parallel}^{2}{\delta}_{j}{\theta}_{j}^{1}(\omega ),\hfill \\ {B}_{j}(\omega )=\sum _{j=0}^{N1}({\tilde{f}}_{j}+{\tilde{Q}}_{j}){\theta}_{j}^{1}(\omega ){\delta}_{j}\hfill \end{array}
and
{f}_{j}(\omega )=f(x({t}_{j},\omega )).
Similarly, extending that representation to other functions, we may express the iterative value of the game as
{\mathrm{\Phi}}_{j}(\omega )={\int}_{\partial \mathrm{\Upsilon}}\frac{exp(\frac{1}{\lambda (T)}\varphi (x(T,\omega )))}{{\prod}_{j=0}^{N1}{((2\pi ){\theta}_{j})}^{\frac{1}{2}}}[exp({A}_{j}(\omega )+{B}_{j}(\omega ))]\phantom{\rule{0.2em}{0ex}}d\mathrm{\Upsilon}.
(31)
Applying Theorem 4.2, we obtain the optimal iterative control estimates, which completes the proof. □