Backward reachability approach to state-constrained stochastic optimal control problem for jump-diffusion models

In this paper, we consider the stochastic optimal control problem for jump-diﬀusion models with state constraints. In general, the value function of such problems is the discontinuous viscosity solution of the associated Hamilton-Jacobi-Bellman (HJB) equation since the regularity cannot be guaranteed at the boundary of the state constraint. By adapting the stochastic target theory, we obtain an equivalent representation of the original value function by means of the backward reachable set. We then show that this backward reachable can be characterized by the zero-level set of the auxiliary value function for the unconstrained stochastic control problem, which includes two additional unbounded controls as a consequence of the martingale representation theorem. We prove that the auxiliary value function is the unique continuous viscosity solution of the associated HJB equation, which is the second-order nonlinear partial integro-diﬀerential equation (PIDE). Our paper provides an explicit way to characterize the original (possibly discontinuous) value function as a zero-level set of the continuous solution of the auxiliary HJB equation. The proof of the existence and uniqueness requires a new technique due to the unbounded control sets and the presence of the singularity of the corresponding Lévy measure in the nonlocal operator of the HJB equation.


Introduction
Let B andÑ be a standard Brownian motion and an E-marked compensated Poisson random process, respectively, which are mutually independent of each other. The problem studied in this paper is to minimize the following objective functional over u ∈ U t,T J(t, a; u) = E  (1.2) and the state constraint ( is a non-empty closed subset of R n ) x t,a;u s ∈ , ∀s ∈ [t, T], P-a.s. (1. 3) The precise problem formulation is given in Sect. The problem in (1.4) can then be referred to as the stochastic optimal control problem for jump-diffusion systems with state constraints.
The main results of the paper can be summarized as follows: • The first main result is that the value function in (1.4) can be equivalently represented by the zero-level set of W (see Theorems 3.1 and 3.2), i.e., where R t is the backward reachable set of the stochastic target problem with state constraints (see (1.7)), and W (defined in (3.7)) is a continuous value function of the auxiliary stochastic control problem that includes unbounded control sets A t,T × B t,T ; • The second main result is that the auxiliary value function W is a unique continuous viscosity solution of the following Hamilton-Jacobi-Bellman (HJB) equation with suitable boundary conditions (see Theorems 4.1 and 5.1): (time and state arguments are suppressed) d(a, ), (1.6) which is the second-order nonlinear partial integro-differential equation (PIDE) that includes two unbounded control variables (α, β) ∈ R r × G 2 . • The first and second main results imply that we can characterize the original value function (1.4) using (1.5) and the solution of (1.6). (Deterministic and stochastic) control problems with state constraints were studied extensively in the literature; see [1,7,14,15,20,22,25,27,28,33,34] and the references therein. In particular, as discussed in [11,15,28,33], under some conditions, the value function of the state-constrained stochastic control problem is only a discontinuous viscosity solution to the associated constrained HJB equation having a complex boundary condition at ∂ (the boundary of ), as the regularity cannot be guaranteed at ∂ . In fact, in the references mentioned above, they did not study the equivalent representation of the corresponding value function as a continuous function, their control spaces are bounded, and they only considered the case for deterministic systems or SDEs in a Brownian setting without jumps. Viability theory for deterministic and stochastic systems could be viewed as an alternative approach to solve state-constrained problems [3][4][5]19], and its extension to jump-diffusion models was studied in [31,39]. However, they focus only on the viability property of state constraints (without optimizing the objective functional), their control spaces are bounded, and some additional technical assumptions (e.g., see [31, (H.3)]) are essentially required.
Recently, the state-constrained problem via the backward reachability approach was studied in [11]. One remarkable feature of [11] is that it provides the explicit characterization method of the original (possibly discontinuous) value function in terms of the zerolevel set of the auxiliary value function, which is continuous, as the auxiliary value function is induced from the unconstrained control problem. However, the model used in [11] is the SDE driven by Brownian motion without jumps, which is a special case of (1.2). Moreover, the HJB equation in [11] is the local equation, which is also a special case of (1.6) without the nonlocal integral term (the second line of (1.6)). The aim of this paper is to generalize the results in [11] to the case of jump-diffusion systems. As mentioned below, it turns out that these generalizations are not straightforward due to jump diffusions in (1.1) and the presence of the nonlocal operator in the HJB equation (1.6).
Our first main result given in (1.5) is obtained based on the stochastic target theory. In particular, using the equivalence relationship between stochastic optimal control and stochastic target problems, we show (1.5) (see Theorems 3.1 and 3.2), where R t is the backward reachable set with the state constraint given by R t := (a, b) ∈ R n × R|∃(u, α, β) ∈ U t,T × A t,T × B t,T such that y u,α,β T;t,a,b ≥ m x t,a;u T , P-a.s. and x t,a;u s ∈ , ∀s ∈ [t, T], P-a.s. , (1.7) with (y u,α,β s;t,a,b ) s∈ [t,T] being an auxiliary state process controlled by additional control processes (α, β) ∈ A t,T × B t,T that take values from unbounded control spaces. Here, the main technical tool to show the equivalence in (1.5) using (1.7) is the martingale representation theorem for general Lévy processes, by which additional (unbounded) controls (α, β) ∈ A t,T × B t,T are introduced. It should be mentioned that [11] also used the result of [13] (where only (u, α) ∈ U t,T × A t,T appeared in [11]), and we extend [11] to the case of jump-diffusion models. Note that this extension is not straightforward, since we have to obtain the new estimates for SDEs with jump diffusions (Lemmas 2.1 and 3.1) and the additional control variables (α, β) ∈ A t,T × B t,T (Remark 3.1). We mention that in addition to the application of the martingale representation theorem, these steps are essential to prove (1.5) and the properties of W in (1.5) (see Theorems 3.1 and 3.2 as well as the results in Sect. 3.3). Moreover, (1.5) in Theorem 3.2 relies on the existence of optimal controls for jump-diffusion systems, and our paper presents the new existence result for general optimal control problems with jump diffusions (see Theorem A.1 in the Appendix), which has not been reported in existing literatures. We should mention that the results on the existence of optimal controls in [11,38] are applicable only for linear SDEs without jumps.
The second main result of this paper is to show that the auxiliary value function W is a unique continuous viscosity solution of the HJB equation in (1.6) (see Theorems 4.1 and 5.1), where W will be defined in Sect. 3.2 (see (3.7)). Therefore, using the solution of (1.6), the explicit characterization of the original value function V in (1.4) can be obtained through (1.5). We mention that the proofs for existence and uniqueness of the viscosity solution in Theorems 4.1 and 5.1 should be different from those for the case without jumps in [11]. Specifically, Theorem 4.1, the proof for existence of the viscosity solution for (1.6), requires the dynamic programming principle and the application of Itô's formula of general Lévy-type stochastic integrals to test functions. In fact, unlike [11,Theorem 4.3], Theorem 4.1 has to deal with two different stochastic integrals (stochastic integrals with respect to the Brownian motion and the (compensated) Poisson process) and their quadratic variations to obtain the desired inequalities in the definition of viscosity solutions. Such an extended (existence of viscosity solution) analysis is not presented in [11,Theorem 4.3], and our paper provides the different proof in Theorem 4.1.
Regarding the proof of uniqueness of the viscosity solution in Theorem 5.1, the approach developed for the case without jumps in [11,Theorem 4.6] (that also relies on [10,17]) cannot be directly adopted, since the HJB equation in (1.6) includes the local term (the first line of (1.6)) and the nonlocal (integral) operator in terms of the singular Lévy measure π (the second line of (1.6)), where the latter is induced due to jump diffusions. Note also that in classical stochastic optimal control problems with jump diffusions without state constraints ( = R n ), the corresponding control space is assumed to be a compact set [18,31,32,35]. Hence, their approaches cannot be used directly to prove the uniqueness of the viscosity solution for the HJB equation in (1.6).
Based on the discussion above, we need to develop a new approach to prove the uniqueness of the viscosity solution for the HJB equation in (1.6). Our strategy to prove the uniqueness in Theorem 5.1 is to use the equivalent definition of viscosity solutions in terms of (super and sub)jets, where the nonlocal integral operator is decomposed into the singular part with the test function and the nonsingular part with jets (see Lemma 6.3). Then we obtain the desired result for the nonlocal singular part with the help of the regularity of test functions and the estimates in Remark 3.1. Note that the unboundedness of β ∈ G 2 in the nonlocal nonsingular part is resolved with the help of the appropriate construction of the comparison functions in (6.12) and the proper estimates of doubling variables in based on [21,Proposition 3.7]. 1 In addition, we convert the second-order local part (the first line of (1.6)) into the equivalent spectral radius form by which the unboundedness with respect to α ∈ R r can be handled (see Lemma 6.1). By combining all these steps, we obtain the comparison principle of viscosity sub and supersolutions (see Theorem 5.1), which implies the uniqueness of the viscosity solution for (1.6) (see Corollary 5.1).
The rest of the paper is organized as follows. The notation and the precise problem statement are given in Sect. 2. In Sect. 3, we obtain the equivalent representation of (1.4) given in (1.5). In Sect. 4, we show that the auxiliary value function W is the continuous viscosity solution of the HJB equation in (1.6). The uniqueness of the viscosity solution for (1.6) is presented in Sect. 5, and its proof is provided in Sect. 6. The Appendix provides several different conditions for the existence of optimal controls for jump-diffusion systems.

Notation and problem statement 2.1 Notation
Let R n be the n-dimensional Euclidean space. For x, y ∈ R n , x denotes the transpose of x, x, y is the inner product, and |x| := x, x 1/2 . Let S n be the set of n × n symmetric matrices.
Let Tr(A) be the trace operator for a square matrix A ∈ R n×n . Let · F be the Frobenius norm, i.e., A F := Tr(AA ) 1/2 for A ∈ R n×m . Let I n be an n × n identity matrix. In various places of the paper, an exact value of a positive constant C can vary from line to line, which mainly depends on the coefficients in Assumptions 1, 2, and 3, terminal time T, and the initial condition, but independent of a specific choice of control.
Let ( , F, P) be a complete probability space with the natural filtration F := {F s , 0 ≤ s ≤ t} generated by the following two mutually independent stochastic processes and augmented by all the P-null sets in F : (i) an r-dimensional standard Brownian motion B defined on [t, T] and (ii) an E-marked right continuous Poisson random measure (process) T] is an associated compensated F s -martingale random (Poisson) measure of N for any A ∈ B(E). Here, π is a σ -finite Lévy measure on (E, B(E)), which holds E (1 ∧ |e| 2 )π(de) < ∞.
We introduce the following spaces: • G 2 (E, B(E), π; R n ): the space of square integrable functions k : E → R n such that k satisfies k G 2 := ( E |k(e)| 2 π(de))

Problem statement
We consider the following stochastic differential equation (SDE) driven by both B andÑ : where x t,a;u s ∈ R n is the value of the state at time s, and u s ∈ U is the value of the control at time s with U being the space of control values, which is a compact subset of R m . The set of admissible controls is denoted by U t,T := L 2 F (t, T; U).
and hold the following conditions with the constant L > 0: for x, x ∈ R n , This completes the proof.
The objective functional is given by Let ⊂ R n be the non-empty and closed set, which captures the state constraint. Then the state-constrained stochastic control problem for jump-diffusion systems considered in this paper is as follows: inf u∈U t,T J(t, a; u) subject to (2.1) and x t,a;u s ∈ , ∀s ∈ [t, T], P-a.s.
We introduce the value function for the above problem: Note that (2.3) means a ∈ for the initial state of the SDE in (2.1). The following assumptions are imposed for (2.2).
l and m satisfy the following conditions with the constant (ii) l and m are nonnegative functions, i.e., l, m ≥ 0.

Characterization of V
In this section, we convert the original problem in (2.3) into the stochastic target problem for jump-diffusion systems with state constraints. Then we show that (2.3) can be characterized by the backward reachable set of the stochastic target problem, which is equivalent to the zero-level set of the auxiliary value function.

Equivalent stochastic target problem via backward reachability approach
We first introduce an auxiliary SDE associated with the objective functional in (2.2): Lemma 3.1 Suppose that Assumptions 1 and 2 hold. Then: Proof The proof for parts (i) and (ii)-(a) is analogous to that of Lemma 2.1. We prove part (ii)-(b). Without loss of generality, we assume t ≥ t. Consider, By Assumptions 1 and 2, and using Lemma 2.1, we have Then from (i) of Assumption 2, the estimates in (ii) of Lemma 2.1, and the fact thatÑ and B are mutually independent, we have Hence, without loss of generality, we may use the controls (α, β) such that (α, β) are square integrable and bounded in L 2 F and G 2 F senses.
For any function m : R n → R, let us define the epigraph of m: Then we have the following equivalent expression of the value function in (2.3) in terms of the stochastic target problem with state constraints. Below, we drop t, T in U t,T , A t,T and B t,T to simplify the notation.
Proof of Lemma 3.2 It is easy to see that As discussed in [13] and [11], we consider the following two statements: We now introduce the backward reachable set Clearly, based on Lemma 3.2, we have the following result: Remark 3.3 From Theorem 3.1, we observe that the value function in (2.3) can be characterized by the backward reachable set R t . In the next subsection, we focus on an explicit characterization of R t as the zero-level set of the value function for the unconstrained auxiliary stochastic control problem.

Characterization of backward reachable set
where we introduce the following distance function on R n to R + : Then the auxiliary value function W : [0, T] × R n × R → R can be defined as follows: Clearly, it holds Assumption 3.
The following theorem shows the equivalent expression of V in terms of the zero-level set of W .
Theorem 3.2 Suppose that Assumptions 1-3 hold and there exists an optimal control such that it attains the minimum of the auxiliary optimal control problem in (3.7). Then: (i) The reachable set can be obtained by Proof of Theorem 3.2 From (3.6) in Theorem 3.1, we see that (ii) follows from (i). Hence, we prove (i). Recall R t defined in (3.5): Suppose that (a, b) ∈R t , i.e., W (t, a, b) = 0. Then due to the assumption of the existence of an optimal control given in the statement, 3 From the nonnegativity of l, m, and d(x, ) in Assumptions 2 and 3, and the property of the max function, we can see that max{m(x t,a;ū This shows thatR t ⊆ R t for t ∈ [0, T]. We complete the proof.

Properties of W
We provide some useful properties of W in (3.7).
Proposition 3.1 Assume that Assumptions 1-3 hold. Then for (a, b) ∈ R n ×R and t ∈ [0, T] with τ > 0, the auxiliary value function W satisfies the following dynamic programming principle (DPP): Proof Let us define , ∀s ∈ t , T .
We complete the proof.
Proof In view of the definition of W , when b ∈ [0, ∞), with α = 0 and β = 0, where the second inequality follows from the fact that l and m are nonnegative due to (ii) of Assumption 2. Then the linear growth of W in a in the statement of (i) follows from Assumptions 1, 2, and 3, and (ii) of Lemma 2.1.
|. From Assumptions 1, 2, and 3, and using the Hölder inequality, Notice that to obtain the last inequality, we have used (ii) of Lemmas 2.1 and 3.1, the compactness of U, and the fact that the controls (α, β) can be restricted to be bounded in G 2 F and L 2 F senses from Remark 3.1. This shows (ii). For the continuity of W in t ∈ [0, T] in (iii), let t, t +τ ∈ [0, T] with τ > 0. Then by applying the similar technique above and using (ii) of Lemma 2.1, we have  τ , a, b) -W (t, a, b)| → 0 as τ ↓ 0. We complete the proof. This completes the proof.
Based on (3.7) (see Remark 3.4) and Lemma 3.4, W satisfies the following boundary conditions: Lemma 3.5 Suppose that Assumptions 1, 2, and 3 hold. Then W satisfies the following boundary conditions:

Characterization of W via viscosity solution of Hamilton-Jacobi-Bellman equation
Based on Theorem 3.2 and Remark 3.6, it is necessary to study the characterization of the auxiliary value function W in (3.7) in order to solve the original state-constrained control problem in (2.3). In this section and Sects. 5-6, we provide the characterization of W by showing that W is a unique continuous viscosity solution of the associated HJB equation. As seen from (3.7), the auxiliary value function depends on the augmented dynamical system on R n+1 . We introduce the following notation: The HJB equation with the boundary conditions (see Lemma 3.5) is introduced below, which is the second-order nonlinear partial integro-differential equation (PIDE): where the Hamiltonian H : The notion of viscosity solutions for (4.1) is given as follows [8,9]: Definition 1 A real-valued function W ∈ C(Ō) is said to be a viscosity subsolution (resp. supersolution) of (4.1) if and W (t, a, 0) ≤ W 0 (t, a) (resp. W (t, a, 0) ≥ W 0 (t, a)) for (t, a) ∈ [0, T) × R n ; (ii) For all test functions φ ∈ C 1,3 b (Ō) ∩ C 2 (Ō), the following inequality holds at the global maximum (resp. minimum) point (t, a, b) ∈ O of Wφ: t, a, b) + H t, a, b, φ, Dφ, D 2 φ (t, a, b) ≥ 0 .
A real-valued function W ∈ C(Ō) is said to be a viscosity solution of (4.1) if it is both a viscosity subsolution and a viscosity supersolution of (4.1).
The existence of the viscosity solution for (4.1) can be stated as follows: We prove (ii) of Definition 1. Let φ ∈ C 1,3 b (Ō) be the test function such that and without loss of generality, we may assume that W (t, a, b) = φ(t, a, b). This implies t, a, b). By using the DPP in Proposition 3.1 with t, t + τ ∈ [0, T] and τ > 0, where we have used the fact that the expectation for the stochastic integrals of B andÑ are zero, since they are F t -martingales. Multiplying 1 τ above and then letting τ ↓ 0, we have t, a, b); u, α, β a + χ(t, a, u, e), b + β(e) ) φ (t, a, b) -Dφ(t, a, b), χ(t, a, u, e, β)

π(de). (4.2)
By taking sup with respect to (u, α, β) ∈ U × R p × G 2 , in view of definition H, which shows that W is the viscosity subsolution of (4.1). We now prove, by contradiction, the supersolution property. It is easy to see that W satisfies the boundary inequalities in (i) of Definition 1.
This leads to the desired contradiction, since θ > 0. Hence, W is the viscosity supersolution. This, together with (4.3), shows that W is the continuous viscosity solution of (4.1). This completes the proof.

Uniqueness of viscosity solution
We state the comparison principle of viscosity subsolution and supersolution, whose proof is reported in Sect. 6.  W (t, a, b) ≤ W (t, a, b), ∀(t, a, b) ∈Ō.

Concluding remarks
We have studied the state-constrained stochastic optimal problem for jump-diffusion sys- One possible potential future research problem would be to consider the two-player stochastic game framework for which we need to generalize Theorem 3.2 using the notion of nonanticipative strategies. The state-constrained problem with general BSDE (backward SDE) type recursive objective functionals would also be an interesting avenue to pursue. Applications to various mathematical finance problems will be studied in the near future.

Proof of Theorem 5.1
This section is devoted to the proof of Theorem 5.1.
The definition of parabolic superjet and subjet is given as follows [21]: W (t, a), the superjet of W at the point of (t, a) ∈ O is defined by 5 a) .
(ii) The closure of P 1,2,+ W (t, a) is defined by (q, p, P) = lim n→∞ (q n , p n , P n ) with (q n , p n , P n ) ∈ P 1,2,+ W (t n , a n ) and lim n→∞ t n , a n , W (t n , a n ) = t, a, W (t, a) .
Remark 6.2 Lemma 6.3 is introduced due to the singularity of the Lévy measure in zero, appearing in the nonlocal operator H (21) δ . We will see that with the regularity of the test function, one can pass the limit of H (21) δ around the singular point of the measure.

Strict viscosity subsolution
Lemma 6.4 Suppose that W is the viscosity subsolution of (6.2). Let where for ν > 0, Then W ν is the strict viscosity subsolution of (6.2) in the sense that ≤ 0 is replaced by ≤ -ν 8 in Definition 1.
Proof We first verify the boundary condition of W ν . Note that as b ∈ [0, ∞) and ν > 0, and by Lemma 3.5 a).
Then from (6.2) and Definition 1, it is necessary to show that it is easy to see that φ ∈ C 1,3 b (Ō) and t, a, b)φ(t, a, b).
Since b ∈ [0, ∞) and β ∈ G 2 , it is easy to see that with β(e) = 0, In the definition of G ψ , Note that since l and d(a, ) are positive, and b ∈ [0, ∞), we have G (11) ≤ -1. Then we can show that which implies Then (6.5) and (6.6) lead to (6.3). We complete the proof.

Appendix: Existence of optimal controls for jump-diffusion systems
In Theorem 3.2, an additional assumption of the existence of optimal controls for the auxiliary optimal control problem in (3.7) is needed. Here, we show that a certain class of stochastic optimal control problems for jump-diffusion systems with unbounded control sets admits an optimal control. The proof of the main result in this appendix (see Theorem A.1) extends the case of SDEs in a Brownian setting without jumps studied in [11, Appendix A] and [38, Theorem 5.2, Chap. 2] to the framework of jump-diffusion systems.

Assumption 4
(i) For ι := f , σ , χ, l with ι = ι 1 · · · ι n , ι satisfies Assumptions 1 and 2, and is independent of x. Moreover, ι i , i = 1, . . . , n, is convex and Lipschitz continuous in u with the Lipschitz constant L; (ii) ρ 1 and ρ 2 are convex, nondecreasing and bounded from below; (iii) U ⊂ R m is a compact and convex set. Proof Since ρ 1 and ρ 2 are bounded from below, (A.1) is well defined. Suppose that {( u k , α k , β k )} k≥1 ∈ U × A × B is a sequence of minimizing controllers such that J(t, a, b; u k , α k , β k ) k→∞ − −− → W (t, a, b). Note that L 2 F and G 2 F are Hilbert spaces. Also, from Remark 3.1, {( α k , β k )} k≥1 can be restricted to a sequence of controls bounded in L 2 F and G 2 F senses, and U is compact from (iii) of Assumption 4. Hence, in view of [16,Theorem 3.18], we can extract a subsequence Then for each > 0, there exists i such that for any i ≥ i ,