In this section, we introduce preliminary facts which are used throughout this paper.
Definition 2.1 (see [7–9])
The fractional integral of order \alpha >0 with the lower limit zero for a function f can be defined as
{I}^{\alpha}f(t)=\frac{1}{\mathrm{\Gamma}(\alpha )}{\int}_{0}^{t}\frac{f(s)}{{(ts)}^{1\alpha}}\phantom{\rule{0.2em}{0ex}}ds,\phantom{\rule{1em}{0ex}}t>0
provided the righthand side is pointwise defined on [0,\mathrm{\infty}), where Γ is the gamma function.
Definition 2.2 (see [7–9])
The Caputo derivative of order α with the lower limit zero for a function f can be written as
{}^{c}D^{\alpha}f(t)=\frac{1}{\mathrm{\Gamma}(n\alpha )}{\int}_{0}^{t}\frac{{f}^{(n)}(s)}{{(ts)}^{\alpha +1n}}\phantom{\rule{0.2em}{0ex}}ds={I}^{n\alpha}{f}^{(n)}(t),\phantom{\rule{1em}{0ex}}t>0,0\le n1<\alpha <n.
If f is an abstract function with values in X, then the integrals appearing in the above definitions are taken in Bochner’s sense.
The operators A:D(A)\subset X\to Y and E:D(E)\subset X\to Y satisfy the following hypotheses:
({H}_{1}) A and E are closed linear operators,
({H}_{2}) D(E)\subset D(A) and E is bijective,
({H}_{3}) {E}^{1}:Y\to D(E) is continuous.
The hypotheses {H}_{1}, {H}_{2} and the closed graph theorem imply the boundedness of the linear operator A{E}^{1}:Y\to Y.
({H}_{4}) For each t\in [0,a] and for some \lambda \in \rho (A{E}^{1}), the resolvent set of A{E}^{1}, the resolvent R(\lambda ,A{E}^{1}) is a compact operator.
Lemma 2.1 [10]
Let S(t) be a uniformly continuous semigroup. If the resolvent set R(\lambda ;A) of A is compact for every \lambda \in \rho (A), then S(t) is a compact semigroup.
From the above fact, A{E}^{1} generates a compact semigroup \{T(t),t\ge 0\} in Y, which means that there exists M>1 such that
\underset{t\in J}{max}\parallel T(t)\parallel \le M.
(2.1)
Definition 2.3 The system (1.1) is said to be controllable on the interval J if for every {x}_{0},{x}_{1}\in X, there exists a control u\in {L}^{2}(J,U) such that the solution x(\cdot ) of (1.1) satisfies x(a)={x}_{1}.
({H}_{5}) The linear operator W from U into X defined by
Wu={\int}_{0}^{a}{E}^{1}{(as)}^{\alpha 1}{T}_{\alpha}(as)Bu(s)\phantom{\rule{0.2em}{0ex}}ds
has an inverse bounded operator {W}^{1} which takes values in {L}^{2}(J,U)/kerW, where the kernel space of W is defined by kerW=\{x\in {L}^{2}(J,U):Wx=0\}, B is a bounded linear operator and {T}_{\alpha}(t) is defined later.
({H}_{6}) The function f satisfies the following two conditions:

(i)
For each t\in J, the function f(t,\cdot ):X\to Y is continuous, and for each x\in X, the function f(\cdot ,x):J\to Y is strongly measurable.

(ii)
For each positive number k\in N, there is a positive function {g}_{k}(\cdot ):[0,a]\to {R}^{+} such that
\underset{x\le k}{sup}f(t,x)\le {g}_{k}(t),
the function s\to {(ts)}^{1\alpha}{g}_{k}(s)\in {L}^{1}([0,t],{R}^{+}), and there exists a \beta >0 such that
\underset{k\to \mathrm{\infty}}{lim}inf\frac{{\int}_{0}^{t}{(ts)}^{1\alpha}{g}_{k}(s)\phantom{\rule{0.2em}{0ex}}ds}{k}=\beta <\mathrm{\infty},\phantom{\rule{1em}{0ex}}t\in [0,a].
({H}_{7}) For each (t,s)\in J\times J, the function H(t,s,\cdot ):X\to X is continuous, and for each x\in X, the function H(\cdot ,\cdot ,x):J\times J\to X is strongly measurable.
({H}_{8}) The function g satisfies the following two conditions:

(i)
For each (t,s,x)\in J\times J\times X, the function g(t,s,\cdot ,\cdot ):X\times X\to Y is continuous, and for each x\in X, H\in X, the function g(\cdot ,x,y):J\times J\to Y is strongly measurable.

(ii)
For each positive number k\in N, there is a positive function {h}_{k}(\cdot ):[0,a]\to {R}^{+} such that
\underset{x\le k}{sup}\left{\int}_{0}^{t}g(t,s,x,{\int}_{0}^{s}H(s,\tau ,x)\phantom{\rule{0.2em}{0ex}}d\tau )\phantom{\rule{0.2em}{0ex}}ds\right\le {h}_{k}(t),
the function s\to {(ts)}^{1\alpha}{h}_{k}(s)\in {L}^{1}([0,t],{R}^{+}), and there exists a \gamma >0 such that
\underset{k\to \mathrm{\infty}}{lim}inf\frac{{\int}_{0}^{t}{(ts)}^{1\alpha}{h}_{k}(s)\phantom{\rule{0.2em}{0ex}}ds}{k}=\gamma <\mathrm{\infty},\phantom{\rule{1em}{0ex}}t\in [0,a].
According to [11, 12], a solution of equation (1.1) can be represented by
\begin{array}{rcl}x(t)& =& {E}^{1}{S}_{\alpha}(t)E{x}_{0}+{\int}_{0}^{t}{(ts)}^{\alpha 1}{T}_{\alpha}(ts){E}^{1}f(s,x(s))\phantom{\rule{0.2em}{0ex}}ds\\ +{\int}_{0}^{t}{(ts)}^{\alpha 1}{E}^{1}{T}_{\alpha}(ts)Bu(s)\phantom{\rule{0.2em}{0ex}}ds\\ +{\int}_{0}^{t}{(ts)}^{\alpha 1}{E}^{1}{T}_{\alpha}(ts)\left\{{\int}_{0}^{s}g(s,\tau ,x(\tau ),R(\tau ))\phantom{\rule{0.2em}{0ex}}d\tau \right\}\phantom{\rule{0.2em}{0ex}}ds,\phantom{\rule{1em}{0ex}}t\in J,\end{array}
(2.2)
where
with {\xi}_{\alpha} being a probability density function defined on (0,\mathrm{\infty}), that is, {\xi}_{\alpha}(\theta )\ge 0, \theta \in (0,\mathrm{\infty}) and {\int}_{0}^{\mathrm{\infty}}{\xi}_{\alpha}(\theta )\phantom{\rule{0.2em}{0ex}}d\theta =1.
Remark {\int}_{0}^{\mathrm{\infty}}\theta {\xi}_{\alpha}(\theta )\phantom{\rule{0.2em}{0ex}}d\theta =\frac{1}{\mathrm{\Gamma}(1+\alpha )}.
Definition 2.4 By a mild solution of the problem (1.1), we mean that the function x\in C(J,X) satisfies the integral equation (2.2).
Lemma 2.2 (see [11])
The operators {S}_{\alpha}(t) and {T}_{\alpha}(t) have the following properties:

(I)
For any fixed x\in X, \parallel {S}_{\alpha}(t)x\parallel \le M\parallel x\parallel, \parallel {T}_{\alpha}(t)x\parallel \le \frac{\alpha M}{\mathrm{\Gamma}(\alpha +1)}\parallel x\parallel;

(II)
\{{S}_{\alpha}(t),t\ge 0\} and \{{T}_{\alpha}(t),t\ge 0\} are strongly continuous;

(III)
For every t>0, {S}_{\alpha}(t) and {T}_{\alpha}(t) are also compact operators if T(t), t>0 is compact.