Now, we consider that the survival rates \(p_{i}=p\), \(q_{i}=q\) and \(r_{i}=r\) are known. It assumes what ecologists refer to as Type II mortality, which is a constant mortality rate over the entire life span. This pattern is approached by most birds and some mammals [14]. Basically, Type II mortality is a good approximation for the survival rate of human populations in the developed world. Furthermore, we suppose known the transference rates between consecutive compartments \(\sigma_{i}\), \(\mu_{i}\), and \(\nu_{i}\), \(i=1,2,3\), and \(\epsilon_{i}\) and \(\delta_{i}\), \(i=1,2\). Thus, from an observed dataset our goal is to find an approximate value of the parameters \(f_{i}\) (from them we have \(\alpha_{i}\)) and the parameters \(\gamma_{i}\), using the mathematical model given by (2). Note that \(\alpha_{i}\) and \(\gamma_{i}\) are the most important rates since they give us information as regards the disease under consideration. That is, for each age range, we want to have an estimated value of the rate of infection of a susceptible individual and the rate of recovery of an infected individual, which allow us to draw conclusions on the incidence of the disease according to the age of the individual.

From an initial observation \(ob(0)\), we consider an observed dataset

$$\bigl\{ ob(k) \bigr\} _{k=1}^{K}= \bigl\{ \left ( \begin{matrix} S_{1}(k)& S_{2}(k)& S_{3}(k)& I_{1}(k) &I_{2}(k)& I_{3}(k)& R_{1}(k)& R_{2}(k)& R_{3}(k) \end{matrix} \right )^{T} \bigr\} _{k=1}^{K}, $$

in *K* steps, \(K\geq1\), and, on the other hand, we have the fit mathematical model

$$x(k+1)=A(\mathbf{p})x(k)+B,\qquad x(0)=ob(0),\quad k\geq1, $$

where, from now on, the parameter vector to estimate is

$$\mathbf{p}=\left ( \begin{matrix}f_{1}& f_{2}& f_{3}& \gamma_{1}& \gamma_{2}& \gamma_{3} \end{matrix} \right )^{T}. $$

Rewriting the system (2) we have

$$ x(k+1)= M(k)\mathbf{p}+N(k)=\left ( \begin{matrix} M_{1}(k)&M_{2}(k) \\ M_{3}(k)&M_{4}(k) \\ M_{5}(k)&M_{6}(k) \end{matrix} \right ) \mathbf{p}+N(k), $$

(3)

where

$$\begin{aligned}& M_{1}(k)=\left ( \begin{matrix} -{I_{1}}(k) & 0 & 0 \\ (1-\epsilon_{1}) I_{1}(k) & 0 & 0 \\ 0 & 0 & 0 \end{matrix} \right ), \qquad M_{2}(k)= \left ( \begin{matrix} 0 & 0 & 0 \\ -{I_{1}}(k)&0&0 \\ (1-\delta_{1}) I_{1}(k) & 0 & 0 \end{matrix} \right ), \\& M_{3}(k)=\left ( \begin{matrix} 0&-{I_{2}}(k) & 0 \\ \epsilon_{1} I_{1}(k) & (1-\epsilon_{2}) I_{2}(k) & 0 \\ 0 & 0 & 0 \end{matrix} \right ),\qquad M_{4}(k)=\left ( \begin{matrix} 0 & 0 & 0 \\ 0& -{I_{2}}(k)&0 \\ \delta_{1} I_{1}(k) & (1-\delta_{2}) {I_{2}}(k) & 0 \end{matrix} \right ), \\& M_{5}(k)=\left ( \begin{matrix} 0&0&-{I_{3}}(k) \\ 0&\epsilon_{2} I_{2}(k) & I_{3}(k) \\ 0 & 0 & 0 \end{matrix} \right ),\qquad M_{6}(k)=\left ( \begin{matrix} 0 & 0 & 0 \\ 0&0& -{I_{3}}(k) \\ 0&\delta_{2} I_{2}(k) & I_{3}(k) \end{matrix} \right ), \end{aligned}$$

and \(N(k)=\operatorname{col}(N_{i}(k))_{i=1}^{9}\) where

$$\begin{aligned}& N_{1}(k)=N-p\sum_{i=2}^{3}S_{i}(k)-q \sum_{i=1}^{3}I_{i}(k)-r\sum _{i=1} ^{3}R_{i}(k)- \sigma_{1}S_{1}(k), \\& N_{2}(k)=h_{1}I_{1}(k),\qquad N_{3}(k)=(r-\nu_{1})R_{1}(k), \\& N_{4}(k)= \sigma_{1}S_{1}(k)+ (p- \sigma_{2})S_{2}(k),\qquad N_{5}(k)=\mu _{1}I_{1}(k)+h_{2}I_{2}(k), \\& N_{6}(k)=\nu_{1}R_{1}(k)+(r- \nu_{2})R_{2}(k),\qquad N_{7}(k)= \sigma_{2}S _{2}(k)+pS_{3}(k), \\& N_{8}(k)=\mu_{2}I_{2}(k)+qI_{3}(k), \qquad N_{9}(k)=\nu_{2}R_{2}(k)+rR_{3}(k). \end{aligned}$$

From the *K* data of the observed dataset we want to estimate the value of **p**, that is, we want to find the parameter vector which minimizes the quadratic function

$$J_{K}(\mathbf{p})=\frac{1}{2}(d_{K}-H_{K} \mathbf{p})^{T}(d_{K}-H_{K} \mathbf{p}). $$

Thus, **p** satisfies

$$\frac{\partial J_{K}(\mathbf{p})}{\partial\mathbf{p}}=H_{K}^{T}H_{K} \mathbf{p}-H_{K}^{T}d_{K}=0. $$

Note that if \(S_{K}=H_{k}^{T}H_{K}\) is nonsingular, then the solution is \(\mathbf{p}=S_{K}^{-1}H_{K}^{T}d_{K}\), and if it is singular, then \(\mathbf{p}=S_{K}^{\dagger}H_{K}^{T}d_{K}\) where † denotes the M-Penrose generalized inverse matrix. In this last case, **p** is not identifiable since we have infinite values for the parameter and a unique output of the mathematical model.

From the structure of the matrices we can establish the following result.

### Proposition 2

*Let the system be* (3). *The estimation problem has a unique solution if and only if for each*
*i*, \(i=1,2,3\), *there exists*
\(k_{i}\), \(0\leq k_{i}\leq K\)
*such that*
\(I_{i}(k_{i})\neq0\).

### Proof

If for each *i*, \(i=1,2,3\) there exists \(k_{i}\), \(0\leq k_{i}\leq K\) such that \(I_{i}(k_{i})\neq0\) then \(\operatorname{rank}(H_{k_{0}})=6\) for \(k_{0}=\max\{k_{i}\}\). Hence, \(\operatorname{rank}(S_{K})=6\). Conversely, if \(\operatorname{rank}(S_{K})=6\), from \(S_{K}=H^{T}_{K}H_{K}\) and using the structure of the matrices \(H_{K}\) and \(M(k)\) the condition is proved. Therefore, we can ensure that \(S_{K}\) is definite positive, that is, all eigenvalues are positive and there exists \(S_{K}^{-1}\) for all \(K> k_{0}\). □

Under the above assumption, we could obtain an approximated value of **p**, for instance, using the descendent gradient method. That is, from an initial \(\mathbf{p}_{0}\) the \((i+1)\)th step provides us

$$\begin{aligned} \mathbf{p}_{i+1} =&\mathbf{p}_{i}-a_{i} \bigl(S_{K}\mathbf{p}_{i}-H_{K}^{T}d _{K}\bigr) \\ =&\mathbf{p}_{i}+a_{i}H_{K}^{T}(d_{K}-H_{K} \mathbf{p}_{i}), \end{aligned}$$

where \(a_{i}\) is the minimum of the curve \(h(a)=J_{K}(\mathbf{p}_{i}-a(S _{K}\mathbf{p}_{i}-H_{K}^{T}d_{K}))\). Thus, we have given a numerical procedure to achieve the best fit between observed data and the parameters of the model. This algorithm is based on iterative local search in a down-hill direction from the initial point.

The parameter \(\mathbf{p} \geq 0\) chosen in this epidemiological model satisfies \(\Vert \mathbf{p}\Vert _{2}<1\), where \(\Vert \cdot \Vert _{2}\) denote the spectral norm. In the next result we establish a condition on the observed dataset in order to keep this property in the process of parameter estimation.

### Proposition 3

*Let the system be* (3). *For each*
*i*, \(i=1,2,3\), *we suppose that there exists*
\(k_{i}\), \(0\leq k_{i}\leq K\)
*such that*
\(I_{i}(k_{i})\neq0\).

$$\textit{If }\Vert d_{K}\Vert _{2}< \frac{1}{\rho(S^{-1}_{K})\sqrt{\rho(S_{K}})} \textit{ then }\Vert \mathbf {p}_{K}\Vert _{2}< 1, $$

*where*
\(\mathbf{p}_{K}=S_{K}^{-1}H_{K}^{T}d_{K}\)
*is the parameter which minimizes the problem associated with*
*K*
*observation data and*
\(\rho(\cdot)\)
*denoting the spectral radius*.

### Proof

From \(\mathbf{p}_{K}=S_{k}^{-1}H_{K}^{T}d_{K}\) and taking into account that \(S_{K}\) is a symmetric matrix

$$\Vert \mathbf{p}_{K}\Vert _{2}= \bigl\Vert S_{K}^{-1}H_{K}^{T}d_{K} \bigr\Vert _{2}\leq\rho \bigl(S _{K}^{-1} \bigr)\Vert H_{K}\Vert _{2}\Vert d_{k} \Vert _{2}< \rho \bigl(S_{K}^{-1} \bigr)\sqrt{ \rho(S _{K})}\Vert d_{K}\Vert _{2}< 1. $$

□

### Adding more observations. Algorithm

Consider *K* data of the observed dataset, such that \(I_{i}(k_{i}) \neq0\) for some \(k_{i}\), \(0\leq k_{i}\leq K\) and for each *i*, \(i=1,2,3\). This fact implies that there exists \(k_{0}\) such that \(H(k_{0})\) is full rank. Now, we want to improve the approximated value of parameter **p** adding one observation \(ob(K+1)\) and fitting the mathematical model to the \(K+1\) data of the observed dataset. Using that

$$\begin{aligned}& H^{T}_{K+1}H_{K+1}=H^{T}_{K}H_{K}+M^{T}(K)M(K), \\& H^{T}_{K+1}d_{K+1}=H^{T}_{K}d_{K}+M^{T}(K)d(K+1), \end{aligned}$$

we obtain the following discrete-time variable system to represent the dynamic of the parameter vector:

$$ \mathbf{p}_{K+1}=A_{K} \mathbf{p}_{K}+B_{K}, \quad K\geq1, $$

(4)

where \(A_{K}=S_{K+1}^{-1}S_{K}\) and \(B_{K}=S_{K+1}^{-1}M^{T}(K)d(K+1)\), where \(S_{K}=H^{T}_{K}H_{K}\).

The solution of this system is

$$\mathbf{p}_{K}=\Phi_{A}(K,k_{0}) \mathbf{p}_{k_{0}}+ \sum_{j=k_{0}+1}^{K-1} \Phi_{A}(K,j+1)B_{j},\quad K> k_{0}, $$

with the monodromy matrix \(\Phi_{A}(K,k_{0})\) defined as \(\Phi_{A}(K,k _{0})=A_{K-1}\cdots A_{k_{0}}\) if \(K>k_{0}\) and \(\Phi_{A}(K,k_{0})=I\) if \(K=k_{0}\).

Note that if \(A_{K}\) is asymptotically stable, that is, \(\rho(A_{K})<1\) for all \(K> k_{0}\), then the monodromy matrix is also asymptotically stable, since \(\rho(\Phi_{A}(K,k_{0}))= \Vert \Phi _{A}(K,k_{0})\Vert _{2} \leq\prod_{k=k_{0}}^{K-1} \rho(A_{k})<1\) (this is followed from the symmetry of the matrix \(S_{K}\) for \(K> k_{0}\)). Hence, we can ensure that the recurrence sequence of the parameter vector \(\{\mathbf{p} _{K}\}_{K\geq1}\) obtained as solution of (4) is bounded if \(B_{K}\) is also bounded.

Finally, we establish a condition on the new data \(K+1\) in order that the consecutive approximations of the parameter are sufficiently close.

### Proposition 4

*Let the system be* (4). *Suppose that there exists*
\(k_{0}\)
*such that*
\(\operatorname{rank}(H(k_{0}))\)
*is full rank*, *and*
\(\rho(A_{K})<1\)
*for all*
\(K> k_{0}\). *Consider the observation data such that*
\(\Vert ob(K+1)-x(K+1)\Vert _{2}< \frac{\epsilon}{\rho(S^{-1}_{K+1})\rho(M^{T}(K)M(K))}\), *for some*
\(\epsilon>0\). *Then*

$$\Vert \mathbf{p}_{K+1}-\mathbf{p}_{K}\Vert _{2}< \epsilon. $$

### Proof

Given *K* observation data and \(A_{K}=S_{K+1}^{-1}S _{K}\), \(B_{K}=S_{K+1}^{-1}M^{T}(K)d(K+1)\) we have

$$\begin{aligned} \Vert \mathbf{p}_{K+1}-\mathbf{p}_{K}\Vert _{2} =& \Vert A_{K}\mathbf{p}_{K}+B_{K}- \mathbf{p}_{K}\Vert _{2} \\ \leq&\bigl\Vert S_{K+1}^{-1}\bigr\Vert _{2} \bigl\Vert (S_{K}-S_{K+1})\mathbf{p}_{K}+ M^{T}(K)d(K+1)\bigr\Vert _{2} \\ \leq&\rho\bigl(S^{-1}_{K+1}\bigr)\bigl\Vert M^{T}(K)\bigr\Vert _{2}\bigl\Vert d(K+1)-M(K) \mathbf{p}_{K}\bigr\Vert _{2} \\ =&\rho\bigl(S^{-1}_{K+1}\bigr)\rho\bigl(M^{T}(K)M(K) \bigr)\bigl\Vert ob(K+1)-x(K+1)\bigr\Vert _{2}< \epsilon. \end{aligned}$$

□

### Remark 1

The parameters involved in an epidemiological process are not always known. To obtain a value of these sufficiently reliable, it is necessary to know if it can be identified from a set of observations of the process, and then estimate its value. In the literature, there exist several approaches to the identifiability problem and to the estimation problem. From a set of observations, for instance in engineering, it is usual to consider the transfer matrix, in chemicals, if we consider the input-output response and the parameters of Markov [4, 9], or directly from the solution of the system. Generally the estimation approach is based in the gradient algorithm and the least squares algorithm, [7, 8]. In our case, we identify the case using the structure of the matrices and asking whether the performance of the estimation process is constructive, using a least squares algorithm. Then the algorithm proposed can be used to identify and estimate the parameters of other time-discrete age-structured models taking into account only the structure one has when performing the steps of our algorithm.