- Research
- Open Access
- Published:
The steepest descent of gradient-based iterative method for solving rectangular linear systems with an application to Poisson’s equation
Advances in Difference Equations volume 2020, Article number: 259 (2020)
Abstract
We introduce an effective iterative method for solving rectangular linear systems, based on gradients along with the steepest descent optimization. We show that the proposed method is applicable with any initial vectors as long as the coefficient matrix is of full column rank. Convergence analysis produces error estimates and the asymptotic convergence rate of the algorithm, which is governed by the term \(\sqrt {1-\kappa^{-2}}\), where κ is the condition number of the coefficient matrix. Moreover, we apply the proposed method to a sparse linear system arising from a discretization of the one-dimensional Poisson equation. Numerical simulations illustrate the capability and effectiveness of the proposed method in comparison to the well-known and recent methods.
1 Introduction
Linear systems play an essential role in modern applied mathematics, including numerical analysis, statistics, mathematical physics/biology, and engineering. In this paper, we develop an effective algorithm for solving rectangular linear systems. The proposed algorithm can be applied for most of the scientific models involving differential equations. As a model problem, we concern Poisson’s equation, which arises in many applications, for example, electromagnetics, fluid mechanics, heat flow, diffusion, and quantum mechanics.
Let us consider a (square) linear system \(Ax = b\) with given \(A \in M_{n}(\mathbb {R})\) and \(b \in \mathbb {R}^{n}\). Here we denote the set of m-by-n real matrices by \(M_{m,n}(\mathbb {R})\), and for square matrices, we set \(M_{n}(\mathbb {R}) = M_{n,n}(\mathbb {R})\). For solving linear systems, iterative methods have received much attention. In principle, iterative methods create a sequence of numerical solutions so that starting from an initial approximation, an iteration with sufficiently large number finally becomes an accurate solution. A group of methods for solving the linear system, called stationary iterative methods, can be expressed in the simple form
where B is the associated iteration matrix derived from the coefficient matrix A, and c is the vector derived from A and b. The Jacobi method, the Gauss–Seidel (GS) method, and the successive over-relaxation (SOR) method are three classical ones (see, e.g., [1, Ch. 10]) derive by splitting
where D is a diagonal matrix, and L (U) is a lower (upper) triangular matrix. The SOR method has received much attention and has been evolved continually into new iterative methods, for example:
Jacobi over-relaxation (JOR) method [2]
$$\begin{aligned} B = D^{-1} \bigl( \alpha L+\alpha U+(1-\alpha)D \bigr), \qquad c = \alpha D^{-1}b,\quad\alpha>0. \end{aligned}$$Extrapolated SOR (ESOR) method [3]
$$\begin{aligned}& B = (D-\omega L)^{-1} \bigl( (\tau-\omega)L+\tau U+(1-\tau)D \bigr), \\& c = \tau(D-\omega L)^{-1}b, \quad0< \vert \tau \vert < \omega< 2. \end{aligned}$$Accelerated over-relaxation (AOR) method [4]
$$\begin{aligned}& B = (D+\alpha L)^{-1} \bigl((\alpha-\beta)L-\beta U+(1-\beta )D \bigr), \\& c = \beta(D+\alpha L)^{-1}b, \quad0< \alpha< \beta< 2. \end{aligned}$$
In the recent decade, many researchers developed gradient-based iterative algorithms for solving matrix equations based on the techniques of hierarchical identification and minimization of associated norm-error functions; see, for example, [5–24]. Convergence analysis for such algorithms relies on the Frobenius norm \(\Vert\cdot\Vert_{F}\), the spectral norm \(\Vert\cdot\Vert_{2}\), and the condition number respectively defined for each \(A\in M_{m,n}(\mathbb {R})\) by
Moreover, such techniques can be employed for any rectangular linear system of the form
If A is assumed to be of full column rank, then the consistent system (1) has a unique solution \(x^{*} = (A^{T} A)^{-1}A^{T}b\). The following algorithms are derived from such techniques.
Proposition 1.1
Consider the linear system (1) with full-column-rank matrixA. Let\(0<\mu< 2/\Vert A \Vert_{2}^{2}\)or\(0<\mu< 2/ \Vert A\Vert^{2}_{F}\). Then the iterative solution\(x(k)\)given by the gradient-based iterative (GI) algorithm
converges to a unique solution for any initial value\(x(0)\).
Proposition 1.2
Consider the linear system (1) with full-column-rank matrix A. Let\(0<\mu<2\). Then the iterative solution\(x(k)\)given by the least-squares iterative (LS) algorithm
converges to a unique solution for any initial value\(x(0)\).
Another study of solving linear systems considers unconstrained convex optimization, where the gradient method along with the steepest descent is used (see, e.g., [27]). The steepest descent is a gradient algorithm where the step size \(\alpha_{k}\) is chosen at each individual iteration to achieve the maximum amount of decrease of the objective function. Suppose we would like to minimize a continuously differentiable function f on \(\mathbb {R}^{n}\). To do this, let \(x_{k}\) be the current iterate point, and let \(g_{k}=\nabla f(x_{k})\) be the gradient vector at \(x_{k}\). The steepest descent method defines the next iteration by
where \(\alpha^{*}_{k}>0\) satisfies
Barzilai and Borwein [28] approached the step size in the current iteration. For the iterative equation (4), the step size \(\alpha_{k}\) can be chosen as
or
where \(s_{k-1}=x_{k}-x_{k-1}\) and \(y_{k-1}=g_{k}-g_{k-1}\). We call such iterative method the BB method. Convergence analysis of the BB method is provided in [28, 29]. This idea has encouraged and brought about many researches on the gradient method; see, for example, [30–34].
In the present paper, we propose a new gradient-based iterative algorithm with a sequence of optimal convergent factors for solving rectangular linear systems (see Sect. 2). Then we make convergence analysis for the proposed algorithm, including the convergence rate and error estimates (see Sect. 3). Numerical experiments are provided to illustrate the capability and effectiveness of the proposed algorithm in comparison to all mentioned algorithms (see Sect. 4). We also apply the algorithm to a sparse linear system arising from a discretization of the one-dimensional Poisson equation (see Sect. 5). Finally, we conclude the paper with some remarks in Sect. 6.
2 Proposing the algorithm
In this section, we introduce a new method for solving rectangular linear systems based on gradients, and we provide an appropriate sequence of convergent factors that minimizes an error at each iteration.
Consider the rectangular linear system (1) where \(A\in M_{m,n}(\mathbb {R})\) is a full-column-rank matrix, \(b\in \mathbb {R}^{m}\) is a known constant vector, and \(x\in \mathbb {R}^{n}\) is an unknown vector. We first define the quadratic norm-error function
Since A is of full column rank, the consistent system (1) has a unique solution, and hence an optimal vector \(x^{*}\) of f exists. We will start by having an arbitrary initial vector \(x(0)\), and then at every step \(k>0\), we iteratively move to the next vector \(x(k+1)\) in an appropriate direction, that is, the negative gradient of f together with a suitable step size \(\tau_{k+1}\). Thus the gradient-based iterative method can be described through the following recursive rule:
To minimize the function f, we will deduce their gradients in detail. Indeed, we get
Thus our iterative equation is of the form
To generate the best step size at each iteration, we recall the technique of the steepest descent, which minimizes the error occurring at each iteration. Thus we consider the error \(f(x(k+1))\) as a function of \(\tau\geqslant0\):
Putting \(\tilde {c}= b - Ax(k)\) and \(\tilde {b}= AA^{T} \tilde {c}\), we get
To obtain the critical point, we make the differentiation:
which gives \(\tau=\operatorname{tr}(\tilde {b}\tilde {c}^{T})/\operatorname{tr}(\tilde {b}\tilde {b}^{T})\). Note that the second derivative of \(\phi_{k+1}(\tau)\) is given by \(\operatorname{tr}(\tilde {b}\tilde {b}^{T})\), which is positive. Hence the minimizer of the function \(\phi_{k+1}(\tau)\) is
We call \(\{\tau_{k+1}\}_{k=0}^{\infty}\) the sequence of optimal convergent factors. Now we summarize the search direction and optimal step size.
Algorithm 2.1
The steepest descent of gradient-based iterative algorithm.
Input step: Input \(A\in M_{m,n}(\mathbb {R})\) and \(b\in \mathbb {R}^{m}\).
Initializing step: Choose an initial vector \(x(0)\in \mathbb {R}^{n}\). Set \(k:=0\). Compute \(c=A^{T}b\), \(C=A^{T}A\), \(d=Ac\), \(D=AC\).
Updating step:
-
\(\tau_{k+1}=\sum_{i=1}^{n} ( c_{i}-\sum_{j=1}^{n}C_{ij}x_{j}(k) )^{2}/\sum_{i=1}^{m} ( d_{i}-\sum_{j=1}^{n}D_{ij}x_{j}(k) )^{2}\),
-
\(x(k+1)=x(k)+\tau_{k+1} ( c-Cx(k) )\).
-
Set \(k:=k+1\) and repeat the updating step.
Here we denote by \(c_{i}\) the ith entry of c and by \(C_{ij}\) the \((i,j)\)th entry of C. In case of stopping the algorithm, a stopping criteria is necessary and can be described as \(\Vert b-Ax(k)\Vert_{F} < \epsilon\), where ϵ is a small positive number. Note that we introduce the vectors c, d and the matrices C, D to avoid duplicated computations.
3 Convergence analysis
In this section, we show that Algorithm 2.1 converges to the exact solution for any initial vector. Moreover, we provide the convergence rate, error estimates, and the iteration number corresponding to a given satisfactory error.
3.1 Convergence of the algorithm
The convergence analysis is based on a matrix partial order and strongly convex functions. Recall that the Löwner partial order ⪯ for real symmetric matrices is defined by \(A\preceq B\) if \(B-A\) is a positive semidefinite matrix or, equivalently, \(x^{T}A x\leqslant x^{T}Bx\) for all \(x\in \mathbb {R}^{n}\). A twice-differentiable convex function \(f:\mathbb {R}^{n}\rightarrow \mathbb {R}\) is said to be strongly convex if there exist constants \(0 \leqslant m < M\) such that for all \(x\in \mathbb {R}^{n}\),
Using the definition of the partial order ⪯, this is equivalent to
In other words, m (resp., M) is a lower (resp., upper) bound for the smallest (resp., largest) eigenvalue of \(\nabla^{2} f(x)\) for all x.
Lemma 3.1
([35])
Iffis strongly convex on\(\mathbb {R}^{n}\), then for any\(x,y\in \mathbb {R}^{n}\),
Theorem 3.2
If system (1) is consistent andAis of full column rank, then the sequence\(\{x(k)\}\)generated by Algorithm2.1converges to a unique solution for any initial vector\(x(0)\).
Proof
The hypothesis implies the existence of a unique solution \(x^{*}\) for the system. We will show that \(x(k)\rightarrow x^{*}\) as \(k\rightarrow\infty\). In case the gradient \(\nabla f(x(k))\) becomes the zero vector for some k, we have \(x(k)=x^{*}\), and the result holds. So assume that \(\nabla f(x(k))\neq0\) for all k. Since \(\nabla^{2}f(x)=A^{T}A\) is a positive semidefinite matrix, we have
Thus f is strongly convex. For convenience, we write \(\lambda _{\min }\) and \(\lambda _{\max }\) instead of \(\lambda _{\min }(A^{T}A)\) and \(\lambda _{\max }(A^{T}A)\), respectively. We consider the function \(\phi_{k+1}(\tau)\) of the step size τ. Applying Lemma 3.1, we obtain
Minimize this inequality over τ. The right-hand side (RHS) is minimized by \(\tau_{k+1}=1/\lambda _{\max }\), and thus
From the other inequality in Lemma 3.1 we have
We find that \(\tau= 1/\lambda _{\min }\) minimizes the RHS, that is,
Now \(\Vert\nabla f(x(k))\Vert^{2}_{F}\geqslant2\lambda _{\min }f(x(k))\), and hence by (11) we have
Since A is a full-column-rank matrix, the matrix \(A^{T}A\) is positive definite, that is, \(\lambda>0\) for all \(\lambda\in\sigma(A^{T}A)\). It follows that \(c:=1-\lambda _{\min }/\lambda _{\max }<1\) and
By induction we obtain that for any \(k \in \mathbb {N}\),
which shows that \(f(x(k)) \to0\) or, equivalently, \(x(k) \to x^{*}\) as \(k\rightarrow\infty\). □
3.2 Convergence rate and error estimates
From now on, denote \(\kappa= \kappa(A)\), the condition number of A. According to the proof of Theorem 3.2, bounds (12) and (13) give rise to the following estimates:
Theorem 3.3
Assume the hypothesis of Theorem3.2. The asymptotic convergence rate of the Algorithm2.1 (with respect to the certain error\(\Vert Ax(k)-b \Vert_{F}\)) is governed by\(\sqrt{1-\kappa^{-2}}\). Moreover, the error estimates\(\Vert Ax(k)-b\Vert_{F}\)compared to the previous step and the first step are provided by (14) and (15), respectively. In particular, the relative error at each iteration gets smaller than the previous (nonzero) one.
Now we recall the following properties.
Lemma 3.4
(e.g. [1])
For any matricesAandBof proper sizes, we have
- (i)
\(\Vert A^{T} \Vert_{2}=\Vert A\Vert_{2}\),
- (ii)
\(\Vert A^{T}A\Vert_{2}=\Vert A\Vert^{2}_{2}\),
- (iii)
\(\Vert AB\Vert_{F}\leqslant\Vert A\Vert_{2}\Vert B\Vert_{F}\).
Theorem 3.5
Assume the hypothesis of Theorem3.2. Then the error estimates\(\Vert x(k)-x^{*}\Vert_{F}\)compared to the previous step and the first step of Algorithm2.1are given as follows:
In particular, the asymptotic convergence rate of the algorithm is governed by\(\sqrt{1-\kappa^{-2}}\).
Proof
By (15) and Lemma 3.4 we obtain
Since the end behavior of this error depends on the term \((1- \kappa ^{-2})^{\frac{k}{2}} \), the asymptotic rate of convergence for the algorithm is governed by \(\sqrt{1-\kappa^{-2}}\). Similarly, from (14) and Lemma 3.4 we have
and thus we get (16). □
Hence the condition number κ of the coefficient matrix determines the asymptotic rate of convergence, as well as how far our initial point was from the exact solution. As κ gets closer to 1, the algorithm converges faster.
Proposition 3.6
Let\(\{x(k)\}_{k=1}^{\infty}\)be the sequence of vectors generated by Algorithm2.1. For each\(\epsilon>0\), we have\(\Vert Ax(k)-b \Vert_{F} < \epsilon\)after\(k^{*}\)iterations for any
Besides, for each\(\epsilon>0\), we have\(\Vert x(k) - x^{*} \Vert_{F} < \epsilon\)after\(k^{*}\)iterations for any
Proof
From (13) we have \(\Vert Ax(k)-b \Vert_{F} \leqslant (1-\kappa^{-2})^{k} \Vert Ax(0)-b \Vert_{F} \rightarrow0\) as \(k\rightarrow\infty\). This means precisely that for each \(\epsilon>0\), there is a positive integer N such that for all \(k\geqslant N\),
Taking the logarithm on both sides, we obtain (18). Another result can be obtained in a similar manner; here we start with approximation (17). □
From Proposition 3.6 we obtain the iteration number such that the relative error \(\Vert Ax(k)-b \Vert_{F}\) and the exact error \(\Vert x(k)-x^{*} \Vert_{F}\) have an accuracy to a decimal digit after iterations. Indeed, if p is a satisfactory decimal digit, then the desired iteration number is obtained by substituting \(\epsilon = 0.5\times10^{-p}\).
Remark 3.7
A sharper bound for error estimation is obtained when the coefficient matrix A is a square matrix. However, the convergence rate is governed by the same value. Indeed, the condition that A is of full column rank is equivalent to the invertibility of A. Using Lemma 3.4, we have the following bound:
Similarly, we get \(\Vert x(k)-x^{*} \Vert_{F} \leqslant \sqrt{\kappa^{2} -1} \Vert x(k-1)-x^{*} \Vert_{F}\). Since \(\kappa\geqslant1\), these bounds are sharper than those in (16) and (17).
4 Numerical simulations for linear systems
In this section, we illustrate the effectiveness and capability of our algorithm. We report the comparison of TauOpt, our proposed algorithm, with the existing algorithms we have presented in the introduction, that is, GI (Proposition 1.1), LS (Proposition 1.2), BB1 (5), and BB2 (6). All iteration results have been carried out by MATLAB R2018a in Intel(R) Core(TM) i7-6700HQ CPU @ 2.60 GHz, RAM 8.00 GB PC environment. To measure the computational time taken for each program, we apply the tic and toc functions in MATLAB and abbreviate them CT. The readers are recommended to consider all reported results, such as errors, CTs, figures, while comparing the performance of any algorithms. For each example, unless otherwise stated, we consider the following error at the kth iteration step:
Example 4.1
We consider the linear system \(Ax=b\) with
We choose an initial vector \(x(0)=10^{-6}[ 1 \ {-}1 ]^{T}\). Running Algorithm 2.1, we see from Table 1 that the approximated solutions converge to the exact solution \(x^{*}= [ -3 \ 4 ]^{T}\). Fig. 1 and Table 2 show the results when running 100 iterations. We can conclude that Algorithm 2.1 gives the fastest convergence.
Error for Ex. 4.1
Example 4.2
In this example, we consider the convergence rate of the algorithm. Let \(a \in \mathbb {R}\) and consider
Thus the condition number of the iteration matrix depends on a. By taking different values of a we then obtain the results shown in Table 3 and Fig. 2. The simulations reveal that the closer to 1 the condition number, the faster the convergence of the algorithm. This shows the correctness of Theorems 3.3 and 3.5.
Error for Ex. 4.2
Example 4.3
We consider a larger linear system. We would like to show that for a coefficient matrix that has no appropriate property and makes all approximated solutions from every other method diverge, our method still converges to the exact solution. Let
As we can see from Fig. 3, the natural logarithm errors \(\log{\Vert x(k)-x^{*}\Vert_{F}}\) for Jacobi, GS, SOR, ESOR, AOR, and JOR diverge, whereas those for our method continue to converge to 0. As a result, the approximated solutions from Algorithm 2.1 converge to the exact solution
with six decimals accuracy using \(14{,}612\) iterations and CT = 0.3627.
Natural logarithm errors for Ex. 4.3
Errors for Ex. 4.4
Example 4.4
In this example, we consider a rectangular linear system where its coefficient matrix is of full column rank. We compare Algorithm 2.1 with GI, LS, and BB algorithms. Let
The exact solution of the system \(Ax=b\) is given by
The results of running 100 iterations are provided in Fig. 4 and Table 4. Both show that Algorithm 2.1 outperforms the GI and LS algorithms. On the other hand, both types 1 and 2 of the BB algorithm of are comparable with ours. The BB algorithm gives a better iteration time; however, our algorithm gives a better computation time.
Example 4.5
For this example, we use the sparse \(100 \times100\) matrix
where \(M = \operatorname {tridiag}(-1,2,-1)\), \(N = \operatorname {tridiag}(0.5,0,-0.5)\), \(r = 0.01\), and \(n = 16\) as in [36]. We choose an initial vector \(x(0)=[x_{i}]\in \mathbb {R}^{100}\), where \(x_{i}=10^{-6}\) for all \(i=1,\dots,100\). We take a random vector \(b\in \mathbb {R}^{100}\) with every element in \([-100,100]\). Since the exact solution is not yet known, it is appropriate to consider the relative error \(\Vert b-Ax(k)\Vert_{F}\). The numerical results after 500 iterations in comparing our algorithm with GI and LS algorithm are shown in Table 5 and Fig. 5. They reveal that our algorithm performs better than GI and LS methods.
Relative errors for Ex. 4.5
5 Application to one-dimensional Poisson’s equation
We now discuss an application of the proposed algorithm to a sparse linear system arising from the one-dimensional Poisson equation
Here \(u:(\alpha,\beta)\rightarrow \mathbb {R}\) is an unknown function to be approximated, and \(f:(\alpha,\beta)\rightarrow \mathbb {R}\) is a given function. The function u must satisfy the Dirichlet boundary conditions \(u(\alpha) = u(\beta) = 0\). We discretize the problem to solve an approximate solution at N partitioned points \(x_{i}\) between α and β: \(x_{i} = \alpha+ ih\), where \(h = (\beta-\alpha )/(N+1)\) and \(0\leqslant i \leqslant N+1\). We denote \(u_{i}=u(x_{i})\) and \(f_{i}=f(x_{i})\). By the centered 2nd-order finite difference approximation we obtain
Now we can put it into a linear system \(T_{N} u = h^{2}f\), where \(u = [u_{1} \ u_{2} \ \dots\ u_{N}]\) is an unknown vector, and the coefficient matrix \(T_{N} = \operatorname {tridiag}(-1,2,-1) \in M_{N}(\mathbb {R})\) is a tridiagonal matrix. Now the proposed algorithm for this sparse linear system is presented as follows.
Algorithm 5.1
The steepest descent of gradient-based iterative algorithm for solving one-dimensional Poisson’s equation.
Input step: Input \(N\in \mathbb {N}\) as a number of partition.
Initializing step: Let \(h=(\beta-\alpha)/(N+1)\). For each \(i=1,\dots,N\), set \(x(i)=\alpha+ ih\) and \(f(i)=f(x(i))\). Compute \(g = h^{2}f\), \(s = T_{N}g\), \(S = T_{N}^{2}\), \(t = T_{N}s\), and \(T = T_{N}S\), where \(T_{N} = \operatorname {tridiag}(-1,2,-1)\in M_{N}(\mathbb {R})\). Choose an initial vector \(u(0)\in \mathbb {R}^{N}\) and set \(k:=0\).
Updating step:
-
\(\tau_{k+1}=\sum_{i=1}^{N} ( s_{i}-\sum_{j=1}^{N}S_{ij}u_{j}(k) )^{2}/\sum_{i=1}^{N} ( t_{i}-\sum_{j=1}^{N}T_{ij}u_{j}(k) )^{2}\),
-
\(u(k+1) = u(k)+\tau_{k+1} ( s-Su(k) )\).
-
Set \(k:=k+1\) and repeat the updating step.
Here the stopping criteria is \(\Vert g-T_{N}u(k)\Vert_{F}<\epsilon\), where ϵ is a small positive number. Since the coefficient matrix \(T_{N}\) is a sparse matrix, the error norm can be described more precisely:
The eigenvalues of \(T_{N}\) are given by \(\lambda_{j}=2( 1-\cos\frac{j\pi }{N+1})\) for \(j=1,\dots,N\); see, for example, [1]. The smallest eigenvalues of \(T_{N}\) can be approximated by the second-order Taylor approximation:
Thus \(T_{N}\) is positive definite with condition number
Now, according to Remark 3.7, the convergence analysis of Algorithm 5.1 can be described as follows.
Corollary 5.2
The discretization\(T_{N} u = h^{2} f\)of the Poisson equation (20) can be solved using Algorithm5.1so that the approximated solution\(u(k)\)converges to the exact solution\(u^{*}\)for any initial vector\(u(0)\). The asymptotic convergence rate of the algorithm is governed by\(\sqrt {1-{\kappa}^{-2}}\), whereκis given by (21).
Thus the convergence rate of the algorithm depends on the number N of partition.
Example 5.3
We consider an application of our algorithm to the one-dimensional Poisson equation (20) with
and \(u=0\) on the boundary of \([0,\pi]\). We choose an initial vector \(u(0)=2\times \operatorname {ones}\), where ones is the matrix that contains 1 at every position. We run Algorithm 5.1 with 8 partitioned points, so that the size of the matrix \(T_{N}\) is \(64\times 64\). The analytical solution is
Figure 6 shows the result of our algorithm (right) compared to the analytical solution (left) after running 1000 iterations with CT = 0.0112 seconds.
The analytical solution (left) and the numerical solution (right) for Ex. 5.3
6 Conclusion
A new algorithm, the steepest descent of gradient-based iterative algorithm, is proposed for solving rectangular linear systems. The algorithm is applicable for any rectangular linear systems and any initial points without any conditions, but the coefficient matrix is of full column rank. We use an optimization technique to obtain a new formula for a convergence factor, so that it excellently enhances the algorithm in performance of convergence. The asymptotic rate of convergence is governed by \(\sqrt{1-\kappa ^{-2}}\), where κ is the condition number of the coefficient matrix. The numerical simulations in Sect. 4 illustrate the applicability and efficiency of the algorithm compared to all other algorithms mentioned in this paper. The iteration number and the CT indicate that our algorithm is a good choice for solving linear systems. Moreover, the sparse linear system arising from the one-dimensional Poisson equation can be solved efficiently using this algorithm. In our opinion, the techniques of gradients, steepest descent, and convex optimization might be useful for a class of matrix equations such as Lyapunov equation, Sylvester equation, and so on. However, these topics require more studies and can be another further research.
References
James, W.D.: Applied Numerical Linear Algebra. Society for Industrial and Applied Mathematics, Philadelphia (1997)
Young, D.M.: Iterative Solution of Large Linear Systems. Academic Press, New York (1971)
Albrechtt, P., Klein, M.P.: Extrapolated iterative methods for linear systems. SIAM J. Numer. Anal. 21(1), 192–201 (1984)
Hallett, A.J.H.: The convergence of accelerated overrelaxation iterations. Math. Comput. 47(175), 219–223 (1986). https://doi.org/10.2307/2008090
Ding, F., Chen, T.: Gradient based iterative algorithms for solving a class of matrix equations. IEEE Trans. Autom. Control 50(8), 1216–1221 (2005). https://doi.org/10.1109/TAC.2005.852558
Ding, F., Chen, T.: Hierarchical gradient-based identification of multivariable discrete-time systems. Automatica 41(2), 315–325 (2005). https://doi.org/10.1016/j.automatica.2004.10.010
Ding, F., Chen, T.: Hierarchical least squares identification methods for multivariable systems. IEEE Trans. Autom. Control 50(3), 397–402 (2005). https://doi.org/10.1109/TAC.2005.843856
Ding, F., Liu, P.X., Ding, J.: Iterative solutions of the generalized Sylvester matrix equations by using the hierarchical identification principle. Appl. Math. Comput. 197(1), 41–50 (2008). https://doi.org/10.1016/j.amc.2007.07.040
Niu, Q., Wang, X., Lu, L.Z.: A relaxed gradient based algorithm for solving Sylvester equation. Asian J. Control 13(3), 461–464 (2011). https://doi.org/10.1002/asjc.328
Wang, X., Dai, L., Liao, D.: A modified gradient based algorithm for solving Sylvester equation. Appl. Math. Comput. 218(9), 5620–5628 (2012). https://doi.org/10.1016/j.amc.2011.11.055
Xie, Y., Ma, C.F.: The accelerated gradient based iterative algorithm for solving a class of generalized Sylvester-transpose matrix equation. Appl. Math. Comput. 273(15), 1257–1269 (2016). https://doi.org/10.1016/j.amc.2015.07.022
Zhang, X., Sheng, X.: The relaxed gradient based iterative algorithm for the symmetric (skew symmetric) solution of the Sylvester equation \({AX+XB=C}\). Math. Probl. Eng. 2017, Article ID 1624969 (2017). https://doi.org/10.1155/2017/1624969
Sheng, X., Sun, W.: The relaxed gradient based iterative algorithm for solving matrix equations. Comput. Math. Appl. 74(3), 597–604 (2017). https://doi.org/10.1016/j.camwa.2017.05.008
Sheng, X.: A relaxed gradient based algorithm for solving generalized coupled Sylvester matrix equations. J. Franklin Inst. 355(10), 4282–4297 (2018). https://doi.org/10.1016/j.jfranklin.2018.04.008
Li, M., Liu, X., Ding, F.: The gradient based iterative estimation algorithms for bilinear systems with autoregressive noise. Circuits Syst. Signal Process. 36(11), 4541–4568 (2017). https://doi.org/10.1007/s00034-017-0527-4
Sun, M., Wang, Y., Liu, J.: Two modified least-squares iterative algorithms for the Lyapunov matrix equations. Adv. Differ. Equ. 2019(1), Article ID 305 (2019). https://doi.org/10.1186/s13662-019-2253-7
Zhu, M.Z., Zhang, G.F., Qi, Y.E.: On single-step HSS iterative method with circulant preconditioner for fractional diffusion equations. Adv. Differ. Equ. 2019(1), Article ID 422 (2019). https://doi.org/10.1186/s13662-019-2353-4
Zhang, H.M., Ding, F.: A property of the eigenvalues of the symmetric positive definite matrix and the iterative algorithm for coupled Sylvester matrix equations. J. Franklin Inst. 351(1), 340–357 (2014). https://doi.org/10.1016/j.jfranklin.2013.08.023
Zhang, H.M., Ding, F.: Iterative algorithms for \({X+A^{T}X^{-1}A=I}\) by using the hierarchical identification principle. J. Franklin Inst. 353(5), 1132–1146 (2016). https://doi.org/10.1016/j.jfranklin.2015.04.003
Ding, F., Zhang, H.: Brief paper – Gradient-based iterative algorithm for a class of the coupled matrix equations related to control systems. IET Control Theory Appl. 8(15), 1588–1595 (2014). https://doi.org/10.1049/iet-cta.2013.1044
Xie, L., Ding, J., Ding, F.: Gradient based iterative solutions for general linear matrix equations. Comput. Math. Appl. 58(7), 1441–1448 (2009). https://doi.org/10.1016/j.camwa.2009.06.047
Xie, L., Liu, Y.J., Yang, H.Z.: Gradient based and least squares based iterative algorithms for matrix equations \({AXB+CX^{T}D=F}\). Appl. Math. Comput. 217(5), 2191–2199 (2010). https://doi.org/10.1016/j.amc.2010.07.019
Ding, F., Lv, L., Pan, J., et al.: Two-stage gradient-based iterative estimation methods for controlled autoregressive systems using the measurement data. Int. J. Control. Autom. Syst. 18, 886–896 (2020). https://doi.org/10.1007/s12555-019-0140-3
Ding, F., Xu, L., Meng, D., et al.: Gradient estimation algorithms for the parameter identification of bilinear systems using the auxiliary model. J. Comput. Appl. Math. 369, Article ID 112575 (2020). https://doi.org/10.1016/j.cam.2019.112575
Ding, F., Chen, T.: Iterative least-squares solutions of coupled Sylvester matrix equations. Syst. Control Lett. 54(2), 95–107 (2005). https://doi.org/10.1016/j.sysconle.2004.06.008
Ding, F., Chen, T.: On iterative solutions of general coupled matrix equations. SIAM J. Control Optim. 44(6), 2269–2284 (2006). https://doi.org/10.1137/S0363012904441350
Edwin, K.P.C., Stanislaw, H.Z.: An Introduction to Optimization, 2nd edn. Wiley-Interscience, New York (2001)
Barzilai, J., Borwein, J.: Two point step size gradient methods. IMA J. Numer. Anal. 8(1), 141–148 (1988). https://doi.org/10.1093/imanum/8.1.141
Yuan, Y.X.: Step-sizes for the gradient method. AMS/IP Stud. Adv. Math. 42, 785–797 (2008)
Dai, Y.H., Yuan, J.Y., Yuan, Y.: Modified two-point step-size gradient methods for unconstrained optimization. Comput. Optim. Appl. 22, 103–109 (2002). https://doi.org/10.1023/A:1014838419611
Dai, Y.H., Fletcher, R.: On the asymptotic behaviour of some new gradient methods. Numerical analysis report NA/212, Department of Mathematics, University of Dundee, Scotland, UK (2003)
Dai, Y.H., Yuan, Y.: Analysis of monotone gradient methods. J. Ind. Manag. Optim. 1(2), 181–192 (2005). https://doi.org/10.3934/jimo.2005.1.181
Fletcher, R.: On the Brazilar–Borwein method. Research report, University of Dundee, Scotland, UK (2001)
Yuan, Y.: A new stepsize for the steepest descent method. Research report, Institute of Computional Mathematics and Scientific/Engineering Computing, Chinese Academy of Sciences, China (2004)
Stephen, P.B., Lieven, V.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Zhong, Z.B.: On Hermitian and skew-Hermitian spliting iteration methods for continuous Sylvester equations. J. Comput. Math. 29(2), 185–198 (2011). https://doi.org/10.4208/jcm.1009-m3152
Acknowledgements
This work was supported by Thailand Science Research and Innovation (Thailand Research Fund).
Availability of data and materials
Not applicable.
Funding
This second author expresses his gratitude to Thailand Science Research and Innovation (Thailand Research Fund), Grant No. MRG6280040, for financial supports.
Author information
Authors and Affiliations
Contributions
Both authors contributed equally and significantly in writing this article. Both authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kittisopaporn, A., Chansangiam, P. The steepest descent of gradient-based iterative method for solving rectangular linear systems with an application to Poisson’s equation. Adv Differ Equ 2020, 259 (2020). https://doi.org/10.1186/s13662-020-02715-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13662-020-02715-9
MSC
- 15A12
- 15A60
- 26B25
- 65F10
- 65N22
Keywords
- Rectangular linear system
- Iterative method
- Gradient
- Steepest descent
- Condition number
- Poisson’s equation