# 数学代写|数值分析代写numerical analysis代考|Conjugate Gradient and Quasi-Newton Methods

## 数学代写|数值分析代写numerical analysis代考|Conjugate Gradients for Optimization

It is noted in Section 2.4.2.1 that part of the conjugate gradient method can be derived from the condition that $\boldsymbol{x}_{k+1}$ minimizes the convex quadratic function $\phi(\boldsymbol{x})=\frac{1}{2} x^T A x-\boldsymbol{b}^T x$ over $x \in \operatorname{span}\left{p_0, p_1, \ldots, p_k\right}$, together with the property that $\boldsymbol{p}i^T A \boldsymbol{p}_j=0$ for $i \neq j$ (the conjugacy property). The conjugacy condition implies that $\boldsymbol{x}{k+1}-\boldsymbol{x}k \in \operatorname{span}\left{\boldsymbol{p}_k\right}$, so that we can find $\boldsymbol{x}{k+1}=\boldsymbol{x}_k+s_k \boldsymbol{p}_k$ by using an exact line search.

We can generalize this algorithm to functions $f(\boldsymbol{x})$ that are neither quadratic nor convex, and to use inexact line search methods. However, in this process, we lose some properties of the method. For convex quadratic objective functions and exact line searches, we have the following properties of the iterates of the conjugate gradient method (recall that $\boldsymbol{r}_i=A \boldsymbol{x}_i-\boldsymbol{b}$ ):
$$\begin{array}{rlr} \boldsymbol{r}_i^T \boldsymbol{r}_j & =0 \quad & \text { if } i \neq j, \ \boldsymbol{r}_i^T \boldsymbol{p}_j & =0 \quad \text { if } j<i, \ \boldsymbol{p}_i^T A \boldsymbol{p}_j & =0 \quad \text { if } i \neq j, \ \operatorname{span}\left{\boldsymbol{p}_0, \ldots, \boldsymbol{p}_k\right} & =\operatorname{span}\left{\boldsymbol{r}_0, \ldots, \boldsymbol{r}_k\right} \ & =\operatorname{span}\left{\boldsymbol{r}_0, A \boldsymbol{r}_0, \ldots, A^k \boldsymbol{r}_0\right} \quad \text { for all } k \geq 0 . \end{array}$$
In order to avoid flipping pages to compare with the original algorithm, we repeat the unpreconditioned standard linear conjugate gradient algorithm (Algorithm 27) as conjgrad2.

From this, we will see how to derive the conjugate gradient algorithm for optimization. First, if $f(\boldsymbol{x})=\frac{1}{2} \boldsymbol{x}^T A \boldsymbol{x}-\boldsymbol{b}^T \boldsymbol{x}+c$, then the residual $\boldsymbol{r}=A \boldsymbol{x}-\boldsymbol{b}=\nabla f(\boldsymbol{x})$. So we replace the computation of the residual $\boldsymbol{r}_k$ on line 7 with $\boldsymbol{r}_k \leftarrow \nabla f\left(\boldsymbol{x}_k\right)$. The other thing to remember is that in the general situation, there is no matrix $A$. The closest thing we have to $A$ for a general smooth function $f$ is Hess $f(\boldsymbol{x})$, the Hessian matrix of $f$ at $\boldsymbol{x}$. But in general, Hess $f(\boldsymbol{x})$ is neither constant nor easily computable. So we should avoid any explicit reference to $A$. These references occur on line 4 in computing $\boldsymbol{q}_k$, which is used to compute the step length $\alpha_k$ on line 5 and the update of the residual on line 7 . For the step length, we simply use a suitable line search algorithm. This gives Algorithm 81 (conjgradoptFR) below, which is known as the Fletcher-Reeves conjugate gradient algorithm.

## 数学代写|数值分析代写numerical analysis代考|Line Search Algorithms for Conjugate Gradient Optimization

Now we need to ask: what makes for a suitable line search algorithm for this generalized conjugate gradient algorithm? All of the line search algorithms we have discussed assume the direction of the line search (here $p_k$ ) must be a descent direction: $\boldsymbol{p}k^T \nabla f\left(\boldsymbol{x}_k\right)=\boldsymbol{p}_k^T \boldsymbol{r}_k<0$. This is clearly true for $k=0$ as $\boldsymbol{p}_0=-\boldsymbol{r}_0$. But can we guarantee this will be true for $k=1,2, \ldots$ ? If we use exact line searches, then $s_k$ minimizes $f\left(\boldsymbol{x}_k+s \boldsymbol{p}_k\right)$ over all $s>0$, and so $\boldsymbol{p}_k^T \nabla f\left(\boldsymbol{x}_k+\right.$ $\left.s_k \boldsymbol{p}_k\right)=\boldsymbol{p}_k^T \nabla f\left(\boldsymbol{x}{k+1}\right)=\boldsymbol{p}k^T \boldsymbol{r}{k+1}=0$. From line 9 , we then have
\begin{aligned} \boldsymbol{p}{k+1}^T \boldsymbol{r}{k+1} & =\left(-\boldsymbol{r}{k+1}+\beta_k \boldsymbol{p}_k\right)^T \boldsymbol{r}{k+1} \ & =-\boldsymbol{r}{k+1}^T \boldsymbol{r}{k+1}<0 \end{aligned}
provided $\boldsymbol{r}_{k+1} \neq \mathbf{0}$, as we wanted. But in practice, we can only approximate exact line searches, and attempting something close to an exact line search can be very expensive in terms of function evaluations.

Since the descent direction condition involves gradients, we need a line search method that involves $\nabla f(\boldsymbol{x})$. This leaves Wolfe condition $(8.3 .7,8.3 .8)$ based line searches as the only practical way to properly implement conjugate gradient methods for optimization. As you may recall, we have two parameters for the Wolfe conditions: $c_1$ for the sufficient decrease criterion, and $c_2$ for the curvature condition. In order to guarantee the existence of a step satisfying $(8.3 .7,8.3 .8)$ we need $f$ bounded below as well as being smooth, and $00$ small.

If exact line searches are used so that $\boldsymbol{p}k^T \nabla f\left(\boldsymbol{x}_k+s_k \boldsymbol{p}_k\right)=0$, then we can guarantee that $\boldsymbol{p}{k+1}$ is a descent direction. If we use a Wolfe condition-based line search, what value of $c_2>0$ can guarantee the same property? For the Fletcher-Reeve conjugate gradient method, it turns out that $0<c_2<\frac{1}{2}$ is sufficient to ensure that $\boldsymbol{p}_{k+1}$ is a descent direction.

# 数值分析代考

## 数学代写|数值分析代写numerical analysis代考|Conjugate Gradients for Optimization

$f(\boldsymbol{x})=\frac{1}{2} \boldsymbol{x}^T \boldsymbol{A} \boldsymbol{x}-\boldsymbol{b}^T \boldsymbol{x}+c$ ，那么残差 $\boldsymbol{r}=A \boldsymbol{x}-\boldsymbol{b}=\nabla f(\boldsymbol{x})$. 所以我们替换残差的 计算 $\boldsymbol{r}_k$ 在第 7 行 $\boldsymbol{r}_k \leftarrow \nabla f\left(\boldsymbol{x}_k\right)$. 另一件事要记住，在一般情况下，没有矩阵 $A$. 我们必 须做的最接近的事情 $A$ 对于一般的平滑函数 $f$ 是赫斯 $f(\boldsymbol{x})$, 的 Hessian 矩阵 $f$ 在 $\boldsymbol{x}$. 但总的 来说，赫斯 $f(\boldsymbol{x})$ 既不是常数也不容易计算。所以我们应该避免任何明确提及 $A$. 这些引 用出现在计算中的第 4 行 $\boldsymbol{q}_k$ ，用于计算步长 $\alpha_k$ 在第 5 行和第 7 行的残差更新。对于步 长，我们简单地使用合适的线搜索算法。这给出了下面的算法 81 (conjgradoptFR)，它 被称为 Fletcher-Reeves 共轭梯度算法。

## 数学代写|数值分析代写numerical analysis代考|Line Search Algorithms for Conjugate Gradient Optimization

$\boldsymbol{p} k^T \nabla f\left(\boldsymbol{x}k\right)=\boldsymbol{p}_k^T \boldsymbol{r}_k<0$. 这显然适用于 $k=0$ 作为 $\boldsymbol{p}_0=-\boldsymbol{r}_0$. 但是我们能保证这对 $k=1,2, \ldots$ ? 如果我们使用精确的线搜索，那么 $s_k$ 最小化 $f\left(\boldsymbol{x}_k+s \boldsymbol{p}_k\right)$ 全面的 $s>0$ ，所以 $\boldsymbol{p}_k^T \nabla f\left(\boldsymbol{x}_k+s_k \boldsymbol{p}_k\right)=\boldsymbol{p}_k^T \nabla f(\boldsymbol{x} k+1)=\boldsymbol{p} k^T \boldsymbol{r} k+1=0$. 从第 9 行开始， 我们有 $$\boldsymbol{p} k+1^T \boldsymbol{r} k+1=\left(-\boldsymbol{r} k+1+\beta_k \boldsymbol{p}_k\right)^T \boldsymbol{r} k+1 \quad=-\boldsymbol{r} k+1^T \boldsymbol{r} k+1<0$$ 假如 $r_{k+1} \neq 0$ ，如我们所愿。但在实践中，我们只能近似精确线搜索，并且尝试接近 精确线搜索的东西在函数评估方面可能非常昂贵。 由于下降方向条件涉及梯度，我们需要一种线搜索方法，涉及 $\nabla f(\boldsymbol{x})$. 这留下了沃尔夫 条件 $(8.3 .7,8.3 .8)$ 基于线搜索的方法是正确实施共轭梯度法进行优化的唯一实用方法。 您可能还记得，我们有两个 Wolfe 条件参数： $c_1$ 对于足够的减少标准，和 $c_2$ 对于曲率条 件。为了保证存在满足 $(8.3 .7,8.3 .8)$ 我们需要 $f$ 在下方有界并且很光滑，并且 00 小的。 如果使用精确的线搜索，那么 $\boldsymbol{p} k^T \nabla f\left(\boldsymbol{x}_k+s_k \boldsymbol{p}_k\right)=0$ ，那么我们可以保证 $\boldsymbol{p} k+1$ 是 下降方向。如果我们使用 Wolfe 基于条件的线搜索，什么值 $c_2>0$ 能保证一样的财产 吗? 对于 Fletcher-Reeve 共轭梯度法，事实证明 $0{k+1}$ 是下降方向。

myassignments-help数学代考价格说明

1、客户需提供物理代考的网址，相关账户，以及课程名称，Textbook等相关资料~客服会根据作业数量和持续时间给您定价~使收费透明，让您清楚的知道您的钱花在什么地方。

2、数学代写一般每篇报价约为600—1000rmb，费用根据持续时间、周作业量、成绩要求有所浮动(持续时间越长约便宜、周作业量越多约贵、成绩要求越高越贵)，报价后价格觉得合适，可以先付一周的款，我们帮你试做，满意后再继续，遇到Fail全额退款。

3、myassignments-help公司所有MATH作业代写服务支持付半款，全款，周付款，周付款一方面方便大家查阅自己的分数，一方面也方便大家资金周转，注意:每周固定周一时先预付下周的定金，不付定金不予继续做。物理代写一次性付清打9.5折。

Math作业代写、数学代写常见问题

myassignments-help擅长领域包含但不是全部: