3 Line Search Methods

Each iteration of a line search method computes a search direction \(p_k\) and then decides how far to move along that direction. The iteration is given by

\[x_{k+1} = x_k + \alpha_k p_k\]

where the positive scalar \(\alpha_k\) is called the step length.

Most line search algorithms require \(p_k\) to be a descent direction – one for which \(p_k^\top \nabla f_k < 0\) – because this property guarantees that the function \(f\) can be reduced along this direction. Moreover, the search direction often has the form

\[p_k = - \mathbf{B}_k^{-1} \nabla f_k\]

where \(\mathbf{B}_k\) is a symmetric and nonsingular matrix. In the steepest descent method, \(\mathbf{B}_k\) is simply the identity matrix \(\mathbf{I}\). In Newton’s method, \(\mathbf{B}_k\) is the exact Hessian \(\nabla^2 f(x_k)\). In quasi-Newton methods, \(\mathbf{B}_k\) is an approximation to the Hessian updated by a low-rank formula. When \(\mathbf{B}_k\) is positive definite, we have

\[p_k^\top \nabla f_k = - \nabla f_k^\top \mathbf{B}_k^{-1} \nabla f_k < 0\]

and therefore \(p_k\) is a descent direction. In this chapter, we discuss how to choose \(\alpha_k\) and \(p_k\) to promote convergence from remote starting points. We also study the rate of convergence of steepest descent, quasi-Newton, and Newton methods. We also discuss modifications to Newton methods in Section 3.4.