Coordinate Descent

In Unconstrained Optimization, coordinate descent is similar to applying a Preconditioner to Gradient Descent (see Newton's Method). This preconditioner sets the descent direction to be along just one coordinate / one entry of the vector to be optimized.

Consider

min_{x} f (x) .

Gradient descent iterates as

x^{k + 1} = x^{k} - η \nabla f (x^{k}) .

Preconditioning with $T$ (which is ideally the inverse of the Hessian) gives

x^{k + 1} = x^{k} - η T^{k} \nabla f (x^{k}) .

$T^{k}$ is chosen so that

T^{k} \nabla f (x^{k}) = [\nabla f (x^{k})]_{i},

which is done by $T^{k} = e_{i} e_{i}^{T}$ .