Verifying gradients and Hessians

If you have computed a gradient or differential and you are not sure whether it is correct.

Manopt.check_HessianFunction
check_Hessian(M, f, grad_f, Hess_f, p=rand(M), X=rand(M; vector_at=p), Y=rand(M, vector_at=p); kwargs...)

Verify numerically whether the Hessian $\operatorname{Hess} f(M,p, X)$ of f(M,p) is correct.

For this either a second-order retraction or a critical point $p$ of f is required. The approximation is then

\[f(\operatorname{retr}_p(tX)) = f(p) + t⟨\operatorname{grad} f(p), X⟩ + \frac{t^2}{2}⟨\operatorname{Hess}f(p)[X], X⟩ + \mathcal O(t^3)\]

or in other words, that the error between the function $f$ and its second order Taylor behaves in error $\mathcal O(t^3)$, which indicates that the Hessian is correct, cf. also [Bou23, Section 6.8].

Note that if the errors are below the given tolerance and the method is exact, no plot is generated.

Keyword arguments

  • check_grad: (true) verify that $\operatorname{grad} f(p) ∈ T_p\mathcal M$.

  • check_linearity: (true) verify that the Hessian is linear, see is_Hessian_linear using a, b, X, and Y

  • check_symmetry: (true) verify that the Hessian is symmetric, see is_Hessian_symmetric

  • check_vector: (false) verify that $\operatorname{Hess} f(p)[X] ∈ T_p\mathcal M$ using is_vector.

  • mode: (:Default) specify the mode for the verification; the default assumption is, that the retraction provided is of second order. Otherwise one can also verify the Hessian if the point p is a critical point. THen set the mode to :CritalPoint to use gradient_descent to find a critical point. Note: this requires (and evaluates) new tangent vectors X and Y

  • atol, rtol: (same defaults as isapprox) tolerances that are passed down to all checks

  • a, b two real values to verify linearity of the Hessian (if check_linearity=true)

  • N: (101) number of points to verify within the log_range default range $[10^{-8},10^{0}]$

  • exactness_tol: (1e-12) if all errors are below this tolerance, the verification is considered to be exact

  • io: (nothing) provide an IO to print the result to

  • gradient: (grad_f(M, p)) instead of the gradient function you can also provide the gradient at p directly

  • Hessian: (Hess_f(M, p, X)) instead of the Hessian function you can provide the result of $\operatorname{Hess} f(p)[X]$ directly. Note that evaluations of the Hessian might still be necessary for checking linearity and symmetry and/or when using :CriticalPoint mode.

  • limits: ((1e-8,1)) specify the limits in the log_range

  • log_range: (range(limits[1], limits[2]; length=N)) specify the range of points (in log scale) to sample the Hessian line

  • N: (101) number of points to use within the log_range default range $[10^{-8},10^{0}]$

  • plot: (false) whether to plot the resulting verification (requires Plots.jl to be loaded). The plot is in log-log-scale. This is returned and can then also be saved.

  • retraction_method: (default_retraction_method(M, typeof(p))) retraction method to use for

  • slope_tol: (0.1) tolerance for the slope (global) of the approximation

  • error: (:none) how to handle errors, possible values: :error, :info, :warn

  • window: (nothing) specify window sizes within the log_range that are used for the slope estimation. the default is, to use all window sizes 2:N.

The kwargs... are also passed down to the check_vector and the check_gradient call, such that tolerances can easily be set.

While check_vector is also passed to the inner call to check_gradient as well as the retraction_method, this inner check_gradient is meant to be just for inner verification, so it does not throw an error nor produce a plot itself.

source
Manopt.check_differentialFunction
check_differential(M, F, dF, p=rand(M), X=rand(M; vector_at=p); kwargs...)

Check numerically whether the differential dF(M,p,X) of F(M,p) is correct.

This implements the method described in [Bou23, Section 4.8].

Note that if the errors are below the given tolerance and the method is exact, no plot is generated,

Keyword arguments

  • exactness_tol: (1e-12) if all errors are below this tolerance, the differential is considered to be exact
  • io: (nothing) provide an IO to print the result to
  • limits: ((1e-8,1)) specify the limits in the log_range
  • log_range: (range(limits[1], limits[2]; length=N)) specify the range of points (in log scale) to sample the differential line
  • N: (101) number of points to verify within the log_range default range $[10^{-8},10^{0}]$
  • name: ("differential") name to display in the plot
  • plot: (false) whether to plot the result (if Plots.jl is loaded). The plot is in log-log-scale. This is returned and can then also be saved.
  • retraction_method: (default_retraction_method(M, typeof(p))) retraction method to use
  • slope_tol: (0.1) tolerance for the slope (global) of the approximation
  • throw_error: (false) throw an error message if the differential is wrong
  • window: (nothing) specify window sizes within the log_range that are used for the slope estimation. the default is, to use all window sizes 2:N.
source
Manopt.check_gradientFunction
check_gradient(M, F, gradF, p=rand(M), X=rand(M; vector_at=p); kwargs...)

Verify numerically whether the gradient gradF(M,p) of F(M,p) is correct, that is whether

\[f(\operatorname{retr}_p(tX)) = f(p) + t⟨\operatorname{grad} f(p), X⟩ + \mathcal O(t^2)\]

or in other words, that the error between the function $f$ and its first order Taylor behaves in error $\mathcal O(t^2)$, which indicates that the gradient is correct, cf. also [Bou23, Section 4.8].

Note that if the errors are below the given tolerance and the method is exact, no plot is generated.

Keyword arguments

  • check_vector: (true) verify that $\operatorname{grad} f(p) ∈ T_p\mathcal M$ using is_vector.
  • exactness_tol: (1e-12) if all errors are below this tolerance, the gradient is considered to be exact
  • io: (nothing) provide an IO to print the result to
  • gradient: (grad_f(M, p)) instead of the gradient function you can also provide the gradient at p directly
  • limits: ((1e-8,1)) specify the limits in the log_range
  • log_range: (range(limits[1], limits[2]; length=N)) - specify the range of points (in log scale) to sample the gradient line
  • N: (101) number of points to verify within the log_range default range $[10^{-8},10^{0}]$
  • plot: (false) whether to plot the result (if Plots.jl is loaded). The plot is in log-log-scale. This is returned and can then also be saved.
  • retraction_method: (default_retraction_method(M, typeof(p))) retraction method to use
  • slope_tol: (0.1) tolerance for the slope (global) of the approximation
  • atol, rtol: (same defaults as isapprox) tolerances that are passed down to is_vector if check_vector is set to true
  • error: (:none) how to handle errors, possible values: :error, :info, :warn
  • window: (nothing) specify window sizes within the log_range that are used for the slope estimation. the default is, to use all window sizes 2:N.

The remaining keyword arguments are also passed down to the check_vector call, such that tolerances can easily be set.

source
Manopt.is_Hessian_linearFunction
is_Hessian_linear(M, Hess_f, p,
    X=rand(M; vector_at=p), Y=rand(M; vector_at=p), a=randn(), b=randn();
    error=:none, io=nothing, kwargs...
)

Verify whether the Hessian function Hess_f fulfills linearity,

\[\operatorname{Hess} f(p)[aX + bY] = b\operatorname{Hess} f(p)[X] + b\operatorname{Hess} f(p)[Y]\]

which is checked using isapprox and the keyword arguments are passed to this function.

Optional arguments

  • error: (:none) how to handle errors, possible values: :error, :info, :warn
source
Manopt.is_Hessian_symmetricFunction
is_Hessian_symmetric(M, Hess_f, p=rand(M), X=rand(M; vector_at=p), Y=rand(M; vector_at=p);
error=:none, io=nothing, atol::Real=0, rtol::Real=atol>0 ? 0 : √eps

)

Verify whether the Hessian function Hess_f fulfills symmetry, which means that

\[⟨\operatorname{Hess} f(p)[X], Y⟩ = ⟨X, \operatorname{Hess} f(p)[Y]⟩\]

which is checked using isapprox and the kwargs... are passed to this function.

Optional arguments

  • atol, rtol with the same defaults as the usual isapprox
  • error: (:none) how to handle errors, possible values: :error, :info, :warn
source

Literature