Hessian#
This section shows how to compute the Hessian (matrix of second derivatives) using DerivKit.
The Hessian describes the local curvature of a function with respect to the model parameters.
For a function f(theta), the Hessian is the matrix of second derivatives
with respect to the parameters.
Notation
pdenotes the number of model parameters (thetahas shape(p,)).
Depending on the output type of f(theta), the Hessian has the following shape:
scalar output
f(theta)→ Hessian shape(p, p)tensor output
f(theta)with shapeout_shape→ Hessian shape(*out_shape, p, p)
See also Gradient for first derivatives and Jacobian for vector-valued outputs. For more information on hessian, see CalculusKit.
The primary interface for computing the Hessian is
derivkit.calculus_kit.CalculusKit.hessian().
For advanced usage and backend-specific keyword arguments, see
derivkit.calculus.hessian.build_hessian().
You can choose the derivative backend via method and pass backend-specific
options via **dk_kwargs (forwarded to
derivkit.derivative_kit.DerivativeKit.differentiate()).
Basic usage (scalar-valued function)#
>>> import numpy as np
>>> from derivkit.calculus_kit import CalculusKit
>>> # Define a scalar-valued function
>>> def func(theta):
... return np.sin(theta[0]) + theta[0] * theta[1] + theta[1] ** 2
>>> # Point at which to compute the Hessian
>>> x0 = np.array([0.5, 2.0])
>>> # Create CalculusKit instance and compute Hessian
>>> calc = CalculusKit(func, x0=x0)
>>> hess = calc.hessian()
>>> print(np.round(hess, 6))
[[-0.479426 1. ]
[ 1. 2. ]]
>>> print(hess.shape)
(2, 2)
>>> ref = np.array([
... [-np.sin(0.5), 1.0],
... [1.0, 2.0],
... ])
>>> print(np.round(ref, 6))
[[-0.479426 1. ]
[ 1. 2. ]]
Hessian diagonal only#
For large parameter spaces you may only need the diagonal of the Hessian. DerivKit provides a fast helper for this case.
>>> import numpy as np
>>> from derivkit.calculus_kit import CalculusKit
>>> # Define a scalar-valued function
>>> def func(theta):
... return np.sin(theta[0]) + theta[0] * theta[1] + theta[1] ** 2
>>> # Instantiate CalculusKit and compute Hessian diagonal
>>> calc = CalculusKit(func, x0=np.array([0.5, 2.0]))
>>> hess_diag = calc.hessian_diag()
>>> print(np.round(np.asarray(hess_diag).reshape(-1), 6))
[-0.479426 2. ]
>>> ref = np.array([-np.sin(0.5), 2.0])
>>> np.allclose(hess_diag, ref)
True
Tensor-valued outputs#
If the function returns a tensor, the Hessian is computed independently for each output component.
The result is reshaped back to (*out_shape, p, p).
>>> import numpy as np
>>> from derivkit.calculus_kit import CalculusKit
>>> # Define a tensor-valued function
>>> def func(theta):
... return np.array([
... np.sin(theta[0]),
... theta[0] * theta[1] + theta[1] ** 2,
... ])
>>> # Point at which to compute the Hessian
>>> x0 = np.array([0.5, 2.0])
>>> # Create CalculusKit instance and compute Hessian
>>> calc = CalculusKit(func, x0=x0)
>>> hess = calc.hessian()
>>> print(hess.shape)
(2, 2, 2)
>>> hess0_ref = np.array([
... [-np.sin(0.5), 0.0],
... [0.0, 0.0],
... ])
>>> hess1_ref = np.array([
... [0.0, 1.0],
... [1.0, 2.0],
... ])
>>> np.allclose(hess[0], hess0_ref)
True
>>> np.allclose(hess[1], hess1_ref)
True
Finite differences (Ridders) via dk_kwargs#
>>> import numpy as np
>>> from derivkit.calculus_kit import CalculusKit
>>> # Define a scalar-valued function
>>> def func(theta):
... return np.sin(theta[0]) + theta[0] * theta[1] + theta[1] ** 2
>>> # Create CalculusKit instance and compute Hessian
>>> calc = CalculusKit(func, x0=np.array([0.5, 2.0]))
>>> hess = calc.hessian(
... method="finite",
... n_workers=4,
... stepsize=1e-2,
... num_points=5,
... extrapolation="ridders",
... levels=4,
... )
>>> print(np.round(hess, 6))
[[-0.479426 1. ]
[ 1. 2. ]]
Notes#
For scalar outputs, only the upper triangle of the Hessian is evaluated and mirrored for efficiency.
For tensor outputs, each component is treated as a scalar internally (flattened, differentiated, and reshaped back).
n_workersparallelizes Hessian tasks (entries and/or components).When using
CalculusKit.hessian_diag(), mixed partial derivatives are skipped for speed.