.. |dklogo| image:: ../../assets/logos/logo-black.png
   :alt: DerivKit logo black
   :width: 32px

|dklogo| Hessian
================

This section shows how to compute the Hessian (matrix of second derivatives)
using DerivKit.

The Hessian describes the local curvature of a function with respect to the
model parameters.

For a function ``f(theta)``, the Hessian is the matrix of second derivatives
with respect to the parameters.

**Notation**

- ``p`` denotes the number of model parameters (``theta`` has shape ``(p,)``).

Depending on the output type of ``f(theta)``, the Hessian has the following shape:

- scalar output ``f(theta)`` → Hessian shape ``(p, p)``
- tensor output ``f(theta)`` with shape ``out_shape`` → Hessian shape
  ``(*out_shape, p, p)``

See also :doc:`gradient` for first derivatives and :doc:`jacobian` for
vector-valued outputs.
For more information on hessian, see :doc:`../../about/kits/calculus_kit`.

The primary interface for computing the Hessian is
:meth:`derivkit.calculus_kit.CalculusKit.hessian`.
For advanced usage and backend-specific keyword arguments, see
:func:`derivkit.calculus.hessian.build_hessian`.
You can choose the derivative backend via ``method`` and pass backend-specific
options via ``**dk_kwargs`` (forwarded to
:meth:`derivkit.derivative_kit.DerivativeKit.differentiate`).


Basic usage (scalar-valued function)
------------------------------------

.. doctest:: hessian_basic

   >>> import numpy as np
   >>> from derivkit.calculus_kit import CalculusKit
   >>> # Define a scalar-valued function
   >>> def func(theta):
   ...     return np.sin(theta[0]) + theta[0] * theta[1] + theta[1] ** 2
   >>> # Point at which to compute the Hessian
   >>> x0 = np.array([0.5, 2.0])
   >>> # Create CalculusKit instance and compute Hessian
   >>> calc = CalculusKit(func, x0=x0)
   >>> hess = calc.hessian()
   >>> print(np.round(hess, 6))
   [[-0.479426  1.      ]
    [ 1.        2.      ]]
   >>> print(hess.shape)
   (2, 2)
   >>> ref = np.array([
   ...     [-np.sin(0.5), 1.0],
   ...     [1.0, 2.0],
   ... ])
   >>> print(np.round(ref, 6))
   [[-0.479426  1.      ]
    [ 1.        2.      ]]


Hessian diagonal only
---------------------

For large parameter spaces you may only need the diagonal of the Hessian.
DerivKit provides a fast helper for this case.

.. doctest:: hessian_diag

   >>> import numpy as np
   >>> from derivkit.calculus_kit import CalculusKit
   >>> # Define a scalar-valued function
   >>> def func(theta):
   ...     return np.sin(theta[0]) + theta[0] * theta[1] + theta[1] ** 2
   >>> # Instantiate CalculusKit and compute Hessian diagonal
   >>> calc = CalculusKit(func, x0=np.array([0.5, 2.0]))
   >>> hess_diag = calc.hessian_diag()
   >>> print(np.round(np.asarray(hess_diag).reshape(-1), 6))
   [-0.479426  2.      ]
   >>> ref = np.array([-np.sin(0.5), 2.0])
   >>> np.allclose(hess_diag, ref)
   True


Tensor-valued outputs
---------------------

If the function returns a tensor, the Hessian is computed independently for
each output component.

The result is reshaped back to ``(*out_shape, p, p)``.

.. doctest:: hessian_tensor

   >>> import numpy as np
   >>> from derivkit.calculus_kit import CalculusKit
   >>> # Define a tensor-valued function
   >>> def func(theta):
   ...     return np.array([
   ...         np.sin(theta[0]),
   ...         theta[0] * theta[1] + theta[1] ** 2,
   ...     ])
   >>> # Point at which to compute the Hessian
   >>> x0 = np.array([0.5, 2.0])
   >>> # Create CalculusKit instance and compute Hessian
   >>> calc = CalculusKit(func, x0=x0)
   >>> hess = calc.hessian()
   >>> print(hess.shape)
   (2, 2, 2)
   >>> hess0_ref = np.array([
   ...     [-np.sin(0.5), 0.0],
   ...     [0.0, 0.0],
   ... ])
   >>> hess1_ref = np.array([
   ...     [0.0, 1.0],
   ...     [1.0, 2.0],
   ... ])
   >>> np.allclose(hess[0], hess0_ref)
   True
   >>> np.allclose(hess[1], hess1_ref)
   True


Finite differences (Ridders) via ``dk_kwargs``
----------------------------------------------

.. doctest:: hessian_finite_ridders

   >>> import numpy as np
   >>> from derivkit.calculus_kit import CalculusKit
   >>> # Define a scalar-valued function
   >>> def func(theta):
   ...     return np.sin(theta[0]) + theta[0] * theta[1] + theta[1] ** 2
   >>> # Create CalculusKit instance and compute Hessian
   >>> calc = CalculusKit(func, x0=np.array([0.5, 2.0]))
   >>> hess = calc.hessian(
   ...     method="finite",
   ...     n_workers=4,
   ...     stepsize=1e-2,
   ...     num_points=5,
   ...     extrapolation="ridders",
   ...     levels=4,
   ... )
   >>> print(np.round(hess, 6))
   [[-0.479426  1.      ]
    [ 1.        2.      ]]


Notes
-----

- For scalar outputs, only the upper triangle of the Hessian is evaluated and
  mirrored for efficiency.
- For tensor outputs, each component is treated as a scalar internally
  (flattened, differentiated, and reshaped back).
- ``n_workers`` parallelizes Hessian tasks (entries and/or components).
- When using :meth:`CalculusKit.hessian_diag`, mixed partial derivatives are
  skipped for speed.