derivkit.forecasting.likelihoods module

Basic likelihood functions for forecasting.

derivkit.forecasting.likelihoods.binomial(*args, **kwargs)

This is a placeholder for a Binomial likelihood function.

derivkit.forecasting.likelihoods.build_gaussian_likelihood(data: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], model_parameters: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], cov: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], return_log: bool = False) tuple[tuple[ndarray[tuple[Any, ...], dtype[float64]], ...], ndarray[tuple[Any, ...], dtype[float64]]]

Constructs the Gaussian likelihood function.

Parameters:
  • data – a 1D or 2D array representing the given data values. It is expected that axis 0 represents different samples of data while axis 1 represents the data values.

  • model_parameters – a 1D array representing the theoretical values of the model parameters.

  • cov – covariance matrix. May be a scalar, a 1D vector of diagonal variances, or a full 2D covariance matrix. It will be symmetrised and normalized internally to ensure compatibility with the data and model_parameters.

  • return_log – when set to True, the function will compute the log-likelihood instead.

Returns:

  • coordinate_grids: tuple of 1D arrays giving the evaluation coordinates for each dimension (one array per dimension), ordered consistently with the first axis of data.

  • probability_density: ndarray with the values of the multivariate Gaussian probability density function evaluated on the Cartesian product of those coordinates.

Return type:

A tuple

Raises:

ValueError – raised if - data is not 1D or 2D, - model_parameters is not 1D, - the number of samples in data does not match the number of model parameters, - model_parameters contain non-finite values, - cov cannot be normalized to a valid covariance matrix.

Examples

A 1D Gaussian likelihood:
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> data = np.linspace(-10, 10, 100)[np.newaxis, :]
>>> model_parameters = np.array([1.0])
>>> cov = np.array([[2.0]])
>>> x_grid, pdf = build_gaussian_likelihood(data, model_parameters, cov)
>>> plt.plot(x_grid[0], pdf[0])
A 2D Gaussian likelihood:
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> data = np.asarray((np.linspace(-10, 10, 30), np.linspace(3, 6, 30)))
>>> model_parameters = np.array([0.0, 4.0])
>>> cov = np.array([[1.0, 0.2], [0.2, 0.3]])
# Build coordinate arrays and evaluate the probability density on their
# Cartesian product. The indexing ensures the coordinate order matches
# the order in ``data``.
>>> grid, probability_density = build_gaussian_likelihood(data, model_parameters, cov)
>>> plt.contour(*grid, probability_density)
derivkit.forecasting.likelihoods.build_poissonian_likelihood(data: float | ndarray[float], model_parameters: float | ndarray[float], return_log: bool = False) tuple[ndarray[float], ndarray[float]]

Constructs the Poissonian likelihood function.

The shape of the data products depend on the shape of model_parameters. The assumption is that model_parameters contains the expectation value of some quantity which is either uniform for the entire distribution or is distributed across a grid of bins. It is uniform for the entire distribution if it is a scalar.

The function will try to reshape data to align with model_parameters. If model_parameters is a scalar, then data will be flattened. Otherwise, the grid can contain any number of axes, but currently the number of axes is hardcoded to 2. Supplying a higher-dimensional array to model_parameters may produce unexpected results.

This hardcoded limit means that, while it is possible to supply model_parameters along a 1D grid, the output shape will always be a 2D row-major array. See Examples for more details.

Parameters:
  • data – an array representing the given data values.

  • model_parameters – an array representing the means of the data samples.

  • return_log – when set to True, returns the log-likelihood. Defaults to False.

Returns:

  • the data, reshaped to align with the model parameters.

  • the values of the Poissonian probability mass function computed from the data and model parameters.

Return type:

A tuple of arrays containing (in order)

Raises:

ValueError – If any of the model_parameters are negative or non-finite, or the data points cannot be reshaped to align with model_parameters.

Examples

The Poissonian probability of 2 events, given that the mean is 1.4 events per unit interval, shows that the output is reshaped as a 2D array:

>>> x, y = build_poissonian_likelihood(2, 1.4)
>>> print(x, y)
[2] [0.24166502]

A Poisson-distributed sample can be computed for a given expectation value:

>>> data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> model_parameters = 2.4
>>> x, y = build_poissonian_likelihood(data, model_parameters)
>>> print(x)
[ 1  2  3  4  5  6  7  8  9 10]
>>> print(y)
[2.17723088e-01 2.61267705e-01 2.09014164e-01 1.25408499e-01
 6.01960793e-02 2.40784317e-02 8.25546231e-03 2.47663869e-03
 6.60436985e-04 1.58504876e-04]

Note that the shape of the results are determined by the shape of model_parameters:

>>> data = np.array([1, 2])
>>> model_parameters = np.array([3])
>>> x, y = build_poissonian_likelihood(data, model_parameters)
>>> print(x)
[[1]
 [2]]
>>> print(y)
[[0.14936121]
 [0.22404181]]

Probabilities computed from values and parameters distributed along a 1D grid of bins:

>>> model_parameters = np.array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6])
>>> data = np.array([1, 2, 3, 4, 5, 6])
>>> x, y = build_poissonian_likelihood(data, model_parameters)
>>> print(x)
[[1 2 3 4 5 6]]
>>> print(y)
[[9.04837418e-02 1.63746151e-02 3.33368199e-03 7.15008049e-04
  1.57950693e-04 3.55629940e-05]]

Probabilities computed from values and parameters distributed across a 2D grid of bins:

>>> data = np.array([[1, 2, 3], [4, 5, 6]])
>>> model_parameters = np.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]])
>>> x, y = build_poissonian_likelihood(data, model_parameters)
>>> print(x)
[[[1 2 3]
  [4 5 6]]]
>>> print(y)
[[[9.04837418e-02 1.63746151e-02 3.33368199e-03]
  [7.15008049e-04 1.57950693e-04 3.55629940e-05]]]

Combining multiple data values on the same grid with the same Poissonian means:

>>> val1 = np.array([[1, 2, 3], [4, 5, 6]])
>>> val2 = np.array([[7, 8, 9], [10, 11, 12]])
>>> data = np.array([val1, val2])
>>> model_parameters = np.array([[0.1, 0.2, 0.3,], [0.4, 0.5, 0.6]])
>>> x, y = build_poissonian_likelihood(data, model_parameters)
>>> print(x)
[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]
>>> print(y)
 [[[9.04837418e-02 1.63746151e-02 3.33368199e-03]
  [7.15008049e-04 1.57950693e-04 3.55629940e-05]]

 [[1.79531234e-11 5.19829050e-11 4.01827740e-11]
  [1.93695302e-11 7.41937101e-12 2.49402815e-12]]]

The same result can be obtained by supplying the data in a flattened array:

>>> data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
>>> x, y = build_poissonian_likelihood(data, model_parameters)
>>> print(x)
[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]
>>> print(y)
 [[[9.04837418e-02 1.63746151e-02 3.33368199e-03]
  [7.15008049e-04 1.57950693e-04 3.55629940e-05]]

 [[1.79531234e-11 5.19829050e-11 4.01827740e-11]
  [1.93695302e-11 7.41937101e-12 2.49402815e-12]]]
derivkit.forecasting.likelihoods.multinomial(*args, **kwargs)

This is a placeholder for a Multinomial likelihood function.

derivkit.forecasting.likelihoods.sellentin_heavens(*args, **kwargs)

This is a placeholder for the Sellentin-Heavens likelihood function.

derivkit.forecasting.likelihoods.student_t(*args, **kwargs)

This is a placeholder for a Student’s t-distribution likelihood function.