Gaussian Likelihood#
This section shows how to evaluate a Gaussian likelihood using
derivkit.likelihood_kit.LikelihoodKit.
A Gaussian likelihood describes the probability of observed data under a normal
noise model with mean mu and covariance cov.
For a set of samples data and a model prediction mu, the Gaussian
log-likelihood is
Notation#
ndenotes the number of data samples.datacontainsnsamples (internally treated as a column of samples).muis the Gaussian mean at each sample (model_parametershas shape(n,)).
The primary interface for evaluating the Gaussian likelihood is
derivkit.likelihood_kit.LikelihoodKit.gaussian().
For advanced usage, see derivkit.likelihoods.gaussian.build_gaussian_likelihood().
For a conceptual overview of likelihoods, see LikelihoodKit.
Gaussian log-likelihood#
For inference, you should almost always work with the log-likelihood for numerical stability.
>>> import numpy as np
>>> from derivkit.likelihood_kit import LikelihoodKit
>>> # Observed data samples
>>> data = np.array([[0.2], [-0.1], [0.05]])
>>> # Model prediction (Gaussian mean at each sample)
>>> mu = np.array([0.0, 0.0, 0.0])
>>> # Diagonal covariance given as variances
>>> cov = np.array([0.1**2, 0.1**2, 0.1**2])
>>> # Create LikelihoodKit instance
>>> lkit = LikelihoodKit(data=data, model_parameters=mu)
>>> # Evaluate Gaussian log-likelihoods
>>> grid, loglike = lkit.gaussian(cov=cov)
>>> print(bool(np.isfinite(loglike)))
True
Gaussian PDF (small problems only)#
If you explicitly need probability density values (not recommended for large
or high-dimensional problems), set return_log=False.
>>> import numpy as np
>>> from derivkit.likelihood_kit import LikelihoodKit
>>> data = np.array([[0.2], [-0.1], [0.05]])
>>> mu = np.array([0.0, 0.0, 0.0])
>>> cov = np.array([0.1**2, 0.1**2, 0.1**2])
>>> lkit = LikelihoodKit(data=data, model_parameters=mu)
>>> grid, pdf = lkit.gaussian(cov=cov, return_log=False)
>>> print(bool(np.isfinite(pdf) and (pdf >= 0.0)))
True
Covariance input forms#
The covariance can be provided in several equivalent forms.
>>> import numpy as np
>>> from derivkit.likelihood_kit import LikelihoodKit
>>> # Observed data samples and model prediction
>>> data = np.array([[0.1], [-0.2]])
>>> mu = np.array([0.0, 0.0])
>>> # Initialize LikelihoodKit
>>> lkit = LikelihoodKit(data=data, model_parameters=mu)
>>> _, loglike1 = lkit.gaussian(cov=0.05**2)
>>> # Diagonal variances (1D array)
>>> _, loglike2 = lkit.gaussian(cov=np.array([0.05**2, 0.05**2]))
>>> # Full covariance matrix (2D)
>>> cov2d = np.array([
... [0.0025, 0.0],
... [0.0, 0.0025],
... ])
>>> _, loglike3 = lkit.gaussian(cov=cov2d)
>>> print(np.allclose(loglike1, loglike2) and np.allclose(loglike2, loglike3))
True
Returned objects#
The Gaussian likelihood returns a tuple (coordinate_grids, values).
coordinate_gridsis a tuple of 1D arrays, one per data dimensionvaluesis either the PDF or log-PDF evaluated on the grid
>>> import numpy as np
>>> from derivkit.likelihood_kit import LikelihoodKit
>>> data = np.array([[0.1], [-0.1]])
>>> mu = np.array([0.0, 0.0])
>>> cov = np.array([0.05**2, 0.05**2])
>>> lkit = LikelihoodKit(data=data, model_parameters=mu)
>>> grid, loglike = lkit.gaussian(cov=cov)
>>> print(isinstance(grid, tuple))
True
>>> print(bool(np.isfinite(loglike)))
True
Notes#
By default, the Gaussian likelihood returns the log-likelihood (
return_log=True).model_parametersmust provide one mean value per data sample (muhas shape(n,)).covcan be provided as a scalar variance, a 1D array of diagonal variances, or a full 2D covariance matrix.For high-dimensional data, working with the PDF directly can lead to numerical underflow; prefer log-likelihoods.
The covariance matrix must be positive definite to ensure a valid likelihood.
The Gaussian likelihood assumes samples are conditionally independent given the model parameters.
For correlated data, provide the full covariance matrix to capture dependencies.
The Gaussian likelihood is appropriate for continuous data; for discrete data, use a Poisson or multinomial likelihood.
When combining multiple likelihood terms, sum log-likelihoods rather than multiplying PDFs.