sincei.ExponentialFamily module#
- class sincei.ExponentialFamily.ExponentialFamily(family_params=None, **kwargs)[source]#
Bases:
objectEncodes an exponential family distribution using PyTorch autodiff structures.
ExponentialFamily corresponds to the superclass providing a backbone for a subclass for any exponential family distribution. Each subclass should contain the following methods, defined based on the distribution of choice (same notation as in Mourragui et al, 2023):
sufficient_statistics (\(T\))
natural_parametrization (\(\eta\))
log_partition (\(A\))
invert_g (\(g^{-1}\))
initialize_family_parameters: computes parameters used in other methods, e.g., gene-level dispersion for Negative Binomial.
We added a "base_measure" for the sake of completeness, but this method is not necessary for running GLM-PCA. The log-likelihood and exponential term are defined directly from the aforementionned methods.
Parameters#
- family_namestr
Name of the family.
- class sincei.ExponentialFamily.Gaussian(family_params=None, **kwargs)[source]#
Bases:
ExponentialFamilyGaussian with standard deviation one.
GLMPCA with Gaussian as family is equivalent to the standard PCA.
- class sincei.ExponentialFamily.Bernoulli(family_params=None, **kwargs)[source]#
Bases:
ExponentialFamilyBernoulli distribution
- family_params of interest:
"max_val" (int) corresponding to the max value (replaces infinity). Empirically, values above 10 yield similar results.
- class sincei.ExponentialFamily.Poisson(family_params=None, **kwargs)[source]#
Bases:
ExponentialFamilyPoisson distribution
- family_params of interest:
"min_val" (int) corresponding to the min value (replaces 0).
- class sincei.ExponentialFamily.Beta(family_params=None, **kwargs)[source]#
Bases:
ExponentialFamilyBeta distribution, using a standard formulation.
Original formulation presented in [Mourragui et al, 2023].
- family_params of interest:
"min_val" (int): min data value (replaces 0 and 1).
"n_jobs" (int): number of jobs, specifically for computing the "nu" parameter.
"method" (str): method use to compute the "nu" parameter per feature. Two possibles: "MLE" and "MM". Defaults to "MLE".
"eps" (float): minimum difference used for inverting the g function. Defaults to 1e-4
"maxiter" (int): maximum number of iterations for the inversion of the g function. Defaults to 100.
- class sincei.ExponentialFamily.SigmoidBeta(family_params=None, **kwargs)[source]#
Bases:
BetaBeta distribution re-parametrized using a Sigmoid.
This distribution is similar to the previous Beta (which it inherits from) but the natural parameter is re-parametrized using a Sigmoid. This is shown expeerimentally to stabilize the optimisation by removing the ]0,1[ constraint.
- family_params of interest:
"min_val" (int): min data value (replaces 0 and 1).
"n_jobs" (int): number of jobs, specifically for computing the "nu" parameter.
"method" (str): method use to compute the "nu" parameter per feature. Two possibles: "MLE" and "MM". Defaults to "MLE".
"eps" (float): minimum difference used for inverting the g function. Defaults to 1e-4
"maxiter" (int): maximum number of iterations for the inversion of the g function. Defaults to 100.
- class sincei.ExponentialFamily.Gamma(family_params=None, **kwargs)[source]#
Bases:
ExponentialFamilyGamma distribution using a standard formulation.
Original formulation presented in [Mourragui et al, 2023].
- family_params of interest:
"min_val" (int): min data value. Defaults to 1e-5.
"max_val" (int): max data value. Defaults to 1e7.
"n_jobs" (int): number of jobs, specifically for computing the "nu" parameter.
"method" (str): method use to compute the "nu" parameter per feature. Two possibles: "MLE" and "MM". Defaults to "MLE".
"eps" (float): minimum difference used for inverting the g function. Defaults to 1e-4
"maxiter" (int): maximum number of iterations for the inversion of the g function. Defaults to 100.
- class sincei.ExponentialFamily.LogNormal(family_params=None, **kwargs)[source]#
Bases:
ExponentialFamilyLog-normal distribution using a standard formulation.
Original formulation presented in [Mourragui et al, 2023].
- family_params of interest:
"min_val" (int): min data value. Defaults to 1e-5.
"max_val" (int): max data value. Defaults to 1e7.
"n_jobs" (int): number of jobs, specifically for computing the "nu" parameter.
"method" (str): method use to compute the "nu" parameter per feature. Two possibles: "MLE" and "MM". Defaults to "MLE".
"eps" (float): minimum difference used for inverting the g function. Defaults to 1e-4
"maxiter" (int): maximum number of iterations for the inversion of the g function. Defaults to 100.