Distributions\uf0c1

Dirichlet\uf0c1

trilearn.distributions.dirichlet.log_norm_constant(alpha)[source]\uf0c1

Log of the normalizing constant in the Dirichlet distribution.

Parameters:alpha (numpy array float) – alpha vector.
Returns:the normalizing constant.
Return type:float
trilearn.distributions.dirichlet.log_norm_constant_multidim(alpha, beta, levels)[source]\uf0c1

The normalizing constant in a multidimensional multinomial distribution.

Parameters:
  • levels (list) – A list of the number of level for each random variable. e.g. [2, 2, 3].
  • alpha (dict) – A dictiory of cells and the pseudo counts added for that cell.
  • beta (float) – A constant pseudo count for each cell.
Returns:

the normalizing constant.

Return type:

float

trilearn.distributions.dirichlet.log_pdf(x, alpha)[source]\uf0c1

Log density function of the Dirichlet distribution.

Parameters:
  • alpha (np.array float) – alpha vector.
  • x (float) – function argument.
Returns:

the normalizing constant.

Return type:

float

trilearn.distributions.dirichlet.pdf(x, alpha)[source]\uf0c1

The density function f(x) of a one-dimensional Dirichlet distribution.

Parameters:
  • alpha (np.array float) – alpha vector.
  • x (float) – function argument.
Returns:

the normalizing constant.

Return type:

float

trilearn.distributions.dirichlet.pdf_multidim(x, alpha, beta, levels)[source]\uf0c1

The density function f(x) of a multi-dimensional Dirichlet distribution.

Parameters:
  • alpha (dict) – a dictionary specifying specific pseudo counts.
  • beta (float) – is added to all pseudo counts. This is to avoid storing large pseudo count tables.
Returns:

the density at x, f(x)

Return type:

float

Discrete decomposable log linear\uf0c1

trilearn.distributions.discrete_dec_log_linear.conditional_prob_dec(x, y, dist, cliques, separators)[source]\uf0c1

Conditional probability of x given y, p(x | y).

Parameters:x (dict) –
trilearn.distributions.discrete_dec_log_linear.est_parameters(graph, data, levels, const_alpha)[source]\uf0c1
trilearn.distributions.discrete_dec_log_linear.get_all_counts(graph, data)[source]\uf0c1
trilearn.distributions.discrete_dec_log_linear.hyperconsistent_cliques(clique1, clique1_dist, clique2, levels, constant_alpha)[source]\uf0c1

Returns a distribution for clique2 that is hyper-consistent with clique1_dist.

Parameters:
  • clique1 (set) – A clique
  • clique1_dist (np.array) – A distribution for clique1
  • clique2 (set) – A clique
  • levels (np.array of lists) – levels for all nodes in the full graph
trilearn.distributions.discrete_dec_log_linear.ll_complete_set_ratio(comp, alpha, counts, data, levels, cache)[source]\uf0c1

The ratio of normalizing constants for a posterior Dirichlet distribution defined ofer a complete set (clique or separator). I(alpha + n) / I(alpha) :param comp: Clique or separator. :param alpha: Pseudo counts for each cell.

trilearn.distributions.discrete_dec_log_linear.locals_to_joint_prob_table(graph, parameters, levels)[source]\uf0c1

This is way too slow.

trilearn.distributions.discrete_dec_log_linear.log_likelihood_partial(cliques, separators, no_levels, cell_alpha, counts, data, levels, cache)[source]\uf0c1
trilearn.distributions.discrete_dec_log_linear.prob_dec(x, parameters, cliques, separators)[source]\uf0c1

Probability of numpy array x in a decomposable model.

trilearn.distributions.discrete_dec_log_linear.read_local_hyper_consistent_parameters_from_json_file(filename)[source]\uf0c1
trilearn.distributions.discrete_dec_log_linear.sample(table, n=1)[source]\uf0c1

This is not optimized. Should sample one clique at a time. Instead of one node at a time.

trilearn.distributions.discrete_dec_log_linear.sample_hyper_consistent_counts(graph, levels, constant_alpha)[source]\uf0c1

TODO

trilearn.distributions.discrete_dec_log_linear.sample_hyper_consistent_parameters(graph, constant_alpha, levels)[source]\uf0c1
trilearn.distributions.discrete_dec_log_linear.sample_joint_prob_table(graph, levels, total_counts)[source]\uf0c1
trilearn.distributions.discrete_dec_log_linear.sample_prob_table(graph, levels, total_counts=1.0)[source]\uf0c1

Graph intra-class\uf0c1

The graph intra-class distribution.

trilearn.distributions.g_intra_class.cov_matrix(G, r, s2)[source]\uf0c1

Returns a covariance matrix cov such that zeros in cov.I is determined by G.

Parameters:
  • G (NetworkX graph) – A decomposable graph.
  • r (float) – Correlation.
  • s2 (float) – Variance.
Returns:

A covariance matrix cov such that zeros in it inverse is determined by G.

Return type:

Numpy matrix

trilearn.distributions.g_intra_class.sample(G, r, s2, n)[source]\uf0c1

Samples from the G-intra-class distribution [1].

Parameters:
  • G (NetworkX graph) – a decompoable graph
  • r (float) – correllation
  • s2 (float) – variance
  • n (int) – uber of samples
Returns:

n samples from the G-intra-class distribution in a row matrix.

Return type:

np.matrix

References

[1]
    1. Green and A. Thomas. Sampling decomposable graphs using a Markov chain on junction trees. Biometrika, 2013. https://doi.org/10.1093/biomet/ass052

Graph inverse-Wishart\uf0c1

trilearn.distributions.g_inv_wishart.sample(G, dof, scale)[source]\uf0c1

Sample from G-inverse Wishart distribution [2].

Parameters:
  • G (networkx graph) – A decomposable graph.
  • scale (numpy matrix) – Scale parameter, a positive definite matrix.
  • delta (float) – Degrees o freedom, a positive real number.
Returns:

A sample from the G-inverse wishart distribution.

Return type:

numpy matrix

References

[2]
    1. Carvalho, H. Massam, and M. West. Simulation of hyper-inverse wishart distributions in graphical models. Biometrika, 94(3):647-659, 2007.

Gaussian graphical model\uf0c1

Gaussian graphical model.

trilearn.distributions.gaussian_graphical_model.gaussian_marginal_log_likelihood(S, n, D, delta, cache={})[source]\uf0c1

Marginal log-likelihood of the data, x in a normal distribution with zero mean and where the precision matrix is marginalized out.

Parameters:
  • S (Numpy matrix) – sum of squares matrix for the full distribution
  • D (Numpy matrix) – location matrix for the full distribution
  • delta (float) – scale parameter
  • n (int) – number of data samples on which S is built
trilearn.distributions.gaussian_graphical_model.log_likelihood(graph, S, n, D, delta, cache={})[source]\uf0c1
Parameters:
  • S (Numpy matrix) – sum of squares matrix for the full distribution
  • D (Numpy matrix) – location matrix for the full distribution
  • delta (float) – scale parameter
  • n (int) – number of data samples on which S is built
trilearn.distributions.gaussian_graphical_model.log_likelihood_partial(S, n, D, delta, cliques, separators, cache={}, idmatrices=None)[source]\uf0c1

Partial log-likelihood of the given cliques and separators. If every clique and separator is found in a graph, g this is the marginal likelihood of g.

Parameters:
  • S (Numpy matrix) – sum of squares matrix for the full distribution
  • D (Numpy matrix) – location matrix for the full distribution
  • delta (float) – scale parameter
  • n (int) – number of data samples on which S is built
  • cliques (list) – list of cliques, represented as frozensets
  • separators (dict) – dict with separators as keys and list of associated edges as values
  • cache (dict) – dict of seps of cliques as kayes and partial ll as values

Junction tree clustering\uf0c1

Matrix multivariate normal\uf0c1

trilearn.distributions.matrix_multivariate_normal.sample(M, S, Sigma)[source]\uf0c1

Generates a sample from the multivariate matrix normal distribution.

Multivariate students-t\uf0c1

Students t-distribution.

trilearn.distributions.multivariate_students_t.log_pdf(x, mu, T, n)[source]\uf0c1

Sequential junction tree distribution\uf0c1

Junction tree distributions suitable for SMC sampling.

class trilearn.distributions.sequential_junction_tree_distributions.CondUniformGivenSizeJTDistribution(p, size)[source]\uf0c1

Bases: trilearn.distributions.sequential_junction_tree_distributions.SequentialJTDistribution

A sequential formulation of

\[P(T) = P(T, G) = P(T|G)P(G)\]

, where

\[P(G)=1/(\text{#decomopsable graphs}) * I(\text{size of } G = k)\]

and

\[P(T|G) = 1/(\text{#junction trees for G}).\]
ll(graph)[source]\uf0c1

Log-lokelihood.

Parameters:graph (Networkx graph) – A graph.
log_ratio(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]\uf0c1

Log-likelihood ratio of new_JT and old_JT.

Parameters:
  • old_cliques ([type]) – [description]
  • old_separators ([type]) – [description]
  • new_cliques ([type]) – [description]
  • new_separators ([type]) – [description]
  • old_JT ([type]) – [description]
  • new_JT ([type]) – [description]
Returns:

[description]

Return type:

[type]

class trilearn.distributions.sequential_junction_tree_distributions.CondUniformJTDistribution(p)[source]\uf0c1

Bases: trilearn.distributions.sequential_junction_tree_distributions.SequentialJTDistribution

A sequential formulation of
\[P(T) = P(T|G)P(G),\]

where

\[P(G)= \frac{1}{\text{# decomposable graphs}}\]

and

\[P(T|G) = \frac{1}{\text{#junction trees for G}}.\]
ll(graph)[source]\uf0c1
log_ratio(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]\uf0c1
class trilearn.distributions.sequential_junction_tree_distributions.GGMJTPosterior[source]\uf0c1

Bases: trilearn.distributions.sequential_junction_tree_distributions.SequentialJTDistribution

Posterior of Junction tree for a GGM.

get_json_model()[source]\uf0c1
init_model(X, D, delta, cache={})[source]\uf0c1
init_model_from_json(sd_json)[source]\uf0c1
ll_diff(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]\uf0c1
log_likelihood(graph)[source]\uf0c1
log_likelihood_partial(cliques, separators)[source]\uf0c1
log_ratio(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]\uf0c1
class trilearn.distributions.sequential_junction_tree_distributions.LogLinearJTPosterior[source]\uf0c1

Bases: trilearn.distributions.sequential_junction_tree_distributions.SequentialJTDistribution

Posterior for a log-linear model.

get_json_model()[source]\uf0c1
init_model(X, cell_alpha, levels, cache_complete_set_prob={}, counts={})[source]\uf0c1
Parameters:
  • cell_alpha – the constant number of pseudo counts for each cell
  • the full distribution. (in) –
init_model_from_json(sd_json)[source]\uf0c1
log_likelihood(graph)[source]\uf0c1
log_likelihood_diff(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]\uf0c1

Log-likelihood difference when cliques and separators are added and removed.

log_ratio(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]\uf0c1
class trilearn.distributions.sequential_junction_tree_distributions.SequentialJTDistribution[source]\uf0c1

Bases: object

Abstract class of junction tree distributions for SMC sampling.

log_ratio(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]\uf0c1
class trilearn.distributions.sequential_junction_tree_distributions.UniformJTDistribution(p)[source]\uf0c1

Bases: trilearn.distributions.sequential_junction_tree_distributions.SequentialJTDistribution

A sequential formulation of P(T) = P(T|G)P(G), where P(G)=1/(#decomopsable graphs) and P(T|G) = 1/(#junction trees for G).

log_likelihood(jt)[source]\uf0c1
log_likelihood_partial(cliques, separators)[source]\uf0c1
log_ratio(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]\uf0c1

Wishart distribution\uf0c1

trilearn.distributions.wishart.log_norm_constant(D, delta, cache={})[source]\uf0c1
trilearn.distributions.wishart.logpdf(S, D, delta)[source]\uf0c1
trilearn.distributions.wishart.normalizing_constant(phi, delta)[source]\uf0c1