Distributions

Dirichlet

trilearn.distributions.dirichlet.log_norm_constant(alpha)[source]

Log of the normalizing constant in the Dirichlet distribution.

Parameters:: alpha (numpy array float) – alpha vector.
Returns:: the normalizing constant.
Return type:: float

trilearn.distributions.dirichlet.log_norm_constant_multidim(alpha, beta, levels)[source]

The normalizing constant in a multidimensional multinomial distribution.

Parameters:

levels (list) – A list of the number of level for each random variable. e.g. [2, 2, 3].
alpha (dict) – A dictiory of cells and the pseudo counts added for that cell.
beta (float) – A constant pseudo count for each cell.

Returns:

the normalizing constant.

Return type:

float

trilearn.distributions.dirichlet.log_pdf(x, alpha)[source]

Log density function of the Dirichlet distribution.

Parameters:

alpha (np.array float) – alpha vector.
x (float) – function argument.

Returns:

the normalizing constant.

Return type:

float

trilearn.distributions.dirichlet.pdf(x, alpha)[source]

The density function f(x) of a one-dimensional Dirichlet distribution.

Parameters:

alpha (np.array float) – alpha vector.
x (float) – function argument.

Returns:

the normalizing constant.

Return type:

float

trilearn.distributions.dirichlet.pdf_multidim(x, alpha, beta, levels)[source]

The density function f(x) of a multi-dimensional Dirichlet distribution.

Parameters:

alpha (dict) – a dictionary specifying specific pseudo counts.
beta (float) – is added to all pseudo counts. This is to avoid storing large pseudo count tables.

Returns:

the density at x, f(x)

Return type:

float

Discrete decomposable log linear

trilearn.distributions.discrete_dec_log_linear.conditional_prob_dec(x, y, dist, cliques, separators)[source]

Conditional probability of x given y, p(x | y).

Parameters:: x (dict) –

trilearn.distributions.discrete_dec_log_linear.est_parameters(graph, data, levels, const_alpha)[source]

trilearn.distributions.discrete_dec_log_linear.get_all_counts(graph, data)[source]

trilearn.distributions.discrete_dec_log_linear.hyperconsistent_cliques(clique1, clique1_dist, clique2, levels, constant_alpha)[source]

Returns a distribution for clique2 that is hyper-consistent with clique1_dist.

Parameters:

clique1 (set) – A clique
clique1_dist (np.array) – A distribution for clique1
clique2 (set) – A clique
levels (np.array of lists) – levels for all nodes in the full graph

trilearn.distributions.discrete_dec_log_linear.ll_complete_set_ratio(comp, alpha, counts, data, levels, cache)[source]: The ratio of normalizing constants for a posterior Dirichlet distribution defined ofer a complete set (clique or separator). I(alpha + n) / I(alpha) :param comp: Clique or separator. :param alpha: Pseudo counts for each cell.

trilearn.distributions.discrete_dec_log_linear.locals_to_joint_prob_table(graph, parameters, levels)[source]: This is way too slow.

trilearn.distributions.discrete_dec_log_linear.log_likelihood_partial(cliques, separators, no_levels, cell_alpha, counts, data, levels, cache)[source]

trilearn.distributions.discrete_dec_log_linear.prob_dec(x, parameters, cliques, separators)[source]: Probability of numpy array x in a decomposable model.

trilearn.distributions.discrete_dec_log_linear.read_local_hyper_consistent_parameters_from_json_file(filename)[source]

trilearn.distributions.discrete_dec_log_linear.sample(table, n=1)[source]: This is not optimized. Should sample one clique at a time. Instead of one node at a time.

trilearn.distributions.discrete_dec_log_linear.sample_hyper_consistent_counts(graph, levels, constant_alpha)[source]: TODO

trilearn.distributions.discrete_dec_log_linear.sample_hyper_consistent_parameters(graph, constant_alpha, levels)[source]

trilearn.distributions.discrete_dec_log_linear.sample_joint_prob_table(graph, levels, total_counts)[source]

trilearn.distributions.discrete_dec_log_linear.sample_prob_table(graph, levels, total_counts=1.0)[source]

Graph intra-class

The graph intra-class distribution.

trilearn.distributions.g_intra_class.cov_matrix(G, r, s2)[source]

Returns a covariance matrix cov such that zeros in cov.I is determined by G.

Parameters:

G (NetworkX graph) – A decomposable graph.
r (float) – Correlation.
s2 (float) – Variance.

Returns:

A covariance matrix cov such that zeros in it inverse is determined by G.

Return type:

Numpy matrix

trilearn.distributions.g_intra_class.sample(G, r, s2, n)[source]

Samples from the G-intra-class distribution [1].

Parameters:

G (NetworkX graph) – a decompoable graph
r (float) – correllation
s2 (float) – variance
n (int) – uber of samples

Returns:

n samples from the G-intra-class distribution in a row matrix.

Return type:

np.matrix

References

Graph inverse-Wishart

trilearn.distributions.g_inv_wishart.sample(G, dof, scale)[source]

Sample from G-inverse Wishart distribution [2].

Parameters:

G (networkx graph) – A decomposable graph.
scale (numpy matrix) – Scale parameter, a positive definite matrix.
delta (float) – Degrees o freedom, a positive real number.

Returns:

A sample from the G-inverse wishart distribution.

Return type:

numpy matrix

References

Gaussian graphical model

Gaussian graphical model.

trilearn.distributions.gaussian_graphical_model.gaussian_marginal_log_likelihood(S, n, D, delta, cache={})[source]

Marginal log-likelihood of the data, x in a normal distribution with zero mean and where the precision matrix is marginalized out.

Parameters:

S (Numpy matrix) – sum of squares matrix for the full distribution
D (Numpy matrix) – location matrix for the full distribution
delta (float) – scale parameter
n (int) – number of data samples on which S is built

trilearn.distributions.gaussian_graphical_model.log_likelihood(graph, S, n, D, delta, cache={})[source]

Parameters:

S (Numpy matrix) – sum of squares matrix for the full distribution
D (Numpy matrix) – location matrix for the full distribution
delta (float) – scale parameter
n (int) – number of data samples on which S is built

trilearn.distributions.gaussian_graphical_model.log_likelihood_partial(S, n, D, delta, cliques, separators, cache={}, idmatrices=None)[source]

Partial log-likelihood of the given cliques and separators. If every clique and separator is found in a graph, g this is the marginal likelihood of g.

Parameters:

S (Numpy matrix) – sum of squares matrix for the full distribution
D (Numpy matrix) – location matrix for the full distribution
delta (float) – scale parameter
n (int) – number of data samples on which S is built
cliques (list) – list of cliques, represented as frozensets
separators (dict) – dict with separators as keys and list of associated edges as values
cache (dict) – dict of seps of cliques as kayes and partial ll as values

Junction tree clustering

Matrix multivariate normal

trilearn.distributions.matrix_multivariate_normal.sample(M, S, Sigma)[source]: Generates a sample from the multivariate matrix normal distribution.

Multivariate students-t

Students t-distribution.

trilearn.distributions.multivariate_students_t.log_pdf(x, mu, T, n)[source]

Sequential junction tree distribution

Junction tree distributions suitable for SMC sampling.

class trilearn.distributions.sequential_junction_tree_distributions.CondUniformGivenSizeJTDistribution(p, size)[source]

Bases: SequentialJTDistribution

A sequential formulation of

\[P(T) = P(T, G) = P(T|G)P(G)\]

, where

\[P(G)=1/(\text{#decomopsable graphs}) * I(\text{size of } G = k)\]

and

\[P(T|G) = 1/(\text{#junction trees for G}).\]

ll(graph)[source]

Log-lokelihood.

Parameters:: graph (Networkx graph) – A graph.

log_ratio(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]

Log-likelihood ratio of new_JT and old_JT.

Parameters:

old_cliques ([type]) – [description]
old_separators ([type]) – [description]
new_cliques ([type]) – [description]
new_separators ([type]) – [description]
old_JT ([type]) – [description]
new_JT ([type]) – [description]

Returns:

[description]

Return type:

[type]

class trilearn.distributions.sequential_junction_tree_distributions.CondUniformJTDistribution(p)[source]

Bases: SequentialJTDistribution

A sequential formulation of: \[P(T) = P(T|G)P(G),\]

where

\[P(G)= \frac{1}{\text{# decomposable graphs}}\]

and

\[P(T|G) = \frac{1}{\text{#junction trees for G}}.\]

ll(graph)[source]

log_ratio(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]

class trilearn.distributions.sequential_junction_tree_distributions.GGMJTPosterior[source]

Bases: SequentialJTDistribution

Posterior of Junction tree for a GGM.

get_json_model()[source]

init_model(X, D, delta, cache={})[source]

init_model_from_json(sd_json)[source]

ll_diff(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]

log_likelihood(graph)[source]

log_likelihood_partial(cliques, separators)[source]

log_ratio(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]

class trilearn.distributions.sequential_junction_tree_distributions.LogLinearJTPosterior[source]

Bases: SequentialJTDistribution

Posterior for a log-linear model.

get_json_model()[source]

init_model(X, cell_alpha, levels, cache_complete_set_prob={}, counts={})[source]

Parameters:

cell_alpha – the constant number of pseudo counts for each cell
distribution. (in the full) –

init_model_from_json(sd_json)[source]

log_likelihood(graph)[source]

log_likelihood_diff(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]: Log-likelihood difference when cliques and separators are added and removed.

log_ratio(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]

class trilearn.distributions.sequential_junction_tree_distributions.SequentialJTDistribution[source]

Bases: object

Abstract class of junction tree distributions for SMC sampling.

log_ratio(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]

class trilearn.distributions.sequential_junction_tree_distributions.UniformJTDistribution(p)[source]

Bases: SequentialJTDistribution

A sequential formulation of P(T) = P(T|G)P(G), where P(G)=1/(#decomopsable graphs) and P(T|G) = 1/(#junction trees for G).

log_likelihood(jt)[source]

log_likelihood_partial(cliques, separators)[source]

log_ratio(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]

Wishart distribution

trilearn.distributions.wishart.log_norm_constant(D, delta, cache={})[source]

trilearn.distributions.wishart.logpdf(S, D, delta)[source]

trilearn.distributions.wishart.normalizing_constant(phi, delta)[source]