Distributions
Dirichlet
- trilearn.distributions.dirichlet.log_norm_constant(alpha)[source]
Log of the normalizing constant in the Dirichlet distribution.
- Parameters:
alpha (numpy array float) – alpha vector.
- Returns:
the normalizing constant.
- Return type:
float
- trilearn.distributions.dirichlet.log_norm_constant_multidim(alpha, beta, levels)[source]
The normalizing constant in a multidimensional multinomial distribution.
- Parameters:
levels (list) – A list of the number of level for each random variable. e.g. [2, 2, 3].
alpha (dict) – A dictiory of cells and the pseudo counts added for that cell.
beta (float) – A constant pseudo count for each cell.
- Returns:
the normalizing constant.
- Return type:
float
- trilearn.distributions.dirichlet.log_pdf(x, alpha)[source]
Log density function of the Dirichlet distribution.
- Parameters:
alpha (np.array float) – alpha vector.
x (float) – function argument.
- Returns:
the normalizing constant.
- Return type:
float
- trilearn.distributions.dirichlet.pdf(x, alpha)[source]
The density function f(x) of a one-dimensional Dirichlet distribution.
- Parameters:
alpha (np.array float) – alpha vector.
x (float) – function argument.
- Returns:
the normalizing constant.
- Return type:
float
- trilearn.distributions.dirichlet.pdf_multidim(x, alpha, beta, levels)[source]
The density function f(x) of a multi-dimensional Dirichlet distribution.
- Parameters:
alpha (dict) – a dictionary specifying specific pseudo counts.
beta (float) – is added to all pseudo counts. This is to avoid storing large pseudo count tables.
- Returns:
the density at x, f(x)
- Return type:
float
Discrete decomposable log linear
- trilearn.distributions.discrete_dec_log_linear.conditional_prob_dec(x, y, dist, cliques, separators)[source]
Conditional probability of x given y, p(x | y).
- Parameters:
x (dict) –
- trilearn.distributions.discrete_dec_log_linear.est_parameters(graph, data, levels, const_alpha)[source]
- trilearn.distributions.discrete_dec_log_linear.hyperconsistent_cliques(clique1, clique1_dist, clique2, levels, constant_alpha)[source]
Returns a distribution for clique2 that is hyper-consistent with clique1_dist.
- Parameters:
clique1 (set) – A clique
clique1_dist (np.array) – A distribution for clique1
clique2 (set) – A clique
levels (np.array of lists) – levels for all nodes in the full graph
- trilearn.distributions.discrete_dec_log_linear.ll_complete_set_ratio(comp, alpha, counts, data, levels, cache)[source]
The ratio of normalizing constants for a posterior Dirichlet distribution defined ofer a complete set (clique or separator). I(alpha + n) / I(alpha) :param comp: Clique or separator. :param alpha: Pseudo counts for each cell.
- trilearn.distributions.discrete_dec_log_linear.locals_to_joint_prob_table(graph, parameters, levels)[source]
This is way too slow.
- trilearn.distributions.discrete_dec_log_linear.log_likelihood_partial(cliques, separators, no_levels, cell_alpha, counts, data, levels, cache)[source]
- trilearn.distributions.discrete_dec_log_linear.prob_dec(x, parameters, cliques, separators)[source]
Probability of numpy array x in a decomposable model.
- trilearn.distributions.discrete_dec_log_linear.read_local_hyper_consistent_parameters_from_json_file(filename)[source]
- trilearn.distributions.discrete_dec_log_linear.sample(table, n=1)[source]
This is not optimized. Should sample one clique at a time. Instead of one node at a time.
- trilearn.distributions.discrete_dec_log_linear.sample_hyper_consistent_counts(graph, levels, constant_alpha)[source]
TODO
- trilearn.distributions.discrete_dec_log_linear.sample_hyper_consistent_parameters(graph, constant_alpha, levels)[source]
Graph intra-class
The graph intra-class distribution.
- trilearn.distributions.g_intra_class.cov_matrix(G, r, s2)[source]
Returns a covariance matrix cov such that zeros in cov.I is determined by G.
- Parameters:
G (NetworkX graph) – A decomposable graph.
r (float) – Correlation.
s2 (float) – Variance.
- Returns:
A covariance matrix cov such that zeros in it inverse is determined by G.
- Return type:
Numpy matrix
- trilearn.distributions.g_intra_class.sample(G, r, s2, n)[source]
Samples from the G-intra-class distribution [1].
- Parameters:
G (NetworkX graph) – a decompoable graph
r (float) – correllation
s2 (float) – variance
n (int) – uber of samples
- Returns:
n samples from the G-intra-class distribution in a row matrix.
- Return type:
np.matrix
References
Graph inverse-Wishart
- trilearn.distributions.g_inv_wishart.sample(G, dof, scale)[source]
Sample from G-inverse Wishart distribution [2].
- Parameters:
G (networkx graph) – A decomposable graph.
scale (numpy matrix) – Scale parameter, a positive definite matrix.
delta (float) – Degrees o freedom, a positive real number.
- Returns:
A sample from the G-inverse wishart distribution.
- Return type:
numpy matrix
References
[2] Carvalho, H. Massam, and M. West. Simulation of hyper-inverse wishart distributions in graphical models. Biometrika, 94(3):647-659, 2007.
Gaussian graphical model
Gaussian graphical model.
- trilearn.distributions.gaussian_graphical_model.gaussian_marginal_log_likelihood(S, n, D, delta, cache={})[source]
Marginal log-likelihood of the data, x in a normal distribution with zero mean and where the precision matrix is marginalized out.
- Parameters:
S (Numpy matrix) – sum of squares matrix for the full distribution
D (Numpy matrix) – location matrix for the full distribution
delta (float) – scale parameter
n (int) – number of data samples on which S is built
- trilearn.distributions.gaussian_graphical_model.log_likelihood(graph, S, n, D, delta, cache={})[source]
- Parameters:
S (Numpy matrix) – sum of squares matrix for the full distribution
D (Numpy matrix) – location matrix for the full distribution
delta (float) – scale parameter
n (int) – number of data samples on which S is built
- trilearn.distributions.gaussian_graphical_model.log_likelihood_partial(S, n, D, delta, cliques, separators, cache={}, idmatrices=None)[source]
Partial log-likelihood of the given cliques and separators. If every clique and separator is found in a graph, g this is the marginal likelihood of g.
- Parameters:
S (Numpy matrix) – sum of squares matrix for the full distribution
D (Numpy matrix) – location matrix for the full distribution
delta (float) – scale parameter
n (int) – number of data samples on which S is built
cliques (list) – list of cliques, represented as frozensets
separators (dict) – dict with separators as keys and list of associated edges as values
cache (dict) – dict of seps of cliques as kayes and partial ll as values
Junction tree clustering
Matrix multivariate normal
Multivariate students-t
Students t-distribution.
Sequential junction tree distribution
Junction tree distributions suitable for SMC sampling.
- class trilearn.distributions.sequential_junction_tree_distributions.CondUniformGivenSizeJTDistribution(p, size)[source]
Bases:
SequentialJTDistributionA sequential formulation of
\[P(T) = P(T, G) = P(T|G)P(G)\], where
\[P(G)=1/(\text{#decomopsable graphs}) * I(\text{size of } G = k)\]and
\[P(T|G) = 1/(\text{#junction trees for G}).\]- log_ratio(old_cliques, old_separators, new_cliques, new_separators, old_JT, new_JT)[source]
Log-likelihood ratio of new_JT and old_JT.
- Parameters:
old_cliques ([type]) – [description]
old_separators ([type]) – [description]
new_cliques ([type]) – [description]
new_separators ([type]) – [description]
old_JT ([type]) – [description]
new_JT ([type]) – [description]
- Returns:
[description]
- Return type:
[type]
- class trilearn.distributions.sequential_junction_tree_distributions.CondUniformJTDistribution(p)[source]
Bases:
SequentialJTDistribution- A sequential formulation of
- \[P(T) = P(T|G)P(G),\]
where
\[P(G)= \frac{1}{\text{# decomposable graphs}}\]and
\[P(T|G) = \frac{1}{\text{#junction trees for G}}.\]
- class trilearn.distributions.sequential_junction_tree_distributions.GGMJTPosterior[source]
Bases:
SequentialJTDistributionPosterior of Junction tree for a GGM.
- class trilearn.distributions.sequential_junction_tree_distributions.LogLinearJTPosterior[source]
Bases:
SequentialJTDistributionPosterior for a log-linear model.
- init_model(X, cell_alpha, levels, cache_complete_set_prob={}, counts={})[source]
- Parameters:
cell_alpha – the constant number of pseudo counts for each cell
distribution. (in the full) –
- class trilearn.distributions.sequential_junction_tree_distributions.SequentialJTDistribution[source]
Bases:
objectAbstract class of junction tree distributions for SMC sampling.
- class trilearn.distributions.sequential_junction_tree_distributions.UniformJTDistribution(p)[source]
Bases:
SequentialJTDistributionA sequential formulation of P(T) = P(T|G)P(G), where P(G)=1/(#decomopsable graphs) and P(T|G) = 1/(#junction trees for G).