Trilearn package\uf0c1
Subpackages\uf0c1
Auxiliary functions\uf0c1
-
trilearn.auxiliary_functions.get_marg_counts(full_data, subset)[source]\uf0c1 Returns a contingency table in dictionary form.
Parameters: - data (np.array) – The data in n x p form.
- subset (list) – The subset of interest
-
trilearn.auxiliary_functions.l1_loss(m1, m2)[source]\uf0c1 L1 loss.
Parameters: - m1 (Numpy array) – A matrix
- m1 – A matrix
Returns: float
-
trilearn.auxiliary_functions.l2_loss(m1, m2)[source]\uf0c1 L2 loss between m1 and m2.
Parameters: - m1 (Numpy array) – A matrix
- m1 – A matrix
Returns: float
-
trilearn.auxiliary_functions.plot_graph_traj_statistics(graph_traj, write_to_file=False)[source]\uf0c1
-
trilearn.auxiliary_functions.plot_heatmap(heatmap, cbar=False, annot=False, xticklabels=1, yticklabels=1)[source]\uf0c1
-
trilearn.auxiliary_functions.plot_matrix(m, filename, extension, title='Adjmat')[source]\uf0c1 Plots a 2-dim numpy array as heatmap. :param m: matrix to plot. :type m: numpy array
-
trilearn.auxiliary_functions.plot_multiple_traj_statistics(trajs, burnin_end, write_to_file=False, annot=False, output_directory='./', file_extension='eps')[source]\uf0c1
-
trilearn.auxiliary_functions.random_subset(A)[source]\uf0c1 Draws a random subset of elements in a list, inclding the empty set.
Parameters: A (list) – Returns: Subset of A. Return type: set
Graph predictive classification\uf0c1
A Bayesian graphical predictive classifier.
-
class
trilearn.graph_predictive.GraphPredictive(n_particles=None, n_pgibbs_samples=None, prompt_burnin=False, only_map_graph=False, standard_bayes=False, cta_alpha=0.5, cta_beta=0.5, true_graphs=None, async=False)[source]\uf0c1 Bases:
sklearn.base.BaseEstimator-
fit(x, y, hyper_mu=None, hyper_v=None, hyper_tau=None, hyper_alpha=None, same_graph_groups=None)[source]\uf0c1 These parameters are set here in the constructor in order to avoid mismatch since the hyper parameters in classification has to be consistent with those in the structure learning procedure.
Parameters: - x (Numpy matrix) – Matrix of training data
- y (Numpy array) – Array of class correspondence
- hyper_mu (Numpy array) – Array of mean hyper parameter for
- normal inverse Wishart density (the) –
- hyper_v (float) – Parameter in the covariace matrix in the normal inverse Wishart density
- hyper_tau (Numpy matrix) – Precision matrix in the normal inverse Wishart density
- hyper_alpha (float) – Degrees of freedom in the normal inverse Wishart density
-
gen_gibbs_chains(n_particles, n_pgibbs_samples, smc_radius=None, cta_alpha=0.5, cta_beta=0.5, async=True)[source]\uf0c1 If same_graph is True, this generates one single Gibbs graph-trajectory for common for all classes. Otherwise, this generates Gibbs graph-trajectories for each class.
Parameters: smc_radius – radius for the SMC algorithm.
-
gibbs_chains_from_json(gibbs_js)[source]\uf0c1 Reads a Gibbs trajectory in json format.
Parameters: gibbs_js (dict) – Gibbs trajectory in json format.
-
gibbs_chains_to_json(title, optional={})[source]\uf0c1 Returns the Gibbs trajectory in json format.
Parameters: - title (string) – The json key for the attriute _id
- optional (dict) – Optional infor in json format to be included in the json object.
-
predict_likelihoods(x_new)[source]\uf0c1 Model: x_i | M=m, R=r ~ Normal(m,r) R ~ Wishart(tau, alpha) M | R=r ~ Normal(mu, r*v) Args: x_new: row matrix with new observations for which we
the predictive density will be computed. Example 2-dim x. [[0.4, 0.5],[0.2, 1.4]] x: row matrix with training data y: vector of classes mu: hyper parameter for M v: hyper parameter for M classes: unique class labels, eg. [0, 1, 2] graph_distribution: dictionary with graph distribution for each class.
-
predict_proba(X)[source]\uf0c1 Estimate probability. :param X: Input data. :type X: array-like, shape (n_samples, n_features)
Returns: C – Estimated probabilities. Return type: array, shape (n_samples, n_classes)
-
predictive_pdf(x_new, x, mu, v, tau, alpha, graph_dist)[source]\uf0c1 This is the predictive distribution of x_new. It is a multivatiate T-distributiona where the graph is marginalized out accordning to according to graph_dist.
-
set_burnin(true_graphs=None, directory='.', title='')[source]\uf0c1 Sets the burn-in period for the class-groups.
-
set_graph_dists(true_graphs=None, json_trajs=None, directory='.', title='', set_burnins=False)[source]\uf0c1
-
set_hyper_parameters(hyper_mu, hyper_v, hyper_tau, hyper_alpha)[source]\uf0c1 Parameters: - hyper_mu (Numpy array) – Array of mean hyper parameter for the normal inverse wishart density
- hyper_v (float) – Parameter in the covariance matrix in the normal inverse wishart density
- hyper_tau (Numpy matrix) – Precision matrix in the normal inverse wishart density
- hyper_alpha (float) – Degrees of freedom in the normal inverse wishart density
-
P. Green & A. Thomas MH-sampler\uf0c1
-
trilearn.mh_greenthomas.sample_trajectories_ggm_parallel(dataframe, n_samples, randomize=[1000], D=None, delta=1.0, reps=1, output_directory='.', **args)[source]\uf0c1
-
trilearn.mh_greenthomas.sample_trajectories_ggm_to_file(dataframe, n_samples, randomize=[1000], D=None, delta=1.0, reps=1, output_directory='.', **args)[source]\uf0c1
-
trilearn.mh_greenthomas.sample_trajectories_loglin_parallel(dataframe, n_samples, randomize=[1000], pseudo_obs=[1.0], reps=1, output_directory='.', **args)[source]\uf0c1
-
trilearn.mh_greenthomas.sample_trajectories_loglin_to_file(dataframe, n_samples, randomize=[1000], pseudo_obs=[1.0], reps=1, output_directory='.', **args)[source]\uf0c1
-
trilearn.mh_greenthomas.sample_trajectory_ggm(dataframe, n_samples, randomize=1000, D=None, delta=1.0, cache={}, **args)[source]\uf0c1
-
trilearn.mh_greenthomas.sample_trajectory_loglin(dataframe, n_samples, pseudo_obs=1.0, randomize=1000, cache={}, **args)[source]\uf0c1
-
trilearn.mh_greenthomas.sample_trajectory_uniform(n_samples, randomize=100, graph_size=5, cache={}, **args)[source]\uf0c1
-
trilearn.mh_greenthomas.trajectory_to_file(n_samples, randomize, seqdist, dir='.', reseed=False)[source]\uf0c1 Writes the trajectory of graphs generated by particle Gibbs to file.
Parameters: - seq_dist (SequentialJTDistributions) – the distribution to be sampled from
- filename_prefix (string) – prefix to the filename
Returns: Markov chain of underlying graphs of the junction trees sampled by pgibbs.
Return type:
-
trilearn.mh_greenthomas.trajectory_to_queue(n_samples, randomize, seqdist, queue, reseed=False)[source]\uf0c1 Writes the trajectory of graphs generated by particle Gibbs to file.
Parameters: - seq_dist (SequentialJTDistributions) – the distribution to be sampled from
- filename_prefix (string) – prefix to the filename
Returns: Markov chain of underlying graphs of the junction trees sampled by pgibbs.
Return type:
Node-drive MH-sampler\uf0c1
Metropolis-Hastings sampler for junction tree distributions.
-
trilearn.mh_nodedriven.accept_proposal_prob(from_tree, reduced_tree, to_tree, moved_node, alpha, beta, seq_dist)[source]\uf0c1
-
trilearn.mh_nodedriven.gen_ggm_trajectory(dataframe, n_samples, D=None, delta=1.0, cache={}, alpha=0.5, beta=0.5, **args)[source]\uf0c1
-
trilearn.mh_nodedriven.log_prop_pdf(from_tree, reduced_tree, to_tree, moved_node, alpha, beta)[source]\uf0c1
-
trilearn.mh_nodedriven.log_prop_ratio(from_tree, reduced_tree, to_tree, moved_node, alpha, beta)[source]\uf0c1
-
trilearn.mh_nodedriven.mh(alpha, beta, traj_length, seq_dist, jt_traj=None, debug=False)[source]\uf0c1 A Metropolis-Hastings implementation for approximating distributions over junction trees.
Parameters: - traj_length (int) – Number of Gibbs iterations (samples)
- alpha (float) – sparsity parameter for the Christmas tree algorithm
- beta (float) – sparsity parameter for the Christmas tree algorithm
- seq_dist (SequentialJTDistributions) – the distribution to be sampled from
Returns: Markov chain of teh underlying graphs of the junction trees sampled by M-H.
Return type:
Particle Gibbs\uf0c1
-
trilearn.pgibbs.sample_trajectories_ggm(dataframe, n_particles, n_samples, D=None, delta=1.0, alphas=[0.5], betas=[0.5], radii=[None], reset_cache=True, reps=1, **args)[source]\uf0c1
-
trilearn.pgibbs.sample_trajectories_ggm_parallel(dataframe, n_particles, n_samples, D=None, delta=1.0, alphas=[0.5], betas=[0.5], radii=[None], reset_cache=True, reps=1, **args)[source]\uf0c1
-
trilearn.pgibbs.sample_trajectories_ggm_to_file(dataframe, n_particles, n_samples, D=None, delta=1.0, alphas=[0.5], betas=[0.5], radii=[None], reset_cache=True, reps=1, output_directory='.', output_filename='trajectory.json', **args)[source]\uf0c1
-
trilearn.pgibbs.sample_trajectories_loglin(dataframe, n_particles, n_samples, pseudo_observations=[1.0], alphas=[0.5], betas=[0.5], radii=[None], reset_cache=True, reps=1, **args)[source]\uf0c1
-
trilearn.pgibbs.sample_trajectories_loglin_parallel(dataframe, n_particles, n_samples, pseudo_observations=[1.0], alphas=[0.5], betas=[0.5], radii=[None], reset_cache=True, reps=1, output_directory='.', **args)[source]\uf0c1
-
trilearn.pgibbs.sample_trajectories_loglin_to_file(dataframe, n_particles, n_samples, pseudo_observations=[1.0], alphas=[0.5], betas=[0.5], radii=[None], reset_cache=True, reps=1, output_directory='.', output_filename='trajectory.json', **args)[source]\uf0c1
-
trilearn.pgibbs.sample_trajectory(smc_N, alpha, beta, radius, n_samples, seq_dist, jt_traj=None, debug=False, reset_cache=True)[source]\uf0c1 A particle Gibbs implementation for approximating distributions over junction trees.
Parameters: - smc_N (int) – Number of particles in SMC in each Gibbs iteration
- n_samples (int) – Number of Gibbs iterations (samples)
- alpha (float) – sparsity parameter for the Christmas tree algorithm
- beta (float) – sparsity parameter for the Christmas tree algorithm
- radius (float) – defines the radius within which ned nodes are selected
- seq_dist (SequentialJTDistributions) – the distribution to be sampled from
Returns: Markov chain of the underlying graphs of the junction trees sampled by pgibbs.
Return type:
-
trilearn.pgibbs.sample_trajectory_ggm(dataframe, n_particles, n_samples, D=None, delta=1.0, alpha=0.5, beta=0.5, radius=None, reset_cache=True, **args)[source]\uf0c1 Particle Gibbs for approximating distributions over Gaussian graphical models.
Parameters: - n_particles (int) – Number of particles in SMC in each Gibbs iteration
- n_samples (int) – Number of Gibbs iterations (samples)
- alpha (float) – sparsity parameter for the Christmas tree algorithm
- beta (float) – sparsity parameter for the Christmas tree algorithm
- radius (float) – defines the radius within which ned nodes are selected
- dataframe (np.matrix) – row matrix of data
- D (np.matrix) – matrix parameter for the hyper inverse wishart prior
- delta (float) – degrees of freedom for the hyper inverse wishart prior
- cache (dict) – cache for clique likelihoods
Returns: Markov chain of the underlying graphs of the junction trees sampled by pgibbs.
Return type:
-
trilearn.pgibbs.sample_trajectory_loglin(dataframe, n_particles, n_samples, pseudo_obs=1.0, alpha=0.5, beta=0.5, radius=None, reset_cache=True, **args)[source]\uf0c1
-
trilearn.pgibbs.trajectory_to_file(n_particles, n_samples, alpha, beta, radius, seqdist, node_labels, reset_cache=True, dir='.', output_filename='trajectory.csv', reseed=False)[source]\uf0c1 Writes the trajectory of graphs generated by particle Gibbs to file.
Parameters: - n_particles (int) – Number of particles in SMC in each Gibbs iteration
- n_samples (int) – Number of Gibbs iterations (samples)
- alpha (float) – sparsity parameter for the Christmas tree algorithm
- beta (float) – sparsity parameter for the Christmas tree algorithm
- radius (float) – defines the radius within which ned nodes are selected
- seq_dist (SequentialJTDistributions) – the distribution to be sampled from
- filename_prefix (string) – prefix to the filename
Returns: Markov chain of underlying graphs of the junction trees sampled by pgibbs.
Return type:
-
trilearn.pgibbs.trajectory_to_queue(n_particles, n_samples, alpha, beta, radius, seqdist, queue, reset_cache=True, reseed=False)[source]\uf0c1 Writes the trajectory of graphs generated by particle Gibbs to file.
Parameters: - n_particles (int) – Number of particles in SMC in each Gibbs iteration
- n_samples (int) – Number of Gibbs iterations (samples)
- alpha (float) – sparsity parameter for the Christmas tree algorithm
- beta (float) – sparsity parameter for the Christmas tree algorithm
- radius (float) – defines the radius within which ned nodes are selected
- seq_dist (SequentialJTDistributions) – the distribution to be sampled from
- filename_prefix (string) – prefix to the filename
Returns: Markov chain of underlying graphs of the junction trees sampled by pgibbs.
Return type:
Stochastic set process\uf0c1
-
trilearn.set_process.backward_order_neigh_log_prob(from_order, to_order, radius, maxradius)[source]\uf0c1 Probability of generating order from_order from the larger order to_order under the restriction that no hole greater than radius is created.
-
trilearn.set_process.backward_order_neigh_set(from_order, radius, maxradius)[source]\uf0c1 Returns the list of nodes that can be removed from from_order (the greater order).
-
trilearn.set_process.backward_perm_traj_sample(p, radius)[source]\uf0c1 Samples a permutation tajectory with maximum p indices.
-
trilearn.set_process.gen_backward_order_neigh(from_order, radius, maxradius)[source]\uf0c1 Returns: A permutation wit one less element than fromm_order
-
trilearn.set_process.gen_order_neigh(from_order, radius, total_set)[source]\uf0c1 - Returns a list with one more element than from_order
- such that the new element is within the radius and belongs to total_set.
Parameters: - from_order (list) – list of elements
- radius (int) – specifies the radius within which the new element can be taken
- total_set (list) – the full set of elements
Returns: numpy array
SMC\uf0c1
Sequential Monte Carlo sampler for junction tree distributions.
-
trilearn.smc.approximate(N, alpha, beta, radius, seq_dist, debug=False, neig_set_cache={})[source]\uf0c1 Sequential Monte Carlo for junction trees using the christmas tree algorithm as proposal kernel.
Parameters: - N (int) – number
- alpha (float) – sparsity parameter for the Christmas tree algorithm
- beta (float) – sparsity parameter for the Christmas tree algorithm
- radius (float) – defines the radius within which ned nodes are selected
- seqdist (SequentialJTDistributions) – the distribution to be sampled from
Returns: (new_trees, log_w)
References:
-
trilearn.smc.approximate_cond(N, alpha, beta, radius, seq_dist, T_cond, perm_cond, debug=False, neig_set_cache={})[source]\uf0c1 SMC an junction trees conditioned on the trajectories T_cond and perm_cond.
-
trilearn.smc.est_dec_max_clique_size(order, n_particles, alpha=0.5, beta=0.5, n_smc_estimates=1, debug=False)[source]\uf0c1
-
trilearn.smc.est_log_norm_consts(order, n_particles, sequential_distribution, alpha=0.5, beta=0.5, n_smc_estimates=1, debug=False)[source]\uf0c1
-
trilearn.smc.est_n_dec_graphs(order, n_particles, alpha=0.5, beta=0.5, n_smc_estimates=1, debug=False)[source]\uf0c1
-
trilearn.smc.get_smc_trajs(Is)[source]\uf0c1 This method is made for visualizing the collapsing in SMC.