training {BPHO}R Documentation

Functions related to Markov chain sampling

Description

The models are trained with Markov chain Monte Carlo (MCMC) methods. Slice sampling is used to update `beta's, the regression coefficients for groups, and `log(sigma)', where `sigma' is the width parameter of the prior for `beta'.

The function training carries out the Markov chain sampling, saving the Markov chain samples in a binary file mc_file.

The function display_mc displays the summary information in the file mc_file.

The function read_mc reads the Markov chain samples from the file mc_file at given iterations.

The function read_betas is based on the function read_mc. It specifically reads the `beta' for given group and class identities.

The function display_a_beta displays both the pattern information for the group associated with the `beta' specified by id_beta, and also return the full Markov chain samples of this `beta'.

The function calc_medians_betas returns the medians of the Markov chain samples for all `beta's at specified iterations. This function is for discovering important interaction patterns. An interaction pattern with large absolute medians is highly suspected to be an important pattern for predicting the response.

Usage

display_mc(mc_file)
read_mc(mc_file,group,ix, iter_b=0,forward=1,n=c(),quiet=1)
read_betas(mc_file,ix_g,ix_cls,iter_b=0,forward=1,n=c(),quiet=1)
display_a_beta(id_beta,mc_file,ptn_file)
calc_medians_betas(mc_file,iter_b=0,forward=1,n=c())
training(mc_file,ptn_file, train_y,no_cls,
         alpha,log_sigma_widths,
         log_sigma_modes,ini_log_sigmas,
         iters_mc,iters_bt,iters_sgm,
         w_bt,w_sgm,m_bt,m_sgm)

Arguments

mc_file A character string, the name of the binary file to which Markov chain is written.
group A character string giving the group name of values.
It can be one of 'lprobs',lsigmas','betas', 'evals'.
Group 'lprobs' contains: the values of log probabilities of data given the values of `beta's (identified by ix=0), the value of log prior of 'beta's given `sigma's (identified by ix=1), the value of log prior of 'log(sigma)'s (identified by ix=2), and the value of log posterior (identified by ix=3), which is the sum of the previous three values.
Group 'lsigmas' contains: the values of hyperparameters 'log(sigma)', with ix indicating the order, starting from 0.
Group 'betas' contains: the values of 'betas', with ix indicating the index of `beta'. The `beta's in each iteration is placed as that the no_cls values of `beta's for pattern group `i' are followed by the next no_cls values for pattern group 'i+1'. The smallest index is 0.
Group `evals' contains: the average times of evaluating the posterior distribution in updating each `beta' using slice sampling (identified by ix=0), and the average rejection rate of updating each `log(sigma)' with Metropolis sampling (identified by ix=1).
ix index of parameters inside each group, as discussed for group above.
ix_g index of pattern group, starting from 0.
ix_cls index of class, ranging from 1 to no_cls.
id_beta index of `beta', starting from 0.
iter_b, forward, n Starting from iter_b, one of every forward Markov chain samples, with the number of total samples being <= n and the maximum usable in the file mc_file, is read.
train_y Discrete response of training data. Assumed to be coded with 1,2,... no_cls.
no_cls the number of possibilities (classes) of the response, default to the maximum value in train_y.
alpha alpha=1 indicates that Cauchy prior is used, alpha=2 indicates that Gaussian prior is used.
log_sigma_widths, log_sigma_modes two vectors of length order+1, which are interpreted as follows: the Gaussian distribution with location log_sigma_modes[o] and standard deviation log_sigma_widths[o] is the prior for `log(sigmas[o])', which is the hyperparameter (width parameter of Gaussian distribution or Cauchy distribution) for the regression coefficients (i.e. `beta's) associated with the interactions of order `o'.
ptn_file a character string, the name of the binary file where the compression result is saved. The method of writing to and reading from ptn_file can be found from the documentation for compression.
iters_mc,iters_bt,iters_sgm iters_mc iterations of super-transition will be run. Each super-transition consists of iters_bt iterations of updating `beta's, and for each updating of `beta's, the hyperparameters `log(sigma)'s are updated iters_sgm times. When iters_mc=0, no Markov chain sampling will be run and other arguments related to Markov chain sampling take no effect.
w_bt,w_sgm, m_bt,m_sgm w_bt is the amount of stepping-out in updating `beta' with slice sampling, m_bt is the maximum number of stepping-out in slice sampling for updating `beta'. w_sgm and m_sgm are intepreted similarly for sampling for `log(sigma)'.
ini_log_sigmas Initial values of `log(sigma)', default to log_sigma_mode.
quiet quiet=1 suppresses the messages printed during reading the file mc_file.

Value

The function display_mc returns a vector with names as
#iters,#class,#groups,order,alpha.
The function read_mc returns the Markov chain samples for a variable at specified iterations.
The function read_betas returns the Markov chain samples for a `beta' at specified iterations.
The function display_a_beta displays the pattern group information for the group associated with the queried `beta', and also returns the Markov chain samples of this `beta'. The method of reading the on-screen messages about a pattern group is documented in compression.
The function calc_medians_betas returns the medians of Markov chain samples of all `beta's at given iterations.
The function training returns no value. Instead, the Markov chain samples are written to the binary file mc_file.

See Also

comp_train_pred,compression,prediction

Examples

## examples are given in comp_train_pred.

[Package BPHO version 1.2-5 Index]