snowFT-cluster {snowFT}R Documentation

Cluster-Level Functions with Fault Tolerant Features

Description

Functions extending the collection of cluster-level functions of the snow package providing fault tolerance, reproducibility and additional management features. The heart of the package is the function performParallel.

Usage

clusterApplyFT(cl, x, fun, initfun, exitfun, printfun, printargs, printrepl,
               gentype, seed, prngkind, para, mngtfiles, ft_verbose, ...)

performParallel(count, x, fun, initfun, exitfun, printfun, printargs,
                printrepl, cltype, gentype, seed, prngkind, para, mngtfiles,
                ft_verbose, ...)

clusterCallpart(cl, nodes, fun, ...)
clusterEvalQpart(cl, nodes, expr)

Arguments

cl Cluster object.
count Number of cluster nodes.
fun Function or character string naming a function.
x Array whose length determines how many times fun is to be called. x[i] is passed to fun (as its first argument) at $i$th call.
initfun Function or character string naming a function with no arguments that is to be called on each node prior to the computation. Default: NULL.
exitfun Function or character string naming a function with no arguments that is to be called on each node after the computation is completed. Default: NULL.
printfun, printargs, printrepl printfun is a function or character string naming a function that is to be called on the master node after each printrepl completed replicates, and thus it can be used for accessing intermediate results. Arguments passed to printfun are a list (of length |x|) of results (including the non-finished ones), the number of finished results, and printargs. Defaults: printfun=printargs=NULL, printrepl=max(length(x)/10,1).
cltype Character string that specifies cluster type (see makeClusterFT). Default: getClusterOption("type").
gentype Character string that specifies type of the used RNG. Possible values: "RNGstream" (default for performParallel) - L'Ecuyer's RNG, "SPRNG", or "None" (default for clusterApplyFT). See clusterSetupRNG.FT. If gentype="None", no RNG action is taken.
seed, prngkind, para Seed, kind and parameters for the RNG (see clusterSetupRNG.FT). Defaults: seed=rep(123456,6), prngkind="default", para=0.
mngtfiles A character vector of length 3 containing names of management files: mngtfiles[1] for managing the cluster size, mngtfiles[2] for storing the replicates being currently computed, mngtfiles[3] for storing the failed replicates. If any of these files equals an empty string, the corresponding management actions are not performed. If the files already exist, their content is overwritten. Default: c(".clustersize", ".proc", ".proc_fail").
ft_verbose If TRUE, debugging messages are sent to standard output.
expr Expression to evaluate.
nodes Indices of cluster nodes.
... Additional arguments to pass to function fun.

Details

clusterApplyFT is a fault tolerant version of clusterApplyLB of the snow package with additional features, such as results reproducibility, computation transparency and dynamic cluster resizing. The master process searches for failed nodes in its waiting time. If failures are detected, the cluster is repaired. All failed computations are restarted (in three additional runs) after the replication loop is finished, and hence the user should not notice any interruptions.

The file mngtfiles[1] is initially written by the master prior to the computation and it contains a single integer value corresponding to the number of cluster nodes. Then the value can be arbitrarily changed by the user (but should remain in the same format). The master reads the file in its waiting time. If the value in this file is larger than the current cluster size, new nodes are created and the computation is expanded on them. If on the other hand the value is smaller, nodes are successively discarded after they finish their current computation. The arguments initfun, exitfun in clusterApplyFT are only used, if there are changes in the cluster, i.e. if new nodes are added or if nodes are removed from cluster.

The RNG uses the scheme 'one stream per replicate', in contrary to 'one stream per node' used by clusterApplyLB. Therefore with each replicate, the RNG is reset to the corresponding stream (identified by the replicate number). Thus, the final results are reproducible.

performParallel is a wrapper function for clusterApplyFT and we recommend using this function rather than using clusterApplyFT directly. It creates a cluster of count nodes, on all nodes it calls initfun and initializes the RNG. Then it calls clusterApplyFT. After the computation is finished, it calls exitfun on all nodes and stops the cluster.

clusterCallpart calls a function fun with identical arguments ... on nodes specified by indices nodes in the cluster cl and returns a list of the results.

clusterEvalQpart evaluates a literal expression on nodes specified by indices nodes.

Value

clusterApplyFT returns a list of two elements. The first one is a list (of length |x|) of results, the second one is the (possibly updated) cluster object.
performParallel returns a list of results.

Author(s)

Hana Sevcikova

Examples

  ## Not run: 
# generates n normally distributed random numbers in r replicates
# on p nodes and prints their mean after each r/10 replicate.

printfun <- function(res, n, args=NULL) {
  res <- unlist(res)
  res <- res[!is.null(res)]
  print(paste("mean after:", n,"replicates:", mean(res),
           "(from",length(res),"RNs)"))
  }

r<-1000; n<-100; p<-5
res <- performParallel(p, rep(n,r), fun=rnorm,
  gentype="RNGstream", seed=rep(1,6), printfun=printfun)
## End(Not run)

[Package snowFT version 0.0-2 Index]