snowFT-cluster {snowFT} | R Documentation |
Functions extending the collection of cluster-level functions of the
snow package providing fault tolerance, reproducibility and additional
management features. The heart of the package is the function
performParallel
.
clusterApplyFT(cl, x, fun, initfun, exitfun, printfun, printargs, printrepl, gentype, seed, prngkind, para, mngtfiles, ft_verbose, ...) performParallel(count, x, fun, initfun, exitfun, printfun, printargs, printrepl, cltype, gentype, seed, prngkind, para, mngtfiles, ft_verbose, ...) clusterCallpart(cl, nodes, fun, ...) clusterEvalQpart(cl, nodes, expr)
cl |
Cluster object. |
count |
Number of cluster nodes. |
fun |
Function or character string naming a function. |
x |
Array whose length determines how many times fun is to
be called. x[i] is passed to fun (as its first argument)
at $i$th call. |
initfun |
Function or character string naming a
function with no
arguments that is to
be called on each node prior to the computation. Default: NULL . |
exitfun |
Function or character string naming a function with no
arguments that is to
be called on each node after the computation is completed. Default: NULL . |
printfun, printargs, printrepl |
printfun is a function or
character string naming a function that is to be called on the master
node after each
printrepl completed replicates, and thus it can be used for accessing
intermediate results. Arguments passed to
printfun are a list (of length |x| ) of results (including
the non-finished
ones), the number of finished results,
and printargs . Defaults: printfun=printargs=NULL,
printrepl=max(length(x)/10,1) . |
cltype |
Character string that specifies cluster type (see
makeClusterFT ). Default: getClusterOption("type") . |
gentype |
Character string that specifies type of the used
RNG. Possible values: "RNGstream" (default for performParallel ) - L'Ecuyer's RNG,
"SPRNG", or "None" (default for clusterApplyFT ). See
clusterSetupRNG.FT . If
gentype="None" , no RNG action is taken. |
seed, prngkind, para |
Seed, kind and parameters for the RNG (see
clusterSetupRNG.FT ). Defaults:
seed=rep(123456,6), prngkind="default", para=0 . |
mngtfiles |
A character vector of length 3 containing names of
management files: mngtfiles[1] for managing the
cluster size, mngtfiles[2] for storing the replicates
being currently computed, mngtfiles[3] for storing the failed
replicates. If any of these files equals an empty string, the
corresponding management actions are not performed. If the files
already exist, their content
is overwritten. Default:
c(".clustersize", ".proc", ".proc_fail") . |
ft_verbose |
If TRUE, debugging messages are sent to standard output. |
expr |
Expression to evaluate. |
nodes |
Indices of cluster nodes. |
... |
Additional arguments to pass to function fun . |
clusterApplyFT
is a fault tolerant version of
clusterApplyLB
of the snow package with additional features, such as results
reproducibility, computation transparency and dynamic cluster
resizing. The master process searches for failed nodes in its
waiting time. If failures are detected, the cluster is
repaired. All failed computations are restarted (in three additional
runs) after the replication
loop is finished, and hence the user should not notice any
interruptions.
The file mngtfiles[1]
is initially written by the master
prior to the computation and it contains a single integer value corresponding
to the number of cluster nodes. Then the value can be arbitrarily changed by
the user (but should remain in the same format). The master reads the
file in its waiting time. If the value in this file is larger than
the current
cluster size, new nodes are created and the computation is expanded on
them. If on the other hand the value is smaller, nodes are
successively discarded after they finish their current
computation.
The arguments initfun, exitfun
in
clusterApplyFT
are only used, if there are
changes in the cluster, i.e. if new nodes are added or if nodes are
removed from cluster.
The RNG uses
the scheme 'one stream per replicate', in contrary to 'one stream per
node' used by clusterApplyLB
. Therefore with each replicate, the
RNG is reset to the corresponding stream (identified by the replicate
number). Thus, the final results are reproducible.
performParallel
is a wrapper function for
clusterApplyFT
and we recommend using this function rather than
using clusterApplyFT
directly. It creates a cluster of
count
nodes,
on all nodes it
calls initfun
and initializes the RNG. Then it calls
clusterApplyFT
. After the computation is finished, it calls
exitfun
on all nodes and stops the cluster.
clusterCallpart
calls a function fun
with identical arguments
...
on nodes
specified by indices nodes
in the cluster cl
and returns a list
of the results.
clusterEvalQpart
evaluates a literal expression on nodes
specified by indices nodes
.
clusterApplyFT
returns a list of two elements. The first
one is a list (of length |x|
) of results, the second one is the
(possibly updated)
cluster object.
performParallel
returns a list of results.
Hana Sevcikova
## Not run: # generates n normally distributed random numbers in r replicates # on p nodes and prints their mean after each r/10 replicate. printfun <- function(res, n, args=NULL) { res <- unlist(res) res <- res[!is.null(res)] print(paste("mean after:", n,"replicates:", mean(res), "(from",length(res),"RNs)")) } r<-1000; n<-100; p<-5 res <- performParallel(p, rep(n,r), fun=rnorm, gentype="RNGstream", seed=rep(1,6), printfun=printfun) ## End(Not run)